Telephone Numbers puzzle discussion

CodinGame65 · June 17, 2014, 8:13am

Feel free to send your feedback or ask some help here!

Twibby · October 21, 2014, 6:10pm

Hi,
I post here because i need some advice for this puzzle.

My first idea for this challenge was obviously to use trees so i created :

struct Tree{
char data; // a number (between 0 and 9)
vector< Tree > childs; // following of any phone numbers starting in the same way
};

In that way, i thought the following picture was perfectly illustrated.
http://ide.codingame.com/fileservlet?id=266985027077

So, I created also 3 little functions :

int mem(char number, vector childs); // return the index of the child which data equals number, or -1 if number not found

Tree create(string tel); // called when mem returns -1, simply create unary tree (each node is one number of tel) --recursive

Tree add(Tree mainTree, string phone, int* length); // use in the main function, add a new phone number to the tree and increase the length when create() is called (in that way length not increased until mem returns -1) --recursive

My programs works perfectly for the 4 first tests but fails the last one and i can see in the console that the program stops at about the 750th number (on 10.000).

So i suppose my idea was not optimal (or maybe functions implementations) and i’m looking for some new leads, or hints. I think about using struct Tree {map< int, Tree>} maybe… ?
Thank you for your help !

Edit : Made in C++

rkj · October 26, 2014, 12:17am

I think you’re overkilling it, we only need to deal with strings and substrings here.

If you decompose a given string (phone number), with all its prefixes, you get exactly phone.length() substrings. 0123 -> 0, 01, 012, 0123 - so the idea is to check the existence of each prefix in a common list of prefixes. If it’s new, then you need to count it (+1), if it’s already there, do not count it (0). For example we already have 0123 stored, now you compute 01299 -> 0, 01, 012, 0129, 01299, the first three give a result of 0 (already in our list), but both 0129 and 01299 are new, so the whole string would compute as 0+0+0+1+1

This idea allowed me to compute 7k5 phone numbers before timeout… Not quite enough for the 10k needed, had to use another trick to reduce the computing time from around 17" to 3" (it took me a little while though, but the trick is simple enough)

therealemm · October 27, 2014, 8:10am

I did the same thing at first with a recursive search, and by the end it was taking far too long. What rkj said is key! Without giving away the answer, think on the fastest way to count all the string/substring entries. In the end this puzzle is the shortest code block I’ve written for any puzzle at CodinGame. The longest list test completes on avg in about 80 milliseconds. (this is coded in Go)

arnaudus · October 27, 2014, 4:09pm

I did it using the “tree” algorithm and I passed all tests in C++, so I guess it is more an implementation issue than an algorithmic problem. However, I did not use a vector, but a Tree* table[10] filled with NULL in the constructor. Then you can fill the table only when you need it, which can probably speed the algorithm up to a factor of 10.

Combining the “string” method to a binary search (perhaps using a map container, which uses an efficient hash method) looks like an interesting candidate for reaching even faster speed…

minioim · October 27, 2014, 5:48pm

hi, I used the same system as rkj (almost) -> decompose a String Phone number in substring and try to add each substring to a list if it doesn’t already exist.

at the end return the size of the list.

with Java and the Set Interface, processing the 10k phone numbers requires 180ms.

so that’s definitively a good solution (and the shortest I wrote in Codingame…)

clojure_addict · October 30, 2014, 11:01pm

Actually, you don’t even need additional storage. Simply sorting the input (using an appropriate comparator) and then iterating once over the pairs of consecutive numbers is enough. 20 lines of Clojure (I know, I know… Using Clojure is kind of cheating :P)

NewboO · October 30, 2014, 11:45pm

20 lines is actually long for this puzzle.

clojure_addict · October 31, 2014, 12:22am

Oh noes!!! *clojure_addict heads over to a dark corner and starts crying uncontrollably not being able to face the humiliation…*

Seriously though, 20 lines of what language and using what approach?

EDIT: Actually, I just realized that I could improve it more. I noticed that I don’t need a custom comparator after all (the built-in string comparator is enough), and I also optimized away a variable I used only once. I now am at 10 lines!

minioim · November 5, 2014, 5:24pm

I need 12 lines with Java.

clojure_addict · November 9, 2014, 6:32am

12 lines sounds pretty good for a Java solution. I’m curious, how much of it is boilerplate (e.g. imports, class definitions, method signatures etc…) and how much is actual problem solving code? For example, ATM, I have 7 lines of Clojure but the body of my main function only spans 3 lines. The rest are namespace declarations, newlines for readability etc…

nickmao · November 10, 2014, 12:36pm

Things get so much easier after reading tips on solving puzzles.

.NET Spoiler Alert
http://pastebin.com/rVXnA3kq

UPD: ugly not formatted non-readable and non-usable one-liner :-D:

Console.WriteLine(new HashSet<string>(Enumerable.Range(0, int.Parse(Console.ReadLine())).Select(t => Console.ReadLine()).SelectMany(t => Enumerable.Range(1, t.Length).Select(index => t.Substring(0, index)))).Count);

CvxFous · November 14, 2014, 2:10pm

As @nickmaovich prove it, you shouldn’t count in lines but instead count in characters

minioim · November 20, 2014, 7:10pm

yep, in fact, the complete code needs 19 lines :

4 for imports and new line,

1 for “class solution”
1 for " public static void main"
3 to read input (scanner etc…)
7 really doing the job,
1 for output
and 2 more for closing “}”

Eryns · December 4, 2014, 7:20am

A problem here, my implementation go through the first tests but not the last one.
My solution is similar of @arnaudus’. But I get a Segmentation fault (only on the last test) that I don’t understand… I would be glad if one of you can have a look.

//Tree structure
struct Node{
     Node* children[10]={NULL};
     int elm=-1;
 };
 typedef struct Node Node;

 //add a new node and increment the compter
 Node* add(Node* dad, int _elm,int& cpt){
    Node* son;
    son=(Node*)malloc(sizeof(Node));
    son->elm=_elm;
    dad->children[_elm]=son;
    cpt++;
    return son;
 }

int main(){   
    int cpt=0;
    Node* phone_book;
    phone_book=(Node*)malloc(sizeof(Node));
    int N;
    cin >> N; cin.ignore();
    for (int i = 0; i < N; i++){
        string telephone;
        cin >> telephone; cin.ignore();
        Node* cnode=phone_book;//phone_book cursor
        for(string::iterator it=telephone.begin();it!=telephone.end();++it){
            Node* child=cnode->children[*it];//Segmentation fault
            if(child==NULL)
                cnode=add(cnode,*it,cpt);
            else
                cnode=child;
        }
    }
    cout << cpt << endl;

I think I will try the solution proposed above (that seems a bit cheaty to me but should work easily)
I think that the problem comes from "malloc’ but I’m still very naive about dynamic memory allocation so I cannot figure out where is the bug.

Thank’s in advance.

CvxFous · December 4, 2014, 10:07am

Ok, so here’s what i’m gonna do: I’m gonna explain why what you did isn’t good for C++ and then i’m gonna explain two ways to do it, the way to do it properly, and the way to it your way.

PLEASE refrain from using another code, yes you’ll pass, get 100% and badges, but you won’t have learn anything.

So why wrong?

What you did is C, malloc is a good function when you know how to use it, but only in C.
In C++ we almost never use malloc, for simple reasons that I’ll explain later.

Wrong too because you reinvent the wheel, which is fine in C, but not in C++, you try to implement a trie and that’s good, but to do linked list, use <list>, <vector> etc…

Solutions

Good one:

class Node {
public:
    char val;
    Node(new_val) : val(new_val) {}
    vector<Node *> children;
};

And in the main

vector<Node *> list

and when you need to add a new elem:

list.push_back(new Node(char_value))

Bad one:

struct Node{
     Node* children[10]={NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL};
     int elm=-1;
};

Your children was badly initialized, which cause

Node* child=cnode>children[*it];//Segmentation fault

To fail.

If you want to continue coding like this, please use C, doing C++ won’t do you any good if you keep using C coding style. if not please look at this: Vector Reference

Hope I helped

Eryns · December 5, 2014, 2:17am

Thank you for your quick answer @CvxFous (as always).
I know I’m still C oriented and I try to get read of it, but it’s never easy to forget the language you learned first. Well, enough excuses.

So basicaly, you told me to replace my malloc by a constructor. It’s the only difference between your class and my struct, am I wrong ? I’ll try it !
Then I still don’t understand why my code only crashes on test 5 and not before. And what is the purpose of your “list” (which is a vector… notation very disturbing to me).
I know what a C++ vector is, thank you, and that is why I’m not using it here. In fact, I need a random acces in O(1) that’s why I use an array.

Don’t fear, you always help, and I thank you again for that.

CvxFous · December 5, 2014, 9:00am

Because in your struct you didn’t initialize all the 10 elements of children.
Let’s take an example:

Node* child=cnode->children[*it];
if(child==NULL)
    cnode=add(cnode,*it,cpt);
else
    cnode=child;

if (*it) == 2 then cnode->children[2], because it is unitialized, may be an invalid pointer like 0xFFF02521. so child isn’t NULL and you affect cnode to the value of child.
Then you do another loop, and again:

Node* child=cnode->children[*it];

and there you got cnode which is 0xFFF02521 and you try to access to the value that it is pointing cnode->children (which means (*cnode).children). Not only cnode is random, but *cnode is more and cnode->children is another layer of random, and chances are that you access invalid memory adress -> segfault.

Now there’s a second issue with your program.
Truth is, C++ is a good guy, even if you didn’t initialize everything, it MAY have do it for you, so MAYBE the problem isn’t there.
But one thing sure is:

for(string::iterator it=telephone.begin();it!=telephone.end();++it)
    Node* child=cnode->children[*it];//Segmentation fault

*it doesn’t range from 0 to 9, it range from ‘0’ to ‘9’, which is from 48 to 57. Even if C++ has initialize to NULL the first ten value of your array, it didn’t for THAT far away.

Fix: *it - 48 of *it - ‘0’

Now, second part

C++ vector isn’t only a list, is also the equivalent of array in C.
You can do stuff like tab[5] with it.

And you didn’t do a hastable of O(1), you did a Trie, each element contain up to N sons, and each sons contain up to N sons. The complexity is O(N) to access the element. for you N is 10 so it is trivial and you’ve found a great way to do the exercice. (I did it that way too)

Back to C++.

My point is that with c++, you don’t need to create your own add function, neither do any malloc:

vector<Node*> phone_book(10, NULL);
    int N;
    cin >> N; cin.ignore();
    for (int i = 0; i < N; i++){
        string telephone;
        cin >> telephone; cin.ignore();
        vector<Node*> &cnode = phone_book;//phone_book cursor
        for(string::iterator it=telephone.begin();it!=telephone.end();++it){
            Node* child=cnode[*it - '0'];
            if(child==NULL)
            {
                Node *n = new Node(*it - '0')
                cnode.push_back(b);
                cnode = n->children;
            }
            else
                cnode=child->children;
        }
    }
    cout << cpt << endl;

Something like that

Eryns · December 8, 2014, 2:45am

Thank you for your time !

This was the big problem, I was starting to desperate! I’m kind’of blind in this case… I’m glad you enlighted this point for me. Really BIG thank’s.

For me, N always refer to the input, so O(10) ~ O(1) ^^"
But basically you’re right, I’ll become more vector friendly, I promise !

I wish there were more people like you on the forum. Hope I’m not a too stupid student.
Eryns.

arnaudus · December 11, 2014, 12:48pm

Actually, when reading your code, it seems obvious to me that a clean and elegant solution would be to use a map<char, Node*> instead of the vector. No need to initialize it and to fill it with NULL, or even to save some space for all these NULL elements. And this will work for any character, and not only for 0-9 (think about the + in international phone numbers ). And no need to assume that 0-9 are contingous and ordered in the character table.

There is probably a small cost in using map instead of vector, but a lot of time is saved when constructing a new node, so I wonder if the map thing is not faster.