Feel free to send your feedback or ask some help here!
Hi,
I post here because i need some advice for this puzzle.
My first idea for this challenge was obviously to use trees so i created :
struct Tree{
Ā Ā Ā char data; // a number (between 0 and 9)
Ā Ā Ā vector< Tree > childs; // following of any phone numbers starting in the same way
};
In that way, i thought the following picture was perfectly illustrated.
http://ide.codingame.com/fileservlet?id=266985027077
So, I created also 3 little functions :
int mem(char number, vector childs); // return the index of the child which data equals number, or -1 if number not found
Tree create(string tel); // called when mem returns -1, simply create unary tree (each node is one number of tel) --recursive
Tree add(Tree mainTree, string phone, int* length); // use in the main function, add a new phone number to the tree and increase the length when create() is called (in that way length not increased until mem returns -1) --recursive
My programs works perfectly for the 4 first tests but fails the last one and i can see in the console that the program stops at about the 750th number (on 10.000).
So i suppose my idea was not optimal (or maybe functions implementations) and iām looking for some new leads, or hints. I think about using struct Tree {map< int, Tree>} maybeā¦ ?
Thank you for your help !
Edit : Made in C++
I think youāre overkilling it, we only need to deal with strings and substrings here.
If you decompose a given string (phone number), with all its prefixes, you get exactly phone.length()
substrings. 0123 -> 0, 01, 012, 0123
- so the idea is to check the existence of each prefix in a common list of prefixes. If itās new, then you need to count it (+1), if itās already there, do not count it (0). For example we already have 0123
stored, now you compute 01299 -> 0, 01, 012, 0129, 01299
, the first three give a result of 0 (already in our list), but both 0129
and 01299
are new, so the whole string would compute as 0+0+0+1+1
This idea allowed me to compute 7k5 phone numbers before timeoutā¦ Not quite enough for the 10k needed, had to use another trick to reduce the computing time from around 17" to 3" (it took me a little while though, but the trick is simple enough)
I did the same thing at first with a recursive search, and by the end it was taking far too long. What rkj said is key! Without giving away the answer, think on the fastest way to count all the string/substring entries. In the end this puzzle is the shortest code block Iāve written for any puzzle at CodinGame. The longest list test completes on avg in about 80 milliseconds. (this is coded in Go)
I did it using the ātreeā algorithm and I passed all tests in C++, so I guess it is more an implementation issue than an algorithmic problem. However, I did not use a vector, but a Tree* table[10] filled with NULL in the constructor. Then you can fill the table only when you need it, which can probably speed the algorithm up to a factor of 10.
Combining the āstringā method to a binary search (perhaps using a map container, which uses an efficient hash method) looks like an interesting candidate for reaching even faster speedā¦
hi, I used the same system as rkj (almost) -> decompose a String Phone number in substring and try to add each substring to a list if it doesnāt already exist.
at the end return the size of the list.
with Java and the Set Interface, processing the 10k phone numbers requires 180ms.
so thatās definitively a good solution (and the shortest I wrote in Codingameā¦)
Actually, you donāt even need additional storage. Simply sorting the input (using an appropriate comparator) and then iterating once over the pairs of consecutive numbers is enough. 20 lines of Clojure (I know, I knowā¦ Using Clojure is kind of cheating :P)
20 lines is actually long for this puzzle.
Oh noes!!! *clojure_addict heads over to a dark corner and starts crying uncontrollably not being able to face the humiliationā¦*
Seriously though, 20 lines of what language and using what approach?
EDIT: Actually, I just realized that I could improve it more. I noticed that I donāt need a custom comparator after all (the built-in string comparator is enough), and I also optimized away a variable I used only once. I now am at 10 lines!
I need 12 lines with Java.
12 lines sounds pretty good for a Java solution. Iām curious, how much of it is boilerplate (e.g. imports, class definitions, method signatures etcā¦) and how much is actual problem solving code? For example, ATM, I have 7 lines of Clojure but the body of my main function only spans 3 lines. The rest are namespace declarations, newlines for readability etcā¦
Things get so much easier after reading tips on solving puzzles.
.NET Spoiler Alert
http://pastebin.com/rVXnA3kq
UPD: ugly not formatted non-readable and non-usable one-liner :-D:
Console.WriteLine(new HashSet<string>(Enumerable.Range(0, int.Parse(Console.ReadLine())).Select(t => Console.ReadLine()).SelectMany(t => Enumerable.Range(1, t.Length).Select(index => t.Substring(0, index)))).Count);
As @nickmaovich prove it, you shouldnāt count in lines
but instead count in characters
yep, in fact, the complete code needs 19 lines :
4 for imports and new line,
1 for āclass solutionā
1 for " public static void main"
3 to read input (scanner etcā¦)
7 really doing the job,
1 for output
and 2 more for closing ā}ā
A problem here, my implementation go through the first tests but not the last one.
My solution is similar of @arnaudusā. But I get a Segmentation fault (only on the last test) that I donāt understandā¦ I would be glad if one of you can have a look.
//Tree structure
struct Node{
Node* children[10]={NULL};
int elm=-1;
};
typedef struct Node Node;
//add a new node and increment the compter
Node* add(Node* dad, int _elm,int& cpt){
Node* son;
son=(Node*)malloc(sizeof(Node));
son->elm=_elm;
dad->children[_elm]=son;
cpt++;
return son;
}
int main(){
int cpt=0;
Node* phone_book;
phone_book=(Node*)malloc(sizeof(Node));
int N;
cin >> N; cin.ignore();
for (int i = 0; i < N; i++){
string telephone;
cin >> telephone; cin.ignore();
Node* cnode=phone_book;//phone_book cursor
for(string::iterator it=telephone.begin();it!=telephone.end();++it){
Node* child=cnode->children[*it];//Segmentation fault
if(child==NULL)
cnode=add(cnode,*it,cpt);
else
cnode=child;
}
}
cout << cpt << endl;
I think I will try the solution proposed above (that seems a bit cheaty to me but should work easily)
I think that the problem comes from "mallocā but Iām still very naive about dynamic memory allocation so I cannot figure out where is the bug.
Thankās in advance.
Ok, so hereās what iām gonna do: Iām gonna explain why what you did isnāt good for C++ and then iām gonna explain two ways to do it, the way to do it properly, and the way to it your way.
PLEASE refrain from using another code, yes youāll pass, get 100% and badges, but you wonāt have learn anything.
So why wrong?
What you did is C, malloc is a good function when you know how to use it, but only in C.
In C++ we almost never use malloc, for simple reasons that Iāll explain later.
Wrong too because you reinvent the wheel, which is fine in C, but not in C++, you try to implement a trie and thatās good, but to do linked list, use <list>, <vector>
etcā¦
Solutions
Good one:
class Node {
public:
char val;
Node(new_val) : val(new_val) {}
vector<Node *> children;
};
And in the main
vector<Node *> list
and when you need to add a new elem:
list.push_back(new Node(char_value))
Bad one:
struct Node{
Node* children[10]={NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL};
int elm=-1;
};
Your children was badly initialized, which cause
Node* child=cnode>children[*it];//Segmentation fault
To fail.
If you want to continue coding like this, please use C, doing C++ wonāt do you any good if you keep using C coding style. if not please look at this: Vector Reference
Hope I helped
Thank you for your quick answer @CvxFous (as always).
I know Iām still C oriented and I try to get read of it, but itās never easy to forget the language you learned first. Well, enough excuses.
So basicaly, you told me to replace my malloc by a constructor. Itās the only difference between your class and my struct, am I wrong ? Iāll try it !
Then I still donāt understand why my code only crashes on test 5 and not before. And what is the purpose of your ālistā (which is a vectorā¦ notation very disturbing to me).
I know what a C++ vector is, thank you, and that is why Iām not using it here. In fact, I need a random acces in O(1) thatās why I use an array.
Donāt fear, you always help, and I thank you again for that.
Because in your struct you didnāt initialize all the 10 elements of children.
Letās take an example:
Node* child=cnode->children[*it];
if(child==NULL)
cnode=add(cnode,*it,cpt);
else
cnode=child;
if (*it) == 2 then cnode->children[2], because it is unitialized, may be an invalid pointer like 0xFFF02521. so child isnāt NULL and you affect cnode to the value of child.
Then you do another loop, and again:
Node* child=cnode->children[*it];
and there you got cnode which is 0xFFF02521 and you try to access to the value that it is pointing cnode->children (which means (*cnode).children). Not only cnode is random, but *cnode is more and cnode->children is another layer of random, and chances are that you access invalid memory adress -> segfault.
Now thereās a second issue with your program.
Truth is, C++ is a good guy, even if you didnāt initialize everything, it MAY have do it for you, so MAYBE the problem isnāt there.
But one thing sure is:
for(string::iterator it=telephone.begin();it!=telephone.end();++it)
Node* child=cnode->children[*it];//Segmentation fault
*it doesnāt range from 0 to 9, it range from ā0ā to ā9ā, which is from 48 to 57. Even if C++ has initialize to NULL the first ten value of your array, it didnāt for THAT far away.
Fix: *it - 48 of *it - ā0ā
Now, second part
C++ vector isnāt only a list, is also the equivalent of array in C.
You can do stuff like tab[5] with it.
And you didnāt do a hastable of O(1), you did a Trie, each element contain up to N sons, and each sons contain up to N sons. The complexity is O(N) to access the element. for you N is 10 so it is trivial and youāve found a great way to do the exercice. (I did it that way too)
Back to C++.
My point is that with c++, you donāt need to create your own add function, neither do any malloc:
vector<Node*> phone_book(10, NULL);
int N;
cin >> N; cin.ignore();
for (int i = 0; i < N; i++){
string telephone;
cin >> telephone; cin.ignore();
vector<Node*> &cnode = phone_book;//phone_book cursor
for(string::iterator it=telephone.begin();it!=telephone.end();++it){
Node* child=cnode[*it - '0'];
if(child==NULL)
{
Node *n = new Node(*it - '0')
cnode.push_back(b);
cnode = n->children;
}
else
cnode=child->children;
}
}
cout << cpt << endl;
Something like that
Thank you for your time !
This was the big problem, I was starting to desperate! Iām kindāof blind in this caseā¦ Iām glad you enlighted this point for me. Really BIG thankās.
For me, N always refer to the input, so O(10) ~ O(1) ^^"
But basically youāre right, Iāll become more vector friendly, I promise !
I wish there were more people like you on the forum. Hope Iām not a too stupid student.
Eryns.
Actually, when reading your code, it seems obvious to me that a clean and elegant solution would be to use a map<char, Node*>
instead of the vector. No need to initialize it and to fill it with NULL, or even to save some space for all these NULL elements. And this will work for any character, and not only for 0-9 (think about the + in international phone numbers ). And no need to assume that 0-9 are contingous and ordered in the character table.
There is probably a small cost in using map instead of vector, but a lot of time is saved when constructing a new node, so I wonder if the map thing is not faster.