Two community puzzles were recently added which use multibyte characters (é, £, €, …) with some of the validators. I’m talking of “7-segment display” (validator #2) and “Snake encoding” (validator #4).
This makes the puzzles more difficult to solve for some languages, and I don’t think that it’s very interesting.
Nevertheless, when such characters are used, please:
add at least one visible test case with multibyte characters;
maybe say it clearly in the problem statement that such chararcters are allowed;
certainly don’t say that “the input text is composed of ASCII characters” (Snake encoding), since it’s not true.
Sorry. I voted to approve both puzzles, and because of my language choice when solving the problem, I didn’t notice the issue. I will try to pay more attention to this in the future. For now, I agree with you that something should be done to rectify the situation for these two puzzles.
Not only this, but the “Auto-generated code [that] aims at helping you parse the standard input according to the problem statement” in C uses “char LINE; fgets(LINE, 21, stdin);” which is obviously wrong if multibyte characters are allowed.
Yes, I did try that, but for some reason it didn’t work (I used getwchar instead of scanf, and it kept returning -1, I don’t know why).
I now solved the problem by writing a getutf8 and a pututf8 function, returning and taking an unsigned int, so I only have to #include stdio.h. Now I saw your code, and you can see mine
That’s a rather subtle difference… and I’m not even sure it is correct. With the input string “héhé” in UTF-8, getwchar is called 6 times (two times for each accented letter). If I call scanf on a wchar_t buffer with that same input string, wcslen also returns 6, and I can print its contents character by character with printf ("%lc", …), in which case printf is again called 6 times. Likewise I can iterate through that buffer and print its contents with putwchar, that must be called 6 times again.
Got it. Actually you should use setlocale(LC_CTYPE, “en_US.UTF-8”) instead of setlocale(LC_CTYPE, “”), unless your default encoding happens to be UTF-8. Otherwise scanf, cwslen and friends have no way to magically detect that the input is UTF-8.
A last reply to myself: actually setlocale (LC_CTYPE, “en_US.UTF-8”) doesn’t even work as I would have expected: you have to use either scanf and printf or getwchar and putwchar, you cannot mix them. At least it doesn’t work reliably on my computer, whose local encoding is ISO-8859-1: for example, putwchar-printf does not output the characters of the printf, and printf-putwchar-printf outputs the characters of both.
AFAIU, (LC_CTYPE “”) makes sense to use in CG IDE, as the code is executed on server side. Probably using the same locale as the puzzle text.
For your local machine, locale of course might be different.