[Community Puzzle] Byte pair encoding

https://www.codingame.com/training/medium/byte-pair-encoding

Send your feedback or ask for help here!

Created by @fpringle,validated by @murrayr,@xspeedasx and @Radu_Sav.
If you have any issues, feel free to ping them.

Continuing the discussion from [Community Puzzle] Byte pair encoding:

What about “aabdaabcab” case?
Can you expane what answer should be for this input?

aabdaabcab:

YdYcZ
Z = ab
Y = aZ

input -----------------------
aabdaabcab

process 1 -----------------------

  • input: aabdaabcab
  • most common byte pair: ab with 3 occurrences
  • replaced by: Z
  • output: aZdaZcZ

process 2 -----------------------

  • intput: aZdaZcZ
  • most common byte pair: aZ with 2 occurrences
  • replaced by: Y
  • output: YdYcZ

return -----------------------
YdYcZ
Z = ab
Y = aZ

Not passing last test case any hint

I don’t think there is anything special about the last test case…

I would suggest looking at the expected output to see where the encoding rules differ from your answer
(you can use the “Show testcases” button to see the full expected output for each test).

1 Like

thanks! never knew we can see test case expected like that(Im new to the platform).
And Finally solved!!

1 Like

Continuing the discussion from [Community Puzzle] Byte pair encoding:

Hi,
Could you help me ? I don’t get the third test:

word = aaaaaaaaaaaaaaabbbbbbbbbbbbbbbcccccccccc

First try:
pair identified : “aa”
==> word = ZZZZZZZabbbbbbbbbbbbbbbcccccccccc
Second try :
pair identified : “ZZ”
==> word = YYYZabbbbbbbbbbbbbbbcccccccccc

I don’t get why the word should be “ZZZZZZZaYYYYYYYbcccccccccc” as the first pair is “ZZ”

I may have missed a rule…

Your “second try” is incorrect. “ZZ” is not the most common pair.

value: aaaaaaaaaaaaaaabbbbbbbbbbbbbbbcccccccccc

most common pair: aa, replacement Z
value: ZZZZZZZabbbbbbbbbbbbbbbcccccccccc

most common pair: bb, replacement Y
value: ZZZZZZZaYYYYYYYbcccccccccc

most common pair: cc, replacement X
value: ZZZZZZZaYYYYYYYbXXXXX

most common pair: ZZ, replacement W
value: WWWZaYYYYYYYbXXXXX

most common pair: YY, replacement V
value: WWWZaVVVYbXXXXX

most common pair: XX, replacement U
value: WWWZaVVVYbUUX

2 Likes

Recursion is not needed to solve this problem. A single while loop that just checks if there is more to be done is sufficient.