[Community Puzzle] Sokoban

IndecisionTree · July 3, 2021, 6:14pm

This makes a lot more sense. With the way I’m currently doing it, there are too many valid, but “useless” states considered, like just moving the character random directions w/o pushing a box, and that’s probably why my node count is so high. Guess this requires a whole rewrite of everything I’ve got.

Figuring out unreachable positions and reconstructing the character path sounds really expensive though?

If you ignore the character position and just generate box pushes, how can you tell if a push is unreachable without also knowing where the character is? Do you keep some kind of character reachable area info with each state?

Also, do you do a separate BFS within each iteration to scan for reachable positions and then one at the end to reconstruct the character path to each box push?

pardouin · July 3, 2021, 6:27pm

Djoum’s approach is very interesting but not mandatory at all, in fact I wanted to try this but finally didn’t had too.

If you want to go Djoum’s way, just keep track of the player position after each push. At each step you can use a small DFS or BFS (doesn’t matter) to find the next possible pushes.

IndecisionTree · July 3, 2021, 6:51pm

EDIT: Disregard the previous post. I finally started tracking how many iterations my BFS was doing vs the actual size of my visited nodes. I was iterating more times than I was actually finding unique nodes, despite pruning before appending to the queue. Added a check at the top for uniqueness (before any computation work was done) and suddenly the whole things runs much faster and passes all tests. No idea why as I’m very much an Algo noob. Thanks for the help everyone.

Djoums · July 3, 2021, 7:26pm

Reconstructing the character path is a chained BFS itself, but we’re talking a few hundred iterations at worst so the cost is minimal.
For the unreachable cells, I number areas on the map (each one is separate from the other) and I check the moves that have the same number as the character. Any time a box is moved, you know the cell where the character would be and you can include it in your BFS state. Doing this incurs some cost, but it dropped my number of iterations so much that it ended up being a massive gain.

geppoz · July 4, 2021, 8:24am

Solved in JS with simplest BFS (just a tree and a hash for already visited states), with map in a wxh long string ( ), no heuristics, no metrics, just “don’t push in a corner please”.
Test cases timings:
3-5: 312k nodes, 2.0s
3-7: 194k nodes, 1.3s
3-10: 263k nodes, 1.7s

Q12 · July 12, 2021, 3:19pm

Great puzzle!! It was very fun and the puzzle is very good for learning BFS,
both for searching in maze and generating possible states.
With a good pruning you don’t have to use the second given in the first turn, I didn’t use it.

_BZH_TidyMaze · August 7, 2021, 10:58pm

Very nice puzzle! Took me quite a long time to optimise and understand why my “alreadySeen” cache wasn’t working properly (coded a custom hash for state, without handling collisions, in go).

Also thanks for the hint that boxes order should have been ignored when hashing.

TBali · February 6, 2022, 2:13pm

To ~~torture myself~~ practice, I translated my original solution to several languages.
As these version use exactly the same (deterministic) search order, it can be used as a kind of language benchmark. Of course, relative speed depends on MANY things, so these results should not be generalized or taken too seriously.

These are the average running times for IDE TEST #25 in turn 0 (when the complete solution is found) on my local machine:

language	msec	relative
C++ (optimize pragma)	85	1.00
C++ (default options)	242	2.85
Java	421	4.95
PHP	3,747	44.08
Python	1,776	20.89

Notes:

My code searches 226,877 nodes for this input.
The pragma line for the optimized C++ run was: #pragma GCC optimize "O3,omit-frame-pointer,inline,unroll-loops"
All solutions use standard library data types (vector + unordered_map, Vector + HashMap, array, list + dict), and don’t use language specific code optimizations. The game state representation is encoded to a single int64, involving lots of bitwise operations.
Online results in CG IDE was quite similar. However Java seems to be twice as fast locally. I don’t know if this comes from Java v8 vs Java v17, or from different default settings.)
While other languages running times are fairly stable, Java result fluctuated by ±15%, most likely because of the garbage collector.

Remi · October 23, 2022, 10:08am

Hi, i try to solve it, but this is my score :

100% on “scoria” tests
80% on “scoria 2” tests (2 timeout)
0% on “scoria 3” test (10 timeout)

I optimized my code as much as possible, but i don’t think this is the problem.
I saw that some of you need less than 100,000 iterations to find a solution, and i’m over 1,000,000 (2,429,671 on scoria 2 level 10 and i don’t timeout).
Is there a heuristic to change the BFS into A* ? Or did I miss something else ?

Remi · October 23, 2022, 10:17am

(I wanted to do an A* but I didn’t find the heuristic, since we don’t know in advance which box is associated with which target)

Remi · October 23, 2022, 1:45pm

How do you compute a hash ignoring boxes order ?

pardouin · October 23, 2022, 2:52pm

I just have a straigthforward BFS in Python.
My nodes are tuples of 3 int: player x, player y and a bitboard of boxes positions.

Remi · October 23, 2022, 3:50pm

Same here, but all in only one int64.
But i don’t know how to bitboard boxes position AND ignore boxes order.

cedricdd · October 23, 2022, 4:15pm

Hi,

For reference, I’m finding the solution (85 moves) for scoria 2 level 10 with BFS in 49,894 iterations.
I think that “ignoring the boxes order” is referring to if you have 2 boxes on positions x1;y1 & x2;y2 it doesn’t matter if it’s box ID 1 on x1;y1 and box ID 2 on x2;y2 or if it’s ID 2 on x1;y1 and box ID 1 on x2;y2

Remi · October 23, 2022, 4:18pm

Yes, that is my problem. My hash is computed like that :

for (int i=0; i<n; ++i)
   hash |= (box[i].x<<4|box[i].y)<<(8*i);

So, the order matters here.

cedricdd · October 23, 2022, 4:30pm

What about sorting boxes from top-left to bottom-right before generating the hash?

Remi · October 23, 2022, 4:34pm

I tryied to sort, but it was too long (timeout on the very first test). Maybe i don’t know the good way to sort.
I think I’ll try to use an int128 to bitboard boxes with “0” for empty cell and “1” with a cell with a box.

Remi · October 24, 2022, 7:28pm

I didn’t try yet the 128 bits integer, but I avoid corners, then I tryied that :

The result is better but i still have timeouts.

Test	Simple BFS	Multi BFS	Multi BFS + gcc optim
Scoria level 1	6ms	5ms	(not played)
Scoria level 2	3ms	1ms	(not played)
Scoria level 3	3ms	3ms	(not played)
Scoria level 4	7ms	9ms	(not played)
Scoria level 5	4ms	4ms	(not played)
Scoria level 6	5ms	8ms	(not played)
Scoria level 7	9ms	21ms	(not played)
Scoria level 8	17ms	21ms	(not played)
Scoria level 9	3ms	4ms	(not played)
Scoria level 10	12ms	5ms	(not played)
Scoria 2 level 1	1,153s	1,083s	(not played)
Scoria 2 level 2	1,241s	1,165s	(not played)
Scoria 2 level 3	0,430s	0,191s	(not played)
Scoria 2 level 4	0,504s	0,180s	(not played)
Scoria 2 level 5	1,071s	0,772s	(not played)
Scoria 2 level 6	0,841s	0,732s	(not played)
Scoria 2 level 7	0,572s	0,224s	(not played)
Scoria 2 level 8	0,154s	0,081s	(not played)
Scoria 2 level 9	timeout	timeout	timeout
Scoria 2 level 10	0,734s	0,303s	0,063s
Scoria 3 level 1	timeout	3,5s	0,700s
Scoria 3 level 2	timeout	1,67s	0,414s
Scoria 3 level 3	timeout	8,5s	1,89s
Scoria 3 level 4	timeout	timeout	timeout
Scoria 3 level 5	timeout	timeout	timeout
Scoria 3 level 6	timeout	timeout	timeout
Scoria 3 level 7	timeout	timeout	timeout
Scoria 3 level 8	timeout	8,2s	1,85s
Scoria 3 level 9	timeout	5,99s	1,24s
Scoria 3 level 10	timeout	10,01s	2,04s

Is there another heuristic to avoid bad positions (like corners) ?

cedricdd · October 24, 2022, 9:54pm

Similar to avoiding corners you can avoid segments when you know that the box would be stuck on that segment.
A segment is a part of a row with only consecutive empty cells (no target) with a wall on the left and the right. If at the row below or above all the positions of the segment are walls then no boxes should be pushed on any of the positions of the segment.

For example Scoria - Level 10 is:

#######
#...###
#.....#
##*.*.#
#.*.#.#
#.....#
#######

On line 1 you have a segment with positions: 1, 2 & 3, since at line 0 there’s a wall for the positions 1, 2 & 3 you know you can forbid the position 2;1 (positions 1;1 & 3;1 are already excluded because they are corners)
On line 5 you have a segment with positions: 1, 2, 3, 4 & 5, since at line 6 there’s a wall for the positions 1, 2, 3, 4 & 5 you know you can forbid the position 2;5 & 3;5 & 4;5 (positions 1;5 & 5;5 are already excluded because they are corners)

The same logic applies to segments on columns, so you can also forbid the positions 5;3 & 5;4

TBali · October 25, 2022, 7:58pm

Sorting 5 numbers should not take too much time… When (and only when) any box is moved, I re-sort the box positions before re-packing them into an int64 game state. It is good not just for the hashmap lookup in the BFS to avoid revisiting same game state twice, but also to make checking for the goal easier, i.e. a simple int comparison.
Pruning the state space, as cerdicdd described, is also essential.
Depending on language choice, turn off garbage collection if possible.
Pre-allocating big vectors, instead of letting them grow only as needed also saves some time.