First go through the grid to find trivial cell values (cells with only one possible candidate) until there are none left.

Then do brute-force backtracking on the remaining grid.

Iâ€™ve managed to speed up the brute-force backtracking considerably, but it is still way to slow to solve test grids 3-6. Grid 3 requires 3,907,366,771 iterations to complete (after filling 5 trivial cell values), taking 82 seconds on my machine.

How can this be improved? What is the required â€śsmartâ€ť backtracking part here? How can I prioritize exploring the candidates during backtracking? Any hints?

For Test3 Iâ€™m only making 3 guesses in total so you are missing quite a few in your first step.
Before I have to make any guess my grid looks like that:

You say you are taking care of cells with only one possible candidate, you also need to use the fact that you know that in each row, col & block every letter appears only once, ie: if you can only place the letter in one cell of the row, col or block, it has to go there.

I must be doing something fundamentally wrong then. For grid 3, I am only able to find 10 trivial cell values now (after fixing a silly mistake), after which my grid looks like this:

When testing the trivial cell values, I of course combine what is already in the row plus in the column plus in the block, and if only one letter is missing across all 3, that is the only solution. I go through the whole grid and repeat until I find no more trivial cell values.