The goal of this sequence of patches is to end up representing a
position as a single integer index into the grid instead of an (x, y)
pair. I believe this will have better performance.
First step move all calls of x(), y() into calls on methods of Results &
Grid as in the future these will need to use the length of the grid.
We abstract away the types being used from 'int' and 'char'. The one
exception being for output.
This allows us to experiment to see if using different types can improve
performance.
It turns out it does - and that using 64-bit integers everywhere is a
good idea.
We now return the results in a separate structure to the grid we have
been working on.
This ultimately makes it easier to play with the implementation of the
solution finder.
Previously our implementation was recursive (although less so than the
OCaml version).
Moving to an iterative implementation improves performance as we have
fewer function calls, and so fewer function prologues and epilogues.
The initial implementation of the code stored squares in the grid in a
"pretty" form. This is a complicated bit of code and slowed down the
code slightly.
Now we just store the size of the current square in the grid. This
reduces the code complexity, and offers a slight performance
improvement.
We still output a prettified version of the grid, as we can generate
that from a grid populated with just square sizes.
This is another optimisation, as it improves memory layout and we are
not using a reference counted string.
On a M1 Macbook Air this brings runtime for solving the 9 cell down to
under 3m30s. Which is about a 15% performance improvement.
We were not taking into account that the largest_square may be
constrained by the y position as well as the x. i.e. is there enough
vertical space for the square.
This commit adds that check.
We were working before because C++ is not memory safe, and we weren't
running debug builds.
Grid::next_pos was starting its scan from the position after the current
one.
This is inefficient as we know how large the square we've just placed is
as we scan those positions.
This optimisation just skips that square when starting the scan.
As an optimisation we know that if a square with side length N fits in a
certain position, then one of side length N - 1 does too. And
conversely, if a square of side length N does not fit then one of side
length N + 1 does not.
So our implementation which asked if every square fitted in a certain
position called Grid::fits() too many times, once for each side length.
We replace this by a function Grid::largest_square which gives the
appropriate side length to start with.
Naïvely the optimisation here is that instead of checking whether N
squares fit, and so having to do (N * (N + 1)) / 2 comparisons in fits
per position, we instead do one scan in largest_square and do at most N
comparisons.
A performance improvement is also seen in real life.
We stop using function calls to move down an index but instead use a for
loop.
This actually provides a slight performance improvement as we are not
repeated making and destroying function frames.