As an optimisation we know that if a square with side length N fits in a
certain position, then one of side length N - 1 does too. And
conversely, if a square of side length N does not fit then one of side
length N + 1 does not.
So our implementation which asked if every square fitted in a certain
position called Grid::fits() too many times, once for each side length.
We replace this by a function Grid::largest_square which gives the
appropriate side length to start with.
Naïvely the optimisation here is that instead of checking whether N
squares fit, and so having to do (N * (N + 1)) / 2 comparisons in fits
per position, we instead do one scan in largest_square and do at most N
comparisons.
A performance improvement is also seen in real life.
We stop using function calls to move down an index but instead use a for
loop.
This actually provides a slight performance improvement as we are not
repeated making and destroying function frames.