The online project notebook of
Mark Van Dine
Owner and Proprieter




- Current
- Notebook
- College

A collection of quotes and excerpts.


  Light Verse for a Heavy Universe
  Puzzle Machine
  Google
  Wikipedia
  Purportal.com
  Yahoo!
  Professional Writers & Editors
  BBC News
  Drama Queen
  Commoner's View


A Perl Solver for Sudoku

Elsewhere on the site there is a Sudoku Solver that employs a simple recursive algorithm to solve these enjoyable number placement puzzles. I recently got an email from a reader saying first that he'd found a puzzle that could stump my solver AND that he'd found a solver better than mine.

Bah, humbug. And it did not help that he was right on both counts.

The difficult puzzle he'd found was this geographically impressive example on the UK's Sky One site. I have no idea if this thing is real or a Photoshop miracle (although they claim Guinness Book of World Records authenticity), but I once again tip my hat to British creativity.


The puzzle itself has you guessing within the first several moves. I dislike this sort of Sudoku, preferring those that yield to logical deduction. But the recursion in my algorithm is designed to handle exactly this problem ... otherwise a simple loop would be the far simpler approach to take. The trouble is that PHP is not a great scripting choice for deep recursion. I'm no authority on the subject, but I've followed developer discussions about this and I'm inclined to believe it (it depends on whether you use PHP 4 or 5, but under 4 the available memory space for recursion is very constrained.)

The Sky One puzzle requires no fewer than 10 nested guesses (about a 1 in 1000 chance of getting all the correct guesses right the first time if the solution is unique ... which it isn't, but we'll save that math lesson for another time). A Sudoku board is usually considered very difficult if it requires two guesses, so a requirement of 10 is pretty much over the top. Still, my PHP algorithm, running in its Apache environment, stumbles on it, so here I'd like to provide the Perl script on which it was based. Perl will support very deep recursion, if needed.

Recursion, for those who aren't familiar with the term in this context, occurs when a function in a program calls itself in order to do what it is supposed to do. The old joke is that if you look up "recursion" in a programmer's dictionary, the definition says, "See recursion". (That actually defines an infinite loop, which is an everyday threat when working with recursion, but as a friend of mine says, "If you buy the premise, then you buy the joke.") A practical example often given is for a function that computes the factorial of an integer N. For its code, such a function "factorial(N)" might simply return the value "N * factorial(N - 1)". As long as the same function tests for the special case of N = 1, and just returns 1 in that case, it is an extremely compact way to capture the functionality

First part of the script:

#!/usr/bin/perl -w
use strict;


our $move_ctr;

Sure, uninspired, but it must be done. Of course, if your perl is installed elsewhere, update that first line as needed. Next, I'm going to show the main body of the program. In practice, I always enter the subroutine definitions first, and that's how it will be in the actual file.

# +-----------------------------------------------------------+
# | M A I N program.                                          |
# |                                                           |
# +-----------------------------------------------------------+

my $board;
my $solution;
my $guesses;
my $answer;
my $state;


print_time("Start time");
    
# Gather new board from STDIN

$board = $ARGV[0];

$answer = $board;
$move_ctr = 0;

($solution,$guesses,$state) = solve($answer,0,0);
if ($state == 0) {
    die("Could not solve the board!\n");
}
    
print "Solved after $guesses guesses and $move_ctr moves:\n\n";
print_board("Before",$board);
print_board("After",$solution);

print_time("Stop time");

We gather an 81-character string from the command line that represents the board, done this way just to keep it utterly simple. The string starts with the top row, each row has nine characters corresponding to the numbers in that row. If a cell is blank, we put a "0" (zero) in its place. So the command to solve the Sky One puzzle will look something like:

perl sudokusolve.pl 506020903008000500000000000600285009000903000800761004000000000004000300201050607

We get the board string from the @ARGV array, but since we may want to compare the original board to the solution, we leave it as is and make a copy of it into the variable "$answer", which we will update with our solution.

The print_time subroutine takes a label for an argument and simply prints out a timestamp. I am always wondering how long things take to complete, so I'm in the habit of printing this before and after big steps. Otherwise, all the main program does is call the "solve" subroutine, which takes as arguments a game board, a 'level' counter (to track recursion depth ... we start at level 0), and a counter for the number of guesses made on the way to a solution. A guess happens when the best alternative on the board still requires you to guess among two or more possible values for that cell.

The solve subroutine looks like this:

# +-------------------------------------------------------------+
# | Subroutine solve                                            |
# |                                                             |
# +-------------------------------------------------------------+
sub solve
{
    my $board     = $_[0];
    my $level     = $_[1];
    my $guess_ctr = $_[2];
    my $nextrow;
    my $nextcol;
    my $score;
    my @legal;
    my $nlegal;
    my $val;
    my $i;
    my $c;
    my $newstr;
    my $solution;
    my $guesses;
    my $state;
    
    ($score, $nextrow, $nextcol, @legal) = score_board($board);
    
    if ($score == 0) {
        return($board,$guess_ctr,0);
    }
    
    # If $score is still 10, then all of the spaces are filled! This should be a
    # winning board. "state" returned is 1.
    
    if ($score == 10) {
        return($board,$guess_ctr,1);
    }
    
    # Otherwise, we take our best alternative and push ahead ...
    $nlegal = @legal;
    
    $move_ctr++;
    
    # If we have a cell with only one valid response, fill it in and go on ...
    if ($nlegal == 1) {
        $val = $legal[0];
        substr($board,($nextrow * 9) + $nextcol,1) = $val;
        
        return(solve($board,$level + 1,$guess_ctr));
    }
    
    # Otherwise, we will try a set of alternatives until we find success
    # (or  run out of alternatives!)
    
    $guess_ctr++;
    
    # The following is an optional diagnostic for looking at recursion depth.
    print STDERR "Guess $guess_ctr, checking $nlegal possibilities.\n";
    
    for ($i = 0;$i < $nlegal;$i++) {
        
        $val = $legal[$i];
        substr($board,($nextrow * 9) + $nextcol,1) = $val;
        
        ($solution,$guesses,$state) = solve($board,$level + 1,$guess_ctr);
        
        # See if that path led to a solution
        if ($state == 1) {
            return($solution,$guesses,$state);
        }
    }
    
    # We only get here if all of our alternatives failed.  Return the last failure.
    return($solution,$guesses,$state);
}

The solve subroutine invokes the score_puzzle subroutine, and acts on its results. score_puzzle simply looks at each and every square in the game board. If the square is vacant (represented by a zero in the argument string), it analyzes the Sudoku board to determine what numbers could be placed there. It looks at the column the cell is in, the row, and the 'local' 3x3 grid that the cell is in, and eliminates any of the integers from 1 through 9 already found in any of those locations. It then counts how many possible integers remain. The subroutine keeps track of the row and column with the lowest possible alternatives.

# +-------------------------------------------------------------+
# | Subroutine score_board                                      |
# |                                                             |
# +-------------------------------------------------------------+
sub score_board
{
    my $str = $_[0];
    my @board;
    my @scores;
    my $i;
    my $j;
    my $cell;
    my @legal;
    my $nlegal;
    my $lowscore = 10;
    my $lowi     =  0;
    my $lowj     =  0;
    my @lowlegal = ();
    
    # The @scores array is a 9x9 grid representing the board $str.
    
    # If a cell has an @scores value of -1, then it already has an assigned value
    # on @board.
    
    # Otherwise the cell value is the number of unused numbers that could be assigned
    # at that row and column.  If this value is zero (0), then
    # we are at a dead-end.  Otherwise, we want to return the row and column
    # of the cell with the lowest number of alternatives, and the list of those
    # alternatives, to try in order.
    
    for ($i = 0;$i < 9;$i++) {
        for ($j = 0;$j < 9;$j++) {
            $board[$i][$j] = substr($str,($i * 9) + $j,1);
        }
    }
    
    for ($i = 0;$i < 9;$i++) {
        for ($j = 0;$j < 9;$j++) {
            
            $cell = $board[$i][$j];
            if ($cell != 0) {
                $scores[$i][$j] = -1;
                next;
            }
            
            @legal  = legal_values($str,$i,$j);
            $nlegal = @legal;
            
            $scores[$i][$j] = $nlegal;
            if ($nlegal < $lowscore) {
                $lowscore = $nlegal;
                $lowi     = $i;
                $lowj     = $j;
                
                @lowlegal = @legal;
            }
        }
    }
    
    return ($lowscore, $lowi, $lowj, @lowlegal);

}

If the $lowscore returned by score_board is zero, then there is a cell on the board that can not be filled with a legal value ... which means the board is at a dead end and can't be solved, so the 'solve' subroutine will return a state that says "Give Up!". If no blank cells are found, that means that the board IS solved, and "solve" returns a state that says "Found It!"

If score_board returns a one, it means it has found a cell where only one possible number fits. The "solve" subroutine places this number in the indicated coordinates, and then calls itself to analyze the new board. If the return value is greater than one, then the subroutine iterates through the list of possible legal values until a solution is found (or all the alternatives fail).

Finally, for those interested, the following is the PHP translation of the main functions that I came up with when scripting the page. Other code is involved, including a real Frankenstein's Monster of a function to extract the board values from the form into the array, but the main solve and score_board functions are here ... connecting the pieces will be straightforward.


$str          = "000004050000003700208006000540000000100000003000000028000200106003700000070500000";
$original     = $str;
$nctr         = 0;       // Counts steps in solution
$solution_ctr = 0;       // Counts number of solutions found
$solutions    = Array(); // Array of solutions found.

function score_board($str) {
    
    global $nctr;
    
    $lowscore = 10;
    $lowi     =  0;
    $lowj     =  0;
    $lowalts  = "0123456789";
    
    $nctr++;
    
    # The $scores array is the same 9x9 grid as the $board array.  If a cell
    # has an $scores value of -1, then it already has an assigned value
    # on $board.  Otherwise, the cell value is the number of unused numbers that
    # could be assigned at that row and column.  If this value is zero (0), then
    # we are at a dead-end.  Otherwise, we want to return the row and column
    # of the cell with the lowest number of alternatives, and the list of those
    # alternatives, to try in order.
    
    for ($i = 0;$i < 9;$i++) {
        for ($j = 0;$j < 9;$j++) {
            $board[$i][$j] = substr($str,($i * 9) + $j,1);
        }
    }
    
    // Initialize $alts to allocate some space to it.
    $alts  = "0123456789";
    
    for ($i = 0;$i < 9;$i++) {
        
        for ($j = 0;$j < 9;$j++) {
            
            $cell = $board[$i][$j];
            if ($cell != 0) {
                $scores[$i][$j] = -1;
                continue;
            }
            
            # Prep array to keep track of numbers that can and can't be played
            # in this particular location.  Note that '0' denotes an unused
            # cell ... the numbers 'in play' are 1 through 9.
            
            for ($k = 0;$k <= 9;$k++) {
                $ctr[$k] = 0;
            }
            
            
            # Define the row and column of the top left element of the 'local'
            # 3x3 grid.
            
            $localr = $i - ($i % 3);
            $localc = $j - ($j % 3);
            
            # Mark all numbers already used in the 'local' 3x3 grid.
            for ($ii = $localr;$ii < ($localr + 3);$ii++) {
                for ($jj = $localc;$jj < ($localc + 3);$jj++) {
                    $value = $board[$ii][$jj];
                    if ($value != 0) {
                        $ctr[$value] = 1;
                    }
                }
            }
            
            # Mark all numbers already used in the current row.
            for ($jj = 0;$jj < 9;$jj++) {
                $value = $board[$i][$jj];
                if ($value != 0) {
                    $ctr[$value] = 1;
                }
            }
            
            # Mark all numbers already used in the current column.
            for ($ii = 0;$ii < 9;$ii++) {
                $value = $board[$ii][$j];
                if ($value != 0) {
                    $ctr[$value] = 1;
                }
            }
            
            # Now count up possible moves and record the score.  A number is
            # available to play in a current space if ther is a zero in its
            # place in the @ctr array.
            #
            # If this is the current Low Score, keep track of it!
            
            $moves = 0;
            for ($k = 1;$k < 10;$k++) {
                if ($ctr[$k] == 0) {
                    $alts[$moves] = $k;
                    $moves++;
                }
            }
            
            $scores[$i][$j] = $moves;
            if ($moves < $lowscore) {
                $lowscore = $moves;
                
                $lowi     = $i;
                
                $lowj     = $j;
                
                
                $lowalts = substr($alts,0,$moves);
                
            }
        }
    }
    
    $results[0] = $lowscore;
    $results[1] = $lowi;
    $results[2] = $lowj;
    $results[3] = substr($lowalts,0,$lowscore);
    
    return $results;
}

function solve_sudoku($str,$level) {
    global $solution_ctr;
    global $solutions;
    global $nctr;
    
    $results = score_board($str);
    
    $score   = $results[0];
    $nextrow = $results[1];
    $nextcol = $results[2];
    $alts    = $results[3];
    
    if ($score == 10) {
        
        $solution_ctr++;
        $solutions[] = $str;
        
        return;
    }
    
    for ($i = 0;$i < $score;$i++) {
        
        $c = substr($alts,$i,1);
        
        $newstr = $str;
        $newstr[($nextrow * 9) + $nextcol] = $c;
        
        solve_sudoku($newstr,$level + 1);
    }
}

function PushSolveButton($request_page,$request_ip) {
    global $nctr;
    global $solution_ctr;
    global $solutions;
    global $original;
    
    $before = gather_board($_POST["grid"]);
    $original = $before;
    
    //      Database stuff removed.  The final application checks a database of 'known' 
    //      puzzles to see if has already solved the board and if so just retrieves and
    //      displays the solution.
    //
    //      Otherwise, assume $ctr resolves to 0 (board not found in the database, so solve).
    //      'gather_board' is just a function that gets the data from a grid.  I think the
    //      other functions are pretty clear about what that array looks like!
    
    if ($ctr == 0) { // The board was not found in the database.  Run the solver.
        
        $nctr         = 0;       // Counts steps in solution
        $solution_ctr = 0;       // Counts number of solutions found
        $solutions    = Array(); // Array of solutions found.
        
        solve_sudoku($before,0);
        
        if ($solution_ctr > 0) {
            $after = $solutions[0];
            
            // Check the database, and add this board/solution if it is new
            update_database($original,$after,$nctr);
            
            showboardform($after,$before);
        }
        else {
            showboardform($before,$before);
        }
    }
    else {
        showboardform($after,$before);
    }
    
}

The full Perl script is provided here (in a ZIP file so it won't attempt to execute when you press the link ... extract the .pl script before submitting it to Perl on your machine!)

 

Copyright 2006 Mark Van Dine, All Rights Reserved

 
     

Recent additions:

- Tough Sudoku

- Scramble Redux

- Sudoku

- Cryptograms III

- Cryptograms II

- Cryptograms I

- Peg Solitaire

- Mining Words

- What's My Line?

- Solve the Jumble!

- Scramble Squares

- SRAT Fever



Check out 'The Puzzle Machine'!

 

home - contact