Sudoku

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Robert Merkel (talk | contribs) at 00:29, 28 May 2005 (→‎Mathematics of Sudoku: 17-given sudoku link). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

File:Jan1.gif
A difficult Sudoku puzzle.

Sudoku, sometimes spelled Su doku, is a number placement puzzle, also known as Number Place in the United States. The aim of the puzzle is to enter a number from 1 through 9 in each square of a grid, frequently a 9×9 grid made up of 3×3 subgrids (called "regions"), starting with various numbers given in some cells (the "givens"). Each row, column and region must only contain one instance of each number. Completing the puzzle requires only patience and modest logical ability (although some puzzles can be fiendishly difficult). Its classic grid layout is reminiscent of other newspaper puzzles such as crosswords and chess. Sudoku initially became immensely popular in Japan in 1986 and in the UK in 2005, stimulating international interest and making it the fastest growing puzzle in the world.

History and terminology

The puzzle seems to have first been published in New York in the late 1970s by the specialist puzzle publisher Dell in its magazine Math Problems and Logic Puzzles. Dell referred to it as Number Place. The person who devised the first puzzle is not recorded. The description "number place" is somewhat misleading as there is nothing unique about the choice of numbers: the puzzle would still work if each specific number were to be substituted by another specific number (like swapping all the 8s and 4s, etc.) or to be replaced by an arbitrary non-numerical image. Indeed, Penny Press uses letters in their version called Scramblets.

The puzzle was introduced in Japan by Nikoli in the paper Monthly Nikolist April 1984 as "Suuji wa dokushin ni kagiru (数字は独身に限る)" named by KAJI Maki (鍜治 真起) the president of Nikoli, which means 'number is limited only single (unmarried).' At a later date, the name was abbreviated to Sudoku (pronounced sue-doe-koo). The name is Japanese; in kanji it is written 数独 (su - "number" - and doku - "single"), which implies "multiple isolations". In 1986, Nikoli introduced two innovations which guaranteed the popularity of the puzzle: the number of givens was restricted to no more than 30 and puzzles became symmetrical. It is now published in mainstream Japanese periodicals, such as the Asahi Shimbun. It became immensely popular in British newspapers in 2005. It has gained popularity internationally being dubbed the "Rubik's cube of the 21st century".

Bringing the process full-circle, Kappa reprints Nikoli Sudoku in GAMES Magazine under the name Squared Away; the New York Post is now also publishing the puzzle. It is also often included in puzzle anthologies, such as The Giant 1001 Puzzle Book (under the title Nine Numbers).

The challenge

The puzzle is played on a grid, most frequently a 9×9 grid made up of 3×3 subgrids (called "regions") and starting with various numbers in the range of 1 through 9 given in some cells (the "givens"). The goal is to fill in the empty cells, one number in each, so that every column, row, and region each contains the numbers 1 through 9 once. Therefore, each number in the solution is unique - or alone - in each of three "directions", hence "multiple isolations". The attraction of the puzzle is that the completion rules are so simple, yet the overall task can be difficult. Each puzzle can be ranked in terms of difficulty depending upon how many numbers are given and how easy it is to logically determine subsequent numbers once the puzzle has been started.

Sudoku is recommended by some teachers as a exercise in logical reasoning. The level of difficulty of the puzzles can be selected to suit the class. The puzzles are often free to use from published sources and may also be custom-generated using software.

Solution methods

File:Sudoku cross hatching example.png
The 3×3 region in the top-left corner must contain a 7. But where? By hatching across and up from 7s located elsewhere in the grid, we can eliminate all of the empty squares in the top-left corner which cannot contain 7. This leaves only one possible square. (It is highlighted in green.)

The strategy for solving a puzzle may be regarded as comprising three processes: scanning, marking up and analysing.

Scanning

Scanning is performed at the outset and periodically throughout the solution. It is an easy even tedious process being primarily visual. Puzzles which can be solved by scanning alone are classified as "easy" puzzles. More difficult puzzles by definition cannot be solved by scanning alone and scans may have to be performed several times in between the analysis periods. Scanning comprises two basic techniques, cross-hatching and counting, which may be used in either order:

  • Cross-hatching, being the reading of rows (or columns) to identify firstly which three-square line must have a certain number, and secondly, reading down the columns (or rows), discovering which of these squares must contain the number. For fastest results, the numbers should be scanned in order of their frequency. It is important to perform this process systematically checking all of the digits 1-9.
  • Counting 1-9 in regions, rows, and columns to identify missing numbers. Counting based upon the last number discovered may speed up the search. It can also be the case (typically in tougher puzzles) that the value of an individual square can be determined by counting in reverse - that is, scanning its region, row, and column for values it cannot be to see which is left.

Advanced solvers look for "contingencies" while scanning - that is, narrowing a number's location within a row, column, or region to two or three squares. When those squares all lie within the same row (or column) and region, they can be used for elimination purposes during cross-hatching and counting (Contingency example at Puzzle Japan). Particularly challenging puzzles may require multiple contingencies to be recognized, perhaps in multiple directions or even intersecting - relegating most solvers to marking up (as described below).

Marking up

Scanning comes to a halt when no further numbers can be discovered. From this point, it is necessary to engage in some logical analysis. It is essential to guide this analysis by marking candidate numbers in the blank squares. There are two popular notations: subscripts and dots. In the subscript notation the candidate numbers are written in subscript in the squares. The drawback to this is that original puzzles printed in a newspaper usually are too small to accommodate more than a few digits of normal handwriting. If using the subscript notation solvers often create a larger copy of the puzzle or employ a sharp or revolving pencil. The second notation is a pattern of dots with a dot in the top left hand corner representing a 1 and a dot in the bottom right hand corner representing a 9. The dot notation has the advantage that it can be used on the original puzzle. Dexterity is required in placing the dots since misplaced dots or inadvertent marks inevitably lead to confusion and may not be easy to erase without adding to the confusion.

Analysing

There are two main analysis approaches - elimination and what-if.

  • In elimination, progress is made by successively eliminating candidate numbers from one or more squares to leave just one choice. After each answer has been achieved, another scan may be performed - usually checking to see the effect of the latest number. There are a number of elimination tactics. One of the most common is "unmatched candidate deletion". Squares with identical sets of candidate numbers are said to be matched if the quantity of candidate numbers in each is equal to the number of squares containing them. For example, squares are said to be matched within a particular row, column, or region if two squares contain the same pair of candidate numbers (p,q) and no others, or if three squares contain the same triple of candidate numbers (p,q,r) and no others. These are essentially coincident contingencies. These numbers (p,q,r) appearing as candidates elsewhere in the same row, column, or region in unmatched squares can be deleted.
  • In the what-if approach, a square with only two candidate numbers is selected and a guess is made. Then repeat the steps as above unless a duplication is found, in which case choose the alternative candidate number and repeat. This approach requires a pencil and eraser. This approach may be frowned on by logical purists as too much trial and error but it can arrive at solutions fairly rapidly.

Ideally one needs to find a combination of techniques which avoids some of the drawbacks of the above elements. The counting of regions, rows, and columns can feel boring. Writing candidate numbers into empty squares can be time-consuming. The what-if approach can be confusing unless you are well organised. The Holy Grail is to find a technique which minimises counting, marking up, and rubbing out.

Computer solutions

For computer programmers it is relatively simple to build a brute force search. Typically this would assign a value (say, 1, or the nearest available number to 1) to the first available square (say, the top left hand corner) and then move on to assign the next available value (say, 2) to the next available square. This continues until a duplication is discovered in which case the next alternative value is placed in the last field changed. This isn't remotely computationally efficient, but it will work. A more efficient program could keep track of potential values for cells, eliminating impossible values until only one value remains for a cell, then filling that cell in and using that information for more eliminations, and so on until the puzzle is solved. This more closely emulates the way a human would solve the puzzle without resorting to trial and error.

Coding the search for impossibilities based on contingencies and even multiple contingencies (as would be required for the hardest of Sudoku) is quite complex to construct by hand. However, such complications are unnecessary. A more efficient way to find solutions involves the use of finite domain constraint programming. A constraint program requires the programmer only to specify the constraints on the solution (the fact that every number in each row, each column, and each 3x3 subgrid must be unique, and the provided starting values); the finite domain solver does all the work of propagating additional information about possible values to narrow down the solution space until a unique solution is found. The self-imposed constraints of most Sudoku puzzle publishers even ensures that the backtracking search capabilities of the finite domain solver are not required.

Construction

It is commonly believed that Dell Number Place puzzles are computer-generated; they typically have over 30 givens placed in an apparently random scatter, some of which can even be deduced from other givens. They also have no authoring credits. Nikoli Sudoku are hand-constructed, with the author being credited beside each puzzle; the givens are always found in a symmetrical pattern. (Building a Sudoku with symmetrical givens can be achieved by determining in advance where the givens will be located, and only assigning an actual number to them as needed.) Dell Number Place Challenger puzzles also list author credits. The Sudoku puzzles printed in most UK newspapers are apparently computer-generated but employ symmetrical givens, implying a more humanistic algorithm; The Guardian states that its puzzles are hand-constructed "in Japan", though it does not include authoring credits.

Note that it is possible to set starting grids with more than one solution and to set grids with no solution, but such are not considered proper Sudoku puzzles; like most other pure-logic puzzles, a unique solution is expected. Great caution is required in constructing a Sudoku puzzle, as failing to recognize where a number can be logically deduced at any point in construction - regardless of how torturous that logic may be - can result in an unsolvable puzzle when defining a future given contradicts what has already been built.

Variants

Although the 9×9 grid with 3×3 regions is by far the most common, numerous variations abound: sample puzzles can be 4×4 grids with 2×2 regions; 5×5 grids with pentomino regions have been published under the name Logi-5; the World Puzzle Championship has previously featured a 6×6 grid with 2×3 regions and a 7×7 grid with six heptomino regions and a disjoint region. Even the 9×9 grid is not always standard, with Ebb regularly publishing some of those with nonomino regions. Larger grids are also possible, with Dell regularly publishing 16×16-grid Number Place Challenger puzzles and Nikoli proffering 25×25 Sudoku the Giant behemoths. Another common variant is for the numbers in the main diagonals of the grid to also be required to be unique; all Dell Number Place Challenger puzzles are of this variant.

Mathematics of Sudoku

Sudoku puzzles are known to be NP-complete [1].

Mathematicians would say that solving Sudoku puzzles is essentially a problem in graph coloring. The aim of the puzzle in its standard form is to construct a proper 9-coloring of a particular graph, given a partial 9-coloring. The graph in question has 81 vertices, one vertex for each square of the grid. The vertices can be labelled with the ordered pairs , where x and y are integers between 1 and 9. In this case, the vertices labelled by and are joined by an edge if and only if:

  • or,
  • or,
  • and .

The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.

The number of valid Sudoku solution grids for the standard 9×9 grid with 3×3 regions was calculated by Bertram Felgenhauer to be 6,670,903,752,021,072,936,960 [2]. This number is equivalent to 9! × 72^2 × 2^7 × 27,704,267,971, the last factor of which is prime. The result was derived through logic and brute force computer attack. Whilst the order of magnitude of this result has been confirmed, the exact figure is awaiting independent validation.

The maximum number of givens that can be provided while still not rendering the solution unique, regardless of variation, is four short of a full grid; if two instances of two numbers each are missing and the squares they are to occupy are the corners of an orthogonal rectangle, there are two ways the numbers can be added. The inverse of this - the fewest givens that render a solution unique - is an unsolved problem, although the lowest number yet found for the standard variation (without a symmetry constraint) is 17, a number of which have been found by Japanese puzzle enthusiasts [3].

Popularity in British newspapers

In 2004 and 2005, Sudoku experienced a surge of popularity in the United Kingdom. British newspapers had a long history of publishing crosswords and other puzzles but before 2004 Sudoku was not a common feature on puzzle pages. Wayne Gould, a New Zealander and former Hong Kong High Court Judge, discovered Sudoku in Japan and promoted the idea to The Times, launching it on 12 November 2004. Three days later the Daily Mail began to publish the puzzle under the name 'Codenumber'.

By April and May 2005 the puzzle had become popular in these publications and it was rapidly introduced to several other national British newspapers including The Daily Telegraph, The Independent, The Guardian, The Sun, and The Daily Mirror. As the name 'Sudoku' became well-known in Britain the Daily Mail adopted it in place of its earlier name 'Codenumber'; some commentators have suggested that the paper initially avoided using a Japanese name so as not to put off its traditionally right-wing readers. Newspapers competed to promote their Sudoku puzzles, with The Times and the Daily Mail each claiming to have been the first to feature Sudoku, and The Guardian claiming that its hand-made puzzles offered a superior experience to the computer-generated grids found in other papers. The rapid rise of Sudoku from relative obscurity in Britain to a front-page feature in national newspapers attracted commentary in the media (see External links below) and parody (such as when The Guardian's G2 section advertised itself as the first newspaper supplement with a Sudoku grid on every page [4]). Sudoku became particularly prominent in newspapers soon after the 2005 general election; some commentators suggested that it was promoted heavily by the newspapers in order to fill the gaps previously filled by election coverage. As of 2005 it remains to be seen whether Sudoku will remain a fixture in British newspapers.

See also

Other Nikoli logic-based puzzles:

References

External links