The mathematical analysis of Sudoku falls into two main areas: analyzing the properties of a) completed grids and b) puzzles. Grid analysis has largely focused on counting (enumerating) possible solutions for different ''variants''. Puzzle analysis centers on the initial ''given'' values. The techniques used in either are largely the same: Combinatorics and Permutation Group Theory , augmented by the dexterous application of programming tools.
There are many Sudoku Variants , (partially) characterized by the size (''N'') and shape of their regions. For classic Sudoku, ''N''=9 and the regions are 3x3 squares (blocks). A rectangular Sudoku uses rectangular regions of row-column dimension ''R''×''C''. For ''R''×1 (and 1×''C''), i.e. where the region is a row or column, Sudoku becomes a Latin Square .
Other Sudoku variants also exist, such as those with irregularly-shaped regions or with additional constraints ( Hypercube ) or different ( Samunamupure ) constraint types. See Sudoku - Variants for a discussion of variants and Sudoku Terms And Jargon for an expanded listing.
Mathematics uses elegant and succinct symbolism, often meaningful only with formal training. For this article, an abstract mathematical statement is used at the upper levels. Detail descriptions are deferred to sub-sections, usually with the term 'detail' in the section heading, and can be skipped where the condensed mathematical statement is (deemed) sufficient. The extended explanations can also be used as a practical introduction to the mathematical concepts involved.
The mathematics of Sudoku is relatively new area of exploration, mirroring the popularity of Sudoku itself. and expand the material here.
The general problem of solving ''Sudoku'' puzzles on ''n''2 x ''n''2 boards of ''n'' x ''n'' blocks is known to be that knows the entire Game Tree .
Solving ''Sudoku'' puzzles can be expressed as a Graph Colouring problem. Let us talk about the 9x9 = 32x32 case. The aim of the puzzle in its standard form is to construct a proper 9-colouring of a particular graph, given a partial 9-colouring. The graph in question has 81 vertices, one vertex for each cell of the grid. The vertices can be labelled with the ordered pairs
, where x and y are integers between 1 and 9. In this case, two distinct vertices labelled by and are joined by an edge if and only if:
- (same column) or,
- (same row) or,
- and (same 3x3 cell)
The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.
A valid ''Sudoku'' solution grid is also a . The result was derived through logic and Brute Force Computation . The derivation of this result was considerably simplified by analysis provided by Frazer Jarvis and the figure has been confirmed independently by Ed Russell. Russell and Jarvis also showed that when symmetries were taken into account, there were 5,472,730,538 solutions {Link without Title} . The number of valid ''Sudoku'' solution grids for the 16×16 derivation is not known.
The maximum number of givens that can be provided while still not rendering the solution unique, regardless of variation, is four short of a full grid; if two instances of two numbers each are missing and the cells they are to occupy are the corners of an orthogonal rectangle, and exactly two of these cells are within one region, there are two ways the numbers can be added. The inverse of this—the fewest givens that render a solution unique—is an Unsolved Problem , although the lowest number yet found for the standard variation without a symmetry constraint is 17, a number of which have been found by Japanese puzzle enthusiasts [http://www.csse.uwa.edu.au/~gordon/sudokumin.php , and 18 with the givens in rotationally symmetric cells.
A ''puzzle'' is a partially completed ''grid''. The initial puzzle values are known as ''givens'' or ''clues''. The ''regions'' are also called ''boxes'' or ''blocks''. The use of the term ''square'' is generally avoided because of ambiguity.
Horizontally adjacent blocks constitute a ''band'' (the vertical equivalent is called a ''stack'').
A ''proper'' puzzle has a unique solution. The constraint'' 'each digit appears in each row, column and region' '' is called the ''One Rule''
See basic terms or the List Of Sudoku Terms And Jargon for an expanded list of terminology.
Sudoku types can be identified by their ''region'' dimensions. "3x3" Sudokus denotes the classic format. ''R''x''C'' denotes a rectangular region with ''R'' rows and ''C'' columns. The implied grid configuration has ''R'' blocks per band (''C'' blocks per stack), ''C''×''R'' bands×stacks and grid dimensions ''N''x''N'', with ''N'' = ''R'' · ''C''. Region dimensions are preferable to region size for characterisation, as size can be ambiguous, e.g. 2×6 and 3×4 regions both have size 12.
The following framework can be used to
structure the relation of Sudoku variants:
- region size
- ---rectangular regions, with a sub-types ordered by region perimeter length as needed
--square regions
- ---irregular ( Polyomino ) regions, used for prime region sizes and other variants
The application of additional constraints can be used to generate further specializations.
The three principle (sub)classes of Sudoku are rectangular, square, prime (region).
The size ordering of each can be used to define an integer series, e.g. for square Sudoku, the integer series of possible solutions ().
Sudoku with square ''N''x''N'' regions are more symmetrical than immediately obvious from the ''One Rule''. Each row and column intersects N regions and shares N cells with each. The number of bands and stacks also equals N. Rectangular Sudoku do not have these properties. The "3x3" Sudoku is additionally unique: N is also the number of row-column-region constraints from the ''One Rule'' (i.e. there are N=3 types of ''units'').
See the List Of Sudoku Terms And Jargon for an expanded list and classification of variants.
|   |
1s2=2 s3=3 s4=4 s5=5 s6=6 s7=7 s8=8 s9=9=
|
|   |
2s11=B1s12= s13= s14=B2s15= s16= s17=B3s18= =
|
|   |
3s20=&nbsps21= s22= s23=&nbsps24= s25= s26=&nbsps27= =
|
|   |
4s29= s30= s31= s32= s33=c s34= s35=&nbsps36= =
|
|   |
5s38=B4s39= s40= s41=B5s42=5 s43= s44=B6s45= =
|
|   |
6s47= s48= s49=r s50=5 s51=6 s52= s53=&nbsps54= =
|
|   |
7s56= s57= s58= s59=&nbsps60= s61= s62=&nbsps63= =
|
|   |
8s65=B7s66= s67= s68=B8s69= s70= s71=B9s72= =
|
|   |
9s74= s75= s76= s77=&nbsps78= s79= s80=&nbsps81= =
|
|   |
yellowc14=yellowc17=yellowc38=yellowc41=yellowc44=yellow=
|
|   |
yellowc68=yellowc71=yellow=
|
|   |
e8f0ffc42=e8f0ffc49=e8f0ffc50=e8f0ffc51=e8f0ff=
|
|
Let
|
where ''b''
R, C is the number of ways of completing a
Sudoku band of ''R'' horizontally adjacent ''R''×''C'' boxes.
Pettersen's algorithm , as implemented by
Silver , is the currently the fastest known technique for exact evaluation of these ''b''
R, C.
The band counts are listed below. As in the previous section, "Dimensions" are those of the regions.
:
: same author, different method
The expression for the 4×C case is:
- \sum_{\begin{matrix}k_{12},k_{13},k_{14},\k_{23},k_{24},k_{34}\end{matrix}} {b\choose k_{13}}{c \choose k_{14}}{c \choose k_{23}}{b \choose k_{24}}{a \choose k_{34}} }
ight)^2 }
where:
: the outer summand is taken over all ''a'',''b'',''c'' such that 0<=''a'',''b'',''c'' and ''a''+''b''+''c''=2''C''
: the inner summand is taken over all ''k''
12,''k''
13,''k''
14,''k''
23,''k''
24,''k''
34 ≥ 0 such that
:: ''k''
12,''k''
34 ≤ ''a'' and
:: ''k''
13,''k''
24 ≤ ''b'' and
:: ''k''
14,''k''
23 ≤ ''c'' and
:: ''k''
12+''k''
13+''k''
14 = ''a''-''k''
12+''k''
23+''k''
24 = ''b''-''k''
13+''c''-''k''
23+''k''
34 = ''c''-''k''
14+''b''-''k''
24+''a''-''k''
34 = ''C''
The following are all restrictions of 3x3 Sudokus. The type names have not been standardised: click on the attribution links to see the definitions.
All Sudokus remain valid (no repeated numbers in any row, column or region) under the action of the
Sudoku Preserving Symmetries (see also
Jarvis ). Some Sudokus are special in that some operations merely have the effect of relabelling the digits; several of these are enumerated below.
Further calculations of this ilk combine to show that the number of essentially different Sudoku grids is 5,472,730,538, for example as demonstrated by
Jarvis/Russell and verified by
Pettersen . Similar methods have been applied to the 2x3 case, where
Jarvis/Russell showed that there are 49 essentially different grids (see also the article by
Bailey, Cameron and Connelly ), and to the 2x4 case, where
Russell showed that there are 1,673,187 essentially different grids (not independently confirmed).
have a unique solution. An '''irreducible puzzle''' (or '''minimal puzzle''') is a proper puzzle from which no givens can removed leaving it a proper puzzle (with a single solution). It is possible to construct minimal puzzles with different numbers of givens. This section discusses the minimum number of givens for proper puzzles.
The lowest known is 17 givens in general Sudoku, or 18 when the positions of the givens are constrained to be half-turn rotationally symmetric. It is conjectured that these are the best possible, evidence for which stems from extensive randomised searching:
- Gordon Royle has compiled a list of (currently) 36628 17-clue puzzles , each of which is unique up to Isomorphism . None of these was isomorphic to a symmetric puzzle, nor contained a 16-clue puzzle.
- An independent construction of 700 distinct 17-clue puzzles found only 33 not already on an earlier, size 32930, version of Royle's list. From that, the MLE estimate of the number of 17-clue grids was ''c.'' 34550. If the construction methods were truly random and independent then this would imply a negligible probability of there being a 16-clue puzzle waiting to be discovered -- since any such "16" would give rise to 65 17-clue puzzles, all of which were somehow missed in the search.
- The most fruitful set of clue positions, in terms of number of distinct 17-clue puzzles they admit, from Royle's list have been exhaustively searched for 17-clue puzzles. All 36 puzzles found by this process were already in Royle's list. All 34 puzzles on the next most fruitful set of clue positions were also in Royle's list.
- The most fruitful solution grids (in the same sense) have been exhaustively searched for 16-clue puzzles using CHECKER with no success. This includes one, " strangely familiar ", grid that yields at least 29 different 17-clue puzzles.
Additional constraints (here, on 3×3 Sudokus) lead to a smaller minimum number of clues.
- 3doku: ''no results for this variant''
- Disjoint Groups: some 12-clue puzzles have been demonstrated by Glenn Fowler. It is not known if this is the best possible.
- Hypercube: various 8-clue puzzles (the best possible) have been demonstrated by Guenter Stertenbrink.
- Magic Sudoku: a 7-clue example has been provided by Guenter Stertenbrink. It is not known if this is the best possible.
- Sudoku X: a 13-clue example has been provided by Gordon Royle. It is not known if this is the best possible.
- NRC Sudoku: an 11-clue example has been provided by Andries Brouwer. It is not known if this is the best possible.
"Du-sum-oh" (a.k.a. "geometry number place") puzzles replace the 3×3 (or ''R''×''C'') regions of Sudoku with irregular shapes of a fixed size. Bob Harris has
proved that it is always possible to create ''N''-1 clue du-sum-ohs on an ''N''×''N'' grid, and has constructed several examples.
In sum number place (Samunamupure), the regions are of irregular shape and various sizes. The usual constraints of no repeated value in any row, column or region apply. The clues are given as sums of values within regions (e.g. a 4-cell region with sum 10 must consist of values 1,2,3,4 in some order). The minimum number of clues for Samunamupure is not known, nor even conjectured.
A variant on Miyuki Misawa's web site replaces sums with relations: the clues are symbols , '''<''' and '''>''' showing the relative values of (some but not all) adjacent region sums. She demonstrates an example with only eight relations. It is not known whether this is the best possible.
The approach described here was the historically first strategy employed to enumerate the Sudoku 9×9 grid solutions, as published by Felgenhauer and Jarvis in
Enumerating possible Sudoku grids .
The methods and algorithms used are very straight forward and provide a practical introduction to several mathematical concepts. The development is presented here for those wishing to explore these topics.
The strategy begins by analyzing the
Permutations of the top band used in valid solutions. Once the Band1
Symmetries and
Equivalence Class for the partial solutions are identified, the completions of the lower two bands are constructed and counted for each equivalence class. Summing the completions over the equivalence classes gives the total number of solutions as 6,670,903,752,021,072,936,960 (c. 6.67×10
21).
The Band1 algorithm proceeds as follows:
- Choose a canonical labeling of the digits by assigning values for B1, e.g.
1 2 3
4 5 6
7 8 9
:Compute the rest of the Band1 permutations relative to the B1 canonical choice.
- Compute the permutations of B2 by Partitioning the B1 cell values over the B2 row Triplets . From the triplet combinations compute the B2 permutations. There are k=0..3 ways to choose the:
:B1 r11 values for B2 r22, the rest must go to r23,
:B1 r12 values for B2 r23, the rest must go to r21,
:B1 r13 values for B2 r21, the rest must go to r22, i.e.
|   |
whiteb2c=d8f0ffalign=right=
|
|   |
Triplet rBR(box/row) Labelsalign=right=
|
|   |
r s2=1 s3=1 s4=r s5=2 s6=1 s7=r s8=3 s9=1=
|
|   |
rs11=1s12=2s13=rs14=2s15=2s16=rs17=3s18=2=
|
|   |
rs20=1s21=3s22=rs23=2s24=3s25=rs26=3s27=3=
|
|
|
|   |
whitealign=right=
|
|   |
Case 0 Matching Cells Tripletsalign=right=
|
|   |
1 s2=2 s3=3 s4=7 s5=8 s6=9 s7=4 s8=5 s9=6=
|
|   |
4s11=5s12=6s13=1s14=2s15=3s16=7s17=8s18=9=
|
|   |
7s20=8s21=9s22=4s23=5s24=6s25=1s26=2s27=3=
|
|   |
pinkb3c=pink=
|
|
|
|   |
whiteb2c=d8f0ffalign=right=
|
|   |
Case 1 Match - Triplet Cell Options=
|
|   |
1 s2=2 s3=3 s4=3 s5=3 s6=2 s7=3 s8=2 s9=1=
|
|   |
4s11=5s12=6s13=1s14=3s15=2s16=3s17=2s18=1=
|
|   |
7s20=8s21=9s22=1s23=2s24=1s25=3s26=2s27=1=
|
|   |
yellowc2=yellowc3=yellowc10=pinkc11=pinkc12=pink=
|
|   |
limec20=limec21=lime=
|
|   |
pinkc5=limec6=lime=
|
|   |
limec14=yellowc15=yellow=
|
|   |
yellowc23=pinkc24=pink=
|
|
|
|   |
whiteb2c=d8f0ffalign=right=
|
|   |
Case 2 Match - Triplet Cell Options=
|
|   |
1 s2=2 s3=3 s4=3 s5=2 s6=3 s7=3 s8=2 s9=1=
|
|   |
4s11=5s12=6s13=2s14=1s15=3s16=3s17=2s18=1=
|
|   |
7s20=8s21=9s22=2s23=1s24=1s25=3s26=2s27=1=
|
|   |
yellowc2=yellowc3=yellow=
|
|   |
pinkc11=pinkc12=pink=
|
|   |
limec20=limec21=lime=
|
|   |
pinkc5=pinkc6=lime=
|
|   |
limec14=limec15=yellow=
|
|   |
yellowc23=yellowc24=pink=
|
|
|
|   |
Sb : the size of Sb, be the number of sb elements (permutations) in {Link without Title}
|
|   |
{Sb} : be the number of equivalence classes
|
|
The Band1 symmetries (above) are ''solution permutation symmetries'' defined so that a permuted solution is also a solution. For the purpose of enumerating solutions, a ''counting symmetry'' for grid completion can be used to define band equivalence classes that yield a minimal number of classes.
''Counting symmetry'' partitions valid Band1 permutations into classes that place the same completion constraints on lower bands; all members of a band ''counting symmetry'' equivalence class must have the same number of grid completions since the completion constraints are equivalent. Counting symmetry constraints are identified by the Band1 column triplets (a column value set, no implied element order). Using band counting symmetry, a minimal generating
set of 44 equivalence classes was established.
|   |
whiteb2c=d8f0ffalign=right=
|
|   |
Column Triplets=
|
|   |
1 s2=2 s3=3 s4=4 s5=1 s6=2 s7=5 s8=1 s9=3=
|
|   |
4s11=5s12=6s13=5s14=3s15=6s16=7s17=2s18=6=
|
|   |
7s20=8s21=9s22=9s23=8s24=7s25=8s26=4s27=9=
|
|   |
pinkc14=pinkc23=pink=
|
|   |
pinkc17=pinkc26=pink=
|
|
|
|   |
whiteb3c=d8f0ffalign=right=
|
|   |
Ordered Column Triplets=
|
|   |
1 s2=2 s3=3 s4=1 s5=3 s6=5 s7=1 s8=2 s9=4=
|
|   |
4s11=5s12=6s13=2s14=6s15=7s16=3s17=6s18=5=
|
|   |
7s20=8s21=9s22=4s23=9s24=8s25=8s26=7s27=9=
|
|   |
pinkc13=pinkc22=pink=
|
|   |
pinkc16=pinkc25=pink=
|
|
|
|