Mathematics Of Sudoku Article Index for
Mathematics Of
Website Links For
Mathematics
 

Information About

Mathematics Of Sudoku





OVERVIEW



The mathematical analysis of Sudoku falls into two main areas: analyzing the properties of a) completed grids and b) puzzles. Grid analysis has largely focused on counting (enumerating) possible solutions for different ''variants''. Puzzle analysis centers on the initial ''given'' values. The techniques used in either are largely the same: Combinatorics and Permutation Group Theory , augmented by the dexterous application of programming tools.

There are many Sudoku Variants , (partially) characterized by the size (''N'') and shape of their regions. For classic Sudoku, ''N''=9 and the regions are 3x3 squares (blocks). A rectangular Sudoku uses rectangular regions of row-column dimension ''R''×''C''. For ''R''×1 (and 1×''C''), i.e. where the region is a row or column, Sudoku becomes a Latin Square .

Other Sudoku variants also exist, such as those with irregularly-shaped regions or with additional constraints ( Hypercube ) or different ( Samunamupure ) constraint types. See Sudoku - Variants for a discussion of variants and Sudoku Terms And Jargon for an expanded listing.

Mathematics uses elegant and succinct symbolism, often meaningful only with formal training. For this article, an abstract mathematical statement is used at the upper levels. Detail descriptions are deferred to sub-sections, usually with the term 'detail' in the section heading, and can be skipped where the condensed mathematical statement is (deemed) sufficient. The extended explanations can also be used as a practical introduction to the mathematical concepts involved.

The mathematics of Sudoku is relatively new area of exploration, mirroring the popularity of Sudoku itself. and expand the material here.


Mathematical context


The general problem of solving ''Sudoku'' puzzles on ''n''2 x ''n''2 boards of ''n'' x ''n'' blocks is known to be that knows the entire Game Tree .

Solving ''Sudoku'' puzzles can be expressed as a Graph Colouring problem. Let us talk about the 9x9 = 32x32 case. The aim of the puzzle in its standard form is to construct a proper 9-colouring of a particular graph, given a partial 9-colouring. The graph in question has 81 vertices, one vertex for each cell of the grid. The vertices can be labelled with the ordered pairs
(x,\, y), where x and y are integers between 1 and 9. In this case, two distinct vertices labelled by (x,\, y) and (x',\, y') are joined by an edge if and only if:
  • x = x'\, (same column) or,

  • y = y'\, (same row) or,

  • \lceil x/3 ceil = \lceil x'/3 ceil and \lceil y/3 ceil = \lceil y'/3 ceil (same 3x3 cell)

  • The puzzle is then completed by assigning an integer between 1 and 9 to each vertex, in such a way that vertices that are joined by an edge do not have the same integer assigned to them.


A valid ''Sudoku'' solution grid is also a . The result was derived through logic and Brute Force Computation . The derivation of this result was considerably simplified by analysis provided by Frazer Jarvis and the figure has been confirmed independently by Ed Russell. Russell and Jarvis also showed that when symmetries were taken into account, there were 5,472,730,538 solutions {Link without Title} . The number of valid ''Sudoku'' solution grids for the 16×16 derivation is not known.

The maximum number of givens that can be provided while still not rendering the solution unique, regardless of variation, is four short of a full grid; if two instances of two numbers each are missing and the cells they are to occupy are the corners of an orthogonal rectangle, and exactly two of these cells are within one region, there are two ways the numbers can be added. The inverse of this—the fewest givens that render a solution unique—is an Unsolved Problem , although the lowest number yet found for the standard variation without a symmetry constraint is 17, a number of which have been found by Japanese puzzle enthusiasts [http://www.csse.uwa.edu.au/~gordon/sudokumin.php , and 18 with the givens in rotationally symmetric cells.


Terminology

A ''puzzle'' is a partially completed ''grid''. The initial puzzle values are known as ''givens'' or ''clues''. The ''regions'' are also called ''boxes'' or ''blocks''. The use of the term ''square'' is generally avoided because of ambiguity.
Horizontally adjacent blocks constitute a ''band'' (the vertical equivalent is called a ''stack'').

A ''proper'' puzzle has a unique solution. The constraint'' 'each digit appears in each row, column and region' '' is called the ''One Rule''

See basic terms or the List Of Sudoku Terms And Jargon for an expanded list of terminology.


Type identification and classification

Sudoku types can be identified by their ''region'' dimensions. "3x3" Sudokus denotes the classic format. ''R''x''C'' denotes a rectangular region with ''R'' rows and ''C'' columns. The implied grid configuration has ''R'' blocks per band (''C'' blocks per stack), ''C''×''R'' bands×stacks and grid dimensions ''N''x''N'', with ''N'' = ''R'' · ''C''. Region dimensions are preferable to region size for characterisation, as size can be ambiguous, e.g. 2×6 and 3×4 regions both have size 12.

The following framework can be used to
structure the relation of Sudoku variants:
  • region size

  • ---rectangular regions, with a sub-types ordered by region perimeter length as needed


  • --square regions

  • ---irregular ( Polyomino ) regions, used for prime region sizes and other variants

  • The application of additional constraints can be used to generate further specializations.


The three principle (sub)classes of Sudoku are rectangular, square, prime (region).
The size ordering of each can be used to define an integer series, e.g. for square Sudoku, the integer series of possible solutions ().

Sudoku with square ''N''x''N'' regions are more symmetrical than immediately obvious from the ''One Rule''. Each row and column intersects N regions and shares N cells with each. The number of bands and stacks also equals N. Rectangular Sudoku do not have these properties. The "3x3" Sudoku is additionally unique: N is also the number of row-column-region constraints from the ''One Rule'' (i.e. there are N=3 types of ''units'').

See the List Of Sudoku Terms And Jargon for an expanded list and classification of variants.


Definition of terms and labels

  S1 1s2=2 s3=3 s4=4 s5=5 s6=6 s7=7 s8=8 s9=9=
  s10 2s11=B1s12= s13= s14=B2s15= s16= s17=B3s18= =
  s19 3s20=&nbsps21= s22= s23=&nbsps24= s25= s26=&nbsps27= =
  s28 4s29= s30= s31= s32= s33=c s34= s35=&nbsps36= =
  s37 5s38=B4s39= s40= s41=B5s42=5 s43= s44=B6s45= =
  s46 6s47= s48= s49=r s50=5 s51=6 s52= s53=&nbsps54= =
  s55 7s56= s57= s58= s59=&nbsps60= s61= s62=&nbsps63= =
  s64 8s65=B7s66= s67= s68=B8s69= s70= s71=B9s72= =
  s73 9s74= s75= s76= s77=&nbsps78= s79= s80=&nbsps81= =
  c11 yellowc14=yellowc17=yellowc38=yellowc41=yellowc44=yellow=
  c65 yellowc68=yellowc71=yellow=
  c33 e8f0ffc42=e8f0ffc49=e8f0ffc50=e8f0ffc51=e8f0ff=


Let
  • s be a solution to a Sudoku grid with specific dimensions, satisfying the ''One Rule'' constraints

  • S = {s}, be the set of all solutions

  { Border 1 cellpadding=2 cellspacing=0 align="center"


where ''b''R, C is the number of ways of completing a Sudoku band of ''R'' horizontally adjacent ''R''×''C'' boxes. Pettersen's algorithm , as implemented by Silver , is the currently the fastest known technique for exact evaluation of these ''b''R, C.

The band counts for problems whose full Sudoku grid-count is unknown are listed below. As in the previous section, "Dimensions" are those of the regions.

:# : same author, different method

The expression for the 4×C case is:

where:
: the outer summand is taken over all ''a'',''b'',''c'' such that 0<=''a'',''b'',''c'' and ''a''+''b''+''c''=2''C''
: the inner summand is taken over all ''k''12,''k''13,''k''14,''k''23,''k''24,''k''34 ≥ 0 such that
:: ''k''12,''k''34 ≤ ''a''    and
:: ''k''13,''k''24 ≤ ''b''    and
:: ''k''14,''k''23 ≤ ''c''    and
:: ''k''12+''k''13+''k''14 = ''a''-''k''12+''k''23+''k''24 = ''b''-''k''13+''c''-''k''23+''k''34 = ''c''-''k''14+''b''-''k''24+''a''-''k''34 = ''C''


Sudoku with additional constraints

The following are all restrictions of 3x3 Sudokus. The type names have not been standardised: click on the attribution links to see the definitions.

All Sudokus remain valid (no repeated numbers in any row, column or region) under the action of the Sudoku Preserving Symmetries (see also Jarvis ). Some Sudokus are special in that some operations merely have the effect of relabelling the digits; several of these are enumerated below.

Further calculations of this ilk combine to show that the number of essentially different Sudoku grids is 5,472,730,538, for example as demonstrated by Jarvis/Russell and verified by Pettersen . Similar methods have been applied to the 2x3 case, where Jarvis/Russell showed that there are 49 essentially different grids (see also the article by Bailey, Cameron and Connelly ), and to the 2x4 case, where Russell showed that there are 1,673,187 essentially different grids (not independently confirmed).


MINIMUM NUMBER OF GIVENS

Proper puzzles have a unique solution. An '''irreducible puzzle''' (or '''minimal puzzle''') is a proper puzzle from which no givens can removed leaving it a proper puzzle (with a single solution). It is possible to construct minimal puzzles with different numbers of givens. This section discusses the minimum number of givens for proper puzzles.


Ordinary Sudoku

The lowest known is 17 givens in general Sudoku, or 18 when the positions of the givens are constrained to be half-turn rotationally symmetric. It is conjectured that these are the best possible, evidence for which stems from extensive randomised searching:


Sudoku with additional constraints

Additional constraints (here, on 3×3 Sudokus) lead to a smaller minimum number of clues.


Sudoku with irregular regions

"Du-sum-oh" (a.k.a. "geometry number place") puzzles replace the 3×3 (or ''R''×''C'') regions of Sudoku with irregular shapes of a fixed size. Bob Harris has proved that it is always possible to create ''N''-1 clue du-sum-ohs on an ''N''×''N'' grid, and has constructed several examples.


Sum number place ("Killer Sudoku")

In sum number place (Samunamupure), the regions are of irregular shape and various sizes. The usual constraints of no repeated value in any row, column or region apply. The clues are given as sums of values within regions (e.g. a 4-cell region with sum 10 must consist of values 1,2,3,4 in some order). The minimum number of clues for Samunamupure is not known, nor even conjectured.

A variant on Miyuki Misawa's web site replaces sums with relations: the clues are symbols =, '''<''' and '''>''' showing the relative values of (some but not all) adjacent region sums. She demonstrates an example with only eight relations. It is not known whether this is the best possible.


METHOD AND ALGORITHM DETAILS FOR THE 9×9 GRID DIRECT ENUMERATION

The approach described here was the historically first strategy employed to enumerate the Sudoku 9×9 grid solutions, as published by Felgenhauer and Jarvis in
Enumerating possible Sudoku grids .
The methods and algorithms used are very straight forward and provide a practical introduction to several mathematical concepts. The development is presented here for those wishing to explore these topics.

The strategy begins by analyzing the Permutations of the top band used in valid solutions. Once the Band1 Symmetries and Equivalence Class for the partial solutions are identified, the completions of the lower two bands are constructed and counted for each equivalence class. Summing the completions over the equivalence classes gives the total number of solutions as 6,670,903,752,021,072,936,960 (c. 6.67×1021).

Counting the top band permutations

The Band1 algorithm proceeds as follows:

1 2 3
4 5 6
7 8 9

:Compute the rest of the Band1 permutations relative to the B1 canonical choice.

:B1 r12 values for B2 r23, the rest must go to r21,
:B1 r13 values for B2 r21, the rest must go to r22, i.e.
  {{Sudoku 3x9 Bandb1c whiteb2c=d8f0ffalign=right=
  title Triplet rBR(box/row) Labelsalign=right=
  s1 r s2=1 s3=1 s4=r s5=2 s6=1 s7=r s8=3 s9=1=
  s10 rs11=1s12=2s13=rs14=2s15=2s16=rs17=3s18=2=
  s19 rs20=1s21=3s22=rs23=2s24=3s25=rs26=3s27=3=


  {{Sudoku 3x9 Bandb1c whitealign=right=
  title Case 0 Matching Cells Tripletsalign=right=
  s1 1 s2=2 s3=3 s4=7 s5=8 s6=9 s7=4 s8=5 s9=6=
  s10 4s11=5s12=6s13=1s14=2s15=3s16=7s17=8s18=9=
  s19 7s20=8s21=9s22=4s23=5s24=6s25=1s26=2s27=3=
  b2c pinkb3c=pink=


  {{Sudoku 3x9 Bandb1c whiteb2c=d8f0ffalign=right=
  title Case 1 Match - Triplet Cell Options=
  s1 1 s2=2 s3=3 s4=3 s5=3 s6=2 s7=3 s8=2 s9=1=
  s10 4s11=5s12=6s13=1s14=3s15=2s16=3s17=2s18=1=
  s19 7s20=8s21=9s22=1s23=2s24=1s25=3s26=2s27=1=
  c1 yellowc2=yellowc3=yellowc10=pinkc11=pinkc12=pink=
  c19 limec20=limec21=lime=
  c4 pinkc5=limec6=lime=
  c13 limec14=yellowc15=yellow=
  c22 yellowc23=pinkc24=pink=


  {{Sudoku 3x9 Bandb1c whiteb2c=d8f0ffalign=right=
  title Case 2 Match - Triplet Cell Options=
  s1 1 s2=2 s3=3 s4=3 s5=2 s6=3 s7=3 s8=2 s9=1=
  s10 4s11=5s12=6s13=2s14=1s15=3s16=3s17=2s18=1=
  s19 7s20=8s21=9s22=2s23=1s24=1s25=3s26=2s27=1=
  c1 yellowc2=yellowc3=yellow=
  c10 pinkc11=pinkc12=pink=
  c19 limec20=limec21=lime=
  c4 pinkc5=pinkc6=lime=
  c13 limec14=limec15=yellow=
  c22 yellowc23=yellowc24=pink=


  Sbz Sb : the size of Sb, be the number of sb elements (permutations) in {Link without Title}
  {Sb}z {Sb} : be the number of equivalence classes



Solution and counting permutation symmetry

The Band1 symmetries (above) are ''solution permutation symmetries'' defined so that a permuted solution is also a solution. For the purpose of enumerating solutions, a ''counting symmetry'' for grid completion can be used to define band equivalence classes that yield a minimal number of classes.

''Counting symmetry'' partitions valid Band1 permutations into classes that place the same completion constraints on lower bands; all members of a band ''counting symmetry'' equivalence class must have the same number of grid completions since the completion constraints are equivalent. Counting symmetry constraints are identified by the Band1 column triplets (a column value set, no implied element order). Using band counting symmetry, a minimal generating set of 44 equivalence classes was established.
  {{Sudoku 3x9 Bandb1c whiteb2c=d8f0ffalign=right=
  title Column Triplets=
  s1 1 s2=2 s3=3 s4=4 s5=1 s6=2 s7=5 s8=1 s9=3=
  s10 4s11=5s12=6s13=5s14=3s15=6s16=7s17=2s18=6=
  s19 7s20=8s21=9s22=9s23=8s24=7s25=8s26=4s27=9=
  c5 pinkc14=pinkc23=pink=
  c8 pinkc17=pinkc26=pink=


  {{Sudoku 3x9 Bandb1c whiteb3c=d8f0ffalign=right=
  title Ordered Column Triplets=
  s1 1 s2=2 s3=3 s4=1 s5=3 s6=5 s7=1 s8=2 s9=4=
  s10 4s11=5s12=6s13=2s14=6s15=7s16=3s17=6s18=5=
  s19 7s20=8s21=9s22=4s23=9s24=8s25=8s26=7s27=9=
  c4 pinkc13=pinkc22=pink=
  c7 pinkc16=pinkc25=pink=