Type Inference Article Index for
Type
Website Links For
Type
 

Information About

Type Inference




Type inference refers to the ability to automatically either partially or fully deduce the type of the value derived from the eventual evaluation of an expression. As this process is systematically performed at compile time, the compiler is often able to infer the type of a variable or the Type Signature of a function, without explicit type annotations having been given. In many cases, it is possible to omit type annotations from a program completely if the type inference system is robust enough, or the program or language simple enough.

To obtain the information required to correctly infer the type of an expression lacking an explicit type annotation, the compiler either gathers this information as an aggregate and subsequent reduction of the type annotations given for its subexpressions (which may themselves be variables or functions), or through an implicit understanding of the type of various atomic values (e.g., () : and Polymorphism , it is not always possible for the compiler to infer as much, however, and type annotations are occasionally necessary for disambiguation.


EXAMPLE

For example, let us consider the Haskell function length, which may be defined as:
length {Link without Title} = 0
length (first:rest) = 1 + length rest

From this, it is evident that the function handles lists as inputs, and the base case of this recursive function returns an integer (Haskell "Int"). So we can reliably construct a type signature
length :: {Link without Title} -> Int

Since there are no Ad-hoc Polymorphic subfunctions in the function definition, we can declare the function to be Parametric Polymorphic .


HINDLEY-MILNER TYPE INFERENCE ALGORITHM

The common algorithm used to perform the type inference is the one now commonly referred to as Hindley-Milner or Damas-Milner algorithm.

The origin of this algorithm is the type inference algorithm for the Simply Typed Lambda Calculus , which was devised by Haskell B. Curry and Robert Feys in 1958.

In 1969 Roger Hindley extended this work and proved that their algorithm always inferred the most general type.

In 1978 Robin Milner , independently of Hindley's work, provided an equivalent algorithm,

In 1985 Luis Damas finally proved that Milner's algorithm is complete and extended it to support systems with polymorphic references.


The Algorithm

The algorithm proceeds in two steps. First, we need to generate a number of equations to solve, then we need to solve them.


Generating the equations

The algorithm used for generating the equations is similar to a regular type checker, so let's consider first a regular type checker for the typed lambda calculus given by

e \, ::= E \mid v \mid (\lambda v: au. e) \mid (e\, e)

and

au \, ::= T \mid au o au

where E is a primitive expression (such as "3") and T is a primitive type (such as "Integer").

We want to construct a function f of type \epsilon o t o au, where \epsilon is a type environment and t is a term. We assume that this function is already defined on primitives. The other cases are:

f\, \Gamma\, v = au whenever the binding v\, :\, au is in \Gamma

f\, \Gamma\, (g\, e) = au whenever au_1 = au_2 o au where au_1 = f\, \Gamma\, g and au_2 = f\, \Gamma\, e.

f\, \Gamma\, (\lambda v: au. e) = au o au_e where au_e = f\, \Gamma'\, e and \Gamma' is \Gamma extended by the binding v \,:\, au.

  • checking--- algorithm, the check simply fails whenever anything goes wrong.


Now, we develop a more sophisticated algorithm that can deal with type variables and constraints on them. Therefore, we extend the set T of primitive types to include an infinite supply of variables, denoted by lowercase greek letters \alpha, \beta, ...

This is a limited overview. For now, refer to ''Types and Programming Languages'' by Benjamin Pierce, Sections 22.1-4 .


Solving the equations

Solving the equations proceeds by unification. This is - maybe surprisingly - a rather simple algorithm. The function u operates on type equations and returns a structure called a "substitution". A substitution is simply a mapping from type variables to types. Substitutions can be composed and extended in the obvious ways.

Unifying the empty set of equations is easy enough: u\, \emptyset = \mathbf{i}, where \mathbf{i} is the identity substitution.

Unifying a variable with a type goes this way: u\, ( = T \cup C) = u\, (C') \cdot (\alpha \mapsto T), where \cdot is the substitution composition operator, and C' is the set of remaining constraints C with the new substitution \alpha \mapsto T applied to it.

Of course, u\, ( = \alpha \cup C) = u ( = T \cup C).

The interesting case remains as u\, ([S o S' = T o T']\cup C) = u \, (\{ = T , = T' \}\cup C).


REFERENCES



EXTERNAL LINK