Type inference

from Wikipedia, the free encyclopedia

With type inference (from English type "(data) type" or " data type " and inference "conclusion"), also called type derivation , a lot of paperwork can be saved in some (strongly typed) programming languages by not having to write down type specifications, which can be derived (reconstructed) from the remaining information and the typing rules; the same rules are used for this purpose, which are also used for type testing, the further development of which the type inference is to be seen in a certain way. During the implementation, the most general type ( principal type ) of a term is determined by unification .

The development of type inference (for ML ) by Milner was a milestone in the development of programming languages. It required, but at the same time enabled more sophisticated type systems , which thus gained considerably in importance. Certain language properties such as type matching and sometimes overloading have been pushed back because they collide with type inference.

Languages ​​that use type inference are for example: C ++ 11 , C # , Clean , Crystal , D , F # , FreeBASIC , Go , Haskell , Julia , Kotlin , ML , Nim , OCaml , Opa , R , Python , Rust , Scala , Swift , Vala , and Visual Basic .

example

The following code is given:

int a;
c = a + b;

The type system of the respective programming language (if it has a corresponding type system including a strict set of rules) can now automatically infer that the variable b must have the type int , since the variable a is already of the type int and the operator + does not have two different values Type can be used. In addition, the variable c must also be of type int , since the result of an int addition is also of type int .

Type inference is important to help the programmer spot careless mistakes quickly. For example, in a strictly typed language like SML , it is not possible to compare whole numbers with Boolean values . In order to avoid and find precisely such errors, types are derived for all expressions using type inference, and a check is carried out to determine whether all operations performed on the expressions conform to the type (if, for example, as assumed above, additions are only allowed between two numbers of the same type are).

A slightly more complex example of type inference is the derivation of the type of function f in the following SML code:

fun f(a, b, c) = if b then a(b) else c + 1

First, the variable b must be of the type bool (truth value), since it is placed after the if in the if-then-else statement . Second, we can say that the entire function must return an int , since both the then and the else part must be of the same type and c + 1of type int, since 1 is of type int , and therefore also - because of the Plus operator - c must be of type int. Now one can say that a has to be a function, since a is applied to b in the then part. However, since the then and else parts must have the same return type, the result of the function a applied to b must also be of type int . The following type results for the function f:

f : ((bool->int) * bool * int) -> int

Note: In the example above, the typing rules of the SML language were used explicitly . Other languages ​​such as C ++ or Java have different typing rules so that the type derivation may look different there or may not be possible at all due to the weak typing of the language (several different types are allowed in many places).

Hindley-Milner

Hindley-Milner (HM) is a classic method of type inference with parametric polymorphism for the lambda calculus . It was first described by J. Roger Hindley and later rediscovered by Robin Milner . Luis Damas contributed a detailed formal analysis and proof of the method in his doctoral thesis, which is why the procedure is also known as Damas-Milner . Among the salient features of the HM are completeness and the ability to determine the most general type of a given source without the addition of annotations or other cues. HM is an efficient method that can determine the typing in almost linear time with respect to the size of the source, which makes it practical for creating large programs. HM is preferably used in functional languages . It was first implemented as part of the ML programming language. Since then, HM has been expanded in a variety of ways, including limited types such as those used in Haskell .

See also

literature

Remarks

  1. He has rediscovered what had already been found by Hindley (1969), drawing on ideas from Curry from the 1950s. Pierce, TAPL, 336.

Individual evidence

  1. ^ R. Hindley: The Principal Type-Scheme of an Object in Combinatory Logic . In: Transactions of the American Mathematical Society , Vol. 146, 1969, pp. 29-60 jstor.org
  2. Milner: A Theory of Type Polymorphism in Programming . In: Journal of Computer and System Science (JCSS), 17, 1978, pp. 348–374, citeseerx.ist.psu.edu (PDF)
  3. Luis Damas: Type Assignment in Programming Languages . PhD thesis, University of Edinburgh, 1985 (CST-33-85)
  4. Damas, Milner: Principal type schemes for functional programs . Retrieved May 21, 2020 . (PDF) 9th Symposium on Principles of programming languages ​​(POPL'82), ACM, 1982, pp. 207-212.