Hindley-Milner type inference

from Wikipedia, the free encyclopedia

Hindley-Milner ( HM ) is a method of type inference with parametric polymorphism for the lambda calculus . It was first described by J. Roger Hindley and later rediscovered by Robin Milner . Luis Damas contributed a detailed formal analysis and proof of the method in his doctoral thesis, which is why the procedure is also known as Damas-Milner . Among the salient features of HM are completeness and the ability to determine the most general type of a given source without adding annotations or other references. HM is an efficient method that can determine the typing in almost linear time with respect to the size of the source, making it practically applicable for typing large programs. HM is preferably used in functional languages such as OCaml / Reason. It was first implemented as part of the type system of the ML programming language. Since then, HM has expanded in a number of ways, particularly limited types such as those used in Haskell .

introduction

Damas and Milner separate two very different tasks in the structure of their original publication, the first being to describe which types an expression can have. The other is to provide an algorithm that actually determines a type.

Considering both aspects separately from one another allows you to concentrate specifically on the logic (i.e. meaning) behind the algorithm and at the same time to be able to determine a test for the properties of the method. How expressions and types fit together is described with the help of an inference system . Like any system of evidence , it allows a conclusion to be reached in different ways. Since one and the same expression can have justifiable different types, the deduction system can also come to quite different conclusions about an expression. In contrast, with the type inference method itself ( algorithm W ), each execution step is uniquely determined. Of course, decisions may have been made in the construction of the algorithm that do not appear in the logic, and require a closer examination and justification, which would not be visible without the above differentiation.

The syntax

Expressions
Types

Logic and algorithm share the terms "expression" and "type", the form of which is specified by the syntax.

The expressions to be typed are those of the lambda calculus , extended by a let expression.

The form also often represents the written application of functions, while the abstraction means the anonymous function or the functional literal, which is now available in many contemporary programming languages, there perhaps only more verbatim than written out.

The total of types (varieties) is divided into two groups, called mono- and poly-types.

Monotypes , syntactically terms , always designate certain types in the sense that they are equal only to themselves and different from all others. Characteristic representatives of the monotypes are type constants such as or . Types can be parametric, such as B. . All of these types are examples of applications of type functions , e.g. B. in the previous examples, where the superscript number indicates the number of type parameters. While the choice of is basically arbitrary, in the context of the HM it must contain at least the function type conveniently written infix . For example, a function that maps integers to strings has the type .

Perhaps somewhat irritatingly, type variables are also monotypes. A single type variable denotes a type that is just as concrete as or and is different from both. A type variable occurring as a monotype behaves as if it were a type constant about which one just has no further information. Accordingly, a function with the type only maps values ​​of the type to itself and can only be applied to values ​​of this type and no other values.

In contrast to this, a function with the polytype can map any value to a value of the same type. The identical function is a value for this type. Another example is the type of function that maps any finite set to integers. The number of elements in the set is a value for this type. Note that quantifiers may only appear on the top level, i.e. H. the type, for example, is excluded by the syntax. Furthermore, monotypes are a subset of the polytypes, so that types as a whole have the general form .

Free type variables

Free type variables

In a type , the symbol is the quantifier binding the type variables in the monotype . The variables are called quantified . Every occurrence of a quantified variable in is called bound and all unbound type variables are called free . As in the lambda calculus , the notion of free and bound variables is fundamental to understanding the meaning of types.

This is certainly the hardest part of HM, perhaps because polytypes that contain free type variables cannot be expressed in programming languages ​​like Haskell . Likewise, there are no free variables in Prolog clauses. In particular, developers who are experienced with both languages, who actually know all the requirements for the HM, can easily overlook this point. For example, in Haskell, all type variables are implicitly quantified, e.g. B. means the Haskell type here . Since types like , although they can occur in Haskell, cannot be expressed there, they can easily be confused with the quantified version.

Which functions can now have a type such as B. , d. H. contain a mixture of bound and unbound type variables, and what does a free type variable mean in it?

example 1

See Example 1, with type annotations in square brackets. Obviously the parameter is not used in the body, but the variable bound in the external context of is used . As a consequence, it accepts any value as an argument, while it delivers a value that is bound outside of it, and therefore also its type.

In contrast, has the type in which all type variables occur bound. If one evaluates z. For example , the result is a function of the type , which perfectly reflects that the monotype of in was refined by the call.

In this example, the free monotype variable in the type of is loaded with meaning through the quantification in the outer scope, namely in the type of . In the context of this example, this means that the same type variable occurs both bound and free in different types. In this respect, a free type variable without knowledge of the context cannot be interpreted better than stating that it is a monotype. In other words, a type is i. General not meaningful without knowledge of the context.

Context and typing

syntax
Free type variables

In order to bring the previously separate parts of syntax, expressions and types together in a meaningful way, a third, the context, is required. Syntactically, this is called a list of pairs , assignment (see also assignment , assignment , assignment ), which assigns a type to each value variable in the context . All three parts together result in a typing of the form , which states that under the assumption that the expression has the type .

Since the complete syntax is now available, a meaningful statement can be made about the type of in example 1, namely . In contrast to the previous formulations, the monotype variable is no longer free; H. meaningless, but tied in the context as a type of value variable . Obviously, the fact whether a type variable occurs free or bound in the context plays an important role for a type in the context of a typing, so it is specified in the box on the side.

Notes on expressiveness

Since the syntax of the expression may seem far too weak to express to a reader unfamiliar with the lambda calculus , and since the examples are more likely to support this prejudice, it may be helpful to note that HM does not refer to toy languages. A key finding of research in the area of computability is that the above expression syntax (without the let clause) is strong enough to describe any computable function. In addition, all further constructions in programming languages ​​can be converted relatively directly syntactically into expressions of the lambda calculus. This is why these simple expressions are used as a model for programming languages. A method which is well applicable to the lambda calculus can easily be transferred to all or at least many other syntactic constructions of a special programming language by means of the transformations just mentioned.

For example, the additional expression variant can be transformed to . It was added to the expression syntax in HM only to support generalization during type inference and not because the syntax lacks computational power. So HM deals with the inference of types in programs in general, and the various functional languages ​​that use this method show how well a result that is only formulated for the syntax of the lambda calculus can be extended to syntactically complex languages.

In contrast to the impression that the expressions are too weak for practical use, they are actually too expressive to (in general) be typed at all. This is a consequence of the undecidability for content-related statements about expressions of the lambda calculus. Correspondingly, the calculation of the type of programs is i. General a hopeless endeavor. Depending on the nature of the type system, this is u. You may either not terminate or otherwise refuse service.

HM belongs to the latter group of type systems. A collapse of the variety apparatus appears here as a rather subtle situation in which only one and the same type is determined for the expressions of interest. This is not a flaw in the HM, but a property inherent in the problem of typing itself, which can easily be generated in any strongly typed language. To do this, one codes an evaluator (i.e. the universal function ) for the "too simple" expressions. This gives you a single concrete type that represents the universal data type as it occurs in typeless languages. The host language's sort apparatus has then collapsed and can no longer distinguish between the various types of values ​​that are passed to the evaluation or received from it. In this context, the sorting apparatus still determines or checks types, but always the same, just as if the type system no longer existed.

Order of polymorphic types

While the equality of monotypes is purely syntactic, polytypes have a richer relationship to other cultivars, which is expressed by a specialization relation , where means that is more special than .

When a polymorphic function is applied to a value, it must adapt its form to the specific type of that value. So during this adjustment it also changes its type to match that of the parameter. For example, if the identical function with the type is applied to a number with the type , then these cannot initially work together, since all types are different and do not match. What is needed in this situation is a function of type . For this purpose, the polymorphic identity is transformed into a monomorphic version of itself as part of the application. In terms of specialization, this can be written as .

The shape change of polymorphic values ​​is not completely arbitrary, but rather limited by the original polytype. Following the process in this example, the specialization rule can be rewritten in such a way that a polymorphic type is specialized by consistently replacing every occurrence of in , after which the quantification is omitted. While this rule works well for all monotypes as a substitute, it fails when a polytype occurs as a substitute. If you try this z. B. with , then you get an unsyntactic type . But not only that. Even if a nested quantification were allowed in the syntax, the result of this replacement would no longer have the property of the original type in the parameter and result of the function, since now both sub-types have become independent of each other and each would allow specialization with different types, e.g. B. , which would hardly be the right task for an identical function.

The syntactic restriction of the quantification to the top level prevents this unwanted generalization during specialization. In this case, the more specific type must be used instead of.

You can cancel the specialization you just made with another one with the type . In terms of the relation, you get a summary , which means that syntactically different polytypes are the same with regard to the renaming of their quantified variables.

Specialization rule

If one now only concentrates on the question of whether one type is more special than another, and more on what a more special type is used for, then the specialization can be summarized as in the adjacent box. Circulating clockwise, a type is specialized by consistently replacing each of the quantified variables with any monotype , so that a monotype is obtained overall . Finally, type variables that did not appear freely in the original type can optionally be quantified.

The specialization rule ensures that no free variable, i.e. H. Monotype in the original type, unintentionally bound by a quantifier. Variables that were originally quantified can, however, be replaced at will (consistently), including by types that introduce new quantified or unquantified variables.

Starting with the polytype , a specialization can either replace the body with another quantized variable, ultimately renaming, or with a type constant (including the function type), which may or may not have the parameters, each filled with a monotype or a quantized type variable. As soon as a quantified variable has been replaced by a type application, this can no longer be canceled by further specialization, as was possible with the replacement by a quantified variable. So type applications are there to stay. Only if these contain a further quantified type variable, the specialization can be continued by further replacement.

The specialization does not introduce any further similarities on polytypes apart from the renaming that is already known. Polytypes are syntactically the same except for the renaming of their bound variables. The equality of types is a reflexive, antisymmetric, and transitive relation, and the remaining specialization of polytypes is transitive. The relation is thus a partial order .

The deduction machinery

The syntax of the rules

The syntax of HM is continued by the syntax of the inference rules that form the body of the formal system by using the typings as judgments . The rules define which assumptions can be used to draw which conclusions. In addition to the judgments, some boundary conditions introduced above can be used as premises.

Proof using the rules is a sequence of judgments so that all premises are listed before the conclusions. See the following examples 2,3 for one possible format of the evidence. From left to right, each line shows the conclusion, that of the rule applied and the premises, either by referring to a previous line if the premise is a judgment, or by explicitly specifying the predicate.

The typing rules

Declarative rule system

The box on the side shows the deduction rule of the HM type system. They can be roughly divided into two groups:

The first four rules , , and are centered around the syntax and give each one rule for each of the expressions of. Their meaning is already quite obvious at first glance, as they split up each expression, use the proofs of the partial expressions and combine the individual types in the premises to form the type in the end.

The second group is formed by the remaining rules and , which deal with the specialization and generalization of types. While the content of the rule has already been described in the section on specialization , the rule completes it in the opposite direction. It allows generalization, i.e. H. a quantification of a monotype variable not bound in context. The need for the restriction was introduced in the section on free type variables .

The following two examples drill the rule system in action.

Example 2 : A proof for with can be written as follows:

Example 3 : To demonstrate the generalization, the following is shown:

Main type

As mentioned in the introduction , the rules allow different types to be deduced for the same expression. See example 2, steps 1,2 and example 3, steps 2,3 for three different types of the same expression. Obviously, the different results are not completely independent, but are linked by the type order. It is an essential property of the rule system and the type order that, if more than one type can be deduced for an expression, a clearly defined, most general type is found among these types (modulo equality) in the sense that all other specializations of it are. While a rule system must allow specialized types to be derived, a type-inferior algorithm should return the most general or main type as a result.

Let polymorphism

The set of rules, which is not immediately evident, encodes a rule under which circumstances a type may be generalized and when not. This is done by a slight variation in the use of mono- and poly-types in the rules and .

Usually the value variable of the parameter of the function is added to the context as a monomorphic type by the premise , while usually the variable enters the environment in polymorphic form . Although in both cases the presence of x in the context prevents the use of the generalization rule for every monotype variable in the assignment, this rule forces a parameter x to remain monomorphic in an expression, while the type variables can already be introduced polymorphically in a let expression what enables specializations.

As a consequence of these regulations, no type can be deduced for , since the parameter is in a monomorphic position, while the type results from being introduced in a let expression and therefore treated polymorphically. Note that this behavior is in stark contrast to common definitions , which is why the let variant was even included in the syntax of the expression. This distinction is called the let polymorphism and is peculiar to the HM.

Transition to the algorithm

Now that the HM's inference system is at hand, one could imagine an algorithm to check it against the rule. Alternatively, it may be inferred by looking more closely at how the rules work together and how the evidence is shaped. That path will now be followed in the remainder of this article by focusing on the choices that can be made while proving a typing.

Degrees of freedom in rule selection

If you isolate the places in a proof where no selection at all is possible, you get the first group of rules centered around the expression syntax, which determines the framework of the proof, as it were, since each of the inference rules corresponds to exactly one rule of the syntax, while between the Chains of and connecting the premises and conclusions of these fixed rule applications can occur. All evidence must be in the form so outlined.

Since the only possibilities in a proof with regard to rule selection are these and chains, the shape of the proofs suggests the question of whether one can specify where these chains are needed. This is indeed possible and leads to a variant of the rule system without these two rules.

Syntax-controlled rule system

Syntactic rule system
Generalization

A contemporary treatment of HM uses a purely syntax-controlled system of rules according to Clement as an intermediate step. In this system, the specialization is placed directly after the original rule and is now mixed in with it, while the generalization is included in the rule. The generalization is also determined by the introduction of the function in such a way that it always generates the most general type by quantifying all non- bound monotype variables.

In order to check that this new system of rules is equivalent to the original , one has to show that what breaks down into two partial proofs:

  • ( Correctness )
  • ( Completeness )

While one can see the correctness by decomposing the rules and of to proofs in , it is also recognizable that is incomplete, since one z. B. cannot show that it is true, but at best only . However, an only slightly weaker version of completeness can be demonstrated, namely:

which means that one can derive the main type for an expression in if one allows to generalize the proof at the end.

Note that compared to only contains monotypes in the judgments of its rules.

Degrees of freedom in rule instantiation

With a given expression, you are free to choose the instances for all (rule) variables within the rules that are not already defined by the expressions. These are the instances for the type variables in the rules. If one works towards showing the main type, then the choice can be restricted to choosing suitable types for in and . The decision for an appropriate selection cannot be made locally, but its quality is discernible in the premises of , the only rule in which two different types, namely that of the formal and that of the current parameter, must come together as one.

Therefore, a general strategy for finding a proof would be to make the most general assumption ( ) for and refine this and the choice to be made in until all the constraints required by the rules are finally satisfied. Fortunately, nor any iterations of this neither trial and error necessary as an effective method is known to calculate all the necessary decisions that unification by Robinson in connection with the so-called union-find algorithm.

To briefly summarize the union find method, it allows for a given set of all types in a proof to be divided into equivalence classes using the procedure and a representative to be determined using the procedure . Emphasizing the word procedure in the sense of side effect , the realm of logic is left here in order to prepare an effective algorithm. The representative of is determined in such a way that, if both and are also type variables, the representative can be any one of them, while in the case of a union of a variable and an application, the latter is chosen as the representative. If one has such an implementation of the union find at hand, then one can formulate the unification of two monotypes as follows:

unify(ta,tb):
  ta = find(ta)
  tb = find(tb)
  wenn ta,tb beide Anwendungen der Form D p1..pn mit identischen D,n sind dann
    unify(ta[i],tb[i]) für jeden korrespondierenden i-ten Parameter
  sonst
  wenn wenigstens einer von ta,tb eine Typvariable ist dann
    union(ta,tb)
  sonst
    fehler 'Die Typen ta,tb passen nicht zusammen.'

Algorithm W

Algorithm W

The presentation of the algorithm W as shown in the box on the side not only deviates significantly from the original, but also represents a considerable misuse of the notation of logical rules, since it includes side effects. This is legitimized here to enable a direct comparison with and at the same time to indicate an effective implementation. The rules now specify a procedure with parameters and result in the end, with the execution of the premises running from left to right. As an alternative to a procedure, the rules can be viewed as an attribution (with the same remarks regarding the side effects).

The procedure ' ' specializes the polytype by copying the term and consistently replacing the bound variables it contains with (globally) new monotype variables. ' ' creates a new monotype variable. Similarly, a copy of the type has to be made by introducing (globally) new type variables for the quantification in order to avoid unwanted binding.

Overall, the algorithm now proceeds by always making the most general possible selection and leaving the specialization to the unification, which in turn produces the most general possible result. As noted above , the result has to be generalized again at the end to get the main type for a given expression.

Since the procedures used in the algorithm are close to cost , the total cost of the procedure is nearly linear to the size of the expression for which a type is to be inferred. It is in stark contrast to many other attempts to establish a type inference method, which have often turned out to be NP-difficult , if not undecidable with regard to termination. Thus, HM has the same throughput that the best fully informed type approval process can have. Type testing means here that an algorithm has no evidence to find, but only needs to check it.

The efficiency is slightly lower for two reasons. First, the bindings of the type variables have to be managed in context to allow the calculation of . An occurs check is also required to prevent recursive types from occurring during unification. An example of such a case is for which no type can be derived with the HM. Since in practice the types are only small terms and also do not build up structures that do not need to be expanded, they can be viewed in the complexity analysis as being smaller than a certain constant so that the O (1) costs are preserved.

The algorithm W in the original

In the original publication, the algorithm is more formally described in a substitution style rather than by side effects as in the method above. In the latter form, the side effect invisibly takes care of all places in which type variables are used. Explicit use of the substitution not only makes the algorithm harder to read because the side effects appear virtually everywhere, but it also gives the false impression that the method is expensive. If, on the other hand, it is implemented with purely functional means or for the purpose of proving the equivalence to the deduction system, full explicitness is of course necessary and the original formulation a necessary refinement.

other topics

Recursive Definitions

A central property of the lambda calculus is that recursive definitions are not elementary, but can be expressed using the fixed point combiner. In the original publication it is noted that recursion can be realized with the type of this combiner. A possible recursive definition can thus be formulated as.

Alternatively, an extension of the expression syntax and an additional type rule is possible with:

With

Herein are fundamental and mixed together, whereby the recursively defined variables are treated monomorphically if they occur to the left of , but polymorphically to the right of it. This phrase perhaps best sums up the essence of the let polymorphism .

Remarks

  1. Polytypes are referred to as type schemes in the original publication.
  2. The parametric types are not included in the original publication on HM and are not required for the presentation of the method. None of the inference rules are responsible for them or would notice their absence. The same applies to the nonparametric "primitive types" in the paper concerned. The entire machinery of polymorphic type inference can be defined without it. They are included here partly for the sake of examples, but mostly because the nature of HM revolves entirely around polymorphic types. The parametric polymorphism occurs in the original only in a special form due to the function type hardwired in the inference rules , which has two polymorphic parameters and is only treated here as a special case.

Individual evidence

  1. ^ R. Hindley, (1969) The Principal Type-Scheme of an Object in Combinatory Logic, Transactions of the American Mathematical Society, Vol. 146, pp. 29-60 JSTOR 1995158
  2. Milner, (1978) A Theory of Type Polymorphism in Programming. Journal of Computer and System Science (JCSS) 17, pp. 348-374 online
  3. Luis Damas (1985): Type Assignment in Programming Languages. PhD thesis, University of Edinburg (CST-33-85)
  4. a b c d e Damas, Milner (1982), Principal type schemes for functional programs. 9th Symposium on Principles of Programming Languages ​​(POPL'82) pp. 207–212, ACM: PDF ( Memento of the original from March 24, 2012 in the Internet Archive ) Info: The archive link has been inserted automatically and has not yet been checked. Please check the original and archive link according to the instructions and then remove this notice. @1@ 2Template: Webachiv / IABot / ian-grant.net
  5. Clement, (1987). The Natural Dynamic Semantics of Mini-Standard ML. TAPSOFT'87, Vol 2. LNCS, Vol. 250, pp 67-81
  6. Jeff Vaughan, A proof of correctness for the Hindley-Milner type inference algorithm ( Memento of March 24, 2012 in the Internet Archive ) (PDF, 152 KB)