Row Polymorphism: The First 20 Years
References
See
http://www.citeulike.org/user/tov/tag/records
or
Bibtex in this page.
Records in untyped world
Let L be a set of labels.
Then we can think of a record as a partial function,
L -> V
Sometimes we'll make them total, L -> V + {absent}.
Operations we might have:
Selection (-).a: where r.a = a(r) (in the metalanguage)
A null record: null = ^a . absent
Extension, {- with a = -}:
{r with a = v} = ^b . if b = a then v else r(b)
Restriction, (-)\a: r\a = ^b . if b = a then absent else r(a).
Concatenation, (- || -):
r || r' = ^a . match r'(a) with absent -> r(a) | v -> v
What does going wrong mean?
- We shouldn't ever see "absent".
- We shouldn't go wrong in whatever other ways our language cares
about.
- We may choose to make extension, restriction, or concatenation
strict, by adding side conditions
{r with !a = v} | a not in dom r
r \ !a | a in dom r
r || r' | dom r \cup dom r' = 0
(this is also called symmetric concatenation)
What kind of functions would we like to write? We'll need some data.
Let mine = { name = "Mini", age = 2, plate = "34BC78" }
hers = { name = "Volvo", age = 13, plate = "56EF90" }
trailer = { name = "Bumpy", age = "old" }
Let moving = { hers with pulling = trailer }
Let trailer' = { trailer with age = 29 }
Okay:
plate x = x.plate
(reject trailer, trailer')
age x = x.age
(allow all)
next_year x = x.age + 1
(reject trailer)
older x y = x.age > y.age (magic >)
(reject trailer with non-trailer)
choice x y = (if c then x else y).name
(allow all)
choice' x y = if c then x.name else y.name
(allow all)
field_eq f x y = f x == f y
(reject age/trailer/non-trailer or plate/trailer)
tricky x y = (x || y).age + 1
(okay as long as y as age:int or (y has no age and x has age:int))
(don't put trailer in the y position)
We'd like to type this. Some criteria:
- be conservative, but as little as possible.
- don't require anything fancy. in other words, do it in ML(ish).
- no extra typing rules for records -- record operations are constants
- we don't want to add subtyping (Cardelli does this)
- we'd like to use ML polymorphism
But we'll make some concessions:
- We'll add kinds.
- We might need to add some equational axioms on types.
Wand's Solution
Assume finite L for now.
Our record types will be records of types. That is, the type of
a record will be a function,
L -> Type + {absent}
We'll add a type constructor Pi:
Pi :: [L => Type + {absent}] => Type
Yuck! Rémy to the rescue -- we can recover an ordinary algebraic
signature by adding a new kind:
Pi :: Field^L => Type
abs :: Field
pre :: Type => Field
Since L is finite, we actually make Pi just have arity |L|:
Pi :: Field x Field x Field => Type
null : Pi(absent, ..., absent)
(-).a : Pi(f1, ..., pre(t), ..., fn) -> t
{- with a = -} :
Pi(f1, ..., fn) -> t -> Pi(f1, ..., pre(t), ..., fn)
(-)\ : Pi(f1, ..., fn) -> Pi(f1, ..., abs, ..., fn)
We can also type the strict variants:
{- with !a = -} :
Pi(f1, ..., abs, ..., fn) -> t -> Pi(f1, ..., pre(t), ..., fn)
(-)\!a : Pi(f1, ..., pre(t), ..., fn) -> Pi(f1, ..., abs, ..., fn)
This is a pretty standard type system, except for the funny kind
structure.
L = {name, age, plate, trailer}
plate : Pi(x, y, z, pre(a)) -> a
age : Pi(x, pre(a), y, z) -> a
next_year : Pi(x, pre(int), y, z) -> int
older : Pi(x, pre(a), y, z) -> Pi(u, pre(a), v, w) -> bool
choice : Pi(pre(a), x, y, z) -> Pi(pre(a), x, y, z) -> a
(too restricitive!)
choice' : Pi(pre(a), x, y, z) -> Pi(pre(a), u, v, w) -> a
field_eq : (a -> b) -> a -> a -> bool
(too restrictive!)
tricky?
Hm . . .
tricky x y = (x || y).age + 1
tricky : Pi(x, pre(int), y, z) -> Pi(u, abs, v, w) -> int
tricky : Pi(x, abs, y, z) -> Pi(u, pre(int), v, w) -> int
tricky : Pi(x, x', y, z) -> Pi(u, pre(int), v, w) -> int
If tricky has a principle type, it must be at least as general as any of
the above, which means:
*tricky : Pi(x, x', y, z) -> Pi(u, u', v, w) -> int
This is unreasonable. Thus, || loses principle types!
Consider a formulation of HM as a set of constraints to solve. E.g,
for each application (M N), generate the constraint
t_M = t_N -> t_(M N)
If our system of constraints has a solution, it will be unique -- we
have principle types. We can state the rule for using equations like
this. For each (M || N),
t_M = Pi(x_1, ..., x_n)
t_N = Pi(y_1, ..., y_n)
t_(M||N) = Pi(z_1, ..., z_n)
(y_i = pre(t_i) ^ z_i = y_i) v (y_i = abs ^ z_i = x_i),
for all i in 1, ..., n
Convert our whole system of equations into disjunctive normal form, and
solve _each_ disjunct. Some will have no solutions, but some will yield
typings. We can think of the type of (M || N) as the disjunction of
these finitely many solutions.
Theoretically we could get many typings, but in practice (hopefully) we
won't.
Now, what if we want L to be infinite? (Separate compilation, or
the Pi types are just getting too big.)
Before we could represent Field^L as a tuple, but don't want infinite
tuples.
Only a finite subset of L can appear in the program, so let's assign
these some numbering 1, ..., n. Then we can represent the infinite
product types finitely:
Pi(x1, ..., xn, abs, abs, ...) = Pi[x1, ..., xn]empty
Pi(x1, ..., xn, y1, y2, ...) = Pi[x1, ..., xn]r
We've added a new kind of type variable, r, which we call an extension.
We've recovered finity:
Pi : Field^n x Extension => Type
abs : Field
pre : Type => Field
empty : Extension
We can type our constants:
null : Pi[abs, ..., abs]empty
(-).a : Pi[f1, ..., pre(t), ..., fn]r -> t
{- with a = -} :
Pi[f1, ..., fn]r -> t -> Pi[f1, ..., pre(t), ..., fn]r
(-)\a : Pi[f1, ..., fn]r -> Pi[f1, ..., abs, ..., fn]r
- The finite representations have unifiers iff the infinite thing does
- The translation of most general unifier = m.g.u. of the translation
To get (- || -), we add row variables to our system of equations:
t_M = Pi(x_1, ..., x_n)r_1
t_N = Pi(y_1, ..., y_n)r_2
t_(M||N) = Pi(z_1, ..., z_n)r_3
(y_i = pre(t_i) ^ z_i = y_i) v (y_i = abs ^ z_i = x_i),
for all i in 1, ..., n
r_1 || r_2 = r_3
We can solve this just like before, letting all the r_i = empty.
This still sucks. We want separate compilation.
Let's write explicit labels in our Pi nodes:
null : Pi[empty]
(-).a : Pi[a : pre(t), r] -> t
{- with a = -} :
Pi[a : f, r] -> t -> Pi[a : pre(t), r]
(-)\a : Pi[a : f, r] -> Pi[a : abs, r]
We can again do the strict variants as well:
{- with !a = -} :
Pi[a : abs, r] -> t -> Pi[a : pre(t), r]
(-)\!a : Pi[a : pre(t), r] -> Pi[a : abs, r]
Pi :: Row(0) => Type
abs :: Field
pre :: Type => Field
[a : -, -] :: Field x Row(K U! {a}) => Row(K)
for each a in L, K in P_fin(L \ {a})
empty :: Row(K)
for each K in P_fin(L)
Now we need to constrain how we use extension variables. Basically, we
give each r a kind that reflects which labels _are not_ in its domain.
We can enforce this as follows:
- If r appears in different Pi nodes, each must have the same explicit
labels. (Our constants follow this.)
- When we try to unify two Pi nodes, they must have the same explicit
labels first. (This identifies their extension variables.)
- When we have an extension constraint (r_1 || r_2) = r_3, each must
have the same label-kind.
Suppose we're unifying two terms and encountered corresponding record
types with different labels. We need to "pad" them out to give them the
same labels, so we find every missing label a, and for every Pi in
either term that's missing a, we do:
Pi[...]empty -> Pi[..., a : abs]empty
Pi[...]r -> Pi[..., a : f]r
Then unify normally.
Rémy views this differently. In his record algebra, we have an
equivalence on terms:
Commutativity:
Pi[..., a : f, b : g, ...] =_E Pi[..., b : g, a : f, ...]
Distributivity:
Pi[...]empty =_E Pi[..., a : abs]empty (for a not in ...)
Pi[...]r =_E Pi[..., a : f]r' (for a not in ... and r' fresh)
Then we extend ML typing with one new inference rule:
A |- M : s s =_E t
-------------------------
A |- M : t
We still have the same power as we had with finite L, but we have
separate compilation and less ugly types.
Remy's Extensions
Remember choice?
choice x y = (if c then x else y).name
We'd like to be able to write (choice mine trailer), which shouldn't go
wrong, but we can't type it.
We'll add a new kind!
Pi :: Row(0) => Type
abs :: Flag
pre :: Flag
- . - :: Flag x Type => Field
[a^K : -, -] :: Field x Row(K U! {a}) => Row(K)
for each a in L, K in P_fin(L \ {a})
empty^K :: Row(K)
for each K in P_fin(L)
Now types look like:
null : Pi[empty]
(-).a : Pi[a : pre.b, r] -> b
{- with a = -} :
Pi[a : f, r] -> b -> Pi[a : e.b, r]
(-)\a : Pi[a : f, r] -> Pi[a : abs.b, r]
So we give types like
{ name = "Mini", age = 2, plate = "34BC78" } :
Pi[ name : e.string, age : f.int, plate : g.string, empty ]
We can now type choice:
choice : Pi[ name : pre.a, r ] -> Pi[ name : pre.a, r ] -> a
We can go one better:
mine = { name = "Mini", age = 2, plate = "34BC78" }
: Pi[ name : e.string, age : f.int, plate : g.string, empty ]
trailer = { name = "Bumpy", age = "old" }
: Pi[ name : e.string, age : f.string, empty ]
trailer' = { name = "Bumpy", age = 29 }
: Pi[ name : e.string, age : f.int, empty ]
if c then mine else trailer' : ?
if c then mine else trailer : ?
We pad trailer' and trailer with label plate:
trailer : Pi[ name : e.string, age : f.string, plate : abs.b, empty ]
trailer' : Pi[ name : e.string, age : f.int, plate : abs.b, empty ]
Then attempt to unify each with mine:
mine : Pi[ name : e.string, age : f.*nt, plate : abs.string, empty ]
trailer : Pi[ name : e.string, age : f.*tring, plate : abs.string, empty ]
mine : Pi[ name : e.string, age : f.int, plate : abs.string, empty ]
trailer' : Pi[ name : e.string, age : f.int, plate : abs.string, empty ]
if c then mine else trailer'
: Pi[ name : e.string, age : f.int, plate : abs.string, empty ]
*if c then mine else trailer
We can only forget once -- note that the type of the if-then-else won't
unify with Pi[ plate : abs.int, r ], for example.
We can generally fix this with (-)\a:
(if c then mine else trailer') \ plate
: Pi[ name : e.string, age : f.int, plate : abs.b, empty ]
A nice example:
type tree(e) =
| Leaf of int
| Node of Pi[ left: pre.tree(e), right: pre.tree(e),
annot:e.int, empty ]
So a tree(pre) has annotations and a tree(abs) does not!
There's a general principle at work here:
If you want pre to be a subtype of abs, represent pre by pre in
negative positions and by a variable in positive positions.
(Pucella, Fluet '06)
Rémy also proposes a version with subtyping, with the atom coercion
pre < abs,
but I can't figure out how that would work any better. (We still have
the abs.remembering problem.) He says that pre(a) < abs wouldn't work.
Concatenation for Free
Rémy says we have concatenation in any language with extension, and
keep principle types.
Given a language L with extension, we create a language L^{||} with:
M ::= ... | null | { a = M } | (M || M) | M.a
We treat singleton and concat as primitive, and can encode
{ M with a = N } as M || { a = N }
Free extension => Asymmetric concatenation
Strict extension => Symmetric concatenation
Now the L^{||} to L translation:
[[ null ]] = ^u . u
[[ { a = M } ]] = ^u . { u with a = [[ M ]] }
[[ M || N ]] = [[ N ]] o [[ M ]]
This satisfies the equation:
[[ r ]] = ^u . u || r
In other words, we now represent records as the function that adds their
fields to another record. [[ - ]] is "record abstraction".
r = [[ r ]] null
so
[[ M.a ]] = ([[ M ]] null).a
We translate other terms by mapping [[ - ]] through them:
[[ ^x . M ]] = ^x . [[ M ]]
[[ M N ]] = [[ M ]] [[ N ]]
and so on. . . .
We're not completely satisfied, because the encoding is not injective:
[[ null ]] = ^u . u
[[ ^u . u ]] = ^u . u
Assume there exists a constructor/destructor pair that is private to
our translation:
Tag : a -> Tagged a
Untag : Tagged a -> a
Untag (Tag v) => v (or we erase them!)
Then we wrap/unwrap to make things injective:
[[ null ]] = Tag (^u . u)
[[ { a = M } ]] = Tag (^u . { u with a = [[ M ]] })
[[ M || N ]] = Tag (Untag [[ N ]] o Untag [[ M ]])
[[ M.a ]] = (Untag [[ M ]] null).a
Here's the beautify thing:
If we translate L|| programs into L{Tag,Untag} and then type check
them, we get a nice type system for L|| !
In L||, the types a records will actually be a function on record types
that describes the extension that any record does when concatenated.
Let { r => r' } = Tagged of Pi(r) -> Pi(r'). Then for L|| where L is our
earlier calculus, we have constants:
null : { r => r }
(-).a : { r => a: pre(b), r' } -> b
{ a = - } : b -> { a: f, r => a: pre(b), r }
(-)\a : { a: e, r => a: f, r' } -> { a: g, r => a: g, r' }
(- || -) : { r => r' } -> { r' => r'' } -> { r => r'' }
And strictly:
{ !a = - } : b -> { a: abs, r => a: pre(b), r }
(-)\!a : { a: e, r => a: pre(b), r' } -> { a: g, r => a: g, r' }
Note || is no longer symmetric or asymmetric. It depends on
individual fields:
{ !a = 1, b = true }
: { a: abs, b: f, r => a: pre(int), b: pre(bool), r }
This is still weaker than Wand's system. Claim is Wand can type and
this can't:
^r. r.a + ({ a = 1 } || r).a
Rémy can recover that power in this system by losing principle types (of
course)!
Modern Row-typed Systems
Ocaml has row types with the restriction that any two rows ending the
same extension variable must unify. This rules out extension and
concatenation, but has the benefit that row variables can be anonymous,
and no explicit kinds. No explicit abs either.
Types are like:
Pi[ a: int, b: string, 0 ]
Pi[ a: int, b: string, 1 ]
How does this work?
When ocaml finds it has to unify two row variables, it instead unifies
the whole rows, creating sharing.
No need to name the extension variable, because the whole type is
shared. It prints shared type structure using "as":
(Pi[ a: int, 1 ] as b) -> b
Abs isn't needed because once two rows unify, it can be folded out:
- In positive positions, we forget everything
- We only need extensions in negative positions to say "we'll allow more"
Ocaml also has the dual, open variants types. These look like:
Sum[ a: int, b: bool, +1 ] : have this or more (co-)
Sum[ a: int, b: bool, 0 ] : exact (in-)
Sum[ a: int, b: bool, -1 ] : accept this or less (contra-)
MLPolyR (Matthias Blume, et al.)
The dual of record extension is case extension (extensible sum
destruction); The dual of record destruction is sum construction.
Let's construct our pattern matches and use them later:
handle_a c = case `A () => "A"
| default: c
handle_b c = case `B () => "B"
| default: c
handle_a : (Sum[beta] ~> string) -> (Sum[`A : unit, beta] ~> string)
handle_b : (Sum[beta] ~> string) -> (Sum[`B : unit, beta] ~> string)
handle_ab c = handle_a (handle_b c)
handle_ab : (Sum[beta] ~> string) ->
(Sum[`A : unit, `B : unit, beta] ~> string)
nilcase : Sum[] ~> a
case_ab = handle_ab nilcase
case_ab : (Sum[`A : unit, `B : unit] ~> string)
match `B () with case_ab
Okay, now a bigger example:
kv2kb kv = ^v . `App(kv, [v])
kb2kv kb = `Lam([xr], kb(`Var xr))
where xr = gensym()
cvt_lam(xs, e) = `Lam(k::xs, cvt(e, kv2kb(`Var k)))
where k = gensym()
cvt(e, kb) = match e with
cases `Con i -> kb(`Con i)
| `Var x -> kb(`Var x)
| `Lam(xs, e) -> kb(cvt_lam(xs, e))
| `App(e, es) -> cvt_app(e, es, kb2kv(kb))
cvt_app(e, es, kv) = let
lc([], kb) = kb []
lc(e::es, kb) = pc(e, es, ^(v, vs) . kb(v::vs))
pc(e, es, kb) = cvt(e, ^v . lc(es, ^vs . kb(v, vs)))
in
pc(e, es, ^(v, vs) . `App(v, kv::vs)
convert e = cvs_lam([], e)
What does this do?
It has a neat type:
convert : forall a, forall xi : {`App},
forall sigma : {`Con, `Lam, `Var}.
(e as Sum[ `App : e * list e, `Con : a,
`Lam : list int * e, `Var : int ]) ->
(nu as Sum[ `Con : a,
`Lam : list int * Sum[ `App : nu * list nu, xi ],
`Var : int, sigma)
Abstract syntax:
Products:
{ l = e, ... } | e * { l = e } | e / l | e.l
Sums:
{ l x => e, ... } | e + { l x = e } | e - l | l e | match e with e
[[ { l x => e, ... } ]] = { l = ^x . [[ e ]], ... }
[[ e + { l x = e } ]] = [[ e ]] * { l = ^x . [[ e ]] }
[[ e - l ] = [[ e ]] / l
[[ l e ]] = let x = [[ e ]] in ^c . c.l x
[[ match e1 with e2 ]] = [[ e1 ]] [[ e2 ]]
The types translate in the obvious way.
References
See
http://www.citeulike.org/user/tov/tag/records.
@misc{citeulike:1652943,
abstract = {A (mini-)ML with type inference, polymorphic records, and functional record update},
author = {Blume, Matthias },
citeulike-article-id = {1652943},
howpublished = {for CMSC 22620/23620},
keywords = {types, records, pl},
priority = {0},
school = {University of Chicago},
title = {MLPolyR},
url = {http://ttic.uchicago.edu/~blume/classes/spr2005/cmsc22620/docs/langspec.pdf},
year = {2005}
}
@inproceedings{citeulike:1652932,
address = {New York, NY, USA},
author = {Blume, Matthias and Acar, Umut A. and Chae, Wonseok },
booktitle = {ICFP '06: Proceedings of the eleventh ACM SIGPLAN international conference on Functional programming},
citeulike-article-id = {1652932},
doi = {10.1145/1159803.1159836},
issn = {0362-1340},
keywords = {types, records, pl},
pages = {239--250},
priority = {0},
publisher = {ACM Press},
title = {Extensible programming with first-class cases},
url = {http://dx.doi.org/10.1145/1159803.1159836},
year = {2006}
}
@article{citeulike:1642457,
abstract = {We investigate a technique from the literature, called the phantom-types technique, that uses parametric polymorphism, type constraints, and unification of polymorphic types to model a subtyping hierarchy. Hindley-Milner type systems, such as the one found in Standard ML, can be used to enforce the subtyping relation, at least for first-order values. We show that this technique can be used to encode any finite subtyping hierarchy (including hierarchies arising from multiple interface inheritance). We formally demonstrate the suitability of the phantom-types technique for capturing first-order subtyping by exhibiting a type-preserving translation from a simple calculus with bounded polymorphism to a calculus embodying the type system of SML.},
author = {Fluet, Matthew and Pucella, Riccardo },
citeulike-article-id = {1642457},
journal = {Journal of Functional Programming},
keywords = {types, records, pl},
month = {June},
number = {06},
pages = {751--791},
priority = {0},
publisher = {Cambridge University Press},
title = {Phantom types and subtyping},
url = {http://journals.cambridge.org/action/displayAbstract?fromPage=online\&aid=524448},
volume = {16},
year = {2006}
}
@techreport{citeulike:1628537,
abstract = {We describe an extension of ML with records where inheritance is given by ML generic polymorphism. All common operations on records but concatenation are supported, in particular the free extension of records. Other operations such as renaming of fields are added. The solution relies on an extension of ML, where the language of types is sorted and considered modulo equations, and on a record extension of types. The solution is simple and modular and the type inference algorithm is efficient in...},
author = {Rémy, Didier },
citeulike-article-id = {1628537},
keywords = {types, records, pl},
number = {RR-1431},
priority = {0},
title = {Type inference for records in a natural extension of ML},
url = {http://citeseer.ist.psu.edu/remy91type.html}
}
@article{citeulike:1628532,
abstract = {Objective ML is a small practical extension to ML with objects and top level classes. It is fully compatible with ML; its type system is based on ML polymorphism, record types with polymorphic access, and a better treatment of type abbreviations. Objective ML allows for most features of object-oriented languages including multiple inheritance, methods returning self and binary methods as well as parametric classes. This demonstrates that objects can be added to strongly typed languages based on ML polymorphism. © 1998 John Wiley \& Sons, Inc.},
address = {INRIA-Rocquencourt, B.P. 105, F-78153 Le Chesnay Cedex, France},
author = {Rémy, Didier and Vouillon, Jérôme },
citeulike-article-id = {1628532},
doi = {10.1002/(SICI)1096-9942(1998)4:1<27::AID-TAPO3>3.0.CO;2-4},
journal = {Theory and Practice of Object Systems},
keywords = {types, records, pl},
number = {1},
pages = {27--50},
priority = {0},
title = {Objective ML: An effective object-oriented extension to ML},
url = {http://dx.doi.org/10.1002/(SICI)1096-9942(1998)4:1<27::AID-TAPO3>3.0.CO;2-4},
volume = {4},
year = {1998}
}
@inproceedings{citeulike:600745,
address = {New York, NY, USA},
author = {Harper, Robert and Pierce, Benjamin },
booktitle = {POPL '91: Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages},
citeulike-article-id = {600745},
doi = {10.1145/99583.99603},
isbn = {0897914198},
keywords = {types, records, pl},
pages = {131--142},
priority = {0},
publisher = {ACM Press},
title = {A record calculus based on symmetric concatenation},
url = {http://portal.acm.org/citation.cfm?id=99603},
year = {1991}
}
@inproceedings{citeulike:600741,
address = {New York, NY, USA},
author = {Rémy, Didier },
booktitle = {POPL '92: Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages},
citeulike-article-id = {600741},
doi = {10.1145/143165.143202},
isbn = {0897914538},
keywords = {types, records, pl},
pages = {166--176},
priority = {0},
publisher = {ACM Press},
title = {Typing record concatenation for free},
url = {http://portal.acm.org/citation.cfm?id=143202},
year = {1992}
}
@inproceedings{wand91,
abstract = {The author shows that the type inference problem for a lambda calculus with records, including a record concatenation operator, is decidable. He shows that this calculus does not have principal types but does have finite complete sets of type, that is, for any term M in the calculus, there exists an effectively generable finite set of type schemes such that every typing for M is an instance of one of the schemes in the set. The author shows how a simple model of object-oriented programming, including hidden instance variables and multiple inheritance, may be coded in this calculus. The author concludes that type inference is decidable for object-oriented programs, even with multiple inheritance and classes as first-class values},
author = {Wand, M. },
booktitle = {Logic in Computer Science, 1989. LICS '89, Proceedings., Fourth Annual Symposium on},
citeulike-article-id = {1628251},
journal = {Logic in Computer Science, 1989. LICS '89, Proceedings., Fourth Annual Symposium on},
keywords = {types, shcaml, records, pl},
pages = {92--97},
priority = {0},
title = {Type inference for record concatenation and multiple inheritance},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=39162},
year = {1989}
}