Thinking Semantically

Consider the syntax of a simply typed lambda calculus, with integers and pairs as base types:

\[\def\alt{\mathrel{|}} \def\ePair#1#2{(#1,#2)} \def\ePrj#1#2{\mathop{\mathrm{prj}_{#1}}({#2})} \def\eLam#1#2#3{\lambda({#1}\mathrel{:}{#2}).\,{#3}} \def\eApp#1#2{{#1}\;{#2}} \def\tInt{\mathsf{int}} \def\tPair#1#2{{#1}\times{#2}} \def\tFun#1#2{{#1}\to{#2}}\]
\[\begin{alignat*}{2} &e&{}\mathrel{::=}{}& n \alt \ePair{e}{e} \alt \ePrj{i}{e} \alt x \alt \eLam{x}{\tau}{e} \alt \eApp{e}{e}\\ &\tau&{}\mathrel{::=}{}& \tInt \alt \tPair{\tau}{\tau} \alt \tFun{\tau}{\tau}\\ \end{alignat*}\]

An obvious question to ask at this juncture is «what does any of this mean?». Grammar gives us only raw syntax, without interpretation. Terms may be given meaning via a number of styles of semantics, which I shall not be concerned with here. Types, rather, will be the focus of this post.

Syntactic Type Soundness

The dominant approach today to providing the semantics of types is to define a syntactic type system, laying out a number of proof rules which decide whether a particular term has a particular type:

\[\def\types#1#2#3{{#1}\vdash{#2}:{#3}}\]
\[\frac{\begin{gather*}\end{gather*}}{\types{\Gamma}{n}{\tInt}}\] \[\frac{\begin{gather*}\types{\Gamma}{e_1}{\tau_1} \qquad \types{\Gamma}{e_2}{\tau_2}\end{gather*}}{\types{\Gamma}{\ePair{e_1}{e_2}}{\tPair{\tau_1}{\tau_2}}}\] \[\frac{\begin{gather*}\types{\Gamma}{e}{\tPair{\tau_1}{\tau_2}}\end{gather*}}{\types{\Gamma}{\ePrj{i}{e}}{\tau_i}}\] \[\frac{\begin{gather*}(x : \tau) \in \Gamma\end{gather*}}{\types{\Gamma}{x}{\tau}}\] \[\frac{\begin{gather*}\types{\Gamma, (x : \tau)}{e}{\tau'}\end{gather*}}{\types{\Gamma}{\eLam{x}{\tau}{e}}{\tFun{\tau}{\tau'}}}\] \[\frac{\begin{gather*}\types{\Gamma}{e}{\tFun{\tau}{\tau'}} \qquad \types{\Gamma}{e'}{\tau}\end{gather*}}{\types{\Gamma}{\eApp{e}{e'}}{\tau'}}\]

However, there is something fundamentally and deeply unsatisfying about this approach. The system of typing rules does not give any meaning to the types themselves; rather, types remain meaningless pieces of syntax, which the typing rules use to enforce some structure on the program itself. If I ask what it means for a term \(e\) to be an \(\tInt\), the best answer I can give is that «there is a valid typing derivation which ascribes the type \(\tInt\) to \(e\)». Worse still, the annotation \((x : \tau)\) on a lambda doesn't directly mean anything like «the value substituted for \(x\) will be an integer», and indeed, depending on the operational semantics such may not even be necessary for type safety.

\[\def\redc#1#2{{#1}\longrightarrow{#2}}\]

If one is working syntactically, a common next step after describing a syntactic type system is to prove syntactic type soundness. This approach, introduced by Wright and Felleisen 1994, proves type soundness by means of two lemmata, progress and preservation. Supposing that we are armed with some small-step reduction relation \(\redc{e}{e'}\):

Lemma 1 (Preservation): If \(\types{\cdot}{e}{\tau}\) and \(\redc{e}{e'}\), then \(\types{\cdot}{e'}{\tau}\).

Lemma 2 (Progress): If \(\types{\cdot}{e}{\tau}\), then either e is a value, or there exists some \(e'\) such that \(\redc{e}{e'}\).

Together, these theorems ensure that «a well-typed program won't go wrong», at least in the (very limited) sense that evaluation will not get stuck.

Armed with these results, one might begin to think along the lines of «oh, the \(\tInt\) type means that a program with that type will reduce to an actual integer». Unfortunately, this breaks down quickly for functions: reducing to a function value just means reducing to a lambda whose body is syntactically well-typed—the highly syntactic rules of the type system remain the only source of meaning in the program.

At the end of the day, the syntactic approach simply cannot give any satisfying answer as to what types truly mean. A type is just a piece of syntax, and if there are any interesting invariants preserved about terms at that type, they are never clearly stated, but rather ensured by careful design of the syntactic typing rules and operational semantics.

The essence of the semantic approach is to instead give meanings directly to types. These results are not only more aesthetically pleasing, but also quite useful, since complex invariants on the inhabitants of a type may be stated relatively directly.

Aside: Denotational Semantics

Since this simple language does not have recursion, it is easy to give it a denotational set-theoretic model. In the traditional denotational world, as in all semantic models of types, we begin by directly giving meaning to types as some sort of mathematical object:

\[\def\llbracket{\left[\!\left[} \def\rrbracket{\right]\!\right]} \def\semT#1{\mathcal{T}\llbracket #1\rrbracket} \def\semG#1{\mathcal{G}\llbracket #1\rrbracket} \def\semE#1#2{\mathcal{E}\llbracket #1\rrbracket({#2})}\]
\[\begin{align*} \semT{\tInt} &{} = \mathbb{Z}\\ \semT{\tPair{\tau_1}{\tau_2}} &{}= \semT{\tau_1} \times \semT{\tau_2}\\ \semT{\tFun{\tau_1}{\tau_2}} &{}= \semT{\tau_1} \to \semT{\tau_2} \end{align*}\]

Note that in the second equation, the cross on the left is the syntactic type, and the cross on the right is the Cartesian product; and similarly for the arrow case.

Armed with the denotation of types, it is simple to extend this notion to a denotation of contexts as functions which map each name in \(\Gamma\) to an element of the semantic domain of the appropriate type:

\[\begin{align*} \semG{\cdot} &= \{ \varnothing \}\\ \semG{\Gamma,(x:\tau)} &= \{ \gamma[v/x] \mid v \in \semT{\tau} \land \gamma \in \semG{\Gamma} \} \end{align*}\]

We then give (open) terms a denotation as functions from the denotation of the context to the denotation of the type:

\[\begin{align*} \semE{n}{\gamma} &= n\\ \semE{\ePair{e_1}{e_2}}{\gamma} &= (\semE{e_1}{\gamma}, \semE{e_2}{\gamma})\\ \semE{\ePrj{e}{i}}{\gamma} &= \pi_i (\semE{e}{\gamma})\\ \semE{x}{\gamma} &= \gamma(x)\\ \semE{\eLam{x}{\tau}{e}}{\gamma} &= v \mapsto \semE{e}{\gamma[v/x]}\\ \semE{\eApp{e_1}{e_2}}{\gamma} &= \semE{e_1}{\gamma}(\semE{e_2}{\gamma})\\ \end{align*}\]

The syntactic typing rules are then treated as a proof system for this semantic model, rather than as the source of truth themselves:

Lemma 3 (Typing Rules Are Sound): If \(\types{\Gamma}{e}{\tau}\), then \(\llbracket e \rrbracket \in \semG{\Gamma} \to \semT{\tau}\).

Denotational-operational Models

I follow Ahmed 2004 in referring to the models often called «logical relations models» as denotational-operational models, as this is arguably a better description of how they work. A denotational-operational model is a semantic model of types, as discussed above, but denotes types into sets of terms with an operational semantics.

For our simple language, our model contains two denotations of types into terms. First, we specify the meaning of a type in terms of what values behave like it:

\[\def\semV#1{\mathcal{V}\llbracket #1\rrbracket} \def\semE#1{\mathcal{E}\llbracket #1\rrbracket} \def\redn#1#2{{#1}\longrightarrow^*{#2}}\]
\[\begin{align*} \semV{\tInt} &= \{ n \}\\ \semV{\tPair{\tau_1}{\tau_2}} &= \{ \ePair{v_1}{v_2} \mid v_1 \in \semV{\tau_1} \land v_2 \in \semV{\tau_2} \}\\ \semV{\tFun{\tau_1}{\tau_2}} &= \{ \eLam{x}{\tau}{e} \mid \forall v \in \semV{\tau_1}.\; \eApp{(\eLam{x}{\tau}{e})}{v} \in \semE{\tau_2} \}\\ \end{align*}\]

This semantic definition of types as sets of values references another interpretation of types, as expressions instead of values. An expression inhabits a type if and only if any value that it might reduce to inhabits the type under the interpretation above:

\[\begin{align*} \semE{\tau} &= \{ e \mid \forall v.\; \redn{e}{v} \Rightarrow v \in \semV{\tau} \} \end{align*}\]

This general approach can be extended to contexts and open terms in a natural way:

\[\begin{align*} \semG{\cdot} &= \{ \varnothing \}\\ \semG{\Gamma,(x:\tau)} &= \{ \gamma[v/x] \mid v \in \semV{\tau} \land \gamma \in \semG{\Gamma} \}\\ \llbracket \types{\Gamma}{e}{\tau} \rrbracket &= \forall \gamma \in \semG{\Gamma}.\; \gamma(e) \in \semE{\tau}\\ \end{align*}\]

Unlike the syntactic type system, this approach directly gives semantic meaning (in terms of the operational semantics): for example, it explicitly states that a function is something which, when applied to a value in its domain type, produces a value in its codomain type.

In this semantic world, type soundness still (unlike in the syntactic world) states that the syntactic typing system is a sound proof system for this model:

Lemma 4 (Typing Rules Are Sound): If \(\types{\Gamma}{e}{\tau}\), then \(\llbracket \types{\Gamma}{e}{\tau} \rrbracket\).

References

A.K. Wright and M. Felleisen. 1994. A Syntactic Approach to Type Soundness. Inf. Comput. 115, 1 (Nov. 15, 1994), 38–94. https://doi.org/10.1006/inco.1994.1093

A.J. Ahmed. 2004. Semantics of Types for Mutable State. [Dissertation]. Princeton University.