InterimReport.tex

   1 \documentclass[article, a4paper]{article}
   2 \usepackage[T1]{fontenc}
   3 \usepackage{fixltx2e}
   4 \usepackage{enumitem}
   5 \usepackage[normalem]{ulem}
   6 \usepackage{graphicx}
   7 \usepackage{subcaption}
   8
   9 % Unicode
  10 \usepackage{ucs}
  11 \usepackage[utf8x]{inputenc}
  12 % \usepackage{autofe}
  13
  14 %% -----------------------------------------------------------------------------
  15 %% Fonts
  16 %% \usepackage[osf]{libertine}
  17 %% \usepackage{helvet}
  18 %% \usepackage{lmodern}
  19 %% \IfFileExists{microtype.sty}{\usepackage{microtype}}{}
  20
  21 %% -----------------------------------------------------------------------------
  22 %% Diagrams
  23 % \usepackage{tikz}
  24 % \usetikzlibrary{positioning}
  25 %% \usetikzlibrary{shapes}
  26 %% \usetikzlibrary{arrows}
  27 %% \usepackage{tikz-cd}
  28 %% \usepackage{pgfplots}
  29 \usepackage[all]{xy}
  30
  31 %% -----------------------------------------------------------------------------
  32 %% Symbols
  33 \usepackage[fleqn]{amsmath}
  34 \usepackage{amssymb}
  35 \usepackage{wasysym}
  36 \usepackage{turnstile}
  37 \usepackage{centernot}
  38 \usepackage{stmaryrd}
  39 \usepackage{bm}
  40
  41 %% -----------------------------------------------------------------------------
  42 %% Utils
  43 \usepackage{bussproofs}
  44 \usepackage{umoline}
  45 \usepackage[export]{adjustbox}
  46 \usepackage{multicol}
  47 \usepackage{listingsutf8}
  48 \lstset{
  49   basicstyle=\small\ttfamily,
  50   extendedchars=\true,
  51   inputencoding=utf8x,
  52   xleftmargin=1cm
  53 }
  54 \usepackage{epigraph}
  55
  56 %% -----------------------------------------------------------------------------
  57 %% Bibtex
  58 \usepackage{natbib}
  59
  60 %% Numbering depth
  61 %% \setcounter{secnumdepth}{0}
  62
  63 %% -----------------------------------------------------------------------------
  64 % We will generate all images so they have a width \maxwidth. This means
  65 % that they will get their normal width if they fit onto the page, but
  66 % are scaled down if they would overflow the margins.
  67 \makeatletter
  68 \def\maxwidth{
  69   \ifdim\Gin@nat@width>\linewidth\linewidth
  70   \else\Gin@nat@width\fi
  71 }
  72 \makeatother
  73
  74 %% -----------------------------------------------------------------------------
  75 %% Links
  76 \usepackage[
  77   setpagesize=false, % page size defined by xetex
  78   unicode=false, % unicode breaks when used with xetex
  79   xetex
  80 ]{hyperref}
  81
  82 \hypersetup{
  83   breaklinks=true,
  84   bookmarks=true,
  85   pdfauthor={Francesco Mazzoli <fm2209@ic.ac.uk>},
  86   pdftitle={Observational Equality - Interim Report},
  87   colorlinks=false,
  88   pdfborder={0 0 0}
  89 }
  90
  91 % Make links footnotes instead of hotlinks:
  92 % \renewcommand{\href}[2]{#2\footnote{\url{#1}}}
  93
  94 % avoid problems with \sout in headers with hyperref:
  95 \pdfstringdefDisableCommands{\renewcommand{\sout}{}}
  96
  97 \title{Observational Equality - Interim Report}
  98 \author{Francesco Mazzoli \href{mailto:fm2209@ic.ac.uk}{\nolinkurl{<fm2209@ic.ac.uk>}}}
  99 \date{December 2012}
 100
 101 \begin{document}
 102
 103 \maketitle
 104
 105 \setlength{\tabcolsep}{12pt}
 106
 107 \begin{abstract}
 108 The marriage between programming and logic has been a very fertile one.  In
 109 particular, since the simply typed lambda calculus (STLC), a number of type
 110 systems have been devised with increasing expressive power.
 111
 112 In section \ref{sec:types} I will give a very brief overview of STLC, and then
 113 illustrate how it can be interpreted as a natural deduction system.  Section
 114 \ref{sec:itt} will introduce Inutitionistic Type Theory (ITT), which expands on
 115 this concept, employing a more expressive logic.  The exposition is quite dense
 116 since there is a lot of material to cover; for a more complete treatment of the
 117 material the reader can refer to \citep{Thompson1991, Pierce2002}. Section
 118 \ref{sec:practical} will describe additions common in programming languages
 119 based on ITT.
 120
 121 Finally, in section \ref{sec:equality} I will explain why equality has always
 122 been a tricky business in these theories, and talk about the various attempts
 123 that have been made to make the situation better.  One interesting development
 124 has recently emerged: Observational Type theory.  I propose to explore the ways
 125 to turn these ideas into useful practices for programming and theorem proving.
 126 \end{abstract}
 127
 128 \tableofcontents
 129
 130 \section{Simple and not-so-simple types}
 131 \label{sec:types}
 132
 133 \subsection{A note on notation}
 134
 135 \newcommand{\appsp}{\hspace{0.07cm}}
 136 \newcommand{\app}[2]{#1\appsp#2}
 137 \newcommand{\absspace}{\hspace{0.03cm}}
 138 \newcommand{\abs}[2]{\lambda #1\absspace.\absspace#2}
 139 \newcommand{\termt}{t}
 140 \newcommand{\termm}{m}
 141 \newcommand{\termn}{n}
 142 \newcommand{\termp}{p}
 143 \newcommand{\termf}{f}
 144 \newcommand{\separ}{\ \ |\ \ }
 145 \newcommand{\termsyn}{\mathit{term}}
 146 \newcommand{\axname}[1]{\textbf{#1}}
 147 \newcommand{\axdesc}[2]{\axname{#1} \fbox{$#2$}}
 148 \newcommand{\lcsyn}[1]{\mathrm{\underline{#1}}}
 149 \newcommand{\lccon}[1]{\mathsf{#1}}
 150 \newcommand{\lcfun}[1]{\mathbf{#1}}
 151 \newcommand{\tyarr}{\to}
 152 \newcommand{\tysyn}{\mathit{type}}
 153 \newcommand{\ctxsyn}{\mathit{context}}
 154 \newcommand{\emptyctx}{\cdot}
 155
 156 In the following sections I will introduce a lot of syntax, typing, and
 157 reduction rules, along with examples.  To make the exposition clearer, usually
 158 the rules are preceded by what the rule looks like and what it shows (for
 159 example \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}).
 160
 161 In the languages presented I will also use different fonts for different things:
 162 \begin{center}
 163   \begin{tabular}{c | l}
 164     $\lccon{Sans}$  & Sans serif, capitalised, for type constructors. \\
 165     $\lccon{sans}$  & Sans serif, not capitalised, for data constructors. \\
 166     $\lcsyn{roman}$ & Roman, underlined, for the syntax of the language. \\
 167     $\lcfun{roman}$ & Roman, bold, for defined functions and values. \\
 168     $math$          & Math mode font for quantified variables and syntax elements.
 169   \end{tabular}
 170 \end{center}
 171
 172 Moreover, I will from time to time give examples in the Haskell programming
 173 language as defined in \citep{Haskell2010}, which I will typeset in
 174 \texttt{teletype} font.  I assume that the reader is already familiar with
 175 Haskell, plenty of good introductions are available \citep{LYAH,ProgInHask}.
 176
 177 \subsection{Untyped $\lambda$-calculus}
 178
 179 Along with Turing's machines, the earliest attempts to formalise computation
 180 lead to the $\lambda$-calculus \citep{Church1936}.  This early programming
 181 language encodes computation with a minimal syntax and no `data' in the
 182 traditional sense, but just functions.
 183
 184 The syntax of $\lambda$-terms consists of three things: variables, abstractions,
 185 and applications:
 186
 187 \begin{center}
 188 \axname{syntax}
 189 $$
 190 \begin{array}{rcl}
 191   \termsyn & ::= & x \separ (\abs{x}{\termsyn}) \separ (\app{\termsyn}{\termsyn}) \\
 192          x & \in & \text{Some enumerable set of symbols, e.g.}\ \{x, y, z, \dots , x_1, x_2, \dots\}
 193 \end{array}
 194 $$
 195 \end{center}
 196
 197 I will use $\termt,\termm,\termn,\dots$ to indicate a generic term, and $x,y$
 198 for variables.  Parenthesis will be omitted in the usual way:
 199 $\app{\app{t}{m}}{n} = \app{(\app{t}{m})}{n}$. I will also assume that all
 200 variable names in a term are unique to avoid problems with name capturing.
 201 Intuitively, abstractions ($\abs{x}{\termt}$) introduce functions with a named
 202 parameter ($x$), and applications ($\app{\termt}{\termm}$) apply a function
 203 ($\termt$) to an argument ($\termm$).
 204
 205 The `applying' is more formally explained with a reduction rule:
 206
 207 \newcommand{\bred}{\leadsto}
 208 \newcommand{\bredc}{\bred^*}
 209
 210 \begin{center}
 211 \axdesc{reduction}{\termsyn \bred \termsyn}
 212 $$\app{(\abs{x}{\termt})}{\termm} \bred \termt[\termm / x]$$
 213 \end{center}
 214
 215 Where $\termt[\termm / x]$ expresses the operation that substitutes all
 216 occurrences of $x$ with $\termm$ in $\termt$.  In the future, I will use
 217 $[\termt]$ as an abbreviation for $[\termt / x]$.  In the systems presented, the
 218 $\bred$ relation also includes reduction of subterms, for example if $\termt
 219 \bred \termm$ then $\app{\termt}{\termn} \bred \app{\termm}{\termn}$, and so on.
 220
 221 These few elements are of remarkable expressiveness, and in fact Turing
 222 complete.  As a corollary, we must be able to devise a term that reduces forever
 223 (`loops' in imperative terms):
 224 \[
 225   \app{(\abs{x}{\app{x}{x}})}{(\abs{x}{\app{x}{x}})} \bred \app{(\abs{x}{\app{x}{x}})}{(\abs{x}{\app{x}{x}})} \bred \dotsb
 226 \]
 227 Terms that can be reduced only a finite number of times (the non-looping ones)
 228 are said to be \emph{strongly normalising}, and the `final' term (where no
 229 reductions are possible on the term or on its subterms) is called \emph{normal
 230   form}.
 231
 232 From now on I will omit the `strongly' in `strongly normalising', however weaker
 233 notions of normalisation have been described.  More generally, it is often
 234 convenient not to reduce subterms indiscriminately, and have a more principled
 235 approach to evaluation, or `evaluation strategy'.  Common evaluation strategies
 236 include `call by value' (or `strict'), where arguments of abstractions are
 237 reduced before being applied to the abstraction; and conversely `call by name'
 238 (or `lazy'), where we reduce only when we need to do so to proceed - in other
 239 words when we have an application where the function is still not a $\lambda$.
 240 In both these reduction strategies we never reduce under an abstraction, and a
 241 different notion of normalisation (`weak head normal form') if often used, where
 242 every abstraction is a weak head normal form.
 243
 244 \subsection{The simply typed $\lambda$-calculus}
 245 \label{sec:stlc}
 246
 247 \newcommand{\tya}{A}
 248 \newcommand{\tyb}{B}
 249 \newcommand{\tyc}{C}
 250
 251 One way to `discipline' $\lambda$-terms is to assign \emph{types} to them, and
 252 then check that the terms that we are forming make sense given our typing rules
 253 \citep{Curry1934}.
 254
 255 We wish to introduce rules of the form $\Gamma \vdash \termt : \tya$, which
 256 reads `in context $\Gamma$, term $\termt$ has type $\tya$'.
 257
 258 The syntax for types is as follows:
 259
 260 \begin{center}
 261   \axname{syntax}
 262 $$
 263 \begin{array}{rcl}
 264   \tysyn & ::= & \alpha \separ \tysyn \tyarr \tysyn \\
 265   \alpha & \in & \text{A set of base types provided by the language}
 266 \end{array}
 267 $$
 268 \end{center}
 269
 270 I will use $\tya,\tyb,\dots$ to indicate a generic type, carrying the convention
 271 from set theory of denoting member of sets with lowercase letters and sets
 272 themselves with capitals.
 273
 274 A context $\Gamma$ is a map from variables to types.  I will use the notation
 275 $\Gamma; x : \tya$ to augment it and to `extract' pairs from it.
 276
 277 Predictably, $\tya \tyarr \tyb$ is the type of a function from $\tya$ to
 278 $\tyb$.  We extend the syntax and allow to decorate our abstractions with
 279 types:
 280 \begin{center}
 281   \axname{syntax}
 282    $$\termsyn ::= x \separ (\abs{x{:}\tysyn}{\termsyn}) \separ (\app{\termsyn}{\termsyn})$$
 283 \end{center}
 284 Now we are ready to give the typing judgements:
 285
 286 \begin{center}
 287   \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}
 288
 289   \vspace{0.5cm}
 290
 291   \begin{tabular}{c c c}
 292     \AxiomC{$\Gamma; x : \tya \vdash x : \tya$}
 293     \DisplayProof
 294     &
 295     \AxiomC{$\Gamma; x : \tya \vdash \termt : \tyb$}
 296     \UnaryInfC{$\Gamma \vdash \abs{x{:}\tya}{\termt} : \tya \tyarr \tyb$}
 297     \DisplayProof
 298   \end{tabular}
 299
 300   \vspace{0.5cm}
 301
 302   \begin{tabular}{c}
 303     \AxiomC{$\Gamma \vdash \termt : \tya \tyarr \tyb$}
 304     \AxiomC{$\Gamma \vdash \termm : \tya$}
 305     \BinaryInfC{$\Gamma \vdash \app{\termt}{\termm} : \tyb$}
 306     \DisplayProof
 307   \end{tabular}
 308 \end{center}
 309
 310 This typing system takes the name of `simply typed lambda calculus' (STLC),
 311 and enjoys a number of properties.  Two of them are expected in most type
 312 systems \citep{Pierce2002}:
 313 \begin{description}
 314 \item[Progress] A well-typed term is not stuck - it is either a variable, or its
 315   constructor does not appear on the left of the $\bred$ relation (currently only
 316   $\lambda$), or it can take a step according to the evaluation rules.
 317 \item[Preservation] If a well-typed term takes a step of evaluation, then the
 318   resulting term is also well-typed, and preserves the previous type.
 319 \end{description}
 320
 321 However, STLC buys us much more: every well-typed term is normalising.  It is
 322 easy to see that we can't fill the blanks if we want to give types to the
 323 non-normalising term shown before:
 324 \begin{equation*}
 325   \app{(\abs{x : ?}{\app{x}{x}})}{(\abs{x : ?}{\app{x}{x}})}
 326 \end{equation*}
 327
 328 \newcommand{\lcfix}[2]{\lcsyn{fix} \appsp #1\absspace.\absspace #2}
 329
 330 This makes the STLC Turing incomplete.  We can recover the ability to loop by
 331 adding a combinator that recurses:
 332 $$
 333   \termsyn ::= \dotsb \separ  \lcfix{x : \tysyn}{\termsyn}
 334 $$
 335 \begin{center}
 336   \begin{prooftree}
 337     \AxiomC{$\Gamma;x : \tya \vdash \termt : \tya$}
 338     \UnaryInfC{$\Gamma \vdash \lcfix{x : \tya}{\termt} : \tya$}
 339   \end{prooftree}
 340 \end{center}
 341 $$
 342   \lcfix{x : \tya}{\termt} \bred \termt[\lcfix{x : \tya}{\termt}]
 343 $$
 344 This will deprive us of normalisation, which is a particularly bad thing if we
 345 want to use the STLC as described in the next section.
 346
 347 \subsection{The Curry-Howard correspondence}
 348 \label{sec:curry-howard}
 349
 350 \newcommand{\lcunit}{\lcfun{\langle\rangle}}
 351
 352 It turns out that the STLC can be seen a natural deduction system for
 353 propositional logic.  Terms are proofs, and their types are the propositions
 354 they prove.  This remarkable fact is known as the Curry-Howard correspondence,
 355 or isomorphism.
 356
 357 The `arrow' ($\to$) type corresponds to implication.  If we wish to prove that
 358 $(\tya \tyarr \tyb) \tyarr (\tyb \tyarr \tyc) \tyarr (\tya \tyarr \tyc)$, all we
 359 need to do is to devise a $\lambda$-term that has the correct type:
 360 \begin{equation*}
 361   \abs{f : (\tya \tyarr \tyb)}{\abs{g : (\tyb \tyarr \tyc)}{\abs{x : \tya}{\app{g}{(\app{f}{x})}}}}
 362 \end{equation*}
 363 That is, function composition.  We might want extend our bare lambda calculus
 364 with a couple of terms former to make our natural deduction more pleasant to
 365 use.  For example, tagged unions (\texttt{Either} in Haskell) are disjunctions,
 366 and tuples (or products) are conjunctions.  We also want to be able to express
 367 falsity ($\bot$): that can done by introducing a type inhabited by no closed
 368 terms.  If evidence of such a type is presented (necessarily under an
 369 abstraction), then we can derive any type, which expresses absurdity.
 370 Conversely, truth ($\top$) is the type with just one trivial element
 371 ($\lcunit$).
 372
 373 \newcommand{\lcinl}{\lccon{inl}\appsp}
 374 \newcommand{\lcinr}{\lccon{inr}\appsp}
 375 \newcommand{\lccase}[3]{\lcsyn{case}\appsp#1\appsp\lcsyn{of}\appsp#2\appsp#3}
 376 \newcommand{\lcfst}{\lcfun{fst}\appsp}
 377 \newcommand{\lcsnd}{\lcfun{snd}\appsp}
 378 \newcommand{\orint}{\vee I_{1,2}}
 379 \newcommand{\orintl}{\vee I_{1}}
 380 \newcommand{\orintr}{\vee I_{2}}
 381 \newcommand{\orel}{\vee E}
 382 \newcommand{\andint}{\wedge I}
 383 \newcommand{\andel}{\wedge E_{1,2}}
 384 \newcommand{\botel}{\bot E}
 385 \newcommand{\lcabsurd}{\lcfun{absurd}\appsp}
 386 \newcommand{\lcabsurdd}[1]{\lcfun{absurd}_{#1}\appsp}
 387
 388 % TODO make parens bold with \bm
 389 \begin{center}
 390   \axname{syntax}
 391   $$
 392   \begin{array}{rcl}
 393     \termsyn & ::= & \dotsb \\
 394              &  |  & \lcinl \termsyn \separ \lcinr \termsyn \separ \lccase{\termsyn}{\termsyn}{\termsyn} \\
 395              &  |  & (\termsyn , \termsyn) \separ \lcfst \termsyn \separ \lcsnd \termsyn \\
 396              &  |  & \lcunit \\
 397     \tysyn & ::= & \dotsb \separ \tysyn \vee \tysyn \separ \tysyn \wedge \tysyn \separ \bot \separ \top
 398   \end{array}
 399   $$
 400 \end{center}
 401 \begin{center}
 402   \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}
 403   \begin{prooftree}
 404     \AxiomC{$\Gamma \vdash \termt : \tya$}
 405     \UnaryInfC{$\Gamma \vdash \lcinl \termt : \tya \vee \tyb$}
 406     \noLine
 407     \UnaryInfC{$\Gamma \vdash \lcinr \termt : \tyb \vee \tya$}
 408   \end{prooftree}
 409   \begin{prooftree}
 410     \AxiomC{$\Gamma \vdash \termt : \tya \vee \tyb$}
 411     \AxiomC{$\Gamma \vdash \termm : \tya \tyarr \tyc$}
 412     \AxiomC{$\Gamma \vdash \termn : \tyb \tyarr \tyc$}
 413     \TrinaryInfC{$\Gamma \vdash \lccase{\termt}{\termm}{\termn} : \tyc$}
 414   \end{prooftree}
 415
 416   \begin{tabular}{c c}
 417     \AxiomC{$\Gamma \vdash \termt : \tya$}
 418     \AxiomC{$\Gamma \vdash \termm : \tyb$}
 419     \BinaryInfC{$\Gamma \vdash (\tya , \tyb) : \tya \wedge \tyb$}
 420     \DisplayProof
 421     &
 422     \AxiomC{$\Gamma \vdash \termt : \tya \wedge \tyb$}
 423     \UnaryInfC{$\Gamma \vdash \lcfst \termt : \tya$}
 424     \noLine
 425     \UnaryInfC{$\Gamma \vdash \lcsnd \termt : \tyb$}
 426     \DisplayProof
 427   \end{tabular}
 428
 429   \vspace{0.5cm}
 430
 431   \begin{tabular}{c c}
 432     \AxiomC{$\Gamma \vdash \termt : \bot$}
 433     \UnaryInfC{$\Gamma \vdash \lcabsurdd{\tya} \termt : \tya$}
 434     \DisplayProof
 435     &
 436     \AxiomC{}
 437     \UnaryInfC{$\Gamma \vdash \lcunit : \top$}
 438     \DisplayProof
 439   \end{tabular}
 440 \end{center}
 441 \begin{center}
 442   \axdesc{reduction}{\termsyn \bred \termsyn}
 443   \begin{eqnarray*}
 444     \lccase{(\lcinl \termt)}{\termm}{\termn} & \bred & \app{\termm}{\termt} \\
 445     \lccase{(\lcinr \termt)}{\termm}{\termn} & \bred & \app{\termn}{\termt} \\
 446     \lcfst (\termt , \termm)                 & \bred & \termt \\
 447     \lcsnd (\termt , \termm)                 & \bred & \termm
 448   \end{eqnarray*}
 449 \end{center}
 450
 451 With these rules, our STLC now looks remarkably similar in power and use to the
 452 natural deduction we already know.  $\neg A$ can be expressed as $A \tyarr
 453 \bot$.  However, there is an important omission: there is no term of the type $A
 454 \vee \neg A$ (excluded middle), or equivalently $\neg \neg A \tyarr A$ (double
 455 negation), or indeed any term with a type equivalent to those.
 456
 457 This has a considerable effect on our logic and it's no coincidence, since there
 458 is no obvious computational behaviour for laws like the excluded middle.
 459 Theories of this kind are called \emph{intuitionistic}, or \emph{constructive},
 460 and all the systems analysed will have this characteristic since they build on
 461 the foundation of the STLC\footnote{There is research to give computational
 462   behaviour to classical logic, but I will not touch those subjects.}.
 463
 464 Finally, going back to our $\lcsyn{fix}$ combinator, it's now easy to see how
 465 we would want to exclude such a thing if we want to use STLC as a logic, since
 466 it allows us to prove everything: $(\lcfix{x : \tya}{x}) : \tya$ clearly works
 467 for any $A$!  This is a crucial point: in general we wish to have systems that
 468 do not let the user devise a term of type $\bot$, otherwise our logic will be
 469 inconsistent\footnote{Obviously such a term can be present under a $\lambda$.}.
 470
 471 \subsection{Extending the STLC}
 472
 473 \newcommand{\lctype}{\lccon{Type}}
 474 \newcommand{\lcite}[3]{\lcsyn{if}\appsp#1\appsp\lcsyn{then}\appsp#2\appsp\lcsyn{else}\appsp#3}
 475 \newcommand{\lcbool}{\lccon{Bool}}
 476 \newcommand{\lcforallz}[2]{\forall #1 \absspace.\absspace #2}
 477 \newcommand{\lcforall}[3]{(#1 : #2) \tyarr #3}
 478 \newcommand{\lcexists}[3]{(#1 : #2) \times #3}
 479
 480 The STLC can be made more expressive in various ways.  \cite{Barendregt1991}
 481 succinctly expressed geometrically how we can add expressively:
 482
 483 $$
 484 \xymatrix@!0@=1.5cm{
 485   & \lambda\omega \ar@{-}[rr]\ar@{-}'[d][dd]
 486   & & \lambda C \ar@{-}[dd]
 487   \\
 488   \lambda2 \ar@{-}[ur]\ar@{-}[rr]\ar@{-}[dd]
 489   & & \lambda P2 \ar@{-}[ur]\ar@{-}[dd]
 490   \\
 491   & \lambda\underline\omega \ar@{-}'[r][rr]
 492   & & \lambda P\underline\omega
 493   \\
 494   \lambda{\to} \ar@{-}[rr]\ar@{-}[ur]
 495   & & \lambda P \ar@{-}[ur]
 496 }
 497 $$
 498 Here $\lambda{\to}$, in the bottom left, is the STLC.  From there can move along
 499 3 dimensions:
 500 \begin{description}
 501 \item[Terms depending on types (towards $\lambda{2}$)] We can quantify over
 502   types in our type signatures: $(\abs{A : \lctype}{\abs{x : A}{x}}) :
 503   \lcforallz{A}{A \tyarr A}$.  The first and most famous instance of this idea
 504   has been System F.  This gives us a form of polymorphism and has been wildly
 505   successful, also thanks to a well known inference algorithm for a restricted
 506   version of System F known as Hindley-Milner.  Languages like Haskell and SML
 507   are based on this discipline.
 508 \item[Types depending on types (towards $\lambda{\underline{\omega}}$)] We have
 509   type operators: $(\abs{A : \lctype}{\abs{R : \lctype}{(A \to R) \to R}}) :
 510   \lctype \to \lctype \to \lctype$.
 511 \item[Types depending on terms (towards $\lambda{P}$)] Also known as `dependent
 512   types', give great expressive power: $(\abs{x : \lcbool}{\lcite{x}{\mathbb{N}}{\mathbb{Q}}}) : \lcbool \to \lctype$.
 513 \end{description}
 514
 515 All the systems preserve the properties that make the STLC well behaved.  The
 516 system we are going to focus on, Intuitionistic Type Theory, has all of the
 517 above additions, and thus would sit where $\lambda{C}$ sits in the
 518 `$\lambda$-cube'.
 519
 520 \section{Intuitionistic Type Theory and dependent types}
 521 \label{sec:itt}
 522
 523 \newcommand{\lcset}[1]{\lccon{Type}_{#1}}
 524 \newcommand{\lcsetz}{\lccon{Type}}
 525 \newcommand{\defeq}{\cong}
 526
 527 \subsection{A Bit of History}
 528
 529 Logic frameworks and programming languages based on type theory have a long
 530 history.  Per Martin-L\"{o}f described the first version of his theory in 1971,
 531 but then revised it since the original version was inconsistent due to its
 532 impredicativity\footnote{In the early version there was only one universe
 533   $\lcsetz$ and $\lcsetz : \lcsetz$, see section \ref{sec:core-tt} for an
 534   explanation on why this causes problems.}.  For this reason he gave a revised
 535 and consistent definition later \citep{Martin-Lof1984}.
 536
 537 A related development is the polymorphic $\lambda$-calculus, and specifically
 538 the previously mentioned System F, which was developed independently by Girard
 539 and Reynolds.  An overview can be found in \citep{Reynolds1994}.  The surprising
 540 fact is that while System F is impredicative it is still consistent and strongly
 541 normalising.  \cite{Coquand1986} further extended this line of work with the
 542 Calculus of Constructions (CoC).
 543
 544 \subsection{A core type theory}
 545 \label{sec:core-tt}
 546
 547 The calculus I present follows the exposition in \citep{Thompson1991}, and is
 548 quite close to the original formulation of predicative ITT as found in
 549 \citep{Martin-Lof1984}.
 550
 551 \begin{center}
 552   \axname{syntax}
 553   \begin{eqnarray*}
 554   \termsyn & ::= & x \\
 555          &  |  & \lcforall{x}{\termsyn}{\termsyn} \separ \abs{x : \termsyn}{\termsyn} \separ \app{\termsyn}{\termsyn} \\
 556          &  |  & \lcexists{x}{\termsyn}{\termsyn} \separ (\termsyn , \termsyn)_{x.\termsyn} \separ \lcfst \termsyn \separ \lcsnd \termsyn \\
 557          &  |  & \bot \separ \lcabsurd_{\termsyn} \termsyn \\
 558          &  |  & \lcset{n} \\
 559    n     & \in & \mathbb{N}
 560  \end{eqnarray*}
 561
 562   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 563
 564   \vspace{0.5cm}
 565
 566   \begin{tabular}{c c c}
 567     \AxiomC{$\Gamma;x : \tya \vdash x : \tya$}
 568     \DisplayProof
 569     &
 570     \AxiomC{$\Gamma \vdash \termt : \tya$}
 571     \AxiomC{$\tya \defeq \tyb$}
 572     \BinaryInfC{$\Gamma \vdash \termt : \tyb$}
 573     \DisplayProof
 574     &
 575     \AxiomC{$\Gamma \vdash \termt : \bot$}
 576     \UnaryInfC{$\Gamma \vdash \lcabsurdd{\tya} \termt : \tya$}
 577     \DisplayProof
 578   \end{tabular}
 579
 580   \vspace{0.5cm}
 581
 582   \begin{tabular}{c c}
 583     \AxiomC{$\Gamma;x : \tya \vdash \termt : \tya$}
 584     \UnaryInfC{$\Gamma \vdash \abs{x : \tya}{\termt} : \lcforall{x}{\tya}{\tyb}$}
 585     \DisplayProof
 586     &
 587     \AxiomC{$\Gamma \vdash \termt : \lcforall{x}{\tya}{\tyb}$}
 588     \AxiomC{$\Gamma \vdash \termm : \tya$}
 589     \BinaryInfC{$\Gamma \vdash \app{\termt}{\termm} : \tyb[\termm ]$}
 590     \DisplayProof
 591   \end{tabular}
 592
 593   \vspace{0.5cm}
 594
 595   \begin{tabular}{c c}
 596     \AxiomC{$\Gamma \vdash \termt : \tya$}
 597     \AxiomC{$\Gamma \vdash \termm : \tyb[\termt ]$}
 598     \BinaryInfC{$\Gamma \vdash (\termt, \termm)_{x.\tyb} : \lcexists{x}{\tya}{\tyb}$}
 599     \DisplayProof
 600     &
 601     \AxiomC{$\Gamma \vdash \termt: \lcexists{x}{\tya}{\tyb}$}
 602     \UnaryInfC{$\hspace{0.7cm} \Gamma \vdash \lcfst \termt : \tya \hspace{0.7cm}$}
 603     \noLine
 604     \UnaryInfC{$\Gamma \vdash \lcsnd \termt : \tyb[\lcfst \termt ]$}
 605     \DisplayProof
 606   \end{tabular}
 607
 608   \vspace{0.5cm}
 609
 610   \begin{tabular}{c c}
 611     \AxiomC{}
 612     \UnaryInfC{$\Gamma \vdash \lcset{n} : \lcset{n + 1}$}
 613     \DisplayProof
 614     &
 615     \AxiomC{$\Gamma \vdash \tya : \lcset{n}$}
 616     \AxiomC{$\Gamma; x : \tya \vdash \tyb : \lcset{m}$}
 617     \BinaryInfC{$\Gamma \vdash \lcforall{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 618     \noLine
 619     \UnaryInfC{$\Gamma \vdash \lcexists{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 620     \DisplayProof
 621   \end{tabular}
 622
 623   \vspace{0.5cm}
 624
 625   \axdesc{reduction}{\termsyn \bred \termsyn}
 626   \begin{eqnarray*}
 627     \app{(\abs{x}{\termt})}{\termm} & \bred & \termt[\termm ] \\
 628     \lcfst (\termt, \termm) & \bred & \termt \\
 629     \lcsnd (\termt, \termm) & \bred & \termm
 630   \end{eqnarray*}
 631 \end{center}
 632
 633 When showing examples types will be omitted when this can be done without loss
 634 of clarity, for example $\abs{x}{x}$ in place of $\abs{A : \lcsetz}{\abs{x :
 635     A}{x}}$.  I will also use $\lcsetz$ for $\lcset{0}$.
 636
 637 There are a lot of new factors at play here. The first thing to notice is that
 638 the separation between types and terms is gone.  All we have is terms, that
 639 include both values (terms of type $\lcset{0}$) and types (terms of type
 640 $\lcset{n}$, with $n > 0$).  This change is reflected in the typing rules.
 641 While in the STLC values and types are kept well separated (values never go
 642 `right of the colon'), in ITT types can freely depend on values (they are
 643 `dependent types').
 644
 645 This relation is expressed in the typing rules for $\tyarr$ and $\times$: if a
 646 function has type $\lcforall{x}{\tya}{\tyb}$, $\tyb$ can depend on $x$.
 647 Examples will make this clearer once some base types are added in section
 648 \ref{sec:base-types}.
 649
 650 The Curry-Howard correspondence runs through ITT as it did with the STLC with
 651 the difference that ITT corresponds to an higher order propositional
 652 logic. $\tyarr$ and $\times$ are at the core of the machinery of ITT:
 653
 654 \begin{description}
 655 \item[`forall' ($\tyarr$)] is a generalisation of $\tyarr$ in the STLC and
 656   expresses universal quantification in our logic, where $A \tyarr B$ can be
 657   expressed as $(- : A) \tyarr B$.  It is often written with a syntax closer to
 658   logic, e.g. $\forall x : \tya. \tyb$.  In the literature this is also known as
 659   `dependent product' and shown as $\Pi$, following the interpretation of
 660   functions as infinitary products. I will just call it `dependent function',
 661   reserving `product' for $\exists$.
 662
 663 \item[`exists' ($\times$)] is a generalisation of $\wedge$ in the extended STLC
 664   of section \ref{sec:curry-howard}, and thus I will call it `dependent
 665   product'.  Like $\wedge$, it is formed by providing a pair of things.  In our
 666   logic, it represents existential quantification.  Similarly to $\tyarr$, it is
 667   often written as $\exists x : \tya. \tyb$.
 668
 669   For added confusion, in the literature that uses $\Pi$ for $\tyarr$, $\times$
 670   is often named `dependent sum' and shown as $\Sigma$.  This is following the
 671   interpretation of $\exists$ as a generalised, infinitary $\vee$, where the
 672   first element of the pair is the `tag' that determines which type the second
 673   element will have.
 674 \end{description}
 675
 676 Another thing to notice is that types are very `first class': we are free to
 677 create functions that accept and return types.  For this reason we define
 678 $\defeq$ as the smallest equivalence relation extending $\bredc$, where $\bredc$
 679 is the reflexive transitive closure of $\bred$; and we treat types that are
 680 related according to $\defeq$ as the same.  Another way of seeing $\defeq$ is
 681 this: when we want to compare two terms for equality, we reduce them as far as
 682 possible and then check if they are equal\footnote{Note that when comparing
 683   terms we do it up to $\alpha$-renaming.  That is, we do not consider renaming
 684   of bound variables as a difference - for example $\abs{x : A}{x} \defeq \abs{y
 685     : A}{y}$.}.  This works since not only each term has a normal form (ITT is
 686 strongly normalising), but the normal form is also unique; or in other words
 687 $\bred$ is confluent (if $\termt \bredc \termm$ and $\termt \bredc \termn$, then
 688 there is a $p$ such that $\termm \bredc \termp$ and $\termn \bredc \termp$).
 689 This measure makes sure that, for instance, $\app{(\abs{x :
 690     \lctype}{x})}{\lcbool} \defeq \lcbool$.  The theme of equality is central
 691 and will be analysed better in section \ref{sec:equality}.
 692
 693 The theory presented is \emph{stratified}.  We have a hierarchy of types
 694 $\lcset{0} : \lcset{1} : \lcset{2} : \dots$, so that there is no `type of all
 695 types', and our theory is predicative.  Constructors like $\tyarr$ and $\times$
 696 take the least upper bound $\sqcup$ of the contained types. The layers of the
 697 hierarchy are called `universes'.  Theories where $\lcsetz : \lcsetz$ are
 698 inconsistent due to Girard's paradox \citep{Hurkens1995}, and thus lose their
 699 well-behavedness.  Impredicativity sometimes has its place, either because the
 700 theory retain good properties (normalization, consistency, etc.) anyway, like in
 701 System F and CoC; or because we are at a stage at which we do not care.
 702
 703 \subsection{Base Types}
 704 \label{sec:base-types}
 705
 706 \newcommand{\lctrue}{\lccon{true}}
 707 \newcommand{\lcfalse}{\lccon{false}}
 708 \newcommand{\lcw}[3]{\lccon{W} #1 : #2 \absspace.\absspace #3}
 709 \newcommand{\lcnode}[4]{#1 \lhd_{#2 . #3} #4}
 710 \newcommand{\lcnodez}[2]{#1 \lhd #2}
 711 \newcommand{\lcited}[5]{\lcsyn{if}\appsp#1/#2\appsp.\appsp#3\appsp\lcsyn{then}\appsp#4\appsp\lcsyn{else}\appsp#5}
 712 \newcommand{\lcrec}[4]{\lcsyn{rec}\appsp#1/#2\appsp.\appsp#3\appsp\lcsyn{with}\appsp#4}
 713 \newcommand{\lcrecz}[2]{\lcsyn{rec}\appsp#1\appsp\lcsyn{with}\appsp#2}
 714 \newcommand{\AxiomL}[1]{\Axiom$\fCenter #1$}
 715 \newcommand{\UnaryInfL}[1]{\UnaryInf$\fCenter #1$}
 716
 717 While the ITT presented is a fairly complete logic, it is not that useful for
 718 programming.  If we wish to make it better, we can add some base types to
 719 represent the data structures we know and love, such as numbers, lists, and
 720 trees.  Apart from some unsurprising data types, we introduce $\lccon{W}$, a
 721 very general tree-like structure useful to represent inductively defined types
 722 and that will allow us to express structural recursion in an equally general
 723 manner.
 724
 725 \begin{center}
 726   \axname{syntax}
 727   \begin{eqnarray*}
 728   \termsyn & ::= & \dots \\
 729            &  |  & \top \separ \lcunit \\
 730            &  |  & \lcbool \separ \lctrue \separ \lcfalse \separ \lcited{\termsyn}{x}{\termsyn}{\termsyn}{\termsyn} \\
 731            &  |  & \lcw{x}{\termsyn}{\termsyn} \separ \lcnode{\termsyn}{x}{\termsyn}{\termsyn} \separ \lcrec{\termsyn}{x}{\termsyn}{\termsyn}
 732  \end{eqnarray*}
 733
 734   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 735
 736   \vspace{0.5cm}
 737
 738   \begin{tabular}{c c c}
 739     \AxiomC{}
 740     \UnaryInfC{$\hspace{0.2cm}\Gamma \vdash \top : \lcset{0} \hspace{0.2cm}$}
 741     \noLine
 742     \UnaryInfC{$\Gamma \vdash \lcbool : \lcset{0}$}
 743     \DisplayProof
 744     &
 745     \AxiomC{}
 746     \UnaryInfC{$\Gamma \vdash \lcunit : \top$}
 747     \DisplayProof
 748     &
 749     \AxiomC{}
 750     \UnaryInfC{$\Gamma \vdash \lctrue : \lcbool$}
 751     \noLine
 752     \UnaryInfC{$\Gamma \vdash \lcfalse : \lcbool$}
 753     \DisplayProof
 754   \end{tabular}
 755
 756   \vspace{0.5cm}
 757
 758   \begin{tabular}{c}
 759     \AxiomC{$\Gamma \vdash \termt : \lcbool$}
 760     \AxiomC{$\Gamma \vdash \termm : \tya[\lctrue]$}
 761     \AxiomC{$\Gamma \vdash \termn : \tya[\lcfalse]$}
 762     \TrinaryInfC{$\Gamma \vdash \lcited{\termt}{x}{\tya}{\termm}{\termn} : \tya[\termt]$}
 763     \DisplayProof
 764   \end{tabular}
 765
 766   \vspace{0.5cm}
 767
 768   \begin{tabular}{c}
 769     \AxiomC{$\Gamma \vdash \tya : \lcset{n}$}
 770     \AxiomC{$\Gamma; x : \tya \vdash \tyb : \lcset{m}$}
 771     \BinaryInfC{$\Gamma \vdash \lcw{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 772     \DisplayProof
 773   \end{tabular}
 774
 775   \vspace{0.5cm}
 776
 777   \begin{tabular}{c}
 778     \AxiomC{$\Gamma \vdash \termt : \tya$}
 779     \AxiomC{$\Gamma \vdash \termf : \tyb[\termt ] \tyarr \lcw{x}{\tya}{\tyb}$}
 780     \BinaryInfC{$\Gamma \vdash \lcnode{\termt}{x}{\tyb}{\termf} : \lcw{x}{\tya}{\tyb}$}
 781     \DisplayProof
 782   \end{tabular}
 783
 784   \vspace{0.5cm}
 785
 786   \begin{tabular}{c}
 787     \AxiomC{$\Gamma \vdash \termt: \lcw{x}{\tya}{\tyb}$}
 788     \noLine
 789     \UnaryInfC{$\Gamma \vdash \lcforall{\termm}{\tya}{\lcforall{\termf}{\tyb[\termm] \tyarr \lcw{x}{\tya}{\tyb}}{(\lcforall{\termn}{\tyb[\termm]}{\tyc[\app{\termf}{\termn}]}) \tyarr \tyc[\lcnodez{\termm}{\termf}]}}$}
 790     \UnaryInfC{$\Gamma \vdash \lcrec{\termt}{x}{\tyc}{\termp} : \tyc[\termt]$}
 791     \DisplayProof
 792   \end{tabular}
 793
 794   \vspace{0.5cm}
 795
 796   \axdesc{reduction}{\termsyn \bred \termsyn}
 797   \begin{eqnarray*}
 798     \lcited{\lctrue}{x}{\tya}{\termt}{\termm} & \bred & \termt \\
 799     \lcited{\lcfalse}{x}{\tya}{\termt}{\termm} & \bred & \termm \\
 800     \lcrec{\lcnodez{\termt}{\termf}}{x}{\tya}{\termp} & \bred & \app{\app{\app{\termp}{\termt}}{\termf}}{(\abs{\termm}{\lcrec{\app{f}{\termm}}{x}{\tya}{\termp}})}
 801   \end{eqnarray*}
 802 \end{center}
 803
 804 The introduction and elimination for $\top$ and $\lcbool$ are unsurprising.
 805 Note that in the $\lcite{\dotsb}{\dotsb}{\dotsb}$ construct the type of the
 806 branches are dependent on the value of the conditional.
 807
 808 The rules for $\lccon{W}$, on the other hand, are quite an eyesore.  The idea
 809 behind $\lccon{W}$ types is to build up `trees' where the number of `children'
 810 of each node is dependent on the value (`shape') in the node.  This is captured
 811 by the $\lhd$ constructor, where the argument on the left is the value, and the
 812 argument on the right is a function that returns a child for each possible value
 813 of $\tyb[\text{node value}]$.  The recursor $\lcrec{\termt}{x}{\tyc}{\termp}$
 814 uses $p$ to inductively prove that $\tyc[\termt]$ holds.
 815
 816 \subsection{Examples}
 817
 818 Now we can finally provide some meaningful examples.  Apart from omitting types,
 819 I will also use some abbreviations:
 820 \begin{itemize}
 821   \item $\_\mathit{operator}\_$ to define infix operators
 822   \item $\abs{x\appsp y\appsp z : \tya}{\dotsb}$ to define multiple abstractions at the same
 823     time
 824 \end{itemize}
 825
 826 \subsubsection{Sum types}
 827
 828 We would like to recover our sum type, or disjunction, $\vee$.  This is easily
 829 achieved with $\times$:
 830 \[
 831 \begin{array}{rcl}
 832   \_\vee\_ & = &  \abs{\tya\appsp\tyb : \lcsetz}{\lcexists{x}{\lcbool}{\lcite{x}{\tya}{\tyb}}} \\
 833   \lcinl   & = & \abs{x}{(\lctrue, x)} \\
 834   \lcinr   & = & \abs{x}{(\lcfalse, x)} \\
 835   \lcfun{case} & = & \abs{x : \tya \vee \tyb}{\abs{f : \tya \tyarr \tyc}{\abs{g : \tyb \tyarr \tyc}{ \\
 836            &   & \hspace{0.5cm} \app{(\lcited{\lcfst x}{b}{(\lcite{b}{A}{B}) \tyarr C}{f}{g})}{(\lcsnd x)}}}}
 837 \end{array}
 838 \]
 839 What is going on here?  We are using $\exists$ with $\lcbool$ as a tag, so that
 840 we can choose between one of two types in the second element.  In
 841 $\lcfun{case}$ we use $\lcite{\lcfst x}{\dotsb}{\dotsb}$ to discriminate on the
 842 tag, that is, the first element of $x : \tya \vee \tyb$.  If the tag is true,
 843 then we know that the second element is of type $\tya$, and we will apply $f$.
 844 The same applies to the other branch, with $\tyb$ and $g$.
 845
 846 \subsubsection{Naturals}
 847
 848 Now it's time to showcase the power of $\lccon{W}$ types.
 849 \begin{eqnarray*}
 850   \lccon{Nat}  & = & \lcw{b}{\lcbool}{\lcite{b}{\top}{\bot}}  \\
 851   \lccon{zero} & = & \lcfalse \lhd \abs{z}{\lcabsurd z} \\
 852   \lccon{suc}  & = & \abs{n}{(\lctrue \lhd \abs{\_}{n})}  \\
 853   \lcfun{plus} & = & \abs{x\appsp y}{\lcrecz{x}{(\abs{b}{\lcite{b}{\abs{\_\appsp f}{\app{\lccon{suc}}{(\app{f}{\lcunit})}}}{\abs{\_\appsp\_}{y}}})}}
 854 \end{eqnarray*}
 855 Here we encode natural numbers through $\lccon{W}$ types.  Each node contains a
 856 $\lcbool$.  If the $\lcbool$ is $\lcfalse$, then there are no children, since we
 857 have a function $\bot \tyarr \lccon{Nat}$: this is our $0$.  If it's $\lctrue$,
 858 we need one child only: the predecessor.  We recurse giving $\lcsyn{rec}$ a
 859 function that handles the `base case' 0 when the $\lcbool$ is $\lctrue$, and the
 860 inductive case otherwise.
 861
 862 I postpone more complex examples after the introduction of inductive families
 863 and pattern matching, since $\lccon{W}$ types get unwieldy very quickly.
 864
 865 \subsection{Propositional Equality}
 866 \label{sec:propeq}
 867
 868 I can finally introduce one of the central subjects of my project: propositional
 869 equality.
 870
 871 \newcommand{\lceq}[3]{#2 =_{#1} #3}
 872 \newcommand{\lceqz}[2]{#1 = #2}
 873 \newcommand{\lcrefl}[2]{\lccon{refl}_{#1}\appsp #2}
 874 \newcommand{\lcsubst}[3]{\lcfun{subst}\appsp#1\appsp#2\appsp#3}
 875
 876 \begin{center}
 877   \axname{syntax}
 878   \begin{eqnarray*}
 879   \termsyn & ::= & ... \\
 880            &  |  & \lceq{\termsyn}{\termsyn}{\termsyn} \separ \lcrefl{\termsyn}{\termsyn} \separ \lcsubst{\termsyn}{\termsyn}{\termsyn}
 881  \end{eqnarray*}
 882
 883   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 884
 885   \vspace{0.5cm}
 886
 887   \begin{tabular}{c}
 888     \AxiomC{$\Gamma \vdash \termt : \tya$}
 889     \UnaryInfC{$\Gamma \vdash \lcrefl{\tya}{\termt} : \lceq{\tya}{\termt}{\termt}$}
 890     \DisplayProof
 891   \end{tabular}
 892
 893   \vspace{0.5cm}
 894
 895   \begin{tabular}{c}
 896     \AxiomC{$\Gamma \vdash \termp : \lceq{\tya}{\termt}{\termm}$}
 897     \AxiomC{$\Gamma \vdash \tyb : \tya \tyarr \lcsetz$}
 898     \AxiomC{$\Gamma \vdash \termn : \app{\tyb}{\termt}$}
 899     \TrinaryInfC{$\Gamma \vdash \lcsubst{\termp}{\tyb}{\termn} : \app{\tyb}{\termm}$}
 900     \DisplayProof
 901   \end{tabular}
 902
 903   \vspace{0.5cm}
 904
 905   \axdesc{reduction}{\termsyn \bred \termsyn}
 906   \begin{eqnarray*}
 907     \lcsubst{(\lcrefl{\tya}{\termt})}{\tyb}{\termm} \bred \termm
 908   \end{eqnarray*}
 909 \end{center}
 910
 911 Propositional equality internalises equality as a type.  The only way to
 912 introduce an equality proof is by reflexivity ($\lccon{refl}$).  The eliminator
 913 is in the style of `Leibnitz's law' of equality: `equals can be substituted for
 914 equals' ($\lcfun{subst}$).  The computation rule refers to the fact that every
 915 proof of equality will be a $\lccon{refl}$.  We can use $\neg
 916 (\lceq{\tya}{x}{y})$ to express inequality.
 917
 918 These elements conclude our presentation of a `core' type theory.  For an
 919 extended example of a similar theory in use see Section 6.2 of
 920 \citep{Thompson1991}\footnote{Note that while I attempted to formalise the proof
 921   in Agda, I found a bug in the book!  See the errata for details:
 922   \url{http://www.cs.kent.ac.uk/people/staff/sjt/TTFP/errata.html}.}.  The above
 923 language and examples have been codified in Agda\footnote{More on Agda in the
 924   next section.}, see appendix \ref{app:agda-code}.
 925
 926 \section{A more useful language}
 927 \label{sec:practical}
 928
 929 While our core type theory equipped with $\lccon{W}$ types is very useful
 930 conceptually as a simple but complete language, things get messy very fast,
 931 since handling $\lccon{W}$ types directly is incredibly cumbersome.  In this
 932 section I will present the elements that are usually included in theorem provers
 933 or programming languages to make them usable by mathematicians or programmers.
 934
 935 All the features are presented following the second version of the Agda system
 936 \citep{Norell2007, Bove2009}.  Agda follows a tradition of theorem provers based
 937 on ITT with inductive families and pattern matching.  The closest relative of
 938 Agda, apart from previous versions, is Epigram \citep{McBride2004, EpigramTut}.
 939 Coq is another notable theorem prover based on the same concepts, and the first
 940 and most popular \citep{Coq}.
 941
 942 The main difference between Coq and Agda/Epigram is that the former focuses on
 943 theorem proving, while the latter also wants to be useful as functional
 944 programming languages, while still offering good tools to express
 945 mathematics\footnote{In fact, currently, Agda is used almost exclusively to
 946   express mathematics, rather than to program.}.  This is reflected in a series
 947 of differences that I will not discuss here (most notably the absence of tactics
 948 but better support for pattern matching in Agda/Epigram).
 949
 950 Again, all the examples presented have been codified in Agda, see appendix
 951 \ref{app:agda-code}.
 952
 953 \subsection{Type inference}
 954
 955 The theory I presented is fully explicit in the sense that the user has to
 956 specify every type when forming abstractions, products, etc.
 957
 958 For the programmer used to Hindley-Milner as in Haskell and SML (and for any
 959 human being), this is a great burden. Complete inference is undecidable - which
 960 is hardly surprising considering the role that types play - but partial
 961 inference in the style of \cite{Pierce2000}, also called `bidirectional type
 962 checking', will have to be deployed in a practical system.
 963
 964 Agda gives users an explicit way to indicate which fields should be implicit, by
 965 wrapping them in curly braces in type signatures: $\{A : \lcsetz\} \tyarr
 966 \dotsb$.  It also allows to omit types of arguments altimeter, if they can be
 967 inferred by other arguments: $\{A\} \tyarr (x : A) \tyarr \dotsb$.
 968
 969 \subsection{Inductive families}
 970 \label{sec:inductive-families}
 971
 972 \newcommand{\lcdata}[1]{\lcsyn{data}\appsp #1}
 973 \newcommand{\lcdb}{\ |\ }
 974 \newcommand{\lcind}{\hspace{0.2cm}}
 975 \newcommand{\lcwhere}{\appsp \lcsyn{where}}
 976 \newcommand{\lclist}[1]{\app{\lccon{List}}{#1}}
 977 \newcommand{\lcnat}{\lccon{Nat}}
 978 \newcommand{\lcvec}[2]{\app{\app{\lccon{Vec}}{#1}}{#2}}
 979
 980 Inductive families were first introduced by \cite{Dybjer1991}.  For the reader
 981 familiar with the recent developments present in the GHC compiler for Haskell,
 982 inductive families will look similar to GADTs (Generalised Abstract Data Types)
 983 \citep[Section 7.4.7]{GHC}.
 984 Haskell-style data types provide \emph{parametric polymorphism}, so that we can
 985 define types that range over type parameters:
 986 \begin{lstlisting}
 987 List a = Nil | Cons a (List a)
 988 \end{lstlisting}
 989 In this way we define the \texttt{List} type once while allowing elements to be
 990 of any type.  In Haskell \texttt{List} will be a type constructor of kind
 991 \texttt{* -> *}, while \texttt{Nil :: List a} and \texttt{Cons :: a -> List a ->
 992   List a}\footnote{Note that the \texttt{a}s are implicitly quantified type
 993   variables.}.
 994
 995 Inductive families bring this concept one step further by allowing some of the
 996 parameters to be constrained by constructors.  We call these parameters
 997 `indices'.  For example we might define natural numbers:
 998 \[
 999 \begin{array}{l}
1000 \lcdata{\lcnat} : \lcsetz \lcwhere \\
1001   \begin{array}{l@{\ }l}
1002     \lcind \lccon{zero} &: \lcnat \\
1003     \lcind \lccon{suc}  &: \lcnat \tyarr \lcnat
1004   \end{array}
1005 \end{array}
1006 \]
1007 And then define a family for lists indexed by length:
1008 \[
1009 \begin{array}{l}
1010 \lcdata{\lccon{Vec}}\appsp (A : \lcsetz) : \lcnat \tyarr \lcsetz \lcwhere \\
1011   \begin{array}{l@{\ }l}
1012     \lcind \lccon{nil} &: \lcvec{A}{\lccon{zero}} \\
1013     \lcind \_::\_ &: \{n : \lccon{Nat}\} \tyarr A \tyarr \lcvec{A}{n} \tyarr \lcvec{A}{\app{(\lccon{suc}}{n})}
1014   \end{array}
1015 \end{array}
1016 \]
1017 Note that contrary to the Haskell ADT notation, with inductive families we
1018 explicitly list the types for the type and data constructors.  In $\lccon{Vec}$,
1019 $A$ is a parameter (same across all constructors) while the $\lcnat$ is an
1020 index.  In our syntax, in the $\lcsyn{data}$ declaration, things to the left of
1021 the colon are parameters, while on the right we can define the type of indices.
1022 Also note that the parameters' identifiers will be in scope across all
1023 constructors, while indices' won't.
1024
1025 In the $\lccon{Vec}$ example, when we form a new list the length is
1026 $\lccon{zero}$.  When we append a new element to an existing list of length $n$,
1027 the new list is of length $\app{\lccon{suc}}{n}$, that is, one more than the
1028 previous length.  Once $\lccon{Vec}$ is defined we can do things much more
1029 safely than with normal lists.  For example, we can define an $\lcfun{head}$
1030 function that returns the first element of the list:
1031 \[
1032 \begin{array}{l}
1033 \lcfun{head} : \{A\ n\} \tyarr \lcvec{A}{(\app{\lccon{suc}}{n})} \tyarr A
1034 \end{array}
1035 \]
1036 Note that we will not be able to provide this function with a
1037 $\lcvec{A}{\lccon{zero}}$, since it needs an index which is a successor of some
1038 number.
1039
1040 \newcommand{\lcfin}[1]{\app{\lccon{Fin}}{#1}}
1041
1042 If we wish to index a $\lccon{Vec}$ safely, we can define an inductive family
1043 $\lcfin{n}$, which represents the family of numbers smaller than a given number
1044 $n$:
1045 \[
1046 \begin{array}{l}
1047   \lcdata{\lccon{Fin}} : \lcnat \tyarr \lcsetz \lcwhere \\
1048   \begin{array}{l@{\ }l}
1049     \lcind \lccon{fzero} &: \{n\} \tyarr \lcfin{(\app{\lccon{suc}}{n})} \\
1050     \lcind \lccon{fsuc}  &: \{n\} \tyarr \lcfin{n} \tyarr \lcfin{(\app{\lccon{suc}}{n})}
1051   \end{array}
1052 \end{array}
1053 \]
1054 $\lccon{fzero}$ is smaller than any number apart from $\lccon{zero}$.  Given
1055 the family of numbers smaller than $n$, we can produce the family of numbers
1056 smaller than $\app{\lccon{suc}}{n}$.  Now we can define an `indexing' operation
1057 for our $\lccon{Vec}$:
1058 \[
1059 \begin{array}{l}
1060 \_\lcfun{!!}\_ : \{A\ n\} \tyarr \lcvec{A}{n} \tyarr \lcfin{n} \tyarr A
1061 \end{array}
1062 \]
1063 In this way we will be sure that the number provided as an index will be smaller
1064 than the length of the $\lccon{Vec}$ provided.
1065
1066 \subsubsection{Computing with inductive families}
1067
1068 I have carefully avoided defining the functions that I mentioned as examples,
1069 since one question still is unanswered: `how do we work with inductive
1070 families'?  The most basic approach is to use eliminators, that can be
1071 automatically provided by the programming language for each data type defined.
1072 For example the induction principle on natural numbers will serve as an
1073 eliminator for $\lcnat$:
1074 \[
1075 \begin{array}{l@{\ } c@{\ } l}
1076   \lcfun{NatInd} & : & (A : \lcsetz) \tyarr \\
1077                  &   & A \tyarr (\lcnat \tyarr A \tyarr A) \tyarr \\
1078                  &   & \lcnat \tyarr A
1079 \end{array}
1080 \]
1081
1082 That is, if we provide an $A$ (base case), and if given a number and an $A$ we
1083 can provide the next $A$ (inductive step), then an $A$ can be computed for every
1084 number.  This can be expressed easily in Haskell:
1085 \begin{lstlisting}
1086 data Nat = Zero | Suc Nat
1087 natInd :: a -> (Nat -> a -> a) -> Nat -> a
1088 natInd z _  Zero   = z
1089 natInd z f (Suc n) = f n (natInd z f n)
1090 \end{lstlisting}
1091 However, in a dependent setting, we can do better:
1092 \[
1093 \begin{array}{l@{\ } c@{\ } l}
1094   \lcfun{NatInd} & : & (P : \lcnat \tyarr \lcsetz) \tyarr \\
1095                  &   & \app{P}{\lccon{zero}} \tyarr ((n : \lcnat) \tyarr \app{P}{n} \tyarr \app{P}{(\app{\lccon{suc}}{n})}) \tyarr \\
1096                  &   & (n : \lcnat) \tyarr \app{P}{n}
1097 \end{array}
1098 \]
1099 This expresses the fact that the resulting type can be dependent on the number.
1100 In other words, we are proving that $P$ holds for all $n : \lcnat$.
1101
1102 Naturally a reduction rule will be associated with each eliminator:
1103 \[
1104 \begin{array}{l c l}
1105 \app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{\lccon{zero}} & \bred & z \\
1106 \app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{(\app{\lccon{suc}}{n})} & \bred & \app{\app{f}{n}}{(\app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{n})}
1107 \end{array}
1108 \]
1109 Which echoes the \texttt{natInd} function defined in Haskell.  An extensive
1110 account on combinators and inductive families can be found in \citep{McBride2004}.
1111
1112 \subsubsection{Pattern matching and guarded recursion}
1113
1114 However, combinators are far more cumbersome to use than the techniques usually
1115 employed in functional programming: pattern matching and recursion.
1116 \emph{General} recursion (exemplified by the $\lcsyn{fix}$ combinator in section
1117 \ref{sec:stlc}) cannot be added if we want to keep our theory free of $\bot$.
1118 The common solution to this problem is to allow recursive calls only if the
1119 arguments are structurally smaller than what the function received.  For
1120 example, if we have a $\lccon{Tree}$ family with a $\lccon{node}\appsp l \appsp
1121 r$ constructor, functions that work on $\lccon{Tree}$ will be able to make
1122 recursive calls on $l$ and $r$.
1123
1124 Pattern matching on the other hand gains considerable power with inductive
1125 families, since when we match a constructor we are gaining information on the
1126 indices of the family.  Thus matching constructors will enable us to restrict
1127 patterns of other arguments.
1128
1129 Following this discipline defining $\lcfun{head}$ becomes easy:
1130 \[
1131 \begin{array}{l}
1132 \lcfun{head} : \{A\ n\} \tyarr \lcvec{A}{(\app{\lccon{suc}}{n})} \tyarr A \\
1133 \lcfun{head}\appsp(x :: \_) = x
1134 \end{array}
1135 \]
1136 We need no case for $\lccon{nil}$, since the type checker knows that the
1137 equation $n = \lccon{zero} = \app{\lccon{suc}}{n'}$ cannot be satisfied.
1138
1139 More details on the implementations of inductive families can be found in
1140 \citep{McBride2004, Norell2007}.
1141
1142 \subsection{Friendlier universes}
1143
1144 Universes as presented in section \ref{sec:core-tt} are quite annoying to work
1145 with.  Consider the example where we define an inductive family for booleans and
1146 then we want to define an `if then else' eliminator:
1147 \[
1148 \begin{array}{l}
1149 \lcdata{\lcbool} : \lcsetz \lcwhere \\
1150   \begin{array}{l@{\ }l}
1151     \lcind \lccon{true}  &: \lcbool \\
1152     \lcind \lccon{false} &: \lcbool
1153   \end{array}\\
1154 \\
1155 \lcfun{ite} : \{A : \lcset{0}\} \tyarr \lcbool \tyarr A \tyarr A \tyarr A \\
1156 \lcfun{ite} \appsp \lccon{true} \appsp x \appsp \_ = x \\
1157 \lcfun{ite} \appsp \lccon{false} \appsp \_ \appsp x = x
1158 \end{array}
1159 \]
1160
1161 What if we want to use $\lcfun{ite}$ with types, for example $\lcfun{ite} \appsp
1162 b \appsp \lcnat \appsp \lccon{Bool}$?  Clearly $\lcnat$ is not of type
1163 $\lcset{0}$, so we'd have to redefine $\lcfun{ite}$ with exactly the same body,
1164 but different type signature.  The problem is even more evident with type
1165 families: for example our $\lccon{List}$ can only contain values, and if we want
1166 to build lists of types we need to define another identical, but differently
1167 typed, family.
1168
1169 One option is to have a \emph{cumulative} theory, where $\lcset{n} : \lcset{m}$
1170 iff $n < m$.  Then we can have a sufficiently large level in our type signature
1171 and forget about it.  Moreover, levels in this setting can be inferred
1172 mechanically \citep{Pollack1990}, and thus we can lift the burden of specifying
1173 universes from the user.  This is the approach taken by Epigram.
1174
1175 Another more expressive (but currently more cumbersome) way is to expose
1176 universes more, giving the user a way to quantify them and to take the least
1177 upper bound of two levels.  This is the approach taken by Agda, where for
1178 example we can define a level-polymorphic $\times$:
1179 \[
1180 \begin{array}{l}
1181 \lcdata{\_\times\_}\ \{a\ b : \lccon{Level}\}\ (A : \lcset{a})\ (B : A \tyarr \lcset{b}) : \lcset{a \sqcup b} \lcwhere \\
1182   \lcind \_ , \_ : (x : A) \tyarr \app{B}{x} \tyarr A \times \app{B}{x}
1183 \end{array}
1184 \]
1185 Levels can be made implicit as shown and can be almost always inferred.
1186 However, having to decorate each type signature with quantified levels adds
1187 quite a lot of noise.  An inference algorithm that automatically quantifies and
1188 instantiates levels (much like Hindley-Milner for types) seems feasible, but is
1189 currently not implemented anywhere.  The ideal situation would be polymorphic,
1190 cumulative levels, with an easy way to omit explicit treatment of levels unless
1191 in the (possibly few) cases where the inference algorithm breaks down.
1192
1193 \subsection{Coinduction}
1194
1195 One thing that the Haskell programmer will miss in ITT as presented is the
1196 possibility to work with infinite data.  The classic example is the manipulation
1197 of infinite streams:
1198 \begin{lstlisting}
1199 zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
1200
1201 fibs :: [Int]
1202 fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
1203 \end{lstlisting}
1204 While we can clearly write useful programs of this kind, we need to be careful,
1205 since \texttt{length fibs}, for example, does not make much sense\footnote{Note
1206   that if instead of machine \texttt{Int}s we used naturals as defined
1207   previously, getting the length of an infinite list would be a productive
1208   definition.}.
1209
1210 In less informal terms, we need to distinguish between \emph{productive} and
1211 non-productive definitions.  A productive definition is one for which pattern
1212 matching on data will never diverge, which is the case for \texttt{fibs} but
1213 not, for example, for \texttt{let x = x in x :: [Int]}.  It is very desirable to
1214 recover \emph{only} the productive definition so that total programs working
1215 with infinite data can be written.
1216
1217 This desire has lead to separate the notion of (finite) data and \emph{codata},
1218 which can be worked on by \emph{coinduction} - an overview is given in
1219 \citep{Jacobs1997}.  Research is very active on this subject since coinduction
1220 as implemented by Coq and Agda is not satisfactory in different ways.
1221
1222 \section{Many equalities}
1223 \label{sec:equality}
1224
1225 \subsection{Revision, and function extensionality}
1226 \label{sec:fun-ext}
1227
1228 \epigraph{\emph{Half of my time spent doing research involves thinking up clever
1229   schemes to avoid needing functional extensionality.}}{@larrytheliquid}
1230
1231 Up to this point, I have introduced two equalities: \emph{definitional} equality
1232 ($\defeq$) and \emph{propositional} equality ($=_{\tya}$).
1233
1234 Definitional equality relates terms that the type checker identifies as equal.
1235 The definition in section \ref{sec:core-tt} consisted of `reduce the two terms
1236 to their normal forms, then check if they are equal (up to $\alpha$-renaming)'.
1237 We can extend this in other ways so that more terms will be identified as equal.
1238 Common tricks include doing $\eta$-expansion, which converts partial appications
1239 of functions, and congruence laws for $\lambda$:
1240 \begin{center}
1241   \begin{tabular}{c c c}
1242     \AxiomC{$\Gamma \vdash \termf : \tya \tyarr \tyb$}
1243     \UnaryInfC{$\Gamma \vdash \termf \defeq \abs{x}{\app{\termf}{x}}$}
1244     \DisplayProof
1245     &
1246     \AxiomC{$\Gamma; x : \tya \vdash \termf \defeq g$}
1247     \UnaryInfC{$\Gamma \vdash \abs{x}{\termf} \defeq \abs{x}{g}$}
1248     \DisplayProof
1249   \end{tabular}
1250 \end{center}
1251 Definitional equality is used in the type checking process, and this means that
1252 in dependently typed languages type checking and evaluation are intertwined,
1253 since we need to reduce types to normal forms to check if they are equal.  It
1254 also means that we need be able to compute under binders, for example to
1255 determine that $\abs{x}{\app{(\abs{y}{y})}{\tya}} \defeq \abs{x}{\tya}$.  This
1256 process goes on `under the hood' and is outside the control of the user.
1257
1258 Propositional equality, on the other hand, is available to the user to reason
1259 about equality, internalising it as a type.  As we have seen in section
1260 \ref{sec:propeq} propositional equality is introduced by reflexivity and
1261 eliminated with a `Leibnitz's law' style rule ($\lcfun{subst}$).  Note that now
1262 that we have inductive families and dependent pattern matching we do not need
1263 hard coded rules to express this concepts\footnote{Here I use Agda notation, and
1264   thus I cannot redefine $=$ and use subscripts, so I am forced to use $\equiv$
1265   with implicit types.  After I will carry on using the old notation.}:
1266 \[
1267 \begin{array}{l}
1268   \lcdata{\lccon{\_\equiv\_}} : \{A : \lcsetz\} : A \tyarr A \tyarr \lcsetz \lcwhere \\
1269   \begin{array}{l@{\ }l}
1270     \lcind \lccon{refl} &: \{x : A\} \tyarr x \equiv x
1271   \end{array} \\
1272   \\
1273   \lcfun{subst} : \{A : \lcsetz\} \tyarr \{t\ m : A\} \tyarr t \equiv m \tyarr (B : A \tyarr \lcsetz) \tyarr \app{B}{t} \tyarr \app{B}{m} \\
1274   \lcfun{subst} \appsp \lccon{refl}\appsp B \appsp n = n
1275 \end{array}
1276 \]
1277 Here matching $\lccon{refl}$ tells the type checker that $t \defeq m$, and thus
1278 $\app{B}{t} \defeq \app{B}{m}$, so we can just return $n$.  This shows the
1279 connection between type families indices and propositional equality, also
1280 highlighted in \citep{McBride2004}.
1281
1282 It is worth noting that all $\lcfun{subst}$s, in ITT, are guaranteed to reduce
1283 at the top level.  This is because $\lccon{refl}$ is the only constructor for
1284 propositional equality, and thus without false assumptions every closed proof
1285 will have that shape.  Extending this idea to other types, in ITT, at the top
1286 level, expressions have \emph{canonical} values - a property known as
1287 \emph{canonicity}.  We call canonical those values formed by canonical
1288 constructors, that do not appear on the left of $\bred$: $\lambda$, $(,)$,
1289 $\lhd$, etc.  In other words a value is canonical if it's not something that is
1290 supposed to reduce (an eliminator) but is stuck on some variable.
1291
1292 While propositional equality is a very useful construct, we can prove less terms
1293 equal than we would like to.  For example, if we have the usual functions
1294 \[
1295 \begin{array}{l@{\ } l}
1296 \lcfun{id} & : \{A : \lcsetz\} \tyarr A \tyarr A \\
1297 \lcfun{map} & : \{A\ B : \lcsetz\} \tyarr (A \tyarr B) \tyarr \app{\lccon{List}}{A} \tyarr \app{\lccon{List}}{B}
1298 \end{array}
1299 \]
1300 we would certainly like to have a term of type
1301 \[\{A\ B : \lcsetz\} \tyarr
1302 \app{\lcfun{map}}{\lcfun{id}} =_{\lclist{A} \tyarr \lclist{B}} \lcfun{id}\]
1303 We cannot do this in ITT, since the things on the sides of $=$ look too
1304 different at the term level.  What we can have is
1305 \[
1306 \{A\ B : \lcsetz\} \tyarr (x : \lclist{A}) \tyarr \app{\app{\lcfun{map}}{\lcfun{id}}}{x}
1307 =_{\lclist{B}} \app{\lcfun{id}}{x}
1308 \]
1309 Since the type checker has something to compute with (even if only a variable)
1310 and reduce the two lists to equal normal forms.  The root of this problem is
1311 that \emph{function extensionality} does not hold:
1312 \[
1313 \begin{array}{l@{\ } l}
1314 \lcfun{ext} : & \{A\ B : \lcsetz\} \tyarr (f\ g : A \tyarr B) \tyarr \\
1315               & ((x : A) \tyarr \app{f}{x} =_{A} \app{g}{x}) \tyarr f =_{A \tyarr B} g
1316 \end{array}
1317 \]
1318 Which is a long standing issue in ITT.
1319
1320 \subsection{Extensional Type Theory and its problems}
1321
1322 One way to `solve' the problem and gain extensionality is to allow the type
1323 checker to derive definitional equality from propositional equality, introducing
1324 what is known as the `equality reflection' rule:
1325 \begin{center}
1326   \begin{prooftree}
1327     \AxiomC{$\Gamma \vdash t =_A m$}
1328     \UnaryInfC{$\Gamma \vdash t \defeq m$}
1329   \end{prooftree}
1330 \end{center}
1331 This jump from types to a metatheoretic relation has deep consequences.
1332 Firstly, let's get extensionality out of the way.  Given $\Gamma = \Gamma';
1333 \lcfun{eq} : (x : A) \tyarr \app{f}{x} = \app{g}{x}$, we have:
1334 \begin{center}
1335   \begin{prooftree}
1336     \AxiomC{$\Gamma; x : A \vdash \app{\lcfun{eq}}{x} : \app{f}{x} = \app{g}{x}$}
1337     \RightLabel{(equality reflection)}
1338     \UnaryInfC{$\Gamma; x : A \vdash \app{f}{x} \defeq \app{g}{x}$}
1339     \RightLabel{(congruence for $\lambda$)}
1340     \UnaryInfC{$\Gamma \vdash \abs{x : A}{\app{f}{x}} \defeq \abs{x : A}{\app{g}{x}}$}
1341     \RightLabel{($\eta$-law)}
1342     \UnaryInfC{$\Gamma \vdash f \defeq g$}
1343     \RightLabel{($\lccon{refl}$)}
1344     \UnaryInfC{$\Gamma \vdash f = g$}
1345   \end{prooftree}
1346 \end{center}
1347 Since the above is possible, theories that include the equality reflection rule
1348 are often called `Extensional Type Theories', or ETTs.  A notable exponent of
1349 this discipline is the NuPRL system \citep{NuPRL}.  Moreover, equality
1350 reflection simplifies $\lcfun{subst}$-like operations, since if we have $t = m$
1351 and $\app{A}{t}$, then by equality reflection clearly $\app{A}{t} \defeq
1352 \app{A}{m}$.
1353
1354 However equality reflection comes at a great price.  The first hit is that it is
1355 now unsafe to compute under binders, since we can send the type checker into a
1356 loop given that under binders we can have any equality proof we might want
1357 (think about what happens with $\abs{x : \bot}{\dotsb}$), and thus we can
1358 convince the type checker that a term has any type we might want.  Secondly,
1359 equality reflection is not syntax directed, thus the type checker would need to
1360 `invent' proofs of propositional equality to prove two terms definitionally
1361 equal, rendering the type checking process undecidable.  Thus the type checker
1362 needs to carry complete derivations for each typing judgement, instead of
1363 relying on terms only.
1364
1365 This is troubling if we want to retain the good computational behaviour of ITT
1366 and has kept languages like Agda, Epigram, and Coq from adopting the equality
1367 reflection rule.
1368
1369 \subsection{Observational equality}
1370
1371 \newcommand{\lcprop}{\lccon{Prop}}
1372 \newcommand{\lcdec}[1]{\llbracket #1 \rrbracket}
1373 \newcommand{\lcpropsyn}{\mathit{prop}}
1374 \newcommand{\lcpropf}[3]{\forall #1 : #2.\appsp #3}
1375 \newcommand{\lcparr}{\Rightarrow}
1376
1377 A recent development by \citet{Altenkirch2007} promises to keep the well
1378 behavedness of ITT while being able to gain many useful equality
1379 proofs\footnote{It is suspected that Observational Type Theory gains \emph{all}
1380   the equality proofs of ETT, but no proof exists yet.}, including function
1381 extensionality.  The main idea is to give the user the possibility to
1382 \emph{coerce} (or transport) values from a type $A$ to a type $B$, if the type
1383 checker can prove structurally that $A$ and $B$ are equal; and providing a
1384 value-level equality based on similar principles.
1385
1386 Starting from a theory similar to the one presented in section \ref{sec:itt} but
1387 with only $\lcset{0}$ and without propositional equality, a propositional
1388 subuniverse of $\lcsetz$ is introduced, plus a `decoding' function $\lcdec{\_}$:
1389
1390 \begin{center}
1391   \axname{syntax}
1392   $$
1393   \begin{array}{rcl}
1394     \termsyn   & ::= & \dotsb \separ \lcprop \separ \lcdec{\lcpropsyn} \\
1395     \lcpropsyn & ::= & \bot \separ \top \separ \lcpropsyn \wedge \lcpropsyn \separ \lcpropf{x}{\termsyn}{\lcpropsyn}
1396   \end{array}
1397   $$
1398
1399   \axname{typing}
1400
1401   \vspace{0.5cm}
1402
1403   \begin{tabular}{c c c}
1404     \AxiomC{}
1405     \UnaryInfC{$\Gamma \vdash \lcprop : \lcsetz$}
1406     \DisplayProof
1407     &
1408     \AxiomC{}
1409     \UnaryInfC{$\Gamma \vdash \bot : \lcprop$}
1410     \noLine
1411     \UnaryInfC{$\Gamma \vdash \top : \lcprop$}
1412     \DisplayProof
1413     &
1414     \AxiomC{$\Gamma \vdash P : \lcprop$}
1415     \AxiomC{$\Gamma \vdash Q : \lcprop$}
1416     \BinaryInfC{$\Gamma \vdash P \wedge Q : \lcprop$}
1417     \DisplayProof
1418   \end{tabular}
1419
1420   \vspace{0.5cm}
1421
1422   \begin{tabular}{c c}
1423     \AxiomC{$\Gamma \vdash S : \lcsetz$}
1424     \AxiomC{$\Gamma \vdash Q : \lcprop$}
1425     \BinaryInfC{$\Gamma \vdash \lcpropf{x}{S}{Q} : \lcprop$}
1426     \DisplayProof
1427     &
1428     \AxiomC{$\Gamma \vdash P : \lcprop$}
1429     \UnaryInfC{$\Gamma \vdash \lcdec{P} : \lcsetz$}
1430     \DisplayProof
1431   \end{tabular}
1432
1433   \vspace{0.5cm}
1434
1435   \axdesc{reduction}{\termsyn \bred \termsyn}
1436   \begin{eqnarray*}
1437     \lcdec{\bot}              & \bred & \bot \\
1438     \lcdec{\top}              & \bred & \top \\
1439     \lcdec{P \wedge Q}        & \bred & \lcdec{P} \times \lcdec{Q} \\
1440     \lcdec{\lcpropf{x}{S}{P}} & \bred & (x : S) \tyarr \lcdec{P}
1441   \end{eqnarray*}
1442 \end{center}
1443 I will use $P \lcparr Q$ as an abbreviation for $\lcpropf{\_}{P}{Q}$.  Note that
1444 $\lcprop$ has no `data', but only proofs.  This has the consequence that when
1445 compiling code using $\lcprop$ it will be safe to erase every $\lcprop$, as long
1446 as we don't  compute under binders - and we most likely don't need to compute
1447 under binders after we type checked and we just need to run the code.  Another
1448 consequence of the fact that $\lcprop$ is irrelevant computationally is that we
1449 can extend $\lcprop$ with other axioms while retaining canonicity.  Note that
1450 this works as long as $\lcdec{\bot}$ is unhabited: if we add inconsistent axioms
1451 then we can `pull' data out of $\lcprop$.
1452
1453 \newcommand{\lccoe}[4]{\lcfun{coe}\appsp#1\appsp#2\appsp#3\appsp#4}
1454 \newcommand{\lccoh}[4]{\lcfun{coh}\appsp#1\appsp#2\appsp#3\appsp#4}
1455
1456 Once we have $\lcprop$, we can define the operations that will let us form
1457 equalities and coerce between types and values:
1458 \begin{center}
1459   \axname{typing}
1460
1461   \vspace{0.5cm}
1462
1463   \begin{tabular}{c c}
1464     \AxiomC{$\Gamma \vdash S : \lcsetz$}
1465     \AxiomC{$\Gamma \vdash T : \lcsetz$}
1466     \BinaryInfC{$\Gamma \vdash S = T : \lcprop$}
1467     \DisplayProof
1468     &
1469     \AxiomC{$\Gamma \vdash Q : \lcdec{S = T}$}
1470     \AxiomC{$\Gamma \vdash s : S$}
1471     \BinaryInfC{$\Gamma \vdash \lccoe{S}{T}{Q}{s} : T$}
1472     \DisplayProof
1473   \end{tabular}
1474
1475   \vspace{0.5cm}
1476
1477   \begin{tabular}{c c}
1478     \AxiomC{$\Gamma \vdash s : S$}
1479     \AxiomC{$\Gamma \vdash t : T$}
1480     \BinaryInfC{$\Gamma \vdash (s : S) = (t : T) : \lcprop$}
1481     \DisplayProof
1482     &
1483     \AxiomC{$\Gamma \vdash Q : \lcdec{S = T}$}
1484     \AxiomC{$\Gamma \vdash s : S$}
1485     \BinaryInfC{$\Gamma \vdash \lccoh{S}{T}{Q}{s} : \lcdec{(s : S) = (\lccoe{S}{T}{Q}{s})}$}
1486     \DisplayProof
1487   \end{tabular}
1488 \end{center}
1489
1490 In the first row, $=$ forms equality between types, and $\lcfun{coe}$ (`coerce')
1491 transports values of equal types.  On the second row, $=$ forms equality between
1492 values, and $\lcfun{coh}$ (`coherence') guarantees that all equalities are
1493 really between equal things.  Note that the value equality is quite different
1494 from the propositional equality that we are used to, since it allows equality
1495 between arbitrary types.  This kind of equality is called `heterogeneous'
1496 equality and was introduced by \cite{McBride1999}.  Now the tricky part is to
1497 define reduction rules to reduce the proofs of equality to something
1498 $\lcdec{\_}$ can work with, being careful that proofs of equality will exist
1499 only between equal things.
1500
1501 Let's start with type-level $=$:
1502 $$
1503 \begin{array}{r@{\ } c@{\ } l c l}
1504   \bot & = & \bot       & \bred & \top \\
1505   \top & = & \top       & \bred & \top \\
1506   \lcbool & = & \lcbool & \bred & \top \\
1507   (s : S) \times T & = & (s' : S') \times T'  & \bred & \\
1508   \multicolumn{5}{l}{
1509     \hspace{0.8cm}
1510     S = S' \wedge \lcpropf{s}{S}{\lcpropf{s'}{S'}{(s : S) = (s' : S') \lcparr T[x] = T'[x']}}
1511   } \\
1512   (s : S) \tyarr T & = & (s' : S') \tyarr T'  & \bred & \\
1513     \multicolumn{5}{l}{
1514       \hspace{0.8cm}
1515       S' = S \wedge \lcpropf{s'}{S'}{\lcpropf{s}{S}{(s' : S') = (s : S) \lcparr T[x] = T'[x']}}
1516     } \\
1517   \lcw{s}{S}{T} & = & \lcw{s'}{S'}{T'}  & \bred & \\
1518     \multicolumn{5}{l}{
1519       \hspace{0.8cm}
1520       S = S' \wedge \lcpropf{s}{S}{\lcpropf{s'}{S'}{(s : S) = (s' : S') \lcparr T'[x'] = T[x]}}
1521     } \\
1522   S & = & T & \bred & \bot\ \text{for other canonical types}
1523 \end{array}
1524 $$
1525 The rule for $\times$ is unsurprising: it requires the left types to be equal,
1526 and the right types to be equal when the left values are equal.  The rules for
1527 $\tyarr$ and $\lccon{W}$ are similar but with some twists to make the rules for
1528 $\lcfun{coe}$ simpler:
1529 $$
1530 \begin{array}{r@{} l@{} l@{} l c l}
1531   \lccoe{&\bot}{&\bot}{&Q}{z} & \bred & z \\
1532   \lccoe{&\top}{&\top}{&Q}{u} & \bred & u \\
1533   \lccoe{&\lcbool}{&\lcbool}{&Q}{b} & \bred & b \\
1534   \lccoe{&((s : S) \times T)}{&((s' : S') \times T')}{&Q}{p} & \bred & \\
1535   \multicolumn{6}{l}{
1536       \lcind
1537       \begin{array}{l@{\ } l@{\ } c@{\ } l@{\ }}
1538         \lcsyn{let} & s   & \mapsto & \lcfst p : S \\
1539                     & t   & \mapsto & \lcsnd p : T[s] \\
1540                     & Q_S & \mapsto & \lcfst Q : \lcdec{S = S'} \\
1541                     & s'  & \mapsto & \lccoe{S}{S'}{Q}{s} : S' \\
1542                     & Q_T & \mapsto & \lcsnd Q \appsp s \appsp s' \appsp (\lccoh{S}{S'}{Q_S}{s}) : \lcdec{T[s] = T[s']} \\
1543                     & t'  & \mapsto & \lccoe{T[t]}{T'[s']}{Q_T}{t} : T'[s'] \\
1544         \multicolumn{4}{l}{\lcsyn{in}\ (s', t')}
1545       \end{array}
1546     }\\
1547   \lccoe{&((x : S) \tyarr T)}{&((x' : S') \tyarr T')}{&Q}{f} & \bred & \dotsb \\
1548   \lccoe{&(\lcw{x}{S}{T})}{&(\lcw{x'}{S'}{T'})}{&Q}{(s \lhd f)} & \bred & \dotsb \\
1549   \lccoe{&S}{&T}{&Q}{x} & \bred & \lcabsurdd{T}{Q}
1550 \end{array}
1551 $$
1552 The rule for $\times$ is hairy but straightforward: we are given a $(s, t) : (x
1553 : S) \times T$ and, given the reduction rules specified before, a $$Q : \lcdec{S
1554   = S'} \times ((s : S) \tyarr (s' : S') \tyarr \lcdec{(s : S) = (s' : S')}
1555 \tyarr \lcdec{T[x] = T'[x']})$$ We need to obtain a $(x' : S') \times T'$.  We
1556 can easily get the left part with the left of $Q$ and a coercion, and the right
1557 with the help of $\lcfun{coh}$ and the right of $Q$.  The rules for the other
1558 binders are similar but not reproduced here for brevity.  Note that
1559 $\lcfun{coe}$ computes on $Q$ and on the value very lazily, never matching on
1560 the pairs, meaning that $\lcfun{coe}$ will always reduce when working with
1561 canonical values and types.
1562
1563 Since $\lcfun{coh}$ is a propositional axiom, we can leave it without reduction
1564 rules.  Now we are left with value equality:
1565 $$
1566 \begin{array}{r@{\ } c@{\ } l c l}
1567   (z : \bot) &=& (z' : \bot) & \bred & \top \\
1568   (u : \top) &=& (u' : \top) & \bred & \top \\
1569   (\lctrue : \lcbool) &=& (\lctrue : \lcbool) & \bred & \top \\
1570   (\lctrue : \lcbool) &=& (\lcfalse : \lcbool) & \bred & \top \\
1571   (\lcfalse : \lcbool) &=& (\lctrue : \lcbool) & \bred & \bot \\
1572   (\lcfalse : \lcbool) &=& (\lcfalse : \lcbool) & \bred & \top \\
1573   (f : (s : S) \tyarr T) &=& (f' : (s' : S') \tyarr T) & \bred & \\
1574   \multicolumn{5}{l}{
1575     \hspace{1cm}
1576     \lcpropf{s}{S}{\lcpropf{s'}{S'}{(s : S) = (s' : S') \lcparr (\app{f}{s} : T[s]) = (\app{f'}{s'} : T'[s'])}}
1577   }\\
1578   (p : (s : S) \times T) &=& (p : (s' : S') \times T') & \bred & \\
1579   \multicolumn{5}{l}{
1580     \hspace{1cm}
1581     (\lcfst p : S) = (\lcfst p' : S') \wedge (\lcsnd p : T[\lcfst p]) = (\lcsnd p' : T'[\lcfst p'])
1582   }\\
1583   (s \lhd f : \lcw{s}{S}{T}) &=& (s' \lhd f' : \lcw{s'}{S'}{T'}) & \bred & \\
1584   \multicolumn{5}{l}{
1585     \hspace{1cm}
1586     (s : S) = (s' : S') \wedge (\lcpropf{t}{T[s]}{\lcpropf{t'}{T'[s']}{(t : T[s]) = (t' : T'[s']) \lcparr}}
1587   }\\
1588   \multicolumn{5}{l}{
1589     \hspace{4.5cm}
1590       (\app{f}{s} : \lcw{s}{S}{T}) = (\app{f'}{s'} : \lcw{s}{S'}{T'}))
1591   }\\
1592   (s : S) &=& (s' : S') & \bred & \bot\ \text{for other canonical types}
1593 \end{array}
1594 $$
1595 Functions are equal if they take equal inputs to equal outputs, which satisfies
1596 our need of extensionality.  $\times$ and $\lccon{W}$ work similiarly.  The
1597 authors called this kind of equality `observational' since it computes based on
1598 the structure of the values, and theories implementing it take the name
1599 `Observational Type Theories' (OTTs).  For lack of space, I have glossed over
1600 many details explained in the paper showing why OTT works, but hopefully my
1601 explanaition should give an idea on how the pieces fit together - the reader
1602 should refer to \citep{Altenkirch2007} for the complete story.
1603
1604 \section{What to do}
1605
1606 My goal is to advance the practice of OTT.  Conor McBride and other
1607 collaborators already implemented OTT and other measures as part of efforts
1608 towards a new version of Epigram\footnote{Available at
1609   \url{http://www.e-pig.org/darcs/Pig09/web/}.}.  However development is stale
1610 and it is not clear when and if Epigram 2 will be released.
1611
1612 The first thing that would be very useful to do is to have a small, core
1613 language that supports OTT, on which a more expressive language can be built.
1614 This follows an established tradition in functional programming and theorem
1615 proving of layering several languages of increasing complexity, so that type
1616 checking at a lower level is much simpler and less prone to bugs.  For example
1617 GHC Haskell uses an internal language, System F\textsubscript{C}
1618 \citep{Sulzmann2007}, which includes only minimal features compared to Haskell.
1619
1620 I have already implemented a type theory as described in section
1621 \ref{sec:itt}\footnote{Available at \url{https://github.com/bitonic/mfixed}.} to
1622 make myself comfortable with how such systems work.  From that, a good starting
1623 point to build something more useful could be $\Pi\Sigma$
1624 \citep{Altenkirch2007}, a core partial language, devised to be a good target
1625 for high level languages like Agda, with facilities to implement inductive
1626 families and corecursion.  Starting to think about how OTT would work in such a
1627 language is my immediate goal.
1628
1629 If these attempts are successful, I can work towards understanding what a higher
1630 level language would look like and how it would be `elaborated' to the lower
1631 level core theory.  Epigram 2 can certainly be an inspiration, although it
1632 employs a sophisticated reflection system \citep{Chapman2010} that I do not plan
1633 to reproduce.  In any case, there are many things to think about, with the most
1634 salients point being how to treat inductive families and pattern matching,
1635 corecursion, and what a friendly interface for OTT would be from the
1636 programmer/mathematician's perspective (and whether it's feasible to always
1637 infer coercions and/or equality proofs automatically).
1638
1639 Interestingly, the mentioned System F\textsubscript{C} was introduced in GHC 7
1640 to be able to implement GADTs, which as said in section
1641 \ref{sec:inductive-families} bear many similarities to inductive families.  To
1642 do that it uses a system of coercions that is not too far away from OTT.  This
1643 is not a coincidence, since indices and propositional equality are often
1644 connected, and offers a lot of food for thought.  For instance, GHC
1645 automatically generates and applies equality proofs, and the inference engine
1646 that serves this purpose is quite complex \citep{Vytiniotis2011} and often very
1647 confusing to the user; we would like to have a less `magic' but clearer system.
1648
1649 \bibliographystyle{authordate1}
1650 \bibliography{InterimReport}
1651
1652 \appendix
1653 \section{Agda code}
1654 \label{app:agda-code}
1655
1656 \lstset{
1657   xleftmargin=0pt
1658 }
1659
1660 \lstinputlisting{InterimReport.agda}
1661
1662 \end{document}