docs/background.tex

   1 \documentclass[article, a4paper]{article}
   2 \usepackage[T1]{fontenc}
   3 \usepackage{fixltx2e}
   4 \usepackage{enumitem}
   5 \usepackage[normalem]{ulem}
   6 \usepackage{graphicx}
   7 \usepackage{subcaption}
   8
   9 % Unicode
  10 \usepackage{ucs}
  11 \usepackage[utf8x]{inputenc}
  12 % \usepackage{autofe}
  13
  14 %% -----------------------------------------------------------------------------
  15 %% Fonts
  16 %% \usepackage[osf]{libertine}
  17 %% \usepackage{helvet}
  18 %% \usepackage{lmodern}
  19 %% \IfFileExists{microtype.sty}{\usepackage{microtype}}{}
  20
  21 %% -----------------------------------------------------------------------------
  22 %% Diagrams
  23 % \usepackage{tikz}
  24 % \usetikzlibrary{positioning}
  25 %% \usetikzlibrary{shapes}
  26 %% \usetikzlibrary{arrows}
  27 %% \usepackage{tikz-cd}
  28 %% \usepackage{pgfplots}
  29 \usepackage[all]{xy}
  30
  31 %% -----------------------------------------------------------------------------
  32 %% Symbols
  33 \usepackage[fleqn]{amsmath}
  34 \usepackage{amssymb}
  35 \usepackage{wasysym}
  36 \usepackage{turnstile}
  37 \usepackage{centernot}
  38
  39 %% -----------------------------------------------------------------------------
  40 %% Utils
  41 \usepackage{bussproofs}
  42 \usepackage{umoline}
  43 \usepackage[export]{adjustbox}
  44 \usepackage{multicol}
  45 \usepackage{listings}
  46 \lstset{
  47   basicstyle=\small\ttfamily,
  48   extendedchars=\true,
  49   inputencoding=utf8x,
  50   xleftmargin=1cm
  51 }
  52
  53 %% -----------------------------------------------------------------------------
  54 %% Bibtex
  55 \usepackage{natbib}
  56
  57 %% Numbering depth
  58 %% \setcounter{secnumdepth}{0}
  59
  60 %% -----------------------------------------------------------------------------
  61 % We will generate all images so they have a width \maxwidth. This means
  62 % that they will get their normal width if they fit onto the page, but
  63 % are scaled down if they would overflow the margins.
  64 \makeatletter
  65 \def\maxwidth{
  66   \ifdim\Gin@nat@width>\linewidth\linewidth
  67   \else\Gin@nat@width\fi
  68 }
  69 \makeatother
  70
  71 %% -----------------------------------------------------------------------------
  72 %% Links
  73 \usepackage[
  74   setpagesize=false, % page size defined by xetex
  75   unicode=false, % unicode breaks when used with xetex
  76   xetex
  77 ]{hyperref}
  78
  79 \hypersetup{
  80   breaklinks=true,
  81   bookmarks=true,
  82   pdfauthor={Francesco Mazzoli <fm2209@ic.ac.uk>},
  83   pdftitle={Observational Equality},
  84   colorlinks=false,
  85   pdfborder={0 0 0}
  86 }
  87
  88 % Make links footnotes instead of hotlinks:
  89 % \renewcommand{\href}[2]{#2\footnote{\url{#1}}}
  90
  91 % avoid problems with \sout in headers with hyperref:
  92 \pdfstringdefDisableCommands{\renewcommand{\sout}{}}
  93
  94 \title{Observational Equality}
  95 \author{Francesco Mazzoli \href{mailto:fm2209@ic.ac.uk}{\nolinkurl{<fm2209@ic.ac.uk>}}}
  96 \date{December 2012}
  97
  98 \begin{document}
  99
 100 \maketitle
 101
 102 \setlength{\tabcolsep}{12pt}
 103
 104 \begin{abstract}
 105 The marriage between programming and logic has been a very fertile one.  In
 106 particular, since the simply typed lambda calculus (STLC), a number of type
 107 systems have been devised with increasing expressive power.
 108
 109 Section \ref{sec:types} I will give a very brief overview of STLC, and then
 110 illustrate how it can be interpreted as a natural deduction system.  Section
 111 \ref{sec:itt} will introduce Inutitionistic Type Theory (ITT), which expands on
 112 this concept, employing a more expressive logic.  The exposition is quite dense
 113 since there is a lot of material to cover; for a more complete treatment of the
 114 material the reader can refer to \citep{Thompson1991, Pierce2002}. Section
 115 \ref{sec:practical} will describe the status of practical programming languages
 116 based on ITT.
 117
 118 Finally, I will explain why equality has been a tricky business in these
 119 theories, and talk about the various attempts have been made to make the
 120 situation better.  One interesting development has recently emerged:
 121 Observational Type theory.  I propose to explore the ways to turn these ideas
 122 into useful practices for programming and theorem proving.
 123 \end{abstract}
 124
 125 \tableofcontents
 126
 127 \section{Simple and not-so-simple types}
 128 \label{sec:types}
 129
 130 \subsection{A note on notation}
 131
 132 \newcommand{\appsp}{\hspace{0.07cm}}
 133 \newcommand{\app}[2]{#1\appsp#2}
 134 \newcommand{\absspace}{\hspace{0.03cm}}
 135 \newcommand{\abs}[2]{\lambda #1\absspace.\absspace#2}
 136 \newcommand{\termt}{t}
 137 \newcommand{\termm}{m}
 138 \newcommand{\termn}{n}
 139 \newcommand{\termp}{p}
 140 \newcommand{\termf}{f}
 141 \newcommand{\separ}{\ \ |\ \ }
 142 \newcommand{\termsyn}{\mathit{term}}
 143 \newcommand{\axname}[1]{\textbf{#1}}
 144 \newcommand{\axdesc}[2]{\axname{#1} \fbox{$#2$}}
 145 \newcommand{\lcsyn}[1]{\mathrm{\underline{#1}}}
 146 \newcommand{\lccon}[1]{\mathsf{#1}}
 147 \newcommand{\lcfun}[1]{\mathbf{#1}}
 148 \newcommand{\tyarr}{\to}
 149 \newcommand{\tysyn}{\mathit{type}}
 150 \newcommand{\ctxsyn}{\mathit{context}}
 151 \newcommand{\emptyctx}{\cdot}
 152
 153 In the following sections I will introduce a lot of syntax, typing, and
 154 reduction rules, along with examples.  To make the exposition clearer, usually
 155 the rules are preceded by what the rule looks like and what it shows (for
 156 example \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}).
 157
 158 In the languages presented I will also use different fonts for different things:
 159 \begin{tabular}{c | l}
 160   $\lccon{Sans}$  & Sans serif, capitalised, for type constructors. \\
 161   $\lccon{sans}$  & Sans serif, not capitalised, for data constructors. \\
 162   $\lcsyn{roman}$ & Roman, underlined, for the syntax of the language. \\
 163   $\lcfun{roman}$ & Roman, bold, for defined functions and values. \\
 164   $math$          & Math mode font for quantified variables and syntax elements.
 165 \end{tabular}
 166
 167 \subsection{Untyped $\lambda$-calculus}
 168
 169 Along with Turing's machines, the earliest attempts to formalise computation
 170 lead to the $\lambda$-calculus \citep{Church1936}.  This early programming
 171 language encodes computation with a minimal syntax and no `data' in the
 172 traditional sense, but just functions.
 173
 174 The syntax of $\lambda$-terms consists of just three things: variables,
 175 abstractions, and applications:
 176
 177 \begin{center}
 178 \axname{syntax}
 179 $$
 180 \begin{array}{rcl}
 181   \termsyn & ::= & x \separ (\abs{x}{\termsyn}) \separ (\app{\termsyn}{\termsyn}) \\
 182          x & \in & \text{Some enumerable set of symbols, e.g.}\ \{x, y, z, \dots , x_1, x_2, \dots\}
 183 \end{array}
 184 $$
 185 \end{center}
 186
 187
 188 % I will omit parethesis in the usual manner. %TODO explain how
 189
 190 I will use $\termt,\termm,\termn,\dots$ to indicate a generic term, and $x,y$
 191 for variables.  I will also assume that all variable names in a term are unique
 192 to avoid problems with name capturing.  Intuitively, abstractions
 193 ($\abs{x}{\termt}$) introduce functions with a named parameter ($x$), and
 194 applications ($\app{\termt}{\termm}$) apply a function ($\termt$) to an argument
 195 ($\termm$).
 196
 197 The `applying' is more formally explained with a reduction rule:
 198
 199 \newcommand{\bred}{\leadsto}
 200 \newcommand{\bredc}{\bred^*}
 201
 202 \begin{center}
 203 \axdesc{reduction}{\termsyn \bred \termsyn}
 204 $$\app{(\abs{x}{\termt})}{\termm} \bred \termt[\termm / x]$$
 205 \end{center}
 206
 207 Where $\termt[\termm / x]$ expresses the operation that substitutes all
 208 occurrences of $x$ with $\termm$ in $\termt$.  In the future, I will use
 209 $[\termt]$ as an abbreviation for $[\termt / x]$.  In the systems presented, the
 210 $\bred$ relation also includes reduction of subterms, for example if $\termt
 211 \bred \termm$ then $\app{\termt}{\termn} \bred \app{\termm}{\termn}$, and so on.
 212
 213 These few elements are of remarkable expressiveness, and in fact Turing
 214 complete.  As a corollary, we must be able to devise a term that reduces forever
 215 (`loops' in imperative terms):
 216 \[
 217   \app{(\abs{x}{\app{x}{x}})}{(\abs{x}{\app{x}{x}})} \bred \app{(\abs{x}{\app{x}{x}})}{(\abs{x}{\app{x}{x}})} \bred \dotsb
 218 \]
 219 Terms that can be reduced only a finite number of times (the non-looping ones)
 220 are said to be \emph{normalising}, and the `final' term is called \emph{normal
 221   form}.  These concepts (reduction and normal forms) will run through all the
 222 material analysed.
 223
 224 \subsection{The simply typed $\lambda$-calculus}
 225
 226 \newcommand{\tya}{A}
 227 \newcommand{\tyb}{B}
 228 \newcommand{\tyc}{C}
 229
 230 One way to `discipline' $\lambda$-terms is to assign \emph{types} to them, and
 231 then check that the terms that we are forming make sense given our typing rules
 232 \citep{Curry1934}.
 233
 234 We wish to introduce rules of the form $\Gamma \vdash \termt : \tya$, which
 235 reads `in context $\Gamma$, term $\termt$ has type $\tya$'.
 236
 237 The syntax for types is as follows:
 238
 239 \begin{center}
 240   \axname{syntax}
 241    $$\tysyn ::= x \separ \tysyn \tyarr \tysyn$$
 242 \end{center}
 243
 244 I will use $\tya,\tyb,\dots$ to indicate a generic type.
 245
 246 A context $\Gamma$ is a map from variables to types.  I will use the notation
 247 $\Gamma; x : \tya$ to augment it and to `extract' pairs from it.
 248
 249 Predictably, $\tya \tyarr \tyb$ is the type of a function from $\tya$ to
 250 $\tyb$.  We need to be able to decorate our abstractions with
 251 types\footnote{Actually, we don't need to: computers can infer the right type
 252   easily, but that is another story.}:
 253 \begin{center}
 254   \axname{syntax}
 255    $$\termsyn ::= x \separ (\abs{x : \tysyn}{\termsyn}) \separ (\app{\termsyn}{\termsyn})$$
 256 \end{center}
 257 Now we are ready to give the typing judgements:
 258
 259 \begin{center}
 260   \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}
 261
 262   \vspace{0.5cm}
 263
 264   \begin{tabular}{c c c}
 265     \AxiomC{\phantom{1em}}
 266     \UnaryInfC{$\Gamma; x : \tya \vdash x : \tya$}
 267     \DisplayProof
 268     &
 269     \AxiomC{$\Gamma; x : \tya \vdash \termt : \tyb$}
 270     \UnaryInfC{$\Gamma \vdash \abs{x : \tya}{\termt} : \tya \tyarr \tyb$}
 271     \DisplayProof
 272   \end{tabular}
 273
 274   \vspace{0.5cm}
 275
 276   \begin{tabular}{c}
 277     \AxiomC{$\Gamma \vdash \termt : \tya \tyarr \tyb$}
 278     \AxiomC{$\Gamma \vdash \termm : \tya$}
 279     \BinaryInfC{$\Gamma \vdash \app{\termt}{\termm} : \tyb$}
 280     \DisplayProof
 281   \end{tabular}
 282 \end{center}
 283
 284 This typing system takes the name of `simply typed lambda calculus' (STLC),
 285 and enjoys a number of properties.  Two of them are expected in most type
 286 systems \citep{Pierce2002}:
 287 \begin{description}
 288 \item[Progress] A well-typed term is not stuck - either it is a value or it can
 289   take a step according to the evaluation rules.  With `value' we mean a term
 290   whose subterms (including itself) don't appear to the left of the $\bred$
 291   relation.
 292 \item[Preservation] If a well-typed term takes a step of evaluation, then the
 293   resulting term is also well typed.
 294 \end{description}
 295
 296 However, STLC buys us much more: every well-typed term
 297 is normalising.  It is easy to see that we can't fill the blanks if we want to
 298 give types to the non-normalising term shown before:
 299 \begin{equation*}
 300   \app{(\abs{x : ?}{\app{x}{x}})}{(\abs{x : ?}{\app{x}{x}})}
 301 \end{equation*}
 302
 303 \newcommand{\lcfix}[2]{\lcsyn{fix} \appsp #1\absspace.\absspace #2}
 304
 305 This makes the STLC Turing incomplete.  We can recover the ability to loop by
 306 adding a combinator that recurses:
 307 $$
 308   \termsyn ::= \dotsb \separ  \lcfix{x : \tysyn}{\termsyn}
 309 $$
 310 \begin{center}
 311   \begin{prooftree}
 312     \AxiomC{$\Gamma;x : \tya \vdash \termt : \tya$}
 313     \UnaryInfC{$\Gamma \vdash \lcfix{x : \tya}{\termt} : \tya$}
 314   \end{prooftree}
 315 \end{center}
 316 $$
 317   \lcfix{x : \tya}{\termt} \bred \termt[\lcfix{x : \tya}{\termt}]
 318 $$
 319 Which will deprive us of normalisation for every term.
 320
 321 \subsection{The Curry-Howard correspondence}
 322 \label{sec:curry-howard}
 323
 324 \newcommand{\lcunit}{\lcfun{\langle\rangle}}
 325
 326 It turns out that the STLC can be seen a natural deduction system for
 327 propositional logic.  Terms are proofs, and their types are the propositions
 328 they prove.  This remarkable fact is known as the Curry-Howard correspondence,
 329 or isomorphism.
 330
 331 The `arrow' ($\to$) type corresponds to implication.  If we wish to prove that
 332 $(\tya \tyarr \tyb) \tyarr (\tyb \tyarr \tyc) \tyarr (\tya \tyarr \tyc)$, all we
 333 need to do is to devise a $\lambda$-term that has the correct type:
 334 \begin{equation*}
 335   \abs{f : (\tya \tyarr \tyb)}{\abs{g : (\tyb \tyarr \tyc)}{\abs{x : \tya}{\app{g}{(\app{f}{x})}}}}
 336 \end{equation*}
 337 That is, function composition.  We might want extend our bare lambda calculus
 338 with a couple of terms former to make our natural deduction more pleasant to
 339 use.  For example, tagged unions (\texttt{Either} in Haskell) are disjunctions,
 340 and tuples (or products) are conjunctions.  We also want to be able to express
 341 falsity: that can done by introducing a type inhabited by no terms.  If evidence
 342 of such a type is presented, then we can derive any type, which expresses
 343 absurdity.  Conversely, truth ($\top$) is the type with just one trivial
 344 element ($\lcunit$).
 345
 346 \newcommand{\lcinl}{\lccon{inl}\appsp}
 347 \newcommand{\lcinr}{\lccon{inr}\appsp}
 348 \newcommand{\lccase}[3]{\lcsyn{case}\appsp#1\appsp\lcsyn{of}\appsp#2\appsp#3}
 349 \newcommand{\lcfst}{\lcfun{fst}\appsp}
 350 \newcommand{\lcsnd}{\lcfun{snd}\appsp}
 351 \newcommand{\orint}{\vee I_{1,2}}
 352 \newcommand{\orintl}{\vee I_{1}}
 353 \newcommand{\orintr}{\vee I_{2}}
 354 \newcommand{\orel}{\vee E}
 355 \newcommand{\andint}{\wedge I}
 356 \newcommand{\andel}{\wedge E_{1,2}}
 357 \newcommand{\botel}{\bot E}
 358 \newcommand{\lcabsurd}{\lcfun{absurd}\appsp}
 359 \newcommand{\lcabsurdd}[1]{\lcfun{absurd}_{#1}\appsp}
 360
 361
 362 \begin{center}
 363   \axname{syntax}
 364   $$
 365   \begin{array}{rcl}
 366     \termsyn & ::= & \dotsb \\
 367              &  |  & \lcinl \termsyn \separ \lcinr \termsyn \separ \lccase{\termsyn}{\termsyn}{\termsyn} \\
 368              &  |  & (\termsyn , \termsyn) \separ \lcfst \termsyn \separ \lcsnd \termsyn \\
 369              &  |  & \lcunit \\
 370     \tysyn & ::= & \dotsb \separ \tysyn \vee \tysyn \separ \tysyn \wedge \tysyn \separ \bot \separ \top
 371   \end{array}
 372   $$
 373 \end{center}
 374 \begin{center}
 375   \axdesc{typing}{\Gamma \vdash \termsyn : \tysyn}
 376   \begin{prooftree}
 377     \AxiomC{$\Gamma \vdash \termt : \tya$}
 378     \RightLabel{$\orint$}
 379     \UnaryInfC{$\Gamma \vdash \lcinl \termt : \tya \vee \tyb$}
 380     \noLine
 381     \UnaryInfC{$\Gamma \vdash \lcinr \termt : \tyb \vee \tya$}
 382   \end{prooftree}
 383   \begin{prooftree}
 384     \AxiomC{$\Gamma \vdash \termt : \tya \vee \tyb$}
 385     \AxiomC{$\Gamma \vdash \termm : \tya \tyarr \tyc$}
 386     \AxiomC{$\Gamma \vdash \termn : \tyb \tyarr \tyc$}
 387     \RightLabel{$\orel$}
 388     \TrinaryInfC{$\Gamma \vdash \lccase{\termt}{\termm}{\termn} : \tyc$}
 389   \end{prooftree}
 390
 391   \begin{tabular}{c c}
 392     \AxiomC{$\Gamma \vdash \termt : \tya$}
 393     \AxiomC{$\Gamma \vdash \termm : \tyb$}
 394     \RightLabel{$\andint$}
 395     \BinaryInfC{$\Gamma \vdash (\tya , \tyb) : \tya \wedge \tyb$}
 396     \DisplayProof
 397     &
 398     \AxiomC{$\Gamma \vdash \termt : \tya \wedge \tyb$}
 399     \RightLabel{$\andel$}
 400     \UnaryInfC{$\Gamma \vdash \lcfst \termt : \tya$}
 401     \noLine
 402     \UnaryInfC{$\Gamma \vdash \lcsnd \termt : \tyb$}
 403     \DisplayProof
 404   \end{tabular}
 405
 406   \vspace{0.5cm}
 407
 408   \begin{tabular}{c c}
 409     \AxiomC{$\Gamma \vdash \termt : \bot$}
 410     \RightLabel{$\botel$}
 411     \UnaryInfC{$\Gamma \vdash \lcabsurdd{\tya} \termt : \tya$}
 412     \DisplayProof
 413     &
 414     \AxiomC{}
 415     \RightLabel{$\top I$}
 416     \UnaryInfC{$\Gamma \vdash \lcunit : \top$}
 417     \DisplayProof
 418   \end{tabular}
 419 \end{center}
 420 \begin{center}
 421   \axdesc{reduction}{\termsyn \bred \termsyn}
 422   \begin{eqnarray*}
 423     \lccase{(\lcinl \termt)}{\termm}{\termn} & \bred & \app{\termm}{\termt} \\
 424     \lccase{(\lcinr \termt)}{\termm}{\termn} & \bred & \app{\termn}{\termt} \\
 425     \lcfst (\termt , \termm)                 & \bred & \termt \\
 426     \lcsnd (\termt , \termm)                 & \bred & \termm
 427   \end{eqnarray*}
 428 \end{center}
 429
 430 With these rules, our STLC now looks remarkably similar in power and use to the
 431 natural deduction we already know.  $\neg A$ can be expressed as $A \tyarr
 432 \bot$.  However, there is an important omission: there is no term of the type $A
 433 \vee \neg A$ (excluded middle), or equivalently $\neg \neg A \tyarr A$ (double
 434 negation), or indeed any term with a type equivalent to those.
 435
 436 This has a considerable effect on our logic and it's no coincidence, since there
 437 is no obvious computational behaviour for laws like the excluded middle.
 438 Theories of this kind are called \emph{intuitionistic}, or \emph{constructive},
 439 and all the systems analysed will have this characteristic since they build on
 440 the foundation of the STLC\footnote{There is research to give computational
 441   behaviour to classical logic, but I will not touch those subjects.}.
 442
 443 Finally, going back to our $\lcsyn{fix}$ combinator, it's now easy to see how
 444 we would want to exclude such a thing if we want to use STLC as a logic, since
 445 it allows us to prove everything: $(\lcfix{x : \tya}{x}) : \tya$ clearly works
 446 for any $A$!  This is a crucial point: in general we wish to have systems that
 447 do not let the user devise a term of type $\bot$, otherwise our logic will be
 448 unsound\footnote{Obviously such a term can be present under a $\lambda$.}.
 449
 450 \subsection{Extending the STLC}
 451
 452 \newcommand{\lctype}{\lccon{Type}}
 453 \newcommand{\lcite}[3]{\lcsyn{if}\appsp#1\appsp\lcsyn{then}\appsp#2\appsp\lcsyn{else}\appsp#3}
 454 \newcommand{\lcbool}{\lccon{Bool}}
 455 \newcommand{\lcforallz}[2]{\forall #1 \absspace.\absspace #2}
 456 \newcommand{\lcforall}[3]{(#1 : #2) \tyarr #3}
 457 \newcommand{\lcexists}[3]{(#1 : #2) \times #3}
 458
 459 The STLC can be made more expressive in various ways.  Henk Barendregt
 460 succinctly expressed geometrically how we can expand our type system:
 461
 462 $$
 463 \xymatrix@!0@=1.5cm{
 464   & \lambda\omega \ar@{-}[rr]\ar@{-}'[d][dd]
 465   & & \lambda C \ar@{-}[dd]
 466   \\
 467   \lambda2 \ar@{-}[ur]\ar@{-}[rr]\ar@{-}[dd]
 468   & & \lambda P2 \ar@{-}[ur]\ar@{-}[dd]
 469   \\
 470   & \lambda\underline\omega \ar@{-}'[r][rr]
 471   & & \lambda P\underline\omega
 472   \\
 473   \lambda{\to} \ar@{-}[rr]\ar@{-}[ur]
 474   & & \lambda P \ar@{-}[ur]
 475 }
 476 $$
 477 Here $\lambda{\to}$, in the bottom left, is the STLC.  From there can move along
 478 3 dimensions:
 479 \begin{description}
 480 \item[Terms depending on types (towards $\lambda{2}$)] We can quantify over
 481   types in our type signatures: $(\abs{A : \lctype}{\abs{x : A}{x}}) :
 482   \lcforallz{A}{A \tyarr A}$.  The first and most famous instance of this idea
 483   has been System F.  This gives us a form of polymorphism and has been wildly
 484   successful, also thanks to a well known inference algorithm for a restricted
 485   version of System F known as Hindley-Milner.  Languages like Haskell and SML
 486   are based on this discipline.
 487 \item[Types depending on types (towards $\lambda{\underline{\omega}}$)] We have
 488   type operators: $(\abs{A : \lctype}{\abs{R : \lctype}{(A \to R) \to R}}) :
 489   \lctype \to \lctype \to \lctype$.
 490 \item[Types depending on terms (towards $\lambda{P}$)] Also known as `dependent
 491   types', give great expressive power: $(\abs{x : \lcbool}{\lcite{x}{\mathbb{N}}{\mathbb{Q}}}) : \lcbool \to \lctype$.
 492 \end{description}
 493
 494 All the systems preserve the properties that make the STLC well behaved.  The
 495 system we are going to focus on, Intuitionistic Type Theory, has all of the
 496 above additions, and thus would sit where $\lambda{C}$ sits in the
 497 `$\lambda$-cube'.
 498
 499 \section{Intuitionistic Type Theory and dependent types}
 500 \label{sec:itt}
 501
 502 \newcommand{\lcset}[1]{\lccon{Type}_{#1}}
 503 \newcommand{\lcsetz}{\lccon{Type}}
 504 \newcommand{\defeq}{\cong}
 505
 506 \subsection{A Bit of History}
 507
 508 Logic frameworks and programming languages based on type theory have a long
 509 history.  Per Martin-L\"{o}f described the first version of his theory in 1971,
 510 but then revised it since the original version was too impredicative and thus
 511 inconsistent\footnote{In the early version $\lcsetz : \lcsetz$, see section
 512   \ref{sec:core-tt} for an explanation on why this causes problems.}.  For this
 513 reason he gave a revised and consistent definition later \citep{Martin-Lof1984}.
 514
 515 A related development is the polymorphic $\lambda$-calculus, and specifically
 516 the previously mentioned System F, which was developed independently by Girard
 517 and Reynolds.  An overview can be found in \citep{Reynolds1994}.  The surprising
 518 fact is that while System F is impredicative it is still consistent and strongly
 519 normalising.  \cite{Coquand1986} further extended this line of work with the
 520 Calculus of Constructions (CoC).
 521
 522 \subsection{A Core Type Theory}
 523 \label{sec:core-tt}
 524
 525 The calculus I present follows the exposition in \citep{Thompson1991}, and is
 526 quite close to the original formulation of predicative ITT as found in
 527 \citep{Martin-Lof1984}.
 528
 529 \begin{center}
 530   \axname{syntax}
 531   \begin{eqnarray*}
 532   \termsyn & ::= & x \\
 533          &  |  & \lcforall{x}{\termsyn}{\termsyn} \separ \abs{x : \termsyn}{\termsyn} \separ \app{\termsyn}{\termsyn} \\
 534          &  |  & \lcexists{x}{\termsyn}{\termsyn} \separ (\termsyn , \termsyn)_{x.\termsyn} \separ \lcfst \termsyn \separ \lcsnd \termsyn \\
 535          &  |  & \bot \separ \lcabsurd_{\termsyn} \termsyn \\
 536          &  |  & \lcset{n} \\
 537    n     & \in & \mathbb{N}
 538  \end{eqnarray*}
 539
 540   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 541
 542   \vspace{0.5cm}
 543
 544   \begin{tabular}{c c c}
 545     \AxiomC{\phantom{1em}}
 546     \UnaryInfC{$\Gamma;x : \tya \vdash x : \tya$}
 547     \DisplayProof
 548     &
 549     \AxiomC{$\Gamma \vdash \termt : \tya$}
 550     \AxiomC{$\tya \defeq \tyb$}
 551     \BinaryInfC{$\Gamma \vdash \termt : \tyb$}
 552     \DisplayProof
 553     &
 554     \AxiomC{$\Gamma \vdash \termt : \bot$}
 555     \UnaryInfC{$\Gamma \vdash \lcabsurdd{\tya} \termt : \tya$}
 556     \DisplayProof
 557   \end{tabular}
 558
 559   \vspace{0.5cm}
 560
 561   \begin{tabular}{c c}
 562     \AxiomC{$\Gamma;x : \tya \vdash \termt : \tya$}
 563     \UnaryInfC{$\Gamma \vdash \abs{x : \tya}{\termt} : \lcforall{x}{\tya}{\tyb}$}
 564     \DisplayProof
 565     &
 566     \AxiomC{$\Gamma \vdash \termt : \lcforall{x}{\tya}{\tyb}$}
 567     \AxiomC{$\Gamma \vdash \termm : \tya$}
 568     \BinaryInfC{$\Gamma \vdash \app{\termt}{\termm} : \tyb[\termm ]$}
 569     \DisplayProof
 570   \end{tabular}
 571
 572   \vspace{0.5cm}
 573
 574   \begin{tabular}{c c}
 575     \AxiomC{$\Gamma \vdash \termt : \tya$}
 576     \AxiomC{$\Gamma \vdash \termm : \tyb[\termt ]$}
 577     \BinaryInfC{$\Gamma \vdash (\termt, \termm)_{x.\tyb} : \lcexists{x}{\tya}{\tyb}$}
 578     \noLine
 579     \UnaryInfC{$\phantom{1em}$}
 580     \DisplayProof
 581     &
 582     \AxiomC{$\Gamma \vdash \termt: \lcexists{x}{\tya}{\tyb}$}
 583     \UnaryInfC{$\hspace{0.7cm} \Gamma \vdash \lcfst \termt : \tya \hspace{0.7cm}$}
 584     \noLine
 585     \UnaryInfC{$\Gamma \vdash \lcsnd \termt : \tyb[\lcfst \termt ]$}
 586     \DisplayProof
 587   \end{tabular}
 588
 589   \vspace{0.5cm}
 590
 591   \begin{tabular}{c c}
 592     \AxiomC{$\phantom{1em}$}
 593     \UnaryInfC{$\Gamma \vdash \lcset{n} : \lcset{n + 1}$}
 594     \noLine
 595     \UnaryInfC{$\phantom{1em}$}
 596     \DisplayProof
 597     &
 598     \AxiomC{$\Gamma \vdash \tya : \lcset{n}$}
 599     \AxiomC{$\Gamma; x : \tya \vdash \tyb : \lcset{m}$}
 600     \BinaryInfC{$\Gamma \vdash \lcforall{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 601     \noLine
 602     \UnaryInfC{$\Gamma \vdash \lcexists{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 603     \DisplayProof
 604   \end{tabular}
 605
 606   \vspace{0.5cm}
 607
 608   \axdesc{reduction}{\termsyn \bred \termsyn}
 609   \begin{eqnarray*}
 610     \app{(\abs{x}{\termt})}{\termm} & \bred & \termt[\termm ] \\
 611     \lcfst (\termt, \termm) & \bred & \termt \\
 612     \lcsnd (\termt, \termm) & \bred & \termm
 613   \end{eqnarray*}
 614 \end{center}
 615
 616 When showing examples types will be omitted when this can be done without loss
 617 of clarity, for example $\abs{x}{x}$ in place of $\abs{A : \lcsetz}{\abs{x :
 618     A}{x}}$.  I will also use $\lcsetz$ as $\lcset{0}$.
 619
 620 There are a lot of new factors at play here. The first thing to notice is that
 621 the separation between types and terms is gone.  All we have is terms, that
 622 include both values (terms of type $\lcset{0}$) and types (terms of type
 623 $\lcset{n}$, with $n > 0$).  This change is reflected in the typing rules.
 624 While in the STLC values and types are kept well separated (values never go
 625 `right of the colon'), in ITT types can freely depend on values.
 626
 627 This relation is expressed in the typing rules for $\tyarr$ and $\times$: if a
 628 function has type $\lcforall{x}{\tya}{\tyb}$, $\tyb$ can depend on $x$.
 629 Examples will make this clearer once some base types are added in section
 630 \ref{sec:base-types}.
 631
 632 $\tyarr$ and $\times$ are at the core of the machinery of ITT:
 633
 634 \begin{description}
 635 \item[`forall' ($\tyarr$)] is a generalisation of $\tyarr$ in the STLC and
 636   expresses universal quantification in our logic.  It is often written with a
 637   syntax closer to logic, e.g. $\forall x : \tya. \tyb$.  In the
 638   literature this is also known as `dependent product' and shown as $\Pi$,
 639   following the interpretation of functions as infinitary products. I will just
 640   call it `dependent function', reserving `product' for $\exists$.
 641
 642 \item[`exists' ($\times$)] is a generalisation of $\wedge$ in the extended STLC
 643   of section \ref{sec:curry-howard}, and thus I will call it `dependent
 644   product'.  Like $\wedge$, it is formed by providing a pair of things.  In our
 645   logic, it represents existential quantification.  Similarly to $\tyarr$, it is
 646   always written as $\exists x : \tya. \tyb$.
 647
 648   For added confusion, in the literature that calls forall $\Pi$, $\exists$ is
 649   often named `dependent sum' and shown as $\Sigma$.  This is following the
 650   interpretation of $\exists$ as a generalised, infinitary $\vee$, where the
 651   first element of the pair is the `tag' that determines which type the second
 652   element will have.
 653 \end{description}
 654
 655 Another thing to notice is that types are very `first class': we are free to
 656 create functions that accept and return types.  For this reason we define
 657 $\defeq$ as the smallest equivalence relation extending $\bredc$, where $\bredc$
 658 is the reflexive transitive closure of $\bred$; and we treat types that are
 659 relate according to $\defeq$ as the same.  Another way of seeing $\defeq$ is
 660 this: when we want to compare two types for equality, we reduce them as far as
 661 possible and then check if they are equal\footnote{Note that when comparing
 662   terms we do it up to $\alpha$-renaming.  That is, we do not consider
 663   relabelling of variables as a difference - for example $\abs{x : A}{x} \defeq
 664   \abs{y : A}{y}$.}.  This works since not only each term has a normal form (ITT
 665 is strongly normalising), but the normal form is also unique; or in other words
 666 $\bred$ is confluent (if $\termt \bredc \termm$ and $\termt \bredc \termn$, then
 667 there is a $p$ such that $\termm \bredc \termp$ and $\termn \bredc \termp$).
 668 This measure makes sure that, for instance, $\app{(\abs{x :
 669     \lctype}{x})}{\lcbool} \defeq \lcbool$.  The theme of equality is central
 670 and will be analysed better later.
 671
 672 The theory presented is \emph{stratified}.  We have a hierarchy of types
 673 $\lcset{0} : \lcset{1} : \lcset{2} : \dots$, so that there is no `type of all
 674 types', and our theory is predicative.  The layers of the hierarchy are called
 675 `universes'.  $\lcsetz : \lcsetz$ ITT is inconsistent due to Girard's paradox
 676 \citep{Hurkens1995}, and thus loses its well-behavedness.  Some impredicativity
 677 sometimes has its place, either because the theory retain good properties
 678 (normalization, consistency, etc.) anyway, like in System F and CoC; or because
 679 we are at a stage at which we do not care - we will see instances of the latter
 680 later.
 681
 682 Note that the Curry-Howard correspondence runs through ITT as it did with the
 683 STLC with the difference that ITT corresponds to an higher order propositional
 684 logic.
 685
 686 \subsection{Base Types}
 687 \label{sec:base-types}
 688
 689 \newcommand{\lctrue}{\lccon{true}}
 690 \newcommand{\lcfalse}{\lccon{false}}
 691 \newcommand{\lcw}[3]{\lccon{W} #1 : #2 \absspace.\absspace #3}
 692 \newcommand{\lcnode}[4]{#1 \lhd_{#2 . #3} #4}
 693 \newcommand{\lcnodez}[2]{#1 \lhd #2}
 694 \newcommand{\lcited}[5]{\lcsyn{if}\appsp#1/#2\appsp.\appsp#3\appsp\lcsyn{then}\appsp#4\appsp\lcsyn{else}\appsp#5}
 695 \newcommand{\lcrec}[4]{\lcsyn{rec}\appsp#1/#2\appsp.\appsp#3\appsp\lcsyn{with}\appsp#4}
 696 \newcommand{\lcrecz}[2]{\lcsyn{rec}\appsp#1\appsp\lcsyn{with}\appsp#2}
 697 \newcommand{\AxiomL}[1]{\Axiom$\fCenter #1$}
 698 \newcommand{\UnaryInfL}[1]{\UnaryInf$\fCenter #1$}
 699
 700 While the ITT presented is a fairly complete logic, it is not that useful for
 701 programming.  If we wish to make it better, we can add some base types to
 702 represent the data structures we know and love, such as numbers, lists, and
 703 trees.  Apart from some unsurprising data types, we introduce $\lccon{W}$, a
 704 very general tree-like structure useful to represent inductively defined types
 705 and that will allow us to express structural recursion in an equally general
 706 manner.
 707
 708 \begin{center}
 709   \axname{syntax}
 710   \begin{eqnarray*}
 711   \termsyn & ::= & \dots \\
 712            &  |  & \top \separ \lcunit \\
 713            &  |  & \lcbool \separ \lctrue \separ \lcfalse \separ \lcited{\termsyn}{x}{\termsyn}{\termsyn}{\termsyn} \\
 714            &  |  & \lcw{x}{\termsyn}{\termsyn} \separ \lcnode{\termsyn}{x}{\termsyn}{\termsyn} \separ \lcrec{\termsyn}{x}{\termsyn}{\termsyn}
 715  \end{eqnarray*}
 716
 717   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 718
 719   \vspace{0.5cm}
 720
 721   \begin{tabular}{c c c}
 722     \AxiomC{}
 723     \UnaryInfC{$\hspace{0.2cm}\Gamma \vdash \top : \lcset{0} \hspace{0.2cm}$}
 724     \noLine
 725     \UnaryInfC{$\Gamma \vdash \lcbool : \lcset{0}$}
 726     \DisplayProof
 727     &
 728     \AxiomC{}
 729     \UnaryInfC{$\Gamma \vdash \lcunit : \top$}
 730     \noLine
 731     \UnaryInfC{\phantom{1em}}
 732     \DisplayProof
 733     &
 734     \AxiomC{}
 735     \UnaryInfC{$\Gamma \vdash \lctrue : \lcbool$}
 736     \noLine
 737     \UnaryInfC{$\Gamma \vdash \lcfalse : \lcbool$}
 738     \DisplayProof
 739   \end{tabular}
 740
 741   \vspace{0.5cm}
 742
 743   \begin{tabular}{c}
 744     \AxiomC{$\Gamma \vdash \termt : \lcbool$}
 745     \AxiomC{$\Gamma \vdash \termm : \tya[\lctrue]$}
 746     \AxiomC{$\Gamma \vdash \termn : \tya[\lcfalse]$}
 747     \TrinaryInfC{$\Gamma \vdash \lcited{\termt}{x}{\tya}{\termm}{\termn} : \tya[\termt]$}
 748     \DisplayProof
 749   \end{tabular}
 750
 751   \vspace{0.5cm}
 752
 753   \begin{tabular}{c}
 754     \AxiomC{$\Gamma \vdash \tya : \lcset{n}$}
 755     \AxiomC{$\Gamma; x : \tya \vdash \tyb : \lcset{m}$}
 756     \BinaryInfC{$\Gamma \vdash \lcw{x}{\tya}{\tyb} : \lcset{n \sqcup m}$}
 757     \DisplayProof
 758   \end{tabular}
 759
 760   \vspace{0.5cm}
 761
 762   \begin{tabular}{c}
 763     \AxiomC{$\Gamma \vdash \termt : \tya$}
 764     \AxiomC{$\Gamma \vdash \termf : \tyb[\termt ] \tyarr \lcw{x}{\tya}{\tyb}$}
 765     \BinaryInfC{$\Gamma \vdash \lcnode{\termt}{x}{\tyb}{\termf} : \lcw{x}{\tya}{\tyb}$}
 766     \DisplayProof
 767   \end{tabular}
 768
 769   \vspace{0.5cm}
 770
 771   \begin{tabular}{c}
 772     \AxiomC{$\Gamma \vdash \termt: \lcw{x}{\tya}{\tyb}$}
 773     \noLine
 774     \UnaryInfC{$\Gamma \vdash \lcforall{\termm}{\tya}{\lcforall{\termf}{\tyb[\termm] \tyarr \lcw{x}{\tya}{\tyb}}{(\lcforall{\termn}{\tyb[\termm]}{\tyc[\app{\termf}{\termn}]}) \tyarr \tyc[\lcnodez{\termm}{\termf}]}}$}
 775     \UnaryInfC{$\Gamma \vdash \lcrec{\termt}{x}{\tyc}{\termp} : \tyc[\termt]$}
 776     \DisplayProof
 777   \end{tabular}
 778
 779   \vspace{0.5cm}
 780
 781   \axdesc{reduction}{\termsyn \bred \termsyn}
 782   \begin{eqnarray*}
 783     \lcited{\lctrue}{x}{\tya}{\termt}{\termm} & \bred & \termt \\
 784     \lcited{\lcfalse}{x}{\tya}{\termt}{\termm} & \bred & \termm \\
 785     \lcrec{\lcnodez{\termt}{\termf}}{x}{\tya}{\termp} & \bred & \app{\app{\app{\termp}{\termt}}{\termf}}{(\abs{\termm}{\lcrec{\app{f}{\termm}}{x}{\tya}{\termp}})}
 786   \end{eqnarray*}
 787 \end{center}
 788
 789 The introduction and elimination for $\top$ and $\lcbool$ are unsurprising.
 790 Note that in the $\lcite{\dotsb}{\dotsb}{\dotsb}$ construct the type of the
 791 branches are dependent on the value of the conditional.
 792
 793 % TODO: explain better here
 794
 795 The rules for $\lccon{W}$, on the other hand, are quite an eyesore.  The idea
 796 behind $\lccon{W}$ types is to build up `trees' where the number of `children'
 797 of each node is dependent on the value in the node.  This is captured by the
 798 $\lhd$ constructor, where the argument on the left is the value, and the
 799 argument on the right is a function that returns a child for each possible value
 800 of $\tyb[\text{node value}]$, if $\lcw{x}{\tya}{\tyb}$.  The recursor
 801 $\lcrec{\termt}{x}{\tyc}{\termp}$ uses $p$ to inductively prove that
 802 $\tyc[\termt]$ holds.
 803
 804 \subsection{Some examples}
 805
 806 Now we can finally provide some meaningful examples.  Apart from omitting types,
 807 I will also use some abbreviations:
 808 \begin{itemize}
 809   \item $\_\mathit{operator}\_$ to define infix operators
 810   \item $\abs{x\appsp y\appsp z : \tya}{\dotsb}$ to define multiple abstractions at the same
 811     time
 812 \end{itemize}
 813
 814 \subsubsection{Sum types}
 815
 816 We would like to recover our sum type, or disjunction, $\vee$.  This is easily
 817 achieved with $\times$:
 818 \[
 819 \begin{array}{rcl}
 820   \_\vee\_ & = &  \abs{\tya\appsp\tyb : \lcsetz}{\lcexists{x}{\lcbool}{\lcite{x}{\tya}{\tyb}}} \\
 821   \lcinl   & = & \abs{x}{(\lctrue, x)} \\
 822   \lcinr   & = & \abs{x}{(\lcfalse, x)} \\
 823   \lcfun{case} & = & \abs{x : \tya \vee \tyb}{\abs{f : \tya \tyarr \tyc}{\abs{g : \tyb \tyarr \tyc}{ \\
 824            &   & \hspace{0.5cm} \app{(\lcited{\lcfst x}{b}{(\lcite{b}{A}{B}) \tyarr C}{f}{g})}{(\lcsnd x)}}}}
 825 \end{array}
 826 \]
 827 What is going on here?  We are using $\exists$ with $\lcbool$ as a tag, so that
 828 we can choose between one of two types in the second element.  In
 829 $\lcfun{case}$ we use $\lcite{\lcfst x}{\dotsb}{\dotsb}$ to discriminate on the
 830 tag, that is, the first element of $x : \tya \vee \tyb$.  If the tag is true,
 831 then we know that the second element is of type $\tya$, and we will apply $f$.
 832 The same applies to the other branch, with $\tyb$ and $g$.
 833
 834 \subsubsection{Naturals}
 835
 836 Now it's time to showcase the power of $\lccon{W}$ types.
 837 \begin{eqnarray*}
 838   \lccon{Nat}  & = & \lcw{b}{\lcbool}{\lcite{b}{\top}{\bot}}  \\
 839   \lccon{zero} & = & \lcfalse \lhd \abs{z}{\lcabsurd z} \\
 840   \lccon{suc}  & = & \abs{n}{(\lctrue \lhd \abs{\_}{n})}  \\
 841   \lcfun{plus} & = & \abs{x\appsp y}{\lcrecz{x}{(\abs{b}{\lcite{b}{\abs{\_\appsp f}{\app{\lccon{suc}}{(\app{f}{\lcunit})}}}{\abs{\_\appsp\_}{y}}})}}
 842 \end{eqnarray*}
 843 Here we encode natural numbers through $\lccon{W}$ types.  Each node contains a
 844 $\lcbool$.  If the $\lcbool$ is $\lcfalse$, then there are no children, since we
 845 have a function $\bot \tyarr \lccon{Nat}$: this is our $0$.  If it's $\lctrue$,
 846 we need one child only: the predecessor.  We recurse giving $\lcsyn{rec}$ a
 847 function that handles the `base case' 0 when the $\lcbool$ is $\lctrue$, and the
 848 inductive case otherwise.
 849
 850 We postpone more complex examples after the introduction of inductive families
 851 and pattern matching, since $\lccon{W}$ types get unwieldy very quickly.
 852
 853 \subsection{Propositional Equality}
 854 \label{sec:propeq}
 855
 856 We can finally introduce one of the central subjects of my project:
 857 propositional equality.
 858
 859 \newcommand{\lceq}[3]{#2 =_{#1} #3}
 860 \newcommand{\lceqz}[2]{#1 = #2}
 861 \newcommand{\lcrefl}[2]{\lccon{refl}_{#1}\appsp #2}
 862 \newcommand{\lcsubst}[3]{\lcfun{subst}\appsp#1\appsp#2\appsp#3}
 863
 864 \begin{center}
 865   \axname{syntax}
 866   \begin{eqnarray*}
 867   \termsyn & ::= & ... \\
 868            &  |  & \lceq{\termsyn}{\termsyn}{\termsyn} \separ \lcrefl{\termsyn}{\termsyn} \separ \lcsubst{\termsyn}{\termsyn}{\termsyn}
 869  \end{eqnarray*}
 870
 871   \axdesc{typing}{\Gamma \vdash \termsyn : \termsyn}
 872
 873   \vspace{0.5cm}
 874
 875   \begin{tabular}{c c}
 876     \AxiomC{$\Gamma \vdash \termt : \tya$}
 877     \UnaryInfC{$\Gamma \vdash \lcrefl{\tya}{\termt} : \lceq{\tya}{\termt}{\termt}$}
 878     \DisplayProof
 879     &
 880     \AxiomC{$\Gamma \vdash \termp : \lceq{\tya}{\termt}{\termm}$}
 881     \AxiomC{$\Gamma \vdash \tyb : \tya \tyarr \lcsetz$}
 882     \AxiomC{$\Gamma \vdash \termn : \app{\tyb}{\termt}$}
 883     \TrinaryInfC{$\Gamma \vdash \lcsubst{\termp}{\tyb}{\termn} : \app{\tyb}{\termm}$}
 884     \DisplayProof
 885   \end{tabular}
 886
 887   \vspace{0.5cm}
 888
 889   \axdesc{reduction}{\termsyn \bred \termsyn}
 890   \begin{eqnarray*}
 891     \lcsubst{(\lcrefl{\tya}{\termt})}{\tyb}{\termm} \bred \termm
 892   \end{eqnarray*}
 893 \end{center}
 894
 895 Propositional equality internalises equality as a type.  The only way to
 896 introduce an equality proof is by reflexivity ($\lccon{refl}$).  The eliminator
 897 is in the style of `Leibnitz's law' of equality: `equals can be substituted for
 898 equals' ($\lcfun{subst}$).  The computation rule refers to the fact that every
 899 proof of equality will be a $\lccon{refl}$.  We can use $\neg
 900 (\lceq{\tya}{x}{y})$ to express inequality.
 901
 902 These elements conclude our presentation of a `core' type theory.  For an
 903 extended example of a similar theory in use see Section 6.2 of
 904 \cite{Thompson1991}\footnote{Note that while I attempted to formalise the proof
 905   in Agda, I found a bug in the book!  See the errata for details:
 906   \url{http://www.cs.kent.ac.uk/people/staff/sjt/TTFP/errata.html}.}.  The above
 907 language and examples have been codified in Agda\footnote{More on Agda in the
 908   next section.}, see appendix \ref{app:agda-code}.
 909
 910 \section{A more useful language}
 911 \label{sec:practical}
 912
 913 While our core type theory equipped with $\lccon{W}$ types is very usefully
 914 conceptually as a simple but complete language, things get messy very fast.  In
 915 this section I will present the elements that are usually included in theorem
 916 provers or programming languages to make them usable by mathematicians or
 917 programmers.
 918
 919 All the features presented are present in the second version of the Agda system
 920 \citep{Norell2007, Bove2009}.  Agda follows a tradition of theorem provers based
 921 on ITT with inductive families and pattern matching.  The closest relative of
 922 Agda, apart from previous versions, is Epigram \citep{McBride2004, EpigramTut}.
 923 Coq is another notable theorem prover based on the same concepts, and the first
 924 and most popular \citep{Coq}.
 925
 926 The main difference between Coq and Agda/Epigram is that the former focuses on
 927 theorem proving, while the latter also want to be useful as functional
 928 programming languages, while still offering good tools to express
 929 mathematics\footnote{In fact, currently, Agda is used almost exclusively to
 930   express mathematics, rather than to program.}.  This is reflected in a series
 931 of differences that I will not discuss here (most notably the absence of tactics
 932 but better support for pattern matching in Agda/Epigram).  I will take the same
 933 approach as Agda/Epigram and will give a perspective focused on functional
 934 programming rather than theorem proving.  Every feature will be presented as it
 935 is Agda.
 936
 937 \subsection{Friendlier universes}
 938
 939 Universes can be inferred mechanically \citep{Pollack1990}. It is also
 940 convenient to have a \emph{cumulative} theory, where $\lcset{n} : \lcset{m}$ iff
 941 $n < m$.  We eschew these measures to keep the presentation simple.
 942
 943 \subsection{Bidirectional type checking}
 944
 945 Lastly, the theory I present is fully explicit in the sense that the user has to
 946 specify every type when forming abstractions, products, etc.  This can be a
 947 great burden if one wants to use the theory directly.  Complete inference is
 948 undecidable (which is hardly surprising considering the role that types play)
 949 but partial inference (also called `bidirectional type checking' in this
 950 context) in the style of \cite{Pierce2000} will have to be deployed in a
 951 practical system.
 952
 953 \subsection{Inductive families}
 954
 955 \newcommand{\lcdata}[1]{\lcsyn{data}\appsp #1}
 956 \newcommand{\lcdb}{\ |\ }
 957 \newcommand{\lcind}{\hspace{0.2cm}}
 958 \newcommand{\lcwhere}{\appsp \lcsyn{where}}
 959 \newcommand{\lclist}[1]{\app{\lccon{List}}{#1}}
 960 \newcommand{\lcnat}{\app{\lccon{Nat}}}
 961 \newcommand{\lcvec}[2]{\app{\app{\lccon{Vec}}{#1}}{#2}}
 962
 963 Inductive families were first introduced by \cite{Dybjer1991}.  For the reader
 964 familiar with the recent developments present in the GHC compiler for Haskell,
 965 inductive families will look similar to GADTs (Generalised Abstract Data Types)
 966 \citep[Section 7.4.7]{GHC}.
 967 Haskell style data types provide \emph{parametric polymorphism}, so that we can
 968 define types that range over type parameters:
 969 \[
 970 \begin{array}{l}
 971 \lcdata{\lclist{A}} = \lccon{Nil} \lcdb A :: \lclist{A}
 972 \end{array}
 973 \]
 974 In this way we define the $\lccon{List}$ type once while allowing elements to
 975 be of any type.  $\lccon{List}$ will be a type constructor of type $\lcsetz
 976 \tyarr \lcsetz$, while $\lccon{nil} : \lcforall{A}{\lcsetz}{\lclist{A}}$ and
 977 $\_::\_ : \lcforall{A}{\lcsetz}{A \tyarr \lclist{A} \tyarr \lclist{A}}$.
 978
 979 Inductive families bring this concept one step further by allowing some of the
 980 parameters to be constrained by constructors.  We call these `variable'
 981 parameters `indices'.  For example we might define natural numbers
 982 \[
 983 \begin{array}{l}
 984 \lcdata{\lcnat} : \lcsetz \lcwhere \\
 985   \begin{array}{l@{\ }l}
 986     \lcind \lccon{zero} &: \lcnat \\
 987     \lcind \lccon{suc}  &: \lcnat \tyarr \lcnat
 988   \end{array}
 989 \end{array}
 990 \]
 991 And then define a type for lists indexed by length:
 992 \[
 993 \begin{array}{l}
 994 \lcdata{\lccon{Vec}}\appsp (A : \lcsetz) : \lcnat \tyarr \lcsetz \lcwhere \\
 995   \begin{array}{l@{\ }l}
 996     \lcind \lccon{nil} &: \lcvec{A}{\lccon{zero}} \\
 997     \lcind \_::\_ &: \{n : \lccon{Nat}\} \tyarr A \tyarr \lcvec{A}{n} \tyarr \lcvec{A}{\app{(\lccon{suc}}{n})}
 998   \end{array}
 999 \end{array}
1000 \]
1001 Note that contrary to the Haskell ADT notation, with inductive families we
1002 explicitly list the types for the type constructor and data constructors.  In
1003 $\lccon{Vec}$, $A$ is a parameter (same across all constructors) while the
1004 $\lcnat$ is an index.  In our syntax, in the $\lcsyn{data}$ declaration, things
1005 to the left of the colon are parameters, while on the right we can define the
1006 type of indices.  Also note that the parameters' identifiers will be in scope
1007 across all constructors, while indices won't.
1008
1009 In this $\lccon{Vec}$ example, when we form a new list the length is
1010 $\lccon{zero}$.  When we append a new element to an existing list of length $n$,
1011 the new list is of length $\app{\lccon{suc}}{n}$, that is, one more than the
1012 previous length.  Also note that we are using the $\{n : \lcnat\} \tyarr \dotsb$
1013 syntax to indicate an argument that we will omit when using $\_::\_$.  In the
1014 future I shall also omit the type of these implicit parameters, in line with how
1015 Agda works.
1016
1017 Once we have $\lccon{Vec}$ we can do things much more safely than with normal
1018 lists.  For example, we can define an $\lcfun{head}$ function that returns the
1019 first element of the list:
1020 \[
1021 \begin{array}{l}
1022 \lcfun{head} : \{A\ n\} \tyarr \lcvec{A}{(\app{\lccon{suc}}{n})} \tyarr A
1023 \end{array}
1024 \]
1025 Note that we will not be able to provide this function with a
1026 $\lcvec{A}{\lccon{zero}}$, since it needs an index which is a successor of some
1027 number.
1028
1029 \newcommand{\lcfin}[1]{\app{\lccon{Fin}}{#1}}
1030
1031 If we wish to index a $\lccon{Vec}$ safely, we can define an inductive family
1032 $\lcfin{n}$, which represents the family of numbers smaller than a given number
1033 $n$:
1034 \[
1035 \begin{array}{l}
1036   \lcdata{\lccon{Fin}} : \lcnat \tyarr \lcsetz \lcwhere \\
1037   \begin{array}{l@{\ }l}
1038     \lcind \lccon{fzero} &: \{n\} \tyarr \lcfin{(\app{\lccon{suc}}{n})} \\
1039     \lcind \lccon{fsuc}  &: \{n\} \tyarr \lcfin{n} \tyarr \lcfin{(\app{\lccon{suc}}{n})}
1040   \end{array}
1041 \end{array}
1042 \]
1043 $\lccon{fzero}$ is smaller than any number apart from $\lccon{zero}$.  Given
1044 the family of numbers smaller than $n$, we can produce the family of numbers
1045 smaller than $\app{\lccon{suc}}{n}$.  Now we can define an `indexing' operation
1046 for our $\lccon{Vec}$:
1047 \[
1048 \begin{array}{l}
1049 \_\lcfun{!!}\_ : \{A\ n\} \tyarr \lcvec{A}{n} \tyarr \lcfin{n} \tyarr A
1050 \end{array}
1051 \]
1052 In this way we will be sure that the number provided as an index will be smaller
1053 than the length of the $\lccon{Vec}$ provided.
1054
1055 \subsubsection{Computing with inductive families}
1056
1057 I have carefully avoided to define the functions that I mentioned as example,
1058 since one question still is unanswered: `how do we work with inductive
1059 families'?  The most basic approach is to work with eliminators, that can be
1060 automatically provided by the programming language for each data type defined.
1061 For example the induction principle on natural numbers will serve as an
1062 eliminator for $\lcnat$:
1063 \[
1064 \begin{array}{l@{\ } c@{\ } l}
1065   \lcfun{NatInd} & : & (A : \lcsetz) \tyarr \\
1066                  &   & A \tyarr (\lcnat \tyarr A \tyarr A) \tyarr \\
1067                  &   & \lcnat \tyarr A
1068 \end{array}
1069 \]
1070
1071 That is, if we provide an $A$ (base case), and then given a number and an $A$ we
1072 give the next $A$ (inductive step), then an $A$ can be computed for every
1073 number.  This can be expressed easily in Haskell:
1074 \begin{lstlisting}
1075 data Nat = Zero | Suc Nat
1076 natInd :: a -> (Nat -> a -> a) -> Nat -> a
1077 natInd z _  Zero   = z
1078 natInd z f (Suc n) = f n (natInd z f n)
1079 \end{lstlisting}
1080 However, in a dependent setting, we can do better:
1081 \[
1082 \begin{array}{l@{\ } c@{\ } l}
1083   \lcfun{NatInd} & : & (P : \lcnat \tyarr \lcsetz) \tyarr \\
1084                  &   & \app{P}{\lccon{zero}} \tyarr ((n : \lcnat) \tyarr \app{P}{n} \tyarr \app{P}{(\app{\lccon{suc}}{n})}) \tyarr \\
1085                  &   & (n : \lcnat) \tyarr \app{P}{n}
1086 \end{array}
1087 \]
1088 This expresses the fact that the resulting type can be dependent on the number.
1089 In other words, we are proving that $P$ holds for all $n : \lcnat$.
1090
1091 Naturally a reduction rule will be associated with each eliminator:
1092 $$
1093 \begin{array}{l c l}
1094 \app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{\lccon{zero}} & \bred & z \\
1095 \app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{(\app{\lccon{suc}}{n})} & \bred & \app{\app{f}{n}}{(\app{\app{\app{\app{\lcfun{NatInd}}{P}}{z}}{f}}{n})}
1096 \end{array}
1097 $$
1098 Which echoes the \texttt{natInd} function defined in Haskell.  An extensive
1099 account on combinators and inductive families can be found in \cite{McBride2004}.
1100
1101 \subsubsection{Pattern matching and guarded recursion}
1102
1103 However, combinators are far more cumbersome to use than the techniques usually
1104 employed in functional programming: pattern matching and recursion.
1105 \emph{General} recursion cannot be added if we want to keep our theory free of
1106 $\bot$.  The common solution to this problem is to allow recursive calls only if
1107 the arguments are structurally smaller than what the function received.  Pattern
1108 matching on the other hand gains considerable power with inductive families,
1109 since when we match a constructor we are gaining information on the indices of
1110 the family.  Thus matching constructors will enable us to restrict patterns of
1111 other arguments.
1112
1113 Following this discipline defining $\lcfun{head}$ becomes easy:
1114 \[
1115 \begin{array}{l}
1116 \lcfun{head} : \{A\ n\} \tyarr \lcvec{A}{(\app{\lccon{suc}}{n})} \tyarr A \\
1117 \lcfun{head}\appsp(x :: \_) = x
1118 \end{array}
1119 \]
1120 We need no case for $\lccon{nil}$, since the type checker knows that the
1121 equation $n = \lccon{zero} = \app{\lccon{suc}}{n'}$ cannot be satisfied.
1122
1123 More details on the implementations of inductive families can be found in
1124 \citep{McBride2004, Norell2007}.
1125
1126 \subsection{Corecursion}
1127
1128 \section{Why dependent types?}
1129
1130 \section{Many equalities}
1131
1132 \subsection{Revision}
1133
1134 Up to this point, I have introduced two equalities: \emph{definitional} equality
1135 ($\defeq$) and \emph{propositional} equality ($=_{\tya}$).
1136
1137 Definitional equality relates terms that the type checker identifies as equal.
1138 Our definition in section \ref{sec:core-tt} consisted of `reduce the two terms
1139 to their normal forms, then check if they are equal (up to $\alpha$-renaming)'.
1140 We can extend this definition in other way so that more terms will be identified
1141 as equal.  Common ones include doing $\eta$-expansion, which converts partial
1142 appications of functions, and congruence laws for $\lambda$:
1143 \begin{center}
1144   \begin{tabular}{c c c}
1145     \AxiomC{$\Gamma \vdash \termf : \tya \tyarr \tyb$}
1146     \UnaryInfC{$\Gamma \vdash \termf \defeq \abs{x}{\app{\termf}{x}}$}
1147     \DisplayProof
1148     &
1149     \AxiomC{$\Gamma; x : \tya \vdash \termf \defeq g$}
1150     \UnaryInfC{$\Gamma \vdash \abs{x}{\termf} \defeq \abs{x}{g}$}
1151     \DisplayProof
1152   \end{tabular}
1153 \end{center}
1154 Definitional equality is used in the type checking process, and this means that
1155 in dependently typed languages type checking and evaluation are intertwined,
1156 since we need to reduce types to normal forms to check if they are equal.  It
1157 also means that we need be able to compute under $\lambda$s, for example to
1158 determine that $\abs{x}{\app{(\abs{y}{y})}{\tya}} \defeq \abs{x}{\tya}$.  This
1159 process goes on `under the hood' and is outside the control of the user.
1160
1161 Propositional equality, on the other hand, is available to the user to reason
1162 about equality, internalising it as a type.  As we have seen in section
1163 \ref{sec:propeq} propositional equality is introduced by reflexivity and
1164 eliminated with a `Leibnitz's law' style rule ($\lcfun{subst}$).  Now that we
1165 have inductive families and dependent pattern matching we do not need hard coded
1166 rules to express this concepts\footnote{Here I use Agda notation, and thus I
1167   cannot redefine $=$ and use subscripts, so I am forced to use $\equiv$ with
1168   implicit types.  After I will carry on using the old notation.}:
1169 \[
1170 \begin{array}{l}
1171   \lcdata{\lccon{\_\equiv\_}} : \{A : \lcsetz\} : A \tyarr A \tyarr \lcsetz \lcwhere \\
1172   \begin{array}{l@{\ }l}
1173     \lcind \lccon{refl} &: \{x : A\} \tyarr x \equiv x
1174   \end{array} \\
1175   \\
1176   \lcfun{subst} : \{A : \lcsetz\} \tyarr \{t\ m : A\} \tyarr t \equiv m \tyarr (B : A \tyarr \lcsetz) \tyarr \app{B}{t} \tyarr \app{B}{m} \\
1177   \lcfun{subst} \appsp \lccon{refl}\appsp B \appsp n = n
1178 \end{array}
1179 \]
1180 Here matching $\lccon{refl}$ tells the type checker that $t \defeq m$, and thus
1181 $\app{B}{t} \cong \app{B}{m}$, so we can just return $n$.
1182
1183 While propositional equality is a very useful construct, we can prove less terms
1184 equal than we would like to.  For example, if we have the usual functions
1185 \[
1186 \begin{array}{l@{\ } l}
1187 \lcfun{id} & : \{A : \lcsetz\} \tyarr A \tyarr A \\
1188 \lcfun{map} & : \{A\ B : \lcsetz\} \tyarr (A \tyarr B) \tyarr \app{\lccon{List}}{A} \tyarr \app{\lccon{List}}{B}
1189 \end{array}
1190 \]
1191 we would certainly like to have a term of type
1192 \[\{A\ B : \lcsetz\} \tyarr
1193 \app{\lcfun{map}}{\lcfun{id}} =_{\lclist{A} \tyarr \lclist{B}} \lcfun{id}\]
1194 We cannot do this in ITT, since the things on the sides of $=$ look too
1195 different at the term level.  What we can have is
1196 \[
1197 \{A\ B : \lcsetz\} \tyarr (x : \lclist{A}) \tyarr \app{\app{\lcfun{map}}{\lcfun{id}}}{x}
1198 =_{\lclist{B}} \app{\lcfun{id}}{x}
1199 \]
1200 Since the type checker has something to compute with (even if only a variable)
1201 and reduce the two lists to equal normal forms.
1202
1203 \subsection{Extensional type theory and its problems}
1204
1205 \subsection{Observational equality}
1206
1207 \section{What to do}
1208
1209
1210 \bibliographystyle{authordate1}
1211 \bibliography{background}
1212
1213 \appendix
1214 \section{Agda code}
1215 \label{app:agda-code}
1216
1217 \end{document}