the war is over johnny

[bitonic-mengthesis.git] / thesis.lagda
diff --git a/thesis.lagda b/thesis.lagda

index e517f8a23a6620c770c9014a194cedb79736de04..77fdd95cc1dfc67942c122feb02a32a9f03dc888 100644 (file)
--- a/thesis.lagda
+++ b/thesis.lagda
@@ -2,29 +2,34 @@
  %% THIS LATEX HURTS YOUR EYES.  DO NOT READ.
  
  
-% TODO side conditions
-
  \documentclass[11pt, fleqn, twoside]{article}
  \usepackage{etex}
  
+\usepackage[usenames,dvipsnames]{xcolor}
+
  \usepackage[sc,slantedGreek]{mathpazo}
  % \linespread{1.05}
  % \usepackage{times}
  
  % \oddsidemargin .50in
  % \evensidemargin -.25in
-\oddsidemargin 0in
-\evensidemargin 0in
-\textheight 9.5in 
-\textwidth     6.2in
-\topmargin     -7mm  
-%% \parindent  10pt
+% % \oddsidemargin 0in
+% % \evensidemargin 0in
+% \textheight 9.5in 
+% \textwidth   6.2in
+% \topmargin   -9mm  
+% %% \parindent        10pt
  
-\headheight 0pt
-\headsep 0pt
+% \headheight 0pt
+% \headsep 0pt
  
-\usepackage{amsthm}
+\usepackage[hmargin=2cm,vmargin=2.5cm]{geometry}
+\geometry{textwidth=390pt}
+\geometry{bindingoffset=1.5cm}
  
+\raggedbottom
+
+\usepackage{amsthm}
  
  %% Bibtex
  \usepackage{natbib}
@@ -68,6 +73,18 @@
  % \usepackage{tikz-cd}
  % \usepackage{pgfplots}
  
+\usepackage{titlesec}
+
+% custom section
+\titleformat{\section}
+{\normalfont\huge\scshape}
+{\thesection\hskip 9pt\textpipe\hskip 9pt}
+{0pt}
+{}
+
+\newcommand{\sectionbreak}{\clearpage}
+
+
  
  %% -----------------------------------------------------------------------------
  %% Commands for Agda
@@ -116,14 +133,13 @@
  \newcommand{\myITE}[3]{\myfun{If}\, #1\, \myfun{Then}\, #2\, \myfun{Else}\, #3}
  \newcommand{\mycumul}{\preceq}
  
-\FrameSep0.2cm
  \newcommand{\mydesc}[3]{
    \noindent
    \mbox{
      \parbox{\textwidth}{
        {\mysmall
          \vspace{0.2cm}
-        \hfill \textup{\textbf{#1}} $#2$
+        \hfill \textup{\phantom{ygp}\textbf{#1}} $#2$
          \framebox[\textwidth]{
            \parbox{\textwidth}{
              \vspace{0.1cm}
@@ -281,6 +297,17 @@
  \pgfdeclarelayer{foreground}
  \pgfsetlayers{background,main,foreground}
  
+\definecolor{webgreen}{rgb}{0,.5,0}
+\definecolor{webbrown}{rgb}{.6,0,0}
+\definecolor{webyellow}{rgb}{0.98,0.92,0.73}
+
+\hypersetup{
+colorlinks=true, linktocpage=true, pdfstartpage=3, pdfstartview=FitV,
+breaklinks=true, pdfpagemode=UseNone, pageanchor=true, pdfpagemode=UseOutlines,
+plainpages=false, bookmarksnumbered, bookmarksopen=true, bookmarksopenlevel=1,
+hypertexnames=true, pdfhighlight=/O, urlcolor=webbrown, linkcolor=black, citecolor=webgreen}
+
+
  %% -----------------------------------------------------------------------------
  
  \title{\mykant: Implementing Observational Equality}
@@ -295,7 +322,8 @@
  
  \begin{document}
  
-\begin{titlepage}
+\pagenumbering{gobble}
+
  \begin{center}
  
  
@@ -328,47 +356,44 @@
  {\large \today}
  
  \end{center}
-\end{titlepage}
  
-\pagenumbering{gobble}
+\clearpage
  
-\newpage{}
  \mbox{}
-\newpage{}
-
-\thispagestyle{empty}
+\clearpage
  
  \begin{abstract}
-  The marriage between programming and logic has been a very fertile
-  one.  In particular, since the definition of the simply typed
+  The marriage between programming and logic has been a fertile one.  In
+  particular, since the definition of the simply typed
    $\lambda$-calculus, a number of type systems have been devised with
    increasing expressive power.
  
-  Among this systems, Inutitionistic Type Theory (ITT) has been a very
+  Among this systems, Intuitionistic Type Theory (ITT) has been a
    popular framework for theorem provers and programming languages.
    However, reasoning about equality has always been a tricky business in
-  ITT and related theories.  In this thesis we will explain why this is
+  ITT and related theories.  In this thesis we shall explain why this is
    the case, and present Observational Type Theory (OTT), a solution to
    some of the problems with equality.
  
    To bring OTT closer to the current practice of interactive theorem
    provers, we describe \mykant, a system featuring OTT in a setting more
    close to the one found in widely used provers such as Agda and Coq.
-  Nost notably, we feature used defined inductive and record types and a
+  Most notably, we feature used defined inductive and record types and a
    cumulative, implicit type hierarchy.  Having implemented part of
    $\mykant$ as a Haskell program, we describe some of the implementation
    issues faced.
  \end{abstract}
  
-\newpage{}
+\clearpage
+
  \mbox{}
-\newpage{}
+\clearpage
  
  \renewcommand{\abstractname}{Acknowledgements}
  \begin{abstract}
    I would like to thank Steffen van Bakel, my supervisor, who was brave
-  enough to believe in my project and who provided much advice and
-  support.
+  enough to believe in my project and who provided support and
+  invaluable advice.
  
    I would also like to thank the Haskell and Agda community on
    \texttt{IRC}, which guided me through the strange world of types; and
@@ -378,25 +403,21 @@
    exist without him.  Before them, Tony Field introduced me to Haskell,
    unknowingly filling most of my free time from that time on.
  
-  Finally, much of the work stems from the research of Conor McBride,
+  Finally, most of the work stems from the research of Conor McBride,
    who answered many of my doubts through these months.  I also owe him
    the colours.
  \end{abstract}
  
-\newpage{}
+\clearpage
  \mbox{}
-\newpage{}
-
-\thispagestyle{empty}
+\clearpage
  
  \tableofcontents
  
-\clearpage
+\section{Introduction}
  
  \pagenumbering{arabic}
  
-\section{Introduction}
-
  Functional programming is in good shape.  In particular the `well-typed'
  line of work originating from Milner's ML has been extremely fruitful,
  in various directions.  Nowadays functional, well-typed programming
@@ -408,7 +429,7 @@ motivator for ML's existence---is the advancement of the practice of
  
  An interactive theorem prover, or proof assistant, is a tool that lets
  the user develop formal proofs with the confidence of the machine
-checking it for correctness.  While the effort towards a full
+checking them for correctness.  While the effort towards a full
  formalisation of mathematics has been ongoing for more than a century,
  theorem provers have been the first class of software whose
  implementation depends directly on these theories.
@@ -418,45 +439,44 @@ functional programming and proving theorems in an \emph{intuitionistic}
  logic are the same activity.  Under this discipline, the types in our
  programming language can be interpreted as proposition in our logic; and
  the programs implementing the specification given by the types as their
-proofs.  This fact stimulated a very active transfer of techniques and
+proofs.  This fact stimulated an active transfer of techniques and
  knowledge between logic and programming language theory, in both
  directions.
  
-Mathematics could provide programming with a huge wealth of abstractions
-and constructs developed over centuries.  Moreover, identifying our
-types with a logic lets us focus on foundational questions regarding
+Mathematics could provide programming with a wealth of abstractions and
+constructs developed over centuries.  Moreover, identifying our types
+with a logic lets us focus on foundational questions regarding
  programming with a much more solid approach, given the years of rigorous
  study of logic.  Programmers, on the other hand, had already developed a
-wealth of approaches to effectively collaborate with computers, through
+number of approaches to effectively collaborate with computers, through
  the study of programming languages.
  
-We will follow the discipline of Intuitionistic Type Theory, or
-Martin-L\"{o}f Type Theory, after its inventor.  First formulated in the
-70s and then adjusted through a series of revisions, it has endured as
-the core of many practical systems widely in use today, and it is
-probably the most prominent instance of the proposition-as-types and
-proofs-as-programs discipline.  One of the most debated subjects in this
-field has been regarding what notion of \emph{equality} should be
+In this space, we shall follow the discipline of Intuitionistic Type
+Theory, or Martin-L\"{o}f Type Theory, after its inventor.  First
+formulated in the 70s and then adjusted through a series of revisions,
+it has endured as the core of many practical systems widely in use
+today, and it is the most prominent instance of the proposition-as-types
+and proofs-as-programs paradigm.  One of the most debated subjects in
+this field has been regarding what notion of equality should be
  exposed to the user.
  
-The tension in the field of equality springs from the fact that there is
-a divide between what the user can prove equal \emph{inside} the
-theory---what is \emph{propositionally} equal--- and what the theorem
-prover identifies as equal in its meta-theory---what is
+The tension when studying equality in type theory springs from the fact
+that there is a divide between what the user can prove equal
+\emph{inside} the theory---what is \emph{propositionally} equal---and
+what the theorem prover identifies as equal in its meta-theory---what is
  \emph{definitionally} equal.  If we want our system to be well behaved
-(for example if we want type checking to be decidable) we must keep the
-two notions separate, with definitional equality inducing propositional
+(mostly if we want to keep type checking decidable) we must keep the two
+notions separate, with definitional equality inducing propositional
  equality, but not the reverse.  However in this scenario propositional
  equality is weaker than we would like: we can only prove terms equal
-based on their syntactical structure, and not based on their observable
-behaviour.
+based on their syntactical structure, and not based on their behaviour.
  
  This thesis is concerned with exploring a new approach in this area,
  \emph{observational} equality.  Promising to provide a more adequate
  propositional equality while retaining well-behavedness, it still is a
  relatively unexplored notion.  We set ourselves to change that by
-studying it in a setting more akin the one found in currently available
-theorem provers.
+studying it in a setting more akin to the one found in currently
+available theorem provers.
  
  \subsection{Structure}
  
@@ -483,23 +503,25 @@ equality causes problems.
  Section \ref{sec:ott} will introduce observational equality, following
  closely the original exposition by \cite{Altenkirch2007}.  The
  presentation is free-standing but glosses over the meta-theoretic
-properties of OTT, focusing on the mechanism that make it work.
+properties of OTT, focusing on the mechanisms that make it work.
  
-Section \ref{sec:kant-theory} will describe \mykant, a system we have
-developed incorporating OTT along constructs usually present in modern
-theorem provers.  Along the way, we describe these additional features
-and their advantages.  Section \ref{sec:kant-practice} will describe an
-implementation implementing part of \mykant.  A high level design of the
-software is given, along with a few specific implementation issues
+Section \ref{sec:kant-theory} is the central part of the thesis and will
+describe \mykant, a system we have developed incorporating OTT along
+constructs usually present in modern theorem provers.  Along the way, we
+discuss these additional features and their trade-offs.  Section
+\ref{sec:kant-practice} will describe an implementation implementing
+part of \mykant.  A high level design of the software is given, along
+with a few specific implementation issues.
  
  Finally, Section \ref{sec:evaluation} will asses the decisions made in
-designing and implementing \mykant, and Section \ref{sec:future-work}
-will give a roadmap to bring \mykant\ on par and beyond the
-competition.
+designing and implementing \mykant and the results achieved; and Section
+\ref{sec:future-work} will give a roadmap to bring \mykant\ on par and
+beyond the competition.
  
-\subsection{Contribution}
+\subsection{Contributions}
+\label{sec:contributions}
  
-The goal of this thesis is threefold:
+The contribution of this thesis is threefold:
  
  \begin{itemize}
  \item Provide a description of observational equality `in context', to
@@ -524,9 +546,71 @@ The goal of this thesis is threefold:
    theory implementor.
  \end{itemize}
  
+The system developed as part of this thesis, \mykant, incorporates OTT
+with features that are familiar to users of existing theorem provers
+adopting the proofs-as-programs mantra.  The defining features of
+\mykant\ are:
+
+\begin{description}
+\item[Full dependent types] In ITT, types are very `first class' notion
+  and can be the result of computation---they can \emph{depend} on
+  values, thus the name \emph{dependent types}.  \mykant\ espouses this
+  notion to its full consequences.
+
+\item[User defined data types and records] Instead of forcing the user
+  to choose from a restricted toolbox, we let her define types for
+  greater flexibility.  We have two kinds of user defined types:
+  inductive data types, formed by various data constructors whose type
+  signatures can contain recursive occurrences of the type being
+  defined; and records, where we have just one data constructor and
+  projections to extract each each field in said constructor.
+
+\item[Consistency] Our system is meant to be consistent with respects to
+  the logic it embodies.  For this reason, we restrict recursion to
+  \emph{structural} recursion on the defined inductive types, through
+  the use of operators (destructors) computing on each type.  Following
+  the types-as-propositions interpretation, each destructor expresses an
+  induction principle on the data type it operates on.  To achieve the
+  consistency of these operations we make sure that our recursive data
+  types are \emph{strictly positive}.
+
+\item[Bidirectional type checking] We take advantage of a
+  \emph{bidirectional} type inference system in the style of
+  \cite{Pierce2000}.  This cuts down the type annotations by a
+  considerable amount in an elegant way and at a very low cost.
+  Bidirectional type checking is usually employed in core calculi, but
+  in \mykant\ we extend the concept to user defined data types.
+
+\item[Type hierarchy] In set theory we have to take treat powerset-like
+  objects with care, if we want to avoid paradoxes.  However, the
+  working mathematician is rarely concerned by this, and the consistency
+  in this regard is implicitly assumed.  In the tradition of
+  \cite{Russell1927}, in \mykant\ we employ a \emph{type hierarchy} to
+  make sure that these size issues are taken care of; and we employ
+  system so that the user will be free from thinking about the
+  hierarchy, just like the mathematician is.
+
+\item[Observational equality] The motivator of this thesis, \mykant\
+  incorporates a notion of observational equality, modifying the
+  original presentation by \cite{Altenkirch2007} to fit our more
+  expressive system.  As mentioned, we reconcile OTT with user defined
+  types and a type hierarchy. 
+
+\item[Type holes] When building up programs interactively, it is useful
+  to leave parts unfinished while exploring the current context.  This
+  is what type holes are for.
+\end{description}
+
+\subsection{Notation and syntax}
+
+Appendix \ref{app:notation} describes the notation and syntax used in
+this thesis.
+
  \section{Simple and not-so-simple types}
  \label{sec:types}
  
+\epigraph{\emph{Well typed programs can't go wrong.}}{Robin Milner}
+
  \subsection{The untyped $\lambda$-calculus}
  \label{sec:untyped}
  
@@ -569,11 +653,11 @@ $\beta$-reduction and substitution for the $\lambda$-calculus.
      \myapp{(\myabs{\myb{x}}{\mytmm})}{\mytmn} \myred \mysub{\mytmm}{\myb{x}}{\mytmn}\text{ \textbf{where}} \\
      \myind{2}
      \begin{array}{l@{\ }c@{\ }l}
-      \mysub{\myb{x}}{\myb{x}}{\mytmn} & = & \mytmn \\
-      \mysub{\myb{y}}{\myb{x}}{\mytmn} & = & y\text{ \textbf{with} } \myb{x} \neq y \\
-      \mysub{(\myapp{\mytmt}{\mytmm})}{\myb{x}}{\mytmn} & = & (\myapp{\mysub{\mytmt}{\myb{x}}{\mytmn}}{\mysub{\mytmm}{\myb{x}}{\mytmn}}) \\
-      \mysub{(\myabs{\myb{x}}{\mytmm})}{\myb{x}}{\mytmn} & = & \myabs{\myb{x}}{\mytmm} \\
-      \mysub{(\myabs{\myb{y}}{\mytmm})}{\myb{x}}{\mytmn} & = & \myabs{\myb{z}}{\mysub{\mysub{\mytmm}{\myb{y}}{\myb{z}}}{\myb{x}}{\mytmn}} \\
+      \mysub{\myb{y}}{\myb{x}}{\mytmn} \mymetaguard \myb{x} = \myb{y} & \mymetagoes & \mytmn \\
+      \mysub{\myb{y}}{\myb{x}}{\mytmn} & \mymetagoes & \myb{y} \\
+      \mysub{(\myapp{\mytmt}{\mytmm})}{\myb{x}}{\mytmn} & \mymetagoes & (\myapp{\mysub{\mytmt}{\myb{x}}{\mytmn}}{\mysub{\mytmm}{\myb{x}}{\mytmn}}) \\
+      \mysub{(\myabs{\myb{x}}{\mytmm})}{\myb{x}}{\mytmn} & \mymetagoes & \myabs{\myb{x}}{\mytmm} \\
+      \mysub{(\myabs{\myb{y}}{\mytmm})}{\myb{x}}{\mytmn} & \mymetagoes & \myabs{\myb{z}}{\mysub{\mysub{\mytmm}{\myb{y}}{\myb{z}}}{\myb{x}}{\mytmn}} \\
        \multicolumn{3}{l}{\myind{2} \text{\textbf{with} $\myb{x} \neq \myb{y}$ and $\myb{z}$ not free in $\myapp{\mytmm}{\mytmn}$}}
      \end{array}
    \end{array}
@@ -588,22 +672,22 @@ These few elements have a remarkable expressiveness, and are in fact
  Turing complete.  As a corollary, we must be able to devise a term that
  reduces forever (`loops' in imperative terms):
  \[
-  (\myapp{\omega}{\omega}) \myred (\myapp{\omega}{\omega}) \myred \cdots \text{, with $\omega = \myabs{x}{\myapp{x}{x}}$}
+  (\myapp{\omega}{\omega}) \myred (\myapp{\omega}{\omega}) \myred \cdots \text{, \textbf{where} $\omega = \myabs{x}{\myapp{x}{x}}$}
  \]
  \begin{mydef}[redex]
    A \emph{redex} is a term that can be reduced.
  \end{mydef}
  In the untyped $\lambda$-calculus this will be the case for an
  application in which the first term is an abstraction, but in general we
-call aterm reducible if it appears to the left of a reduction rule.
+call a term reducible if it appears to the left of a reduction rule.
  \begin{mydef}[normal form]
    A term that contains no redexes is said to be in \emph{normal form}.
  \end{mydef}
  \begin{mydef}[normalising terms and systems]
    Terms that reduce in a finite number of reduction steps to a normal
    form are \emph{normalising}.  A system in which all terms are
-  normalising is said to be have the \emph{normalisation property}, or
-  to be normalising.
+  normalising is said to have the \emph{normalisation property}, or
+  to be \emph{normalising}.
  \end{mydef}
  Given the reduction behaviour of $(\myapp{\omega}{\omega})$, it is clear
  that the untyped $\lambda$-calculus does not have the normalisation
@@ -616,17 +700,18 @@ systematically. Common evaluation strategies include \emph{call by
  before being applied to the abstraction; and conversely \emph{call by
    name} (or \emph{lazy}), where we reduce only when we need to do so to
  proceed---in other words when we have an application where the function
-is still not a $\lambda$. In both these reduction strategies we never
-reduce under an abstraction: for this reason a weaker form of
-normalisation is used, where both abstractions and normal forms are said
-to be in \emph{weak head normal form}.
+is still not a $\lambda$. In both these strategies we never
+reduce under an abstraction.  For this reason a weaker form of
+normalisation is used, where all abstractions are said to be in
+\emph{weak head normal form} even if their body is not.
  
  \subsection{The simply typed $\lambda$-calculus}
  
-A convenient way to `discipline' and reason about $\lambda$-terms is to assign
-\emph{types} to them, and then check that the terms that we are forming make
-sense given our typing rules \citep{Curry1934}.The first most basic instance
-of this idea takes the name of \emph{simply typed $\lambda$-calculus} (STLC).
+A convenient way to `discipline' and reason about $\lambda$-terms is to
+assign \emph{types} to them, and then check that the terms that we are
+forming make sense given our typing rules \citep{Curry1934}.  The first
+most basic instance of this idea takes the name of \emph{simply typed
+  $\lambda$-calculus} (STLC).
  \begin{mydef}[Simply typed $\lambda$-calculus]
    The syntax and typing rules for the STLC are given in Figure \ref{fig:stlc}.
  \end{mydef}
@@ -672,17 +757,18 @@ $\lambda$-calculus.
    \label{fig:stlc}
  \end{figure}
  
-In the typing rules, a context $\myctx$ is used to store the types of bound
-variables: $\myctx; \myb{x} : \mytya$ adds a variable to the context and
-$\myctx(x)$ extracts the type of the rightmost occurrence of $x$.
+In the typing rules, a context $\myctx$ is used to store the types of
+bound variables: $\myemptyctx$ is the empty context, and $\myctx;
+\myb{x} : \mytya$ adds a variable to the context.  $\myctx(x)$ extracts
+the type of the rightmost occurrence of $x$.
  
  This typing system takes the name of `simply typed lambda calculus' (STLC), and
  enjoys a number of properties.  Two of them are expected in most type systems
  \citep{Pierce2002}:
  \begin{mydef}[Progress]
-  A well-typed term is not stuck---it is either a variable, or it
-  does not appear on the left of the $\myred$ relation (currently
-  only $\lambda$), or it can take a step according to the evaluation rules.
+  A well-typed term is not stuck---it is either a variable, or it does
+  not appear on the left of the $\myred$ relation , or it can take a
+  step according to the evaluation rules.
  \end{mydef}
  \begin{mydef}[Subject reduction]
    If a well-typed term takes a step of evaluation, then the
@@ -729,8 +815,7 @@ Another important property of the STLC is the Church-Rosser property:
    and $\mytmn$ can be reduced.
  \end{mydef}
  Given that the STLC has the normalisation property and the Church-Rosser
-property, each term has a unique normal form for definitional equality
-to be decidable.
+property, each term has a \emph{unique} normal form.
  
  \subsection{The Curry-Howard correspondence}
  
@@ -752,7 +837,7 @@ beyond arrow types, we can extend our bare lambda calculus with useful
  types to represent other logical constructs.
  \begin{mydef}[The extended STLC]
    Figure \ref{fig:natded} shows syntax, reduction, and typing rules for
-  the \emph{extendend simply typed $\lambda$-calculus}.
+  the \emph{extended simply typed $\lambda$-calculus}.
  \end{mydef}
  
  \begin{figure}[t]
@@ -842,7 +927,7 @@ types to represent other logical constructs.
        \DisplayProof
      \end{tabular}
  }
-\caption{Rules for the extendend STLC.  Only the new features are shown, all the
+\caption{Rules for the extended STLC.  Only the new features are shown, all the
    rules and syntax for the STLC apply here too.}
    \label{fig:natded}
  \end{figure}
@@ -860,9 +945,10 @@ and $\mysnd$ to $\wedge$ elimination.
  The trivial type $\myunit$ corresponds to the logical $\top$ (true), and
  dually $\myempty$ corresponds to the logical $\bot$ (false).  $\myunit$
  has one introduction rule ($\mytt$), and thus one inhabitant; and no
-eliminators.  $\myempty$ has no introduction rules, and thus no
-inhabitants; and one eliminator ($\myabsurd{ }$), corresponding to the
-logical \emph{ex falso quodlibet}.
+eliminators---we cannot gain any information from a witness of the
+single member of $\myunit$.  $\myempty$ has no introduction rules, and
+thus no inhabitants; and one eliminator ($\myabsurd{ }$), corresponding
+to the logical \emph{ex falso quodlibet}.
  
  With these rules, our STLC now looks remarkably similar in power and use to the
  natural deduction we already know.
@@ -956,10 +1042,14 @@ inductive data.
  \section{Intuitionistic Type Theory}
  \label{sec:itt}
  
+\epigraph{\emph{Martin-L{\"o}f's type theory is a well established and
+    convenient arena in which computational Christians are regularly
+    fed to logical lions.}}{Conor McBride}
+
  \subsection{Extending the STLC}
  
-The STLC can be made more expressive in various ways.  \cite{Barendregt1991}
-succinctly expressed geometrically how we can add expressivity:
+\cite{Barendregt1991} succinctly expressed geometrically how we can add
+expressively to the STLC:
  $$
  \xymatrix@!0@=1.5cm{
    & \lambda\omega \ar@{-}[rr]\ar@{-}'[d][dd]
@@ -988,19 +1078,36 @@ Here $\lambda{\to}$, in the bottom left, is the STLC.  From there can move along
    This form of polymorphism and has been wildly successful, also thanks
    to a well known inference algorithm for a restricted version of System
    F known as Hindley-Milner \citep{milner1978theory}.  Languages like
-  Haskell and SML are based on this discipline.
+  Haskell and SML are based on this discipline.  In Haskell the above
+  example would be
+  \begin{Verbatim}
+id :: a -> a
+id x = x
+  \end{Verbatim}
+  Where \texttt{a} implicitly quantifies over a type, and will be
+  instantiated automatically thanks to the inference.
  \item[Types depending on types (towards $\lambda{\underline{\omega}}$)] We have
    type operators.  For example we could define a function that given types $R$
    and $\mytya$ forms the type that represents a value of type $\mytya$ in
    continuation passing style:
-  \[\displaystyle(\myabss{\myb{A} \myar \myb{R}}{\mytyp}{(\myb{A}
+  \[\displaystyle(\myabss{\myb{R} \myarr \myb{A}}{\mytyp}{(\myb{A}
      \myarr \myb{R}) \myarr \myb{R}}) : \mytyp \myarr \mytyp \myarr \mytyp
    \]
+  In Haskell we can define type operator of sorts, although we must
+  pair them with data constructors, to keep inference manageable:
+  \begin{Verbatim}
+newtype Cont r a = Cont ((a -> r) -> r)
+  \end{Verbatim}
+  Where the `type' (kind in Haskell parlance) of \texttt{Cont} will be
+  \texttt{* -> * -> *}, with \texttt{*} signifying the type of types in
+  Haskell.
  \item[Types depending on terms (towards $\lambda{P}$)] Also known as `dependent
    types', give great expressive power.  For example, we can have values of whose
    type depend on a boolean:
    \[\displaystyle(\myabss{\myb{x}}{\mybool}{\myite{\myb{x}}{\mynat}{\myrat}}) : \mybool
-  \myarr \mytyp\]
+  \myarr \mytyp\] We cannot give an Haskell example that expresses this
+  concept since Haskell does not support dependent types---it would be a
+  very different language if it did.
  \end{description}
  
  All the systems preserve the properties that make the STLC well behaved.  The
@@ -1028,22 +1135,23 @@ fact is that while System F is impredicative it is still consistent and strongly
  normalising.  \cite{Coquand1986} further extended this line of work with the
  Calculus of Constructions (CoC).
  
-Most widely used interactive theorem provers are based on ITT.  Popular ones
-include Agda \citep{Norell2007, Bove2009}, Coq \citep{Coq}, and Epigram
-\citep{McBride2004, EpigramTut}.
+Most widely used interactive theorem provers are based on ITT.  Popular
+ones include Agda \citep{Norell2007}, Coq \citep{Coq}, Epigram
+\citep{McBride2004, EpigramTut}, Isabelle \citep{Paulson1990}, and many
+others.
  
  \subsection{A simple type theory}
  \label{sec:core-tt}
  
-The calculus I present follows the exposition in \citep{Thompson1991},
-and is quite close to the original formulation of predicative ITT as
-found in \citep{Martin-Lof1984}.
+The calculus I present follows the exposition in \cite{Thompson1991},
+and is quite close to the original formulation of \cite{Martin-Lof1984}.
+Agda and \mykant\ renditions of the presented theory and all the
+examples (even the ones presented only as type signatures) are
+reproduced in Appendix \ref{app:itt-code}.
  \begin{mydef}[Intuitionistic Type Theory (ITT)]
  The syntax and reduction rules are shown in Figure \ref{fig:core-tt-syn}.
  The typing rules are presented piece by piece in the following sections.
  \end{mydef}
-Agda and \mykant\ renditions of the presented theory and all the
-examples is reproduced in Appendix \ref{app:itt-code}.
  
  \begin{figure}[t]
  \mydesc{syntax}{ }{
@@ -1125,23 +1233,29 @@ uniformly in the syntax.
  
  While the usefulness of doing this will become clear soon, a consequence is
  that since types can be the result of computation, deciding type equality is
-not immediate as in the STLC.  For this reason we define \emph{definitional
+not immediate as in the STLC.
+\begin{mydef}[Definitional equality]
+  We define \emph{definitional
    equality}, $\mydefeq$, as the congruence relation extending
-$\myred$---moreover, when comparing types syntactically we do it up to
-renaming of bound names ($\alpha$-renaming).  For example under this
-discipline we will find that
+$\myred$.  Moreover, when comparing types syntactically we do it up to
+renaming of bound names ($\alpha$-renaming)
+\end{mydef}
+For example under this discipline we will find that
  \[
-\myabss{\myb{x}}{\mytya}{\myb{x}} \mydefeq \myabss{\myb{y}}{\mytya}{\myb{y}}
+\begin{array}{@{}l}
+  \myabss{\myb{x}}{\mytya}{\myb{x}} \mydefeq \myabss{\myb{y}}{\mytya}{\myb{y}} \\
+  \myapp{(\myabss{\myb{f}}{\mytya \myarr \mytya}{\myb{f}})}{(\myabss{\myb{y}}{\mytya}{\myb{y}})} \mydefeq \myabss{\myb{quux}}{\mytya}{\myb{quux}}
+\end{array}
  \]
  Types that are definitionally equal can be used interchangeably.  Here
  the `conversion' rule is not syntax directed, but it is possible to
  employ $\myred$ to decide term equality in a systematic way, comparing
  terms by reducing to their normal forms and then comparing them
  syntactically; so that a separate conversion rule is not needed.
-Another thing to notice is that considering the need to reduce terms to
-decide equality it is essential for a dependently typed system to be
-terminating and confluent for type checking to be decidable, since every
-type needs to have a \emph{unique} normal form.
+Another thing to notice is that, considering the need to reduce terms to
+decide equality, for type checking to be decidable a dependently typed
+must be terminating and confluent; since every type needs to have a
+unique normal form for definitional equality to be decidable.
  
  Moreover, we specify a \emph{type hierarchy} to talk about `large'
  types: $\mytyp_0$ will be the type of types inhabited by data:
@@ -1258,8 +1372,8 @@ are a necessary, well-typed, danger.
  However, in a more expressive system, we can do better: the branches' type can
  depend on the value of the scrutinised boolean.  This is what the typing rule
  expresses: the user provides a type $\mytya$ ranging over an $\myb{x}$
-representing the scrutinised boolean type, and the branches are typechecked with
-the updated knowledge on the value of $\myb{x}$.
+representing the scrutinised boolean type, and the branches are type checked with
+the updated knowledge of the value of $\myb{x}$.
  
  \subsubsection{$\myarr$, or dependent function}
  \label{sec:depprod}
@@ -1284,19 +1398,20 @@ the updated knowledge on the value of $\myb{x}$.
      \end{tabular}
  }
  
-Dependent functions are one of the two key features that perhaps most
-characterise dependent types---the other being dependent products.  With
-dependent functions, the result type can depend on the value of the
-argument.  This feature, together with the fact that the result type
-might be a type itself, brings a lot of interesting possibilities.
-Following this intuition, in the introduction rule, the return type is
-typechecked in a context with an abstracted variable of lhs' type, and
-in the elimination rule the actual argument is substituted in the return
-type.  Keeping the correspondence with logic alive, dependent functions
-are much like universal quantifiers ($\forall$) in logic.
+Dependent functions are one of the two key features that characterise
+dependent types---the other being dependent products.  With dependent
+functions, the result type can depend on the value of the argument.
+This feature, together with the fact that the result type might be a
+type itself, brings a lot of interesting possibilities.  In the
+introduction rule, the return type is type checked in a context with an
+abstracted variable of domain's type; and in the elimination rule the
+actual argument is substituted in the return type.  Keeping the
+correspondence with logic alive, dependent functions are much like
+universal quantifiers ($\forall$) in logic.
  
  For example, assuming that we have lists and natural numbers in our
-language, using dependent functions we can write functions of types
+language, using dependent functions we can write functions of
+types
  \[
  \begin{array}{l}
  \myfun{length} : (\myb{A} {:} \mytyp_0) \myarr \myapp{\mylist}{\myb{A}} \myarr \mynat \\
@@ -1311,25 +1426,13 @@ language, using dependent functions we can write functions of types
  function. $\myarg\myfun{$>$}\myarg$ is a function that takes two naturals
  and returns a type: if the lhs is greater then the rhs, $\myunit$ is
  returned, $\myempty$ otherwise.  This way, we can express a
-`non-emptyness' condition in $\myfun{head}$, by including a proof that
+`non-emptiness' condition in $\myfun{head}$, by including a proof that
  the length of the list argument is non-zero.  This allows us to rule out
  the `empty list' case, so that we can safely return the first element.
  
-% TODO fix this
-Again, we need to make sure that the type hierarchy is respected, which
+Finally, we need to make sure that the type hierarchy is respected, which
  is the reason why a type formed by $\myarr$ will live in the least upper
-bound of the levels of argument and return type.  If this was not the
-case, we would be able to form a `powerset' function along the lines of
-\[
-\begin{array}{@{}l}
-\myfun{P} : \mytyp_0 \myarr \mytyp_0 \\
-\myfun{P} \myappsp \myb{A} \mapsto \myb{A} \myarr \mytyp_0
-\end{array}
-\]
-Where the type of $\myb{A} \myarr \mytyp_0$ is in $\mytyp_0$ itself.
-Using this and similar devices we would be able to derive falsity
-\citep{Hurkens1995}.  This trend will continue with the other type-level
-binders, $\myprod$ and $\mytyc{W}$.
+bound of the levels of argument and return type.
  
  \subsubsection{$\myprod$, or dependent product}
  \label{sec:disju}
@@ -1361,14 +1464,16 @@ binders, $\myprod$ and $\mytyc{W}$.
  If dependent functions are a generalisation of $\myarr$ in the STLC,
  dependent products are a generalisation of $\myprod$ in the STLC.  The
  improvement is that the second element's type can depend on the value of
-the first element.  The corrispondence with logic is through the
+the first element.  The correspondence with logic is through the
  existential quantifier: $\exists x \in \mathbb{N}. even(x)$ can be
  expressed as $\myexi{\myb{x}}{\mynat}{\myapp{\myfun{even}}{\myb{x}}}$.
  The first element will be a number, and the second evidence that the
  number is even.  This highlights the fact that we are working in a
  constructive logic: if we have an existence proof, we can always ask for
  a witness.  This means, for instance, that $\neg \forall \neg$ is not
-equivalent to $\exists$.
+equivalent to $\exists$.  Additionally, we need to specify the type of
+the second element (ranging over the first element) explicitly when
+using $\mypair{\myarg}{\myarg}$.
  
  Another perhaps more `dependent' application of products, paired with
  $\mybool$, is to offer choice between different types.  For example we
@@ -1414,14 +1519,15 @@ can easily recover disjunctions:
  }
  
  Finally, the well-order type, or in short $\mytyc{W}$-type, which will
-let us represent inductive data in a general (but clumsy) way.  We can
-form `nodes' of the shape $\mytmt \mynode{\myb{x}}{\mytyb} \myse{f} :
-\myw{\myb{x}}{\mytya}{\mytyb}$ that contain data ($\mytmt$) of type and
-one `child' for each member of $\mysub{\mytyb}{\myb{x}}{\mytmt}$.  The
+let us represent inductive data in a general way.  We can form `nodes'
+of the shape \[\mytmt \mynode{\myb{x}}{\mytyb} \myse{f} :
+\myw{\myb{x}}{\mytya}{\mytyb}\] where $\mytmt$ is of type $\mytya$ and
+is the data present in the node, and $\myse{f}$ specifies a `child' of
+the node for each member of $\mysub{\mytyb}{\myb{x}}{\mytmt}$.  The
  $\myfun{rec}\ \myfun{with}$ acts as an induction principle on
  $\mytyc{W}$, given a predicate and a function dealing with the inductive
-case---we will gain more intuition about inductive data in ITT in
-Section \ref{sec:user-type}.
+case---we will gain more intuition about inductive data in Section
+\ref{sec:user-type}.
  
  For example, if we want to form natural numbers, we can take
  \[
@@ -1458,25 +1564,28 @@ And with a bit of effort, we can recover addition:
    \end{array}
    \]
    Note how we explicitly have to type the branches to make them match
-  with the definition of $\mynat$.  This gives a taste of the
-  `clumsiness' of $\mytyc{W}$-types but not the whole story: well-orders
-  are inadequate not only because they are verbose, but present deeper
-  problems because the notion of equality present in most type theory
-  (which we will present in the next section) is too weak
-  \citep{dybjer1997representing}.  The `better' equality we will present
-  in Section \ref{sec:ott} helps but does not fully resolve these
-  issues.\footnote{See \url{http://www.e-pig.org/epilogue/?p=324}, which
-    concludes with `W-types are a powerful conceptual tool, but they’re
-    no basis for an implementation of recursive datatypes in decidable
-    type theories.'}  For this reasons \mytyc{W}-types have remained
-  nothing more than a reasoning tool, and practical systems implement
-  more expressive ways to represent data.
+  with the definition of $\mynat$.  This gives a taste of the clumsiness
+  of $\mytyc{W}$-types but not the whole story.  Well-orders are
+  inadequate not only because they are verbose, but also because they
+  face deeper problems due to the weakness of the notion of equality
+  present in most type theories, which we will present in the next
+  section \citep{dybjer1997representing}.  The `better' equality we will
+  present in Section \ref{sec:ott} helps but does not fully resolve
+  these issues.\footnote{See \url{http://www.e-pig.org/epilogue/?p=324},
+    which concludes with `W-types are a powerful conceptual tool, but
+    they’re no basis for an implementation of recursive data types in
+    decidable type theories.'}  For this reasons \mytyc{W}-types have
+  remained nothing more than a reasoning tool, and practical systems
+  must implement more expressive ways to represent data.
  
  \section{The struggle for equality}
  \label{sec:equality}
  
-In the previous section we saw how a type checker (or a human) needs a
-notion of \emph{definitional equality}.  Beyond this meta-theoretic
+\epigraph{\emph{Half of my time spent doing research involves thinking up clever
+  schemes to avoid needing functional extensionality.}}{@larrytheliquid}
+
+In the previous section we learnt how a type checker for ITT needs
+a notion of \emph{definitional equality}.  Beyond this meta-theoretic
  notion, in this section we will explore the ways of expressing equality
  \emph{inside} the theory, as a reasoning tool available to the user.
  This area is the main concern of this thesis, and in general a very
@@ -1487,7 +1596,8 @@ formalised in Agda in Appendix \ref{app:agda-itt}.
  \subsection{Propositional equality}
  
  \begin{mydef}[Propositional equality] The syntax, reduction, and typing
-  rules for propositional equality and related constructs is defined as:
+  rules for propositional equality and related constructs are defined
+  as:
  \end{mydef}
  \mynegder
  \noindent
@@ -1504,11 +1614,11 @@ formalised in Agda in Appendix \ref{app:agda-itt}.
  }
  \end{minipage} 
  \begin{minipage}{0.5\textwidth}
-\mydesc{reduction:}{\mytmsyn \myred \mytmsyn}{
+\mydesc{\phantom{y}reduction:}{\mytmsyn \myred \mytmsyn}{
      $
      \myjeq{\myse{P}}{(\myapp{\myrefl}{\mytmm})}{\mytmn} \myred \mytmn
      $
-  \vspace{1.1cm}
+  \vspace{1.05cm}
  }
  \end{minipage}
  \mynegder
@@ -1536,18 +1646,20 @@ formalised in Agda in Appendix \ref{app:agda-itt}.
        \DisplayProof
      \end{tabular}
  }
+\ \\
  
  To express equality between two terms inside ITT, the obvious way to do
  so is to have equality to be a type.  Here we present what has survived
-as the dominating form of equality in systems based on ITT up to the
-present day.
+as the dominating form of equality in systems based on ITT up since
+\cite{Martin-Lof1984} up to the present day.
  
-Our type former is $\mypeq$, which given a type (in this case
-$\mytya$) relates equal terms of that type.  $\mypeq$ has one introduction
-rule, $\myrefl$, which introduces an equality relation between definitionally
-equal terms.
+Our type former is $\mypeq$, which given a type relates equal terms of
+that type.  $\mypeq$ has one introduction rule, $\myrefl$, which
+introduces an equality relation between definitionally equal terms.
  
-Finally, we have one eliminator for $\mypeq$, $\myjeqq$.  $\myjeq{\myse{P}}{\myse{q}}{\myse{p}}$ takes
+Finally, we have one eliminator for $\mypeq$ , $\myjeqq$ (also known as
+`\myfun{J} axiom' in the literature).
+$\myjeq{\myse{P}}{\myse{q}}{\myse{p}}$ takes
  \begin{itemize}
  \item $\myse{P}$, a predicate working with two terms of a certain type (say
    $\mytya$) and a proof of their equality;
@@ -1557,16 +1669,17 @@ Finally, we have one eliminator for $\mypeq$, $\myjeqq$.  $\myjeq{\myse{P}}{\mys
    twice, plus the trivial proof by reflexivity showing that $\myse{m}$
    is equal to itself.
  \end{itemize}
-Given these ingredients, $\myjeqq$ retuns a member of $\myse{P}$ applied
-to $\mytmm$, $\mytmn$, and $\myse{q}$.  In other words $\myjeqq$ takes a
-witness that $\myse{P}$ works with \emph{definitionally equal} terms,
-and returns a witness of $\myse{P}$ working with \emph{propositionally
-  equal} terms.  Invokations of $\myjeqq$ will vanish when the equality
-proofs will reduce to invocations to reflexivity, at which point the
-arguments must be definitionally equal, and thus the provided
+Given these ingredients, $\myjeqq$ returns a member of $\myse{P}$
+applied to $\mytmm$, $\mytmn$, and $\myse{q}$.  In other words $\myjeqq$
+takes a witness that $\myse{P}$ works with \emph{definitionally equal}
+terms, and returns a witness of $\myse{P}$ working with
+\emph{propositionally equal} terms.  Given its reduction rules,
+invocations of $\myjeqq$ will vanish when the equality proofs will
+reduce to invocations to reflexivity, at which point the arguments must
+be definitionally equal, and thus the provided
  $\myapp{\myapp{\myapp{\myse{P}}{\mytmm}}{\mytmm}}{(\myapp{\myrefl}{\mytmm})}$
  can be returned.  This means that $\myjeqq$ will not compute with
-hypotetical proofs, which makes sense given that they might be false.
+hypothetical proofs, which makes sense given that they might be false.
  
  While the $\myjeqq$ rule is slightly convoluted, we can derive many more
  `friendly' rules from it, for example a more obvious `substitution' rule, that
@@ -1578,8 +1691,9 @@ replaces equal for equal in predicates:
    \myjeq{(\myabs{\myb{x}\ \myb{y}\ \myb{q}}{\myapp{\myb{P}}{\myb{y}}})}{\myb{p}}{\myb{q}}
  \end{array}
  \]
-Once we have $\myfun{subst}$, we can easily prove more familiar laws regarding
-equality, such as symmetry, transitivity, congruence laws, etc.
+Once we have $\myfun{subst}$, we can easily prove more familiar laws
+regarding equality, such as symmetry, transitivity, congruence laws,
+etc.\footnote{For definitions of these functions, refer to Appendix \ref{app:itt-code}.}
  
  \subsection{Common extensions}
  
@@ -1590,11 +1704,11 @@ automatically extend propositional equality, given how $\myrefl$ works.
  \subsubsection{$\eta$-expansion}
  \label{sec:eta-expand}
  
-A simple extension to our definitional equality is $\eta$-expansion.
+A simple extension to our definitional equality is achieved by $\eta$-expansion.
  Given an abstract variable $\myb{f} : \mytya \myarr \mytyb$ the aim is
  to have that $\myb{f} \mydefeq
  \myabss{\myb{x}}{\mytya}{\myapp{\myb{f}}{\myb{x}}}$.  We can achieve
-this by `expanding' terms based on their types, a process also known as
+this by `expanding' terms depending on their types, a process known as
  \emph{quotation}---a term borrowed from the practice of
  \emph{normalisation by evaluation}, where we embed terms in some host
  language with an existing notion of computation, and then reify them
@@ -1602,16 +1716,16 @@ back into terms, which will `smooth out' differences like the one above
  \citep{Abel2007}.
  
  The same concept applies to $\myprod$, where we expand each inhabitant
-by reconstructing it by getting its projections, so that $\myb{x}
+reconstructing it by getting its projections, so that $\myb{x}
  \mydefeq \mypair{\myfst \myappsp \myb{x}}{\mysnd \myappsp \myb{x}}$.
  Similarly, all one inhabitants of $\myunit$ and all zero inhabitants of
  $\myempty$ can be considered equal. Quotation can be performed in a
  type-directed way, as we will witness in Section \ref{sec:kant-irr}.
  
  \begin{mydef}[Congruence and $\eta$-laws]
-To justify quotation in our type system we will add a congruence law
-for abstractions and a similar law for products, plus the fact that all
-elements of $\myunit$ or $\myempty$ are equal.
+  To justify quotation in our type system we add a congruence law for
+  abstractions and a similar law for products, plus the fact that all
+  elements of $\myunit$ or $\myempty$ are equal.
  \end{mydef}
  \mynegder
  \mydesc{definitional equality:}{\myjud{\mytmm \mydefeq \mytmn}{\mytmsyn}}{
@@ -1660,16 +1774,15 @@ are by reflexivity.
    \DisplayProof
  }
  
-\cite{Hofmann1994} showed that $\myfun{K}$ is not derivable from the
-$\myjeqq$ axiom that we presented, and \cite{McBride2004} showed that it is
-needed to implement `dependent pattern matching', as first proposed by
-\cite{Coquand1992}.  Thus, $\myfun{K}$ is derivable in the systems that
-implement dependent pattern matching, such as Epigram and Agda; but for
-example not in Coq.
+\cite{Hofmann1994} showed that $\myfun{K}$ is not derivable from
+$\myjeqq$, and \cite{McBride2004} showed that it is needed to implement
+`dependent pattern matching', as first proposed by \cite{Coquand1992}.\footnote{See Section \ref{sec:future-work} for more on dependent pattern matching.}
+Thus, $\myfun{K}$ is derivable in the systems that implement dependent
+pattern matching, such as Epigram and Agda; but for example not in Coq.
  
  $\myfun{K}$ is controversial mainly because it is at odds with
  equalities that include computational behaviour, most notably
-Voevodsky's `Univalent Foundations', which includes a \emph{univalence}
+Voevodsky's \emph{Univalent Foundations}, which feature a \emph{univalence}
  axiom that identifies isomorphisms between types with propositional
  equality.  For example we would have two isomorphisms, and thus two
  equalities, between $\mybool$ and $\mybool$, corresponding to the two
@@ -1681,22 +1794,23 @@ research.\footnote{More information about univalence can be found at
  
  \subsection{Limitations}
  
-\epigraph{\emph{Half of my time spent doing research involves thinking up clever
-  schemes to avoid needing functional extensionality.}}{@larrytheliquid}
-
  Propositional equality as described is quite restricted when
  reasoning about equality beyond the term structure, which is what definitional
  equality gives us (extensions notwithstanding).
  
+\begin{mydef}[Extensional equality]
+Given two functions $\myse{f}$ and $\myse{g}$ of type $\mytya \myarr \mytyb$, they are are said to be \emph{extensionally equal} if
+\[ (\myb{x} {:} \mytya) \myarr \mypeq \myappsp \mytyb \myappsp (\myse{f} \myappsp \myb{x}) \myappsp (\myse{g} \myappsp \myb{x}) \]
+\end{mydef}
+
  The problem is best exemplified by \emph{function extensionality}.  In
-mathematics, we would expect to be able to treat functions that give equal
-output for equal input as the same.  When reasoning in a mechanised framework
-we ought to be able to do the same: in the end, without considering the
-operational behaviour, all functions equal extensionally are going to be
-replaceable with one another.
-
-However this is not the case, or in other words with the tools we have we have
-no term of type
+mathematics, we would expect to be able to treat functions that give
+equal output for equal input as equal.  When reasoning in a mechanised
+framework we ought to be able to do the same: in the end, without
+considering the operational behaviour, all functions equal extensionally
+are going to be replaceable with one another.
+
+However this is not the case, or in other words with the tools we have there is no closed term of type
  \[
  \myfun{ext} : \myfora{\myb{A}\ \myb{B}}{\mytyp}{\myfora{\myb{f}\ \myb{g}}{
      \myb{A} \myarr \myb{B}}{
@@ -1714,28 +1828,41 @@ prove that
  \[
  \myfora{\myb{x}}{\mynat}{\mypeq \myappsp \mynat \myappsp (0 \mathrel{\myfun{$+$}} \myb{x}) \myappsp (\myb{x} \mathrel{\myfun{$+$}} 0)}
  \]
-By analysis on the $\myb{x}$.  However, the two functions are not
-definitionally equal, and thus we won't be able to get rid of the
-quantification.
+By induction on $\mynat$ applied to $\myb{x}$.  However, the two
+functions are not definitionally equal, and thus we will not be able to get
+rid of the quantification.
  
  For the reasons given above, theories that offer a propositional equality
  similar to what we presented are called \emph{intensional}, as opposed
  to \emph{extensional}.  Most systems widely used today (such as Agda,
-Coq, and Epigram) are of this kind.
-
-This is quite an annoyance that often makes reasoning awkward to execute.  It
-also extends to other fields, for example proving bisimulation between
-processes specified by coinduction, or in general proving equivalences based
-on the behaviour of a term.
+Coq, and Epigram) are of the former kind.
+
+This is quite an annoyance that often makes reasoning awkward or
+impossible to execute.  For example, we might want to represent terms of
+some language in Agda and give their denotation by embedding them in
+Agda---if we had $\lambda$-terms, functions will be Agda functions,
+application will be Agda's function application, and so on.  Then we
+would like to perform optimisation passes on the terms, and verify that
+they are sound by proving that the denotation of the optimised version
+is equal to the denotation of the starting term.
+
+But if the embedding uses functions---and it probably will---we are
+stuck with an equality that identifies as equal only syntactically equal
+functions!  Since the point of optimising is about preserving the
+denotational but changing the operational behaviour of terms, our
+equality falls short of our needs.  Moreover, the problem extends to
+other fields beyond functions, such as bisimulation between processes
+specified by coinduction, or in general proving equivalences based on
+the behaviour of a term.
  
  \subsection{Equality reflection}
  
-One way to `solve' this problem is by identifying propositional equality with
-definitional equality.
+One way to `solve' this problem is by identifying propositional equality
+with definitional equality.
  
  \begin{mydef}[Equality reflection]\end{mydef}
  \mydesc{typing:}{\myjud{\mytmsyn}{\mytmsyn}}{
-    \AxiomC{$\myjud{\myse{q}}{\mytmm \mypeq{\mytya} \mytmn}$}
+    \AxiomC{$\myjud{\myse{q}}{\mypeq \myappsp \mytya \myappsp \mytmm \myappsp \mytmn}$}
      \UnaryInfC{$\myjud{\mytmm \mydefeq \mytmn}{\mytya}$}
      \DisplayProof
  }
@@ -1752,15 +1879,6 @@ causes:
    computing under false assumptions becomes unsafe, since we derive any
    equality proof and then use equality reflection and the conversion
    rule to have terms of any type.
-
-  For example, assuming that we are in a context that contains
-  \[
-  \myb{A} : \mytyp; \myb{q} : \mypeq \myappsp \mytyp
-  \myappsp (\mytya \myarr \mytya) \myappsp \mytya
-  \]
-  we can write a looping term similar to the one we saw in Section
-  \ref{sec:untyped}:
-  % TODO dot this
  \end{itemize}
  
  Given these facts theories employing equality reflection, like NuPRL
@@ -1773,25 +1891,24 @@ using the extensions we gave above.  Assuming that $\myctx$ contains
  We can then derive
  \begin{prooftree}
    \mysmall
-  \AxiomC{$\hspace{1.1cm}\myjudd{\myctx; \myb{x} : \myb{A}}{\myapp{\myb{q}}{\myb{x}}}{\myapp{\myb{f}}{\myb{x}} \mypeq{} \myapp{\myb{g}}{\myb{x}}}\hspace{1.1cm}$}
+  \AxiomC{$\myjudd{\myctx; \myb{x} : \myb{A}}{\myb{q}}{\mypeq \myappsp \myb{A} \myappsp (\myapp{\myb{f}}{\myb{x}}) \myappsp (\myapp{\myb{g}}{\myb{x}})}$}
    \RightLabel{equality reflection}
    \UnaryInfC{$\myjudd{\myctx; \myb{x} : \myb{A}}{\myapp{\myb{f}}{\myb{x}} \mydefeq \myapp{\myb{g}}{\myb{x}}}{\myb{B}}$}
    \RightLabel{congruence for $\lambda$s}
    \UnaryInfC{$\myjud{(\myabs{\myb{x}}{\myapp{\myb{f}}{\myb{x}}}) \mydefeq (\myabs{\myb{x}}{\myapp{\myb{g}}{\myb{x}}})}{\myb{A} \myarr \myb{B}}$}
    \RightLabel{$\eta$-law for $\lambda$}
-  \UnaryInfC{$\hspace{1.45cm}\myjud{\myb{f} \mydefeq \myb{g}}{\myb{A} \myarr \myb{B}}\hspace{1.45cm}$}
+  \UnaryInfC{$\myjud{\myb{f} \mydefeq \myb{g}}{\myb{A} \myarr \myb{B}}$}
    \RightLabel{$\myrefl$}
-  \UnaryInfC{$\myjud{\myapp{\myrefl}{\myb{f}}}{\myb{f} \mypeq{} \myb{g}}$}
+  \UnaryInfC{$\myjud{\myapp{\myrefl}{\myb{f}}}{\mypeq \myappsp (\myb{A} \myarr \myb{B}) \myappsp \myb{f} \myappsp \myb{g}}$}
  \end{prooftree}
-
-Now, the question is: do we need to give up well-behavedness of our theory to
+For this reason, theories employing equality reflection are often
+grouped under the name of \emph{Extensional Type Theory} (ETT).  Now,
+the question is: do we need to give up well-behavedness of our theory to
  gain extensionality?
  
  \section{The observational approach}
  \label{sec:ott}
  
-% TODO add \begin{mydef}s
-
  A recent development by \citet{Altenkirch2007}, \emph{Observational Type
    Theory} (OTT), promises to keep the well behavedness of ITT while
  being able to gain many useful equality proofs,\footnote{It is suspected
@@ -1881,7 +1998,7 @@ ad-hoc conditional for types, where the reduction rule is the obvious
  one.
  
  However, we have an addition: a universe of \emph{propositions},
-$\myprop$.  $\myprop$ isolates a fragment of types at large, and
+$\myprop$.\footnote{Note that we do not need syntax for the type of props, $\myprop$, since the user cannot abstract over them.  In fact, we do not not need syntax for $\mytyp$ either, for the same reason.}  $\myprop$ isolates a fragment of types at large, and
  indeed we can `inject' any $\myprop$ back in $\mytyp$ with $\myprdec{\myarg}$.
  \begin{mydef}[Proposition decoding]\ \end{mydef}
  \mydesc{proposition decoding:}{\myprdec{\mytmsyn} \myred \mytmsyn}{
@@ -1905,7 +2022,7 @@ indeed we can `inject' any $\myprop$ back in $\mytyp$ with $\myprdec{\myarg}$.
    Propositions are what we call the types of \emph{proofs}, or types
    whose inhabitants contain no `data', much like $\myunit$.  The goal
    when isolating \mytyc{Prop} is twofold: erasing all top-level
-  propositions when compiling; and to identify all equivalent
+  propositions when compiling; and identifying all equivalent
    propositions as the same, as we will see later.
  
    Why did we choose what we have in $\myprop$?  Given the above
@@ -1917,9 +2034,14 @@ indeed we can `inject' any $\myprop$ back in $\mytyp$ with $\myprdec{\myarg}$.
    decoding will be a constant function for propositional content.  The
    only threat is $\mybot$, by which we can fabricate anything we want:
    however if we are consistent there will be no closed term of type
-  $\mybot$ at, which is what we care about regarding proof erasure and
+  $\mybot$ at, which is enough regarding proof erasure and
    term equality.
  
+  As an example of types that are \emph{not} propositional, consider
+  $\mydc{Bool}$eans, which are the quintessential `relevant' data, since
+  they are often used to decide the execution path of a program through
+  $\myfun{if}\myarg\myfun{then}\myarg\myfun{else}\myarg$ constructs.
+
  \subsection{Equality proofs}
  
  \begin{mydef}[Equality proofs and related operations]\ \end{mydef}
@@ -1978,7 +2100,7 @@ indeed we can `inject' any $\myprop$ back in $\mytyp$ with $\myprdec{\myarg}$.
  While isolating a propositional universe as presented can be a useful
  exercises on its own, what we are really after is a useful notion of
  equality.  In OTT we want to maintain that things judged to be equal are
-still always repleaceable for one another with no additional
+still always replaceable for one another with no additional
  changes. Note that this is not the same as saying that they are
  definitionally equal, since as we saw extensionally equal functions,
  while satisfying the above requirement, are not.
@@ -2015,34 +2137,44 @@ Before introducing the core ideas that make OTT work, let us distinguish
  between \emph{canonical} and \emph{neutral} terms and types.
  
  \begin{mydef}[Canonical and neutral types and terms]
-  \emph{Canonical} types are those arising from the ground types
-  ($\myempty$, $\myunit$, $\mybool$) and the three type formers
-  ($\myarr$, $\myprod$, $\mytyc{W}$).  \emph{Neutral} types are those
-  formed by $\myfun{If}\myarg\myfun{Then}\myarg\myfun{Else}\myarg$.
-  Correspondingly, canonical terms are those inhabiting data
-  constructors ($\mytt$, $\mytrue$, $\myfalse$,
-  $\myabss{\myb{x}}{\mytya}{\mytmt}$, ...); all the others being
-  neutral, including eliminators and abstracted variables.
+  In a type theory, \emph{neutral} terms are those formed by an
+  abstracted variable or by an eliminator (including function
+  application).  Everything else is \emph{canonical}.
+
+  In the current system, data constructors ($\mytt$, $\mytrue$,
+  $\myfalse$, $\myabss{\myb{x}}{\mytya}{\mytmt}$, ...) will be
+  canonical, the rest neutral.  Correspondingly, canonical types are
+  those arising from the ground types ($\myempty$, $\myunit$, $\mybool$)
+  and the three type formers ($\myarr$, $\myprod$, $\mytyc{W}$).
+  Neutral types are those formed by
+  $\myfun{If}\myarg\myfun{Then}\myarg\myfun{Else}\myarg$.
  \end{mydef}
-
-In the current system (and hopefully in well-behaved systems), all
-closed terms reduce to a canonical term (as a consequence or
-normalisation), and all canonical types are inhabited by canonical
-terms (a property known as \emph{canonicity}).
+\begin{mydef}[Canonicity]
+  If in a system all canonical types are inhabited by canonical terms
+  the system is said to have the \emph{canonicity} property.
+\end{mydef}
+The current system, and well-behaved systems in general, has the
+canonicity property.  Another consequence of normalisation is that all
+closed terms will reduce to a canonical term.
  
  \subsubsection{Type equality, and coercions}
  
  The plan is to decompose type-level equalities between canonical types
  into decodable propositions containing equalities regarding the
-subterms, and to use coerce recursively on the subterms using the
-generated equalities.  This interplay between the canonicity of equated
-types, type equalities, and \myfun{coe} ensures that invocations of
-$\myfun{coe}$ will vanish when we have evidence of the structural
-equality of the types we are transporting terms across.  If the type is
-neutral, the equality will not reduce and thus $\myfun{coe}$ will not
-reduce either.  If we come an equality between different canonical
-types, then we reduce the equality to bottom, making sure that no such
-proof can exist, and providing an `escape hatch' in $\myfun{coe}$.
+subterms.  So if are equating two product types, the equality will
+reduce to two subequalities regarding the first and second type.  Then,
+we can \myfun{coe}rce to transport values between equal types.
+Following the subequalities, \myfun{coe} will procede recursively on the
+subterms.
+
+This interplay between the canonicity of equated types, type
+equalities, and \myfun{coe}, ensures that invocations of $\myfun{coe}$
+will vanish when we have evidence of the structural equality of the
+types we are transporting terms across.  If the type is neutral, the
+equality will not reduce and thus $\myfun{coe}$ will not reduce either.
+If we come across an equality between different canonical types, then we
+reduce the equality to bottom, making sure that no such proof can exist,
+and providing an `escape hatch' in $\myfun{coe}$.
  
  \begin{figure}[t]
  
@@ -2106,14 +2238,13 @@ proof can exist, and providing an `escape hatch' in $\myfun{coe}$.
  
  \begin{mydef}[Type equalities reduction, and \myfun{coe}rcions] Figure
    \ref{fig:eqred} illustrates the rules to reduce equalities and to
-  coerce terms.
+  coerce terms.  We use a $\mysyn{let}$ syntax for legibility.
  \end{mydef}
  For ground types, the proof is the trivial element, and \myfun{coe} is
  the identity.  For $\myunit$, we can do better: we return its only
-member without matching on the term.  For the three type binders, things
-are similar but subtly different---the choices we make in the type
-equality are dictated by the desire of writing the $\myfun{coe}$ in a
-natural way.
+member without matching on the term.  For the three type binders the
+choices we make in the type equality are dictated by the desire of
+writing the $\myfun{coe}$ in a natural way.
  
  $\myprod$ is the easiest case: we decompose the proof into proofs that
  the first element's types are equal ($\mytya_1 \myeq \mytya_2$), and a
@@ -2121,18 +2252,18 @@ proof that given equal values in the first element, the types of the
  second elements are equal too
  ($\myprfora{\myb{x_1}}{\mytya_1}{\myprfora{\myb{x_2}}{\mytya_2}{\myjm{\myb{x_1}}{\mytya_1}{\myb{x_2}}{\mytya_2}}
    \myimpl \mytyb_1[\myb{x_1}] \myeq \mytyb_2[\myb{x_2}]}$).\footnote{We
-  are using $\myimpl$ to indicate a $\forall$ where we discard the first
-  value.  We write $\mytyb_1[\myb{x_1}]$ to indicate that the
+  are using $\myimpl$ to indicate a $\forall$ where we discard the
+  quantified value.  We write $\mytyb_1[\myb{x_1}]$ to indicate that the
    $\myb{x_1}$ in $\mytyb_1$ is re-bound to the $\myb{x_1}$ quantified by
    the $\forall$, and similarly for $\myb{x_2}$ and $\mytyb_2$.}  This
  also explains the need for heterogeneous equality, since in the second
-proof we need to equate terms of possibly different types.  In the respective $\myfun{coe}$ case, since
-the types are canonical, we know at this point that the proof of
-equality is a pair of the shape described above.  Thus, we can
-immediately coerce the first element of the pair using the first element
-of the proof, and then instantiate the second element with the two first
-elements and a proof by coherence of their equality, since we know that
-the types are equal.
+proof we need to equate terms of possibly different types.  In the
+respective $\myfun{coe}$ case, since the types are canonical, we know at
+this point that the proof of equality is a pair of the shape described
+above.  Thus, we can immediately coerce the first element of the pair
+using the first element of the proof, and then instantiate the second
+element with the two first elements and a proof by coherence of their
+equality, since we know that the types are equal.
  
  The cases for the other binders are omitted for brevity, but they follow
  the same principle with some twists to make $\myfun{coe}$ work with the
@@ -2142,13 +2273,14 @@ generated proofs; the reader can refer to the paper for details.
  \label{sec:lazy}
  
  It is important to notice that in the reduction rules for $\myfun{coe}$
-are never obstructed by the proofs: with the exception of comparisons
-between different canonical types we never `pattern match' on the proof
-pairs, but always look at the projections.  This means that, as long as
-we are consistent, and thus as long as we don't have $\mybot$-inducing
-proofs, we can add propositional axioms for equality and $\myfun{coe}$
-will still compute.  Thus, we can take $\myfun{coh}$ as axiomatic, and
-we can add back familiar useful equality rules:
+are never obstructed by the structure of the proofs.  With the exception
+of comparisons between different canonical types we never `pattern
+match' on the proof pairs, but always look at the projections.  This
+means that, as long as we are consistent, and thus as long as we don't
+have $\mybot$-inducing proofs, we can add propositional axioms for
+equality and $\myfun{coe}$ will still compute.  Thus, we can take
+$\myfun{coh}$ as axiomatic, and we can add back familiar useful equality
+rules:
  
  \mydesc{typing:}{\myjud{\mytmsyn}{\mytmsyn}}{
    \AxiomC{$\myjud{\mytmt}{\mytya}$}
@@ -2169,7 +2301,7 @@ abstracting over a value we can substitute equal for equal---this lets
  us recover $\myfun{subst}$.  Note that while we need to provide ad-hoc
  rules in the restricted, non-hierarchical theory that we have, if our
  theory supports abstraction over $\mytyp$s we can easily add these
-axioms as abstracted variables.
+axioms as top-level abstracted variables.
  
  \subsubsection{Value-level equality}
  
@@ -2208,6 +2340,11 @@ propositional data, such as $\myempty$ and $\myunit$, we automatically
  return the trivial type, since if a type has zero one members, all
  members will be equal.  When matching on data-bearing types, such as
  $\mybool$, we check that such data matches, and return bottom otherwise.
+When matching on records and functions, we rebuild the records to
+achieve $\eta$-expansion, and relate functions if they are extensionally
+equal---exactly what we wanted.  The case for \mytyc{W} is omitted but
+unsurprising, it checks that equal data in the nodes will bring equal
+children.
  
  \subsection{Proof irrelevance and stuck coercions}
  \label{sec:ott-quot}
@@ -2238,47 +2375,27 @@ coerce and return $\myb{x}$ as it is.
  \section{\mykant: the theory}
  \label{sec:kant-theory}
  
+\epigraph{\emph{The construction itself is an art, its application to the world an evil parasite.}}{Luitzen Egbertus Jan `Bertus' Brouwer}
+
  \mykant\ is an interactive theorem prover developed as part of this thesis.
  The plan is to present a core language which would be capable of serving as
  the basis for a more featureful system, while still presenting interesting
  features and more importantly observational equality.
  
-We will first present the features of the system, and then describe the
+We will first present the features of the system, along with motivations
+and trade-offs for the design decisions made. Then we describe the
  implementation we have developed in Section \ref{sec:kant-practice}.
-
-The defining features of \mykant\ are:
-
-\begin{description}
-\item[Full dependent types] As we would expect, we have dependent a system
-  which is as expressive as the `best' corner in the lambda cube described in
-  Section \ref{sec:itt}.
-
-\item[Implicit, cumulative universe hierarchy] The user does not need to
-  specify universe level explicitly, and universes are \emph{cumulative}.
-
-\item[User defined data types and records] Instead of forcing the user to
-  choose from a restricted toolbox, we let her define inductive data types,
-  with associated primitive recursion operators; or records, with associated
-  projections for each field.
-
-\item[Bidirectional type checking] While no `fancy' inference via
-  unification is present, we take advantage of a type synthesis system
-  in the style of \cite{Pierce2000}, extending the concept for user
-  defined data types.
-
-\item[Observational equality] As described in Section \ref{sec:ott} but
-  extended to work with the type hierarchy and to admit equality between
-  arbitrary data types.
-
-\item[Type holes] When building up programs interactively, it is useful
-  to leave parts unfinished while exploring the current context.  This
-  is what type holes are for.  We do not describe holes rigorously, but
-  provide more information about them in Section \ref{sec:type-holes}.
-
-\end{description}
-
-We will analyse the features one by one, along with motivations and
-tradeoffs for the design decisions made.
+For an overview of the features of \mykant, see
+Section \ref{sec:contributions}, here we present them one by one.  The
+exception is type holes, which we do not describe holes rigorously, but
+provide more information about them in Section \ref{sec:type-holes}.
+
+Note that in this section we will present \mykant\ terms in a fancy
+\LaTeX\ dress too keep up with the presentation, but every term, with its
+syntax reduced to the concrete syntax, is a valid \mykant\ term accepted
+by \mykant\ the software, and not only \mykant\ the theory.  Appendix
+\ref{app:kant-examples} displays most of the terms in this section in
+their concrete syntax.
  
  \subsection{Bidirectional type checking}
  
@@ -2297,8 +2414,9 @@ To introduce the concept and notation, we will revisit the STLC in a
  bidirectional style.  The presentation follows \cite{Loh2010}.  The
  syntax for our bidirectional STLC is the same as the untyped
  $\lambda$-calculus, but with an extra construct to annotate terms
-explicitly---this will be necessary when having top-level canonical
-terms.  The types are the same as those found in the normal STLC.
+explicitly---this will be necessary when dealing with top-level
+canonical terms.  The types are the same as those found in the normal
+STLC.
  
  \begin{mydef}[Syntax for the annotated $\lambda$-calculus]\ \end{mydef}
  \mynegder
@@ -2309,13 +2427,18 @@ terms.  The types are the same as those found in the normal STLC.
    \end{array}
    $
  }
+
  We will have two kinds of typing judgements: \emph{inference} and
  \emph{checking}.  $\myinf{\mytmt}{\mytya}$ indicates that $\mytmt$
  infers the type $\mytya$, while $\mychk{\mytmt}{\mytya}$ can be checked
-against type $\mytya$.  The type of variables in context is inferred,
-and so are annotate terms.  The type of applications is inferred too,
-propagating types down the applied term.  Abstractions are checked.
-Finally, we have a rule to check the type of an inferrable term.
+against type $\mytya$.  The arrows signify the direction of the type
+checking---inference pushes types up, checking propagates types
+down.
+
+The type of variables in context is inferred, and so are annotate terms.
+The type of applications is inferred too, propagating types down the
+applied term.  Abstractions are checked.  Finally, we have a rule to
+check the type of an inferrable term.
  
  \begin{mydef}[Bidirectional type checking for the STLC]\ \end{mydef}
  \mynegder
@@ -2351,11 +2474,14 @@ Finally, we have a rule to check the type of an inferrable term.
  For example, if we wanted to type function composition (in this case for
  naturals), we would have to annotate the term:
  \[
-  \myfun{comp} \mapsto (\myabs{\myb{f}\, \myb{g}\, \myb{x}}{\myb{f}\myappsp(\myb{g}\myappsp\myb{x})}) : (\mynat \myarr \mynat) \myarr (\mynat \myarr \mynat) \myarr \mynat \myarr \mynat
+\begin{array}{@{}l}
+  \myfun{comp} :  (\mynat \myarr \mynat) \myarr (\mynat \myarr \mynat) \myarr \mynat \myarr \mynat \\
+  \myfun{comp} \mapsto (\myabs{\myb{f}\, \myb{g}\, \myb{x}}{\myb{f}\myappsp(\myb{g}\myappsp\myb{x})})
+\end{array}
  \]
  But we would not have to annotate functions passed to it, since the type would be propagated to the arguments:
  \[
-   \myfun{comp}\myappsp (\myabs{\myb{x}}{\myb{x} \mathrel{\myfun{$+$}} 3}) \myappsp (\myabs{\myb{x}}{\myb{x} \mathrel{\myfun{$*$}} 4}) \myappsp 42
+   \myfun{comp}\myappsp (\myabs{\myb{x}}{\myb{x} \mathrel{\myfun{$+$}} 3}) \myappsp (\myabs{\myb{x}}{\myb{x} \mathrel{\myfun{$*$}} 4}) \myappsp 42 : \mynat
  \]
  
  \subsection{Base terms and types}
@@ -2401,12 +2527,12 @@ names can be associated with a body.
        \UnaryInfC{$\myvalid{\myemptyctx}$}
        \DisplayProof
        &
-      \AxiomC{$\myjud{\mytya}{\mytyp}$}
+      \AxiomC{$\mychk{\mytya}{\mytyp}$}
        \AxiomC{$\mynamesyn \not\in \myctx$}
        \BinaryInfC{$\myvalid{\myctx ; \mynamesyn : \mytya}$}
        \DisplayProof
        &
-      \AxiomC{$\myjud{\mytmt}{\mytya}$}
+      \AxiomC{$\mychk{\mytmt}{\mytya}$}
        \AxiomC{$\myfun{f} \not\in \myctx$}
        \BinaryInfC{$\myvalid{\myctx ; \myfun{f} \mapsto \mytmt : \mytya}$}
        \DisplayProof
@@ -2472,8 +2598,8 @@ Section \ref{sec:kant-irr}.
        \UnaryInfC{$\myinf{\mytyp}{\mytyp}$}
        \DisplayProof
        &
-    \AxiomC{$\myinf{\mytya}{\mytyp}$}
-    \AxiomC{$\myinff{\myctx; \myb{x} : \mytya}{\mytyb}{\mytyp}$}
+    \AxiomC{$\mychk{\mytya}{\mytyp}$}
+    \AxiomC{$\mychkk{\myctx; \myb{x} : \mytya}{\mytyb}{\mytyp}$}
      \BinaryInfC{$\myinf{(\myb{x} {:} \mytya) \myarr \mytyb}{\mytyp}$}
      \DisplayProof
  
@@ -2510,15 +2636,17 @@ although with some differences.
  
  \begin{mydef}[Term vector]
    A \emph{term vector} is a series of terms.  The empty vector is
-  represented by $\myemptyctx$, and a new element is added with a
-  semicolon, similarly to contexts---$\vec{t};\mytmm$.
+  represented by $\myemptyctx$, and a new element is added with
+  $\myarg;\myarg$, similarly to contexts---$\vec{t};\mytmm$.
  \end{mydef}
  
-We use term vectors to refer to a series of term applied to another. For
-example $\mytyc{D} \myappsp \vec{A}$ is a shorthand for $\mytyc{D}
-\myappsp \mytya_1 \cdots \mytya_n$, for some $n$.  $n$ is consistently
-used to refer to the length of such vectors, and $i$ to refer to an
-index in such vectors.
+We denote term vectors with the usual arrow notation,
+e.g. $vec{\mytmt}$, $\myvec{\mytmt};\mytmm$, etc.  We often use term
+vectors to refer to a series of term applied to another. For example
+$\mytyc{D} \myappsp \vec{A}$ is a shorthand for $\mytyc{D} \myappsp
+\mytya_1 \cdots \mytya_n$, for some $n$.  $n$ is consistently used to
+refer to the length of such vectors, and $i$ to refer to an index such
+that $1 \le i \le n$.
  
  \begin{mydef}[Telescope]
    A \emph{telescope} is a series of typed bindings.  The empty telescope
@@ -2527,14 +2655,13 @@ index in such vectors.
  \end{mydef}
  
  To present the elaboration and operations on user defined data types, we
-frequently make use what de Bruijn called \emph{telescopes}
-\citep{Bruijn91}, a construct that will prove useful when dealing with
-the types of type and data constructors.  We refer to telescopes with
-$\mytele$, $\mytele'$, $\mytele_i$, etc.  If $\mytele$ refers to a
-telescope, $\mytelee$ refers to the term vector made up of all the
-variables bound by $\mytele$.  $\mytele \myarr \mytya$ refers to the
-type made by turning the telescope into a series of $\myarr$.  For
-example we have that
+frequently make use what \cite{Bruijn91} called \emph{telescopes}, a
+construct that will prove useful when dealing with the types of type and
+data constructors.  We refer to telescopes with $\mytele$, $\mytele'$,
+$\mytele_i$, etc.  If $\mytele$ refers to a telescope, $\mytelee$ refers
+to the term vector made up of all the variables bound by $\mytele$.
+$\mytele \myarr \mytya$ refers to the type made by turning the telescope
+into a series of $\myarr$.  For example we have that
  \[
     (\myb{x} {:} \mynat); (\myb{p} : \myapp{\myfun{even}}{\myb{x}}) \myarr \mynat =
     (\myb{x} {:} \mynat) \myarr (\myb{p} : \myapp{\myfun{even}}{\myb{x}}) \myarr \mynat
@@ -2556,10 +2683,10 @@ We make use of various operations to manipulate telescopes:
    \myapp{\myfun{even}}{42})$.
  \end{itemize}
  
-Additionally, when presenting syntax elaboration, I'll use $\mytmsyn^n$
-to indicate a term vector composed of $n$ elements, or
-$\mytmsyn^{\mytele}$ for one composed by as many elements as the
-telescope.
+Additionally, when presenting syntax elaboration, We use $\mytmsyn^n$ to
+indicate a term vector composed of $n$ elements.  When clear from the
+context, we use term vectors to signify their length,
+e.g. $\mytmsyn^{\mytele}$, or $1 \le i \le \mytele$.
  
  \subsubsection{Declarations syntax}
  
@@ -2584,13 +2711,17 @@ In \mykant\ we have four kind of declarations:
  \item[Defined value] A variable, together with a type and a body.
  \item[Abstract variable] An abstract variable, with a type but no body.
  \item[Inductive data] A \emph{data type}, with a \emph{type constructor}
-  and various \emph{data constructors}, quite similar to what we find in
-  Haskell.  A primitive \emph{eliminator} (or \emph{destructor}, or
-  \emph{recursor}) will be used to compute with each data type.
-\item[Record] A \emph{record}, which consists of one data constructor
-  and various \emph{fields}, with no recursive occurrences.  The
+  (denoted in blue, capitalised, sans serif: $\mytyc{D}$) various
+  \emph{data constructors} (denoted in red, lowercase, sans serif:
+  $\mydc{c}$), quite similar to what we find in Haskell.  A primitive
+  \emph{eliminator} (or \emph{destructor}, or \emph{recursor}; denoted
+  by green, lowercase, roman: \myfun{elim}) will be used to compute with
+  each data type.
+\item[Record] A \emph{record}, which like data types consists of a type
+  constructor but only one data constructor.  The user can also define
+  various \emph{fields}, with no recursive occurrences of the type.  The
    functions extracting the fields' values from an instance of a record
-  are called \emph{projections}.
+  are called \emph{projections} (denoted in the same way as destructors).
  \end{description}
  
  Elaborating defined variables consists of type checking the body against
@@ -2669,11 +2800,11 @@ data Nat = Zero | Suc Nat
    Moreover, each data constructor is prefixed by the type constructor
    name, since we need to retrieve the type constructor of a data
    constructor when type checking.  This measure aids in the presentation
-  of various features but it is not needed in the implementation, where
-  we can have a dictionary to lookup the type constructor corresponding
+  of the theory but it is not needed in the implementation, where
+  we can have a dictionary to look up the type constructor corresponding
    to each data constructor.  When using data constructors in examples I
    will omit the type constructor prefix for brevity, in this case
-  writing $\mydc{zero}$ instead of $\mynat.\mydc{suc}$ and $\mydc{suc}$ instead of
+  writing $\mydc{zero}$ instead of $\mynat.\mydc{zero}$ and $\mydc{suc}$ instead of
    $\mynat.\mydc{suc}$.
  
    Along with user defined constructors, $\mykant$\ automatically
@@ -2741,9 +2872,10 @@ $\mynat$---the type system is far too weak.
    \end{array}
    \]
    The problem with this approach is that creating terms is incredibly
-  verbose and dull, since we would need to specify the type parameters
-  each time.  For example if we wished to create a $\mytree \myappsp
-  \mynat$ with two nodes and three leaves, we would write
+  verbose and dull, since we would need to specify the type parameter of
+  $\mytyc{Tree}$ each time.  For example if we wished to create a
+  $\mytree \myappsp \mynat$ with two nodes and three leaves, we would
+  write
    \[
    \mydc{node} \myappsp \mynat \myappsp (\mydc{node} \myappsp \mynat \myappsp (\mydc{leaf} \myappsp \mynat) \myappsp (\myapp{\mydc{suc}}{\mydc{zero}}) \myappsp (\mydc{leaf} \myappsp \mynat)) \myappsp \mydc{zero} \myappsp (\mydc{leaf} \myappsp \mynat)
    \]
@@ -2817,8 +2949,8 @@ $\mynat$---the type system is far too weak.
    new to the \{Haskell, SML, OCaml, functional\} programmer.  However
    dependent types let us express much more than that.  A useful example
    is the type of ordered lists. There are many ways to define such a
-  thing, but we will define our type to store the bounds of the list,
-  making sure that $\mydc{cons}$ing respects that.
+  thing, but we will define ours to store the bounds of the list, making
+  sure that $\mydc{cons}$ing respects that.
  
    First, using $\myunit$ and $\myempty$, we define a type expressing the
    ordering on natural numbers, $\myfun{le}$---`less or equal'.
@@ -2854,20 +2986,21 @@ $\mynat$---the type system is far too weak.
              \myind{2}\myind{2} \myb{l_1} \\
              \myind{2}\myind{2} (\myabs{\myarg}{\mytyc{Lift} \myarr \mytyp}) \\
              \myind{2}\myind{2} (\myabs{\myarg}{\myunit}) \\
-            \myind{2}\myind{2} (\myabs{\myb{n_1}\, \myb{n_2}}{
+            \myind{2}\myind{2} (\myabs{\myb{n_1}\, \myb{l_2}}{
                \mytyc{Lift}.\myfun{elim} \myappsp \myb{l_2} \myappsp (\myabs{\myarg}{\mytyp}) \myappsp \myempty \myappsp (\myabs{\myb{n_2}}{\myfun{le} \myappsp \myb{n_1} \myappsp \myb{n_2}}) \myappsp \myunit
              }) \\
-            \myind{2}\myind{2} (\myabs{\myb{n_1}\, \myb{n_2}}{
+            \myind{2}\myind{2} (\myabs{\myb{l_2}}{
                \mytyc{Lift}.\myfun{elim} \myappsp \myb{l_2} \myappsp (\myabs{\myarg}{\mytyp}) \myappsp \myempty \myappsp (\myabs{\myarg}{\myempty}) \myappsp \myunit
              })
      \end{array}
      \]
-  Finally, we can defined a type of ordered lists.  The type is
-  parametrised over two values representing the lower and upper bounds
-  of the elements, as opposed to the type parameters that we are used
-  to.  Then, an empty list will have to have evidence that the bounds
-  are ordered, and each time we add an element we require the list to
-  have a matching lower bound:
+    Finally, we can define a type of ordered lists.  The type is
+    parametrised over two \emph{values} representing the lower and upper
+    bounds of the elements, as opposed to the \emph{type} parameters
+    that we are used to in Haskell or similar languages.  An empty
+    list will have to have evidence that the bounds are ordered, and
+    each time we add an element we require the list to have a matching
+    lower bound:
    \[
      \begin{array}{@{}l}
        \myadt{\mytyc{OList}}{\myappsp (\myb{low}\ \myb{upp} {:} \mytyc{Lift})}{\\ \myind{2}}{
@@ -2887,7 +3020,7 @@ $\mynat$---the type system is far too weak.
  
  \item[Dependent products] Apart from $\mysyn{data}$, $\mykant$\ offers
    us another way to define types: $\mysyn{record}$.  A record is a
-  datatype with one constructor and `projections' to extract specific
+  data type with one constructor and `projections' to extract specific
    fields of the said constructor.
  
    For example, we can recover dependent products:
@@ -2924,11 +3057,13 @@ $\mynat$---the type system is far too weak.
      \end{tabular}
    \end{center}
    What we have defined here is equivalent to ITT's dependent products.
+
  \end{description}
  
  \begin{figure}[p]
+  \vspace{-.5cm}
      \mydesc{syntax}{ }{
-      \footnotesize
+      \small
        $
        \begin{array}{l}
          \mynamesyn ::= \cdots \mysynsep \mytyc{D} \mysynsep \mytyc{D}.\mydc{c} \mysynsep \mytyc{D}.\myfun{f}
@@ -2939,7 +3074,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
    \mydesc{syntax elaboration:}{\mydeclsyn \myelabf \mytmsyn ::= \cdots}{
-    \footnotesize
+    \small
        $
        \begin{array}{r@{\ }l}
           & \myadt{\mytyc{D}}{\mytele}{}{\cdots\ |\ \mydc{c}_n : \mytele_n } \\
@@ -2956,7 +3091,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
    \mydesc{context elaboration:}{\myelab{\mydeclsyn}{\myctx}}{
-        \footnotesize
+        \small
  
        \AxiomC{$
          \begin{array}{c}
@@ -2999,7 +3134,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
    \mydesc{reduction elaboration:}{\mydeclsyn \myelabf \myctx \vdash \mytmsyn \myred \mytmsyn}{  
-        \footnotesize
+        \small
          $\myadt{\mytyc{D}}{\mytele}{}{ \cdots \ |\ \mydc{c}_n : \mytele_n } \ \ \myelabf$
        \AxiomC{$\mytyc{D} : \mytele \myarr \mytyp \in \myctx$}
        \AxiomC{$\mytyc{D}.\mydc{c}_i : \mytele;\mytele_i \myarr \myapp{\mytyc{D}}{\mytelee} \in \myctx$}
@@ -3019,7 +3154,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
      \mydesc{syntax elaboration:}{\myelab{\mydeclsyn}{\mytmsyn ::= \cdots}}{
-          \footnotesize
+          \small
      $
      \begin{array}{r@{\ }c@{\ }l}
        \myctx & \myelabt & \myreco{\mytyc{D}}{\mytele}{}{ \cdots, \myfun{f}_n : \myse{F}_n } \\
@@ -3035,7 +3170,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
  \mydesc{context elaboration:}{\myelab{\mydeclsyn}{\myctx}}{
-      \footnotesize
+      \small
      \AxiomC{$
        \begin{array}{c}
          \myinf{\mytele \myarr \mytyp}{\mytyp}\hspace{0.8cm}
@@ -3057,7 +3192,7 @@ $\mynat$---the type system is far too weak.
      \mynegder
  
    \mydesc{reduction elaboration:}{\mydeclsyn \myelabf \myctx \vdash \mytmsyn \myred \mytmsyn}{
-        \footnotesize
+        \small
            $\myreco{\mytyc{D}}{\mytele}{}{ \cdots, \myfun{f}_n : \myse{F}_n } \ \ \myelabf$
            \AxiomC{$\mytyc{D} \in \myctx$}
            \UnaryInfC{$\myctx \vdash \myapp{\mytyc{D}.\myfun{f}_i}{(\mytyc{D}.\mydc{constr} \myappsp \vec{t})} \myred t_i$}
@@ -3089,7 +3224,10 @@ are strictly positive, which ensures the consistency of the theory.  To
  achieve that we employing a syntactic check to make sure that this is
  the case---in fact the check is stricter than necessary for simplicity,
  given that we allow recursive occurrences only at the top level of data
-constructor arguments.
+constructor arguments.  For example a definition of the $\mytyc{W}$ type
+is accepted in Agda but rejected in \mykant.  This is to make the
+eliminator generation simpler, and in practice it is seldom an
+impediment.
  
  Without these precautions, we can easily derive any type with no
  recursion:
@@ -3115,8 +3253,8 @@ destructors, we store their types in full in the context, and then
  instantiate when due.
  \end{mydef}
  \mynegder
-\mydesc{typing:}{\myctx \vdash \mytmsyn \Updownarrow \mytmsyn}{
-    \AxiomC{$
+\mydesc{typing:}{\myctx
+  \vdash \mytmsyn \Updownarrow \mytmsyn}{ \AxiomC{$
        \begin{array}{c}
          \mytyc{D} : \mytele \myarr \mytyp \in \myctx \hspace{1cm}
          \mytyc{D}.\mydc{c} : \mytele \mycc \mytele' \myarr
@@ -3140,19 +3278,22 @@ instantiate when due.
          \myse{F})(\vec{A};\mytmt)}$}
      \DisplayProof
    }
+Note that for 0-ary type constructors, like $\mynat$, we do not need to
+check canonical terms: we can automatically infer that $\mydc{zero}$ and
+$\mydc{suc}\myappsp n$ are of type $\mynat$.  \mykant\ implements this measure, even
+if it is not shown in the typing rule for simplicity.
  
  \subsubsection{Why user defined types?  Why eliminators?}
  
-The hardest design choice when designing $\mykant$\ was to decide
-whether user defined types should be included, and how to handle them.
-In the end, as we saw, we can devise general structures like $\mytyc{W}$
-that can express all inductive structures.  However, using those tools
-beyond very simple examples is near-impossible for a human user.  Thus
-most theorem provers in the wild provide some means for the user to
-define structures tailored to specific uses.
+The hardest design choice in developing $\mykant$\ was to decide whether
+user defined types should be included, and how to handle them.  As we
+saw, while we can devise general structures like $\mytyc{W}$, they are
+unsuitable both for for direct usage and `mechanical' usage.  Thus most
+theorem provers in the wild provide some means for the user to define
+structures tailored to specific uses.
  
  Even if we take user defined types for granted, while there is not much
-debate on how to handle record, there are two broad schools of thought
+debate on how to handle records, there are two broad schools of thought
  regarding the handling of data types:
  \begin{description}
  \item[Fixed points and pattern matching] The road chosen by Agda and Coq.
@@ -3166,9 +3307,14 @@ regarding the handling of data types:
    pioneered by the Epigram line of work.  The advantage is that we can
    reduce every data type to simple definitions which guarantee
    termination and are simple to reduce and type.  It is however more
-  cumbersome to use than pattern maching, although \cite{McBride2004}
+  cumbersome to use than pattern matching, although \cite{McBride2004}
    has shown how to implement an expressive pattern matching interface on
    top of a larger set of combinators of those provided by \mykant.
+
+  We can go ever further down this road and elaborate the declarations
+  for data types themselves to a small set of primitives, so that our `core'
+  language will be very small and manageable
+  \citep{dagand2012elaborating, chapman2010gentle}.
  \end{description}
  
  We chose the safer and easier to implement path, given the time
@@ -3235,19 +3381,18 @@ types too.
  
  \begin{mydef}[Cumulativity for \mykant' base types]
    Figure \ref{fig:cumulativity} gives a formal definition of
-  cumulativity for the base types.  Similar measures can be taken for
-  user defined types, withe the type living in the least upper bound of
-  the levels where the types contained data live.
+  \emph{cumulativity} for the base types.  Similar measures can be taken
+  for user defined types, withe the type living in the least upper bound
+  of the levels where the types contained data live.
  \end{mydef}
-
  For example we might define our disjunction to be
  \[
    \myarg\myfun{$\vee$}\myarg : \mytyp_{100} \myarr \mytyp_{100} \myarr \mytyp_{100}
  \]
  And hope that $\mytyp_{100}$ will be large enough to fit all the types
  that we want to use with our disjunction.  However, there are two
-problems with this.  First, there is the obvious clumsyness of having to
-manually specify the size of types.  More importantly, if we want to use
+problems with this.  First, clumsiness of having to manually specify the
+size of types is still there.  More importantly, if we want to use
  $\myfun{$\vee$}$ itself as an argument to other type-formers, we need to
  make sure that those allow for types at least as large as
  $\mytyp_{100}$.
@@ -3255,29 +3400,31 @@ $\mytyp_{100}$.
  A better option is to employ a mechanised version of what Russell called
  \emph{typical ambiguity}: we let the user live under the illusion that
  $\mytyp : \mytyp$, but check that the statements about types are
-consistent under the hood.  $\mykant$\ implements this along the lines
-of \cite{Huet1988}.  See also \cite{Harper1991} for a published
-reference, although describing a more complex system allowing for both
-explicit and explicit hierarchy at the same time.
+consistent under the hood.  $\mykant$\ implements this following the
+plan given by \cite{Huet1988}.  See also \cite{Harper1991} for a
+published reference, although describing a more complex system allowing
+for both explicit and explicit hierarchy at the same time.
  
  We define a partial ordering on the levels, with both weak ($\le$) and
-strong ($<$) constraints---the laws governing them being the same as the
+strong ($<$) constraints, the laws governing them being the same as the
  ones governing $<$ and $\le$ for the natural numbers.  Each occurrence
-of $\mytyp$ is decorated with a unique reference, and we keep a set of
-constraints and add new constraints as we type check, generating new
-references when needed.
+of $\mytyp$ is decorated with a unique reference.  We keep a set of
+constraints regarding the ordering of each occurrence of $\mytyp$, each
+represented by its unique reference.  We add new constraints as we type
+check, generating new references when needed.
  
  For example, when type checking the type $\mytyp\, r_1$, where $r_1$
  denotes the unique reference assigned to that term, we will generate a
-new fresh reference $\mytyp\, r_2$, and add the constraint $r_1 < r_2$
-to the set.  When type checking $\myctx \vdash
+new fresh reference and return the type $\mytyp\, r_2$, adding the
+constraint $r_1 < r_2$ to the set.  When type checking $\myctx \vdash
  \myfora{\myb{x}}{\mytya}{\mytyb}$, if $\myctx \vdash \mytya : \mytyp\,
  r_1$ and $\myctx; \myb{x} : \mytyb \vdash \mytyb : \mytyp\,r_2$; we will
  generate new reference $r$ and add $r_1 \le r$ and $r_2 \le r$ to the
  set.
  
  If at any point the constraint set becomes inconsistent, type checking
-fails.  Moreover, when comparing two $\mytyp$ terms we equate their
+fails.  Moreover, when comparing two $\mytyp$ terms---during the process
+of deciding definitional equality for two terms---we equate their
  respective references with two $\le$ constraints.  Implementation
  details are given in Section \ref{sec:hier-impl}.
  
@@ -3289,7 +3436,7 @@ expressed:
  \myarg\myfun{$\vee$}\myarg : (l_1\, l_2 : \mytyc{Level}) \myarr \mytyp_{l_1} \myarr \mytyp_{l_2} \myarr \mytyp_{l_1 \mylub l_2}
  \]
  Inference algorithms to automatically derive this kind of relationship
-are currently subject of research.  We chose less flexible but more
+are currently subject of research.  We choose a less flexible but more
  concise way, since it is easier to implement and better understood.
  
  \subsection{Observational equality, \mykant\ style}
@@ -3302,19 +3449,17 @@ is that we let the user define inductive types and records.
  Reconciling propositions for OTT and a hierarchy had already been
  investigated by Conor McBride,\footnote{See
    \url{http://www.e-pig.org/epilogue/index.html?p=1098.html}.} and we
-follow his broad design plan, although with some innovation.  Most of
-the work, as an extension of elaboration, is to handle reduction rules
-and coercions for data types---both type constructors and data
-constructors.
+follow some of his suggestions, with some innovation.  Most of the dirty
+work, as an extension of elaboration, is to handle reduction rules and
+coercions for data types---both type constructors and data constructors.
  
  \subsubsection{The \mykant\ prelude, and $\myprop$ositions}
  
  Before defining $\myprop$, we define some basic types inside $\mykant$,
  as the target for the $\myprop$ decoder.
-
  \begin{mydef}[\mykant' propositional prelude]\ \end{mydef}
  \[
-\begin{array}{l}
+\begin{array}{@{}l}
    \myadt{\mytyc{Empty}}{}{ }{ } \\
    \myfun{absurd} : (\myb{A} {:} \mytyp) \myarr \mytyc{Empty} \myarr \myb{A} \mapsto \\
    \myind{2} \myabs{\myb{A\ \myb{bot}}}{\mytyc{Empty}.\myfun{elim} \myappsp \myb{bot} \myappsp (\myabs{\_}{\myb{A}})} \\
@@ -3401,12 +3546,13 @@ equalities.
        \myind{2} \mytya_1 \myeq \mytya_2 \myand \myprfora{\myb{x_1}}{\mytya_1}{\myprfora{\myb{x_2}}{\mytya_2}{\myjm{\myb{x_1}}{\mytya_1}{\myb{x_2}}{\mytya_2}} \myimpl \myapp{\mytyb_1}{\myb{x_1}} \myeq \myapp{\mytyb_2}{\myb{x_2}}}
      \end{array}
    \]
-  The difference here is that in the original presentation of OTT
-  the type binders are explicit, while here $\mytyb_1$ and $\mytyb_2$ are
+  The difference here is that in the original presentation of OTT the
+  type binders are explicit, while here $\mytyb_1$ and $\mytyb_2$ are
    functions returning types.  We can do this thanks to the type
    hierarchy, and this hints at the fact that heterogeneous equality will
-  have to allow $\mytyp$ `to the right of the colon', and in fact this
-  provides the solution to simplify the equality above.
+  have to allow $\mytyp$ `to the right of the colon'.  Indeed,
+  heterogeneous equalities involving abstractions over types will
+  provide the solution to simplify the equality above.
  
    If we take, just like we saw previously in OTT
    \[
@@ -3418,18 +3564,30 @@ equalities.
           }}
      \end{array}
    \]
-  Then we can simply take
+  Then we can simply have
    \[
      \begin{array}{@{}l}
        \mysigma \myappsp \mytya_1 \myappsp \mytyb_1 \myeq \mysigma \myappsp \mytya_2 \myappsp \mytyb_2 \myred \\ \myind{2} \mytya_1 \myeq \mytya_2 \myand \myjm{\mytyb_1}{\mytya_1 \myarr \mytyp}{\mytyb_2}{\mytya_2 \myarr \mytyp}
      \end{array}
    \]
-  Which will reduce to precisely what we desire.  For what
-  concerns coercions and quotation, things stay the same (apart from the
-  fact that we apply to the second argument instead of substituting).
-  We can recognise records such as $\mysigma$ as such and employ
-  projections in value equality and coercions; as to not
-  impede progress if not necessary.
+  Which will reduce to precisely what we desire, but with an
+  heterogeneous equalities relating types instead of values:
+  \[
+  \begin{array}{@{}l}
+    \mytya_1 \myeq \mytya_2 \myand \myjm{\mytyb_1}{\mytya_1 \myarr \mytyp}{\mytyb_2}{\mytya_2 \myarr \mytyp} \myred \\
+    \mytya_1 \myeq \mytya_2 \myand
+    \myprfora{\myb{x_1}}{\mytya_1}{\myprfora{\myb{x_2}}{\mytya_2}{
+        \myjm{\myb{x_1}}{\mytya_1}{\myb{x_2}}{\mytya_2} \myimpl
+        \myjm{\myapp{\mytyb_1}{\myb{x_1}}}{\mytyp}{\myapp{\mytyb_2}{\myb{x_2}}}{\mytyp}
+      }}
+  \end{array}
+  \]
+  If we pretend for the moment that those heterogeneous equalities were
+  type equalities, things run smoothly. For what concerns coercions and
+  quotation, things stay the same (apart from the fact that we apply to
+  the second argument instead of substituting).  We can recognise
+  records such as $\mysigma$ as such and employ projections in value
+  equality and coercions; as to not impede progress if not necessary.
  
  \item[Lists] Now for finite lists, which will give us a taste for data
    constructors:
@@ -3465,10 +3623,6 @@ equalities.
        (& \mydc{cons} \myappsp \mytmm_1 \myappsp \mytmn_1 & : & \myapp{\mylist}{\mytya_1} &) & \myeq & (& \mydc{nil} & : & \myapp{\mylist}{\mytya_2} &) \myred \mybot
      \end{array}
    \]
-
-\item[Evil type]
-  Now for something useless but complicated. % TODO finish
-
  \end{description}
  
  \subsubsection{Only one equality}
@@ -3477,51 +3631,58 @@ Given the examples above, a more `flexible' heterogeneous equality must
  emerge, since of the fact that in $\mykant$ we re-gain the possibility
  of abstracting and in general handling types in a way that was not
  possible in the original OTT presentation.  Moreover, we found that the
-rules for value equality work very well if used with user defined type
+rules for value equality work well if used with user defined type
  abstractions---for example in the case of dependent products we recover
-the original definition with explicit binders, in a very simple manner.
+the original definition with explicit binders, in a natural manner.
+
+\begin{mydef}[Propositions, coercions, coherence, equalities and
+  equality reduction for \mykant] See Figure \ref{fig:kant-eq-red}.
+\end{mydef}
+
+\begin{mydef}[Type equality in \mykant]
+  We define $\mytya \myeq \mytyb$ as an abbreviation for
+  $\myjm{\mytya}{\mytyp}{\mytyb}{\mytyp}$.
+\end{mydef}
  
  In fact, we can drop a separate notion of type-equality, which will
-simply be served by $\myjm{\mytya}{\mytyp}{\mytyb}{\mytyp}$, from now on
-abbreviated as $\mytya \myeq \mytyb$.  We shall still distinguish
-equalities relating types for hierarchical purposes.  The full rules for
-equality reductions, along with the syntax for propositions, are given
-in figure \ref{fig:kant-eq-red}.  We exploit record to perform
-$\eta$-expansion.  Moreover, given the nested $\myand$s, values of data
-types with zero constructors (such as $\myempty$) and records with zero
-destructors (such as $\myunit$) will be automatically always identified
-as equal.
+simply be served by $\myjm{\mytya}{\mytyp}{\mytyb}{\mytyp}$.  We shall
+still distinguish equalities relating types for hierarchical
+purposes. We exploit record to perform $\eta$-expansion.  Moreover,
+given the nested $\myand$s, values of data types with zero constructors
+(such as $\myempty$) and records with zero destructors (such as
+$\myunit$) will be automatically always identified as equal.  As in the
+original OTT, and for the same reasons, we can take $\myfun{coh}$ as
+axiomatic.
+
  
  \begin{figure}[p]
  \mydesc{syntax}{ }{
    \small
    $
    \begin{array}{r@{\ }c@{\ }l}
+    \mytmsyn & ::= & \cdots \mysynsep \mycoee{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \mysynsep
+                     \mycohh{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \\
      \myprsyn & ::= & \cdots \mysynsep \myjm{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \\
    \end{array}
    $
  }
  
-    % \mytmsyn & ::= & \cdots \mysynsep \mycoee{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \mysynsep
-    %                  \mycohh{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \\
-    % \myprsyn & ::= & \cdots \mysynsep \myjm{\mytmsyn}{\mytmsyn}{\mytmsyn}{\mytmsyn} \\
-
-% \mynegder
-
-% \mydesc{typing:}{\myctx \vdash \mytmsyn \Leftrightarrow \mytmsyn}{
-%   \small
-%   \begin{tabular}{cc}
-%     \AxiomC{$\myjud{\myse{P}}{\myprdec{\mytya \myeq \mytyb}}$}
-%     \AxiomC{$\myjud{\mytmt}{\mytya}$}
-%     \BinaryInfC{$\myinf{\mycoee{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\mytyb}$}
-%     \DisplayProof
-%     &
-%     \AxiomC{$\myjud{\myse{P}}{\myprdec{\mytya \myeq \mytyb}}$}
-%     \AxiomC{$\myjud{\mytmt}{\mytya}$}
-%     \BinaryInfC{$\myinf{\mycohh{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\myprdec{\myjm{\mytmt}{\mytya}{\mycoee{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\mytyb}}}$}
-%     \DisplayProof
-%   \end{tabular}
-% }
+\mynegder
+
+\mydesc{typing:}{\myctx \vdash \mytmsyn \Leftrightarrow \mytmsyn}{
+  \small
+  \begin{tabular}{cc}
+    \AxiomC{$\mychk{\myse{P}}{\myprdec{\mytya \myeq \mytyb}}$}
+    \AxiomC{$\mychk{\mytmt}{\mytya}$}
+    \BinaryInfC{$\myinf{\mycoee{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\mytyb}$}
+    \DisplayProof
+    &
+    \AxiomC{$\mychk{\myse{P}}{\myprdec{\mytya \myeq \mytyb}}$}
+    \AxiomC{$\mychk{\mytmt}{\mytya}$}
+    \BinaryInfC{$\myinf{\mycohh{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\myprdec{\myjm{\mytmt}{\mytya}{\mycoee{\mytya}{\mytyb}{\myse{P}}{\mytmt}}{\mytyb}}}$}
+    \DisplayProof
+  \end{tabular}
+}
  
  \mynegder
  
@@ -3567,7 +3728,7 @@ as equal.
  }
  
  \mynegder
-% TODO add syntax and types for coe and coh
+
  \mydesc{equality reduction:}{\myctx \vdash \myprsyn \myred \myprsyn}{
    \small
      \begin{tabular}{cc}
@@ -3767,12 +3928,9 @@ annoyance lies:
  \end{array}
  \]
  
-
-% TODO finish
-
  \subsubsection{$\myprop$ and the hierarchy}
  
-We shall have, at earch universe level, not only a $\mytyp_l$ but also a
+We shall have, at each universe level, not only a $\mytyp_l$ but also a
  $\myprop_l$.  Where will propositions placed in the type hierarchy?  The
  main indicator is the decoding operator, since it converts into things
  that already live in the hierarchy.  For example, if we have
@@ -3781,7 +3939,7 @@ that already live in the hierarchy.  For example, if we have
    \mytop \myand ((\myb{x}\, \myb{y} : \mynat) \myarr \mytop \myarr \mytop)
  \]
  we will better make sure that the `to be decoded' is at level compatible
-(read: larger) with its reduction.  In the example above, we'll have
+(read: larger) with its reduction.  In the example above, we will have
  that proposition to be at least as large as the type of $\mynat$, since
  the reduced proof will abstract over it.  Pretending that we had
  explicit, non cumulative levels, it would be tempting to have
@@ -3857,9 +4015,9 @@ would not hold.  Consider for instance
  which reduces to
  \[\myjm{\mynat}{\mytyp_0}{\mybool}{\mytyp_0} : \myprop_0 \]
  We need members of $\myprop_0$ to be members of $\myprop_1$ too, which
-will be the case with cumulativity.  This is not the most elegant of
-systems, but it buys us a cheap type level equality without having to
-replicate functionality with a dedicated construct.
+will be the case with cumulativity.  This buys us a cheap type level
+equality without having to replicate functionality with a dedicated
+construct.
  
  \subsubsection{Quotation and definitional equality}
  \label{sec:kant-irr}
@@ -3875,7 +4033,8 @@ We want to:
  \item As a consequence of the previous point, identify all records with
  no projections as equal, since they will have only one element.
  
-\item Identify all members of types with no elements as equal.
+\item Identify all members of types with no constructors (and thus no
+  elements) as equal.
  
  \item Identify all equivalent proofs as equal---with `equivalent proof'
  we mean those proving the same propositions.
@@ -3884,22 +4043,29 @@ we mean those proving the same propositions.
  \end{itemize}
  Towards these goals and following the intuition between bidirectional
  type checking we define two mutually recursive functions, one quoting
-canonical terms against their types (since we need the type to typecheck
+canonical terms against their types (since we need the type to type check
  canonical terms), one quoting neutral terms while recovering their
-types.  The full procedure for quotation is shown in Figure
-\ref{fig:kant-quot}. We $\boxed{\text{box}}$ the neutral proofs and
+types.
+\begin{mydef}[Quotation for \mykant]
+The full procedure for quotation is shown in Figure
+\ref{fig:kant-quot}.
+\end{mydef}
+We $\boxed{\text{box}}$ the neutral proofs and
  neutral members of empty types, following the notation in
  \cite{Altenkirch2007}, and we make use of $\mydefeq_{\mybox}$ which
  compares terms syntactically up to $\alpha$-renaming, but also up to
  equivalent proofs: we consider all boxed content as equal.
  
  Our quotation will work on normalised terms, so that all defined values
-will have been replaced.  Moreover, we match on datatype eliminators and
-all their arguments, so that $\mynat.\myfun{elim} \myappsp \mytmm
+will have been replaced.  Moreover, we match on data type eliminators
+and all their arguments, so that $\mynat.\myfun{elim} \myappsp \mytmm
  \myappsp \myse{P} \myappsp \vec{\mytmn}$ will stand for
  $\mynat.\myfun{elim}$ applied to the scrutinised $\mynat$, the
  predicate, and the two cases.  This measure can be easily implemented by
  checking the head of applications and `consuming' the needed terms.
+Thus, we gain proof irrelevance, and not only for a more useful
+definitional equality, but also for example to eliminate all
+propositional content when compiling.
  
  \begin{figure}[t]
    \mydesc{canonical quotation:}{\mycanquot(\myctx, \mytmsyn : \mytmsyn) \mymetagoes \mytmsyn}{
@@ -3907,7 +4073,8 @@ checking the head of applications and `consuming' the needed terms.
      $
      \begin{array}{@{}l@{}l}
        \mycanquot(\myctx,\ \mytmt : \mytyc{D} \myappsp \vec{A} &) \mymetaguard \mymeta{empty}(\myctx, \mytyc{D}) \mymetagoes \boxed{\mytmt} \\
-      \mycanquot(\myctx,\ \mytmt : \mytyc{D} \myappsp \vec{A} &) \mymetaguard \mymeta{record}(\myctx, \mytyc{D}) \mymetagoes  \mytyc{D}.\mydc{constr} \myappsp \cdots \myappsp \mycanquot(\myctx, \mytyc{D}.\myfun{f}_n : (\myctx(\mytyc{D}.\myfun{f}_n))(\vec{A};\mytmt)) \\
+      \mycanquot(\myctx,\ \mytmt : \mytyc{D} \myappsp \vec{A} &) \mymetaguard \mymeta{record}(\myctx, \mytyc{D}) \mymetagoes 
+     \mytyc{D}.\mydc{constr} \myappsp \cdots \myappsp \mycanquot(\myctx, \mytyc{D}.\myfun{f}_n : (\myctx(\mytyc{D}.\myfun{f}_n))(\vec{A};\mytmt)) \\
        \mycanquot(\myctx,\ \mytyc{D}.\mydc{c} \myappsp \vec{t} : \mytyc{D} \myappsp \vec{A} &) \mymetagoes \cdots \\
        \mycanquot(\myctx,\ \myse{f} : \myfora{\myb{x}}{\mytya}{\mytyb} &) \mymetagoes \myabs{\myb{x}}{\mycanquot(\myctx; \myb{x} : \mytya, \myapp{\myse{f}}{\myb{x}} : \mytyb)} \\
        \mycanquot(\myctx,\ \myse{p} : \myprdec{\myse{P}} &) \mymetagoes \boxed{\myse{p}}
@@ -3928,7 +4095,8 @@ checking the head of applications and `consuming' the needed terms.
        \myneuquot(\myctx,\ \myfora{\myb{x}}{\mytya}{\mytyb} & ) \mymetagoes
         \myfora{\myb{x}}{\myneuquot(\myctx, \mytya)}{\myneuquot(\myctx; \myb{x} : \mytya, \mytyb)} : \mytyp \\
        \myneuquot(\myctx,\ \mytyc{D} \myappsp \vec{A} &) \mymetagoes \mytyc{D} \myappsp \cdots \mycanquot(\myctx, \mymeta{head}((\myctx(\mytyc{D}))(\mytya_1 \cdots \mytya_{n-1}))) : \mytyp \\
-      \myneuquot(\myctx,\ \myprdec{\myjm{\mytmm}{\mytya}{\mytmn}{\mytyb}} &) \mymetagoes \myprdec{\myjm{\mycanquot(\myctx, \mytmm : \mytya)}{\mytya'}{\mycanquot(\myctx, \mytmn : \mytyb)}{\mytyb'}} : \mytyp \\
+      \myneuquot(\myctx,\ \myprdec{\myjm{\mytmm}{\mytya}{\mytmn}{\mytyb}} &) \mymetagoes \\
+      \multicolumn{2}{l}{\myind{2}\myprdec{\myjm{\mycanquot(\myctx, \mytmm : \mytya)}{\mytya'}{\mycanquot(\myctx, \mytmn : \mytyb)}{\mytyb'}} : \mytyp} \\
        \multicolumn{2}{@{}l}{\myind{2}\text{\textbf{where}}\ \mytya' : \myarg = \myneuquot(\myctx, \mytya)} \\
        \multicolumn{2}{@{}l}{\myind{2}\phantom{\text{\textbf{where}}}\ \mytyb' : \myarg = \myneuquot(\myctx, \mytyb)} \\
        \myneuquot(\myctx,\ \mytyc{D}.\myfun{f} \myappsp \mytmt &) \mymetaguard \mymeta{record}(\myctx, \mytyc{D}) \mymetagoes \mytyc{D}.\myfun{f} \myappsp \mytmt' : (\myctx(\mytyc{D}.\myfun{f}))(\vec{A};\mytmt) \\
@@ -3960,7 +4128,7 @@ automatically, and in fact in some sense we already do during equality
  reduction and quotation.  However, this has the considerable
  disadvantage that we can never identify abstracted
  variables\footnote{And in general neutral terms, although we currently
-  don't have neutral propositions apart from equalities on neutral
+  do not have neutral propositions apart from equalities on neutral
    terms.} of type $\mytyp$ as $\myprop$, thus forbidding the user to
  talk about $\myprop$ explicitly.
  
@@ -3974,35 +4142,41 @@ type theories \citep{Jacobs1994}.
  \section{\mykant : the practice}
  \label{sec:kant-practice}
  
+\epigraph{\emph{It's alive!}}{Henry Frankenstein}
+
  The codebase consists of around 2500 lines of Haskell,\footnote{The full
    source code is available under the GPL3 license at
    \url{https://github.com/bitonic/kant}.  `Kant' was a previous
    incarnation of the software, and the name remained.} as reported by
-the \texttt{cloc} utility.  The high level design is inspired by the
-work on various incarnations of Epigram, and specifically by the first
-version as described \citep{McBride2004}.
-
-The author learnt the hard way the implementation challenges for such a
-project, and ran out of time while implementing observational equality.
-While the constructs and typing rules are present, the machinery to make
-it happen (equality reduction, coercions, quotation, etc.) is not
-present yet.  Everything else presented is implemented and working
-reasonably well, and given the detailed plan in the previous section,
-finishing off should not prove too much work.
+the \texttt{cloc} utility.
+
+We implement the type theory as described in Section
+\ref{sec:kant-theory}.  The author learnt the hard way the
+implementation challenges for such a project, and ran out of time while
+implementing observational equality.  While the constructs and typing
+rules are present, the machinery to make it happen (equality reduction,
+coercions, quotation, etc.) is not present yet.
+
+This considered, everything else presented in Section
+\ref{sec:kant-theory} is implemented and working well---and in fact all
+the examples presented in this thesis, apart from the ones that are
+equality related, have been encoded in \mykant\ in the Appendix.
+Moreover, given the detailed plan in the previous section, finishing off
+should not prove too much work.
  
  The interaction with the user takes place in a loop living in and
  updating a context of \mykant\ declarations, which presents itself as in
  Figure \ref{fig:kant-web}.  Files with lists of declarations can also be
-loaded. The REPL is a available both as a commandline application and in
+loaded. The REPL is a available both as a command-line application and in
  a web interface, which is available at \url{bertus.mazzo.li}.
  
-A REPL cycle starts with the user inputing a \mykant\
+A REPL cycle starts with the user inputting a \mykant\
  declaration or another REPL command, which then goes through various
  stages that can end up in a context update, or in failures of various
  kind.  The process is described diagrammatically in figure
  \ref{fig:kant-process}.
  
-\begin{figure}[t]
+\begin{figure}[b!]
  {\small\begin{Verbatim}[frame=leftline,xleftmargin=3cm]
  B E R T U S
  Version 0.0, made in London, year 2013.
@@ -4047,22 +4221,22 @@ Type: Nat
  \item[Reference] Occurrences of $\mytyp$ get decorated by a unique reference,
    which is necessary to implement the type hierarchy check.
  
-\item[Elaborate] Converts the declaration to some context items, which
-  might be a value declaration (type and body) or a data type
-  declaration (constructors and destructors).  This phase works in
-  tandem with \textbf{Type checking}, which in turns needs to
-  \textbf{Evaluate} terms.
+\item[Elaborate/Typecheck/Evaluate] \textbf{Elaboration} converts the
+  declaration to some context items, which might be a value declaration
+  (type and body) or a data type declaration (constructors and
+  destructors).  This phase works in tandem with \textbf{Type checking},
+  which in turns needs to \textbf{Evaluate} terms.
  
  \item[Distill] and report the result.  `Distilling' refers to the
-  process of converting a core term back to a sugared version that the
-  user can visualise.  This can be necessary both to display errors
+  process of converting a core term back to a sugared version that we
+  can show to the user.  This can be necessary both to display errors
    including terms or to display result of evaluations or type checking
    that the user has requested.  Among the other things in this stage we
    go from nameless back to names by recycling the names that the user
    used originally, as to fabricate a term which is as close as possible
    to what it originated from.
  
-\item[Pretty print] Format the terms in a nice way, and display the result to
+\item[Pretty print] Format the terms in a nice way, and display them to
    the user.
  
  \end{description}
@@ -4123,12 +4297,13 @@ theorem prover.
  \subsection{Syntax}
  
  The syntax of \mykant\ is presented in Figure \ref{fig:syntax}.
-Examples showing the usage of most of the constructs are present in
-Appendices \ref{app:kant-itt}, \ref{app:kant-examples}, and
-\ref{app:hurkens}; plus a tutorial in Section \ref{sec:type-holes}.  The
-syntax has grown organically with the needs of the language, and thus it
-is not very sophisticated, being specified in and processed by a parser
-generated with the \texttt{happy} parser generated for Haskell.
+Examples showing the usage of most of the constructs---excluding the
+OTT-related ones---are present in Appendices \ref{app:kant-itt},
+\ref{app:kant-examples}, and \ref{app:hurkens}; plus a tutorial in
+Section \ref{sec:type-holes}.  The syntax has grown organically with the
+needs of the language, and thus is not very sophisticated.  The grammar
+is specified in and processed by the \texttt{happy} parser generator for
+Haskell.\footnote{Available at \url{http://www.haskell.org/happy}.}
  Tokenisation is performed by a simple hand written lexer.
  
  \begin{figure}[p]
@@ -4212,18 +4387,18 @@ variables, and thus substituting:
    \ref{sec:untyped}.  The problem is that avoiding name capturing is a
    nightmare, both in the sense that it is not performant---given that we
    need to rename rename substitute each time we `enter' a binder---but
-  most importantly given the fact that in even more slightly complicated
+  most importantly given the fact that in even slightly more complicated
    systems it is very hard to get right, even for experts.
  
-  One of the sore spots of explicit names is comparing terms up
+  One of the sore spots of explicit names is comparing terms up to
    $\alpha$-renaming, which again generates a huge amounts of
-  substitutions and requires special care.  We can represent the
-  relationship between variables and their binders, by...
+  substitutions and requires special care.  
  
-\item[Nameless] ...getting rid of names altogether, and representing
+\item[Nameless] We can capture the relationship between variables and
+  their binders, by getting rid of names altogether, and representing
    bound variables with an index referring to the `binding' structure, a
-  notion introduced by \cite{de1972lambda}.  Classically $0$ represents
-  the variable bound by the innermost binding structure, $1$ the
+  notion introduced by \cite{de1972lambda}.  Usually $0$ represents the
+  variable bound by the innermost binding structure, $1$ the
    second-innermost, and so on.  For instance with simple abstractions we
    might have
    \[
@@ -4237,29 +4412,30 @@ variables, and thus substituting:
    usability,\footnote{With some people going as far as defining it akin
    to an inverse Turing test.} it is much more convenient as an
    internal representation to deal with terms mechanically---at least in
-  simple cases.  Moreover, $\alpha$ renaming ceases to be an issue, and
+  simple cases.  $\alpha$-renaming ceases to be an issue, and
    term comparison is purely syntactical.
  
-  Nonetheless, more complex, constructs such as pattern matching put
+  Nonetheless, more complex constructs such as pattern matching put
    some strain on the indices and many systems end up using explicit
-  names anyway (Agda, Coq, \dots).
+  names anyway.
  
  \end{description}
  
  In the past decade or so advancements in the Haskell's type system and
-in general the spread new programming practices have enabled to make the
-second option much more amenable.  \mykant\ thus takes the second path
+in general the spread new programming practices have made the nameless
+option much more amenable.  \mykant\ thus takes the nameless path
  through the use of Edward Kmett's excellent \texttt{bound}
  library.\footnote{Available at
-\url{http://hackage.haskell.org/package/bound}.}  We decribe its
-advantages but also pitfalls in the previously relatively unknown
-territory of dependent types---\texttt{bound} being created mostly to
-handle more simply typed systems.
-
-\texttt{bound} builds on the work of \cite{Bird1999}, who suggest to
-parametrising the term type over the type of the variables, and `nest'
-the type each time we enter a scope.  If we wanted to define a term for
-the untyped $\lambda$-calculus, we might have
+  \url{http://hackage.haskell.org/package/bound}.}  We describe the
+advantages of \texttt{bound}'s approach, but also its pitfalls in the
+previously relatively unknown territory of dependent
+types---\texttt{bound} being created mostly to handle more simply typed
+systems.
+
+  \texttt{bound} builds on the work of \cite{Bird1999}, who suggested to
+  parametrising the term type over the type of the variables, and `nest'
+  the type each time we enter a scope.  If we wanted to define a term
+  for the untyped $\lambda$-calculus, we might have
  \begin{Verbatim}
  -- A type with no members.
  data Empty
@@ -4285,14 +4461,14 @@ can be represented as
  -- Empty))'.
  Lam (Lam (V (Free Bound)))
  \end{Verbatim}
-This allows us to reflect the of a type `nestedness' at the type level,
+This allows us to reflect the `nestedness' of a type at the type level,
  and since we usually work with functions polymorphic on the parameter
  \texttt{v} it's very hard to make mistakes by putting terms of the wrong
-nestedness where they don't belong.
+nestedness where they do not belong.
  
  Even more interestingly, the substitution operation is perfectly
  captured by the \verb|>>=| (bind) operator of the \texttt{Monad}
-typeclass:
+type class:
  \begin{Verbatim}
  class Monad m where
    return :: m a
@@ -4327,12 +4503,11 @@ subst v m n = n >>= \v' -> if v == v' then m else return v'
  
  -- Replace the variable bound by `s' with term `t'.
  inst :: Monad m => m v -> m (Var v) -> m v
-inst t s = do v <- s
-              case v of
-                Bound   -> t
-                Free v' -> return v'
+inst t s = s >>= \v -> case v of
+                           Bound   -> t
+                           Free v' -> return v'
  \end{Verbatim}
-The beauty of this technique is that in a few simple function we have
+The beauty of this technique is that with a few simple functions we have
  defined all the core operations in a general and `obviously correct'
  way, with the extra confidence of having the type checker looking our
  back.  For what concerns term equality, we can just ask the Haskell
@@ -4340,9 +4515,9 @@ compiler to derive the instance for the \verb|Eq| type class and since
  we are nameless that will be enough (modulo fancy quotation).
  
  Moreover, if we take the top level term type to be \verb|Tm String|, we
-get for free a representation of terms with top-level, definitions;
-where closed terms contain only \verb|String| references to said
-definitions---see also \cite{McBride2004b}.
+get a representation of terms with top-level definitions; where closed
+terms contain only \verb|String| references to said definitions---see
+also \cite{McBride2004b}.
  
  What are then the pitfalls of this seemingly invincible technique?  The
  most obvious impediment is the need for polymorphic recursion.
@@ -4367,7 +4542,7 @@ telescopes, which are a central tool to deal with contexts and other
  constructs.  In Haskell we can give them a faithful representation
  with a data type along the lines of
  \begin{Verbatim}
-data Tele m v = End (m v) | Bind (m v) (Tele (Var v))
+data Tele m v = Empty (m v) | Bind (m v) (Tele m (Var v))
  type TeleTm = Tele Tm
  \end{Verbatim}
  The problem here is that what we want to substitute for variables in
@@ -4388,9 +4563,9 @@ Appendix \ref{app:termrep}.  The fact that propositions cannot be
  factored out in another data type is an instance of the problem
  described above.  However the real pain is during elaboration, where we
  are forced to treat everything as a type while we would much rather have
-telescopes.  Future work would include writing a library that marries a
-nice interface similar to the one of \verb|bound| with a more flexible
-interface.
+telescopes.  Future work would include writing a library that marries
+more flexibility with a nice interface similar to the one of
+\verb|bound|.
  
  We also make use of a `forgetful' data type (as provided by
  \verb|bound|) to store user-provided variables names along with the
@@ -4404,17 +4579,17 @@ originally used.
  Another source of contention related to term representation is dealing
  with evaluation.  Here \mykant\ does not make bold moves, and simply
  employs substitution.  When type checking we match types by reducing
-them to their wheak head normal form, as to avoid unnecessary evaluation.
+them to their weak head normal form, as to avoid unnecessary evaluation.
  
  We treat data types eliminators and record projections in an uniform
  way, by elaborating declarations in a series of \emph{rewriting rules}:
  \begin{Verbatim}
  type Rewr =
      forall v.
-    TmRef v   -> -- Term to which the destructor is applied
-    [TmRef v] -> -- List of other arguments
+    Tm v   ->    -- Term to which the destructor is applied
+    [Tm v] ->    -- List of other arguments
      -- The result of the rewriting, if the eliminator reduces.
-    Maybe [TmRef v]
+    Maybe [Tm v]
  \end{Verbatim}
  A rewriting rule is polymorphic in the variable type, guaranteeing that
  it just pattern matches on terms structure and rearranges them in some
@@ -4423,15 +4598,109 @@ reducing a series of applications we match the first term and check if
  it is a destructor, and if that's the case we apply the reduction rule
  and reduce further if it yields a new list of terms.
  
-This has the advantage of being very simple, but has the disadvantage of
-being quite poor in terms of performance and that we need to do
-quotation manually.  An alternative that solves both of these is the
-already mentioned \emph{normalization by evaluation}, where we would
-compute by turning terms into Haskell values, and then reify back to
-terms to compare them---a useful tutorial on this technique is given by
-\cite{Loh2010}.
+This has the advantage of simplicity, at the expense of being quite poor
+in terms of performance and that we need to do quotation manually.  An
+alternative that solves both of these is the already mentioned
+\emph{normalization by evaluation}, where we would compute by turning
+terms into Haskell values, and then reify back to terms to compare
+them---a useful tutorial on this technique is given by \cite{Loh2010}.
+
+However, quotation has its disadvantages.  The most obvious one is that
+it is less simple: we need to set up some infrastructure to handle the
+quotation and reification, while with substitution we have a uniform
+representation through the process of type checking.  The second is that
+performance advantages can be rendered less effective by the continuous
+quoting and reifying, although this can probably be mitigated with some
+heuristics.
+
+\subsubsection{Parametrise everything!}
+\label{sec:parame}
+
+Through the life of a REPL cycle we need to execute two broad
+`effectful' actions:
+\begin{itemize}
+\item Retrieve, add, and modify elements to an environment.  The
+  environment will contain not only types, but also the rewriting rules
+  presented in the previous section, and a counter to generate fresh
+  references for the type hierarchy.
  
-\subsection{Turning constraints into graphs}
+\item Throw various kinds of errors when something goes wrong: parsing,
+  type checking, input/output error when reading files, and more.
+\end{itemize}
+Haskell taught us the value of monads in programming languages, and in
+\mykant\ we keep this lesson in mind.  All of the plumbing required to do
+the two actions above is provided by a very general \emph{monad
+  transformer} that we use through the codebase, \texttt{KMonadT}:
+\begin{Verbatim}
+newtype KMonad f v m a = KMonad (StateT (f v) (ErrorT KError m) a)
+
+data KError
+    = OutOfBounds Id
+    | DuplicateName Id
+    | IOError IOError
+    | ...
+\end{Verbatim}
+Without delving into the details of what a monad transformer
+is,\footnote{See
+  \url{https://en.wikibooks.org/wiki/Haskell/Monad_transformers.}} this
+is what \texttt{KMonadT} works with and provides:
+\begin{itemize}
+\item The \verb|v| parameter represents the parametrised variable for
+  the term type that we spoke about at the beginning of this section.
+  More on this later.
+
+\item The \verb|f| parameter indicates what kind of environment we are
+  holding.  Sometimes we want to traverse terms without carrying the
+  entire environment, for various reasons---\texttt{KMonatT} lets us do
+  that.  Note that \verb|f| is itself parametrised over \verb|v|.  The
+  inner \verb|StateT| monad transformer lets us retrieve and modify this
+  environment at any time.
+
+\item The \verb|m| is the `inner' monad that we can `plug in' to be able
+  to perform more effectful actions in \texttt{KMonatT}.  For example if we
+  plug the \texttt{IO} monad in, we will be able to do input/output.
+
+\item The inner \verb|ErrorT| lets us throw errors at any time.  The
+  error type is \verb|KError|, which describes all the possible errors
+  that a \mykant\ process can throw.
+
+\item Finally, the \verb|a| parameter represents the return type of the
+  computation we are executing.
+\end{itemize}
+
+The clever trick in \texttt{KMonadT} is to have it to be parametrised
+over the same type as the term type.  This way, we can easily carry the
+environment while traversing under binders.  For example, if we only
+needed to carry types of bound variables in the environment, we can
+quickly set up the following infrastructure:
+\begin{Verbatim}
+data Tm v = ...
+
+-- A context is a mapping from variables to types.
+newtype Ctx v = Ctx (v -> Tm v)
+
+-- A context monad holds a context.
+type CtxMonad v m = KMonadT Ctx v m
+
+-- Enter into a scope binding a type to the variable, execute a
+-- computation there, and return exit the scope returning to the `current'
+-- context.
+nestM :: Monad m => Tm v -> CtxMonad (Var v) m a -> CtxMonad v m a
+nestM = ...
+\end{Verbatim}
+Again, the types guard our back guaranteeing that we add a type when we
+enter a scope, and we discharge it when we get out.  The author
+originally started with a more traditional representation and often
+forgot to add the right variable at the right moment.  Using this
+practices it is very difficult to do so---we achieve correctness through
+types.
+
+In the actual \mykant\ codebase, we have also abstracted the concept of
+`context' further, so that we can easily embed contexts into other
+structures and write generic operations on all context-like
+structures.\footnote{See the \texttt{Kant.Cursor} module for details.}
+
+\subsection{Turning a hierarchy into some graphs}
  \label{sec:hier-impl}
  
  In this section we will explain how to implement the typical ambiguity
@@ -4454,22 +4723,21 @@ equality constraint ($x = y$), which can be reduced to two constraints
  $x \le y$ and $y \le x$.
  
  Given this specification, we have implemented a standalone Haskell
-module---that we plan to release as a standalone library---to
-efficiently store and check the consistency of constraints.  The problem
-predictably reduces to a graph algorithm, and for this reason we also
-implement a library for labelled graphs, since the existing Haskell
-graph libraries fell short in different areas.\footnote{We tried the
-\texttt{Data.Graph} module in
-\url{http://hackage.haskell.org/package/containers}, and the much more
-featureful \texttt{fgl} library
-\url{http://hackage.haskell.org/package/fgl}.}.  The interfaces for
+module---that we plan to release as a library---to efficiently store and
+check the consistency of constraints.  The problem predictably reduces
+to a graph algorithm, and for this reason we also implement a library
+for labelled graphs, since the existing Haskell graph libraries fell
+short in different areas.\footnote{We tried the \texttt{Data.Graph}
+  module in \url{http://hackage.haskell.org/package/containers}, and the
+  much more featureful \texttt{fgl} library
+  \url{http://hackage.haskell.org/package/fgl}.}  The interfaces for
  these modules are shown in Appendix \ref{app:constraint}.  The graph
  library is implemented as a modification of the code described by
  \cite{King1995}.
  
  We represent the set by building a graph where vertices are variables,
  and edges are constraints between them, labelled with the appropriate
-constraint: $x < y$ gives rise to a $<$-labelled edge from $x$ to $y$<
+constraint: $x < y$ gives rise to a $<$-labelled edge from $x$ to $y$,
  and $x \le y$ to a $\le$-labelled edge from $x$ to $y$.  As we add
  constraints, $\le$ constraints are replaced by $<$ constraints, so that
  if we started with an empty set and added
@@ -4554,10 +4822,13 @@ in \ref{fig:graph-one-after}.
    \label{fig:graph-one}
  \end{figure}
  
-Each time we add a new constraint, we check if any strongly connected
-component (SCC) arises, a SCC being a subset $V$ of vertices where for
-each $v_1,v_2 \in V$ there is a path from $v_1$ to $v_2$.  The SCCs in
-the graph for the constraints above is shown in Figure
+\begin{mydef}[Strongly connected component]
+  A \emph{strongly connected component} in a graph with vertices $V$ is
+  a subset of $V$, say $V'$, such that for each $(v_1,v_2) \in V' \times
+  V$ there is a path from $v_1$ to $v_2$.
+\end{mydef}
+
+The SCCs in the graph for the constraints above is shown in Figure
  \ref{fig:graph-one-scc}.  If we have a strongly connected component with
  a $<$ edge---say $x < y$---in it, we have an inconsistency, since there
  must also be a path from $y$ to $x$, and by transitivity it must either
@@ -4569,21 +4840,194 @@ of said SCC are equal, since for every $x \le y$ we have a path from $y$
  to $x$, which again by transitivity means that $y \le x$.  Thus, we can
  \emph{condense} the SCC to a single vertex, by choosing a variable among
  the SCC as a representative for all the others.  This can be done
-efficiently with disjoint set data structure.
+efficiently with disjoint set data structure, and is crucial to keep the
+graph compact, given the very large number of constraints generated when
+type checking.
+
+\subsection{(Web) REPL}
+
+Finally, we take a break from the types by giving a brief account of the
+design of our REPL, being a good example of modular design using various
+constructs dear to the Haskell programmer.
+
+Keeping in mind the \texttt{KMonadT} monad described in Section
+\ref{sec:parame}, the REPL is represented as a function in
+\texttt{KMonadT} consuming input and hopefully producing output.  Then,
+frontends can very easily written by marshalling data in and out of the
+REPL:
+\begin{Verbatim}
+data Input
+    = ITyCheck String           -- Type check a term
+    | IEval String              -- Evaluate a term
+    | IDecl String              -- Declare something
+    | ...
+
+data Output
+    = OTyCheck TmRefId [HoleCtx] -- Type checked term, with holes
+    | OPretty TmRefId            -- Term to pretty print, after evaluation
+      -- Just holes, classically after loading a file
+    | OHoles [HoleCtx]
+    | ... 
+    
+-- KMonadT is parametrised over the type of the variables, which depends
+-- on how deep in the term structure we are.  For the REPL, we only deal
+-- with top-level terms, and thus only `Id' variables---top level names.
+type REPL m = KMonadT Id m
+
+repl :: ReadFile m => Input -> REPL m Output
+repl = ...
+\end{Verbatim}
+The \texttt{ReadFile} monad embodies the only `extra' action that we
+need to have access too when running the REPL: reading files.  We could
+simply use the \texttt{IO} monad, but this will not serve us well when
+implementing front end facing untrusted parties accessing the application
+running on our servers.  In our case we expose the REPL as a web
+application, and we want the user to be able to load only from a
+pre-defined directory, not from the entire file system.
+
+For this reason we specify \texttt{ReadFile} to have just one function:
+\begin{Verbatim}
+class Monad m => ReadFile m where
+    readFile' :: FilePath -> m (Either IOError String)
+\end{Verbatim}
+While in the command-line application we will use the \texttt{IO} monad
+and have \texttt{readFile'} to work in the `obvious' way---by reading
+the file corresponding to the given file path---in the web prompt we
+will have it to accept only a file name, not a path, and read it from a
+pre-defined directory:
+\begin{Verbatim}
+-- The monad that will run the web REPL.  The `ReaderT' holds the
+-- filepath to the directory where the files loadable by the user live.
+-- The underlying `IO' monad will be used to actually read the files.
+newtype DirRead a = DirRead (ReaderT FilePath IO a)
+
+instance ReadFile DirRead where
+    readFile' fp =
+        do -- We get the base directory in the `ReaderT' with `ask'
+           dir <- DirRead ask
+           -- Is the filepath provided an unqualified file name?
+           if snd (splitFileName fp) == fp
+              -- If yes, go ahead and read the file, by lifting
+              -- `readFile'' into the IO monad
+              then DirRead (lift (readFile' (dir </> fp)))
+              -- If not, return an error
+              else return (Left (strMsg ("Invalid file name `" ++ fp ++ "'")))
+\end{Verbatim}
+Once this light-weight infrastructure is in place, adding a web
+interface was an easy exercise.  We use Jasper Van der Jeugt's
+\texttt{websockets} library\footnote{Available at
+  \url{http://hackage.haskell.org/package/websockets}.} to create a
+proxy that receives \texttt{JSON}\footnote{\texttt{JSON} is a popular data interchange
+  format, see \url{http://json.org} for more info.}  messages with the
+user input, turns them into \texttt{Input} messages for the REPL, and
+then sends back a \texttt{JSON} message with the response.  Moreover, each client
+is handled in a separate threads, so crashes of the REPL for a certain
+client will not bring the whole application down.
+
+On the frontend side, we had to write some JavaScript to accept input
+from a form, and to make the responses appear on the screen.  The web
+prompt is publicly available at \url{http://bertus.mazzo.li}, a sample
+session is shown Figure \ref{fig:web-prompt-one}.
+
+\begin{figure}[t]
+  \includegraphics[width=\textwidth]{web-prompt.png}
+  \caption{A sample run of the web prompt.}
+  \label{fig:web-prompt-one}
+\end{figure}
  
-\subsection{Tooling}
  
-\subsubsection{A type holes tutorial}
+
+\section{Evaluation}
+\label{sec:evaluation}
+
+Going back to our goals in Section \ref{sec:contributions}, we feel that
+this thesis fills a gap in the description of observational type theory.
+In the design of \mykant\ we willingly patterned the core features
+against the ones present in Agda, with the hope that future implementors
+will be able to refer to this document without embarking on the same
+adventure themselves.  We gave an original account of heterogeneous
+equality by showing that in a cumulative hierarchy we can keep
+equalities as small as we would be able too with a separate notion of
+type equality.  As a side effect of developing \mykant, we also gave an
+original account of bidirectional type checking for user defined types,
+which get rid of many types while keeping the language very simple.
+
+Through the design of the theory of \mykant\ we have followed an
+approach where study and implementation were continuously interleaved,
+as a `reality check' for the ideas that we wished to implement.  Given
+the great effort necessary to build a theorem prover capable of
+`real-world' proofs we have not attempted to compare \mykant's
+capabilities to those of Agda and Coq, the theorem provers that the
+author is most familiar with and in general two of the main players in
+the field.  However we have ported a lot of simpler examples to check
+that the key features are working, some of which have been used in the
+previous sections and are reproduced in the appendices\footnote{The full
+list is available in the repository:
+\url{https://github.com/bitonic/kant/tree/master/data/samples/good}.}.
+A full example of interaction with \mykant\ is given in Section
+\ref{sec:type-holes}.
+
+The main culprits for the delays in the implementation are two issues
+that revealed themselves to be far less obvious than what the author
+predicted.  The first, as we have already remarked in Section
+\ref{sec:term-repr}, is to have an adequate term representation that
+lets us express the right constructs in a safe way.  There is still no
+widely accepted solution to this problem, which is approached in many
+different ways both in the literature and in the programming
+practice. The second aspect is the treatment of user defined data types.
+Again, the best techniques to implement them in a dependently typed
+setting still have not crystallised and implementors reinvent many
+wheels each time a new system is built.  The author is still conflicted
+on whether having user defined types at all it is the right decision:
+while they are essential, the recent discovery of a paper by
+\cite{dagand2012elaborating} describing a way to efficiently encode
+user-defined data types to a set of core primitives---an option that
+seems very attractive.
+
+In general, implementing dependently typed languages is still a poorly
+understood practice, and almost every stage requires experimentation on
+behalf of the author.  Another example is the treatment of the implicit
+hierarchy, where no resources are present describing the problem from an
+implementation perspective (we described our approach in Section
+\ref{sec:hier-impl}).  Hopefully this state of things will change in the
+near future, and recent publications are promising in this direction,
+for example an unpublished paper by \cite{Brady2013} describing his
+implementation of the Idris programming language.  Our ultimate goal is
+to be a part of this collective effort.
+
+\subsection{A type holes tutorial}
  \label{sec:type-holes}
  
+As a taster and showcase for the capabilities of \mykant, we present an
+interactive session with the \mykant\ REPL.  While doing so, we present
+a feature that we still have not covered: type holes.
+
  Type holes are, in the author's opinion, one of the `killer' features of
  interactive theorem provers, and one that is begging to be exported to
-the word of mainstream programming.  The idea is that when we are
-developing a proof or a program we can insert a hole to have the
-software tell us the type expected at that point.  Furthermore, we can
-ask for the type of variables in context, to better understand our
-surroundings.  Here we give a short tutorial in \mykant\ of this tool to
-give an idea of its usefulness.
+mainstream programming---although it is much more effective in a
+well-typed, functional setting.  The idea is that when we are developing
+a proof or a program we can insert a hole to have the software tell us
+the type expected at that point.  Furthermore, we can ask for the type
+of variables in context, to better understand our surroundings.
+
+In \mykant\ we use type holes by putting them where a term should go.
+We need to specify a name for the hole and then we can put as many terms
+as we like in it.  \mykant\ will tell us which type it is expecting for
+the term where the hole is, and the type for each  term that we have
+included.  For example if we had:
+\begin{Verbatim}
+plus [m n : Nat] : Nat ⇒ (
+    {| h1 m n |}
+)
+\end{Verbatim}
+And we loaded the file in \mykant, we would get:
+\begin{Verbatim}[frame=leftline]
+>>> :l plus.ka
+Holes:
+  h1 : Nat
+    m : Nat
+    n : Nat
+\end{Verbatim}
  
  Suppose we wanted to define the `less or equal' ordering on natural
  numbers as described in Section \ref{sec:user-type}.  We will
@@ -4705,7 +5149,7 @@ Unit
  >>> :e le (suc (suc zero)) (suc zero)
  Empty
  \end{Verbatim}
-Another functionality of type holes is examining types of things in
+The other functionality of type holes is examining types of things in
  context.  Going back to the examples in Section \ref{sec:term-types}, we can
  implement the safe \texttt{head} function with our newly defined
  \texttt{le}:
@@ -4750,41 +5194,76 @@ head [A : ⋆] [l : List A] : gt (length A l) zero → A ⇒ (
      (λ x _ _ _ ⇒ x)
  )
  \end{Verbatim}
-
+Now, if we tried to get the head of an empty list, we face a problem:
+\begin{Verbatim}[frame=leftline]
+>>> :t head Nat nil
+Type: Empty → Nat
+\end{Verbatim}
+We would have to provide something of type \texttt{Empty}, which
+hopefully should be impossible.  For non-empty lists, on the other hand,
+things run smoothly:
+\begin{Verbatim}[frame=leftline]
+>>> :t head Nat (cons zero nil)
+Type: Unit → Nat
+>>> :e head Nat (cons zero nil) tt
+zero
+\end{Verbatim}
  This should give a vague idea of why type holes are so useful and in
  more in general about the development process in \mykant.  Most
  interactive theorem provers offer some kind of facility
  to... interactively develop proofs, usually much more powerful than the
  fairly bare tools present in \mykant.  Agda in particular offers a
-particularly powerful mode for the \texttt{Emacs} text editor.
-
-\subsubsection{Web REPL}
-
-\section{Evaluation}
-\label{sec:evaluation}
+celebrated interactive mode for the \texttt{Emacs} text editor.
  
  \section{Future work}
  \label{sec:future-work}
  
-As mentioned, the first move that the author plans to make is to work
-towards a simple but powerful term representation, and then adjust
-\mykant\ to this new representation.  A good plan seems to be to
-associate each type (terms, telescopes, etc.) with what we can
-substitute variables with, so that the term type will be associated with
-itself, while telescopes and propositions will be associated to terms.
-This can probably be accomplished elegantly with Haskell's \emph{type
-  families} \citep{chakravarty2005associated}.  After achieving a more
-solid machinery for terms, implementing observational equality fully
-should prove relatively easy.
+The first move that the author plans to make is to work towards a simple
+but powerful term representation.  A good plan seems to be to associate
+each type (terms, telescopes, etc.) with what we can substitute
+variables with, so that the term type will be associated with itself,
+while telescopes and propositions will be associated to terms.  This can
+probably be accomplished elegantly with Haskell's \emph{type families}
+\citep{chakravarty2005associated}.  After achieving a more solid
+machinery for terms, implementing observational equality fully should
+prove relatively easy.
  
-Beyond this steps, \mykant\ would still need many additions to compete
-as a reasonable alternative to the existing systems:
+Beyond this steps, we can go in many directions to improve the
+system that we described---here we review the main ones.
  
  \begin{description}
-\item[Pattern matching] Eliminators are very clumsy to use, and
+\item[Pattern matching and recursion] Eliminators are very clumsy,
+  and using them can be especially frustrating if we are used to writing
+  functions via explicit recursion.  \cite{Gimenez1995} showed how to
+  reduce well-founded recursive definitions to primitive recursors.
+  Intuitively, defining a function through an eliminators corresponds to
+  pattern matching and recursively calling the function on the recursive
+  occurrences of the type we matched against.
+
+  Nested pattern matching can be justified by identifying a notion of
+  `structurally smaller', and allowing recursive calls on all smaller
+  arguments.  Epigram goes all the way and actually implements recursion
+  exclusively by providing a convenient interface to the two constructs
+  above \citep{EpigramTut, McBride2004}.
+
+  However as we extend the flexibility in our recursion elaborating
+  definitions to eliminators becomes more and more laborious.  For
+  example we might want mutually recursive definitions and definitions
+  that terminate relying on the structure of two arguments instead of
+  just one.  For this reason both Agda and Coq (Agda putting more
+  effort) let the user write recursive definitions freely, and then
+  employ an external syntactic one the recursive calls to ensure that
+  the definitions are terminating.
+
+  Moreover, if we want to use dependently typed languages for
+  programming purposes, we will probably want to sidestep the
+  termination checker and write a possibly non-terminating function;
+  maybe because proving termination is particularly difficult.  With
+  explicit recursion this amounts to turning off a check, if we have
+  only eliminators it is impossible.
  
  \item[More powerful data types] A popular improvement on basic data
-  types are inductive families \cite{Dybjer1991}, where the parameters
+  types are inductive families \citep{Dybjer1991}, where the parameters
    for the type constructors can change based on the data constructors,
    which lets us express naturally types such as $\mytyc{Vec} : \mynat
    \myarr \mytyp$, which given a number returns the type of lists of that
@@ -4794,51 +5273,113 @@ as a reasonable alternative to the existing systems:
    by adding equalities concerning the parameters of the type
    constructors as arguments to the data constructor, in much the same
    way that Generalised Abstract Data Types \citep{GHC} are handled in
-  Haskell, where interestingly the modified version of System F that
-  lies at the core of recent versions of GHC features coercions similar
-  to those found in OTT \citep{Sulzmann2007}.
-
-  The notion of inductive family also yields a more interesting notion
-  of pattern matching, since matching on an argument influences the
-  value of the parameters of the type of said argument.  This means that
-  pattern matching influences the context, which can be exploited to
-  constraint the possible constructors in \emph{other} arguments
-  \cite{McBride2004}.
-
-  Another popular extension introduced by \cite{dybjer2000general} is
+  Haskell.  Interestingly the modified version of System F that lies at
+  the core of recent versions of GHC features coercions reminiscent of
+  those found in OTT, motivated precisely by the need to implement GADTs
+  in an elegant way \citep{Sulzmann2007}.
+
+  Another concept introduced by \cite{dybjer2000general} is
    induction-recursion, where we define a data type in tandem with a
    function on that type.  This technique has proven extremely useful to
    define embeddings of other calculi in an host language, by defining
    the representation of the embedded language as a data type and at the
    same time a function decoding from the representation to a type in the
-  host language.
-
-  It is also worth mentionning that in recent times there has been work
-  by \cite{dagand2012elaborating, chapman2010gentle} to show how to
-  define a set of primitives that data types can be elaborated into,
-  with the additional advantage of having the possibility of having a
-  very powerful notion of generic programming by writing functions
-  working on the `primitive' types as to be workable by all `compatible'
-  user-defined data types.  This has been a considerable problem in the
-  dependently type world, where we often define types which are more
-  `strongly typed' version of similar structures,\footnote{For example
-    the $\mytyc{OList}$ presented in Section \ref{sec:user-type} being a
-    `more typed' version of an ordinary list.} and then find ourselves
-  forced to redefine identical operations on both types.
-
-\item[Type inference] While bidirectional type checking helps, for a
-  syts \cite{miller1992unification} \cite{gundrytutorial}
-  \cite{huet1973undecidability}.
-
-\item[Coinduction] \cite{cockett1992charity} \cite{mcbride2009let}.
+  host language.  The decoding function is then used to define the data
+  type for the embedding itself, for example by reusing the host's
+  language functions to describe functions in the embedded language,
+  with decoded types as arguments.
+
+  It is also worth mentioning that in recent times there has been work
+  \citep{dagand2012elaborating, chapman2010gentle} to show how to define
+  a set of primitives that data types can be elaborated into.  The big
+  advantage of the approach proposed is enabling a very powerful notion
+  of generic programming, by writing functions working on the
+  `primitive' types as to be workable by all the other `compatible'
+  elaborated user defined types.  This has been a considerable problem
+  in the dependently type world, where we often define types which are
+  more `strongly typed' version of similar structures,\footnote{For
+    example the $\mytyc{OList}$ presented in Section \ref{sec:user-type}
+    being a `more typed' version of an ordinary list.} and then find
+  ourselves forced to redefine identical operations on both types.
+
+\item[Pattern matching and inductive families] The notion of inductive
+  family also yields a more interesting notion of pattern matching,
+  since matching on an argument influences the value of the parameters
+  of the type of said argument.  This means that pattern matching
+  influences the context, which can be exploited to constraint the
+  possible data constructors for \emph{other} arguments
+  \citep{McBride2004}.
+
+\item[Type inference] While bidirectional type checking helps at a very
+  low cost of implementation and complexity, a much more powerful weapon
+  is found in \emph{pattern unification}, which allows Hindley-Milner
+  style inference for dependently typed languages.  Unification for
+  higher order terms is undecidable and unification problems do not
+  always have a most general unifier \citep{huet1973undecidability}.
+  However \cite{miller1992unification} identified a decidable fragment
+  of higher order unification commonly known as pattern unification,
+  which is employed in most theorem provers to drastically reduce the
+  number of type annotations.  \cite{gundrytutorial} provide a tutorial
+  on this practice.
+
+\item[Coinductive data types] When we specify inductive data types, we
+  do it by specifying its \emph{constructors}---functions with the type
+  we are defining as codomain.  Then, we are offered way of compute by
+  recursively \emph{destructing} or \emph{eliminating} a member of the
+  defined data type.
+
+  Coinductive data types are the dual of this approach.  We specify ways
+  to destruct data, and we are given a way to generate the defined type
+  by repeatedly `unfolding' starting from some seed data.  For example,
+  we could defined infinite streams by specifying a $\myfun{head}$ and
+  $\myfun{tail}$ destructors---here using a syntax reminiscent of
+  \mykant\ records:
+  \[
+  \begin{array}{@{}l}
+    \mysyn{codata}\ \mytyc{Stream}\myappsp (\myb{A} {:} \mytyp)\ \mysyn{where} \\
+    \myind{2} \{ \myfun{head} : \myb{A}, \myfun{tail} : \mytyc{Stream} \myappsp \myb{A}\}
+  \end{array}
+  \]
+  which will hopefully give us something like
+  \[
+  \begin{array}{@{}l}
+    \myfun{head} : (\myb{A}{:}\mytyp) \myarr \mytyc{Stream} \myappsp \myb{A} \myarr \myb{A} \\
+    \myfun{tail} : (\myb{A}{:}\mytyp) \myarr \mytyc{Stream} \myappsp \myb{A} \myarr \mytyc{Stream} \myappsp \myb{A} \\
+    \mytyc{Stream}.\mydc{unfold} : (\myb{A}\, \myb{B} {:} \mytyp) \myarr (\myb{A} \myarr \myb{B} \myprod \myb{A}) \myarr \myb{A} \myarr \mytyc{Stream} \myappsp \myb{B}
+  \end{array}
+  \]
+  Where, in $\mydc{unfold}$, $\myb{B} \myprod \myb{A}$ represents the
+  fields of $\mytyc{Stream}$ but with the recursive occurrence replaced
+  by the `seed' type $\myb{A}$.
+
+  Beyond simple infinite types like $\mytyc{Stream}$, coinduction is
+  particularly useful to write non-terminating programs like servers or
+  software interacting with a user, while guaranteeing their liveliness.
+  Moreover it lets us model possibly non-terminating computations in an
+  elegant way \citep{Capretta2005}, enabling for example the study of
+  operational semantics for non-terminating languages
+  \citep{Danielsson2012}.
+ 
+  \cite{cockett1992charity} pioneered this approach in their programming
+  language Charity, and coinduction has since been adopted in systems
+  such as Coq \citep{Gimenez1996} and Agda.  However these
+  implementations are unsatisfactory, since Coq's break subject
+  reduction; and Agda, to avoid this problem, does not allow types to
+  depend on the unfolding of codata.  \cite{mcbride2009let} has shown
+  how observational equality can help to resolve these issues, since we
+  can reason about the unfoldings in a better way, like we reason about
+  functions' extensional behaviour.
  \end{description}
  
-% TODO coinduction (obscoin, gimenez, jacobs), pattern unification (miller,
-% gundry), partiality monad (NAD)
+The author looks forward to the study and possibly the implementation of
+these ideas in the years to come.
+
+\newpage{}
  
  \appendix
  
  \section{Notation and syntax}
+\label{app:notation}
  
  Syntax, derivation rules, and reduction rules, are enclosed in frames describing
  the type of relation being established and the syntactic elements appearing,
@@ -4848,8 +5389,8 @@ for example
    Typing derivations here.
  }
  
-In the languages presented and Agda code samples I also highlight the syntax,
-following a uniform color and font convention:
+In the languages presented and Agda code samples we also highlight the syntax,
+following a uniform colour, capitalisation, and font style convention:
  
  \begin{center}
    \begin{tabular}{c | l}
@@ -4862,15 +5403,15 @@ following a uniform color and font convention:
    \end{tabular}
  \end{center}
  
-When presenting grammars, I will use a word in $\mysynel{math}$ font
+When presenting grammars, we use a word in $\mysynel{math}$ font
  (e.g. $\mytmsyn$ or $\mytysyn$) to indicate indicate
-nonterminals. Additionally, I will use quite flexibly a $\mysynel{math}$
+nonterminals. Additionally, we use quite flexibly a $\mysynel{math}$
  font to indicate a syntactic element in derivations or meta-operations.
  More specifically, terms are usually indicated by lowercase letters
  (often $\mytmt$, $\mytmm$, or $\mytmn$); and types by an uppercase
  letter (often $\mytya$, $\mytyb$, or $\mytycc$).
  
-When presenting type derivations, I will often abbreviate and present multiple
+When presenting type derivations, we often abbreviate and present multiple
  conclusions, each on a separate line:
  \begin{prooftree}
    \AxiomC{$\myjud{\mytmt}{\mytya \myprod \mytyb}$}
@@ -4878,8 +5419,7 @@ conclusions, each on a separate line:
    \noLine
    \UnaryInfC{$\myjud{\myapp{\mysnd}{\mytmt}}{\mytyb}$}
  \end{prooftree}
-
-I will often present `definitions' in the described calculi and in
+We often present `definitions' in the described calculi and in
  $\mykant$\ itself, like so:
  \[
  \begin{array}{@{}l}
@@ -4887,7 +5427,7 @@ $\mykant$\ itself, like so:
    \myfun{name} \myappsp \myb{arg_1} \myappsp \myb{arg_2} \myappsp \cdots \mapsto \mytmsyn
  \end{array}
  \]
-To define operators, I use a mixfix notation similar
+To define operators, we use a mixfix notation similar
  to Agda, where $\myarg$s denote arguments:
  \[
  \begin{array}{@{}l}
@@ -4895,12 +5435,10 @@ to Agda, where $\myarg$s denote arguments:
    \myb{b_1} \mathrel{\myfun{$\wedge$}} \myb{b_2} \mapsto \cdots
  \end{array}
  \]
-
-In explicitly typed systems, I will also omit type annotations when they
+In explicitly typed systems, we omit type annotations when they
  are obvious, e.g. by not annotating the type of parameters of
-abstractions or of dependent pairs.
-
-I will introduce multiple arguments in one go in arrow types:
+abstractions or of dependent pairs.\\
+We introduce multiple arguments in one go in arrow types:
  \[
    (\myb{x}\, \myb{y} {:} \mytya) \myarr \cdots = (\myb{x} {:} \mytya) \myarr (\myb{y} {:} \mytya) \myarr \cdots
  \]
@@ -4908,12 +5446,13 @@ and in abstractions:
  \[
  \myabs{\myb{x}\myappsp\myb{y}}{\cdots} = \myabs{\myb{x}}{\myabs{\myb{y}}{\cdots}}
  \]
-I will also omit arrows to abbreviate types:
+We also omit arrows to abbreviate types:
  \[
  (\myb{x} {:} \mytya)(\myb{y} {:} \mytyb) \myarr \cdots =
  (\myb{x} {:} \mytya) \myarr (\myb{y} {:} \mytyb) \myarr \cdots
  \]
-Meta operations names will be displayed in $\mymeta{smallcaps}$ and
+
+Meta operations names are displayed in $\mymeta{smallcaps}$ and
  written in a pattern matching style, also making use of boolean guards.
  For example, a meta operation operating on a context and terms might
  look like this:
@@ -4925,8 +5464,8 @@ look like this:
  \end{array}
  \]
  
-I will from time to time give examples in the Haskell programming
-language as defined in \citep{Haskell2010}, which I will typeset in
+From time to time we give examples in the Haskell programming
+language as defined by \cite{Haskell2010}, which we typeset in
  \texttt{teletype} font.  I assume that the reader is already familiar
  with Haskell, plenty of good introductions are available
  \citep{LYAH,ProgInHask}.
@@ -4934,9 +5473,11 @@ with Haskell, plenty of good introductions are available
  Examples of \mykant\ code will be typeset nicely with \LaTeX in Section
  \ref{sec:kant-theory}, to adjust with the rest of the presentation; and
  in \texttt{teletype} font in the rest of the document, including Section
-\ref{sec:kant-practice} and in the appendices.  Snippets of sessions in
-the \mykant\ prompt will be displayed with a left border, to distinguish
-them from snippets of code:
+\ref{sec:kant-practice} and in the appendices.  All the \mykant\ code
+shown is meant to be working and ready to be inputted in a \mykant\
+prompt or loaded from a file. Snippets of sessions in the \mykant\
+prompt will be displayed with a left border, to distinguish them from
+snippets of code:
  \begin{Verbatim}[frame=leftline]
  >>> :t ⋆
  Type: ⋆
@@ -5206,7 +5747,7 @@ efficiently with hash maps.
  \verbatiminput{constraint.hs}
  }
  
-
+\newpage{}
  
  \bibliographystyle{authordate1}
  \bibliography{thesis}