\input{lagda-preamble} \title{Euclid's theorem in Agda} \begin{document} \maketitle This document describes a proof in Agda that there are infinitely many primes, or more precisely, that for any natural number $n$ there exists a prime $p$ such that $p>n$. We have explained the code in great detail, as an introduction to Agda for mathematicians. As Agda is based on intuitionistic logic, we can only prove the existence of $p$ by constructing it, in some sense or other. Our approach is as follows: \begin{itemize} \item[(a)] We prove that for any natural number $m>1$ there is a smallest natural number $p>1$ such that $p$ divides $m$, and that this $p$ is prime. \item[(b)] Given an arbitrary natural number $n$ we show that the number $m=n!+1$ is larger than $1$, so we can apply~(a) to get a prime $p$ dividing $m$. \item[(c)] We then prove that this $p$ is larger than $n$. In more detail, we suppose that $p\leq n$, and then note that $p$ divides both $n!$ and $n!+1$, so $p$ divides $1$, so $p=1$, which is a contradiction because $1$ is not prime. This proves the double negation of the desired inequality $p>n$, which is enough even in Agda's constructive framework because the order relation on $ℕ$ is decidable. \end{itemize} The proof relies on a number of modules in the Agda standard library, and some new modules written for this project. The basic definitions of Peano arithmetic are set up in the Agda standard library (specifically, in the module \verb+Data.Nat+ and its submodules). Facts about divisibility and primes are proved in \verb+Data.Nat.Divisibility+ and \verb+Data.Nat.Primality+. The new modules are as follows: \begin{itemize} \item[(1)] The module \verb+Extra.Data.Nat.LargerPrime+ defined in this file assembles the ingredients mentioned below to prove the main result. \item[(2)] The module \verb+Extra.Data.Nat.LeastDivisor+ implements step~(a) in the above outline. It relies on a more general framework for finding least elements defined in the module \verb+Extra.Data.Nat.Minimiser+. \item[(3)] The module \verb+Extra.Data.Nat.Primality+ proves some basic facts about primes that are not covered by the standard library module \verb+Data.Nat.Primality+. \item[(4)] The module \verb+Extra.Data.Nat.Factorial+ defines the factorial function and proves some basic properties. \end{itemize} We start by declaring the module: \begin{code} module Extra.Data.Nat.LargerPrime where \end{code} Modules provide a framework for packaging together related definitions. In ordinary mathematical writing it is common to quote results from different sources with incompatible notational conventions; one then has to redefine some of the notation in the sources to avoid ambiguity or clashes. Agda modules provide reasonably good support for maneuvers of this kind. We next list some standard library modules whose results we need to use. Note that we use the keyword \verb+open+ as well as \verb+import+. If we just say \verb+import Data.Nat.Primality+ then we can use the definition \verb+Prime+ given in the module \verb+Data.Nat.Primality+ but we need to refer to it as \verb+Data.Nat.Primality.Prime+. If we then say \verb+open Data.Nat.Primality+ then we can use the short form \verb+Prime+ rather than \verb+Data.Nat.Primality.Prime+. The line \verb+open import Data.Nat.Primality+ is equivalent to \verb+import+ followed by \verb+open+. The code for the module \verb+Data.Nat.Primality+ is in the file \verb+src/Data/Nat/Primality.agda+ in the standard library. The location of the standard library depends on the details of your installation. If you already have Agda working under emacs then the command \verb+C-h v agda2-include-dirs+ should tell you where to look. \begin{code} open import Data.Empty open import Data.Nat open import Data.Nat.Properties open import Data.Nat.Divisibility open import Data.Nat.Primality open import Relation.Nullary.Core open import Relation.Binary.PropositionalEquality \end{code} The module \verb+Data.Empty+ defines the empty set. The next four modules should be self-explanatory. The module \verb+Relation.Nullary.Core+ contains the negation operator and some auxiliary definitions for decidable relations that will be discussed later. The module \verb+Relation.Binary.PropositionalEquality+ contains some code that makes it easier to formalise proofs that use chains of equalities, like \begin{align*} a &= b \hspace{4em} \text{ by Lemma 1 } \\ &= c \hspace{4em} \text{ by Lemma 2 } \\ &= d \hspace{4em} \text{ by Lemma 3. } \end{align*} It also contains many things that will not concern us for the moment. Next, we will need to use the fact that addition of natural numbers is commutative. In order to keep the standard library tidy, this is not represented as an isolated fact, but is packaged with various other facts in a proof that $ℕ$ is a commutative semiring. We therefore need to import the definition of a semiring to gain access to the fact that we need. We use the following code: \begin{code} open import Algebra.Structures open IsCommutativeSemiring isCommutativeSemiring using ( +-comm ) \end{code} The effect is that the name \verb!+-comm! now refers to a proof that $n+m=m+n$ for all $n,m\in ℕ$. (Note that names in Agda can use almost any character other than a space. It is common for names to include mathematical symbols or other punctuation.) The details take some explanation, which could be skipped by the impatient. Firstly, \verb+IsCommutativeSemigroup+ (with a capital \verb+I+) is a function of eight arguments that is defined in the module \verb+Algebra.Structures+. The first three arguments are given in curly brackets in the function definition, which means that in most circumstances these arguments need not be given explicitly because they can be deduced from the remaining arguments. The first two arguments are levels, which are used by Agda to avoid Russell-type paradoxes. One can usually ignore these completely. The third (implicit) argument is a set $A$. The remaining (explicit) arguments are a relation on $A$ (written as $≈$), two binary operations on $A$ (written as addition and multiplication) and two elements $0,1\in A$. The value of the function \verb+IsCommutativeSemigroup+ at these arguments is, roughly speaking, the set of proofs that $≈$ is an equivalence relation and that $A/≈$ is a commutative semiring under the given operations. Of course this set could be empty, if the operations do not in fact make $A/≈$ into a commutative semiring. (For reasons that we will not discuss here, algebraic structures are always represented in Agda in this fashion, with operations defined on a set $A$, and axioms that hold only modulo a specified equivalence relation on $A$.) Next, the natural numbers zero and one, and the operations of addition and multiplication of natural numbers, are defined in \verb+Data.Nat+. We also have the relation of equality on $ℕ$. Feeding these ingredients into the function \verb+IsCommutativeSemiring+ we get the set $P$ of proofs that $ℕ$ is a commutative semiring. The module \verb+Data.Nat.Properties+ constructs such a proof, which is a particular element called \verb+Data.Nat.Properties.isCommutativeSemiring+ (with a lower case \verb+i+) of the set $P$. As we have imported and opened the module \verb+Data.Nat.Properties+, we can omit the prefix and use the name \verb+isCommutativeSemiring+. We described \verb+IsCommutativeSemiring+ as a set-valued function, which is essentially correct, but in more technical terms it is a parametrised record type. Agda automatically creates an associated module for each parametrised record type. The statement \begin{verbatim} open IsCommutativeSemiring isCommutativeSemiring using ( +-comm ) \end{verbatim} is a variant of the operation of opening a module, which we have seen before. The proof \verb+isCommutativeSemigroup+ is a package of proofs of all the different axioms for a commutative semiring, and the effect of the above \verb+open+ statement is to enable us to refer to the part of the package called \verb!+-comm!, which is a proof that addition is commutative. This process of opening a record is similar to the usual practice of saying ``consider a semiring $R$'' and then taking all semiring-related notation to refer to $R$ unless otherwise specified. We need one more module from the standard library, called \verb+∣-Reasoning+. This is defined as a submodule of \verb+Data.Nat.Divisibility+, and it makes it easier to reason with chains of divisibility statements, analogous to the chains of equalities that we described previously. \begin{code} open ∣-Reasoning \end{code} We also import three new modules that we mentioned previously: \begin{code} open import Extra.Data.Nat.LeastDivisor open import Extra.Data.Nat.Primality open import Extra.Data.Nat.Factorial \end{code} We next specify what we hope to construct. In Agda, whenever we construct an object with certain properties, we need to package it together with a collection of proofs that it does indeed have those properties. In our case we construct a prime $p$ that is larger than $n$, and we need to remember a proof of that inequality. In Agda, the proposition that $p$ is greater than $n$ is denoted by \verb+p > n+, with spaces between the symbols. Like all propositions in Agda, this is a type, whose terms (if any) are the proofs that $p>n$. The string \verb+p>n+ (with no space) is a valid variable name in Agda, and it is a standard idiom to use this name to refer to an arbitrary term of type \verb+p > n+. With this notation, we hope to construct a record consisting of a natural number $p$, a proof (denoted by \verb+p-prime+) that $p$ is prime, and a proof (denoted by \verb+p>n+) that $p>n$. The next block of code defines \verb+LargerPrime n+ (with a capital L) to be the set of such records. \begin{code} record LargerPrime (n : ℕ) : Set where field p : ℕ p-prime : Prime p p>n : p > n \end{code} We now want to define a function \verb+largerPrime+ (with a lower case l) which constructs an element \verb+largerPrime n+ of \verb+LargerPrime n+ for each natural number $n$. In Agda's notation, this means that \verb+largerPrime+ is a term of type \verb+(n : ℕ) → LargerPrime n+. Agda's type checking system is strong enough that to ensure that any term that it accepts as having the above type gives a valid proof of Euclid's theorem. The next few lines are housekeeping: \begin{code} largerPrime : (n : ℕ) → LargerPrime n largerPrime n = record { p = p ; p-prime = p-prime ; p>n = p>n } where \end{code} The rest of the code is indented at least as much as the keyword \verb+where+, which means that it is treated as a single block. The above lines tell Agda that this block will define \verb+p+, \verb+p-prime+ and \verb+p>n+ with the advertised types. We next invoke the \verb+leastDivisor+ function defined in the module \verb+Extra.Data.Nat.LeastDivisor+. \begin{code} L = leastDivisor (1 + (n !)) (s≤s (n!≥1 n)) \end{code} The \verb+leastDivisor+ function takes two arguments: a natural number and a proof that that number is at least two. For the number we take $1+(n!)$. The module \verb+Extra.Data.Nat.Factorial+ defines a function \verb+n!≥1+ such that \verb+n!≥1 n+ is a proof that \verb+1 ≤ n!+, and \verb+Data.Nat+ defines a function \verb+s≤s+ that converts proofs of \verb+a ≤ b+ to proofs of \verb!1 + a ≤ 1 + b!. We can thus use \verb+s≤s (n!≥1 n)+ as the second argument to \verb+leastDivisor+, and we define \verb+L+ to be the value returned by this function. Now \verb+L+ is a record with seven fields. There is a field called \verb+p+ whose value is a natural number, a field called \verb+p-prime+ whose value is a proof that \verb+p+ is prime, and a field called \verb+p-least+ whose value is a proof that no number with $2\leq kn+. We first declare the type of \verb+p>n+: \begin{code} p>n : p > n \end{code} We next have a \verb+with+ clause: \begin{code} p>n with (suc n) ≤? p \end{code} Here \verb+suc+ is the successor function, taking $n$ to $1+n$. Next, there is a function called \verb+_≤?_+ defined in \verb+Data.Nat+. This is an example of a useful notational mechanism in Agda: when the name of a function involves underscores, the position of those underscores indicates how the arguments are placed relative to the function symbol. Because the name \verb+_≤?_+ starts and ends with an underscore, we can write \verb+n ≤? m+ for the value of the function at a pair of natural numbers \verb+n+ and \verb+m+. That value will either have the form \verb+yes r+ (where \verb+r+ is a proof that $n\leq m$) or \verb+no q+ (where \verb+q+ is a proof that $n\not\leq m$). (These are terms of the type \verb+Dec (n ≤ m)+, which is defined in \verb+Relation.Nullary.Core+.) The clause \verb+p>n with (suc n) ≤? p+ indicates that we will define \verb+p>n+ using two separate cases, depending on the value of \verb+(suc n) ≤? p+. If this value has the form \verb+yes r+ then \verb+r+ must be a proof that \verb+p > n+ and we can just take \verb+p>n+ to be \verb+r+. (This relies on the fact that \verb+p > n+ is \emph{by definition} the same as \verb+(suc n) ≤ p+, as we see by reading the code for \verb+Data.Nat+.) This is implemented by the following line: \begin{code} p>n | yes r = r \end{code} Now what if \verb+(suc n) ≤? p+ has the form \verb+no q+ for some \verb+q+? Here \verb+q+ would have to be a proof that \verb+p ≤ n+, which cannot actually happen. Nonetheless, Agda forces use to deal with this hypothetical case, and prove that it is vacuous. This works as follows. Agda notation for the empty type is \verb+⊥+. For any set \verb+W+ there is a term \verb+⊥-elim+ of type \verb+(⊥ → W)+. Logically, this corresponds to the principle that a falsity implies everything; set-theoretically, it corresponds to the fact that the empty set is initial in the category of sets. (The set \verb+W+ is an implicit argument to the \verb+⊥-elim+ function, given in curly brackets when that function is defined in \verb+Data.Empty+.) Next, the function \verb+Prime+ defined in \verb+Data.Nat.Primality+ sends a natural number $n$ to the set of proofs that $n$ is prime. When $n$ is given as a double successor, the set \verb+(Prime n)+ is defined in an obvious way in terms of divisibility properties. However, the two initial cases $n=0$ and $n=1$ are handled separately, with \verb+Prime 0 = Prime 1 = ⊥+ by definition. We handle the \verb+no q+ case by constructing from \verb+q+ a proof that $1$ is prime, or in other words a term \verb+1-prime+ of type \verb+Prime 1 = ⊥+. We then apply \verb+⊥-elim+ to \verb+1-prime+ to get a term of whatever type we want; in this case, a term of type \verb+(p > n)+. The line \begin{code} p>n | no q = ⊥-elim 1-prime where \end{code} indicates that we will handle the \verb+no q+ case by the general approach discussed above, and promises that the following block of code will eventually define \verb+1-prime+. Note that the type checker knows what type \verb+p>n+ is supposed to have, and supplies this type as the implicit argument to \verb+⊥-elim+. It also knows that \verb+1-prime+ must have type \verb+⊥+ in order to be an acceptable explicit argument to \verb+⊥-elim+, and when we get to the definition of \verb+1-prime+ it will check that we have indeed provided a term of type \verb+⊥+. The rest of the code forms another \verb+where+ block nested inside the \verb+where+ block that we introduced earlier, so it is indented by two more spaces. We now construct a proof that \verb+p+ divides \verb+n !+. The module \verb+Data.Nat.Factorial+ defines a function \verb+k∣n!+ that takes a pair of inequalities \verb+1 ≤ k+ and \verb+k ≤ n+, and returns a proof that \verb+k+ divides \verb+n !+. We have a proof called \verb+p-prime+ that \verb+p+ is prime, and the function \verb+prime≥1+ defined in \verb+Extra.Data.Nat.Primality+ can be used to convert this to a proof that \verb+1 ≤ p+. By hypothesis we have a proof called \verb+q+ that \verb+p > n+ is false, and using the functions \verb+≰⇒>+ and \verb+≤-pred+ from \verb+Data.Nat+ we convert this to a proof that \verb+p ≤ n+ is true. After feeding these ingredients to the function \verb+k∣n!+ we obtain the required proof that \verb+p+ divides \verb+n !+. We denote this proof by \verb+p∣n!+. \begin{code} p∣n! : p ∣ n ! p∣n! = k∣n! (prime≥1 p-prime) (≤-pred (≰⇒> q)) \end{code} We now need a proof that \verb+p+ divides \verb|n ! + 1|. We have already extracted from the record \verb+L+ a proof (denoted by \verb|p∣1+n!|) that \verb+p+ divides \verb|1 + n !|. Although \verb|n ! + 1| is provably equal to \verb|1 + n !|, these two terms have a different internal representation so we cannot just use \verb|p∣1+n!| as a proof that \verb|p ∣ n ! + 1|. However, we also extracted a function called \verb!+-comm! from \verb+Data.Nat.Properties.isCommutativeSemiring+, and \verb|(+-comm 1 (n !))| is a proof that \verb|1 + n ! ≡ n ! + 1|. Using notation established in the module \verb+∣-Reasoning+ we can string these proofs together: \begin{code} p∣n!+1 : p ∣ n ! + 1 p∣n!+1 = begin p ∣⟨ p∣1+n! ⟩ 1 + (n !) ≡⟨ +-comm 1 (n !) ⟩ (n !) + 1 ∎ \end{code} We next use the function \verb+∣-∸+ defined in \verb+Data.Nat.Divisibility+. Given proofs that \verb+i+ divides \verb|n + m| and \verb+n+, this returns a proof that \verb+i+ divides \verb+m+. Thus, we can pass in our proofs that \verb+p+ divides \verb+n !+ and \verb|n ! + 1|, and get back a proof that \verb+p+ divides \verb+1+. \begin{code} p∣1 : p ∣ 1 p∣1 = ∣-∸ p∣n!+1 p∣n! \end{code} The module \verb+Data.Nat.Divisibility+ also defines a function called \verb+∣1⇒≡1+, which can be used to convert our proof that $p$ divides $1$ to a proof that $p=1$. \begin{code} p≡1 : p ≡ 1 p≡1 = ∣1⇒≡1 p∣1 \end{code} We conclude by using the function \verb+subst+ defined in \verb+Relation.Binary.PropositionalEquality.Core+. Roughly speaking, this works as follows. The first argument is a set \verb+A+ depending on a parameter \verb+x+. The second argument is a proof of \verb+u ≡ v+, where \verb+u+ and \verb+v+ are possible values of \verb+x+. The third argument is an element of the set \verb+(A u)+, and the return value is the same thing regarded as an element of \verb+(A v)+. We have a proof that \verb+p ≡ 1+ and a proof that \verb+p+ is prime, so we can use \verb+subst+ to produce a proof that \verb+1+ is prime. \begin{code} 1-prime : Prime 1 1-prime = subst Prime p≡1 p-prime \end{code} As discussed previously, this element \verb+1-prime+ can be fed into \verb+⊥-elim+ to give a tautological proof of our theorem in the vacuous case where \verb+p ≤ n+. \end{document}