Recursion

Something is recursive if it is defined using itself. A simple (albeit hardly useful and contrived) example is the following function:

Haskell
Python

1
f :: Int -> Int
2
f n = f (n + 1)

1
def f(n):
2
    return f(n + 1)

As defined, the body of function f invokes itself. In other words, it is defined using itself. Readers who are unconvinced that f is a recursive definition may see that it is analogous to the following mathematical definition, which is clearly recursive:

Data types can also be defined recursively:

Haskell
Python

1
data SinglyLinkedList a = Empty
2
                        | Node a (SinglyLinkedList a)
3
  deriving (Show, Eq)

1
from abc import ABC
2
from dataclasses import dataclass
3

4
class SinglyLinkedList(ABC):
5
    pass
6

7
class Empty(SinglyLinkedList):
8
    pass
9

10
@dataclass
11
class Node(SinglyLinkedList):
12
    head: object
13
    tail: SinglyLinkedList

Likewise, you can see that the SinglyLinkedList class has a subclass Node which itself holds another SinglyLinkedList. This makes SinglyLinkedList a recursive data structure.

The core idea we present in this section is that we can write recursive functions by thinking structural-inductively.

Induction

We shall begin by describing a proof by induction for a statement over the natural numbers. The principle of a proof by induction is as follows: given a predicate over the natural numbers, if we can show:

is true
(for all natural numbers , implies )

Then is true for all natural numbers . This works because of modus ponens.

Modus Ponens codifies the following idea: if a proposition is true, and if implies , then is true. To show how this allows proofs by induction, we see that we have a proof of . Since we also know that implies , by modus ponens, is true. We also know that implies , and since from earlier is true, by modus ponens, is also true, and so on.

Let us attempt to write a proof by induction. We start with an implementation of the factorial function, then prove that it is correct:

Haskell
Python

1
factorial :: Int -> Int
2
factorial n
3
  | n == 0    = 1
4
  | otherwise = n * factorial (n - 1)

1
def factorial(n):
2
    return 1 if not n else \
3
           n * factorial(n - 1)

Proposition 1.

Let be the proposition that factorial(n) returns . Then, for all natural numbers , is true.

Proof.

We prove and separately.

Basis. Trivial. . Furthermore, by definition, factorial(0) returns 1. In other words, is true.

Inductive. Suppose for some natural number , factorial(k) returns .

By definition of factorial, factorial(k + 1) returns (k + 1) * factorial(k).
By our supposition, this evaluates to , which is, by definition, .
Thus, if for some , factorial(k) returns , then factorial(k + 1) returns . In other words, .

As such, since we have proven and , we have proven by induction.

Recursion via Inductive Reasoning

Naturally (haha), the next question to ask would be, “how do we make use of induction to write recursive functions?” As above, the recipe for a proof by induction involves (broadly) two steps:

Proof of the basis, e.g.
The inductive proof, e.g. . Typically, the inductive step is completed by supposing for some , and showing .

We can write recursive functions similarly by providing:

Non-recursive computation for the result of the base-case, e.g. ;
Recursive computation of based on the result of assuming that gives the correct result.

Example: Natural Numbers

Let us start with a simple description of the natural numbers:

In our usual understanding of the natural numbers, .

A formulation of the natural numbers in Haskell/Python might be the following:

Haskell
Python

1
data Nat = Zero
2
         | Succ Nat
3
  deriving (Show, Eq)

1
class Nat: pass
2

3
@dataclass
4
class Zero(Nat): pass
5

6
@dataclass
7
class Succ(Nat):
8
    pred: Nat

In which case, the number 3 can be written as follows:

Haskell
Python

1
three = Succ (Succ (Succ Zero))

1
three = Succ(Succ(Succ(Zero())))

Let us attempt to define addition over the natural numbers as we have formulated above, recursively:

Haskell
Python

ghci> three = Succ (Succ (Succ Zero))
ghci> two = Succ (Succ Zero)
ghci> add three two
Succ (Succ (Succ (Succ (Succ Zero))))

>>> three = Succ(Succ(Succ(Zero())))
>>> two = Succ(Succ(Zero()))
>>> add(three, two)
Succ(pred=Succ(pred=Succ(pred=Succ(pred=Succ(pred=Zero())))))

We might decide to perform recursion on the first addend (doing so on the second addend is fine as well). In computing there are two possibilities for what could be:

, or
the successor of some natural number .

The first case is straightforward since itself is non-recursive (see the definition of Zero above), and is just . In the other case of where for some , assuming (via our inductive hypothesis) that add(k, n) correctly gives , then is which can be done by Succ(add(k, n)).

Therefore, we arrive at the following solution:

Haskell
Python

1
add :: Nat -> Nat -> Nat
2
add Zero n = n
3
add (Succ k) n = Succ (add k n)

9
def add(m, n):
10
    return n if m == Zero() else \
11
           Succ(add(m.pred, n))

In the Python definition, using structural pattern matching which we present in Chapter 2.4 (Pattern Matching), we may also write the following definition which might be more intuitive:

def add(m, n):
    return n if m == Zero() else \
           Succ(add(m.pred, n))
    match m:
        case Zero(): return n
        case Succ(k): return Succ(add(k, n))

Example: Singly Linked Lists

At this point you might be wondering why we had given such an odd formulation of the natural numbers in Python, when we could have used the int type instead (we totally could). One core idea we would like to make apparent in this formulation, is that recursion via inductive reasoning can be done over the structure of data. Our formulation shows that natural numbers are recursive data structures, where the successor of a natural number has a predecessor who is also, likewise, a natural number. This should make writing recursive functions over other kinds of recursive data structures not too great of a leap from writing recursive functions over natural numbers. To show this, consult our SinglyLinkedList data structure from above before we proceed to write recursive functions over them using inductive reasoning.

First, we shall write a function that appends an element to the end of a singly linked list.

Haskell
Python

ghci> append 1 Empty
Node 1 Empty
ghci> append 2 $ append 1 Empty
Node 1 (Node 2 Empty)

>>> append(1, Empty())
Node(head=1,tail=Empty())
>>> append(2, append(1, Empty()))
Node(head=1,tail=Node(head=2,tail=Empty()))

We can perform recursion over the structure of the list. There are two possible structures of the list:

The empty list
A node of a head element and a tail list

In the former, we append to an empty list, which should give the singleton. Note once again that because the empty list is non-recursive, our solution for appending to the empty list likewise requires no recursion. For the second case of (shorthand for ), assume that our solution is correct for the substructure of the Node, i.e. . Our goal is to have

Observe that:

Therefore, we can write:

Haskell
Python

4
append :: a -> SinglyLinkedList a -> SinglyLinkedList a
5
append y Empty = Node y Empty
6
append y (Node x xs) = Node x (append y xs)

14
def append(x, ls):
15
    if ls == Empty():
16
        return Node(x, Empty())
17
    return Node(ls.head, append(x, ls.tail))
18

19
# Using structural pattern matching:
20
def append2(x, ls):
21
    match ls:
22
        case Empty():
23
            return Node(x, Empty())
24
        case Node(e1, xs):
25
            return Node(e1, append2(x, xs))

We shall give another example by writing list reversals recursively, going straight into our derivation. Reversing the empty list gives the empty list. For nonempty lists our goal is to have . Assuming that , we can see that , giving us the following formulation:

Haskell
Python

7
reverse' :: SinglyLinkedList a -> SinglyLinkedList a
8
reverse' Empty = Empty
9
reverse' (Node x xs) = append x (reverse' xs)

26
def reverse(ls):
27
    if ls == Empty():
28
        return Empty()
29
    return append(ls.head, reverse(ls.tail))
30

31
# Using structural pattern matching:
32
def reverse2(ls):
33
    match ls:
34
        case Empty(): return Empty()
35
        case Node(e1, xs): return append(e1, reverse2(xs))

By this point you should be able to see that recursion can be done via the following based on the structure of the data:

If the structure of the data is non-recursive, provide a non-recursive computation that computes the result directly
If the structure of the data is recursive, recursively solve the problem on the substructure(s) of the data (e.g. pred or tail of the natural number or list), and include its result in your main result

You should be well aware that data structures may be more complex. For example, solving a problem for a structure may require more than one recursive calls, one non-recursive call and one recursive call, etc. To make this apparent, let us look at a formulation of a binary tree of integers:

Haskell
Python

1
data Tree = EmptyTree | TreeNode Tree Int Tree
2
  deriving (Show, Eq)

1
class Tree: pass
2

3
@dataclass
4
class EmptyTree(Tree): pass
5

6
@dataclass
7
class TreeNode(Tree):
8
    left: Tree
9
    val: int
10
    right: Tree

Now let us attempt to write a function that sums all integers in the tree. Again there are two possible structures a tree can have: the first being the empty tree, which has sum 0. For tree nodes, we have two subtrees, left and right, from whom we may recursively obtain their sums using our function. Then, the sum of the entire tree is just the total of the value at the node, the sum of the left subtree and the sum of the right subtree:

Haskell
Python

3
sumTree :: Tree -> Int
4
sumTree EmptyTree = 0
5
sumTree (TreeNode l v r) = sumTree l + v + sumTree r

11
def sum_tree(t):
12
    if t == EmptyTree():
13
        return 0
14
    return t.val + sum_tree(t.left) + sum_tree(t.right)
15

16
# Structural pattern matching
17
def sum_tree(t):
18
    match t:
19
        case EmptyTree(): return 0
20
        case TreeNode(l, v, r):
21
            return sum_tree(l) + v + sum_tree(r)

In summary, our formulation of the natural numbers reveals that numbers are also structurally recursive, and therefore, are amenable to recursive computations. We can extend this idea to all recursive structures, which as you will see in these notes, is very common.