Type Systems
Abstract
As per the course title, one of the most important aspects of functional programming that we shall cover is types. In this chapter, we shall describe what types are, how they are useful and how we can use type information to write code, and some aspects of types that allow us to reduce boilerplate code while still retaining type safety. In addition, we describe how we can define our own data types in Haskell, and a neat feature known as pattern matching that is extremely useful in the presence of algebraic data types. We also offer some examples of how we can incorporate these concepts in Python.
As the course title suggests, Haskell is a typed functional programming language—in particular, it uses a statically-typed type system. This begs the question, “what is a type system?”
An online search for definitions might give you the following:
A type system is a tractable syntactic method for proving the absence of certain program behaviours by classifying phrases according to the kinds of values they compute.
Let us unpack the highlighted phrases in Definition 1.
-
Tractable more or less means easy, or polynomial time. Method refers to a formal method, which means it is a kind of mathematically formal process. The fact that it is a syntactic method means that this formal analysis can be done syntactically, without the need to appeal to a semantic analysis (although, static type checking is done against the static semantics of the type system). More or less, it can be performed without executing any of the code it is analyzing.
-
In the case of type systems, this usually means that the type system is used to prove the absence of type errors. The realm of program analysis is broken down into roughly two kinds: over-approximation analyses, and under-approximation analyses. Notice that both perform approximations of program behaviour—this is because obtaining a precise specification of any program is undecidable. Generally, static analyses, like type checking, perform an over-approximation of program behaviour. An analogy of how this works is as follows: assume true program behaviour is
and buggy behaviour is at (these are all positive numbers, let’s say). We then over-approximate the true program behaviour, giving us . If we can show that , then we can guarantee that , so the program is not buggy. A more concrete example is as follows. Let’s suppose we have the following code snippet in Python:
y: int = 0 if f() else 'abc'print(y + 1)Notice that if we can determine that
f
always returnsTrue
, then we know for sure that there will be no type errors. However, it is not possible to make this determination in general. Thus, we over-approximate program behaviour by assuming that it is possible thatf
may return eitherTrue
orFalse
leading us to show that we cannot prove the absence of type errors in this program. Instead, if we had written the following:y: int = 0 if f() else 'abc'y: int = 0 if f() else 1print(y + 1)Then even by assuming that both branches of the conditional expression may be the result, we can conclusively show that
y
will always be anint
. Our over-approximation of program behaviour doesn’t have type errors, meaning, that our actual program really does not have type errors. -
Types essentially classify the kinds of values they compute. As we will informally define later, types group data/things in the program that all behave similarly or have similar characteristics. In some other sense, types can be seen as abstractions over terms.
Simply put, a type system is a formal system that lets us show that there won’t be type errors. As we have seen, the nature of (static) type systems forces us to program in a different way (at least compared to dynamically typed languages like Python), and this is what we will explore in this chapter.
Types
Type systems are systems of types; but what is a type? In essence, a type is like a kind of thing, or a high-level description of what something is. Types
- give meaning to some data, and
- describe what its members are like.
Since you have already programmed in Python, you should have some inkling of what types are. In Python, everything is an object. Thus, in Python, the type of an object is the class from which it was instantiated.
The following is some sample output showing the types of various objects. The output of all these function calls are classes.
>>> x = 1>>> type(x)<class 'int'>>>> type('abc')<class 'str'>>>> class A: pass>>> type(A())<class '__main__.A'>
This is very apt—classes are blueprints for creating objects, and (for the most part), all instances of a class will abide by the specification as laid out in the class. Therefore, Python’s type system based on classes is very appropriate for our purposes. In fact, this is not unique to Python. Many other languages with OO features also have classes as types.
Type Annotations
In Python, we mainly think of types as being bound to objects, that is, objects have reified types that can be accessed at runtime. We have never thought of assigning types to variables or function parameters, since when we are investigating the type of a variable, what we are really doing is investigating the type of the object that is referred to by the variable. However, Python actually does allow us to annotate variables, function parameters etc with types to document “suggestions” as to what the types of the objects assigned to them should be.
Observe the following program fragment.
def f(x: int) -> str: y: int = x * 2 return f'{x} * 2 = {y}'z: intz = 3s: str = f(z)print(s) # 3 * 2 = 6
This program fragment contains several type annotations.
- In the function header, we have a specification for
f
to receive anint
and return astr
. That is, if the type annotations make sense, then passing anint
intof
will always result in astr
. - In the function body, we also
have an annotation for the variable
y
stating that it is also anint
. This makes sense—ifx
is anint
, then so willx * 2
. Actually, the type ofy
can be inferred (a type checker can determine the type ofy
automatically), so our type annotation for it is not necessary. - Outside the function body we have other type annotations, documenting what the types of the other variables are. On visual inspection, we can see that all the type annotations make sense and we have adhered to them fully; we are thus guaranteed that we have no type errors.
While Haskell also provides the capability for type annotations, a notable distinction lies in Haskell’s enforcement of adherence to these annotations. Consequently, it might be more fitting to refer to them as type declarations. Nevertheless, the core concept remains unchanged: specifying the types of variables, functions, or terms ensures that, when adhered to correctly, our program will be well-typed.
The following code snippet shows some Haskell code with type declarations.
f :: Int -> Stringf x = show x ++ " * 2 = " ++ show y where y = x * 2z :: Intz = 3s :: Strings = f(z) -- 3 * 2 = 6
A natural question would be to ask, what types can we declare variables
to be of? We have looked at some basic types earlier, Int
,
String
(which is an alias for [Char]
),
Char
, [Int]
, Bool
,
Double
etc. There are many other types in Haskell’s Prelude,
and later on we will see how we can create our own types.
Declaring types for functions is slightly different. In Python, when
writing type annotations for functions, we are really annotating the
types of its parameters, and its return type. In Haskell, we are
declaring the type of the function itself. The difference is actually
not as large as one might imagine. If the function receives a type f
receives an Int
and returns a
String
, then f
itself is of the type
Int -> String
.
Haskell has roots in formal systems, in particular, System
The T-Abs rule is an inference rule stating that if the premise above the line is true, then the conclusion below the line will also be true.
- Let’s first parse the premise.
- The part to the left of
is the typing environment, more or less describing the type declarations we have at the point of analysis of the program. Specifically, is the actual type environment, while is an additional assumption that a variable has type . - The part to the right of
describes the judgement of the type of being . Overall, the premise states “given what we have so far, if in assuming is of type we get that is of type , …”.
- The part to the left of
- The conclusion can be understood similarly: it states that the typing environment
will show that the function has type .
Putting these together, the rule states that “given typing
environment
A simple demonstration in Python is as follows: suppose we have x
and x * 2
. If we assume that x
is of type int
, then we know that x * 2
will also
be an int
. Therefore, the type of lambda x: x * 2
is int -> int
.
What about multi-parameter functions? Remember that in Haskell, all
functions are curried, thus, all functions in Haskell are single
parameter functions. Curried functions receive one parameter, and return
a function closure that receives the remaining variables and
eventually will return the final result. Therefore the (+)
function actually looks more like:
# Pythondef add(x): return lambda y: x + y
The type of add
is more like int -> (int -> int)
.
This is (more or less) the type of (+)
in Haskell, which
(more or less) has type Int -> Int -> Int
. Note that
->
is right-associative, so Int -> Int -> Int
is
the same as Int -> (Int -> Int)
.
In Haskell, the types of everything are fixed. This should be unsurprising since everything in Haskell is immutable, but it is a restriction that can also be found in other less restrictive languages like Java and C++. In this environment, we have to, perhaps ahead of time, decide what the type of a variable, function, function parameter is, then write the implementation of your function around those restrictions.
The following code
snippet first declares the type of f
before showing its
implementation. It is not only good practice to declare types above
their implementation, but it can be a nice way to frame your mind around
the implementation of your function—start by providing a high-level
specification of your function, then work on the implementation to
describe what the function is actually trying to achieve.
f :: Int -> String -- explicit type declarationf x = show x ++ "!"g x = x + 1 -- type of g is inferred
However, observe that the type of g
is not defined. This
does not mean that the type of g
is dynamic or is not being
checked; rather, Haskell can infer the principal (most liberal) type
of g
via a process known as type inference. That still
means that the implementation of g
itself must be well-typed
(its implementation does not break any of the typing rules), and that
any users of g
must abide by its static type signature.
Generally speaking, it is good practice to declare the types of
top-level bindings—that is, nested bindings of functions, variables (for
example, in let
expressions) do not need type declarations
and can often be inferred. The example above of the declaration of
f
is a perfectly idiomatic way of defining and declaring a
function, unlike g
which lacks a type declaration.
Programming with Types
When learning Python, you might not have had to think very much about
types; this is because Python does not care about type annotations. For
example, you can happily annotate a variable to be an int
but
then assign a string into it. This is very much unlike Haskell, where
adherence to type declarations and well-typedness is enforced by the
compiler—the compiler will reject any program that is not well-typed.
Observe the following program fragment:
f :: Int -> String -- explicit type declarationf x = show x ++ "!"
g = f "1" -- compiler throws type error as f receives Int, not String
The definition of f
is well-typed since it abides by all the
typing rules, and all the types make sense. However, since f
only receives Int
, passing a String
into it is a
clear violation of the rules. Thus, the entire program is ill-typed and
will not be compiled. Try this for yourself!
Programming in such a strict and formal language can feel restrictive, but these restrictions actually feel more like “guard rails” or “seatbelts”; if your program passes the checks done by the compiler, you can be quite assured that it works. As the saying goes, in Haskell, “if it compiles, it works”. Although this is not necessarily true, Haskell’s robust and expressive type system allows you to rule out a large class of bugs, and often, results in correct programs. However, one question to ask is: how do we go about programming with static types?
The first step of being able to program with types is understanding the typing rules. We shall elide explanation on how typing works with inference, typeclasses, polymorphism etc. and focus solely on the simplest typing rules:
- In a binding
x = e
, the type ofx
must be the same as the type ofe
- In a conditional expression
if x then y else z
, the type ofx
must beBool
and the types ofy
andz
must both be equal to some typea
; the type of the entire expression isa
- In a function application expression
f x
the type off
must bea -> b
for somea
andb
,x
must be of typea
, and the type of the expression isb
- (Without loss of generality of number of parameters) For a function
binding
f x = e
the type off
must bea -> b
for somea
andb
and in assumingx
to be of typea
,e
must be of typeb
.
Try calculating the types of every expression in the following code snippet. Can you get it all right?
f :: Int -> Int -> [Int]f x n = if n == 0 then [] else let r = f x (n - 1) in x : r
Let’s work through this example.
- We are declaring
f
to be of typeInt -> Int -> [Int]
, so it stands to reason that in the definition off
we are assuming thatx
andn
are both of typeInt
. - For this to be well-typed, we must ensure that the conditional expression evaluates to
[Int]
, that means both branches must themselves evaluate to[Int]
. - First we observe the condition
n == 0
; the(==)
function receives two numbers and returns aBool
, so this is well-typed. - Looking at the
True
branch, we see that we are returning the empty list, which matches the type of[Int]
. - In the
False
branch, we have alet
expression, so we must ensure thatx : r
evaluates to[Int]
too. - The
let
binding contains a bindingr = f x (n - 1)
; knowing that (by our own declaration)f
has typeInt -> Int -> [Int]
, knowing thatx
andn - 1
are of typeInt
means we can safely conclude thatr
has type[Int]
(of course, the(-)
function receives two integers and returns an integer). - The
(:)
function receives anInt
and a[Int]
and returns a[Int]
, so all the types match.
Overall, we have seen that we successfully determined the types of every expression in the program fragment, and concluded that it is well-typed.
Now that you are familiar with the basic typing rules and (roughly) how types are inferred, the next step is to get comfortable writing programs with static types. Generally this comes with practice, but one great way to get you started with typeful programming is to try letting the types guide your programming.
Suppose we are trying to define a function f
that receives
an integer x
and returns a string showing the result of
multiplying x
by 2:
ghci> f 3"3 * 2 = 6"ghci> f 5"5 * 2 = 10"
Let us try implementing this function.
-
The first thing we have to consider is the type of
f
itself, which by definition, should receive anInt
and return aString
. As such, we may start with the type declarationf :: Int -> String
.Example.hs f :: Int -> Stringf x = undefined -
Next, we know we are eventually going to have to convert
x
into aString
. We know that there is ashow
function that does that. Its type signature (modified) isInt -> String
, so we know thatshow x
is aString
.Example.hs f :: Int -> Stringf x = undefinedf x =let sx :: String = show xin undefined -
We also know that we need to multiply
x
by 2. For this, we can use the(*)
function, which has a (modified) type signature ofInt -> Int -> Int
. Thus, we can writex * 2
and that gives us anInt
.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2in undefined -
Knowing that we eventually need to display it as a
String
, once again, we can rely on theshow
function.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2sy :: String = show yin undefined -
Now we have all the numbers we need in
String
form, we need to concatenate them together. For this, we can rely on our trusty(++)
function that receives twoString
s and returns aString
. Using this allows us to concatenate all our desired strings together. Since our original functionf
was meant to return aString
, we can return it as our final result.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2sy :: String = show yin undefinedin sx ++ " * 2 = " ++ sy
This is a simple example of using types to guide your programming. While seemingly trivial, this skill can be incredibly useful for defining recursive functions!
Suppose we are trying to define a function that sums the integers in a list:
-
As always, we must decide what the type of this function is. As per our definition, it receives a list of integers and returns the final sum, which should be an integer as well. This gives us the type declaration
sum' :: [Int] -> Int
.Sum.hs sum' :: [Int] -> Int -
First, let us define the base case. We should be quite clear on what the condition for the base case is: it should be when the input list is empty. What should we return in the base case? By our type declaration, we must return an
Int
, so we must express our base result in that type. The result is0
, which matches our type declaration.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else undefined -
Next we must define the recursive case. This one might be tricky initially. We know that we can make our recursive call, passing in the tail of the input list. This might look something like
sum' (tail ls)
. We must be very clear about the type of this expression; as per the type declaration, the result is anInt
, and not anything else.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else undefinedelse let r :: Int = sum' (tail ls)in undefined -
We also know that we want to add the head of the input list to the result of the recursive call. In doing so we get an
Int
.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else let r :: Int = sum' (tail ls)hd :: Int = head lsin undefined -
Finally, we can add the results together, giving us an
Int
, which matches our return type.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else let r :: Int = sum' (tail ls)hd :: Int = head lsin undefinedin hd + r
By getting used to types, having a statically-typed system no longer feels like a chore or a hurdle to cross, and instead feels like a support system that makes everything you are doing clear! Many developers (including myself) love statically-typed programming languages for this very reason, so much so that people have gone to great lengths to add static typing to otherwise dynamically typed languages like JavaScript (the typed variant of JavaScript is TypeScript).
Python is no different. Several static type checkers are out there to
help us analyze the well-typedness of our program. One of the most
popular analyzers is mypy
, which was heavily developed by Dropbox.
However, I recommend pyright
because at the time of writing, it has
implemented bleeding edge features that we need for further discussion
of types which we shall see very shortly.
Let’s see pyright
in action. We shall write an ill-typed
program and see if it catches the potential bug:
def f(x: int, y: int) -> int: z = x / y return z
Running pyright
on this program will reveal an error message:
~ $ pyright main.py/home/main.py /home/main.py:4:12 - error: Expression of type "float" is incompatible with return type "int" "float" is incompatible with "int" (reportReturnType)1 error, 0 warnings, 0 informations
Great! This makes sense because assuming x
and y
are of type int
, the type of z
should actually be
float
! Let’s correct the program and try running pyright
against the new program:
def f(x: int, y: int) -> int: z = x / y z = x // y return z
~ $ pyright main.py0 errors, 0 warnings, 0 informations
Very well! We have now learnt how to program with types in Haskell and
in Python, and since Python does not come with a type-checker, we are
able to use tools like pyright
to do the type checking for us!
One additional great feature about pyright
is that it is actually also a
language server. As such, you can include pyright
in your favourite text
editors so that it can catch bugs while writing programs!