Type Systems
Abstract
As per the course title, one of the most important aspects of functional programming that we shall cover is types. In this chapter, we shall describe what types are, how they are useful and how we can use type information to write code, and some aspects of types that allow us to reduce boilerplate code while still retaining type safety. In addition, we describe how we can define our own data types in Haskell, and a neat feature known as pattern matching that is extremely useful in the presence of algebraic data types. We also offer some examples of how we can incorporate these concepts in Python.
As the course title suggests, Haskell is a typed functional programming language—in particular, it uses a statically-typed type system. This begs the question, “what is a type system?”
An online search for definitions might give you the following:
A type system is a tractable syntactic method for proving the absence of certain program behaviours by classifying phrases according to the kinds of values they compute.
Let us unpack the highlighted phrases in Definition 1.
-
Tractable more or less means easy, or polynomial time. Method refers to a formal method, which means it is a kind of mathematically formal process. The fact that it is a syntactic method means that this formal analysis can be done syntactically, without the need to appeal to a semantic analysis (although, static type checking is done against the static semantics of the type system). More or less, it can be performed without executing any of the code it is analyzing.
-
In the case of type systems, this usually means that the type system is used to prove the absence of type errors. The realm of program analysis is broken down into roughly two kinds: over-approximation analyses, and under-approximation analyses. Notice that both perform approximations of program behaviour—this is because obtaining a precise specification of any program is undecidable. Generally, static analyses, like type checking, perform an over-approximation of program behaviour. An analogy of how this works is as follows: assume true program behaviour is
and buggy behaviour is at (these are all positive numbers, let’s say). We then over-approximate the true program behaviour, giving us . If we can show that , then we can guarantee that , so the program is not buggy. A more concrete example is as follows. Let’s suppose we have the following code snippet in Python:
y: int = 0 if f() else 'abc'print(y + 1)Notice that if we can determine that
falways returnsTrue, then we know for sure that there will be no type errors. However, it is not possible to make this determination in general. Thus, we over-approximate program behaviour by assuming that it is possible thatfmay return eitherTrueorFalseleading us to show that we cannot prove the absence of type errors in this program. Instead, if we had written the following:y: int = 0 if f() else 'abc'y: int = 0 if f() else 1print(y + 1)Then even by assuming that both branches of the conditional expression may be the result, we can conclusively show that
ywill always be anint. Our over-approximation of program behaviour doesn’t have type errors, meaning, that our actual program really does not have type errors. -
Types essentially classify the kinds of values they compute. As we will informally define later, types group data/things in the program that all behave similarly or have similar characteristics. In some other sense, types can be seen as abstractions over terms.
Simply put, a type system is a formal system that lets us show that there won’t be type errors. As we have seen, the nature of (static) type systems forces us to program in a different way (at least compared to dynamically typed languages like Python), and this is what we will explore in this chapter.
Types
Type systems are systems of types; but what is a type? In essence, a type is like a kind of thing, or a high-level description of what something is. Types
- give meaning to some data, and
- describe what its members are like.
Since you have already programmed in Python, you should have some inkling of what types are. In Python, everything is an object. Thus, in Python, the type of an object is the class from which it was instantiated.
The following is some sample output showing the types of various objects. The output of all these function calls are classes.
>>> x = 1>>> type(x)<class 'int'>>>> type('abc')<class 'str'>>>> class A: pass>>> type(A())<class '__main__.A'>This is very apt—classes are blueprints for creating objects, and (for the most part), all instances of a class will abide by the specification as laid out in the class. Therefore, Python’s type system based on classes is very appropriate for our purposes. In fact, this is not unique to Python. Many other languages with OO features also have classes as types.
Type Annotations
In Python, we mainly think of types as being bound to objects, that is, objects have reified types that can be accessed at runtime. We have never thought of assigning types to variables or function parameters, since when we are investigating the type of a variable, what we are really doing is investigating the type of the object that is referred to by the variable. However, Python actually does allow us to annotate variables, function parameters etc with types to document “suggestions” as to what the types of the objects assigned to them should be.
Observe the following program fragment.
def f(x: int) -> str: y: int = x * 2 return f'{x} * 2 = {y}'z: intz = 3s: str = f(z)print(s) # 3 * 2 = 6This program fragment contains several type annotations.
- In the function header, we have a specification for
fto receive anintand return astr. That is, if the type annotations make sense, then passing anintintofwill always result in astr. - In the function body, we also
have an annotation for the variable
ystating that it is also anint. This makes sense—ifxis anint, then so willx * 2. Actually, the type ofycan be inferred (a type checker can determine the type ofyautomatically), so our type annotation for it is not necessary. - Outside the function body we have other type annotations, documenting what the types of the other variables are. On visual inspection, we can see that all the type annotations make sense and we have adhered to them fully; we are thus guaranteed that we have no type errors.
While Haskell also provides the capability for type annotations, a notable distinction lies in Haskell’s enforcement of adherence to these annotations. Consequently, it might be more fitting to refer to them as type declarations. Nevertheless, the core concept remains unchanged: specifying the types of variables, functions, or terms ensures that, when adhered to correctly, our program will be well-typed.
The following code snippet shows some Haskell code with type declarations.
f :: Int -> Stringf x = show x ++ " * 2 = " ++ show y where y = x * 2z :: Intz = 3s :: Strings = f(z) -- 3 * 2 = 6A natural question would be to ask, what types can we declare variables
to be of? We have looked at some basic types earlier, Int,
String (which is an alias for [Char]),
Char, [Int], Bool,
Double etc. There are many other types in Haskell’s Prelude,
and later on we will see how we can create our own types.
Declaring types for functions is slightly different. In Python, when
writing type annotations for functions, we are really annotating the
types of its parameters, and its return type. In Haskell, we are
declaring the type of the function itself. The difference is actually
not as large as one might imagine. If the function receives a type f receives an Int and returns a
String, then f itself is of the type
Int -> String.
Haskell has roots in formal systems, in particular, System
The T-Abs rule is an inference rule stating that if the premise above the line is true, then the conclusion below the line will also be true. Let’s dive into this in a bit more detail:
- Let’s first parse the premise.
- The part to the left of
is the typing environment, more or less describing the type declarations we have at the point of analysis of the program. Specifically, is the actual type environment, while is an additional assumption that a variable has type . - The part to the right of
describes the judgement of the type of being . Overall, the premise states “given what we have so far, if in assuming is of type we get that is of type , …”.
- The part to the left of
- The conclusion can be understood similarly: it states that the typing environment
will show that the function has type .
Putting these together, the rule states that “given typing
environment
A simple demonstration in Python is as follows: suppose we have x and x * 2. If we assume that x
is of type int, then we know that x * 2 will also
be an int. Therefore, the type of lambda x: x * 2 is int -> int.
What about multi-parameter functions? Remember that in Haskell, all
functions are curried, thus, all functions in Haskell are single
parameter functions. Curried functions receive one parameter, and return
a function closure that receives the remaining variables and
eventually will return the final result. Therefore the (+)
function actually looks more like:
# Pythondef add(x): return lambda y: x + yThe type of add is more like int -> (int -> int).
This is (more or less) the type of (+) in Haskell, which
(more or less) has type Int -> Int -> Int. Note that
-> is right-associative, so Int -> Int -> Int is
the same as Int -> (Int -> Int).
In Haskell, the types of everything are fixed. This should be unsurprising since everything in Haskell is immutable, but it is a restriction that can also be found in other less restrictive languages like Java and C++. In this environment, we have to, perhaps ahead of time, decide what the type of a variable, function, function parameter is, then write the implementation of your function around those restrictions.
The following code
snippet first declares the type of f before showing its
implementation. It is not only good practice to declare types above
their implementation, but it can be a nice way to frame your mind around
the implementation of your function—start by providing a high-level
specification of your function, then work on the implementation to
describe what the function is actually trying to achieve.
f :: Int -> String -- explicit type declarationf x = show x ++ "!"g x = x + 1 -- type of g is inferredHowever, observe that the type of g is not defined. This
does not mean that the type of g is dynamic or is not being
checked; rather, Haskell can infer the principal (most liberal) type
of g via a process known as type inference. That still
means that the implementation of g itself must be well-typed
(its implementation does not break any of the typing rules), and that
any users of g must abide by its static type signature.
Generally speaking, it is good practice to declare the types of
top-level bindings—that is, nested bindings of functions, variables (for
example, in let expressions) do not need type declarations
and can often be inferred. The example above of the declaration of
f is a perfectly idiomatic way of defining and declaring a
function, unlike g which lacks a type declaration.
Programming with Types
When learning Python, you might not have had to think very much about
types; this is because Python does not care about type annotations. For
example, you can happily annotate a variable to be an int but
then assign a string into it. This is very much unlike Haskell, where
adherence to type declarations and well-typedness is enforced by the
compiler—the compiler will reject any program that is not well-typed.
Observe the following program fragment:
f :: Int -> String -- explicit type declarationf x = show x ++ "!"
g = f "1" -- compiler throws type error as f receives Int, not StringThe definition of f is well-typed since it abides by all the
typing rules, and all the types make sense. However, since f
only receives Int, passing a String into it is a
clear violation of the rules. Thus, the entire program is ill-typed and
will not be compiled. Try this for yourself!
Programming in such a strict and formal language can feel restrictive, but these restrictions actually feel more like “guard rails” or “seatbelts”; if your program passes the checks done by the compiler, you can be quite assured that it works. As the saying goes, in Haskell, “if it compiles, it works”. Although this is not necessarily true, Haskell’s robust and expressive type system allows you to rule out a large class of bugs, and often, results in correct programs. However, one question to ask is: how do we go about programming with static types?
The first step of being able to program with types is understanding the typing rules. We shall elide explanation on how typing works with inference, typeclasses, polymorphism etc. and focus solely on the simplest typing rules:
- In a binding
x = e, the type ofxmust be the same as the type ofe - In a conditional expression
if x then y else z, the type ofxmust beBooland the types ofyandzmust both be equal to some typea; the type of the entire expression isa - In a function application expression
f xthe type offmust bea -> bfor someaandb,xmust be of typea, and the type of the expression isb - (Without loss of generality of number of parameters) For a function
binding
f x = ethe type offmust bea -> bfor someaandband in assumingxto be of typea,emust be of typeb.
Try calculating the types of every expression in the following code snippet. Can you get it all right?
f :: Int -> Int -> [Int]f x n = if n == 0 then [] else let r = f x (n - 1) in x : rLet’s work through this example.
- Observe line 1:
We are declaring
Example.hs f :: Int -> Int -> [Int]fto be of typeInt -> Int -> [Int], so it stands to reason that in the definition offwe are assuming thatxandnare both of typeInt. - Observe the body of
f:For this to be well-typed, we must ensure that the conditional expression evaluates toExample.hs if n == 0then []elselet r = f x (n - 1)in x : r[Int], that means both branches must themselves evaluate to[Int]. - First we observe the condition
n == 0:TheExample.hs if n == 0(==)function receives two numbers and returns aBool, so this is well-typed. - Looking at the
Truebranch:We see that we are returning the empty list, which matches the type ofExample.hs then [][Int]. - In the
Falsebranch:we have aExample.hs let r = f x (n - 1)in x : rletexpression, so we must ensure thatx : revaluates to[Int]too. - The
letbinding:contains a bindingExample.hs let r = f x (n - 1)r = f x (n - 1); knowing that (by our own declaration)fhas typeInt -> Int -> [Int], knowing thatxandn - 1are of typeIntmeans we can safely conclude thatrhas type[Int](of course, the(-)function receives two integers and returns an integer). - In the
letbody:TheExample.hs in x : r(:)function receives anIntand a[Int]and returns a[Int], so all the types match.
Overall, we have seen that we successfully determined the types of every expression in the program fragment, and concluded that it is well-typed.
Now that you are familiar with the basic typing rules and (roughly) how types are inferred, the next step is to get comfortable writing programs with static types. Generally this comes with practice, but one great way to get you started with typeful programming is to try letting the types guide your programming.
Suppose we are trying to define a function f that receives
an integer x and returns a string showing the result of
multiplying x by 2:
ghci> f 3"3 * 2 = 6"ghci> f 5"5 * 2 = 10"Let us try implementing this function.
-
The first thing we have to consider is the type of
fitself, which by definition, should receive anIntand return aString. As such, we may start with the type declarationf :: Int -> String.Example.hs f :: Int -> Stringf x = undefined -
Next, we know we are eventually going to have to convert
xinto aString. We know that there is ashowfunction that does that. Its type signature (modified) isInt -> String, so we know thatshow xis aString.Example.hs f :: Int -> Stringf x = undefinedf x =let sx :: String = show xin undefined -
We also know that we need to multiply
xby 2. For this, we can use the(*)function, which has a (modified) type signature ofInt -> Int -> Int. Thus, we can writex * 2and that gives us anInt.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2in undefined -
Knowing that we eventually need to display it as a
String, once again, we can rely on theshowfunction.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2sy :: String = show yin undefined -
Now we have all the numbers we need in
Stringform, we need to concatenate them together. For this, we can rely on our trusty(++)function that receives twoStrings and returns aString. Using this allows us to concatenate all our desired strings together. Since our original functionfwas meant to return aString, we can return it as our final result.Example.hs f :: Int -> Stringf x =let sx :: String = show xy :: Int = x * 2sy :: String = show yin undefinedin sx ++ " * 2 = " ++ sy
This is a simple example of using types to guide your programming. While seemingly trivial, this skill can be incredibly useful for defining recursive functions!
Suppose we are trying to define a function that sums the integers in a list:
-
As always, we must decide what the type of this function is. As per our definition, it receives a list of integers and returns the final sum, which should be an integer as well. This gives us the type declaration
sum' :: [Int] -> Int.Sum.hs sum' :: [Int] -> Int -
First, let us define the base case. We should be quite clear on what the condition for the base case is: it should be when the input list is empty. What should we return in the base case? By our type declaration, we must return an
Int, so we must express our base result in that type. The result is0, which matches our type declaration.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else undefined -
Next we must define the recursive case. This one might be tricky initially. We know that we can make our recursive call, passing in the tail of the input list. This might look something like
sum' (tail ls). We must be very clear about the type of this expression; as per the type declaration, the result is anInt, and not anything else.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else undefinedelse let r :: Int = sum' (tail ls)in undefined -
We also know that we want to add the head of the input list to the result of the recursive call. In doing so we get an
Int.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else let r :: Int = sum' (tail ls)hd :: Int = head lsin undefined -
Finally, we can add the results together, giving us an
Int, which matches our return type.Sum.hs sum' :: [Int] -> Intsum' ls =if null lsthen 0else let r :: Int = sum' (tail ls)hd :: Int = head lsin undefinedin hd + r
By getting used to types, having a statically-typed system no longer feels like a chore or a hurdle to cross, and instead feels like a support system that makes everything you are doing clear! Many developers (including myself) love statically-typed programming languages for this very reason, so much so that people have gone to great lengths to add static typing to otherwise dynamically typed languages like JavaScript (the typed variant of JavaScript is TypeScript).
Python is no different. Several static type checkers are out there to
help us analyze the well-typedness of our program. One of the most
popular analyzers is mypy, which was heavily developed by Dropbox.
However, I recommend pyright because at the time of writing, it has
implemented bleeding edge features that we need for further discussion
of types which we shall see very shortly.
Let’s see pyright in action. We shall write an ill-typed
program and see if it catches the potential bug:
def f(x: int, y: int) -> int: z = x / y return zRunning pyright on this program will reveal an error message:
~ $ pyright main.py/home/main.py /home/main.py:4:12 - error: Expression of type "float" is incompatible with return type "int" "float" is incompatible with "int" (reportReturnType)1 error, 0 warnings, 0 informationsGreat! This makes sense because assuming x and y
are of type int, the type of z should actually be
float! Let’s correct the program and try running pyright
against the new program:
def f(x: int, y: int) -> int: z = x / y z = x // y return z~ $ pyright main.py0 errors, 0 warnings, 0 informationsVery well! We have now learnt how to program with types in Haskell and
in Python, and since Python does not come with a type-checker, we are
able to use tools like pyright to do the type checking for us!
One additional great feature about pyright is that it is actually also a
language server. As such, you can include pyright in your favourite text
editors so that it can catch bugs while writing programs!