Validation

One of the most common use of applicatives is validation. From our example at the start of this chapter, we have several data structures and we want to be able to parse them from strings:

1
data Email = Email { emailUsername :: String
2
                   , emailDomain   :: String }
3
  deriving (Eq, Show)
4

5
data Salary = SInt Int
6
            | SDouble Double
7
  deriving (Eq, Show)
8

9
data User = User { username   :: String
10
                 , userEmail  :: Email
11
                 , userSalary :: Salary }
12
  deriving (Eq, Show)

Parsing them from strings may not always succeed, therefore it is imperative that our parsing function does not guarantee that it returns the desired data structure. Therefore, what we can do instead is to have our parsing functions return results in the Maybe context to express this fact. This makes our parsing functions have the following type signatures:

13
parseEmail :: String -> Maybe Email
14
parseSalary :: String -> Maybe Salary

Given these functions, we should be able to define a function that parses a User from three strings: the user name (which requires no parsing), the email (which is parsed using parseEmail) and the salary (which is parsed using parseSalary). One way we can implement this parseUser function is by receiving the three strings, performing parsing on the email and salary (in parallel¹), then constructing our User term with the usual Functor and Applicative methods.

15
parseUser :: String -- name
16
             -> String -- email
17
             -> String -- salary
18
             -> Maybe User -- user
19
parseUser name email salary =
20
    let e = parseEmail email
21
        s = parseSalary salary
22
    in  User name <$> e <*> s

Now our parsing function works just fine!

ghci> parseUser "Foo" "yong@qi.com" "1000"
Just (User "Foo" (Email "yong" "qi.com") 1000)
ghci> parseUser "Foo" "yong" "1000"
Nothing

Validation with Error Messages

However, this is not always helpful since when parsing a user, several things could go wrong—either (1) the supplied email is invalid, (2) the supplied salary is invalid, or (3) both. Therefore, let’s have our parsing functions return an error message instead of Nothing. For this, what we want to rely on is the Either type, which consists of a Left of something sad (like an error message), or a Right of something happy (the desired result type). We show the definitions of Either and its supporting typeclass instances here.

1
data Either a b = Left a -- sad
2
                | Right b -- happy
3

4
instance Functor (Either a) where
5
    fmap :: (b -> c) -> Either a b -> Either a c
6
    fmap _ (Left x) = Left x
7
    fmap f (Right x) = Right $ f x
8

9
instance Applicative (Either a) where
10
    pure :: b -> Either a b
11
    pure = Right
12

13
    (<*>) :: Either a (b -> c) -> Either a b -> Either a c
14
    Left f <*> _ = Left f
15
    _ <*> Left x = Left x
16
    Right f <*> Right x = Right $ f x

Let us change the context that our parsing functions will return. Some of the implementation of parseEmail and parseSalary will need to be changed to add descriptive error messages, and so will their type signatures.

1
parseEmail :: String -> Maybe Email
2
parseEmail :: String -> Either String Email
3
parseEmail email =
4
    if   ...
5
    then Nothing
6
    else Just $ Email ...
7
    then Left $ "error: " ++ email ++ " is not an email"
8
    else Right $ Email ...
9

10
parseSalary :: String -> Maybe Salary
11
parseSalary :: String -> Either String Salary
12
parseSalary salary =
13
    if   ...
14
    then Nothing
15
    else Just $ SInt ...
16
    then Left $ "error: " ++ salary ++ " is not a number"
17
    else Right $ SInt ...

The great thing is that although we have changed the return types of our individual parsing functions, the implementation of parseUser does not, because our definition only relies on the typeclass methods of Functor and Applicative. Since Either a is also an Applicative, our definition can be unchanged, and only the type signature of parseUser needs to be updated.

1
parseUser :: String -- name
2
             -> String -- email
3
             -> String -- salary
4
             -> Maybe User -- user
5
             -> Either String User -- user
6
parseUser name email salary =
7
    let e = parseEmail email
8
        s = parseSalary salary
9
    in  User name <$> e <*> s

Now, users of our parseUser function will get more descriptive error message reports when parsing fails!

ghci> parseUser "Foo" "yong@qi.com" "1000"
Right (User "Foo" (Email "yong" "qi.com") 1000)
ghci> parseUser "Foo" "yong" "1000"
Left "error: yong is not an email"
ghci> parseUser "Foo" "yong@qi.com" "x"
Left "error: x is not a number"

Accumulating Error Messages

However, there is one case that is not handled in our validation function. Let’s see what that is:

ghci> parseUser "Foo" "abc" "x"
Left "error: abc is not an email"

Notice that although both the email and salaries are invalid, the error message shown only highlights the invalid email address. This is misleading because, in fact, the salary is invalid as well, and the user of this function does not know that!

The reason for this lies in the definition of the typeclass instance Applicative (Either a). Notice that in the case of Left f <*> Left x, the result is Left f, ignoring the other error message Left x! In other words, Either is a fail-fast Applicative, and this is not what we want for our parsing function!

As briefly stated earlier, although the Applicative laws describe how an Applicative behaves in the most obvious way, there is in fact, multiple most obvious ways an instance can behave. In fact, we can define a data structure that does not exhibit fail-fastness, and yet, is still a valid Applicative—the result of which is an Applicative that allows us to collect all error messages! Let us give this a try.

The first is to re-define Either as an ADT called Validation that is practically the same (isomorphic) to Either, since that structure is still useful for our purposes. The Functor instance of this ADT will remain the same.

1
data Validation err a = Success a
2
                      | Failure err
3

4
instance Functor (Validation err) where
5
    fmap _ (Failure e) = Failure e
6
    fmap f (Success x) = Success $ f x

Notice that our err type variable remains as a type variable, instead of a pre-defined error message collection type like [String]. This is because, as always, we want to keep our types as general as possible so that it can be used liberally. However, it is now incumbent on us to restrict or constraint err in a way that makes it amenable to collecting error messages in an obvious way so that we can still use it for our purposes. In essence, we just need err to have some binary operation that is associative:

For this, we introduce the Semigroup typeclass which represents just that!

1
class Semigroup a where
2
    -- must be associative
3
    (<>) :: a -> a -> a

Any type is a semigroup as long as it is closed under an associative binary operation. With this, as long as our error is a semigroup, we can use that as our errors in Validation! Let us define our Applicative instance for this:

7
instance Semigroup err => Applicative (Validation err) where
8
    pure :: a -> Validation err a
9
    pure = Success
10

11
    (<*>) :: Validation err (a -> b) -> Validation err a -> Validation err b
12
    Failure l <*> Failure r = Failure (l <> r)
13
    Failure l <*> _ = Failure l
14
    _ <*> Failure r = Failure r
15
    Success f <*> Success x = Success (f x)

Notice the double-failure case—the errors are combined or aggregated using the semigroup binary operation (<>). This way, no information is lost if both operands are Failure cases since they are accumulated together.

Assuredly, using a list of strings as our error log is fine because concatenation is an associative binary operation over lists!

instance Semigroup [a] where
    (<>) :: [a] -> [a] -> [a]
    (<>) = (++)

Therefore, with these definitions we can now amend our parsing functions to use our new Validation Applicative. First, as per usual, we amend parseEmail and parseUser so that they correctly use Validation instead of Either

1
parseEmail :: String -> Either String Email
2
parseEmail :: String -> Validation [String] Email
3
parseEmail email =
4
    if   ...
5
    then Left $ "error: " ++ email ++ " is not an email"
6
    else Right $ Email ...
7
    then Failure ["error: " ++ email ++ " is not an email"]
8
    else Success $ Email ...
9

10
parseSalary :: String -> Either String Salary
11
parseSalary :: String -> Validation [String] Salary
12
parseSalary salary =
13
    if   ...
14
    then Left $ "error: " ++ salary ++ " is not a number"
15
    else Right $ SInt ...
16
    then Failure ["error: " ++ salary ++ " is not a number"]
17
    else Success $ SInt ...

Once again, our parseUser function does not need to change, except for the type signature.

1
parseUser :: String -- name
2
             -> String -- email
3
             -> String -- salary
4
             -> Either String User -- user
5
             -> Validation [String] User -- user
6
parseUser name email salary =
7
    let e = parseEmail email
8
        s = parseSalary salary
9
    in  User name <$> e <*> s

Now, our parsing function works exactly as we want!

ghci> parseUser "Foo" "yong@qi.com" "1000"
Success (User "Foo" (Email "yong" "qi.com") 1000)
ghci> parseUser "Foo" "yong" "1000"
Failure ["error: yong is not an email"]
ghci> parseUser "Foo" "yong@qi.com" "x"
Failure ["error: x is not a number"]
ghci> parseUser "Foo" "abc" "x"
Failure ["error: abc is not an email", "error: x is not a number"]

Hands-On

In this chapter, we went from parsing with Maybes to parsing with Eithers and finally to parsing with Validations. Give this a try for yourself!

Written below is the full program for parsing users with Maybe. Try replacing the Maybes with Eithers, then with Validations and see the outcome of running the program each time!

1
module Main where
2

3
import Control.Applicative
4
import Text.Read
5
import System.IO
6

7
-- edit these!
8
parseEmail :: String -> Maybe Email
9
parseEmail email =
10
    if '@' `elem` email && length e == 2 && '.' `elem` last e
11
    -- edit the following two lines when replacing Maybe with
12
    -- Either or Validation
13
    then Just $ Email (head e) (last e)
14
    else Nothing
15
  where e = split '@' email
16

17
parseSalary :: String -> Maybe Salary
18
parseSalary s =
19
  let si = SInt <$> readMaybe s
20
      sf = SDouble <$> readMaybe s
21
  in  case si <|> sf of
22
        Just x -> Just x -- change the RHS `Just x` when replacing
23
                         -- Maybe with Either or Validation
24
        Nothing -> Nothing -- change the RHS `Nothing` when replacing
25
                           -- Maybe with Either or Validation
26

27
-- you should only need to change the type of `parseUser` when
28
-- replacing Maybe with Either or Validation
29
parseUser :: String -- name
30
          -> String -- email
31
          -> String -- salary
32
          -> Maybe User
33
parseUser name email salary =
34
    let e = parseEmail email
35
        s = parseSalary salary
36
    in  User name <$> e <*> s
37

38
-- no need to edit the rest!
39

40
-- the data structures
41
data Email = Email { emailUsername :: String,
42
                     emailDomain :: String    }
43
  deriving (Eq, Show)
44

45
data Salary = SInt Int | SDouble Double
46
  deriving (Eq, Show)
47

48
data User = User { username :: String,
49
                   userEmail :: Email,
50
                   userSalary :: Salary }
51
  deriving (Eq, Show)
52

53
-- user input with a prompt
54
input :: String -> IO String
55
input prompt = do
56
  putStr prompt
57
  hFlush stdout
58
  getLine
59

60
-- splitting strings
61
split :: Char -> String -> [String]
62
split _ [] = [""]
63
split delim (x : xs)
64
    | x == delim = "" : xs'
65
    | otherwise  = (x : head xs') : tail xs'
66
  where xs' = split delim xs
67

68
-- validation
69
data Validation err a = Success a
70
                      | Failure err
71
    deriving (Eq, Show)
72

73
instance Functor (Validation err) where
74
    fmap :: (a -> b) -> Validation err a -> Validation err b
75
    fmap _ (Failure e) = Failure e
76
    fmap f (Success x) = Success $ f x
77

78
instance Semigroup err => Applicative (Validation err) where
79
    pure :: a -> Validation err a
80
    pure = Success
81

82
    (<*>) :: Validation err (a -> b) -> Validation err a -> Validation err b
83
    Failure l <*> Failure r = Failure (l <> r)
84
    Failure l <*> _ = Failure l
85
    _ <*> Failure r = Failure r
86
    Success f <*> Success x = Success (f x)
87

88
main :: IO ()
89
main = do
90
  n <- input "Enter name: "
91
  e <- input "Enter email: "
92
  s <- input "Enter salary: "
93
  print $ parseUser n e s

It is important to note that the use of the word “parallel” in this chapter has nothing to do with parallelism. The word “parallel” is only used to describe the notion of merging parallel railways into a single rail line via <*>. ↩

Validation

Validation with Error Messages

Accumulating Error Messages

Hands-On

Footnotes