Updated

Software Transactional Memory

Concurrency and parallelism is, generally, really hard. This is because the ordering of concurrent and parallel evaluation is nondeterministic, and the traditional threaded model of concurrent programming with threads make working with concurrent operations and composing them is very difficult and error-prone. MVars and Haskell's runtime make this slightly safer (if you have done concurrency in other languages like C before, you might be able to see why), but are still vulnerable to the same issues that plague concurrent and parallel programs.

To give you a toy example, suppose we have two threads, one which acquires two MVars a and b and adds the value of a to b, and another which swaps their values. One possible implementation might be the following, with a deadly vulnerability hidden in plain sight:

swap :: MVar a -> MVar a -> Chan () -> IO ()
swap a b chan = do
    x <- takeMVar a
    y <- takeMVar b
    putMVar a y
    putMVar b x
    writeChan chan () -- signal done

addToMVar :: Num a => MVar a -> MVar a -> Chan () -> IO ()
addToMVar a b chan = do
    y <- takeMVar b
    x <- takeMVar a
    let z = x + y
    putMVar b z
    putMVar a x
    writeChan chan () -- signal done

main :: IO ()
main = do
    a <- newMVar 1 :: IO (MVar Int)
    b <- newMVar 2 :: IO (MVar Int)
    chan <- newChan :: IO (Chan ())
    forkIO $ addToMVar a b chan
    forkIO $ swap a b chan
    _ <- readChan chan
    _ <- readChan chan
    x <- takeMVar a
    y <- takeMVar b
    print x
    print y
    return ()

In this program, several things could happen:

  • swap starts first, and is able to acquire the values from both MVars a and b, thus executing completely and putting new values to a and b for addToMVar to use
  • addToMVar starts first and is able to acquire the values from both MVars a and b, thus executing completely and putting new values to a and b for swap to use
  • swap starts first and acquires a, shortly thereafter addToMVar begins and acquires b. Now swap is waiting for b, and addToMVar is waiting for a.
  • addToMVar starts first and acquires b, shortly thereafter swap begins and acquires a. Now swap is waiting for b, and addToMVar is waiting for a.

The last two scenarios result in something known as a deadlock and causes all these threads to wait and to be unable to continue. In particular, this deadlock was caused by a lock ordering inversion, a very common mistake that is usually undetectable by the compiler, and only starts causing problems at runtime! Scenarios like these are known as race conditions, and yes, while there are tools to detect race conditions, detecting all race conditions is undecidable, and thus is an impossible problem to solve. Are there tools to help us reduce of likelihood of running into race conditions?

Haskell supports something known as software transactional memory (STM) (Harris et al.; 2005), which is very similar to transactions in databases with ACID guarantees. Notice that this deadlock situation could go away if swap and addToMVar acquired both locks in one atomic operation, so that neither thread can interleave an MVar acquisition! STM provides such facilities to allow us to define, compose and work with atomic transactions. All we need to do is to install the stm package!

Key Ideas

Instead of working with the IO monad, STM constructs work within the STM monad. Under the hood, the stm implementation handles all the coordination, so as programmers, as long as we are working within the STM monad, we can regard these operations as atomic. In other words, an STM transaction appears to take place indivisibly. All transactional operations are within STM, and can only be escaped to IO using atomically:

ghci> import Control.Concurrent.STM
ghci> :t atomically
atomically :: STM a -> IO a

An atomically block is treated as a single I/O operation, so the STM operations cannot interleave. In addition, the atomically block executes a transaction entirely, or not at all.

Now let us try using STM for communications between threads. We are going to create a transaction for atomically acquiring both MVars. Of course, instead of MVar, which operates in the IO monad, the Control.Concurrent.STM module exposes a TMVar, sort of like a transactional MVar that lives in the STM monad. Let us write this transaction:

takeBothTMVars :: TMVar a -> TMVar b -> STM (a, b)
takeBothTMVars a b = do
  x <- takeTMVar a
  y <- takeTMVar b
  return (x, y)

As you can see, this looks just like an equivalent version written for MVars:

takeBothMVars :: MVar a -> MVar b -> IO (a, b)
takeBothMVars a b = do
  x <- takeMVar a
  y <- takeMVar b
  return (x, y)

Now let us rewrite our original deadlocked program using TMVars and STM, focusing temporarily on the swap function. Recall that we want to take both TMVars as a single atomic operation, hence we defined an STM operation takeBothTMVars that does so. To actually perform this operation as a single I/O action, we have to use atomically which performs the transaction, atomically:

swap :: TMVar a -> TMVar a -> Chan () -> IO ()
swap a b chan = do
    (x, y) <- atomically $ takeBothTMVars a b
    -- ...

This way, the transaction is done in one fell swoop, and if either TMVars are empty, the thread running swap will block until both become available. We can do the same for addToMVar, but this time, we are going to introduce lock-order inversion again by swapping the arguments to takeBothTMVars:

addToMVar :: Num a => TMVar a -> TMVar a -> Chan () -> IO ()
addToMVar a b chan = do
    (y, x) <- atomically $ takeBothTMVars b a

Although we swapped the arguments to takeBothTMVars, thereby introducing lock-order inversion, operationally, there is no difference, since takeBothTMVars is regarded as a single atomic operation anyway. We then continue defining the rest of the program which should be similar to before. Importantly, note that to create a new TMVar within IO for coordination, we use the newTMVarIO function;

import Control.Concurrent
import Control.Concurrent.Chan
import Control.Concurrent.STM

takeBothTMVars :: TMVar a -> TMVar b -> STM (a, b)
takeBothTMVars a b = do
  x <- takeTMVar a
  y <- takeTMVar b
  return (x, y)

putBothTMVars :: TMVar a -> a -> TMVar b -> b -> STM ()
putBothTMVars a x b y = do
  putTMVar a x
  putTMVar b y

swap :: TMVar a -> TMVar a -> Chan () -> IO ()
swap a b chan = do
    (x, y) <- atomically $ takeBothTMVars a b 
    atomically $ putBothTMVars a y b x
    writeChan chan ()

addToMVar :: Num a => TMVar a -> TMVar a -> Chan () -> IO ()
addToMVar a b chan = do
    (y, x) <- atomically $ takeBothTMVars b a
    let z = x + y
    atomically $ putBothTMVars a x b z
    writeChan chan ()

main :: IO ()
main = do
    a <- newTMVarIO 1 :: IO (TMVar Int)
    b <- newTMVarIO 2 :: IO (TMVar Int)
    chan <- newChan :: IO (Chan ())
    forkIO $ addToMVar a b chan
    forkIO $ swap a b chan
    _ <- readChan chan
    _ <- readChan chan
    x <- atomically $ takeTMVar a
    y <- atomically $ takeTMVar b
    print x
    print y

We don't have to only use STM for coordination between threads (although that is certainly a great use case). As long as we want atomic memory transactions, it is highly likely that STM is applicable.

For example, suppose we have some in-memory shared state, such as a counter, and users (perhaps across the network) can modify this counter. Modifying the counter requires two things: (1) reading the existing counter, (2) modifying the read value, (3) updating the counter with the modified value. To prevent data races, we want all these operations to be done in one fell swoop (i.e. as a single transaction).

incVar :: TVar Int -> STM Int
incVar v = do
  x <- readTVar v
  let y = x + 1
  writeTVar v y
  return y

Now we're not afraid to compose incVar with other STM operations, even if they are done concurrently!

import Control.Concurrent
import Control.Concurrent.STM

-- Increments a 'TVar'
incVar :: TVar Int -> STM Int
incVar v = do
  x <- readTVar v
  let y = x + 1
  writeTVar v y
  return y

-- IO Action that increments a TVar five times
aIncVar :: TVar Int -> IO ()
aIncVar v = aux 5 where
  aux :: Int -> IO ()
  aux 0 = return ()
  aux n = do
    r <- atomically $ incVar v
    print r
    aux (n - 1)

main :: IO ()
main = do
    n <- getNumCapabilities
    putStrLn $ "Number of cores: " ++ show n
    -- Initialize the counter
    counter <- newTVarIO 0 :: IO (TVar Int)
    -- For example, run four threads that increment the counter 5 times
    forkIO $ aIncVar counter
    forkIO $ aIncVar counter
    forkIO $ aIncVar counter
    forkIO $ aIncVar counter
    -- Sleep so we can wait for the other threads to complete
    threadDelay 1000000

When executing this program, you should notice that the counter is being incremented correctly, with a final value of 20.

The stm library provides many other useful facilities for writing transactional programs. Refer to the library documentation or the original paper for more details.

Concurrent and Parallel Programming in Haskell

In summary, concurrent and parallel programming in Haskell is, generally, not too dissimilar to that in other general-purpose languages. However, because Haskell is a purely functional and non-strict evaluation language, there are several neat things at our disposable. For one, it is relatively straightforward to fork an I/O action to be performed concurrently, and to use synchronizing variables like MVar for communication between threads. Importantly, the Haskell runtime ensures that MVars are only taken from or put by one thread, so synchronization is inherent in its implementation. However, using MVars alone can get cumbersome especially when dealing with multiple concurrent operations that do not compose well; hence, the introduction of STM for atomic transactions to reduce the likelihood of accidentally introducing race conditions and deadlocks. In addition, because Haskell has non-strict evaluation, parallelizing it is fairly straightforward, by simply annotating the functions with par and pseq function applications to describe what operation should be done in parallel with what else.

Most importantly, ideas like I/O actions, STM transactions and even parallel evaluation strategies are all exposed as monads, and programs that are written with these can make use of all the guarantees and conveniences that monads have to offer. As before, monads are some of the most powerful concepts in programming, and it helps dramatically to have programming languages that make working with them easy.

Lastly, concurrency and parallelism are huge topics in Computer Science in and of itself. Since many of what is described in this course are not as generally applicable to other general-purpose languages, many of the details are omitted. More information is readily available online and in the original papers describing the various systems like Concurrent and Parallel Haskell, and STM. This may be useful if you are interested in a pursuing a career involving Haskell development, or wish to learn, more deeply, about some of the ideas we have presented.

References

Tim Harris, Simon Marlow, Simon Peyton-Jones, and Maurice Herlihy. 2005. Composable memory transactions. In Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming (PPoPP '05). Association for Computing Machinery, New York, NY, USA, 48–60. https://doi.org/10.1145/1065944.1065952.