Mutable State

When programming in other languages, assigment (a.k.a. mutation) can be used to achieve several programming goals, such as convenience, sloppiness, and performance. Consider how far we've come without the need to rely on mutation.

Haskell does, however, provide mechanisms for mutation, albeit in a manner distinct from most other languages. In a nutshell, Haskell tracks the "extent" or "scope" of mutable references, using the type system to ensure that they do not "escape" their "computational threads." As we will see, this allows mutation to be used locally without allowing them to leak outside of the kinds of nice, pure functional boundaries we have grown to know and love.

Example: Fibonacci

As a very simple example, implement Fibonacci naively:

fib_slow :: Integer -> Integer
fib_slow 1 = 1
fib_slow 2 = 1
fib_slow n = fib_slow (n-1) + fib_slow (n-2)

> map fib_slow [1..30]     -- slow
> map fib_slow [1..100]    -- hopeless

A much more efficient version keeps track of previous two answers:

fib_fast :: Integer -> Integer
fib_fast 1 = 1
fib_fast 2 = 1
fib_fast n = foo 1 1 (n-2) where
  foo i _ 0 = i
  foo i j m = foo (i+j) i (m-1)

> map fib_fast [1..100]
> map fib_fast [1..1000]

Even though we don't need to, let's implement this using mutable variables (like we probably would in C or Java).

ST (State Threads)

There are two components to mutable references in Haskell:

  1. A "thread" of computation, denoted by Control.Monad.ST s a, where s is the (abstract) "id" of a thread and a is the type of value produced; and

  2. A reference cell of type Data.STRef s a points to a memory cell that stores a value of type a and s is the thread id.

Might help to remember more informative type variable names: ST id a and STRef id a.

Unlike Control.Monad.State s a ~= s -> (a, s), this "state monad" really is stateful.

ST s is a Monad, so can sequence ST s a actions.

fib_impure :: Integer -> Integer
fib_impure 1 = 1
fib_impure 2 = 1
fib_impure n = runST $ do
  x <- newSTRef 1        -- every cell initialized to some value
  y <- newSTRef 1

  replicateM_ (n-2) $ do   -- review Control.Monad!
     i <- readSTRef x
     j <- readSTRef y
     writeSTRef x j
     writeSTRef y (i+j)

  readSTRef y

> map fib_impure [1..30]
> map fib_impure [1..300]
> map fib_impure [1..300] == map fib_fast [1..300]

Okay, so how do these functions work? Notice that new/read/write return values "in the ST monad."

newSTRef   :: a -> ST s (STRef s a)
readSTRef  :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
runST      :: (forall s. ST s a) -> a

Whoa, explicit forall s. .... How is this type different than usual?

Higher-Rank Polymorphism

Let's look at a simpler example first

Type error:

foo :: ([a] -> b) -> Bool -> b
foo f True  = f "abcd"
foo f False = f [1,2]

Remember where the implicit foralls are. (Set the ExplicitForAll language flag to allow implict foralls to be written.)

foo :: forall a. forall b. ([a] -> b) -> Bool -> b

But foo uses the first call to f at type [Char] -> b and the second call at type Num n => [n] -> b.

If we move the a quantifier inside, then it typechecks! (Writing foralls anywhere but all the way to the left in a type requires the flag RankNTypes.).

foo :: forall b. (forall a. [a] -> b) -> Bool -> b

By default, all quantifiers are at "the top-level" of a type cannot appear to the left on arrow. "Higher-rank" types allow quantifiers to be nested inside, to the left of arrows

      all                            all
      / \                            /  \
     a   all                        b  ------>
        /   \                         /       \
       b  ------>                   all       ->
         /       \                  / \      /  \
       ->        ->                a  ->   Bool   b
      /  \      /  \                 /  \
     [a]  b   Bool  b               [a]  b

We can call foo with any functions of type (forall a. [a] -> b). For example:

> foo length True
> foo length False
> foo (const 0) True
> foo (const 0) False

"Normal" foralls allow (require) the caller to choose the type arguments, so language/compiler cannot make any assumptions about choice.

But foralls to the left of arrow prevent caller from knowing anything about the type variable. Used to encode encapsulation (e.g. objects). Allows compiler to make decisions about representation.

Back to ST

The forall in runST prevents output from depending at all on thread. This approach allows us to "locally" use mutation. But only pure values at our function boundaries.

For example:

bad :: Bool
bad = runST $ do
  x <- newSTRef True
  pure $ runST $ do
    writeSTRef x False  -- inner thread can't access vars from outer
  readSTRef x

Even the following fails "because type variable 's' would escape its scope" (something like "x :: exists s. STRef s Bool").

bad' :: ()
bad' =
  let x = runST $ newSTRef True in
  ()

Can define aliases:

aliasEx = runST $ do
  x <- newSTRef True
  let y = x
  writeSTRef y False
  readSTRef x

References prevented from "escaping" scope. So can use references internally (e.g. within a function), but must be pure at boundaries. Best of both worlds! Haskell can enforce this, whereas most type systems cannot either (a) cannot prevent references from crossing boundaries (b) require different types depending on implementation.

"Impure" features do exist in Haskell, but they are to be used with restraint and caution:

  • ST (State Transformer / State Thread) monads and STRef
  • IO (and see unsafe IO functions)
  • IORef, in addition to mutation, allows exceptions and I/O. But cannot (safely) escape IO like you can ST.
  • Control.Concurrent.MVars

Source Files

results matching ""

    No results matching ""