Mutable State
When programming in other languages, assigment (a.k.a. mutation) can be used to achieve several programming goals, such as convenience, sloppiness, and performance. Consider how far we've come without the need to rely on mutation.
Haskell does, however, provide mechanisms for mutation, albeit in a manner distinct from most other languages. In a nutshell, Haskell tracks the "extent" or "scope" of mutable references, using the type system to ensure that they do not "escape" their "computational threads." As we will see, this allows mutation to be used locally without allowing them to leak outside of the kinds of nice, pure functional boundaries we have grown to know and love.
Example: Fibonacci
As a very simple example, implement Fibonacci naively:
fib_slow :: Integer -> Integer
fib_slow 1 = 1
fib_slow 2 = 1
fib_slow n = fib_slow (n-1) + fib_slow (n-2)
> map fib_slow [1..30] -- slow
> map fib_slow [1..100] -- hopeless
A much more efficient version keeps track of previous two answers:
fib_fast :: Integer -> Integer
fib_fast 1 = 1
fib_fast 2 = 1
fib_fast n = foo 1 1 (n-2) where
foo i _ 0 = i
foo i j m = foo (i+j) i (m-1)
> map fib_fast [1..100]
> map fib_fast [1..1000]
Even though we don't need to, let's implement this using mutable variables (like we probably would in C or Java).
ST (State Threads)
There are two components to mutable references in Haskell:
A "thread" of computation, denoted by
Control.Monad.ST s a
, wheres
is the (abstract) "id" of a thread anda
is the type of value produced; andA reference cell of type
Data.STRef s a
points to a memory cell that stores a value of typea
ands
is the thread id.
Might help to remember more informative type variable names:
ST id a
and
STRef id a
.
Unlike Control.Monad.State s a ~= s -> (a, s)
,
this "state monad" really is stateful.
ST s
is a Monad
, so can sequence ST s a
actions.
fib_impure :: Integer -> Integer
fib_impure 1 = 1
fib_impure 2 = 1
fib_impure n = runST $ do
x <- newSTRef 1 -- every cell initialized to some value
y <- newSTRef 1
replicateM_ (n-2) $ do -- review Control.Monad!
i <- readSTRef x
j <- readSTRef y
writeSTRef x j
writeSTRef y (i+j)
readSTRef y
> map fib_impure [1..30]
> map fib_impure [1..300]
> map fib_impure [1..300] == map fib_fast [1..300]
Okay, so how do these functions work?
Notice that new/read/write
return values "in the ST
monad."
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
runST :: (forall s. ST s a) -> a
Whoa, explicit forall s. ...
.
How is this type different than usual?
Higher-Rank Polymorphism
Let's look at a simpler example first
Type error:
foo :: ([a] -> b) -> Bool -> b
foo f True = f "abcd"
foo f False = f [1,2]
Remember where the implicit foralls are.
(Set the ExplicitForAll
language flag to allow
implict forall
s to be written.)
foo :: forall a. forall b. ([a] -> b) -> Bool -> b
But foo
uses the first call to f
at type [Char] -> b
and the second call at type Num n => [n] -> b
.
If we move the a quantifier inside, then it typechecks!
(Writing forall
s anywhere but all the way to the left
in a type requires the flag RankNTypes
.).
foo :: forall b. (forall a. [a] -> b) -> Bool -> b
By default, all quantifiers are at "the top-level" of a type cannot appear to the left on arrow. "Higher-rank" types allow quantifiers to be nested inside, to the left of arrows
all all
/ \ / \
a all b ------>
/ \ / \
b ------> all ->
/ \ / \ / \
-> -> a -> Bool b
/ \ / \ / \
[a] b Bool b [a] b
We can call foo
with any functions of type (forall a. [a] -> b)
.
For example:
> foo length True
> foo length False
> foo (const 0) True
> foo (const 0) False
"Normal" forall
s allow (require) the caller to choose the type
arguments, so language/compiler cannot make any assumptions about
choice.
But forall
s to the left of arrow prevent caller from knowing
anything about the type variable.
Used to encode encapsulation (e.g. objects).
Allows compiler to make decisions about representation.
Back to ST
The forall
in runST
prevents output from depending
at all on thread.
This approach allows us to "locally" use mutation.
But only pure values at our function boundaries.
For example:
bad :: Bool
bad = runST $ do
x <- newSTRef True
pure $ runST $ do
writeSTRef x False -- inner thread can't access vars from outer
readSTRef x
Even the following fails "because type variable 's' would escape its scope" (something like "x :: exists s. STRef s Bool").
bad' :: ()
bad' =
let x = runST $ newSTRef True in
()
Can define aliases:
aliasEx = runST $ do
x <- newSTRef True
let y = x
writeSTRef y False
readSTRef x
References prevented from "escaping" scope. So can use references internally (e.g. within a function), but must be pure at boundaries. Best of both worlds! Haskell can enforce this, whereas most type systems cannot either (a) cannot prevent references from crossing boundaries (b) require different types depending on implementation.
"Impure" features do exist in Haskell, but they are to be used with restraint and caution:
ST
(State Transformer / State Thread) monads andSTRef
IO
(and see unsafeIO
functions)IORef
, in addition to mutation, allows exceptions and I/O. But cannot (safely) escapeIO
like you canST
.Control.Concurrent.MVars