Zippers
Programming in a purely functional languages means no mutation. So, "updating" a data structure means creating a copy with the desired changes.
insertAt :: Int -> a -> [a] -> [a]
insertAt 0 x ys = x : ys
insertAt i x (y:ys) | i > 0 = y : insertAt (pred i) x ys
> "12345" |> insertAt 3 'A' |> insertAt 4 'B' |> insertAt 5 'C'
As you may learn in subsequent courses on data structures in, or compilers for, functional languages, a good compiler will generate code that does use pointers to reuse memory when parts of one data structured is shared with another.
In the above example, the linked list with the numbers 4
and 5
will
be shared by all four lists (the original one and the result of each
call to insertAt
). But there will be four separate linked lists
for the first three numbers 1
and 3
. Three separate A
nodes (two
duplicates) will be created and two separate B
nodes (one duplicate)
will be created.
At the end of the today, some of this re-allocation is unavoidable. But, in cases where multiple updates occur at a similar place in the data structure, it would be nice to avoid the extra, unnecessary copies --- as well as the time to repeatedly traverse to the same location --- making a batch of changes before "moving" to another part of the data structure. That is, want to maintain a "focus" or "cursor" or "context" within the data structure, to allow fast traversal and changes around it, and then "rewind" or "zip up" the data structure as normal later on.
The key to the zipper pattern is to maintain links from a cursor back to the root, thus inverting the typical flow of pointers in the data structure. All "pointers" should point out from the cursor.
Lists
data ListZipper a
= Z { cursor :: Int -- cursor == length revFront
, revFront :: [a]
, back :: [a]
}
deriving (Show)
Creating and removing zippers:
fromList :: [a] -> ListZipper a
fromList xs = Z 0 [] xs
toList :: ListZipper a -> [a]
toList (Z _ revFront back) = reverse revFront ++ back
Traversing left and right:
goLeft, goRight :: ListZipper a -> ListZipper a
goLeft (Z i (x:revFront) back) = Z (pred 1) revFront (x:back)
goRight (Z i revFront (x:back)) = Z (succ 1) (x:revFront) back
Insertion, before and after:
insertAfterCursor x (Z i revFront back) = Z i revFront (x:back)
insertAtCursor x (Z i revFront back) = Z (succ 1) (x:revFront) back
> :{
"12345"
|> fromList
|> goRight |> goRight |> goRight
|> insertAtCursor 'A'
|> insertAtCursor 'B'
|> insertAtCursor 'C'
|> toList
:}
"123ABC45"
Binary Trees
data BinaryTree a
= Node a (BinaryTree a) (BinaryTree a)
| Empty
deriving (Show)
data Path a
= Top
| LeftStep
a -- this node val
(Path a) -- path up
(BinaryTree a) -- right child
| RightStep
a -- this node val
(Path a) -- path up
(BinaryTree a) -- left child
deriving (Show)
data BinaryTreeZipper a = Z (BinaryTree a, Path a) deriving (Show)
An alternative to LeftStep
/ RightStep
would be for the
direction to name the sibling, rather than the step:
Step a (Path a) (Either (BinaryTree a) (BinaryTree a))
Regardless of this choice, notice the type structure of Path
;
could represent with a list instead:
type Direction = L | R
type Path a = [(a, Direction, BinaryTree a)]
Or:
type Path a = [(a, Either (BinaryTree a) (BinaryTree a))]
Let's stick with the first definition.
Creating and removing zippers:
fromTree :: BinaryTree a -> BinaryTreeZipper a
fromTree t = Z (t, Top)
toTree :: BinaryTreeZipper a -> BinaryTree a
toTree (Z (t, Top)) = t
toTree (Z (t, LeftStep x up right)) = toTree (Z (Node x t right, up))
toTree (Z (t, RightStep x up left)) = toTree (Z (Node x left t, up))
Navigation:
goLeft, goRight, goUp, goDownLeft, goDownRight :: BinaryTreeZipper a -> BinaryTreeZipper a
goLeft (Z (t, RightStep x up left)) = Z (left, LeftStep x up t)
goRight (Z (t, LeftStep x up right)) = Z (right, RightStep x up t)
goUp (Z (t, LeftStep x up right)) = Z (Node x t right, up)
goUp (Z (t, RightStep x up left)) = Z (Node x left t, up)
goDownLeft (Z (Node x left right, up)) = Z (left, LeftStep x up right)
goDownRight (Z (Node x left right, up)) = Z (right, RightStep x up left)
Multi Trees
Generalize binary trees to ones with arbitrary number of children:
data MultiTree a
= Node a [MultiTree a] -- children intended to be non-empty
| Empty
deriving (Show)
data Path a
= Top
| Step
a -- this node val
(Path a) -- path up
[MultiTree a] -- left siblings, in reverse order
[MultiTree a] -- right siblings
deriving (Show)
data MultiTreeZipper a = Z (MultiTree a, Path a)
deriving (Show)
Navigation; notice that goUp
is the expensive case, requiring a
call to reverse
in order to zip the entire list of children
back up:
goLeft, goRight, goUp, goDownFirst :: MultiTreeZipper a -> MultiTreeZipper a
goLeft (Z (t, Step x up (l:left) right)) = Z (l, Step x up left (t:right))
goRight (Z (t, Step x up left (r:right))) = Z (r, Step x up (t:left) right)
goUp (Z (t, Step x up left right)) = Z (Node x (reverse left ++ [t] ++ right), up)
goDownFirst (Z (Node x (t:children), p)) = Z (t, Step x p [] children)