Sunday, December 20, 2015

A whirlwind tour of Haskell, day 6

Day 6: precise types, arrays, and ST

This puzzle asks us to parse and execute a sequence of instructions. The instructions are represented as text strings, so in a dynamic language I would probably use regular expressions to determine which instruction I'm looking at, using capturing groups like the following to extract the instruction's parameters.

/^command1 ([0-9]+) ([0-9]+)$/

In Haskell, however, I prefer to define a data type describing the set of all possible instructions, like this:

data Rectangle = Rectangle (V2 Integer) (V2 Integer)
  deriving (Show, Eq)

data Operation = TurnOn | TurnOff | Toggle
  deriving (Show, Eq)

data Instruction = Instruction Operation Rectangle
  deriving (Show, Eq)

The above says that an Instruction consists of an Operation, which can either be TurnOn, TurnOff or Toggle, followed by a Rectangle consisting of two 2D coordinates (two opposite corners). I prefer this representation over a simple string because it is more precise, in the sense that the set of values of this type more closely ressembles the set of valid operations than the set of strings does. There are tons of strings such as "flubberdoodle" which do not represent a valid instruction at all, while with this precise representation, the worse I can do is a value like Instruction TurnOn (V2 0 1000) (V2 50 1050), which at least looks like an instruction, and is only invalid because those coordinates are outside the bounds of the grid specified by the puzzle.

The advantage of having a precise representation is in making sure that we handle all the possibilities. If I was implementing the puzzle's instructions using a function taking a string as input, I'd have to write something like this:

runInstruction :: String -> ...
runInstruction s
  | "turn on"  `isPrefixOf` s = ...
  | "turn off" `isPrefixOf` s = ...
  | "toggle"   `isPrefixOf` s = ...
  | otherwise                 = error "unrecognized instruction"

And then if a new instruction was added, it would be easy to forget to add a handler for it in runInstruction, especially if there were many other functions to update throughout the codebase. By using a more precise algebraic type, I can write runInstruction like this instead:

runInstruction :: Instruction -> ...
runInstruction (Instruction TurnOn  ...) = ...
runInstruction (Instruction TurnOff ...) = ...
runInstruction (Instruction Toggle  ...) = ...

And then if a new instruction gets added, I'll get a warning at compile-time instead of an "unrecognized instruction" error at runtime. Assuming you're compiling with warnings enabled, of course, which isn't the default with the .cabal file generated by cabal init. I always add the line ghc-options: -W -Wall to the end of the file it generates.

Anyway, the instructions are about flipping bits in a bit array, so let's talk about Haskell arrays. The array library defines many different kinds of arrays, and in this implementation I'll be using two different kinds, UArray (V2 Integer) Bool and STUArray s (V2 Integer) Bool.

import Data.Array.ST
import Data.Array.Unboxed

type Index = V2 Integer
type Grid     =   UArray   Index Bool
type STGrid s = STUArray s Index Bool

The Bool means that each cell of the array will contain a boolean, and the V2 Integer means that we'll be indexing into the array using 2D coordinates, so this is a 2D array. The U means "unboxed". Unboxed arrays are only defined for value types which can be tightly-packed; in our case for example, booleans can be efficiently packed as a sequence of bits.

The ST in STUArray comes from yet another monadic EDSL, called ST. If you were disappointed in Day 1 when I said that the State EDSL was limited to a single stateful variable, you'll be happy to hear that ST is a variant which allows you to allocate an arbitrary number of stateful references and to modify them in-place. This is also the EDSL we'll need to use in order to modify our array in-place.

The three operations requested by the puzzle are quite straightforward to implement: we simply read and write booleans to the array at the given index. The unboxed array implementation takes care of translating those boolean reads and writes into the corresponding bit-fiddling shifts and masks which will touch the correct bits.

import Control.Monad.ST

turnOn :: STGrid s -> Index -> ST s ()
turnOn g i = writeArray g i True

turnOff :: STGrid s -> Index -> ST s ()
turnOff g i = writeArray g i False

toggle :: STGrid s -> Index -> ST s ()
toggle g i = do
    b <- readArray g i
    writeArray g i (not b)

Each puzzle instruction requires executing one of our three operations on all the cells inside a particular rectangular area. This is straightforward as well, using the range method. This method comes from a type class named Ix, short for "index", because a type needs to have an Ix instance in order for its values to be used as indices into an array. The Ix instance for Integer, used for 1D arrays, has a range method which simply lists all the integers between two bounds. The Ix instance for V2 Integer, used for 2D arrays, has a range method which enumerates all the 2D coordinates between the corners of a rectangle, which is exactly what we want.

Since ST implements the Monad type class, we can use our trusty forM_ to execute our chosen instruction on every index in the resulting list of indices.

listIndices :: Rectangle -> [Index]
listIndices (Rectangle c1 c2) = range (c1,c2)

runInstruction :: STGrid s -> Instruction -> ST s ()
runInstruction g = go
    go (Instruction TurnOn  r) = forM_ (listIndices r) (turnOn  g)
    go (Instruction TurnOff r) = forM_ (listIndices r) (turnOff g)
    go (Instruction Toggle  r) = forM_ (listIndices r) (toggle  g)

Now that we can run a single instruction, we can run the entire list of instructions in order to obtain the final light pattern. To do this, I allocate a new array of the dimensions described in the puzzle, I execute each of the instructions, and I returned a frozen copy of the STUArray. This converts the STUArray, which cannot outlive the ST computation in which it was created, into an immutable UArray, which can live outside of ST but can no longer be modified in-place.

runInstructions :: [Instruction] -> Grid
runInstructions instructions = runST $ do
    g <- newArray (V2 0 0, V2 999 999) False
    forM_ instructions (runInstruction g)
    freeze g

Finally, to solve the puzzle, I parse the instruction strings into a list of precisely-typed Instructions, I run those instructions to get a light pattern, I enumerate the bits, keep those which are True, and count how many there were.

day6 = parseInstructions
   >>> runInstructions
   >>> elems
   >>> filter id
   >>> length

I've omitted the code for parsing the instructions, it's an important topic but I'll keep it for later. I've also omitted Day 6.5 because it is too similar to Day 6.


No comments: