## Saturday, December 19, 2015

### A whirlwind tour of Haskell, day 1

##### Day 1, imperatively: `transformers` and monads

The first puzzle asks us to increment and decrement a floor number every time an opening or closing parenthesis is encountered. In an imperative language, we'd probably begin by initializing a counter, then we'd iterate over the characters of the input string, incrementing or decrementing the counter depending on which character we encounted. Implementing such a solution in Haskell is possible, but is a bit verbose:

``````import Control.Monad

day1 :: String -> Int
day1 input = execState go 0       -- counter starts at 0
where
go :: State Int ()
go = do
forM_ [0 .. length input - 1] \$ \i -> do
when (input !! i == '(') \$ do -- when we encounter '('...
modify (+ 1)                -- ...increment the counter
when (input !! i == ')') \$ do -- when we encounter ')'...
modify (subtract 1)         -- ...decrement the counter
``````

Here I have used the `transformers` library to define a stateful computation. That's right, Haskell is such a pure functional language that it doesn't even have builtin syntax for declaring and incrementing a counter, you need to use a library in order to do that! I'll explain why the library is called "transformers" in the next section.

Haskell has tons of libraries like this defining specialized embeded domain specific languages (EDSLs). In fact, the word "monad" which is always brought up when discussing Haskell is the name of an interface implemented by many such EDSLs. This shared interface allows generic constructs like the above `forM_` and `when` to work with any EDSL implementing the interface.

##### Day 1, functionally: pattern-matching and composition

A functional solution would instead transform the input using a series of mathematical functions, in such a way that the final transformation gives the desired result. Here, I'm using pattern-matching to map the two parenthesis characters to the numbers one and minus one, and the final result is the sum of all those numbers. Much more succinct!

``````delta '(' = 1
delta ')' = -1

day1 = sum . map delta
``````

The mathematical transformations are applied from right to left: first, `map delta` transforms the input string (a list of characters) into a list of numbers, and then `sum` adds those numbers together.

It is customary to give type signatures to top-level definitions. This makes the code a bit less succinct, but it makes it much easier to understand functions in isolation.

``````delta :: Char -> Int
delta '(' = 1
delta ')' = -1

day1 :: String -> Int
day1 = sum . map delta
``````

With those type signature, it is clearer that `day1` is transforming a `String` into an `Int`, using two intermediate transformations `map delta` and `sum`. I happen to know that `map` takes a transformation from `a` to `b` and produces a transformation from a list of `a` to a list of `b`, so `map delta` transforms the `String`, which is also a `[Char]`, into an `[Int]`, which `sum` then transforms into an `Int`.

It is also customary to write total functions, meaning that if the signature of `delta` claims that it transforms `Char`s into `Int`s, it better be able to handle any `Char`. The above implementation of `delta` is only handling two possible characters, and so even though Advent of Code's input only uses those two characters, I feel compelled to complete my implementation so that it ignores non-parenthesis characters.

``````delta :: Char -> Int
delta '(' = 1
delta ')' = -1
delta _   = 0
``````

Finally, while mathematicians like to write composition from right to left (`f (g x)` first applies `g` to `x`, and then applies `f` to the result, and so the composition is written `f . g`), in this case I find it more convenient to write down my transformations in the order in which they apply, so I'd rather have a left-to-right composition operator. Such an operator exists in the `Control.Arrow` module, along with a few other operators for combining and manipulating various kinds of transformations (Haskell supports more kinds of transformations than just mathematical functions).

``````import Control.Arrow

day1 = map delta
>>> sum
``````

Notice that I've dropped the type signature for `day1`. Its type is still `String -> Int`, but leaving the type out makes it easier for me to visualize intermediate results, using the following technique. I first write the first half of the implementation, `day1 = map delta`. The type of this transformation is `String -> [Int]`, and so my `ghcid` setup will display the resulting list of numbers. After visually inspecting that the result seems correct so far, I add the second line, and `ghcid` updates its output to display the final result. Keeping the type signature would have discouraged this kind of incremental development, because `ghcid` would have displayed a type error instead of the intermediate result.

##### Day 1.5, imperatively: `either` and monad transformers

The next part of the puzzle asks us to count the number of steps until the counter reaches `-1`. An imperative solution would build on the previous one by using a second counter to track the number of steps, and by aborting the loop as soon as the floor number reaches `-1`.

Perhaps surprisingly, the EDSL we used earlier for expressing a stateful computation is intentionally limited to a single stateful variable, and so we cannot use it to express our new two-counters algorithm. Instead, we need to combine two instances of the EDSL, each tracking one of the two counters, with a third EDSL called `either` implementing the ability to abort the computation early. The fact that we can do that is why the library is named `transformers`: the EDSLs it provides are not only independent EDSLs each providing their own feature such as stateful computation, they also have the ability to transform an input EDSL into a slightly more complex EDSL by adding stateful operations to the list of operations supported by the input EDSL. This idea of building more complex systems out of simpler systems is pervasive in Haskell, and if you're interested in learning more, I made a video about combinator libraries which introduces the subject to non-Haskellers.

Now, using three EDSLs for such a simple problem might seem a bit overkill, and it is. As before, the imperative solution ends up being much more verbose in Haskell than the functional implementation. But let's take a look anyway, as building custom EDSLs is an important Haskell skill which becomes invaluable when structuring larger programs.

``````import Control.Arrow
import Data.Functor.Identity

type MyEDSL = StateT Int (StateT Int (EitherT Int Identity))

runMyEDSL :: MyEDSL () -> Int
runMyEDSL = flip evalStateT 0  -- step counter starts at 0
>>> flip evalStateT 0  -- floor number starts at 0
>>> eitherT return (\() -> error "floor -1 never reached")
>>> runIdentity
``````

I first define my custom EDSL, which I name `MyEDSL`. To build it, I start (from right to left) with the `Identity` EDSL, which has no special operations. I combine it with `EitherT Int`, which adds the ability to abort the computation early. Finally, I add two layers of `StateT Int`, for the two counters. I also define a way to run `MyEDSL` computations, by running the operations of all three underlying layers (four if we count `Identity`).

``````getSteps :: MyEDSL Int
getSteps = get

incrSteps :: MyEDSL ()
incrSteps = modify (+1)

getFloor :: MyEDSL Int
getFloor = lift \$ get

incrFloor :: MyEDSL ()
incrFloor = lift \$ modify (+ 1)

decrFloor :: MyEDSL ()
decrFloor = lift \$ modify (subtract 1)

returnEarly :: Int -> MyEDSL ()
returnEarly x = lift \$ lift \$ EitherT.left x
``````

Next, I define operations which are specific to `MyEDSL`, in terms of the simpler EDSLs it is made of: `get` and `modify` are the stateful operations we've already seen in Day 1, and `EitherT.left` is an operator from `either` which aborts the computation early. The function `lift` is used to specify which layer the operation should run in: `op` runs in the outermost `StateT Int` layer, `lift \$ op` runs in the second `StateT Int` layer, and `lift \$ lift \$ op` runs in the third layer, the `EitherT Int` one.

``````day1_5 :: String -> Int
day1_5 input = runMyEDSL go
where
go :: MyEDSL ()
go = do
forM_ [0 .. length input - 1] \$ \i -> do
incrSteps

when (input !! i == '(') \$ incrFloor
when (input !! i == ')') \$ decrFloor

floorNumber <- getFloor
when (floorNumber == -1) \$ do
steps <- getSteps
returnEarly steps
``````

Once all the operations are defined, I can describe my algorithm in terms of my custom operations: iterate through the input string, count the steps, increment and decrement the floor counter, and return early when floor `-1` is reached.

##### Day 1.5, functionally: laziness and higher-order functions

As promised, the functional solution is much more succinct:

``````-- |
-- >>> day1_5 ")"
-- 1
-- >>> day1_5 "()())"
-- 5
day1_5 = map delta
>>> scanl (+) 0
>>> takeWhile (/= -1)
>>> length
``````

As before, I begin by converting the list of characters into a list of `+1`s and `-1`s. This time, however, instead of taking the sum of all the numbers, I use `scanl` to take a running sum, meaning I get a list of all the intermediate sums as each number is added to the total. I then cut that list short just before the first `-1`, and I count how many items are in that trimmed list. Oh, and thanks to laziness (Haskell only evaluates the part of the code which needs to be evaluated to obtain the answer), this implementation is just as efficient as the imperative version: even though it would seem like the running sum would need to produce its entire output list before `takeWhile` could begin to trim it, `map` and `scanl` actually stop consuming their input lists after the sum reaches `-1`, as if by magic.

`map`, `scanl` and `takeWhile` are three of many higher-order functions provided by Haskell's standard library. A "higher-order" function is simply one which takes another function as one of its inputs, in this case in order to produce a modified function. We've already seen how `map` converts a function on elements into a function on lists. The higher-order function `scanl` takes a binary function, in this case `(+)`, and an initial value, in this case `0`, and produces a function from a list of input elements to a list of outputs obtained by repeatedly applying the input function to the value so far and the next input element. So in this case, since the input function is addition, the result is a running sum. Our last higher-order function, `takeWhile`, takes a predicate (a function returning a boolean) and produces a function which trims its input list by keeping all the elements as long as they satisfy the predicate, and then dropping the rest of the elements as soon as it encounters one which doens't satisfy the predicate. In this case, we're keeping all the elements which are not `-1`, and then we're droping `-1` and the rest of the elements.

High-order functions are another example of the principle of constructing more complex systems, in this case functions, by combining and transforming simpler systems, in this case simpler functions. The function `(+)` is quite simple, it simply adds two numbers. The number `0` is even simpler. By applying the higher-order function `scanl` to `(+)` and `0`, we obtain a more sophisticated function which keeps a running sum. And by composing the four functions `map delta`, `scanl (+) 0`, `takeWhile (/= 1)` and `length`, we obtain a very specialized function which solves Day 1.5's puzzle.