Saturday, December 19, 2015

A whirlwind tour of Haskell, day 3

Day 3: linear and containers

In this puzzle, we need to move a cursor in 2D space and count the number of distinct coordinates we visited.

import qualified Data.Set as Set
import Linear.V2

direction :: Char -> V2 Int
direction '<' = V2 (-1) 0
direction '>' = V2 1 0
direction '^' = V2 0 1
direction 'v' = V2 0 (-1)
direction _   = V2 0 0

-- |
-- >>> day3 ">"
-- 2
-- >>> day3 "^>v<"
-- 4
-- >>> day3 "^v^v^v^v^v"
-- 2
day3 = map direction
   >>> scanl (+) 0
   >>> Set.fromList
   >>> length

I use V2 from the linear library to represent the 2D coordinates, because its vector space addition allows me to add both coordinates at once. To get the list of visited coordinates, I use scanl (+) 0 again: it's still a running sum, but this time we're adding directional unit vectors to a coordinate which starts at "zero", the origin of the 2D plane. This works because (+) and 0 are obtained via methods of the Num type class, and Haskell infers that I want to use the Num instance for V2 Int.

Interestingly, this is a parameterized instance, which means that the Num instance for V2 Int delegates part of its implementation to the Num instance for Int, and similarly for V2 Float, etc. This mechanism allows type class instances to be automatically constructed for complex composed types which have never been seen before, without requiring the programmer to write a type class instance for this particular composition of types. For example, our MyEDSL from earlier was composed of four layers of monad transformers, each of which defined their own Monad instance, but we did not have to define a Monad instance for MyEDSL in order to be able to use forM_ and when.

Next, to obtain distinct coordinates from the list of all coordinates, I convert the list into a set, defined by the containers library. There is also a function called nub in the standard library which eliminates duplicates from a list, but for technical reasons nub executes in quadratic time while Set.fromList runs in O(n*log n). The containers library also has an even more efficient IntSet which can only contain Ints, but I rarely use it because Set Int is more versatile (I can easily switch to a Set Float if I need to) and it's fast enough for me.

Finally, I compute the number of elements in the set using length. It's the same length we used earlier to compute the size of a list because, you guessed it, length is part of yet another type class. It's a type class called Foldable, and its methods include or can be used to derive length, sum, forM_, and other functions which require iterating over the elements of a container without modifying them.

Day 3.5: infinite lists, lens, and constraints

For the second part of the puzzle, we now need to move two cursors, and the ASCII instructions alternate between giving a direction for the first cursor and giving a direction for the second. We need to count the number of distinct coordinates which at least one of the cursors has visited.

{-# LANGUAGE RankNTypes #-}

import Control.Lens
import Data.Set.Lens
import Data.List

day3_5 = map direction
     >>> zip [0..]
     >>> partition (fst >>> even)
     >>> over (both.mapped) snd
     >>> over both (scanl (+) 0)
     >>> setOf (both.folded)
     >>> length

The sequence of functions is getting a bit long, so let's split it into smaller helper functions.

-- |
-- >>> fromAlternating [1..6]
-- ([1,3,5],[2,4,6])
-- >>> fromAlternating "a1b2c3"
-- ("abc","123")
fromAlternating :: [a] -> ([a], [a])
fromAlternating = zip [0..]
              >>> partition (fst >>> even)
              >>> over (both.mapped) snd

This first helper splits the lists of instructions into two sublists, one for each cursor. To do that, we first number the elements of the list from 0 to n, then we partition the list into two sublists according to whether the element number is even or odd, and finally, we drop the element numbers and keep only the elements.

zip [0..] is a common Haskell idiom for numbering the elements of a list. The way it works is by creating the infinite list [0,1,2,...] and using zip to pair each element of this list with each element of the original list [x,y,z], resulting in the list of pairs [(0,x), (1,y), (2,z)]. The resulting list has only three elements, not an inifinite number, because zip stops after the shortest list is exhausted. Infinite structures are very common in Haskell, because they allow us to focus on the contents and not on the boundary conditions: in this case, we focus on the fact that the list contains an increasing list of integers, and we don't bother specifying that we need as many integers as there are elements in our original list. Haskell's laziness makes sure that it will not get stuck trying to allocate an infinite list, and will instead only allocate as much as is necessary to compute the resulting finite list of pairs.

partition is a function from the standard library which partitions a list into two sublists according to a predicate: one sublist contains the elements which satisfy the predicate, and the other sublist contains the elements which don't. In this case, our predicate is that the first element of the pair, that is, the element number, should be even. This causes the even-numbered pairs to end up in the first list, and the odd-numbered pairs to end up in the second.

As this point we have two lists of pairs, with each pair containing a number and an element. We'd rather have lists of elements, because the numbers are just an intermediate annotation we used for the partitioning. So we'd like to apply snd, the function which only keeps the second part of a pair, to all the elements of both lists. I could have writen (\(xs,ys) -> (map snd xs, map snd ys)), it would have done the job, but I'd rather to take the opportunity to demonstrate the lens library.

The lens library deals with nested structures, in this case a pair of lists of elements. I think the name comes from the fact that a real-world lens allows you to focus light on a smaller area, and that you can compose multiple lenses to focus the light on an even smaller area. Similarly, a Lens from the lens library allows you to manipulate a structure by focusing on manipulating a smaller part of that nested structure, and Lenses can be composed to allow you to focus on a deeply-nested element hidden deep inside a nested structure.

The lens library also offers a variety of other "optics", which can focus on something other than a single element. In this case, I am constructing a Setter, which allows multiple elements in a nested structure to be modified at the same time. I am constructing it by composing both, a setter which focuses on both halves of a pair simultaneously, with mapped, a setter which focuses on all the elements of a list. The net result is that I am focusing on all the elements of both of my lists, that is, I am focusing on all the pairs on which I want to apply snd.

-- |
-- >>> runningSum [1,2,3]
-- [0,1,3,6]
-- >>> runningSum (map direction "^>>v")
-- [V2 0 0,V2 0 1,V2 1 1,V2 2 1,V2 2 0]
runningSum :: Num a => [a] -> [a]
runningSum = scanl (+) 0

This second helper is our good friend scanl (+) 0, which we have used to produce running sums of integers and running sums of 2D coordinates. The Num a constraint in the type signature indicates that while this helper function can indeed work on list of elements of various types, it will only work if that element type has a Num instance.

-- |
-- >>> distinctCountOf folded [1,2,3,2,2,3]
-- 3
-- >>> distinctCountOf both (1,2)
-- 2
-- >>> distinctCountOf both (1,1)
-- 1
distinctCountOf :: Ord a => Fold s a -> s -> Int
distinctCountOf elems = setOf elems
                    >>> length

Our final helper is a generalization of the Set.fromList >>> length technique we've been using so far to count the distinct elements of a list. This time, however, the elements we want to count are distributed among two separate lists, so a simple Set.fromList is not going to work. Instead, I'm using setOf, a function from lens which takes a Fold and constructs a Set out of the elements on which the Fold focuses. A Fold is yet another kind of optic, one which also focuses on multiple elements simultaneously, but instead of modifying all of them at once like a Setter, it examines them using methods from Foldable.

-- |
-- >>> day3_5 "^v"
-- 3
-- >>> day3_5 "^>v<"
-- 3
-- >>> day3_5 "^v^v^v^v^v"
-- 11
day3_5 = map direction
     >>> fromAlternating
     >>> over both runningSum
     >>> distinctCountOf (both.folded)

Finally, we compose our helpers into one sophisticated function which solves the puzzle: we parse the directions into unit vectors, we split the alternating directions into one list for each of the two cursors, we list the visited coordinates for each, and we count how many distinct coordinates have been visited overall.


No comments: