## Saturday, December 19, 2015

### A whirlwind tour of Haskell, day 5

##### Day 5: infix notation and list comprehensions

This puzzle lists a bunch of nonsense conditions on the characters of a string, and we must check whether the string satisfies all of them. This should be quite easy in any language, so it doesn't give me the opportunity to demonstrate another powerful Haskell library. Instead, I think I will try to make my implementation as readable as possible:

``````-- |
-- >>> numberOfTruePropositions [2+2 == 4, 3+3 == 10, 2*5 == 10]
-- 2
numberOfTruePropositions :: [Bool] -> Int
numberOfTruePropositions = filter (== True)
>>> length

-- |
-- >>> 2 `orMoreAreTrue` [True,True,False]
-- True
-- >>> 3 `orMoreAreTrue` [True,True,False]
-- False
orMoreAreTrue :: Int -> [Bool] -> Bool
orMoreAreTrue n = numberOfTruePropositions
>>> (>= n)

-- |
-- >>> 2 `atMostAreTrue` [True,True,False]
-- True
-- >>> 1 `atMostAreTrue` [True,True,False]
-- False
atMostAreTrue :: Int -> [Bool] -> Bool
atMostAreTrue n = numberOfTruePropositions
>>> (<= n)

-- |
-- >>> isVowel 'a'
-- True
-- >>> isVowel 'b'
-- False
isVowel :: Char -> Bool
isVowel c = c `elem` "aeiou"

-- |
-- >>> doubledLetter 'a'
-- "aa"
doubledLetter :: Char -> String
doubledLetter c = printf "%c%c" c c

-- |
-- >>> atLeastThreeLettersAreVowels "aaa"
-- True
-- >>> atLeastThreeLettersAreVowels "abc"
-- False
atLeastThreeLettersAreVowels :: String -> Bool
atLeastThreeLettersAreVowels s =
3 `orMoreAreTrue` [ isVowel c
| c <- s
]

-- |
-- >>> hasSomeDoubledLetter "aaa"
-- True
-- >>> hasSomeDoubledLetter "aba"
-- False
hasSomeDoubledLetter :: String -> Bool
hasSomeDoubledLetter s =
1 `orMoreAreTrue` [ doubledLetter c `isInfixOf` s
| c <- ['a'..'z']
]

-- |
-- >>> hasNoForbiddenSubstrings "aaa"
-- True
-- >>> hasNoForbiddenSubstrings "abc"
-- False
hasNoForbiddenSubstrings :: String -> Bool
hasNoForbiddenSubstrings s =
0 `atMostAreTrue` [ cc `isInfixOf` s
| cc <- ["ab", "cd", "pq", "xy"]
]

-- |
-- >>> day5 "ugknbfddgicrmopn"
-- True
-- >>> day5 "aaa"
-- True
-- >>> day5 "jchzalrnumimnmhp"
-- False
-- >>> day5 "haegwjzuvuyypxyu"
-- False
-- >>> day5 "dvszwmarrgswjxmb"
-- False
day5 s = atLeastThreeLettersAreVowels s
&& hasSomeDoubledLetter s
&& hasNoForbiddenSubstrings s
``````

To make the implementation readable, I defined many helper functions with long meaningful names, I gave examples clarifying their meaning, I used backticks to turn a binary function into an infix operator, making the code closer to an English sentence, and I used list comprehension syntax (`[... | x <- [1,2,3]]`) to define a list of similar expressions. Since the above is supposed to be extra readable, I guess I won't bother explaining it further.

Instead, I'll give a few more examples of list comprehensions. Here's how to try every combination of `x` and `y`:

``````> [x + y | x <- [100,2000]
, y <- [4,5,6]
]
[104,105,106,2004,2005,2006]
``````

Here's an easy way to generate all the triples of distinct numbers between 1 and 4:

``````>  [(x,y,z) | x <- [1..4]
, y <- [x+1..4]
, z <- [y+1..4]]
[(1,2,3),(1,2,4),(1,3,4),(2,3,4)]
``````

And here's a quick way to filter the results:

``````> [(x,y,z) | x <- [1..10]
, y <- [x..10]
, z <- [y..10]
, x*x + y*y == z*z
]
[(3,4,5),(6,8,10)]
``````
##### Day 5.5: non-determinism

This version of the puzzle is much harder, because instead of having to check whether particular substrings do or do not appear in the input, we have to find substrings which repeat later in the string. This is harder because whether the substring occurs later in the string depends both on the chosen substring and on the position in the original string in which the first copy of the substring appears.

The easiest way to solve this puzzle is probably to use back references in a perl-compatible regular expression. That's not very Haskell-specific, so instead, here's an approach which is more general, in the sense that it can find patterns even if the question has nothing to do with strings: non-determinism.

``````-- |
-- >>> repeatingPairs "aabbaabb"
-- [('a','a'),('a','b'),('b','b')]
repeatingPairs :: String -> [(Char,Char)]
repeatingPairs s = do
i <- [0 .. length s - 2]
(c1:c2:s') <- return (drop i s)
let pair = printf "%c%c" c1 c2
guard (pair `isInfixOf` s')
return (c1, c2)
``````

Here I am non-deterministically choosing the index `i` at which the desired `pair` of characters begins, and then I use `guard` to retro-actively eliminate the choices of `i` for which `pair` does not occur in the remainder of the string. You can think of it as trying all the possibilities at once, like a non-deterministic finite automaton. Except computers are deterministic, so while we are describing a non-deterministic computation, that computation is implemented by trying one possibility at a time and backtracking whenever a guard condition fails.

Here's another example in which `guard` eliminates the paths in which the chosen letter doesn't reoccur two characters later.

``````-- |
-- >>> repeatingLetters "aba-bcd-def"
repeatingLetters :: String -> [Char]
repeatingLetters s = do
i <- [0 .. length s - 1]
(c:_:c':_) <- return (drop i s)
guard (c == c')
return c
``````

The rest of the puzzle is easy: to check whether one of the patterns is present in the input string, list all the matching patterns and check if the list is non-empty. Thanks to laziness, the backtracking algorithm lists all the possibilities when we ask for them, but when we only ask if the list is non-empty, it behaves more efficiently by stopping after the first successful pattern is found.

``````-- |
-- >>> hasRepeatingPair "xyxy"
-- True
-- >>> hasRepeatingPair "aaa"
-- False
hasRepeatingPair :: String -> Bool
hasRepeatingPair = not . null . repeatingPairs

-- |
-- >>> hasRepeatingLetter "xyx"
-- True
-- >>> hasRepeatingLetter "xyz"
-- False
hasRepeatingLetter :: String -> Bool
hasRepeatingLetter = not . null . repeatingLetters

-- |
-- >>> day5_5 "qjhvhtzxzqqjkmpb"
-- True
-- >>> day5_5 "xxyxx"
-- True
-- >>> day5_5 "uurcxstgmygtbstg"
-- False
-- >>> day5_5 "ieodomkazucvgmuy"
-- False
day5_5 s = hasRepeatingPair s
&& hasRepeatingLetter s
``````