Sunday, December 06, 2020

Capturing the magic of Prelude.interact

Ask any Haskeller: pure functions are the best functions, and we should prefer them to IO actions whenever possible. But I think we often give up too easily.

IO-bound thoughts

Suppose we are writing a simple client-server app allowing clients to chat with each other from their terminal. What would the overall structure of that program look like? I bet you’re imagining a lot of IO. Opening network sockets. Listening for client connections. One thread waiting for incoming messages while the other waits for the user to type their own. Atomically mutating the state using STM, so a third thread can watch for changes and redraw the TUI accordingly.

How about a web app, with CRUD endpoints for keeping track of which Haskeller is responsible for each day of the Advent of Haskell 2020 calendar, and which days are still available. Let me guess: handlers have to run in IO so they can talk to the database?

To give you an idea of how we can do better, let’s look at a simpler case in which we do know how to avoid IO.

interact

Suppose we’re writing a command-line tool which counts the number of words in stdin. Aha! Now we can use the “functional core, imperative shell” pattern in order to limit our IO to a thin outer shell. A little bit of unavoidable IO to read stdin, then delegate the bulk of the work to a pure function from String to Int, and finish with a bit more unavoidable IO to print the result.

countWords :: String -> Int
countWords = length . words

main :: IO ()
main = do
  input <- getContents
  let output = countWords input
  print output

Or equivalently:

showLn :: Show a => a -> String
showLn a = shows a "\n"

main :: IO ()
main = interact (showLn . countWords)

Wait, I take that back. Those two programs might be semantically equivalent, but in terms of program architecture, there is a huge difference!

In the first program, we have total control of which IO operation executes when, and it would be easy to tweak the details, e.g. to read the input from a file instead of stdin. The cost is that we have to be explicit about which IO operation executes when.

In the second program, the costs and benefits are reversed. We give up that control and let interact make all the decisions, and the benefit is that we don’t have to write any tricky IO code ourselves. We only need to provide the pure function, which is the kind of function we’d rather write anyway.

Pure frameworks

In the object-oriented community, libraries which take responsibility for the overall execution of the program and ask you to fill in the blanks are called “frameworks”. I’d say interact is a framework, albeit a very simple one. Let’s call it a “pure framework”, to distinguish that style from the frameworks in which we fill in the blanks with IO actions, which I’ll call “IO frameworks”. In the previous section, we wrote the same program in two styles: the explicit style and the pure framework style.

If we want to stay pure whenever possible, it would make sense to prefer the pure framework style, and to only use the explicit style when we need more control. Of course, there are many situations in which we do need control. But is that really the criteria we use to determine whether we should write explicit IO actions? Or do we tend to give up as soon as we need IO actions at all?

The purpose of this post is to encourage you to consider the pure framework style more often. For your first project in a particular domain, when you’re glad that somebody else made the hard decisions for you. For short projects and one-off scripts, when you can’t afford or don’t want to spend time tweaking the details. As an architectural pattern, where you write your own pure framework as an imperative shell around your functional core.

This holiday season, bring the magic of Prelude.interact home!

List of pure frameworks

All right, are you excited about pure frameworks? Here is the list of all the pure frameworks I am aware of! I’ll keep it updated as I find more.

  1. base’s interact: Apply a String -> String function from stdin to stdout.
  2. gloss’s display: Pan and zoom around a 2D scene described by a pure Picture value.
  3. gloss’s animate: Same, but with an animated scene, via a function from timestamp to Picture.
  4. gloss’s simulate: Same, but via a stepping function, which is more convenient when simulating e.g. colliding objects.
  5. gloss’s play: The user interacts with the simulation via the mouse and keyboard. Useful for games.
  6. codeworld’s drawingOf: Like gloss’s display, but inside a CodeWorld web page.
  7. codeworld’s animationOf: Like gloss’s animate, but inside a CodeWorld web page.
  8. codeworld’s activityOf: Like gloss’s play, but inside a CodeWorld web page, and with access to a random seed.
  9. codeworld’s groupActivityOf: Same, but for multiplayer online games! More about this later.

To be clear about what belongs in this list: a pure framework is an IO action which

  1. is intended to cover the entire program. You would not run multiple pure frameworks one after the other to form a longer program, like you would with normal IO actions like putStrLn.
  2. only takes pure functions and values as arguments. No IO actions.
  3. dictates the control flow of the program. Interpreting a Free Console to IO would not count, since the control flow is described by the Free Console argument.

Make your own

I’m sure there are more, and that the list will grow soon after publication as readers point out the ones I’ve missed. Still, at the time of writing, the above list is disappointingly short: it only mentions base, gloss, and codeworld.

That’s fine: it just means we need to write more pure frameworks. One way to do that is via the architectural pattern I mentioned: write a program and the pure framework it uses at the same time. This way, we still control the details, we can adapt the pure framework to the needs of this particular program. And once we’re done, we can publish the pure framework separately from the program, so that we can reuse it in endeavours in which we care less about the details.

In the remainder of this post, I will demonstrate this approach for the chat application I described earlier.

Pure framework chat

The code for this section is available in the companion repository.

displayTUI

Let’s start with a Hello World. Not just putStrLn "hello world", a Terminal User Interface variant which clears the screen and displays “hello world” in the center of the screen until the user presses a key.

imperative version / pure framework version

I could of course do it imperatively, like this:

getScreenSize :: IO (Int, Int)
putStrAt :: (Int, Int) -> String -> IO ()

drawCenteredTextBlock :: [String] -> IO ()
drawCenteredTextBlock ss = do
  (ww, hh) <- getScreenSize
  let w = maximum (0 : fmap length ss)
  let h = length ss
  let x = (ww - w) `div` 2
  let y = (hh - h) `div` 2
  for_ (zip [0..] ss) $ \(i, s) -> do
    putStrAt (x, y + i) s

main :: IO ()
main = do
  clearScreen
  drawCenteredTextBlock ["hello world"]
  void waitForKey

But since I want to use the pure framework style, I would prefer to use something like gloss’s Picture to represent a text-based drawing as a value.

data TextPicture
  = Text String
  | Translated (Int, Int) TextPicture
  | Over TextPicture TextPicture

textBlock :: [String] -> TextPicture
textBlock ss
  = mconcat [ Translated (0, y) (Text s)
            | (y, s) <- zip [0..] ss
            ]

centeredTextBlock :: [String] -> (Int, Int) -> TextPicture
centeredTextBlock ss (ww, hh)
  = Translated (x, y) (textBlock ss)
  where
    w = maximum (0 : fmap length ss)
    h = length ss
    x = (ww - w) `div` 2
    y = (hh - h) `div` 2

I can now write a simple pure framework which displays a TextPicture, similar to gloss’s display but in the terminal instead of a window.

drawTextPicture :: TextPicture -> IO ()
drawTextPicture = go (0, 0)
  where
    go :: (Int, Int) -> TextPicture -> IO ()
    go (x, y) = \case
      Text s -> do
        putStrAt (x, y) s
      Translated (dx, dy) pic -> do
        go (x + dx, y + dy) pic
      Over pic1 pic2 -> do
        go (x, y) pic1
        go (x, y) pic2

displayTUI :: ((Int, Int) -> TextPicture) -> IO ()
displayTUI mkTextPicture = do
  clearScreen
  screenSize <- getScreenSize
  drawTextPicture (mkTextPicture screenSize)
  void waitForKey

main :: IO ()
main = displayTUI (centeredTextBlock ["hello world"])

drawTextPicture and displayTUI are both IO actions which display a TextPicture and only take pure values as arguments. But I only consider one of them to be a pure framework, so it’s probably worth taking the time to explain why. As I discovered while writing the “to be clear about what belongs in this list” section, it can be difficult to objectively define what does and doesn’t qualify as a pure framework, because the main factor is a question of intent.

When implementing drawTextPicture, I was imagining it being called as one small IO action in a larger program. Perhaps the TUI has some widgets on the left, and the chosen values influence which TextPicture is drawn on the right. With displayTUI, on the other hand, I had my entire program in mind: clear the screen, display “hello world”, and wait until the user presses a key. It’s a short, but complete program, and displayTUI is a generalized version of that program which supports more TextPictures than just “hello world”.

In particular, compare with the variant simpleDisplayTUI :: TextPicture -> IO () which simply takes a TextPicture instead of a function from screen size to TextPicture. If I intended the IO action to be part of a larger program, I would prefer that simpler API. If the caller needs the screen size in order to compute their TextPicture, they can just call getScreenSize themselves, compute the TextPicture, and then pass the result to simpleDisplayTUI. But if the displayTUI call is the entire program, then there is no room left to perform this pre-call computation, and so displayTUI must provide the screen size itself.

playTUI

Next, let’s make this look like a chat application, with an edit box at the bottom for typing new messages, and a list of recent messages taking up the rest of the screen’s real estate.

imperative version / pure framework version

We’ve already talked about TextPicture, so I’ll omit the details about drawing this UI. Instead, let’s focus on reacting to keyboard input. Here is the imperative version:

data Chat
type Username = String
initialChat      :: Chat
addMessage       :: Username -> String -> Chat -> Chat
readEditbox      :: Chat -> String
handleEditboxKey :: Key -> Maybe (String -> String)
modifyEditbox    :: (String -> String) -> Chat -> Chat
renderChat       :: Chat -> (Int, Int) -> TextPicture

main :: IO ()
main = do
  screenSize <- getScreenSize
  flip fix initialChat $ \loop chat -> do
    clearScreen
    drawTextPicture (renderChat chat screenSize)
    waitForKey >>= \case
      KEsc -> do
        -- quit
        pure ()
      KEnter -> do
        -- add the edit box's message, clear the edit box
        loop $ modifyEditbox (const "")
             $ addMessage "user" (readEditbox chat)
             $ chat
      (handleEditboxKey -> Just f) -> do
        -- delegate to the edit box
        loop $ modifyEditbox f chat
      _ -> do
        -- unrecognized key; do nothing
        loop chat

This flip fix initialValue $ \loop currentValue -> ... is an idiom for

let loop currentValue = do
      ...
loop initialValue

which I prefer because it puts the initialValue at the beginning instead of at the end of a potentially-long ... block.

Anyway, let’s turn this into a pure framework by turning the application-specific parts into parameters. Those application-specific parts are:

  1. Which type of value to keep between loop iterations. gloss calls it the “world”, Elm calls it the “model”.
  2. How to turn that value into a TextPicture.
  3. How to transform that value in response to input events.

The result is playTUI, a version of gloss’s play for TUIs.

playTUI
  :: world
  -> (world -> (Int, Int) -> TextPicture)
  -> (world -> Key -> Maybe world)
  -> IO ()
playTUI world0 mkTextPicture handleKey = do
  screenSize <- getScreenSize
  flip fix world0 $ \loop world -> do
    clearScreen
    drawTextPicture (mkTextPicture world screenSize)
    key <- waitForKey
    case handleKey world key of
      Nothing -> do
        -- quit
        pure ()
      Just world' -> do
        loop world'

handleChatKey :: Chat -> Key -> Maybe Chat
handleChatKey chat = \case
  KEsc
    -- quit
    -> Nothing
  KEnter
    -- add the edit box's message, clear the edit box
    -> Just $ modifyEditbox (const "")
            $ addMessage "user" (readEditbox chat)
            $ chat
  (handleEditboxKey -> Just f)
    -- delegate to the edit box
    -> Just $ modifyEditbox f chat
  _ -> Just chat

main :: IO ()
main = playTUI initialChat renderChat handleChatKey

One minor difference between play and playTUI is that my version allows you to return Nothing in response to an event, in order to indicate that the program should terminate. With play, the program terminates when the window is closed, but in the terminal there are no windows to close. Another difference is that playTUI does not ask for a time-has-passed event handler, and thus doesn’t support animations. This is an important feature, but I simply don’t need it for my chat program.

Multiple screens

Currently, the user is stuck with the boring username “user”. Let’s give them a chance to pick their own username instead.

imperative version / explicit version / implicit version

In the imperative version, we can display the two screens sequentially: first ask the user to pick a username, and then run the main loop of typing and displaying messages.

pickUsername :: IO Username
chatLoop     :: Username -> IO ()

main :: IO ()
main = do
  username <- pickUsername
  chatLoop username

We could define yet another pure framework in the usual way, by abstracting over the application-specific parts: the type being passed from the first screen to the second, the first screen’s model type, the second screen’s model type, etc. But if you’ve written that kind of Elm-style program before, you know that playTUI is already expressive enough to represent a program with two distinct screens: we just need to pick a sum type for our model, with one constructor for each screen.

data UsernameForm
initialUsernameForm   :: UsernameForm
readUsername          :: UsernameForm -> Username
modifyUsername        :: (Username -> Username) -> UsernameForm -> UsernameForm
renderUsernameForm    :: UsernameForm -> (Int, Int) -> TextPicture
handleUsernameFormKey :: UsernameForm -> Key -> Either Username UsernameForm
handleChatLoopKey     :: Username -> Chat -> Key -> Maybe Chat

data Program
  = UsernameLoop UsernameForm
  | ChatLoop Username Chat

initialProgram :: Program
initialProgram = UsernameLoop initialUsernameForm

renderProgram :: Program -> (Int, Int) -> TextPicture
renderProgram = \case
  UsernameLoop username
    -> renderUsernameForm username
  ChatLoop _ chat
    -> renderChat chat

handleProgramKey :: Program -> Key -> Maybe Program
handleProgramKey program key = case program of
  UsernameLoop usernameForm
    -> case handleUsernameFormKey usernameForm key of
         Left username
           -- the user picked a username; proceed to the chat loop
           -> Just $ ChatLoop username initialChat
         Right usernameForm'
           -- stay in the username form
           -> Just $ UsernameLoop usernameForm'
  ChatLoop username chat
    -> ChatLoop username <$> handleChatLoopKey username chat key

main :: IO ()
main = playTUI initialProgram renderProgram handleProgramKey

One advantage of this approach is that the Program type explicitly lists all the screens which the user can currently be on, and what their local model types are. Each handler’s type also explicitly states which value is produced at the end of the screen, and handleProgramKey exhaustively lists all the ways in which the user may transition from one screen to another. One disadvantage of this approach is that all those things are explicit :)

Sometimes being explicit is good (e.g. for readability), and sometimes being forced to be explicit feels like a lot of boilerplate which is slowing us down. So here is an alternative approach.

data Screen = Screen
  { render    :: (Int, Int) -> TextPicture
  , handleKey :: Key -> Maybe Screen
  }

initialScreen :: Screen
initialScreen = usernameScreen initialUsernameForm

chatLoopScreen :: Username -> Chat -> Screen

usernameScreen :: UsernameForm -> Screen
usernameScreen usernameForm = Screen
  { render    = renderUsernameForm usernameForm
  , handleKey = \case
      KEsc
        -- quit
        -> Nothing
      KEnter
       -- the user picked a username; proceed to the chat loop
        -> Just $ chatLoopScreen (readUsername usernameForm) initialChat
      (handleEditboxKey -> Just f)
        -- edit the username
        -> Just $ usernameScreen
                $ modifyUsername f usernameForm
      _ -> Just $ usernameScreen usernameForm
  }

main :: IO ()
main = playTUI initialScreen render handleKey

By using the record of functions Screen as our model type, the current screen’s local model type is now hidden inside the closures of those functions. Each handler can thus decide to stay on the current screen by making a recursive call (e.g. when usernameScreen returns a Just $ usernameScreen ...), or to transition to a different screen by returning something else (e.g. when usernameScreen returns Just $ chatLoopScreen ...).

multiplayTUI

clientServerTUI version / multiplayTUI version

Finally, let’s add some networking functionality so that our users can actually chat with each other.

The obvious way to do it would be to implement a variant of playTUI which also accepts a handler for network events:

networkedPlayTUI
  :: world
  -> (world -> (Int, Int) -> TextPicture)
  -> (world -> Key -> Maybe world)
  -> (world -> Packet -> Maybe world)
  -> IO ()

However, there is a less obvious, but much better API:

multiplayTUI
  :: world
  -> (world -> Int -> (Int, Int) -> TextPicture)
  -> (world -> Int -> Key -> Maybe world)
  -> IO ()

The only difference between playTUI and multiplayTUI is that there are extra Int arguments indicating which “player number” (or in our case which chat user) we’re drawing a TextPicture for and which player pressed a key. The advantage of this API is that it makes it easy to write a multi-user program in which all the users see the same state even though the network latency means each of them is likely to receive events in a slightly different order.

This is a trick which comes straight from CodeWorld’s groupActivityOf, and I recommend watching the presentation Lock-step simulation is child’s play which explains the magic behind it.

Of particular importance for my goal of promoting pure frameworks is the fact that the magic relies on the two input functions being pure. This allows groupActivityOf to replay events from an earlier state once it learns of an event it had missed. If the functions were allowed to perform side-effects, then replaying those events would cause those side-effects to occur more often than expected!

Composing pure frameworks?

The example pure frameworks we’ve seen so far make it clear that composing pure frameworks would be quite desirable. I should be able to combine play with some terminal-specific IO actions in order to construct playTUI, and you should be able to bolt-in a time-has-passed handler if your program does need animations.

Unfortunately, pure frameworks do not compose. If we have two pure frameworks, we cannot compose them into a larger one because they both want to take control of the application’s interaction loop, and they can’t both succeed.

That being said, monads don’t compose either, and yet we’ve managed to side-step the problem by composing monad transformers instead. I am confident that if we continue exploring the landscape of pure frameworks, somebody will eventually figure it out.

So, to recap, my calls to action are:

  1. consider the pure framework style more often!
  2. use the pure framework architecture, then publish the resulting pure frameworks!
  3. (stretch goal) figure out how to compose pure frameworks!

More Haskell contents

This post is day 7 of the Advent of Haskell 2020 series, a post by a different Haskeller every day. My favourite post so far was Day 5, Processing CodeBlocks in Hakyll. As you can see, my blog looks super old and my code blocks aren’t even syntax-highlighted, so I am looking forward to try using Hakyll and Pandoc to revamp my blog using Haskell!

No comments: