The goal of this post is to demonstrate that I'm not crazy. One of my latest discoveries involves a function called "inhabitants", but that's impossible, so says the textbooks, and I can't explain my discovery any further because I crumble under the mathematically rigorous objections. You know, the way researchers whose latest discovery involves a perpetual motion engine tend not to be taken very seriously.

So I took the time to read about those mathematically rigorous objections, and I'm happy to report that my conclusion is that I am *not* a crackpot. It's just a misunderstanding. A bit as if my perpetual motion engine was supposed to cycle more slowly at every turn, without ever stopping (hence the name) and without producing useful work. It may or may not be an interesting discovery, but it has nothing to do with what people usually call perpetual motion engines.

Now that I'm convinced of my own sanity, the hard part is to convince others. I'll practice with you, foreign and unknown readers, who are probably too far away to cause me trouble if I fail.

First of, I'd like to explain what people usually mean by "inhabitants". The type inhabitation problem is the task of determining, for any given type in a particular programming language, whether there exists an expression in that language which has that type. Such an expression is said to "inhabit" its type. It would be even more useful if an actual inhabitant could be returned as a witness, and that's precisely what Djinn does for Haskell's type system.

Some people have wondered whether Djinn could be extended to enumerate all the inhabitants of that type, instead of just one. Well, no. And here is a diagonalization argument to prove it.

First, it makes things much simpler if we cast everything in terms of integers. Types and expressions can both easily be encoded as integers, for example by interpreting their ascii representations as base 256 numbers. I'll use the notation ⟨`x`⟩ to denote the integer representing the type or expression `x`.

Even though our expressions now masquerade as integers, we should still be able to evaluate them.

The above code is the specification for `eval`, not its implementation. It is an incomplete specification, because we don't care about the behaviour of `eval` when it is given integers which don't represent expressions (rendering the left-hand side meaningless) or when these expressions are of incompatible types (rendering the right-hand side meaningless). In what follows, `eval` should always be used implicitly whenever the code appears to be applying an integer to another.

Next, by contradiction, suppose we did have a version of Djinn which enumerated all of the inhabitants of a type.

Instead of returning a list of all the results, `inhabitants` takes an extra integer `i` as a second argument and returns the `i`^{th} member of its enumeration. If there are less than `i` members in the enumeration, then we don't care what `inhabitants` returns.

Again, that's because the specification is incomplete. It is *very* incomplete. It doesn't even forbid `inhabitants` from returning expressions with the wrong type in addition to those who do. It only requires `inhabitants` not to miss any of the expressions which do have the requested type, because that's all I need in order to derive a contradiction. In fact, I only need to find a single type and a single expression of that type which `inhabitants` misses. The diagonal, for example.

This time I did not merely give a specification, but an explicit definition.

The function `diag` has type `Int -> Int`, so the specification of `inhabitants` requires that `inhabitants `⟨`Int -> Int`⟩` i == `⟨`diag`⟩ for some `i`. Both sides of this equation should evaluate to the same integer, ⟨`diag`⟩. And thus, the expressions represented by those identical integers should also be the same. In particular, the expressions should behave the same on all inputs, including input `i`. But on that input, `diag` made sure to behave differently, by returning a value that is off by one from the left-hand side's value! Thus, `inhabitants `⟨`Int -> Int`⟩ must have missed `diag`, as needed.

Now that we know that implementing `inhabitants` correctly is impossible, what's wrong with the following attempt?

Okay, so it's not a very serious attempt, but it does satisfy the specification. Remember how I said that I didn't need to forbid `inhabitants` from returning too many expressions? Well here I am, returning *all* expressions, all the time, regardless of the type requested. How could I miss any?

Let's walk through the argument again. There is some `diag` function which I'm supposed to miss. The specification requires `inhabitants `⟨`Int -> Int`⟩` i == `⟨`diag`⟩ for some `i`. And with this implementation, this happens when `i` is equal to ⟨`diag`⟩. At this point, `diag` is supposed to stab us in the back by adding one!

The foolish `diag`, however, was stabbed by our infiltrated agents first. It turns out that calling `inhabitants `⟨`Int -> Int`⟩` i` caused `diag i` not to terminate! Adding one to ⊥ has no effect, and `diag` failed to distinguish its behaviour from `inhabitants `⟨`Int -> Int`⟩` i`, avoiding a contradiction.

What this tells us, is that the original argument implicitly assumed that `inhabitants` only enumerated terminating expressions. In turn, this tells us that the type inhabitation problem implicitly requires inhabitants to terminate. And with hindsight, it's rather obvious than they should. Otherwise, the type inhabitation problem would be trivial: every single type would be inhabited!

And now, I can finally reveal what it is that makes my perpetual motion engine possible: the `inhabitants` function involved in my discovery enumerates expressions which, syntactically, have the correct type, but whether the enumerated expressions terminate or not is irrelevant to my discovery. The fact that I won't be able to produce useful work out of it doesn't mean it can't also be a nice theoretical device. But maybe I really should have called it a "perpetually slowing-down engine", to avoid confusion.

**update**: In other news, it is also impossible for Djinn to exist. But it's only impossible for versions of Djinn which cover strong enough type systems, like Haskell's. From this, I can conclude that Djinn covers a useful *fragment* of Haskell, and you can conclude that I don't use Djinn very often.

## 4 comments:

What's really going on is Haskell's type system is inconsistent because Bottom inhabits all the types. See David Turner's paper "total functional programming" which you can find on google easily. It is pretty readable.

If I may summarize my understanding of the paper for the other visitors, Turner is basically saying that like most code-related proofs, my diagonalization argument only works for total languages.

In order to cover a partial language like Haskell, I would need to add extra cases in order to cover ⊥. And the way to cover those cases is by requiring the purported "inhabitants" implementation to enumerate terminating expressions only.

What's really,

reallygoing on is that "inhabitants" is only hard if all inhabitants are required to terminate, like I said. Obviously, in total languages, they are.That being said, thanks for the reference! The first part is indeed quite readable, but the part on codata wasn't. That's unfortunate, because I really like dual things.

For the interested reader, sigfpe wrote a nice simplification of that part of the paper. Go read it now!

The most impressive implementation of "inhabitants" I have seen is miniKanren, a term-searching library which does a lot more than listing all terms of a given type. Their demo lists lambda-calculus terms, but can be easily adapted to search terms in whichever language you implement in it.

Post a Comment