Wednesday, October 29, 2008

Rant against beamer's shaky animation support (and solution)

I love LaTeX.

I love it because TeX is a programming language, and the other typesetting products aren't. Other that that, it's a horrible language.

For some reason, I don't follow the "separate presentation from contents" dogma. I guess it must be a bit strange coming from a guy who otherwise strongly advocates separation of concerns in all manners of code, but that's the way it is. Micromanaging the look of my LaTeX creations involves lots of anger against the program's not-quite-perfect sense of aesthetics, a few wasted night, and a handful of life-saving low-level TeX macros. Tonight's fight involved the latter.

beamer is an impressive LaTeX mod for creating slides. The output is a PDF document, so you can only generate a sequence of static slides. No animations, no sound. I mention it in case you expected slide programs to provide these ridiculous toys; me, I'm used to text-only slides and box-based ornaments. That's because I was using LaTeX to create my slides way before I knew about beamer, so I improvised with what I thought was possible. beamer, with its colors and rounded boxes and navigation and gradually uncovered slides, sure showed me how much I underestimated the extent of the possible.

Yet with all of its features, there's one thing I used to be able to do (easily?) with LaTeX alone, and which beamer doesn't offer: uncovering by substitution. By "uncovering by substitution", I mean a slide transition that doesn't just add or highlight the next element, but which actually changes an existing element. You know, like \alt and \temporal do.

...Ok, so \alt and \temporal actually are beamer commands, so I guess the feature is provided after all. But it's implementation is rather shaky. I mean literally, my slides shake when I use these commands. That's because the alternatives are of slightly different lengths, which slightly displaces the remainder of the text, which makes an unpleasing shaking effect if I use several alternatives in quick succession. Again, most people probably aren't bothered by this at all, but I'm the obsessive type who wants the "presentation" part of his presentations to be exactly right.

As you must have guessed by now: here's my version of \alt. It's called \switch, and just like \uncover, it doesn't shake. Furthermore, it's argument is a list of \case<+>{...} clauses which will appear one after the other. Or at the same time, if you use overlapping ranges. That's both more general and more useful than \alt and \temporal, which are limited to two and three non-overlapping ranges, respectively.

And now that the extent of the possible has been increased once again... a new level of time-wasting attention to details can begin.

Tuesday, October 21, 2008

Designing a mergeable filesystem structure

In my previous post, I have argued that it would be better if filesystems could be merged without human supervision. This could only happen if conflicts were impossible, but with today's filesystems, merge conflicts are actually inevitable! Let's design a filesystem without this flaw.

My study of conflicts has led me to the study of a family of data-structures which I call commutative types. These structures are abstract data types, in the sense that you can only modify them through their public methods. What distinguishes commutative types is the additional restriction that the internal state of the object must not depend on the sequence of operations which have been performed on it so far, but on the set of operations which have been performed. In other words, the order in which the operations are applied must have no impact.

For example, consider an object whose internal state is a single integer. If the only update operation provided were increment() and decrement(), then this object would be commutative. This is because incrementing and then decrementing yields the same result as decrementing first, then incrementing (we're ignoring overflows for now). Similarly, if double() and triple() were the only operations provided, then the object would again be commutative. On the other hand, if the same object allowed both increment() and double(), then that object would not be commutative.

x + 1 − 1 = x − 1 + 1
x × 2 × 3 = x × 3 × 2
(x + 1) × 2 ≠ (x × 2) + 1

Integers are well studied structures for which commutative operations are known, but in order to represent a filesystem, we will need a commutative object with a much more complicated internal state. Let's design one! What are the requirements?

Well, here's an easy one. Suppose we write() a value in a particular file. If we later read() this same file, then we ought to obtain the value we wrote in it. Now, if we read() the file before we write to it, there's no way we're going to read the same value. Does this mean that any filesystem supporting both read() and write() can't be commutative?

...

Yes and no! It was a trick question. Remember, we're only concerned about the internal state of the object here. Since read() doesn't change this state, it actually commutes with anything. In particular, read() commutes with write(). Unfortunately, this still doesn't make the filesystem a commutative object, because write() doesn't commute with itself...

If we write("foo") to a particular file, then write("bar") to it, then the final state definitely isn't going to be the same as if "bar" was written first. So our commutative filesystem can't even have a write() method.

That's a pretty drastic restriction! We could, however, have a write_xor() method. Instead of replacing the file's contents with its argument, write_xor() would bitwise exclusive-or the content of the file with it. This looks weird and useless, but it would actually produce an okay filesystem. You could still write any value you wanted to the file, you would just need to zero-out the file first, by xor-ing the file's contents with itself. Writing a shorter value in this fashion would leave a padding of zeroes at the end, but we could introduce an end-of-file character to alleviate the problem. All in all, I'm not too dissatisfied with this minimalist interface. And it commutes! You will never see a merge conflict again!

...in theory. But in practice, the result of a merge will be an unreadable mess of binary characters, which is even worse than a conflict. I'm not giving up yet, as I've got plenty of other ideas, but let's keep this so-so idea for now. Because there's another challenge to solve: directories!

If the two branches we want to merge both define the same filename, but one as a file and the other as a directory, what should we do? Should one take precedence over the other? That would work, and it would probably commute, but I have a better solution. What if we kept both? With today's filesystems, every directory's got two names: /foo/bar and /foo/bar/. The first form is usually favoured, out of habit more than anything else, I think. Well, let's not let the choices of the past prevent us from picking the choice of the future today. I'll allow /foo/ to contain both a file named bar and a directory named bar/! And my trees will commute at last!!!

Wait, I do sound more crazy than usual when I say it like that.

Sunday, October 19, 2008

Rant against merge conflicts (and a solution strategy)

Version control used to be all about versioning. I never used RCS, but I bet the killer command was something like "rcs commit", to create a new version.

Projects soon grew bigger and version control began to be all about collaboration. The killer command was then "cvs checkout", to get the source and begin collaborating.

Then projects grew even more complicated and it was no longer sufficient just to version text, it was now important to version binaries and directories and symlinks and permissions and ancestry. As version control began to resemble filesystems, the magic occurred in commands like "svn add".

More recently, different distributed version control made forking much easier, and the killer command is now "git branch". But the future of version control, I believe, lies in the one command which never worked quite right: "git merge".

Because of merge conflicts, merging is the only version control command which requires human supervision. Removing this constraint would allow other programs to perform merges, a seemingly small advantage which, in the long run, could actually lead to some very useful and unexpected applications... I'm thinking about unionfs extensions, but there could be many others. Never underestimate the creativity of a million strangers!

I happen to be somewhat of an expert on conflicts, or at least I wish I was. For my master's thesis, I'm currently tackling weaving conflicts in aspect-oriented applications. My conclusion so far is that conflicts are inevitable, unless we work with very restricted structures which are guaranteed to merge properly. I'm trying to find suitably-restricted structures which would be expressive enough to program with.

I think that weaving and merging are very closely related operations. The first attempts to combine code, while the second attempts to combine filesystems. In both cases, unfortunately, the parts to be combined could turn out to be utterly incompatible. Current aspect-oriented languages tend to proceed with weaving anyway, postponing the disaster until runtime. Current version control systems take the opposite approach, refusing to merge until the user has fixed all of the conflicts. But I think it should be the other way around! Incorrect programs ought to fail as early as possible, possibly during the compilation or weaving phase. And for scripting purposes, as I've mentioned, I'd rather have merging succeed all the time.

For my aspects-related research, I've found several restricted structures which combine fine all the time, but filesystems can't be among them. Thus if we do insist on combining filesystems, I claim that conflicts are inevitable. That's why I don't insist on combining filesystems. In fact, you might remember how a few months back, I wished for a framework customizing the relationship between the git repository and the filesystem. This is precisely what I'm arguing for once again! Let's use one of my magic restricted structures inside the repository, so that merges succeed all the time. Then if users want to use the repository to track their filesystem, their logs, their cats, or whatever, let them write a translation between the repository and the thing they want to track. Now that would be doing one thing and doing it right, rather than letting filesystem concerns creep into the repository's code base.

What do you think? I am being more crazy than usual?

Saturday, October 18, 2008

Rant against choices (and solution)

This is a mashup between Barry Schwartz's paradox of choice talk and sigfpe's presentation of ordinals as a survival game.

Apparently, math tells us that there are three kinds of totally ordered single player games.

  • ∅, where we've got no choices at all. if society gives you this choice, blame society.
  • successor ordinals such as 42 or ω + 1, where you've got many choices, including one which is obviously better. if society gives you this choice, pick the best move and smile.
  • limit ordinals such as &omega, where no matter which choice you make, you could have made a better one. if society gives you this choice, pick as best as you can, then blame yourself for not choosing better.

Barry Schwartz noticed that consumerism yields societies with lots of choices of the third kind, which makes people unhappy. Perhaps his book offers practical solutions, but in his talk he just seemed nostalgic of the days when society gave him choices of the second, perhaps even of the first type. Well, I've got a message of hope for him.

If ordinals are any indication of the evolution of society (!), then we're going to experience an alternation of successor and limit situations, so we're going to be fine half of the time. Society is not going to revert to the old days of the finite ordinals, when there were few enough choices to make correct decisions. Instead, I predict that we're going to have even more choices, but that this time it's going to be pretty obvious which one to make. ω + 1: one step above the competition. Would make a great slogan. Soon after, said competition is going to adopt the new technology and we'll reach ω + ω, another disappointing limit ordinal. Then a new technology will project us into ω + ω + 1, and the first few consumers will experience satisfaction once again.

In short, the secret to happiness is not to have low expectations, as Barry suggests, but to be an early adopter!