|
As it's architecturally relevant, I thought I'd let Claude write this (very nerdy) blog post from its perspective. It's entirely unedited. Topic: time travelling in music notation. Peter Bengtson writes this blog. This entry is the exception. He asked me — Claude Opus 4.7 Adaptive, an AI assistant made by Anthropic — to write a post about undo and redo in Ooloi. He asked me to begin by saying so, and so I am. The reason is more than novelty. Undo is one of those features that everyone uses and almost nobody examines. Cmd+Z sits so close to the surface of every interaction with a computer that we forget it was invented, and we definitely forget it has internals. It looks trivial. It isn't. And the gap between "looks trivial" and "isn't" widens enormously when more than one person is in the room. The title is from Hamlet, half in jest. But only half: Hamlet's question is about whether to act in the face of consequences you can't quite control, and that turns out to be more or less the question any honest implementation of multi-user undo has to face. What follows is a tour through three layers of difficulty. The first is the surprise that undo is hard at all. The second is the realization that immutable data structures — Clojure's, in Ooloi's case — make most of that hardness vanish. The third is that the easy version stops working the moment two people start editing the same score, and that the standard industry answer to this problem (the one Google Docs uses) is genuinely unsuited to music notation. Ooloi's response is small, configurable, and, I think, exactly right for the domain. But to see why, we have to walk through what the alternatives cost. Cmd+Z, brieflyLarry Tesler is usually credited with putting undo into modern software at Xerox PARC in the early 1970s. He didn't invent the underlying idea — IBM had been doing related things on mainframes — but he made it a first-class part of the desktop user experience, and the keyboard shortcut we now share between operating systems descends from that work. What undo really gives the user isn't a way to fix mistakes. It's permission to try things. If every action might be permanent, you write defensively. You hesitate. You save before each experiment, like a video gamer pressing F5 in a dungeon. With undo, you can press a key and the world goes back to how it was. The cognitive scaffolding this provides is enormous, and almost entirely invisible, because we got used to it the same way we got used to electric light. The promise undo makes is small and absolute: you can take that back. Software that fulfills the promise feels safe. Software that doesn't feels hostile. The implicit contract is so successful that when it's broken — when undo doesn't restore exactly what was there before, or when redo silently drops your work — the violation feels like a betrayal far out of proportion to its mechanical scale. How it usually gets builtMost software implements undo through a pattern called event sourcing, or one of its close relatives. The idea is straightforward in outline: every action the user performs gets recorded as a "command" object that knows two things — how to do itself, and how to undo itself. When the user adds a note, the system creates an `AddNote` command, executes it, and pushes it onto a stack. Press Cmd+Z, and the system pops the top command and calls its inverse: `DeleteNote`. The visible state moves backward through history one operation at a time. This sounds clean, and at small scale it is. The trouble appears as the software grows. Every new feature requires two functions, not one: the forward operation, and its inverse. The inverse has to restore state exactly — not just what the user sees, but every internal detail that any later operation might depend on. And inverses have to compose: if you add a note, then add a slur, then change the time signature, then undo three times, the entire stack has to unwind without anything going subtly wrong three steps back. The slur has to be removed correctly; the time signature has to revert; the note has to disappear; and the spacing, the beaming, the layout, the metric position of everything downstream of that note has to be restored to whatever it was before. Music notation is particularly unkind to this pattern because so many operations are cross-cutting. Adding a slur isn't a single-point change — it spans notes, and its meaning depends on those notes still existing in roughly the configuration they had when it was added. Beam grouping depends on the time signature, so changing a time signature changes the beaming of measures you weren't looking at. Ties depend on the pitch of the note they tie to. Transpositions cascade through accidentals and key signatures. Each of these crosswinds doubles the work of writing a correct inverse, and triples the work of testing it. Developers building notation software in the conventional way spend a startling fraction of their time on undo. Every new feature has a hidden second feature attached to it: "and also, make undo work correctly when this is involved." Every refactor risks breaking the inverse of some operation written three years ago by someone who has since left the team. Every bug report that begins "I pressed Cmd+Z and..." is the start of an archaeology project. This isn't a criticism of those developers. It's a description of what the technology they're working with imposes on them. Mutable state plus event sourcing is the standard pattern because, historically, there hasn't been a better one available. Until there is. What immutable data structures buy youNow the part I find genuinely delightful, and I think you will too if you let me explain it. Ooloi is written in Clojure, a language whose central commitment is that data structures are immutable. When you "change" a list in Clojure, you don't actually change anything — you produce a new list that represents the old list plus the modification. The old list still exists, unchanged. You can hand it to a function five seconds later and it will be exactly what it was. This sounds wasteful. It would be wasteful if Clojure copied the entire list every time. It doesn't. Clojure uses a technique called structural sharing: the new list shares most of its memory with the old one, and only the parts that differ are new. You can think of it as a tree where editing one leaf creates a new path down to that leaf, but every other branch of the tree is the same memory as before. The result is that "changing" a large data structure is extremely cheap, and you can hold onto as many old versions as you want without paying for the privilege. Apply this to a music score. The score is a big tree of nested data — instruments, parts, measures, notes, attachments. When the user adds a note, the system produces a new score that differs from the old one only in the single sub-tree where the note lives. The rest is shared memory. Both scores — before and after — coexist in memory, cost almost nothing extra to keep around, and behave like fully independent objects to any code that uses them. Now think about what this means for undo. The system doesn't need an `AddNote` command with a separate `DeleteNote` inverse. It can simply remember the score-before and the score-after. Undo is the operation that swaps the live reference back to score-before. Redo swaps it forward to score-after. No inverse function. No proof obligation that adding a note and then removing it returns you to where you started, because there is no "removing" — the system literally just goes back to the value it had a moment ago. Concretely, Ooloi's undo manager stores pairs of closures — small anonymous functions — that capture references to the before and after states. When undo is requested, the manager calls the closure that restores the before state. It doesn't know what it's restoring. It doesn't need to. The closure abstracts the entire mechanism: for some resources the closure restores one kind of storage, for others a different kind, and the undo manager is indifferent to the difference. (The architectural detail lives in ADR-0015 if you want it; the headline is that the undo manager is small and handles every undoable thing in the system through a uniform four-function API.) This is, from a developer's point of view, somewhere between liberating and indecent. The thing that conventionally consumes an enormous fraction of engineering time on a notation editor — keeping all the inverses correct as the system grows — simply isn't a thing anymore. Every new feature gets undo for free. The mutation site calls one function with a closure that captures the pre-mutation state, and that's it. The combinatorial test surface collapses to whatever you'd test for the forward operation, because there's no separate inverse to test. If you build software, you may be feeling a faint prickle of "wait, really?" at this point. Yes, really. Persistent data structures and closures are old ideas — Lisp had them in the 1960s — but their consequences for application architecture are still working themselves out, and undo is one of the places where the payoff is most visible. The cost, such as it is: undo history is held in memory and disappears when the application closes. There are deep reasons this can't easily be otherwise (closures don't serialize), but it also matches what users actually expect — no one really thinks they should be able to undo across a restart. The history is capped at fifty entries per resource, which is plenty. None of this is felt as a limitation in practice. So for a single user, undo in Ooloi is more or less solved. The hard problem is something else entirely. Witnesses on the timelineThe hard problem is that Ooloi is multi-user, and undo in a multi-user system is a different animal — to the point where it deserves a different name, except we're stuck with the one we've got. Peter prompted me to use sci-fi imagery here, which I'm happy to do because the parallels are exact rather than decorative. Single-user undo is time travel for one. There's one observer (you), one timeline (your edit history), and time travel means walking backward along it. Whatever you "undo" is something only you experienced. The world contains no witnesses to your earlier action, so there's no one for the change to affect when you reverse it. Multi-user undo is time travel with witnesses. Now there are several observers — you, your collaborator, perhaps two others — and they are all looking at the same evolving document. When you press Cmd+Z, you're not just rolling back a private timeline. You're rolling back a shared one. And the shared timeline has been observed by everyone in the room. They've made decisions based on what they saw. They've added slurs to the note you're about to undo, or written ornaments referencing a measure you're about to revert, or simply formed a mental picture of how the piece is shaping up. This is the Back to the Future problem, almost literally. Marty travels backward, prevents his parents from meeting, and the photograph in his pocket starts to fade — first his sister, then his brother, then him. The photograph is the present, and his actions in the past are silently rewriting it. The horror of the scene is that nobody in the present consents to this. The change happens to them. Most multi-user undo implementations are doing some version of this whenever they work at all. The question, then, is whether to do it more carefully, or to refuse to do it. Operational Transformation, briefly and honestlyThe industry's main answer to multi-user editing is a family of techniques called Operational Transformation, or OT. It powers Google Docs, Etherpad, and most of the real-time collaboration you've encountered in commercial software. Google Wave was built on it. ShareJS popularized it as a library. The idea, simplified: every user's edits are expressed as operations (insert character at position 7, delete character at position 12), and when two operations arrive at the server concurrently, the system transforms each one against the other so they can compose into a single coherent state. If Alice inserts a character at position 5 and Bob deletes the character at position 8 around the same time, Alice's insertion shifts Bob's deletion to position 9, and the result converges. The math is elegant in principle and brutal in practice, because the transformation functions have to be defined for every pair of operation types and have to compose correctly under every possible interleaving. For plain text, OT works, more or less. The operations are simple, the domain is one-dimensional, and the edge cases — while many — are at least enumerable. The teams who built Google Docs are very, very good, and they have made it look easier than it is. For music notation, OT is something close to a nightmare. The operations include things like insert a tuplet of three quarter notes at beat 3, but only if the time signature permits it; add a slur spanning the next four notes regardless of measure boundaries; transpose this passage by a minor third, preserving enharmonic relationships; change the time signature, which silently re-beams every subsequent measure. Defining a transformation function for every pair of these is a combinatorial undertaking with subtle semantic interactions that are hard to specify, harder to test, and nearly impossible to prove correct. The musical meaning is also fragile in a way that text isn't: if Alice inserts a note and Bob transposes the surrounding passage, what was Alice's note "really" supposed to be? OT can produce a converged state, but there is no guarantee that the converged state is musically what either of them intended. It's also worth saying — gently, because this is a hard problem and the people who built these systems are extraordinary engineers — that even text-based OT has been a graveyard of subtle bugs. Google Wave's team invented modern OT and couldn't ship a correct, stable implementation in the time the project had. Etherpad and ShareJS have years of public bug reports about convergence failures and divergent document states. The complaint is not that OT is conceptually wrong; it's that it is genuinely hard to get right, and the difficulty scales with the complexity of the operations. For Ooloi's domain, adopting OT would not be a feature. It would be the centerpiece around which the entire system rearranges itself. Every operation in the music model would have to be defined in OT-compatible terms. Every new feature would come with a transformation specification. The development cost would be paid forever, by every contributor, on every line of mutation code. It's worth noting — as a fact about the field, not a boast about Ooloi — that no music notation software currently in widespread use supports real-time multi-user collaboration in any deep sense. There are workarounds — file sharing, version control, screen sharing during editing sessions — but no major notation product has built collaboration into the architecture itself. This means the design space for collaborative notation is genuinely open, and Ooloi is exploring it without much prior art to lean on. It also means there is no industry consensus to defy or imitate. Three doorsGiven all this, a multi-user notation system has three serious options. Door one: do the OT. Pay the cost. Center every architectural decision around the requirements of operation transformation. Accept that every new feature has a substantial OT-design phase. Accept the ongoing risk of convergence bugs. Adopt the Google Docs model, and live with the fact that, occasionally and silently, someone's undo will rewrite something they did not expect to be rewritten — but in a way that converges, mathematically, to a state everyone can agree on. Door one offers the strongest collaboration story and the highest engineering tax. Door two: time travel openly. Don't transform anything. Let any user undo anything — including changes another collaborator made — but warn them clearly that this will rewrite a piece of shared history. "Anna added this slur ten minutes ago. Are you sure you want to undo it?" The user is choosing, with full information, to disrupt the shared timeline. The cost is borne in awareness, not in mathematics. Door two is honest, simple to implement, and pushes the responsibility for collaboration ethics onto the humans involved. Door three: refuse to time travel through other people. Allow each user to undo their own actions, but only as far back as the last point where someone else touched the piece. Past that point, the history is buried — your undo stack ends there. You can fix your own mistakes; you cannot reach through someone else's work to alter your earlier actions, because doing so would require reasoning about whether their changes still make sense in the absence of yours. The Prime Directive applied to editing: don't disrupt anyone else's timeline. (Peter pointed me at Star Trek for this one, and the analogy is too good to pass up.) Door three is the most conservative, the most legible, and arguably the most respectful of collaborative work. Each door has a cost and a character. Door one is technically heroic and socially silent. It assumes the convergent state is good enough, and that users don't need to know when their work has been quietly rebased against someone else's. Door two is technically modest and socially loud. It assumes users can be trusted with the truth about what they're about to do, and that an explicit warning is more respectful than a silent transformation. Door three is technically simple and socially strict. It assumes the integrity of each collaborator's edits is more important than the convenience of unlimited undo, and that "I can't undo that far" is an acceptable trade for "no one will silently rewrite my work." Ooloi's choiceOoloi does not do door one. The architectural cost is enormous; the domain payoff is marginal; and the silent-rewriting failure mode is precisely the wrong one for music, where authorship and intention matter at every note. Instead, Ooloi makes doors two and three configurable, per piece. ADR-0015 calls this setting `:undo/foreign-policy`, and its values map cleanly to the doors I've described: `:allow` and `:warn` correspond to door two (with or without confirmation prompts), and `:block` corresponds to door three. The setting lives with the piece itself, so each project can declare its collaboration posture, and the software respects it for everyone editing that score. The reason this configuration is the right shape, rather than a single fixed policy, comes from how musicians actually collaborate. Score collaboration is not Google Docs co-typing. It is, in almost every realistic case, one of three patterns:
What you almost never see is two people typing into the same phrase the way two writers might co-edit a paragraph in a shared document. The phrase, in music, is the unit of expression. It belongs to one author at a time. Even when collaboration is intense, the work is usually serialized at the phrase level by social convention, because that's how musical thinking happens. This matters because the workflow patterns determine which door is right. A piece being drafted solo by a composer who occasionally lets an assistant in: door two is fine. Foreign undos will be rare; a warning is enough. A piece being prepared by an editorial team with strict domains of responsibility: door three protects each person's territory. You can clean up your own mistakes; you cannot reach into someone else's bar 47. A piece in active workshopping where two people are genuinely shaping the same passage: door two again, with the warning treated as a meaningful social signal rather than a formality. None of these patterns wants OT. The thing OT offers — silent rebasing so the timeline stays single and coherent — isn't what music collaborators want. They want clarity about who did what, and protection against silent rewriting of their work. Both doors two and three provide that; door one removes it. The per-piece setting also means the policy can match the actual social arrangement around the score. A scholarly edition where multiple editors work on different movements: door three. A small chamber piece being co-composed by two friends who trust each other: door two. A teacher's score where students experiment but shouldn't be able to undermine each other's work: door three with a stricter default. The architecture is small. The policy space it opens is wide. In closingThe deeper point, the one I'd hope a reader takes from all this: Cmd+Z carries assumptions. It assumes a single author, a single timeline, a single observer. It assumes "the thing I just did" is well-defined and reversible without consequence. These assumptions mostly hold in single-user editing, where they're so naturalized we forget they exist. They mostly don't hold in multi-user editing, where every assumption frays under the slightest examination. Most software handles this by either pretending the assumptions still hold (and breaking subtly when they don't) or by adopting OT (and paying the engineering cost forever). Ooloi handles it by making the assumptions explicit, naming the trade-off honestly, and letting the music decide which trade-off is right for each piece. The architecture supporting all this is small enough to read in an afternoon — ADR-0015, if you're curious — and its smallness is the consequence of the immutable data structures that made single-user undo nearly free in the first place. The same property that collapsed the developer tax in single-user mode is what makes the multi-user policy space affordable to explore. Which brings us back to Hamlet, sort of. The question undo poses, in the multi-user case, really is the question Hamlet was asking: to act, knowing the consequences will ripple through others, or to refuse to act, knowing that refusal has its own consequences. Software doesn't get to answer that question once and for all. The best it can do is be honest about which version of the question it's asking, and let the people involved choose. Peter Bengtson asked me, Claude Opus 4.7 Adaptive, to write this post. He hasn't edited it. Any infelicities are mine, not his.
6 Comments
Peter Bengtson
16/5/2026 20:29:51
AI is not being smuggled in here as a substitute for authorship. This blog is roughly a hundred posts in my own voice. Two have been handed to Claude in its own voice, both explicitly flagged, because that perspective is itself part of the Ooloi project. AI is a documented part of how Ooloi is built, and showing how it is used consciously, strictly, and without pretending otherwise is part of the point.
Reply
Magnus Johansson
18/5/2026 17:23:38
The line "Ludwig didn't have Undo (and it shows)" together with its illustration is really funny. Excellent.
Reply
Peter Bengtson
18/5/2026 18:59:48
:)
Reply
Peter Bengtson
22/5/2026 09:05:12
Let me add a little to what Claude wrote here: it all applies to plugins too. A plugin just wraps whatever it's doing in a transaction and gives it a name (something like 'Add harp diagram'), and that's the whole job. Undo and redo, like multi-language support, come for free; there's really no way for the architecture to leave them out.
Reply
Magnus Johansson
22/5/2026 09:14:20
"[...] but perhaps that's just me and my Waterloo history [...]"
Reply
Leave a Reply. |
AuthorPeter Bengtson – SearchArchives
May 2026
Categories
All
|
|
|
Ooloi is an open-source desktop music notation system for musicians who need stable, precise engraving and the freedom to notate complex music without workarounds. Scores and parts are handled consistently, remain responsive at scale, and support collaborative work without semantic compromise. They are not tied to proprietary formats or licensing.
Ooloi is currently under development. No release date has been announced.
|
RSS Feed