> The REG algorithm excels with its fast local update speeds and eliminate concerns about tombstone collection in CRDTs. For instance, if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches? How can we be sure an operation is synchronized?
If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Hi! I invented replayable event graphs. I'm writing a paper at the moment about it, which hopefully should be out in a month or so. Send me a private email and I can mail you the current draft if you like.
> If you remove these ops from history, does that remove the ability to time travel
Yes it does. You also need the ops from history to be able to merge changes. You can only merge changes so long as you have the operations going back to the point at which the fork happened.
> is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Absolutely. And this is very practically useful. For example, you could have a web page which loads the current state of a document (just a string. Unlike CRDTs, it needs no additional metadata!). Then if some merge happens while you have the document open, the browser could just fetch the operations from the server back as far as it needs to be able to merge. But in normal operation, none of the historical operations need to be loaded at all.
All this said, with text documents the overhead of just keeping the historical operations is pretty tiny anyway. In my testing using diamond types (same algorithm, different library), storing the entire set of historical operations usually increases the file size by less than 50% compared to just storing the final text string. Its much more efficient on disk than git, and more efficient than other CRDTs like automerge and Yjs. So I think most of the time its easier to just keep the history around and not worry about the complexity.
I think you can argue that it’s both a special case of an OT system, and a special case of a CRDT at the same time. It stores the original operations rather than transformed operations (like an OT system would). Then it does OT in bulk on those operations on every peer by constructing a crdt state object on the fly.
So it’s very closely related to both crdt and OT based systems. I think if you squint your eyes you could argue that it’s both. It’s a grow only set crdt, storing operations that we do OT on. We do OT by embedding another crdt implementation inside each peer.
> If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches?
Yes. But squash can be supported.
> How can we be sure an operation is synchronized?
In Loro, we not only record the real-world timestamp efficiently, similar to Git, but also capture the DAG information. This approach ensures that if an operation (op) is particularly old, it will have many other ops depending on it. By utilizing both pieces of information, we can determine the operations that are likely synced across all peers. For peers like servers, it's feasible to preserve all operations. However, we can remove some operations in scenarios such as opening the document online for the first time.
> If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?
Yes. This is not supported at the moment, but we hope to implement it before version 1.0.
> if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
This assumes that the set of endpoints (really, nodes) is both well-known by all other nodes in the network, and stable over time (meaning new nodes will never be added).
Even if this assumption can be made safely (which is not a given) the GC process described here is still an optimization, which would be subverted when even a single node in the network became slow or broken.
It's also basically orthogonal to the concept of "tombstones", which are still required if you want to delete anything from the data structure.
Similar to OT, in certain scenarios, it's sufficient to ensure that only a subset of peers have the complete data, while others don't need the full history. For instance, in real-time collaboration scenarios with a central server, we can, just like OT, allow clients to hold only a shallow clone instead of the complete history. This approach results in minimal overhead for the clients.
I guess it all depends on how you define "client" and "shallow history", and the guarantees you provide around propagation of that history.
But if you have a central server that is considered to be the authoritative source of state, and assuming clients interact with that central server directly, then I'm not sure what is accomplished by modeling your data with CRDTs in the first place?
It's great work improvising over Peritext using joseph's latest CRDT work. Much needed literature in "applying CRDTs for richtext" space.
But I'm surprised why this one too hasn't focussed a lot on rich-text block elements (like lists, tables & sections) as much as it focussed on text attributes (like bold and italics).
I've gone through the Peritext paper several times, and attempted to implement support for lists/tables/sections myself, but found it to be difficult owing to the limited scope of the original paper. I reached out to Martin Kleppman about this, and this is what he told me:
"Thanks for your message. I've written up a document on how to extend Patreon with nested block elements such as bullet points. It's not properly published yet, but you can find a draft here: https://martinkl.notion.site/Block-elements-in-rich-text-CRD...
My colleagues are currently in the process of integrating this algorithm into Automerge, and creating bindings to the ProseMirror rich text editor."
Personally, I found the document to be very helpful.
I've recently implemented a rich text editor on top of Automerge and Peritext that supports blocks, inline-blocks (or whatever you call them: images, tables, etc.), unaware that this was in the works. The implementation ended up _very_ close to what's described in the linked post.
Coming from Google Wave, I still think of this problem the way we dealt with it in wave. In Wave, a document was a sequence of items + annotations. Items were either characters (collectively making up normal text), an item could be an embedded child item, like a table, image, nested document, etc. Embedded items were "in the document" just like any other document content, and could be inserted or deleted the same as text.
Then annotations were used for formatting, like bolded regions or links. This is how peritext and quill work. There are no "annotation items" (since managing them sounds tedious).
It looks like loro works slightly differently. I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
As others have said, once nice thing about this model is that embedded items can themselves be targets of editing events. For example, you could add a map to google wave and then users could collaboratively add and remove pins to the map.
By "annotation items" - you mean the special "boundary characters" that Loro CRDT uses to mark formatting boundaries right?
Yes, Google docs doesn't use such special annotation characters for bold/italics-like formattings. But it does use special boundary characters for marking "comment" boundaries. These characters simplify a few things from a comment perspective (as they shouldn't merge like formatting options). Zoho Writer uses a similar design for differentiating formatting boundaries vs comment boundaries.
> I'd like to take my own stab at rich text in diamond types soon, since I need it for a project. I think there's a cleaner way to do it - though until I write the code, who knows how it'll turn out.
Curious to take a look at your approach. Do write more on your blog Joseph :)
> But I'm surprised why this one too hasn't focussed a lot on rich-text block elements (like lists, tables & sections)
This is where the real strength of crdts is, IMO. Tree-like structures turn into DAGs once you have multiple edits happening (two users editing a tree node create children with two parents) and are much more interesting than linear data like a string. It's definitely not the most efficient way to store data, but it's incredibly convenient for reconciling versions of pretty complex UI.
A few years ago I threw together a report-writing app for car crash data that let you drop different graphs, maps, visualizations, table excerpts (eg "select top 5 worst roads") that could all reference each other. You make a report for your hometown, and some other team could fork it to see the results for their area.
Data were all update live, even if it was pinned to a specific time, since things could be updated after the fact. That other team could then add in a new section of the report or change the conclusion section, and still get the first team's updates to the shared sections (with an option to revert).
You can do like... most stuff like that, if you want. Slides, tables, drawings, whatever. It's not great for graphs but idk what is good for graphs.
The demo they show uses the Quill rich-text editor, which handles block elements analogously to text attributes: a block's type is determined by the attributes on its trailing newline. E.g., for a block that is part of an ordered list, the newline's format is `{ list: "ordered" }`.
Looks neat! Would there be a way of intercepting state and making 'snapshots' into a more traditional format, like SQL, or even a JSON file?
It sounds like this defaults to the server storing the whole state in their binary format, ditto the client-side portion of it. Nothing wrong with the format, but this is an early project, and nobody wants their data in something that's potentially unstable, or something that might get corrupted.
We are carefully stabilizing our encoding format and will have a clear storage format documentation introduced in version 1.0. I agree that a more transparent format can provide users with a better sense of control, and we will try to create a human-readable format for exporting CRDT data (the kind that includes operation history). As for the application state, Loro already supports direct export in json format.
Slightly off-topic - I don't think real-time collaboration is suitable for text-based formats. I believe collaboration similar to working with git is superior:
1. Fork the text
2. Submit proposal
3. Review
4. Merge/Cancel
EDIT: To slightly expand on this - there are many reasons for this intuition - the main, IMO, is that people like to work on text privately before showing it to people. Also, the mental fear of your text interrupted by someone else. There might be even more reasons.
What's great about CRDTs is that they're really good at both real-time and asynchronous collaboration, unlike the operational transform system used by Google Docs. Asynchronous collaboration is Peritext's major motivation [1]:
> We interviewed eight people who regularly collaborate professionally on documents such as news articles, and several told us that they found real-time collaboration a stressful experience: they felt performative, self-conscious of others witnessing their messy work-in-progress, or irritated when a collaborator acted on suggestions before the editing pass was complete. When doing creative work, they preferred to have space to ideate and experiment in private, sharing their progress only when they are ready to do so.
> With asynchronous collaboration, this is possible: a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
So they mention my exact concerns and addressed them, very cool. But their solution in practice doesn't look very different from what we can already achieve with git (apart from seeing your collaborator changes in real-time, which I'm not sure how substantial it is), or am I missing something?
At the end of the day, how will this look to the end user?
> a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).
Again, this sounds almost exactly like what we already achieve using git, so why do we need CRDTs for that?
Another issue which git won't solve is if you collaborate on text which has "structure". For example when merging indented (todo) lists as flat text, bullets might end up in the wrong place, e.g.
[ ] Project
[ ] Don't do this
[ ] Task A
While offline, user #1 adds '[ ] Task B' and '[ ] rm -rf /' under "Don't Do This", while user #2 adds '[ ] Do This' under 'Project'. Obviously you don't want to merge this as:
[ ] Project
[ ] Don't do this
[ ] A
[ ] Do This
[ ] B
[ ] rm -rf /
We're working on an app [1] which needs to deal with this, but in general it also makes git less suitable for things like outliners or other collaborative text editors where people can work on lists, tables, and so on (structured data basically).
Does the app store structured text in a structured way? You probably need to for this to work the way you want.
We are writing a version controlled database (Dolt) and we recently implemented automatic merging of the contents of JSON documents, seems related to what you're doing:
I believe that handling merges like this correctly was a motive for designing pijul: https://pijul.org
See the item on the splash page about 'merge correctness'. Unfortunately I wasn't able to find the post detailing the behavior with a bit of searching.
Git is just the underlying mechanism. What you show to the user depends on your UX/UI. You don't need to show users the words "fork/merge" even. How about:
For one thing, Git is abysmal at merging rich text formats that need balancing open/close annotation markers, like HTML <strong> and similar. With line/character oriented merge algorithms syntax errors from merge are inevitable. Once you start pulling that thread, like “ok can we clean up such mistakes automatically with a better merge algorithm that understands the content a bit more?” you’ll end up back here at CRDTs.
I’ve tried managing markdown docs in git a few times. Even with a team of senior+ engineer git experts it became a pain on frequently changing documents because conflicts were so common — if you wrap markdown at 80 characters, rewording a sentence is more likely to merge conflict than changing a variable in code since it may re-wrap several lines; and if you don’t wrap at all, any change in the paragraph will conflict on the entire paragraph.
Yep. And git can't at all handle realtime editing (since you'd need to commit & push after every keystroke). Or handle any non-text data. Even markdown, as you mentioned, can be a pain.
CRDTs should be able to solve all of these problems. One thing we're still missing in CRDTs is git style conflicts when merging long running branches. Should be possible - but still nobody has solved that as far as I know.
Yeah; and the merging is correct in all cases. You don't get spurious conflicts like you can with git.
Also the same system can work both in realtime and offline scenarios. And CRDTs can handle a lot more than just plain text editing.
That said, one thing git does that I like is that sometimes I want conflict markers to be added to my document when concurrent edits happen on the same line. CRDTs (and REGs like this) store strictly more information than git does, so theoretically it should be possible to make a CRDT which adds conflict markers too like git does. But as far as I know, nobody has built that yet! I really hope someone does, because it would be really neat.
I really want a git style version control system built on top of CRDTs with optional merge conflicts.
I'd say most of the time the merging is incorrect, that is, it is in contrast to both of the parties intention. Unlike git, which actually notifies them, Loro just silently modifies both changes into something nonsense/wrong.
My correctness point is that Git can end up with commits which spuriously conflict with themselves. Git's merge algorithms simply aren't very good.
CRDTs (and REG) are correct in that everyone always ends up seeing the same document state. But I take your point - which is essentially that Loro (and automerge, yjs, diamond types, etc) don't correctly preserve intent in all cases. As you point out, if two users concurrently edit the same word, you can get nonsense.
In practice thats usually fine when users edit in realtime - since users notice and fix it. But its often not ok while users edit asyncronously (eg while offline). Thats exactly why I want a CRDT type system which can emit git style conflicts.
A while ago we would have said a computer program obviously can't guess the intent behind edits. But now we do have such programs: Language models. I wonder whether anyone has already used one to automatically solve merge conflicts.
That’s a good idea that I’ve heard a lot of people suggest. But as far as I know nobody has actually implemented merging using a llm yet.
Part of the problem is that CRDTs want to have strong eventual consistency - so, after merging everyone should be looking at the same final result. It’s hard to guarantee that if you pass the conflicting data to a LLM.
Exactly. This process of putting users in control by notifying them about changes and allowing them to decide how to merge is essential when dealing with text.
It sounds like Loro can do that too, but then again, is it worth the complexity over git?
darcs [0] patch theory was a predecessor to OTs/CRDTs (and a predecessor to git as well; in some ways it is the "smart" to which git was named "dumb"). When it works and performs well it is still sometimes version control magic.
pijul [1] is an interesting experiment to watch, trying to keep the patch theory flag flying and also trying to bring in updates from OTs and CRDTs as it can.
Asynchronous editing on a centralized platform only infrequently causes actual conflicts in practice, like multiple users editing the same sentence at the same time. Either the editing is actually sequential in time, or independent parts of the document are being edited. With asynchronous decentralized/offline editing, conflicts become more likely.
Hey, sounds like you'd love what we're building @ Ellipsus — https://ellipsus.com
We started off your same exact assumptions and built a text editor that combines the best of both worlds: you can collaborate in real-time on the same piece; or branch off of it, work on your own and then have it reviewed and merged back into the main branch. That and other features that should make writing together a pleasure.
Reach out if you want to give it a spin, we'd appreciate your feedback!
Git is still great for developers, who are comfortable with the command-line and are willing to learn all of Git's concepts.
But let's be honest, Git isn't the most friendly tool for most computer users, many of whom have never touched any command-line interface, let alone Git. And yes, there's git-GUIs, but I find those are even more complex.
> The REG algorithm excels with its fast local update speeds and eliminate concerns about tombstone collection in CRDTs. For instance, if an operation has been synchronized across all endpoints, no new operations will occur concurrently with it, allowing it to be safely removed from the history.
If you remove these ops from history, does that remove the ability to time travel (per the home page "An antidote to regret, enabling historical edits traversal") or merge branches? How can we be sure an operation is synchronized?
If dropping these ops is necessary for speed/storage optimization but disables time-travel, is it possible to put the removed historical/tombstone ops into a "cold storage" that's optional and only loaded for time-travel use?