Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Slightly off-topic - I don't think real-time collaboration is suitable for text-based formats. I believe collaboration similar to working with git is superior:

1. Fork the text

2. Submit proposal

3. Review

4. Merge/Cancel

EDIT: To slightly expand on this - there are many reasons for this intuition - the main, IMO, is that people like to work on text privately before showing it to people. Also, the mental fear of your text interrupted by someone else. There might be even more reasons.



What's great about CRDTs is that they're really good at both real-time and asynchronous collaboration, unlike the operational transform system used by Google Docs. Asynchronous collaboration is Peritext's major motivation [1]:

> We interviewed eight people who regularly collaborate professionally on documents such as news articles, and several told us that they found real-time collaboration a stressful experience: they felt performative, self-conscious of others witnessing their messy work-in-progress, or irritated when a collaborator acted on suggestions before the editing pass was complete. When doing creative work, they preferred to have space to ideate and experiment in private, sharing their progress only when they are ready to do so.

> With asynchronous collaboration, this is possible: a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).

[1]: https://www.inkandswitch.com/peritext/


So they mention my exact concerns and addressed them, very cool. But their solution in practice doesn't look very different from what we can already achieve with git (apart from seeing your collaborator changes in real-time, which I'm not sure how substantial it is), or am I missing something?

At the end of the day, how will this look to the end user?

> a user may work in isolation on their own copy of a document for a while, without seeing other users’ real-time updates; sometime later, when they are ready to share their work, they can choose to merge it with their collaborators’ edits. Several such copies may exist side-by-side, and some might never be merged (e.g. if the user changed their mind about a set of edits).

Again, this sounds almost exactly like what we already achieve using git, so why do we need CRDTs for that?


Another issue which git won't solve is if you collaborate on text which has "structure". For example when merging indented (todo) lists as flat text, bullets might end up in the wrong place, e.g.

  [ ] Project
      [ ] Don't do this
          [ ] Task A
While offline, user #1 adds '[ ] Task B' and '[ ] rm -rf /' under "Don't Do This", while user #2 adds '[ ] Do This' under 'Project'. Obviously you don't want to merge this as:

   [ ] Project
      [ ] Don't do this
          [ ] A
      [ ] Do This
          [ ] B
          [ ] rm -rf /

We're working on an app [1] which needs to deal with this, but in general it also makes git less suitable for things like outliners or other collaborative text editors where people can work on lists, tables, and so on (structured data basically).

[1] https://thymer.com/


Does the app store structured text in a structured way? You probably need to for this to work the way you want.

We are writing a version controlled database (Dolt) and we recently implemented automatic merging of the contents of JSON documents, seems related to what you're doing:

https://www.dolthub.com/blog/2024-01-16-announcing-json-merg...


I believe that handling merges like this correctly was a motive for designing pijul: https://pijul.org

See the item on the splash page about 'merge correctness'. Unfortunately I wasn't able to find the post detailing the behavior with a bit of searching.


Plenty of people would rather spend their time doing their job at hand, than to fight with git and spend time having to learn git.

Suggesting that everyone should use git, is like the famous HN comment that dismissed Dropbox because “you could just use rsync”. IMO.

Like yeah, you could. But no, plenty of people will never want to do that.


Git is just the underlying mechanism. What you show to the user depends on your UX/UI. You don't need to show users the words "fork/merge" even. How about:

1. "Edit Joe's text"

2. "Ask Joe for a review"

3. "Do you want to add Alice's changes?"


For one thing, Git is abysmal at merging rich text formats that need balancing open/close annotation markers, like HTML <strong> and similar. With line/character oriented merge algorithms syntax errors from merge are inevitable. Once you start pulling that thread, like “ok can we clean up such mistakes automatically with a better merge algorithm that understands the content a bit more?” you’ll end up back here at CRDTs.

I’ve tried managing markdown docs in git a few times. Even with a team of senior+ engineer git experts it became a pain on frequently changing documents because conflicts were so common — if you wrap markdown at 80 characters, rewording a sentence is more likely to merge conflict than changing a variable in code since it may re-wrap several lines; and if you don’t wrap at all, any change in the paragraph will conflict on the entire paragraph.


Yep. And git can't at all handle realtime editing (since you'd need to commit & push after every keystroke). Or handle any non-text data. Even markdown, as you mentioned, can be a pain.

CRDTs should be able to solve all of these problems. One thing we're still missing in CRDTs is git style conflicts when merging long running branches. Should be possible - but still nobody has solved that as far as I know.


Re: markdown in vcs

Put each sentence and/or clause on its own line and it will avoid a large fraction of issues. Not all though.


The merging is automatic, for one


Yeah; and the merging is correct in all cases. You don't get spurious conflicts like you can with git.

Also the same system can work both in realtime and offline scenarios. And CRDTs can handle a lot more than just plain text editing.

That said, one thing git does that I like is that sometimes I want conflict markers to be added to my document when concurrent edits happen on the same line. CRDTs (and REGs like this) store strictly more information than git does, so theoretically it should be possible to make a CRDT which adds conflict markers too like git does. But as far as I know, nobody has built that yet! I really hope someone does, because it would be really neat.

I really want a git style version control system built on top of CRDTs with optional merge conflicts.


> Yeah; and the merging is correct in all cases.

No, it's obviously not.

I'd say most of the time the merging is incorrect, that is, it is in contrast to both of the parties intention. Unlike git, which actually notifies them, Loro just silently modifies both changes into something nonsense/wrong.


My correctness point is that Git can end up with commits which spuriously conflict with themselves. Git's merge algorithms simply aren't very good.

CRDTs (and REG) are correct in that everyone always ends up seeing the same document state. But I take your point - which is essentially that Loro (and automerge, yjs, diamond types, etc) don't correctly preserve intent in all cases. As you point out, if two users concurrently edit the same word, you can get nonsense.

In practice thats usually fine when users edit in realtime - since users notice and fix it. But its often not ok while users edit asyncronously (eg while offline). Thats exactly why I want a CRDT type system which can emit git style conflicts.


A while ago we would have said a computer program obviously can't guess the intent behind edits. But now we do have such programs: Language models. I wonder whether anyone has already used one to automatically solve merge conflicts.


That’s a good idea that I’ve heard a lot of people suggest. But as far as I know nobody has actually implemented merging using a llm yet.

Part of the problem is that CRDTs want to have strong eventual consistency - so, after merging everyone should be looking at the same final result. It’s hard to guarantee that if you pass the conflicting data to a LLM.


Exactly. This process of putting users in control by notifying them about changes and allowing them to decide how to merge is essential when dealing with text.

It sounds like Loro can do that too, but then again, is it worth the complexity over git?


darcs [0] patch theory was a predecessor to OTs/CRDTs (and a predecessor to git as well; in some ways it is the "smart" to which git was named "dumb"). When it works and performs well it is still sometimes version control magic.

pijul [1] is an interesting experiment to watch, trying to keep the patch theory flag flying and also trying to bring in updates from OTs and CRDTs as it can.

[0] https://darcs.net

[1] https://pijul.org/


What's more realistic: Git switching to some form of CRDT, or Git being superseded by a different system which is based on CRDTs?


This might be true for writing an essay. But plenty of collaborative text-editing happens on a much smaller/informal/ad hoc scale.

My personal experience of using Google Docs/Sheets strongly disagrees with you.


Asynchronous editing on a centralized platform only infrequently causes actual conflicts in practice, like multiple users editing the same sentence at the same time. Either the editing is actually sequential in time, or independent parts of the document are being edited. With asynchronous decentralized/offline editing, conflicts become more likely.


Hey, sounds like you'd love what we're building @ Ellipsus — https://ellipsus.com

We started off your same exact assumptions and built a text editor that combines the best of both worlds: you can collaborate in real-time on the same piece; or branch off of it, work on your own and then have it reviewed and merged back into the main branch. That and other features that should make writing together a pleasure.

Reach out if you want to give it a spin, we'd appreciate your feedback!


Looks awesome! Love the website. I signed up for the wait list.


Git is still great for developers, who are comfortable with the command-line and are willing to learn all of Git's concepts.

But let's be honest, Git isn't the most friendly tool for most computer users, many of whom have never touched any command-line interface, let alone Git. And yes, there's git-GUIs, but I find those are even more complex.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: