Surprised to see no discussion of other data structures like dicts/maps, or arra...

filleokus · 2025-05-21T20:48:42 1747860522

> Surprised to see no discussion of other data structures like dicts/maps, or arrays of arbitrary type. Hopefully they'd be a straightforward extension. IME, apps need collaborative data structures more often than they need pure collaborative text editing.

Totally agree. I guess an array of "atomic" objects, where the properties of the objects can't be changed can be done just by replacing the string with your own type. Changes inside of the object is probably trickier, but maybe it's just about efficiently storing/traversing the tree?

I've also always thoguth it should be possible to create something where the consumer of the helper library (per OP terminology) can hook in their own lightweight "semantic model" logic, to prevent/manage invalid states. A todo item can't both have isDone: true and state: inProgress at the same time. Similar to rich text formatting semantics mentioned in the linked article.

SkiFire13 · 2025-05-22T06:11:51 1747894311

CRDTs essentially work by deterministically picking one side when a conflict arises. The issue is that in general this does not guarantee the lack of data loss nor data being valid (you can resolve the conflict between two pieces of valid data and get invalid data as a result).

Imagine if every git merge conflict you got was resolved automatically by picking one side. Most of the time it would do the wrong thing, sometimes even leading to code that fails to compile. Imagine then you were not there ready to fix the issue, it would lead to even more chaotic results!

This is why CRDTs are not more widespread, because they only fix the problem you think you have, not the problem you actually have, which is to fix conflicts in a way that preserves data, its validity and meaning.

And arguably they make this issue even worse because they restrict the ways you can solve these conflicts to only those that can be replicated deterministically.

josephg · 2025-05-22T06:51:24 1747896684

> This is why CRDTs are not more widespread, because they only fix the problem you think you have, not the problem you actually have, which is to fix conflicts in a way that preserves data, its validity and meaning.

I’ve been saying this for years, but there’s no reason you couldn’t make a crdt which emitted conflict ranges like git does. CRDTs have strictly more information than git when merging branches. It should be pretty easy to make a crdt which has a “merge and emit conflicts” mode for merging branches. It’s just nobody has implemented it yet.

(At this point I’ve been saying this for about 5 years. Maybe I need to finally code this up if only to demonstrate it)

hansworst · 2025-05-22T07:58:30 1747900710

Automerge has this: https://automerge.org/automerge/api-docs/js/functions/getCon...

josephg · 2025-05-22T08:30:10 1747902610

Reading the docs, it looks like that only works for object keys that have been concurrently set to different values. Not text documents.

motorest · 2025-05-23T05:28:40 1747978120

> I’ve been saying this for years, but there’s no reason you couldn’t make a crdt which emitted conflict ranges like git does.

I don't get your point. The C in CRDT stands for "conflict-free". Why would a CRDT have conflict ranges?

josephg · 2025-05-23T06:26:05 1747981565

Because automatic merging isn’t always perfect. Especially when merging long lived changes to code in git, sometimes you need manual intervention to figure out the result. And we need to manually intervene sometimes.

SkiFire13 · 2025-05-22T09:58:00 1747907880

That might be an improvement over git, but not automatically fixing the conflicts will be a dealbreaker for most people.

Fundamentally people want something that automatically fixes conflicts, and do so the way they expect it to, but this just doesn't exist yet.