January 19, 2021
This will be the first post in a series that I'm working on about the CRDT algorithm that I developed known as AnchorLogoot. As a derivative of Logoot, it's easiest to understand Logoot before trying to understand my algorithm. I found the Logoot paper to be a bit on the technical side, so this is explanation will hopefully be a bit more user-friendly.
If you prefer the more mathematically rigorous explanation, you can find the original article here. This section will serve as a more practical explanation of Logoot for those who are not yet familiar.
Logoot distills the problem of collaborative editing to a problem of ordering. Logoot splits the document into single,
indivisible atoms. These could be characters, Unicode grapheme clusters, images, bullet points, or pretty much
anything. Consider the string "hello"
. If this is stored as an array, I could assign numbers 0-4 (inclusive) to each
letter. This can be modeled as a mapping from a single, numeric index to the data (a letter or nothing) stored at that
index. I can modify the value at each index, which allows me to quite easily append characters to each end of the
string. However, it is quite difficult to insert a character in the middle of my string: I would have to shift over
every index by replacing the values to make space for the new value. However, this is inefficient and will not be able
to handle concurrent edits in a reasonable way.
So, there needs to be a way of storing positions between other positions. Mathematically, we know that there are infinite decimals between any two integers. Variable precision decimals are represented using floating points, but using these for positions wouldn't work too well, though: You'd quickly run out of available positions since floating point precision is not infinite. These are the same issues that would be encountered when using standard, fixed-precision integers.
The solution, seems to be to allow an infinitely long integer to specify more points with indefinite precision. Instead
of defining each of our positions as a single integer, we can define them as an array of integers. The map would look
something like this: (for the string "abd"
)
[0] -> 'a'
[1] -> 'b'
[2] -> 'd'
To add a 'c'
between the 'b'
and 'd'
, we simply need a position between them. Finding one easy: Let's call it
[1,1]
:
[0] -> 'a'
[1] -> 'b'
[1,1] -> 'c'
[2] -> 'd'
We can keep doing this by adding more and more elements to the array as necessary. We will always be able to find a position between any other two. It is worth noting having many atoms can cause memory issues, which will be discussed in more depth later.
Here, I used [1,1]
as an example position between [1]
and [2]
. Especially when there's just two numbers in the
array, it's easy to mistake these for decimals. However, there's one key difference: Adding another element in the array
always makes the position greater. In other words, mathematically, 1
is the same number as 1.0
, but with these
positions, that is not the case. The position [1,0]
is greater than [1]
. Furthermore, the same is true even if
the second number is negative: [1,-9000] > [1]
. Like decimals, the goal of this approach is to make infinite space
between two existing positions, however, they behave quite differently.
Removing these atoms is simple. If we wanted to remove the letter b
from the example above, all we would need to do is
record that the position [1]
has been removed. It would be just as easy to record a new operation to add a new
character back into the same position. So long as the events are replayed in order, the same order should always result.
For this to work, the edits must be applied in exactly the same order. Iin distributed systems, this can become
complicated. Consider that a user inserts the letter a
, then deletes it, then inserts the letter b
. If the order is
randomized, then the state of that region of text could be quite different: It could end up being a
, b
, or it could
end up being removed entirely.
EDIT: The fix detailed here involving Lamport clocks here isn't in the original Logoot paper. It's just a trivial fix to the ordering problem. The downside of this is that more memory is used as the old "tombstones" have to be stored, though they can be removed if an insertion is created in one's place. Rather, the original paper assumes that all events are already ordered.
Rather than ensuring that all of the operations are properly ordered, it's easier to properly handle an unordered
operation when it is processed. To do this, each atom is assigned a Lamport clock.
Whenever a new character is inserted over an old one, this clock is incremented. When a removal is recorded, the clock
takes the same Lamport clock of the data being removed. When receiving an operation, atoms with a higher Lamport clock
will take precedence. If they're equal, then the removal will take precedence. Now, in our previous example, the
following operations could be recorded: (Where the number after the @
is the new Lamport clock)
INSERT 'a' -> [0] @ 0
REMOVE [0] @ 0
INSERT 'b' -> [1] @ 1
If the order is reversed, the insertion of b
is received and added to the document. Next, the removal of [0] @ 0
is
received. No action is taken since the Lamport clock of 0
is lower than the existing atom's clock of 1
. The same would go for the insertion of a
.
If the order is the same, except that the removal is seen first, the receiver would actually have to record a removal at
[0] @ 0
and store it in memory. This is because the receiver would have to know to ignore the a
when it is processed
next to ensure that the removal has the desired effect. That allows a removal to be received before the content that
it's supposed to remove. Finally, the b
is inserted over the removal (which can now be forgotten) since it has a
higher Lamport clock.
There was a predecessor to AnchorLogoot known simply as Logootish. Logootish stopped here and implemented Logoot as
described above, but added a detection mechanism for nodes inserted by users on parallel "branches." This was the
algorithm that I talked about on Matrix Live.
However, I'd already come up with AnchorLogoot a while ago. Nearly three months later, I realized that Logootish was too
complicated and that implementing AnchorLogoot was likely less work for an overall better algorithm. So, I started the
kb1rd-breaking-stuff
branch to work on AnchorLogoot instead. The original Logootish code is
here.
While this all works well for one user, we're still missing one small thing. Let's say that two users edit textvat the
same location, but they're both offline. When they sync up again, one user's edits will overwrite the other's. If only
there were some way to divide up positions so that only one user can edit a particular position, the problem would
seemingly be solved. Of course, there is! Let's assume that there are three users, U1-3. Instead of using integers for
each element of the position array, we can use a tuple of (pos, user)
. To order two of these tuples, first the numeric
position is ordered. If these are the same, then the tuples are ordered by the user field. Here's an example of a valid
ordering:
[(0, U1)], [(0, U2)], [(0, U1),(1, U1)], [(0, U1),(1, U3)], [(1, U1)]
This changes allocations slightly. If a user wants to insert an atom, they must insert it such that the last element of
the array has the user component of the tuple set to the current user. This ensures that it's impossible for two users
to concurrently insert at the same position. As with the old positions, it's always possible to find a position between
two others. The main difference in allocation methods shows up when there are two consecutive atoms from different
users. Consider the positions [(0, U1)]
and [(0, U2)]
. If U3
wants to insert between these two atoms, they can use
the position [(0,U1), (0,U3)]
.
Hopefully, Logoot makes a bit more sense now. In the next post in this series, I'll explore some of the problems with Logoot and potential fixes for them.