A map using a lock-free concurrency using left-right algo, written in Go
Go 100.0%
44 1 0

Clone this repository

https://tangled.org/joh.dev/leftright https://tangled.org/did:plc:krwuwso5bnyzx3mk6hny65qp/leftright
git@tangled.org:joh.dev/leftright git@tangled.org:did:plc:krwuwso5bnyzx3mk6hny65qp/leftright

For self-hosted knots, clone URLs may differ based on your setup.

Download tar.gz
README.md

Left-Right Map in Go#

This is experimental code for educational purposes. Do not use in production!

It appears that this project became a testbed for me to test Go's recent new features. I.e. it began as

  • an exercise to prove that programming in Go w/o generic primitives, but with crude sed substitution is perfectly fine and has higher dev velocity that using actual generics.

  • I got the idea to emulate "weak pointers" by wrapping the read handlers into an outer shim onto which I attached a runtime.Finalizer that on GC cleanup calls its inner Close(). I did this with the idea that creating read handlers is expensive and should be amortized by using a sync.Pool. However, without such weak pointer + finalizer, there was no caller for the Close method.

  • Then go2 experimental generics came up and I have used that.

  • Then go generics became beta and I've used that.

  • Then atomic.Pointer became a thing and I've used that. Before I used unsafe.Pointer with uint64(uintptr) and atomic.XXUint64 crutches.

  • Then the iter interfaces were introduced and I've implemented that.

  • Then weak.Pointer was introduced and the runtime.Finalizer was superseeded by runtime.AddCleanup. I have ported those, making the complex outer/inner construct that emulated weak pointers obsolete, which makes the code a lot more understandable again.

  • There is a good chance that the next niche feature in Go that has barely any use case except in extremely non trivial edge cases does provide some value in this code. Which is ironic because I don't have a good use case for this code except for testing niche Go features.

Seriously, DO NOT USE THIS CODE anywhere near production or even at all! (I don't)

How concurrency characteristics of this code relate to the memory model is not clear.

Inspiration and prior art#

Inspired by Jon Gjengset's left-right.

While trying to learn Rust I've watched Jon's stream on how he refactored his rust-evmap into a generic version, then named left-right. This made me want to see in how few lines one would be able to make a reasonable implementation in (non-generic) Go. Also, I plan to use this project as a toy for learning the upcoming Type Parameters (Generics) in Go.

Apparently there is prior work by Pedro Ramalhete and Andreia Correia (2013): "Left Right: A Concurrency Control Technique with Wait-Free Population Oblivious Reads" blog post, preprint paper (pdf) (Warning: Sourceforge).

What is Left-Right?#

A left-right collection (in this experimental implementation: map[K]V[K, V]) is an attempt to eliminate locks entirely from hot paths in concurrent applications where read operations are massive and write operations relatively rare. Above authors show that in their implementations reads scale linearly with the number of readers.

A metaphorical explanation:#

There are two "arenas" (left, right). When a reader "enters", the reader reads a sign (an atomically read pointer), that points to which arena they should go. Once entered an arena, the reader is welcome to do as many reads as they please while the arena presents a consistent read-only state. Meanwhile, the writers are free to do any write operations on the other arena, undisturbed by the presence of readers. Every write operation is also appended to an operations log (like a redo log, write ahead log). Once the writer is done with its write operation, the "sign" is swapped and any reader that enters, goes to the recently-written-to arena. The writer waits until all readers that were still in the other arena to leave. Once the writer is convinced all readers have left the other arena, all operation from the log are re-done in order to sync up both arenas to the same state. Thence the writer can continue doing write operations. Rinse, repeat.

The writer does synchronize any actions with a central mutex lock. Keys and values are kept at most three times: Once for each arenas, plus any non-commited value in the operations log. The writer is in control over how long that operations log is: It is truncated after each commit operation. The delay of each commit is dependent on the readers: They need to preemptively leave the arena to allow the writer to carry out the commit operation.

The "sign", which is in fact a simple pointer, is written by the writer and read by the readers with atomic primitives. The signaling of where the readers are is done with an unsigned integer counter (named "serial") that is atomically incremented by 1 when entering an arena and when leaving an arena. Thus, if a reader has entered an arena the serial is odd, and even if the reader has left the arenas. The writer atomically reads each serial and spins over every serial that happened to be odd at that first read until all of those serials differ from the recorded odd serial. That ensures that every reader has at least once read the pointer after the writer swapped it (and also handles the case of overflowed serials).

Performance#

Once a reader has entered, all reads are as cheap as normal native reads from the collection (map in this case) which is as fast as it gets. So fast in fact, that in my early tests all the read operations where inlined in tight loops, such that Go's scheduler was unable to yield to other goroutines (see runtime.Gosched()).

However, readers need to preempt by regularly Enter()'ing and Leave()'ing which---although this does not encounter locks---is not free, because atomic operations impact cache lines. It is entirely possible that sync.RWLock performs similarly, thus benchmarking the actual use cases is obligatory. sync.RWLock does not allow simultaneous reading/writing and concurrency of reads drains down to zero when a writer wants to write, though.

Creating readers is relatively expensive. Each reader registers itself in the writer (which involves locks), because the writer must observe if any readers are in the live arena. If this algorithm is e.g. considered in HTTP handlers, each request is handled in its own Goroutine and thus also needs an own reader. Creating one reader for each request quickly destroys any benefit. This implementation offers a sync.Pool of reusable readers to alleviate this bottleneck, but if that in total provides any benefit can only be answered with use case specific benchmarks.

I did not perform systematic benchmarks myself.