A simple tool for incremental sync of atproto repo CAR files
Python 100.0%
3 1 0

Clone this repository

https://tangled.org/retr0.id/carsync
git@tangled.org:retr0.id/carsync

For self-hosted knots, clone URLs may differ based on your setup.

README.md

carsync#

Python script to efficiently† refresh an outdated copy of an atproto repo CAR file using com.atproto.sync.getBlocks

$ carsync
Usage: carsync <src_car> <dst_car> <pds_url>

†Caveats:

  • Every missing block is fetched sequentially via getBlocks - there is no batching or concurrency.
  • The whole CAR file is read and re-written.

The latter can be solved by storing the repo in SQLite (or maybe rocksdb) instead of a CAR file, and doing an MST diff rather than a full MST traversal (as is currently the case). Solving the former would probably require some galaxy-brain concurrent MST diff impl.

Despite these limitations, it's still practical and fast even for large-ish repos.

P.S. in theory it could resolve the PDS URL automatically, I didn't implement that so you have to pass it manually.