carsync#
Python script to efficiently† refresh an outdated copy of an atproto repo CAR file using com.atproto.sync.getBlocks
$ carsync
Usage: carsync <src_car> <dst_car> <pds_url>
†Caveats:
- Every missing block is fetched sequentially via
getBlocks- there is no batching or concurrency. - The whole CAR file is read and re-written.
The latter can be solved by storing the repo in SQLite (or maybe rocksdb) instead of a CAR file, and doing an MST diff rather than a full MST traversal (as is currently the case). Solving the former would probably require some galaxy-brain concurrent MST diff impl.
Despite these limitations, it's still practical and fast even for large-ish repos.
P.S. in theory it could resolve the PDS URL automatically, I didn't implement that so you have to pass it manually.