There are some floating ideas regarding to current limitation of atproto and additional requirements of tangled. I think most of all can agree that atproto itself is not enough for tangled's needs. Most easiest example would be shared-private records. Also, it's worth noting that we already have several off-protocol records in knotstream and spindlestream; sh.tangled.git.refUpdate, sh.tangled.pipeline and sh.tangled.pipeline.status.
So I think it would be quite reasonable for tangled to make our own protocol for off-protocol records (uhh sorry for my language skill limitations, hopefully you get the point.)
I don't have single concrete idea, but let me list my atomic thoughts.
Git refs#
Currently when git ref changes on knot via push, knot emits sh.tangled.git.refUpdate event records. First, this violates the lexicon style guidelines so it should be sh.tangled.git.ref.update. Now, it seems reasonable to just have sh.tangled.git.ref record and stream create/update/delete events.
We don't need to store git refs in json. We can fetch them from git object store on request.
Not sure if we can make atproto_pds compatible server to serve these imaginary records... But do we even need to?
Pipeline events#
Similar technique can be go for sh.tangled.pipeline and sh.tangled.pipeline.status records.
- On event trigger, we can create
sh.tangled.pipeline.workflowrecords for each jobs. This will be owned by the spindle1 and will be stored in somewhere2. - Each workflow will hold log UUID which can be used to request spindle for a real-time log stream.
- There won't be
sh.tangled.pipeline.statusrecords.sh.tangled.pipeline.workflowitself will hold its one and only state and change on it will be broadcasted. No state prioritization needed. This way, it will be easier to garbage-collect broken workflow runs withtimeoutstate and pipelines are now Sync 1.1 compatible. We can even backfill old pipelines!
Collaborative Records#
Before talking about shared-private records, this one needs some love. Honestly it deserves its own thread, but I'm leaving it here.
In git forge, it's pretty common to modify same object between multiple people. Like, that's whole point of the platform; collaboration.
Example of Collaborative Records:#
- author and maintainers can modify the issue title
- author and maintainers can change the issue state
- author and maintainers can modify the PR branch (when author allowed)
To allow these operations, a record should be:
- modifiable by arbitrary user with correct permission
- versioned and fully traceable (we can track atproto records with CID, but Sync protocol doesn't support version history)
For issue/PR state, we already have a solution with standard atproto records by making sh.tangled.repo.{issue,pull}.state records. But same technique cannot be applied to title or other fields, that will be too complex. So it is ideal to manage all issue/PR state/content in-record just like how I'm suggesting to remove sh.tangled.pipeline.status.
Access Control System#
modifiable by arbitrary user with correct permission
Obvious following question would be "how can we check the permission for an operation?".
Right now, we don't have much permission system. Only author can modify stuffs. Even repo collaborators have pretty limited capabilities and we don't have a way to provide fine-grained control over each collaborators' permissions.
I can see two solutions for this:
- Knot (or 'RepositoryServer' if we want to leave knot as just a git server) will store all permissions and all requests will proxy this server. For example, when a repo collaborator tries trigger the pipeline, the request will bypass the 'RepositoryServer' to spindle.
- Introduce publishable permission rule spec. We can publish casbin permission in atmosphere so distributed services can follow. This reminds me how Leaf is just sending raw SQL queries between services.
Current rbac implementation is bit incorrect and there is a fix for it. Please see this commit or sl/spindle-rewrite branch for correct implementation.
Conclusion#
This situation makes me feel the need of one or two new protocols. Not sure if we can make it generic while implementing tangled-specific logics like access control.
Thankfully, at-uri by spec isn't restricted for /nsid/rkey pattern. at-uri format in lexicons are restricted, yes, but nothing is stopping us from defining new uri scheme like git+at://.
As I said, I don't have concrete idea yet, so I'm looking for others thoughts.
I think Repo as DID would allow these records to live on protocol but allow for collaboration. A knot as pds is possible, and doesn’t need to fully comply with the standard pds implementation. It can have its own xrpc endpoints, just needs to implement just enough to connect to relays and get/sync records, blobs.