A very experimental PLC implementation which uses BFT consensus for decentralization
Go 97.7%
Shell 2.3%
69 1 0

Clone this repository

https://tangled.org/gbl08ma.com/didplcbft
git@tangled.org:gbl08ma.com/didplcbft

For self-hosted knots, clone URLs may differ based on your setup.

README.md

didplcbft#

didplcbft is an experimental PLC implementation. It uses BFT consensus (via CometBFT) to decentralize the hosting and maintenance of did:plc credentials.

⚠️ Despite using blockchain technology, this is not and will not be a cryptocurrency. It is currently in an experimental phase and is not intended for production use.

The concept#

The current plc.directory is a centralized point in the otherwise increasingly decentralized AT Protocol ecosystem. didplcbft explores an alternative approach where the operation of the PLC is distributed across a network of independent validators (home labs, PDS hosts, or other community members) rather than a single organization.

From a foundational point of view, didplcbft is designed to operate independently from plc.directory, such that it could eventually replace it, becoming the de facto PLC service. However, as a transitional approach, didplcbft can act as more of a mirror of an authoritative PLC implementation, namely, plc.directory. This transitional period can last indefinitely; with didplcbft being a consensus-based distributed system, this period would last until a sufficient quorum of participants agrees that it should operate independently from the centralized directory.

Even though a bespoke peer-to-peer protocol for data replication and consensus could be developed for this use case, for convenience and practicality reasons didplcbft is built on top of existing blockchain-oriented technology, most notably CometBFT. CometBFT is used throughout multiple cryptocurrency and non-cryptocurrency projects deployed in the real world. It seems to offer a sufficiently battle-tested solution for replicating an arbitrary deterministic application across different computers, that are not necessarily operated by trusted parties. Some of the aspects of CometBFT are desirable for our purposes, while others may be considered unnecessary or undesirable, but so far it has proven to be flexible enough such that most criticisms of "blockchain" can be mitigated or avoided.

What is the point, now?#

It is safe to assume that a network of didplcbft nodes is not becoming the official PLC implementation any time soon (and probably never), so one may wonder: why bother? Even in an initial phase where it is non-authoritative, didplcbft can prove useful:

  • For AT Protocol service operators and researchers, the hope is that didplcbft can be a sufficiently efficient local replica of the centralized PLC directory. This can be useful for applications with a high volume of read queries, where any latency and reliability issues associated with accessing a more "distant" internet service become a concern.
  • For AT Protocol users who may not necessarily want to trust the data provided by the official plc.directory at face value, didplcbft can offer a possibly more trustworthy record of changes, through its deterministic and consensus-backed state transitions. In a sense, it can help "keep plc.directory in check." This is somewhat similar to what initiatives like plcbundle offered, but didplcbft is more ambitious in that it tries to be a path towards full decentralization of the directory, including the creation of PLC operations, rather than being a mere read-only replica.
  • From a R&D point of view, it is a research artifact into whether a blockchain-based approach can deal with the transactional volume of plc.directory, without introducing excessive overhead or reliability issues, when compared to a centralized solution.

In a hypothetical future where AT Protocol stewardship decides that didplcbft should become the effective PLC implementation, meaning that the didplcbft network becomes the authoritative source of PLC data, it could offer some benefits over the current centralized approach, most of which are common to many blockchain implementations:

  • Elimination of a single point of failure
    • Better multi-region availability
    • User privacy preservation (no centralized point where statistics/telemetry can be collected)
  • (Arguable) censorship resistance
    • With transparency around the transitions that led to the current state
  • Looking very far into the future, interoperability between different node implementations could help highlight/avoid certain classes of bugs

Isn't did:web the answer to decentralizing identities?#

did:web is certainly a valid option for participating in ATProto without depending on the plc.directory. However, did:plc was developed precisely because did:web presented usability and scalability limitations in the context of ATProto. did:web probably "isn't for everyone;" for example, its use for identifiers controlled by individuals who wish to remain private has been discouraged. A decentralized identifier for one's (social) identity/ies feels like something that is too fundamental to be heavily dependent on what is essentially a different form of identifier (a DNS name), particularly when DNS at the internet level is ultimately controlled by the root zone administrators, or in more practical terms, the administrators of each TLD (which very much feels like a "pick your poison" type of scenario).

There is also the more practical argument that, between did:web and did:plc, the latter is the one that has millions of active users - regardless of whether we like this situation or the path that led to it. It would certainly be nice if all of those identities could be preserved in a decentralized way, without the identities themselves changing to a different method.

We are aware that, when it comes to DID methods in general and not just those "blessed" for use in ATProto, options exist besides did:plc and did:web, including did:webvh, which greatly reduces the dependency on DNS when compared to did:web, and did:fid, which purposefully builds on top of cryptocurrency networks, something didplcbft intentionally avoids.

The field of decentralized identifiers is not a battle where there can only be one winner. We certainly don't think didplcbft is the only path to the decentralization of the identifiers used in ATProto, or the best one. On its own, did:plc definitely has its merits, and to further improve its chance of long-term success, ideally even beyond ATProto, we believe that sooner or later a concrete path to the decentralization of the PLC must be developed. We are trying to work on precisely that.

Wouldn't a blockchain-based approach be something that warrants its own DID method, rather than piggybacking on did:plc?#

We could definitely see our work being the basis for a new, independent identifier method, e.g. "did:blc"/"did:dlc"/[insert even funnier three letter combo here], which could largely keep did:plc's semantics while being internally based on distributed consensus. However, it seemed much more appealing to build a system that can act as a drop-in replacement for the centralized service in certain roles, a system that can resolve real identifiers that already exist right now, instead of a system which would forever wait for user (and service) adoption to be minimally relevant.

We doubt that convincing the ATProto ecosystem to support a different DID method would be easy. Nobody would want to run a node for a DID method used by just a few dozen people, therefore, it would never be a truly decentralized network - the incentives just wouldn't be there. Offering a more gradual path towards decentralization that can keep existing identifiers working without their owners having to do anything seems more realistic, with the benefit that, if didplcbft gets to a point where it can be used "in production" as a mirror of plc.directory, and gains some user adoption due to that, then the hard work (of having real world adoption and a truly decentralized network) will be basically done.

Features#

Currently implemented#

  • PLC HTTP API: Full support for GET and POST operations (resolve DID docs, logs, audit logs, and submit new operations), with the detail that only sequence-based export pagination is supported
    • didplcbft is already able to work as a decentralized, standard-compliant PLC implementation, independently from plc.directory
    • Supports Websocket-based Export equivalent to that of the official plc.directory service (described here).
  • Validation: On-chain validation of PLC operations to ensure history integrity.
  • Authoritative Mirroring: An "authoritative import" mechanism to pull historical data from the official plc.directory.
  • Node-to node fast syncing: Support for snapshot-based sync, to quickly bring new replicas online, making use of the facilities offered by CometBFT.
    • A custom compact serialization format is used, able to archive/transmit the entire directory using around 30 GB of space/data transfer as of January 2026.

In progress#

  • Reputation System: (mostly ready, pending validation) A "proof of useful storage" system where nodes submit periodic proofs to maintain validator status.
    • Due to the way CometBFT works, it is ideal if there is a limit to how many nodes perform validator duties. The reputation system is meant to select, over time, the "best" nodes for this task, and manage their "voting power."
    • The reputation system will have some resistance against Sybil attacks - a single entity operating N identities should not be able to get away with putting in less effort than that of N different honest entities combined.
  • Dynamic Validator Sets: (mostly ready, pending validation) automatic management of which nodes perform validator duties based on their reputation.

Planned#

  • Bi-directional Sync: submitting operations observed on the didplcbft network back to the official plc.directory, while still deferring to operations served by the latter in case of conflict.
  • Spam Prevention: developing a non-currency-based throttling mechanism.
    • For example, by gossipping hashes of IP addresses and AS numbers across the network in order to limit how quickly spammers can create new identities in the PLC. The challenge is that certain entities (e.g. Bluesky's own official PDSs) will naturally need to create many more identities than others... maybe some sort of allowlisting mechanism would need to be implemented.
  • Public Testnet: moving from local clusters to a distributed internet-based environment.
  • Not really a feature, but more of a cross-cutting aspect: proper test coverage (will probably require refactoring the code first to better separate concerns), and perhaps some proper specs around how the proofs and reputation system work.

Out of scope#

  • Financialization: no tokens, no fees, and no "store of value" features will be added.
    • Separation of concerns: why bring finance into an identity system? Identities exist irrespective of their financials.
    • This will hopefully ensure that didplcbft does not run afoul of financial regulations.
    • This incentivizes participants to join the network not because of a direct financial incentive, but because it proves useful to them or their service/business.
    • For the better or for the worse, this also effectively means we can't depend on transaction fees as a rate limiting mechanism.
  • Timestamp-based pagination on the export endpoint: this would complicate the storage layer (requiring extra space for an additional index, at the very least) and is marked to eventually be deprecated by the official PLC implementation

Technical notes#

  • Consensus-backed data is stored in an IAVL+ tree, which can cryptographically prove and disprove the presence of specific leaves.
  • This tree and some non-consensus-backed auxiliary/derived data live in a couple badger key-value stores, using settings specifically tuned for the use cases at hand.
  • The PLC HTTP API reads the IAVL+ tree directly for read operations, and submits write operations as blockchain transactions.
  • The current implementation of the "proofs of useful storage" involves zkSNARK proofs, mostly as the means to attain a "bounded proof of work" mechanism, where proofs require some effort to produce, but can be verified instantly. By "bounded proof of work" we mean that honest nodes don't gain anything from throwing more resources at the system (so it shouldn't devolve into an arms race like Bitcoin's proof of work), and dishonest nodes attempting a Sybil attack will need to spend as many compute resources to operate each Sybil identity, as a single honest node would have to spend operating its own honest identity.
    • Why not rely on enforcing that each directory copy is unique through node-specific XOR operations, as with Filecoin sealing? Because this appeared to make implementation much more complex, would probably couple the storage layer to the proving mechanism in a way that wouldn't be easy to back out of, and after some research, it looked like all workable existing systems of "proof and space in time" (a la Filecoin) required excessive resources for their "sealing" processes (dozens of gigabytes of RAM, etc.). They did not seem adequate for somewhat dynamic data, as is the case with the PLC operations (which can be modified after creation, when being nullified, and in ultra-rare circumstances where history is rewritten).

Getting started#

This project is at a stage where it is very much oriented towards further development work, and not yet towards the operation of "real" nodes.

Note: the code currently contains various debug prints and unstructured logging. There are multiple // TODO comments spread throughout. It is definitely a work in progress.

Contribution prerequisites#

Running locally#

For development and testing, use the provided helper scripts:

  • For a single node: ./startfresh.sh.
    • This will clear the data and configuration in the didplcbft-data directory and start a single node.
    • At least at the time this README was written, the node will immediately begin importing operations from plc.directory (1000 operations/block).
      • To disable this, comment the relevant code in PrepareProposal - the if req.Height == 2 { condition.
      • With this disabled, didplcbft will act as a fully independent PLC, only serving the operations you create through it.
  • Local Testnet (Multiple Nodes): ./startfresh-testnet.sh
    • This will clear the data and configuration for multiple nodes in the testnet directory and start 4 nodes by default (you can pass the number of nodes as the first argument to the script)

The PLC API server listens on 127.0.0.1:28080 by default. When running a testnet, the script makes it so that only one of the nodes serves this API.

To easily import operations you can play around with the "test" that's within importer/importer_test.go - note that this imports operations by creating blockchain transactions directly, it doesn't use the PLC API, but you can probably ask your favorite LLM to change it as you see fit. You can also use your favorite existing tools for interacting with the PLC (e.g. goat) by pointing them at the PLC API endpoint mentioned above, rather than plc.directory.

Contact#

didplcbft is being brought to you by us, meaning, gbl08ma and his spare time 👋 Unsurprisingly, you can reach me on Bluesky and from there we can move to other communication platforms as we see fit.

Disclaimer#

If you happen to know who my employer is - this is not endorsed or condoned by them; this wasn't developed on company time, nor using company resources; no, I don't think this constitutes a conflict of interest; no, it doesn't make use of any trade secrets, proprietary, or confidential information, nor does it make use of any skills specific to my role (as much as it may look like it, from an uninformed point of view). If you don't know, no, I won't tell you who my employer is.