--- title: Repo format description: Specification of the Repo format layout: ../../../layouts/MainLayout.astro --- **All our protocols and formats use the binary codec called [BARE](https://baremessages.org/)**. Only the formats that have been implemented so far are listed here. The whole design is bigger than that. ## CommitBody list of CommitBodyV0 already implemented ### Repository First commit published in root branch, signed by repository key. - For the Root repo of a store, the convergence_key should be derived from : "NextGraph Data BLAKE3 key", RepoId + RepoWriteCapSecret) - For a private store root repo, the RepoWriteCapSecret can be omitted ```rust struct RepositoryV0 { /// Repo public key ID id: RepoId, /// Verification program (WASM) verification_program: Vec, /// Optional serialization of a ReadBranchLink fork_of: Vec, /// User ID who created this repo creator: Option, /// Immutable App-specific metadata (unused for now) metadata: Vec, } ``` - fork_of : ReadBranchLink (of a rootbranch or a transactional branch), if the repository is a fork of another one. Then transaction branches of this new repo, will be able to reference the forked repo/branches commits as DEPS in their singleton Branch commit. ### RootBranch Second commit in the root branch, signed by repository key is used also to update the root branch definition when users are removed, quorum(s) are changed, repo is moved to other store. It is signed by its author, and requires an additional SyncSignature by the total_order_quorum or by the owners_quorum. **DEPS**: Reference to the previous root branch definition commit, if it is an update. ```rust struct RootBranchV0 { /// Branch public key ID, equal to the repo_id id: PubKey, /// Reference to the repository commit, in order to get /// the verification_program and other immutable details repo: ObjectRef, /// Store ID the repo belongs to /// the identity is checked by verifiers (check overlay is matching) store: StoreOverlay, /// signature of repoId with store's partial_order signature /// in order to verify that the store recognizes this repo as part of itself. /// only if not a store root repo itself store_sig: Option, /// Pub/sub topic ID for publishing events about the root branch topic: TopicId, /// topic private key (a BranchWriteCapSecret) topic_privkey: Vec, inherit_perms_users_and_quorum_from_store: Option, /// Quorum definition ObjectRef quorum: Option, /// BEC periodic reconciliation interval. zero deactivates it reconciliation_interval: RelTime, /// list of owners owners: Vec, owners_write_cap: Vec, /// Mutable App-specific metadata (not used) metadata: Vec, } ``` - topic_privkey : topic private key (a BranchWriteCapSecret), encrypted with a nonce = 0 and key derived as follow BLAKE3 derive_key ("NextGraph Branch WriteCap Secret BLAKE3 key", RepoWriteCapSecret, TopicId, BranchId ) so that only editors of the repo can decrypt the privkey. Not encrypted for individual store repo. - inherit_perms_users_and_quorum_from_store : if set, permissions are inherited from the referenced Store Repo. (only set if this repo is not the store repo itself). Check that it matches the self.store. Can only be committed by an owner. It generates a new certificate. Owners are not inherited from store A new RootBranch commit should be published (RootCapRefresh, only read_cap changes) every time the store read cap changes. Empty for private repos. - owners : all of them are required to sign any RootBranch that modifies the list of owners or the inherit_perms_users_and_quorum_from_store field. - owners_write_cap : when the list of owners is changed, a crypto_box containing the RepoWriteCapSecret should be included here for each owner. This should also be done at creation time, with the UserId of the first owner, except for individual private store repo, because it doesn't have a RepoWriteCapSecret. The vector has the same order and size as the owners one. each owner finds their write_cap here. ### Branch Transactional Branch definition First commit in a branch, signed by branch key. In case of a fork, the commit **DEPS** indicate the previous branch heads, and the **ACKS** are empty. Can be used also to update the branch definition when users are removed (in order to refresh the ReadCap) In this case, the total_order quorum is needed, and **DEPS** indicates the BranchCapRefresh commit. ```rust struct BranchV0 { /// Branch public key ID id: PubKey, crdt: BranchCrdt, /// Reference to the repository commit repo: ObjectRef, root_branch_readcap_id: ObjectId, /// Pub/sub topic for publishing events topic: PubKey, topic_privkey: Vec, pulled_from: Vec, /// App-specific metadata (not used) metadata: Vec, } ``` - root_branch_readcap_id : object ID of the current root_branch commit (ReadCap), in order to keep in sync this branch with root_branch. The key is not provided because external readers should not be able to access the root branch definition. It is only used by verifiers (who have the key already). - topic_privkey : a BranchWriteCapSecret, encrypted with a nonce = 0 and a key derived as follow BLAKE3 derive_key ("NextGraph Branch WriteCap Secret BLAKE3 key", RepoWriteCapSecret, TopicId, BranchId ) so that only editors of the repo can decrypt the privkey. For individual store repo, the RepoWriteCapSecret is zero - pulled_from : optional: this branch is the result of a pull request coming from another repo. contains a serialization of a ReadBranchLink of a transactional branch from another repo ### AddBranch Add a branch to the repository If it is part of the Repository creation, that needs to create several branches (Store, Overlay or User branches, and Header branch) then it is also signed by the SyncSignature. **DEPS**: if update branch: previous AddBranch commit of the same BranchId ```rust struct AddBranchV0 { /// the new topic_id topic_id: Option, /// the new branch definition commit branch_read_cap: Option, crdt: BranchCrdt, branch_id: BranchId, branch_type: BranchType, fork_of: Option, merged_in: Option, } ``` - topic_id : the new topic_id. Will be needed immediately by future readers in order to subscribe to the pub/sub). should be identical to the one in the Branch definition. None if merged_in. - branch_read_cap : the new branch definition commit (we need the ObjectKey in order to open the pub/sub Event). None if merged_in - crdt : one of (the string indicates the primary class of the branch) - Graph(String), - YMap(String), - YArray(String), - YXml(String), - YText(String), - Automerge(String), - Elmer(String), - branch_type : one of - Header, - Main, - Transactional, - Store, - Overlay, - User, - Chat, - Stream, - Comments, - BackLinks, - Context, ### SyncSignature Sync Threshold Signature of a single commit or a chain of commits. Points to the new Signature Object. Based on the total order quorum (or owners quorum). Mandatory for UpdateRootBranch, UpdateBranch, some AddBranch, RemoveBranch, RemoveMember, RemovePermission, Quorum, Compact, sync Transaction, RootCapRefresh, BranchCapRefresh **DEPS**: the last signed commit in chain **ACKS**: previous head before the chain of signed commit(s). should be identical to the HEADS (marked as DEPS) of first commit in chain ```rust enum SyncSignature { V0(ObjectRef), } ``` ### AsyncSignature Async Threshold Signature of a commit (or commits) based on the partial order quorum Can sign Transaction, AddFile, and Snapshot, after they have been committed to the DAG. **DEPS**: the signed commits ```rust enum AsyncSignature { V0(ObjectRef), } ``` ### Signature A Signature Object (it is not a commit), referenced in AsyncSignature or SyncSignature. Contains all the information that the signers have prepared. ```rust enum Signature { V0(SignatureV0), } struct SignatureV0 { /// the content that is signed content: SignatureContent, /// The threshold signature itself. can come from 3 different sets threshold_sig: ThresholdSignatureV0, /// Reference to the Certificate that must be used to verify this signature. certificate_ref: ObjectRef, } enum SignatureContent { V0(SignatureContentV0), } struct SignatureContentV0 { /// list of all the "end of chain" commit for each branch /// when doing a SyncSignature, or /// a list of arbitrary commits to sign, for AsyncSignature. commits: Vec, } // the threshold signature itself. with indication which set was used enum ThresholdSignatureV0 { PartialOrder(threshold_crypto::Signature), TotalOrder(threshold_crypto::Signature), Owners(threshold_crypto::Signature), } ``` ### Certificate A Certificate object (not a commit) containing all the information needed to verify a signature. Certificates form a chain that represent the change of signing epoch, with at the root of the chain, the public key of the repo (RepoID) ```rust enum Certificate { V0(CertificateV0), } struct CertificateV0 { /// content of the certificate, which is signed here /// below by the previous certificate signers. content: CertificateContentV0, /// signature over the content. sig: CertificateSignatureV0, } enum CertificateSignatureV0 { /// the root CertificateContentV0 is signed with the PrivKey of the Repo Repo(Sig), /// Any other certificate in the chain of trust is signed by the /// total_order quorum of the previous certificate /// hence establishing the chain of trust. TotalOrder(threshold_crypto::Signature), Owners(threshold_crypto::Signature), Store, } /// A Certificate content, signed by the previous certificate signers. struct CertificateContentV0 { previous: ObjectRef, readcap_id: ObjectId, /// PublicKey used by the Owners. verifier uses this PK // if the signature was issued by the Owners. owners_pk_set: threshold_crypto::PublicKey, /// the two "orders" PublicKeys (total_order and partial_order) orders_pk_sets: OrdersPublicKeySetsV0, } /// Enum for "orders" PKsets. enum OrdersPublicKeySetsV0 { Store(ObjectRef), Repo( ( threshold_crypto::PublicKey, Option, ), ), None, } ``` - CertificateSignatureV0.TotalOrder : indicates the total_order quorum set has been used to sign the certificate. - CertificateSignatureV0.Owners : indicates that the owners set signed the certificate. If the previous cert's total order PKset has a threshold value of 0 or 1 (1 or 2 signers in the quorum),then it is allowed that the next certificate (this one) will be signed by the owners PKset instead. This is for a simple reason: if a user is removed from the list of signers in the total_order quorum,then in those 2 cases, the excluded signer will probably not cooperate to their exclusion, and will not sign the new certificate. To avoid deadlocks, we allow the owners to step in and sign the new cert instead. The Owners are also used when there is no quorum/signer defined (OrdersPublicKeySetsV0::None). - CertificateSignatureV0.Store : in case the new certificate being signed is an update on the store certificate (OrdersPublicKeySetsV0::Store(ObjectRef) has changed from previous cert) then the signature is in that new store certificate, and not here. nothing else should have changed in the CertificateContent, and the validity of the new store cert has to be checked. - CertificateContentV0.previous : the previous certificate in the chain of trust. Can be another Certificate or the Repository commit's body when we are at the root of the chain of trust. - CertificateContentV0.readcap_id : the Commit Id of the latest RootBranch definition (= the ReadCap ID) in order to keep in sync with the options for signing. Not used for verifying (this is why the secret is not present). - OrdersPublicKeySetsV0 : Can be inherited from the store, in this case, it is an ObjectRef pointing to the latest Certificate of the store. Or can be 2 PublicKeys defined specially for this repo, - OrdersPublicKeySetsV0::Repo.0 one for the total_order (first one). - OrdersPublicKeySetsV0::Repo.1 the other for the partial_order (second one.is optional, as some repos are forcefully totally ordered and do not have this set). - OrdersPublicKeySetsV0::None : the total_order quorum is not defined (yet, or anymore). there are no signers for the total_order, neither for the partial_order. The owners replace them. ### StoreUpdate Updates the ReadCap of the public, protected, Group and Dialog stores of the User This is used to speedup joining the overlay of such stores, for new devices on new brokers (or for the web-app that always need it) so they don't have to read the whole pub/sub of the StoreRepo in order to get the last ReadCap **DEPS** : to the previous ones (if any) ```rust struct StoreUpdateV0 { // id of the store. store: StoreRepo, store_read_cap: ReadCap, overlay_branch_read_cap: ReadCap, /// Metadata (not used) metadata: Vec, } ``` ### AddRepo Adds a repo into the store branch. The repo's `store` field should match the destination store **DEPS**: to the previous AddRepo commit(s) if it is an update. in this case, repo_id of the referenced rootbranch definition(s) should match ```rust struct AddRepoV0 { read_cap: ReadCap, /// Metadata (not used) metadata: Vec, } ``` ### AddSignerCap Adds a SignerCap into the user branch. So that a user can share with all its device a new signing capability that was just created. The cap's `epoch` field should be dereferenced and the user must be part of the quorum/owners. **DEPS**: to the previous AddSignerCap commit(s) if it is an update. in this case, repo_ids have to match, and the referenced rootbranch definition(s) should have compatible causal past (the newer AddSignerCap must have a newer epoch compared to the one of the replaced cap ) ```rust struct AddSignerCapV0 { cap: SignerCap, /// Metadata (not used) metadata: Vec, } /// when a signing capability is removed, a new SignerCap should be /// committed to User branch, with the removed key set to None struct SignerCap { repo: RepoId, /// latest RootBranch commit or Quorum commit that defines the signing epoch epoch: ObjectId, owner: Option, total_order: Option, partial_order: Option, } ``` ### Transaction Commit used for both AsyncTransaction and SyncTransaction on a transactional branch. Contains a serialization of CRDT operation contained in a TransactionBody ```rust enum Transaction { V0(TransactionV0), } type TransactionV0 = Vec; struct TransactionBody { body_type: TransactionBodyType, graph: Option, discrete: Option, } // Triple is an oxigraph::oxrdf::triple::Triple struct GraphTransaction { inserts: Vec, removes: Vec, } enum DiscreteTransaction { /// A serialization of a yrs::Update YMap(Vec), YArray(Vec), YXml(Vec), YText(Vec), /// An automerge::Patch Automerge(Vec), } ``` - TransactionBodyType : one of : - Graph - Discrete - Both ### AddFile Add a new binary file in a branch **FILES**: the file ObjectRef ```rust struct AddFileV0 { /// an optional name. does not conflict /// (not unique across the branch nor repo) name: Option, /// Metadata (not used) metadata: Vec, } ``` ### Snapshot Snapshot of a Branch. Contains a reference to a Snapshot Object computed from the commits at the specified head. ```rust struct SnapshotV0 { // Branch heads the snapshot was made from heads: Vec, /// Reference to Object containing Snapshot data structure content: ObjectRef, } ``` The referenced Object content is a JSON serialization (UTF8) as a string. The JSON has the form: ```json { "discrete": "", // depends on the CrdtType "graph": ["one triple in Turtle format", "another triple in Turtle format"] } ``` ## Blocks and Commits format ### Commit Commit object Signed by member private key authorized to publish the commitBody's type ```rust struct CommitV0 { /// Commit content content: CommitContent, /// Signature over the content by the author. an editor (UserId) sig: Sig, } enum CommitContent { V0(CommitContentV0), } struct CommitContentV0 { /// Commit author (a hash of UserId) author: Digest, /// BranchId the commit belongs to branch: BranchId, perms: Vec, header_keys: Option, /// This commit can only be accepted if signed by this quorum quorum: QuorumType, timestamp: Timestamp, /// App-specific metadata (not used) metadata: Vec, /// reference to an Object with a CommitBody inside. body: ObjectRef, } enum CommitHeaderKeys { V0(CommitHeaderKeysV0), } struct CommitHeaderKeysV0 { acks: Vec, deps: Vec, /// head commits that are invalid nacks: Vec, files: Vec, } ``` - CommitHeaderKeysV0.files : list of Files that are referenced in this commit. Exceptionally this is an ObjectRef, because even if the CommitHeader is omitted, we want the Files to be openable. CommitContentV0: - author : Commit author, a BLAKE3 keyed hash of UserId. key is a BLAKE3 derive_key ("NextGraph UserId Hash Overlay Id for Commit BLAKE3 key", overlayId). Hash will be different than for ForwardedPeerAdvertV0 so that core brokers dealing with public sites wont be able to correlate commits and editing peers (via common author's hash).O nly the brokers of the authors that pin a repo for Outer Overlay exposure, will be able to correlate. It also is a different hash than the OuterOverlayId, which is good to prevent correlation when the RepoId is used as author (for Repository, RootBranch and Branch commits) - branch : BranchId the commit belongs to (not a ref, as readers do not need to access the branch definition) - perms : optional list of dependencies on some commits in the root branch that contain the write permission needed for this commit - header_keys : Keys counterpart of all the references present in the header (deps, acks, files, etc...) as the header only has the IDs. - quorum : one of : - NoSigning - PartialOrder - TotalOrder - Owners - IamTheSignature - body: When the commit is reverted or erased (after compaction), the CommitBody is deleted, creating a dangling reference ### CommitHeader Header of a Commit, can be embedded or used as a reference. Contains only the IDs of the references present in the Header (acls, deps, files, etc...). The keys are in CommitContent. The Brokers can read the CommitHeader. On the ExtProtocol, the CommitHeader is stripped from the blocks, so that external readers cannot reconstruct the DAG (they will only be able to read the commit they have been given access to). If the full branch needs to be shared on the ExtProtocol, then a ReadCap of a Branch should be shared instead. The decision to embed or reference is make according to the space left in the CommitContent. Most of the time, it will be embedded. ```rust struct CommitHeaderV0 { /// current valid commits in HEAD before inserting new commit acks: Vec, /// Other objects this commit strongly depends on deps: Vec, /// dependency that is removed after this commit. used for reverts ndeps: Vec, compact: bool, /// head commits that are invalid nacks: Vec, /// list of Files that are referenced in this commit files: Vec, /// list of Files that are not referenced anymore after this commit /// the commit(s) that created the files should be in deps nfiles: Vec, } ``` - compact : tells brokers that this is a hard snapshot and that all the ACKs and full causal past should be treated as ndeps (their body removed). brokers will only perform the deletion of bodies after this commit has been ACKed by at least one subsequent commit. but if the next commit is a nack, the deletion is aborted. ### RandomAccessFile A Random Access File is an immutable binary file that can be stored in the repo, and that can be read in random access. If the file is big, there is no cost if we want to seek to a specific position inside the file, or when decrypting it, everything is streamable and can also be ran in parallel. The uploading is also streamable and can be done concurrently. Each block that composes the file is addressable directly. There is no need to decrypt all the file first, nor to have all the file in memory before being able to read it. ```rust enum RandomAccessFileMeta { V0(RandomAccessFileMetaV0), } struct RandomAccessFileMetaV0 { content_type: String, metadata: Vec, total_size: u64, chunk_size: u32, arity: u16, depth: u8, } ``` ### Object An object is constructed in the follow way : - an ObjectContent is serialized - the resulting buffer is chunked into blocks - each block is encrypted separately - the blocks are assembled in a merkle tree, the root of which is the identifier for the object. ```rust enum ObjectContent { V0(ObjectContentV0), } enum ObjectContentV0 { Commit(Commit), CommitBody(CommitBody), CommitHeader(CommitHeader), Quorum(Quorum), Signature(Signature), Certificate(Certificate), SmallFile(SmallFile), RandomAccessFileMeta(RandomAccessFileMeta), RefreshCap(RefreshCap), Snapshot(Vec), // JSON serialization (UTF8) } ``` ### Block Immutable block with encrypted content, of maximal size 1MB. An `ObjectContent` is chunked and stored as `Block`s in a Merkle tree. Each Block is a Merkle tree node. The brokers can see the children's IDs, but cannot read their content, as the keys are encrypted. ```rust enum Block { V0(BlockV0), } /// unencrypted part of the Block struct BlockV0 { commit_header_key: Option, content: BlockContent, } enum BlockContent { V0(BlockContentV0), } struct BlockContentV0 { commit_header: CommitHeaderObject, /// Block IDs for child nodes in the Merkle tree, children: Vec, encrypted_content: Vec, } enum CommitHeaderObject { /// if header is a reference Id(ObjectId), /// if header is embedded EncryptedContent(Vec), None, RandomAccess, } enum ChunkContentV0 { /// list of Keys of the child nodes InternalNode(InternalNode), /// one chunk of an ObjectContentV0 DataChunk(Vec), } type InternalNode = Vec ``` - BlockV0.commit_header_key : optional Key needed to open the CommitHeader. can be omitted if the Commit is shared without its causal past, or if the block is not a root block of commit, or that commit is a root commit (first in branch) - BlockContentV0.commit_header : Reference (actually, only its ID or an embedded block if the size is small enough) to a CommitHeader of the root Block of a commit that contains references to other objects (e.g. Commit deps & acks). Only set if the block is a commit (and it is the root block of the Object). It is an easy way to know if the Block is a commit (but be careful because some root commits can be without a header). - BlockContentV0.children : Block IDs for child nodes in the Merkle tree. It is empty if ObjectContent fits in one block or this block is a leaf. in both cases, encrypted_content is then not empty - BlockContentV0.encrypted_content : contains an encrypted ChunkContentV0, encrypted using convergent encryption with ChaCha20: nonce = 0 and key = BLAKE3 keyed hash (convergence_key, plaintext of ChunkContentV0), with convergence_key = BLAKE3 derive_key ("NextGraph Data BLAKE3 key", StoreRepo + store's repo ReadCapSecret ) which is basically similar to the InnerOverlayId but not hashed, so that brokers cannot do "confirmation of a file" attacks. ## Common types ```rust /// 32-byte Blake3 hash digest type Blake3Digest32 = [u8; 32]; enum Digest { Blake3Digest32(Blake3Digest32), } /// ChaCha20 symmetric key type ChaCha20Key = [u8; 32]; enum SymKey { ChaCha20Key(ChaCha20Key), } /// Curve25519 public key Edwards form type Ed25519PubKey = [u8; 32]; /// Curve25519 public key Montgomery form type X25519PubKey = [u8; 32]; /// Curve25519 private key Edwards form type Ed25519PrivKey = [u8; 32]; /// Curve25519 private key Montgomery form type X25519PrivKey = [u8; 32]; enum PubKey { Ed25519PubKey(Ed25519PubKey), X25519PubKey(X25519PubKey), } enum PrivKey { Ed25519PrivKey(Ed25519PrivKey), X25519PrivKey(X25519PrivKey), } type Ed25519Sig = [[u8; 32]; 2]; enum Sig { Ed25519Sig(Ed25519Sig), } /// Timestamp: absolute time in minutes since 2022-02-22 22:22 UTC type Timestamp = u32; const EPOCH_AS_UNIX_TIMESTAMP: u64 = 1645568520; type RepoId = PubKey; type RepoHash = Digest; type TopicId = PubKey; type UserId = PubKey; type BranchId = PubKey; type BlockId = Digest; type BlockKey = SymKey; struct BlockRef { /// Block ID id: BlockId, /// Key for decrypting the Block key: BlockKey, } type ObjectId = BlockId; type ObjectKey = BlockKey; type ObjectRef = BlockRef; type ReadCap = ObjectRef; type ReadCapSecret = ObjectKey; type RepoWriteCapSecret = SymKey; type BranchWriteCapSecret = PrivKey; type OuterOverlayId = Digest; type InnerOverlayId = Digest; enum OverlayId { Outer(Blake3Digest32), Inner(Blake3Digest32), Global, } enum StoreOverlayV0 { PublicStore(PubKey), ProtectedStore(PubKey), PrivateStore(PubKey), Group(PubKey), Dialog(Digest), } enum StoreOverlay { V0(StoreOverlayV0), OwnV0(StoreOverlayV0), } enum StoreRepoV0 { PublicStore(RepoId), ProtectedStore(RepoId), PrivateStore(RepoId), Group(RepoId), Dialog((RepoId, Digest)), } ``` - ReadCap : Read capability (for a commit, branch, whole repo, or store) - For a store: A ReadCap to the root repo of the store - For a repo: A reference to the latest RootBranch definition commit - For a branch: A reference to the latest Branch definition commit - For a commit or object, the ObjectRef is itself the read capability - ReacCapSecret : Read capability secret (for a commit, branch, whole repo, or store). it is already included in the ReadCap (it is the key part of the reference) - RepoWriteCapSecret : Write capability secret (for a whole repo) - BranchWriteCapSecret : Write capability secret (for a branch's topic) - Overlay ID : - for outer overlays that need to be discovered by public key: BLAKE3 hash over the public key of the store repo - for inner overlays: BLAKE3 keyed hash over the public key of the store repo with key = BLAKE3 derive_key ("NextGraph Overlay ReadCapSecret BLAKE3 key", store repo's overlay's branch ReadCapSecret) except for Dialog Overlays where the Hash is computed from 2 secrets. - StoreOverlay::OwnV0 : The repo is a store, so the overlay can be derived from its own ID. In this case, the branchId of the `overlay` branch is entered here as PubKey of the StoreOverlayV0 variants.