Client Protocol

master
Niko PLP 3 weeks ago
parent 6118df7188
commit 22a4963720
  1. 2
      src/pages/en/design.md
  2. 4
      src/pages/en/ecosystem.md
  3. 66
      src/pages/en/framework/crdts.md
  4. 22
      src/pages/en/framework/semantic.md
  5. 116
      src/pages/en/specs/format-repo.md
  6. 478
      src/pages/en/specs/protocol-client.md
  7. 6
      src/pages/en/specs/protocol-ext.md
  8. 2
      src/pages/en/verifier.md

@ -6,7 +6,7 @@ layout: ../../layouts/MainLayout.astro
In the previous chapters, we already introduced the concepts of : In the previous chapters, we already introduced the concepts of :
- [Local-First](/en/local-first) and [Encryption](/en/encryption) where explain why we need a novel E2EE sync protocol, and what is the “2-tier” topology of our network. - [Local-First](/en/local-first) and [Encryption](/en/encryption) where we explain why we need a novel E2EE sync protocol, and what is the “2-tier” topology of our network.
- [Data-First](/en/framework/data-first) and [CRDTs](/en/framework/crdts) that bring interoperability, and how [DID URIs](/en/framework/nuri) bring portability to the [Documents](/en/documents) managed by our [Semantic Web](/en/framework/semantic) database. - [Data-First](/en/framework/data-first) and [CRDTs](/en/framework/crdts) that bring interoperability, and how [DID URIs](/en/framework/nuri) bring portability to the [Documents](/en/documents) managed by our [Semantic Web](/en/framework/semantic) database.

@ -12,11 +12,11 @@ NextGraph ecosystem is composed of
- A Framework for App developers, with SDK for nodeJS/Deno, web apps in Svelte and React, an SDK for Rust, and soon for Tauri apps too. - A Framework for App developers, with SDK for nodeJS/Deno, web apps in Svelte and React, an SDK for Rust, and soon for Tauri apps too.
- An App store integrated inside our official apps, so that App developers can easily reach their audience, and users can easily install those apps. Those apps will run inside iframe inside our official apps. - An App store integrated inside our official apps, so that App developers can easily reach their audience, and users can easily install those apps. Those apps will run inside an iframe embedded in our official apps.
- App developers can also build standalone apps with our framework, without being embedded inside our official apps. They can be shipped separately, and have total control on their GUIs. In this case, they will still need to integrate with our Capabilities APIs, in order to request permissions for using the users' data. More on that soon. - App developers can also build standalone apps with our framework, without being embedded inside our official apps. They can be shipped separately, and have total control on their GUIs. In this case, they will still need to integrate with our Capabilities APIs, in order to request permissions for using the users' data. More on that soon.
- An Open Network that anybody can join, by running their own self-hosted broker. - An Open Network that anybody can join, by running their own self-hosted broker, or by using our Open Protocol (with the libraries we already provide or with new bindings they develop).
All the pieces of this ecosystem are being built at the moment. We are currently focusing our efforts on readying the Framework for App developers. Stay tuned! All the pieces of this ecosystem are being built at the moment. We are currently focusing our efforts on readying the Framework for App developers. Stay tuned!

@ -64,62 +64,62 @@ As you will see in the **branch** section below, the programmer decides which CR
Now let's have a look at what those CRDTs have in common and what is different between them. We have marked 🔥 the features that are unique to each model and that we find very cool. Now let's have a look at what those CRDTs have in common and what is different between them. We have marked 🔥 the features that are unique to each model and that we find very cool.
| | Graph (RDF) | Yjs | Automerge | | | Graph (RDF) | Yjs | Automerge |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------- | ---------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
|   | | | | |   | | | |
|   | | | | |   | | | |
| **key/value** | ✅ | ✅ | ✅ | | **key/value** | ✅ | ✅ | ✅ |
| <td colspan=3> (aka property/value, and in RDF, it is called predicate/object) this is the basic feature that the 3 models offer. You have a Document, and you can add key/value pairs to it. Also known as Map or Object. | | <td colspan=3> (aka property/value, and in RDF, it is called predicate/object) this is the basic feature that the 3 models offer. You have a Document, and you can add key/value pairs to it. Also known as Map or Object. |
| **property names** | ✅ predicate | string | string | | **property names** | ✅ 🔥 predicate | string | string |
| <td colspan=3> Thanks to the Ontology/Schema mechanism of RDF (OWL), the schema information is embedded inside the data (with what we call a Predicate), thus avoiding any need for schema migration. Also, it brings full data interoperability as many common ontologies already exist and can be reused, and new ones can be published or shared | | <td colspan=3> Thanks to the Ontology/Schema mechanism of RDF (OWL), the schema information is embedded inside the data (with what we call a Predicate), thus avoiding any need for schema migration. Also, it brings full data interoperability as many common ontologies already exist and can be reused, and new ones can be published or shared |
| **nested** | ✅ blank nodes | ✅ | ✅ | | **nested** | ✅ blank nodes | ✅ | ✅ |
| <td colspan=3> key/value pairs can be nested (like in JSON) | | <td colspan=3> key/value pairs can be nested (like in JSON) |
| **sequence** | ❌ \* | ✅ | ✅ | | **sequence** | ❌ \* | ✅ | ✅ |
| <td colspan=3> And like in JSON or Javascript, some keys can be Arrays (aka list), which preserve the ordering of the elements. (\*) In RDF, storing arrays is a bit more tricky. For that purpose, Collections can encode order, but are not CRDT based nor concurrency safe | | <td colspan=3> And like in JSON or Javascript, some keys can be Arrays (aka list), which preserve the ordering of the elements. (\*) In RDF, storing arrays is a bit more tricky. For that purpose, Collections can encode order, but are not CRDT based nor concurrency safe |
| **sets** | 🔺 multiset | ✅ | ✅ | | **sets** | 🔺 multiset | ✅ | ✅ |
| <td colspan=3> RDF predicates (the equivalent of properties or keys in discrete documents) are not unique. they can have multiple values. that's the main difference between the discrete and graph models. We will offer the option to enforce rules on RDF data (with SHACL/SHEX) that could force unicity of keys, but that would require the use of **Synchronous Transactions**. Sets are usually represented in JS/JSON with a map, of which the values are not used (set to null or false), because keys are unique, so we use the keys to represent the set. In RDF, keys are not unique, but a set can be represented by choosing a single key (a predicate) and its many values will represent the set, as a pair "key/value" is unique (aka a "predicate/object" pair). | | <td colspan=3> RDF predicates (the equivalent of properties or keys in discrete documents) are not unique. they can have multiple values. that's the main difference between the discrete and graph models. We will offer the option to enforce rules on RDF data (with SHACL/SHEX) that could force unicity of keys, but that would require the use of **Synchronous Transactions**. Sets are usually represented in JS/JSON with a map, of which the values are not used (set to null or false), because keys are unique, so we use the keys to represent the set. In RDF, keys are not unique, but a set can be represented by choosing a single key (a predicate) and its many values will represent the set, as a pair "key/value" is unique (aka a "predicate/object" pair). |
| **unique key** | ❌ | ✅ | ✅ | | **unique key** | ❌ | ✅ | ✅ |
| <td colspan=3> related to the above | | <td colspan=3> related to the above |
| **conflict resolution** | ✅ | lamport clock ? | [higher actor ID](https://automerge.org/docs/documents/conflicts/) | | **conflict resolution** | ✅ | lamport clock ? | [higher actor ID](https://automerge.org/docs/documents/conflicts/) |
| <td colspan=3> because RDF has no unique keys, it cannot conflict. | | <td colspan=3> because RDF has no unique keys, it cannot conflict. |
| **CRDT strings in property values** | ❌ | ❌ | ✅ 🔥 | | **CRDT strings in property values** | ❌ | ❌ | ✅ 🔥 |
| <td colspan=3> allows concurrent edits on the value of a property that is a string. this feature is only offered by Automerge. Very useful for collaborative forms or tables/spreadsheets by example! | | <td colspan=3> allows concurrent edits on the value of a property that is a string. this feature is only offered by Automerge. Very useful for collaborative forms or tables/spreadsheets by example! |
| **multi-lingual strings** | ✅ 🔥 | ❌ | ❌ | | **multi-lingual strings** | ✅ 🔥 | ❌ | ❌ |
| <td colspan=3> Store the value of a string property in several languages / translations | | <td colspan=3> Store the value of a string property in several languages / translations |
| **Counter CRDT** | ❌ | ❌ | ✅ 🔥 | | **Counter CRDT** | ❌ | ❌ | ✅ 🔥 |
| <td colspan=3> Counters are a safe way to manage integers in a concurrent system. Automerge is the only one offering counters. Please note that CRDT types in general are "eventual consistent" only (BASE model). If you need stronger guarantees like the ones provided by ACID systems (specially guaranteeing the sequencing of operations, very useful for preventing double-spending) then you have to use a **Synchronous Transaction** in NextGraph. | | <td colspan=3> Counters are a safe way to manage integers in a concurrent system. Automerge is the only one offering counters. Please note that CRDT types in general are "eventual consistent" only (BASE model). If you need stronger guarantees like the ones provided by ACID systems (specially guaranteeing the sequencing of operations, very useful for preventing double-spending) then you have to use a **Synchronous Transaction** in NextGraph. |
| **link/ref values (foreign key)** | ✅ 🔥 | ❌ \* | ❌ \* | | **link/ref values (foreign key)** | ✅ 🔥 | ❌ \* | ❌ \* |
| <td colspan=3> (\*) discrete data cannot link to external documents. This is the reason why all Documents in NextGraph have a Graph part, in order to enable inter-linking of data and documents across the Global Giant Graph of Linked Data / Semantic Web | | <td colspan=3> (\*) discrete data cannot link to external documents. This is the reason why all Documents in NextGraph have a Graph part, in order to enable inter-linking of data and documents across the Global Giant Graph of Linked Data / Semantic Web |
| **Float values** | ✅ | 🟧 | ✅ | | **Float values** | ✅ | 🟧 | ✅ |
| <td colspan=3> Yjs doesn't enforce strong typing on values. they can be any valid JSON (and Floats are just Numbers). | | <td colspan=3> Yjs doesn't enforce strong typing on values. they can be any valid JSON (and Floats are just Numbers). |
| **Date values** | ✅ | ❌ | ✅ | | **Date values** | ✅ | ❌ | ✅ |
| <td colspan=3> JSON doesn't support JS Date datatype. So for the same reason as above, Yjs doesn't support Dates. | | <td colspan=3> JSON doesn't support JS Date datatype. So for the same reason as above, Yjs doesn't support Dates. |
| **Binary buffer values** | ✅ \* | ✅ | ✅ | | **Binary buffer values** | ✅ \* | ✅ | ✅ |
| <td colspan=3> (\*) as base64 or hex encoded. please note that for all purposes of storing binary data, you should use the **binary files** facility of each Document instead, which is much more efficient. | | <td colspan=3> (\*) as base64 or hex encoded. please note that for all purposes of storing binary data, you should use the **binary files** facility of each Document instead, which is much more efficient. |
| **boolean, integer values** | ✅ | ✅ | ✅ | | **boolean, integer values** | ✅ | ✅ | ✅ |
| **NULL values** | ❌ | ✅ | ✅ | | **NULL values** | ❌ | ✅ | ✅ |
| **strongly typed decimal values** | ✅ | ❌ | ❌ | | **strongly typed decimal values** | ✅ | ❌ | ❌ |
| <td colspan=3> signed, unsigned, and different sizes of integers | | <td colspan=3> signed, unsigned, and different sizes of integers |
| **revisions, diff, revert** | 🟧 | 🟧 | 🟧 | | **revisions, diff, revert** | 🟧 | 🟧 | 🟧 |
| <td colspan=3> 🔥 implemented at the NextGraph level. work in progress. You will be able to see the diffs and access all the historical data of the Document, and also revert to previous versions. | | <td colspan=3> 🔥 implemented at the NextGraph level. work in progress. You will be able to see the diffs and access all the historical data of the Document, and also revert to previous versions. |
| **compact** | ✅ | ✅ | ❓ | | **compact** | ✅ | ✅ | ❓ |
| <td colspan=3> compacting is always available as a feature at the NextGraph level (and will compact Graph and Discrete parts alike). Yjs tends to garbage collect deleted content. not sure if automerge does it. Compact will remove all the historical data and deleted content (you won't be able to see diffs nor revert, for all the causal past happening before the compact operation. but normal CRDT behaviour can resume after) . This is a synchronous operation. | | <td colspan=3> compacting is always available as a feature at the NextGraph level (and will compact Graph and Discrete parts alike). Yjs tends to garbage collect deleted content. not sure if automerge does it. Compact will remove all the historical data and deleted content (you won't be able to see diffs nor revert, for all the causal past happening before the compact operation. but normal CRDT behaviour can resume after) . This is a synchronous operation. |
| **snapshot** | ✅ | ✅ | ✅ | | **snapshot** | ✅ | ✅ | ✅ |
| <td colspan=3> take a snapshot of the data at a given HEADs, and store it in a non-CRDT way so it can be opened quickly. Also removes all the historical and deleted data. But a snapshot cannot be used to continue collborating on the document. See it as something similar to "export as a static file". | | <td colspan=3> take a snapshot of the data at a given HEADs, and store it in a non-CRDT way so it can be opened quickly. Also removes all the historical and deleted data. But a snapshot cannot be used to continue collborating on the document. See it as something similar to "export as a static file". |
| **isolated transactions** | ✅ | ✅ | ✅ | | **isolated transactions** | ✅ | ✅ | ✅ |
| <td colspan=3> A NextGraph transaction can atomically mutate both the Graph and the Discrete data in a single isolated transaction. Can be useful to enforce consistency and keep in sync between information stored in the discrete and graph parts of the same document. but: transactions cannot span multiple documents (for that matter, see **smart contracts**). When a SPARQL Update spans across Documents, then the Transaction is split into several ones (one for each target Document) and each one is applied separately, meaning, not atomically. Also, keep in mind, as explained above in the "Counter" section, that CRDTs are eventually consistent. If you need ACID guarantees, use a synchronous transaction instead. | | <td colspan=3> A NextGraph transaction can atomically mutate both the Graph and the Discrete data in a single isolated transaction. Can be useful to enforce consistency and keep in sync between information stored in the discrete and graph parts of the same document. but: transactions cannot span multiple documents (for that matter, see **smart contracts**). When a SPARQL Update spans across Documents, then the Transaction is split into several ones (one for each target Document) and each one is applied separately, meaning, not atomically. Also, keep in mind, as explained above in the "Counter" section, that CRDTs are eventually consistent. If you need ACID guarantees, use a synchronous transaction instead. |
| **Svelte5 reactive Store (Runes)** | 🟧 | 🟧 | 🟧 | | **Svelte5 reactive Store (Runes)** | 🟧 | 🟧 | 🟧 |
| <td colspan=3> 🔥 this is planned. will be available shortly. the store will be **writable** and will allow a bidirectional binding of the data to some javascript reactive variables in Svelte (same could be done for React) and we are considering the use of **Valtio** for a generic reactive store, that would also work on nodeJS and Deno | | <td colspan=3> 🔥 this is planned. will be available shortly. the store will be **writable** and will allow a bidirectional binding of the data to some javascript reactive variables in Svelte (same could be done for React) and we are considering the use of **Valtio** for a generic reactive store, that would also work on nodeJS and Deno |
| **queries across documents** | ✅ 🔥SPARQL | 🟧 \* | 🟧 \* | | **queries across documents** | ✅ 🔥SPARQL | 🟧 \* | 🟧 \* |
| <td colspan=3> (\*) support is planned at the NextGraph level, to be able to query discrete data too in SPARQL. (GraphQL support could then be added) | | <td colspan=3> (\*) support is planned at the NextGraph level, to be able to query discrete data too in SPARQL. (GraphQL support could then be added) |
| **export/import JSON** | ✅ JSON-LD | ✅ | ✅ | | **export/import JSON** | ✅ JSON-LD | ✅ | ✅ |
| &nbsp; | | | | | &nbsp; | | | |
| **Rich Text** | N/A | attributes on XMLElement | [Marks and Block Markers](https://automerge.org/docs/documents/rich_text/) and [here](https://automerge.org/docs/under-the-hood/rich_text_schema/) | | **Rich Text** | N/A | attributes on XMLElement | [Marks and Block Markers](https://automerge.org/docs/documents/rich_text/) and [here](https://automerge.org/docs/under-the-hood/rich_text_schema/) |
| <td colspan=3> Yjs integration for ProseMirror and Milkdown is quite stable. Peritext is newer and only offers ProseMirror integration. For this reason we use Yjs for Rich Text. Performance considerations should be evaluated too. | | <td colspan=3> Yjs integration for ProseMirror and Milkdown is quite stable. Peritext is newer and only offers ProseMirror integration. For this reason we use Yjs for Rich Text. Performance considerations should be evaluated too. |
| **Referencing rich text from outside** | N/A | ✅ 🔥 [Relative Position](https://docs.yjs.dev/api/relative-positions) | ✅ [get_cursor](https://automerge.org/automerge/automerge/trait.ReadDoc.html#tymethod.get_cursor) | | **Referencing rich text from outside** | N/A | ✅ 🔥 [Relative Position](https://docs.yjs.dev/api/relative-positions) | ✅ [get_cursor](https://automerge.org/automerge/automerge/trait.ReadDoc.html#tymethod.get_cursor) |
| <td colspan=3> useful for anchoring comments, quotes and annotations (as you shouldn't have to modify a document in order to add a comment or annotation to it). | | <td colspan=3> useful for anchoring comments, quotes and annotations (as you shouldn't have to modify a document in order to add a comment or annotation to it). |
| **shared cursor** | N/A | 🟧 \* | 🟧 \* | | **shared cursor** | N/A | 🟧 \* | 🟧 \* |
| <td colspan=3> (\*) available in lib but not yet integrated in NextGraph | | <td colspan=3> (\*) available in lib but not yet integrated in NextGraph |
| **undo/redo** | N/A | 🟧 \* | 🟧 \* | | **undo/redo** | N/A | 🟧 \* | 🟧 \* |
Keep on reading about how to handle the [schema](/en/framework/schema) of your data, and what the [Semantic Web](/en/framework/semantic) is all about. Keep on reading about how to handle the [schema](/en/framework/schema) of your data, and what the [Semantic Web](/en/framework/semantic) is all about.

@ -8,11 +8,11 @@ Maybe you heard about Semantic Web, Linked Data or RDF before. Maybe you didn't
The 3 terms are synonyms. RDF is the data format used to describe Resources. Semantic Web just means that as we have the web of pages (linked by hyper references (href) using the HTTP protocol), we also can create a web of data, and that's basically what the Semantic Web is all about. We say semantic because it gets close enough to the data to bring **meaning** to otherwise format-less strings of text. The 3 terms are synonyms. RDF is the data format used to describe Resources. Semantic Web just means that as we have the web of pages (linked by hyper references (href) using the HTTP protocol), we also can create a web of data, and that's basically what the Semantic Web is all about. We say semantic because it gets close enough to the data to bring **meaning** to otherwise format-less strings of text.
Linked Data says exactly the same, and is just another name for what Tim Berners-Lee envisioned back in 1999 (the Semantic Web) and reformulated with the term Linked Data in 2006 and the Giant Global Graph in 2007. Linked Data says exactly the same, and is just another name for a concept that Tim Berners-Lee envisioned back in 1999 (the Semantic Web) and reformulated with the term Linked Data in 2006 and the Giant Global Graph in 2007.
SPARQL is the query language of choice for RDF data, and NextGraph supports both of them at its core. You can see SPARQL as the equivalent of SQL for a relational database. SPARQL is the query language of choice for RDF data, and NextGraph supports both of them at its core. You can see SPARQL as the equivalent of SQL for a relational database.
OWL is a language (based on RDF) that expresses the schema (also called Ontology) of the data. OWL is a language (based on RDF) that expresses the schema (also called Ontology or Vocabulary) of the data.
When we are talking about publicly available data, most of the time concerning immutable facts like academic data, we use the term LOD for Linked Open Data. But Semantic data doesn't have to be public nor open, and in NextGraph we use RDF to store private and encrypted data that very few people will ever see. When we are talking about publicly available data, most of the time concerning immutable facts like academic data, we use the term LOD for Linked Open Data. But Semantic data doesn't have to be public nor open, and in NextGraph we use RDF to store private and encrypted data that very few people will ever see.
@ -63,7 +63,7 @@ As you can see, the predicate's names are often written with 2 words separated b
In the classical semantic web, this URI is a URL, in NextGraph it is a Nuri (a NextGraph DID URI) or it can also be a URL if needed. In the classical semantic web, this URI is a URL, in NextGraph it is a Nuri (a NextGraph DID URI) or it can also be a URL if needed.
So this "file" that contains the ontology, most often in the format OWL, which is also some RDF, that describes the classes, properties, and how they can be combined (which properties belong to which classes, the cardinality of relationships, etc). So this "file" that contains the ontology, most often in the format OWL, which is also some RDF, describes the classes, properties, and how they can be combined (which properties belong to which classes, the cardinality of relationships, etc).
Each entry in the ontology gets a name that can be used later on as a predicate, like `label` that can be found in the OWL ontology here [https://www.w3.org/2000/01/rdf-schema#label](https://www.w3.org/2000/01/rdf-schema#label) Each entry in the ontology gets a name that can be used later on as a predicate, like `label` that can be found in the OWL ontology here [https://www.w3.org/2000/01/rdf-schema#label](https://www.w3.org/2000/01/rdf-schema#label)
@ -73,33 +73,35 @@ When we retrieve the triples, we can give some prefixes and the SPARQL engine wi
What is really interesting here is that Ontologies can be shared across documents, and also across triplestores. In fact, there exist already a good list of ontologies that have been adopted worldwide to represent the most common properties and relationships that we use in data. What is really interesting here is that Ontologies can be shared across documents, and also across triplestores. In fact, there exist already a good list of ontologies that have been adopted worldwide to represent the most common properties and relationships that we use in data.
Then some specialized ontologies have also emerged, often created in cooperation between several actors of a specific field of concern, and those ontologies become standards. Then some specialized ontologies have also emerged, often created in cooperation between several actors of a specific field of concern, and those ontologies became standards.
They form some **shared schema** that has been agreed globally and that can be reused, and amended of course if need, by anybody. They form some **shared schema** that has been agreed globally and that can be reused, and amended of course if need, by anybody.
If need be, it is always possible to define also your own ontologies. if they are of any interest to others, they might be published too, and reused. If need be, it is always possible to define also your own ontologies. if they are of any interest to others, they might be published too, and reused.
This mechanism tends to foment interoperability. This mechanism tends to foment **interoperability**.
First of all because the technology itself of predicate encoded as URIs is very much supporting interoperability of itself. First of all because the technology itself of predicate encoded as URIs is very much supporting interoperability of itself.
But also because groups of interest tend to gather and establish standard ontologies in many fields of concern, which enhance even more interoperability and portability of data. But also because groups of interest tend to gather and establish standard ontologies in many fields of concern, which enhance even more interoperability and portability of data.
At NextGraph, we strive to gather the best existing ontologies out there and propose them to you so you can reuse them. We also make it easy for you to create new ones with graphical tools and editors, so you don't have to have a headache when trying to understand exactly how OWL works. If you know about UML, object-oriented programming, or modelling of data, then you can easily create new ontologies. It is all the same thing. We define classes, properties, and relationships between them. At NextGraph, we strive to gather the best existing ontologies out there and propose them to you so you can reuse them. We also make it easy for you to create new ones with graphical tools and editors, so you don't have to have a headache when trying to understand exactly how OWL works. If you know about UML, object-oriented programming, or modelling of data, then you can easily create new ontologies. **It is all about defining classes, properties, and relationships between them**.
One last word about RDF and ontologies: because the schema of data is encoded inside in each triple (in the predicate part), there is no need to keep track of the schema of your data, as we normally do with relational databases or even in JSON. The advantage of RDF is that predicates get assigned with globally unique identifiers too, so there is never any ambiguity about the schema of the data. No migrations needed. No data inconsistency neither. One last word about RDF and ontologies: because the schema of data is encoded inside in each triple (in the predicate part), there is no need to keep track of the schema of your data, as we normally do with relational databases or even in JSON. The advantage of RDF is that predicates get assigned with globally unique identifiers too, so there is never any ambiguity about the schema of the data. No separate schema definition that sits outside of the data itself (with problems to keep the 2 in sync). No migrations needed. No data inconsistency neither.
### SPARQL ### SPARQL
SPARQL is the query language for RDF. And as we have explained earlier in the [Data-First](/en/framework/data-first) chapter, we want to offer to the developer a very simple interface for accessing and modifying the data : in the form of a reactive store with Javascript objects (POJOs). SPARQL is the query language for RDF. And as we have explained earlier in the [Data-First](/en/framework/data-first) chapter, we want to offer to the developer a very simple interface for accessing and modifying the data : in the form of a reactive store with Javascript objects (POJOs).
But sometimes, you also need to run complex queries. Not only to query the data and traverse the graph, but also to update it with specific conditions that SPARQL will help you with much more than going inside the reactive store. But sometimes, you also need to run complex queries. Not only to query the data and traverse the graph, but also to update it with specific conditions that SPARQL will help you with, more efficiently than going inside the reactive store.
In any case, rest assured that with our framework, you always have access to your data in both ways : via the reactive store, and also via SPARQL. In any case, rest assured that with our framework, you always have access to your data in both ways : via the reactive store, and also via SPARQL.
SPARQL is a query language that looks a bit rebutting at first contact. Many developers do not like it at first. This happened to me too in the past. But I can tell you from experience that once you get to learn it a little bit, every gets much simpler and also very powerful. SPARQL is a query language that looks a bit rebutting at first contact. Many developers do not like it at first. This happened to me too in the past. But I can tell you from experience that once you get to learn it a little bit, everything gets much simpler and also very powerful. We will also provide some graphical tools that will generate SPARQL queries for you, and we also plan to add GraphQL support, which is a query language that more developers know about.
It is not for use to give you a course on SPARQL. You can refer to [this page](https://jena.apache.org/tutorials/sparql.html) as a starter, and there are many other tutorials online. It is not for us to give you a course on SPARQL, but we will try to give you the basics. You can refer to [this page](https://jena.apache.org/tutorials/sparql.html) as a starter, and there are many other tutorials online.
**What is important to understand, for someone coming from SQL and relational databases, is that RDF does not need you to plan in advance all the relations you will need, to normalize them, and add the corresponding foreign keys in your tables, so you can later do some JOINs across tables. Instead, all the RDF data, all the triples, are JOINABLE by default, and you don't need to plan it ahead. We believe that this is a very important feature!**
The 2 main types of queries are `SELECT` and `CONSTRUCT`. Select is similar to SQL, and will return you a table with columns representing the variables that you have defined in your SPARQL query, and one row fow each record that has been found. The things to understand is that in the `WHERE` part, you put filters and also patterns for traversing the graph, which means that you can ask the SPARQL engine to "navigate" inside your RDF data and hop from one triple to another, from one resource to another, until you are reaching the desired combination of "match patterns". By example you can ask to find "all the contacts that I have and that live in Barcelona and who are software engineers and who live less than 100m from an ice cream parlor, and you want to see their name, date of birth, phone number, and profile picture." this obviously will hardly work because we usually don't store information about the ice cream parlors on a geo map. But if you had the data, then it would work ! You can see from this example, that the SPARQL engine needs to go through several resources before being able to deliver the results. From Contact -> it goes to -> City -> and then to Shop and in those 3 type of Documents, it checks some properties (Contact.job="software_engineer" and keeps their date_of_birth, phone_number, profile_pic and geo_location for later, then follows the "lives_in" predicate of the contact, that redirects us to a city, then filters by City.label="Barcelona", then follows the predicates "has_shop" and filters by Shop.type="ice_cream_parlor" and by Shop.geo_loc that is within 100 of Contact.geo_location ). This is exactly the same as doing JOINS in SQL, except that you do not need to normalize your tables in advance in order to establish foreign keys. Instead, all the semantic data is always "JOINABLE" by all its predicates. The 2 main types of queries are `SELECT` and `CONSTRUCT`. Select is similar to SQL, and will return you a table with columns representing the variables that you have defined in your SPARQL query, and one row fow each record that has been found. The things to understand is that in the `WHERE` part, you put filters and also patterns for traversing the graph, which means that you can ask the SPARQL engine to "navigate" inside your RDF data and hop from one triple to another, from one resource to another, until you are reaching the desired combination of "match patterns". By example you can ask to find "all the contacts that I have and that live in Barcelona and who are software engineers and who live less than 100m from an ice cream parlor, and you want to see their name, date of birth, phone number, and profile picture." this obviously will hardly work because we usually don't store information about the ice cream parlors on a geo map. But if you had the data, then it would work ! You can see from this example, that the SPARQL engine needs to go through several resources before being able to deliver the results. From Contact -> it goes to -> City -> and then to Shop and in those 3 type of Documents, it checks some properties (Contact.job="software_engineer" and keeps their date_of_birth, phone_number, profile_pic and geo_location for later, then follows the "lives_in" predicate of the contact, that redirects us to a city, then filters by City.label="Barcelona", then follows the predicates "has_shop" and filters by Shop.type="ice_cream_parlor" and by Shop.geo_loc that is within 100 of Contact.geo_location ). This is exactly the same as doing JOINS in SQL, except that you do not need to normalize your tables in advance in order to establish foreign keys. Instead, all the semantic data is always "JOINABLE" by all its predicates.

@ -759,6 +759,54 @@ type InternalNode = Vec<BlockKey>
- BlockContentV0.encrypted_content : contains an encrypted ChunkContentV0, encrypted using convergent encryption with ChaCha20: nonce = 0 and key = BLAKE3 keyed hash (convergence_key, plaintext of ChunkContentV0), with convergence_key = BLAKE3 derive_key ("NextGraph Data BLAKE3 key", StoreRepo + store's repo ReadCapSecret ) which is basically similar to the InnerOverlayId but not hashed, so that brokers cannot do "confirmation of a file" attacks. - BlockContentV0.encrypted_content : contains an encrypted ChunkContentV0, encrypted using convergent encryption with ChaCha20: nonce = 0 and key = BLAKE3 keyed hash (convergence_key, plaintext of ChunkContentV0), with convergence_key = BLAKE3 derive_key ("NextGraph Data BLAKE3 key", StoreRepo + store's repo ReadCapSecret ) which is basically similar to the InnerOverlayId but not hashed, so that brokers cannot do "confirmation of a file" attacks.
## Event
An event is a commit with some additional meta-data needed for sending it into the pub/dub topic.
It can also contain additional blocks that are sent together with the commit (additional_blocks).
And it can also contain a list of ObjectIds that are associated with this commit, but which content isn't pushed in the pub/sub. Interested parties will have to get those blocks separately. This is useful for the broker mainly, to keep a good reference counting on blocks.
```rust
enum Event {
V0(EventV0),
}
struct EventV0 {
content: EventContentV0,
/// Signature over content by topic priv key
topic_sig: Sig,
/// Signature over content by publisher PeerID priv key
peer_sig: Sig,
}
struct EventContentV0 {
/// Pub/sub topic
pub topic: TopicId,
// on public repos, should be obfuscated
pub publisher: PeerId,
/// Commit sequence number of publisher
pub seq: u64,
pub blocks: Vec<Block>,
pub file_ids: Vec<BlockId>,
pub key: Vec<u8>,
}
```
- EventContentV0.blocks : Blocks with encrypted content. First in the list is always the commit block followed by its children, then its optional header and body blocks (and eventual children).
- EventContentV0.file_ids : Ids of additional Blocks (FILES or Objects) that are not to be pushed in the pub/sub. They will be retrieved later separately by interested users (with BlocksGet)
- EventContentV0.key : Encrypted key for the Commit object (the first Block in `blocks` vec)
The ObjectKey is encrypted using ChaCha20 with a nonce = commit_seq and a key = BLAKE3 derive_key ("NextGraph Event Commit ObjectKey ChaCha20 key", RepoId + BranchId + branch_secret(ReadCapSecret of the branch) + publisher)
## Common types ## Common types
```rust ```rust
@ -876,8 +924,74 @@ enum StoreRepoV0 {
Dialog((RepoId, Digest)), Dialog((RepoId, Digest)),
} }
enum PeerAdvert {
V0(PeerAdvertV0),
}
struct PeerAdvertV0 {
/// Peer advertisement content
content: PeerAdvertContentV0,
/// Signature over content by peer's private key
sig: Sig,
}
struct PeerAdvertContentV0 {
/// Peer ID
peer: PeerId,
/// Id of the broker that is forwarding the peer
forwarded_by: Option<DirectPeerId>,
/// Network addresses, must be empty for forwarded peers
address: Vec<NetAddr>,
/// Version number
version: u32,
/// metadata (not used)
metadata: Vec<u8>,
}
type DirectPeerId = PubKey;
type ForwardedPeerId = PubKey;
enum PeerId {
Direct(DirectPeerId),
Forwarded(ForwardedPeerId),
ForwardedObfuscated(Digest),
}
enum NetAddr {
IPTransport(IPTransportAddr),
}
struct IPTransportAddr {
ip: IP,
port: u16,
protocol: TransportProtocol,
}
enum TransportProtocol {
WS,
QUIC,
Local,
}
enum IP {
IPv4(IPv4),
IPv6(IPv6),
}
type IPv6 = [u8; 16];
type IPv4 = [u8; 4]
``` ```
- PeerId::ForwardedObfuscated : BLAKE3 keyed hash over a ForwardedPeerId with key = BLAKE3 derive_key ("NextGraph ForwardedPeerId Hash Overlay Id BLAKE3 key", overlayId)
- ReadCap : Read capability (for a commit, branch, whole repo, or store) - ReadCap : Read capability (for a commit, branch, whole repo, or store)
- For a store: A ReadCap to the root repo of the store - For a store: A ReadCap to the root repo of the store
@ -902,3 +1016,5 @@ enum StoreRepoV0 {
except for Dialog Overlays where the Hash is computed from 2 secrets. except for Dialog Overlays where the Hash is computed from 2 secrets.
- StoreOverlay::OwnV0 : The repo is a store, so the overlay can be derived from its own ID. In this case, the branchId of the `overlay` branch is entered here as PubKey of the StoreOverlayV0 variants. - StoreOverlay::OwnV0 : The repo is a store, so the overlay can be derived from its own ID. In this case, the branchId of the `overlay` branch is entered here as PubKey of the StoreOverlayV0 variants.
- RepoHash : a BLAKE3 Digest over the RepoId

@ -6,4 +6,480 @@ layout: ../../../layouts/MainLayout.astro
**All our protocols and formats use the binary codec called [BARE](https://baremessages.org/)**. **All our protocols and formats use the binary codec called [BARE](https://baremessages.org/)**.
TBD The Client Protocol is used by the Verifier in order to contact the Broker of the User.
It maintain this connection throughout the session that was opened by the User (by opening the wallet in the app, by example).
From this connection, the Verifier gets al the pushed updates, after it subscribed to some branches.
The verifier also sends the updates that it wants to publish, in the form of an Event to the Pub/Sub Topic, and the Broker deals with forwarding this Event to all the other devices and users that have subscribed to this topic.
The communication on the Client Protocol are using a WebSocket, encrypted from within with the Noise Protocol.
In addition, all the Events, that are send and received with this protocol, are also encrypted end-to-end.
For now, the Verifier only connects to one Broker, but for redundancy and failsafe purposes, it will be possible in the future that it tries to connect to several Brokers.
But one rule should always be followed: for any given Overlay, a User can only participate in this Overlay from one and only one Broker at the same time.
Let's dive into the format of the messages and actions/commands that can be exchange on the Client Protocol
The initiation of the connection is common to all protocols, and involves some Noise handshake. It isn't detailed here, please refer to the code for now. We will provide more documentation on that part later on.
For a reference of the common types, please refer to the [Repo format documentation](/en/specs/format-repo/#common-types)
### ClientMessage
All the subsequent message sent and receive on this protocol, are encapsulated inside a `ClientMessage`.
The `ClientRequestV0.id` is set by the requester in an incremental way. Request IDs must be unique by session. They should start from 1 after every start of a new session. This ID is present in the response, in order to match requests and responses.
```rust
enum ClientMessage {
V0(ClientMessageV0),
}
struct ClientMessageV0 {
overlay: OverlayId,
content: ClientMessageContentV0,
/// Optional padding
padding: Vec<u8>,
}
enum ClientMessageContentV0 {
ClientRequest(ClientRequest),
ClientResponse(ClientResponse),
ForwardedEvent(Event),
ForwardedBlock(Block),
}
enum ClientRequest {
V0(ClientRequestV0),
}
struct ClientRequestV0 {
/// Request ID
id: i64,
/// Request content
content: ClientRequestContentV0,
}
enum ClientRequestContentV0 {
OpenRepo(OpenRepo),
PinRepo(PinRepo),
UnpinRepo(UnpinRepo),
RepoPinStatusReq(RepoPinStatusReq),
// once repo is opened or pinned:
TopicSub(TopicSub),
TopicUnsub(TopicUnsub),
BlocksExist(BlocksExist),
BlocksGet(BlocksGet),
CommitGet(CommitGet),
TopicSyncReq(TopicSyncReq),
// For Pinned Repos only :
ObjectPin(ObjectPin),
ObjectUnpin(ObjectUnpin),
ObjectDel(ObjectDel),
// For InnerOverlay's only :
BlocksPut(BlocksPut),
PublishEvent(PublishEvent),
WalletPutExport(WalletPutExport),
}
enum ClientResponse {
V0(ClientResponseV0),
}
struct ClientResponseV0 {
/// Request ID
id: i64,
/// Result (including but not limited to ServerError)
result: u16,
/// Response content
content: ClientResponseContentV0,
}
enum ClientResponseContentV0 {
EmptyResponse,
Block(Block),
RepoOpened(RepoOpened),
TopicSubRes(TopicSubRes),
TopicSyncRes(TopicSyncRes),
BlocksFound(BlocksFound),
RepoPinStatus(RepoPinStatus),
}
```
- ClientResponseV0.result :
- 0 means success
- 1 means PartialContent (the response is a stream. each element in the stream will have this result code)
- 2 means EndOfStream (that's the last response in the stream)
- 3 means False
- 4 and above are errors. for the list, see `ng-repo/src/errors.rs` starting at line 265.
When an error occurs (result >= 3), the content is of the type ClientResponseContentV0::EmptyResponse
### BlocksGet
Request a Block by ID.
Can be used to retrieve the content of a ReadCap, or any Commit that didn't arrive from the Pub/Sub already. Images and Binary files also use BlocksGet when opened, read and downloaded. ReadCaps are preferably retrieved with CommitGet though.
`commit_header_key` is always set to None in the reply when request is made on OuterOverlay of Protected or Group overlays.
#### Request
```rust
enum BlocksGet {
V0(BlocksGetV0),
}
struct BlocksGetV0 {
/// Block IDs to request
ids: Vec<BlockId>,
/// Whether or not to include all children recursively
include_children: bool,
topic: Option<TopicId>,
}
```
- topic : (optional) Topic the object is referenced from, if it is known by the requester. Can be used by the Broker to do a BlockSearchTopic in the core overlay, in case the block is not present locally in the Broker.
#### Response
The response is a stream of `Block`s.
### CommitGet
Request a Commit by ID
Replied with a stream of `Block`s.
`commit_header_key` of the replied blocks is always set to None when request is made on OuterOverlay of protected or Group overlays.
The difference with BlocksGet is that the Broker will try to return all the commit blocks as they were sent in the Pub/Sub Event, if it has it.
This will help in having all the blocks (including the header and body blocks) in one ClientProtocol message, while a BlocksGet would inevitably return only the blocks of the ObjectContent,
and not the header nor the body. And the load() would fail with CommitLoadError::MissingBlocks. That's what happens when the Commit is not present in the pubsub,
and we need to default to using BlocksGet instead.
#### Request
```rust
enum CommitGet {
V0(CommitGetV0),
}
struct CommitGetV0 {
/// Commit ID to request
id: ObjectId,
/// Topic the commit is referenced from, if it is known by the requester.
/// can be used to do a BlockSearchTopic in the core overlay.
topic: Option<TopicId>,
}
```
### RepoPinStatus
Request the status of pinning for a repo on the broker (for the current user's session).
Returns an error code if not pinned, otherwise returns a RepoPinStatusV0.
The overlay entered in ClientMessage is important. if it is the outer, only outer pinning will be checked.
/// if it is the inner overlay, only the inner pinning will be checked.
#### Request
```rust
struct RepoPinStatusReqV0 {
/// Repo Hash
hash: RepoHash,
}
```
#### Response
```rust
enum RepoPinStatus {
V0(RepoPinStatusV0),
}
struct RepoPinStatusV0 {
/// Repo Hash
hash: RepoHash,
/// only possible for Inner overlays
expose_outer: bool,
/// list of topics that are subscribed to
topics: Vec<TopicSubRes>,
}
enum TopicSubRes {
V0(TopicSubResV0),
}
struct TopicSubResV0 {
/// Topic subscribed
topic: TopicId,
/// current HEADS at the broker
known_heads: Vec<ObjectId>,
/// was the topics subscribed as publisher
publisher: bool,
commits_nbr: u64,
}
```
- commits_nbr : used for a subsequent TopicSyncReq in order to properly size the bloomfilter
### PinRepo
Request to pin a repo on the broker.
When client will disconnect, the subscriptions and publisherAdvert of the topics will remain active on the broker.
Replied with a RepoOpened
#### Request
```rust
enum PinRepo {
V0(PinRepoV0),
}
struct PinRepoV0 {
/// Repo Hash
hash: RepoHash,
overlay: OverlayAccess,
overlay_root_topic: Option<TopicId>,
expose_outer: bool,
peers: Vec<PeerAdvert>,
max_peer_count: u16,
ro_topics: Vec<TopicId>,
rw_topics: Vec<PublisherAdvert>,
}
enum PublisherAdvert {
V0(PublisherAdvertV0),
}
struct PublisherAdvertV0 {
content: PublisherAdvertContentV0,
/// Signature over content by topic key
sig: Sig,
}
struct PublisherAdvertContentV0 {
/// Topic public key
topic: TopicId,
/// Peer public key
peer: DirectPeerId,
}
enum OverlayAccess {
ReadOnly(OverlayId),
ReadWrite((OverlayId, OverlayId)),
WriteOnly(OverlayId),
}
```
- OverlayAccess::ReadOnly : The repo will be accessed on the Outer Overlay in Read Only mode.
This can be used for Public, Protected or Group overlays. Value should be an OverlayId::Outer
- OverlayAccess::ReadWrite : The repo will be accessed on the Inner Overlay in Write mode, and the associated Outer overlay is also given. This is used for Public, Protected and Group overlays. First value in tuple should be the OverlayId::Inner, second the OverlayId::Outer. The overlay that should be used in the ClientMessageV0 is the InnerOverlay
- OverlayAccess::WriteOnly : The repo will be accessed on the Inner Overlay in Write mode, and it doesn't have an Outer overlay.
This is used for Private and Dialog overlays. Value should be an OverlayId::Inner
- PinRepoV0.overlay_root_topic : Root topic of the overlay, used to listen to overlay refreshes. Only set for inner overlays (RW or WO) overlays. Not implemented yet
- PinRepoV0.expose_outer : only possible for RW overlays. not allowed for private or dialog overlay. not implemented yet
- PinRepoV0.peers : Broker peers to connect to in order to join the overlay. If the repo has previously been opened (during the same session) or if it is a private overlay, then peers info can be omitted. If there are no known peers in the overlay yet, vector is left empty (creation of a store, or repo in a store that is owned by user).
- PinRepoV0.max_peer_count : Maximum number of peers to connect to for this overlay (only valid for an inner (RW/WO) overlay). not implemented yet
- PinRepoV0.ro_topics : list of topics that should be subscribed to. If the repo has previously been opened (during the same session) and the list of RO topics does not need to be modified, then ro_topics info can be omitted
- PinRepoV0.rw_topics : list of topics for which the client will be a publisher.
Only possible with inner (RW or WO) overlays.
If the repo has previously been opened (during the same session) then rw_topics info can be omitted
#### Response
```rust
type RepoOpened = Vec<TopicSubRes>;
```
for more details about TopicSubRes see above in [RepoPinStatus](#repopinstatus).
### TopicSub
Request subscription to a `Topic` of an already opened or pinned Repo
replied with a TopicSubRes containing the current heads that should be used to do a TopicSync
#### Request
```rust
enum TopicSub {
V0(TopicSubV0),
}
struct TopicSubV0 {
/// Topic to subscribe to
topic: TopicId,
/// Hash of the repo that was previously opened or pinned
repo_hash: RepoHash,
/// Publisher needs to provide a signed `PublisherAdvert`
// for the PeerId of the broker
publisher: Option<PublisherAdvert>,
}
```
#### Response
A `TopicSubRes`. see above in [RepoPinStatus](#repopinstatus) for more details.
### TopicSyncReq
Topic synchronization request
The broker will send all the missing events (or commits if it cannot find events) that are absent from the DAG of the Client.
The events/commits are ordered respecting their causal relation one with another.
Replied with a stream of `TopicSyncRes`
#### Request
```rust
enum TopicSyncReq {
V0(TopicSyncReqV0),
}
struct TopicSyncReqV0 {
/// Topic public key
topic: TopicId,
/// Fully synchronized until these commits
known_heads: Vec<ObjectId>,
/// Stop synchronizing when these commits are met.
/// if empty, the local HEAD at the responder is used instead
target_heads: Vec<ObjectId>,
/// optional Bloom filter of all the commit IDs present locally
known_commits: Option<BloomFilter>,
}
```
#### Response
```rust
enum TopicSyncRes {
V0(TopicSyncResV0),
}
enum TopicSyncResV0 {
Event(Event),
Block(Block),
}
```
### BlocksExist
Request to know if some blocks are present locally on the responder
used by a Client before publishing an event with FILES, to know what to push, and save bandwidth if the blocks are already present on the Broker (content deduplication). commits without FILES cannot be deduplicated because they are unique, due to their unique position in the DAG, and the embedded BranchId.
Replied with `BlocksFound`
#### Request
```rust
enum BlocksExist {
V0(BlocksExistV0),
}
struct BlocksExistV0 {
/// Ids of Blocks to check
blocks: Vec<BlockId>,
}
```
#### Response
```rust
enum BlocksFound {
V0(BlocksFoundV0),
}
struct BlocksFoundV0 {
/// Ids of Blocks that were found locally
/// this might be deprecated soon (as it is useless)
found: Vec<BlockId>,
/// Ids of Blocks that were missing locally
missing: Vec<BlockId>,
}
```
### BlocksPut
Request to store one or more blocks on the broker
Replied with an `EmptyResponse`
#### Request
```rust
enum BlocksPut {
V0(BlocksPutV0),
}
struct BlocksPutV0 {
/// Blocks to store
blocks: Vec<Block>,
}
```
### PublishEvent
Request to publish an event in pub/sub
Replied with an `EmptyResponse`
#### Request
```rust
struct PublishEvent(Event);
```

@ -8,9 +8,9 @@ The Ext Protocol is used by users who want to access a repository anonymously an
For now, it is only possible to use the ExtProtocol for fetching an Object by its ID, given also its OverlayID. For now, it is only possible to use the ExtProtocol for fetching an Object by its ID, given also its OverlayID.
#### ExtObjectGet ### ExtObjectGet
##### Request #### Request
```rust ```rust
struct ExtObjectGetV0 { struct ExtObjectGetV0 {
@ -27,7 +27,7 @@ struct ExtObjectGetV0 {
All the children blocks of each object will also be sent in the stream of responses. All the children blocks of each object will also be sent in the stream of responses.
##### Response #### Response
```rust ```rust
Vec<Block> Vec<Block>

@ -108,3 +108,5 @@ This Forwarding Client Protocol is not coded yet (but it is just an add-on to th
Also, the Relay/Tunnel feature is not finished. But very few tasks remain in order to have it running. Also, the Relay/Tunnel feature is not finished. But very few tasks remain in order to have it running.
Finally, the CoreProtocol, between Core brokers, has not been coded yet, and will need more work. It implements LoCaPs algorithm for guaranteeing partial causal order of delivery of the events in the pub/sub, while minimizing the need for direct connectivity, as only one stable path within the core network is needed between 2 core brokers, in order to guarantee correct delivery. Finally, the CoreProtocol, between Core brokers, has not been coded yet, and will need more work. It implements LoCaPs algorithm for guaranteeing partial causal order of delivery of the events in the pub/sub, while minimizing the need for direct connectivity, as only one stable path within the core network is needed between 2 core brokers, in order to guarantee correct delivery.
More on the [Client Protocol here](/en/specs/protocol-client)

Loading…
Cancel
Save