master
Niko PLP 3 months ago
parent 56b1e82a4b
commit 0af195ec58
  1. 2
      src/pages/en/encryption.md
  2. 12
      src/pages/en/framework/crdts.md
  3. 50
      src/pages/en/framework/data-first.md
  4. 2
      src/pages/en/framework/schema.md
  5. 173
      src/pages/en/framework/semantic.md

@ -32,7 +32,7 @@ Pure Peer to Peer (P2P) is not practical in real life. Even though we claim Next
### End-to-End Encryption of everything
Now comes another problem. If the concept of a “server” is back in the picture, what about the guarantees on privacy and single point of failure (SPOF)? Well, we have thought about that. Concerns for SPOF is easy to clear if you allow many brokers to be available at the same time, if the broker can be self-hosted by anyone, and if the user can switch to failover brokers transparently. This is what we did in our protocol design. And about concerns for privacy, the solution is very simple (at least, on the paper): **End-to-End-Encryption**. The broker only receives and forwards messages that are E2E encrypted. It cannot read what is in the messages. The only thing it can see is the IP of the devices, and their PeerID (a public key). Only the replicas (the clients) can decrypt the messages. So basically, we reinvented a group E2EE protocol? yes! and why not reuse Signal protocol, Matrix protocol, OMEMO or OpenMLS that already have a very well established Double-Rachet E2EE mechanism? Well, this also could not have been done, because of the following issues.
Now comes another problem. If the concept of a “server” is back in the picture, what about the guarantees on privacy and single point of failure (SPOF)? Well, we have thought about that. Concerns for SPOF is easy to clear if you allow many brokers to be available at the same time, if the broker can be self-hosted by anyone, and if the user can switch to failover brokers transparently. This is what we did in our protocol design. And about concerns for privacy, the solution is very simple (at least, on the paper): **End-to-End-Encryption**, sometimes called zero-knowledge. The broker only receives and forwards messages that are E2E encrypted. It cannot read what is in the messages. The only thing it can see is the IP of the devices, and their PeerID (a public key). Only the replicas (the clients) can decrypt the messages. So basically, we reinvented a group E2EE protocol? yes! and why not reuse Signal protocol, Matrix protocol, OMEMO or OpenMLS that already have a very well established Double-Rachet E2EE mechanism? Well, this also could not have been done, because of the following issues.
### Group encryption is not enough

@ -16,17 +16,17 @@ For now, we offer:
- **Graph** CRDT : the Semantic Web / Linked Data / RDF format, made available as a local-first CRDT model thanks to an OR-set logic (Observe Remove Set) formalized in the paper [SU-set (SPARQL Update set)](https://inria.hal.science/hal-00686484/document). This allows the programmer to link data across documents, globally, and privately. Any NextGraph document features a Graph, that enables connecting it with other documents, and representing its data following any Ontology/vocabulary of the RDF world. Links in RDF are known as predicates and help establishing relationships between Documents while qualifying such relation. Learn more about RDF and how it allows each individual key/value to point to another Document/Resource, similar to foreign keys in the SQL world. With the SPARQL language you can then traverse the Graph and navigate between Documents.
- **Automerge** CRDT: A versatile and compact format that offers sequence and set operations in an integrated way, that lets all types be combined in one document. It is based on the formal research of Martin Kleppmann, Geoffrey Litt et al. and the Ink & Switch lab, implemented by Alex Good, and follows the RGA algorithm (Replicated Growable Array). Their work brought the formalization of Byzantine Eventual Consistency, upon which NextGraph builds its own design. Automerge offers a rich text API (Peritext) but we do not expose it in NextGraph for now, preferring the one of Yjs for all rich text purposes.
- **Automerge** CRDT: A versatile and compact format that offers sequence and set operations in an integrated way, that lets all types be combined in one document. It is based on the formal research of [Martin Kleppmann](https://martin.kleppmann.com/2020/07/06/crdt-hard-parts-hydra.html), Geoffrey Litt et al. and the Ink & Switch lab, implemented by Alex Good, and follows the [RGA algorithm](https://liangrunda.com/posts/automerge-internal-1/) ([Replicated Growable Array](https://www.sciencedirect.com/science/article/abs/pii/S0743731510002716)). Their work brought the formalization of Byzantine Eventual Consistency, upon which NextGraph builds its own design. Automerge offers a rich text API (Peritext) but we do not expose it in NextGraph for now, preferring the one of Yjs for all rich text purposes.
- **Yjs** CRDT: Well-known and widely-used library based on the YATA model. It offers an efficient and robust format for sequences and sets. On top of those primitives, the library offers 4 formats: XML (used by all rich-text applications), plain text, maps (like JSON objects), and arrays (aka lists). We use the XML format for both the "Rich text" and "MarkDown" documents, and offer the 4 formats as separated **classes** of discrete documents for the programmer to use. We all thank Kevin Jahns for bringing this foundational work to the CRDT world.
- **Yjs** CRDT: Well-known and widely-used library based on the [YATA](https://www.bartoszsypytkowski.com/yata-move/) model. It offers an efficient and robust format for sequences and sets. On top of those primitives, the library offers 4 formats: XML (used by all rich-text applications), plain text, maps (like JSON objects), and arrays (aka lists). We use the XML format for both the "Rich text" and "MarkDown" documents, and offer the 4 formats as separated **classes** of discrete documents for the programmer to use. We all thank [Kevin Jahns](https://blog.kevinjahns.de/are-crdts-suitable-for-shared-editing/) for bringing this foundational work to the CRDT world. His paper (et al.) is [available here](https://files.gitter.im/y-js/yjs/yCYx/GROUP2016-_6_.pdf).
In the future, we might want to integrate more CRDTs into NextGraph, specially the promising LORO or json-joy, or older CRDTs like Diamond Types.
If you are not familiar with CRDTs, here are some nice introductions you can read about that subject. TODO.
If you are not familiar with CRDTs, here are some nice introductions you can read about that subject [here](https://www.farley.ai/posts/causal) and [here](https://www.bartoszsypytkowski.com/the-state-of-a-state-based-crdts/).
## Semantic Web, Linked Data, RDF
If the Semantic Web and RDF are still unknown to you, we have selected some good introductory materials here. TODO.
If the Semantic Web and RDF are still unknown to you, we have selected some good introductory materials [here](/en/framework/semantic).
The essential information to understand about RDF is that it encodes the data in the form of triples, which are composition of 3 elements.
@ -40,7 +40,7 @@ In addition, the values (aka, the Object part of the triple) can also be a refer
If we have a triple saying `Alice -> lives_in -> Wonderland`, then we can also say that `Bob -> is_friend_with -> Alice`. Here we have linked the 2 Resources together, and the Predicate `is_friend_with` is not just a property, but it is in fact a `relationship`. If Alice also considers Bob as a friend, then we could say the inverse relationship `Alice -> is_friend_with -> Bob`.
We understand that the Predicates of the RDF world, are corresponding to the keys and properties of the JS/JSON world. Values of the JS world are equivalent to Objects of RDF, although in JS, you cannot encode a link to another (possibily remote) Resource as a value.
We understand that the Predicates of the RDF world, are corresponding to the keys and properties of the JS/JSON world. Values of the JS world are equivalent to Objects of RDF, although in JS, you cannot encode a link to another (possibly remote) Resource as a value.
Finally, the Subject of a resource is its unique identifier. NextGraph assigns a unique ID to each Document, and that's the Subject of the triples it contains. As an analogy, the same could be done in JSON by giving a unique name to a JSON file, or to a JS variable (POJO) holding some map of key/values.
@ -121,3 +121,5 @@ Now let's have a look at what those CRDTs have in common and what is different b
| **shared cursor** | N/A | 🟧 \* | 🟧 \* |
| <td colspan=3> (\*) available in lib but not yet integrated in NextGraph |
| **undo/redo** | N/A | 🟧 \* | 🟧 \* |
Keep on reading about how to handle the [schema](/en/framework/schema) of your data, and what the [Semantic Web](/en/framework/semantic) is all about.

@ -4,4 +4,52 @@ description: NextGraph is a data-first framework where applications are just vie
layout: ../../../layouts/MainLayout.astro
---
Please bare with us as we are currently writing/publishing the documentation (23 of August until 28 of August 2024).
NextGraph is a framework for app developers who want to enjoy all the benefits of local-first, end-to-end encryption, decentralization, and the powerful combination of Semantic Data (Linked Data) with CRDTs.
We believe, like many others, that data should come first. In our data-centric approach, the user owns their data, no matter what the app is doing.
It is the same concept that has been advertised by the Solid project with their PODS (Personal Online Data Stores).
The User always remains in control of their data, and chooses a repository/server where to store all their data, and then, authorize some apps to access part of this data store. This way, the app is just a tool that manipulates the data or creates new data, but the user can always access it all, and revoke the authorization too.
With NextGraph, we push this concept even further, as the data is not managed by a POD provider/server that can see all of it in clear, as this has some negative consequences on privacy, but instead, the data in NextGraph is local-first. Which means that it is primarily stored and available locally on the devices of the users. The apps can use and modify this data directly and locally, without the need to use remote API calls or a backend.
In order to work properly, we use [CRDTs](/en/framework/crdts) that prevent merge conflict when the data is synchronized between devices.
We also use a Broker, that helps with synchronization, and that is only dealing with end-to-end encrypted data, which means that it cannot read or write the data that is exchanged on the server. You can read more about the rationals behind that in the [Encryption chapter](/en/encryption).
For the developer of an App, this means that the data is accessed and manipulated with usual APIs in javascript, and that all the syncing, encryption, and offline management happens in the background, and is transparent to the developer.
Developers of modern apps today often use some front-end frameworks like **React** and **Svelte**.
The data they manipulate is often stored in a **reactive store** (called Redux in React, or Runes in Svelte) that needs to be configured and plugged to a backend system with some APIs that can range from WebSocket and GraphQL to HTTP/REST APIs to a MYSQL and so on...
> With NextGraph, we provide the developer with a reactive store (for React, Svelte, and Deno/Node) and that's all they have to worry about. NextGraph transparently synchronizes, encrypts and deals with permissions for you.
The developer can then focus all their efforts on the business logic, design of their app, and efficient coding with modern technologies, with very short time-to-market.
Furthermore, the User will always remain in control of their data, and interoperability is available by default to every app developer, who can also reuse and ask access to some data that was generated by other apps.
By leveraging the power of **Linked Data and RDF**, we achieve the interconnection of many different kind of data structures, formats, and schemas, and usher in the advent of a Giant Global Graph of data, with strong guarantees on privacy, security, and availability.
### Class, Viewer and Editor
Developing an App in NextGraph is as simple as defining one or several new **Classes** of data, or selecting existing classes that the app can read and/or edit, and then connecting those Classes with some Viewers and Editors.
When you define your class, you have to choose among the 3 data models that we support, and define a list of ontologies that you support. Read more about that in the [schema](/en/framework/schema) section.
The Viewers and Editors can be written in React or Svelte, and more frameworks (including vanilla JS/HTML) will be provided in the future.
We tend to prefer Svelte as this is the framework we use internally in NextGraph and it is also with Svelte that we have developed a set of Official Apps that can already handle the basic Classes and most frequent use-cases.
[Checkout what we already did](/en/features), and what is planned and what we are currently working on, because there might be no need for you to redo another app that already exists or that is already planned. Instead, you could join our community of developers and participate in creating or improving existing apps.
As the goal of NextGraph is to give the smoothest experience to our users, we put a lot of efforts in order to offer a tightly integrated environment to them, that you can try when you open our [official app](/en/getting-started).
If you have innovative ideas for a new app and you want to work with permissive licenses, you can make a proposal for a new app in [our forum](https://forum.nextgraph.org) and we will work together to bring your vision about.
If we code this new app in Svelte, it will have more chances to integrate the official apps and be part of the basic features every user can enjoy.
But of course, it is always possible to code in React, and also to publish your app in any other way, without coordinating your efforts with us. Read more about that in the [Ecosystem chapter](/en/ecosystem).
Now let's read more about [CRDTs and the 3 data models](/en/framework/crdts) that we offer.

@ -4,4 +4,4 @@ description: Describe the Schema of your data with a set of Semantic Ontologies
layout: ../../../layouts/MainLayout.astro
---
Please bare with us as we are currently writing/publishing the documentation (23 of August until 28 of August 2024).
As explained in the previous chapter about the [Semantic Web and Ontologies](/en/framework/semantic), NextGraph is based on RDF, and OWL is used to defined Ontologies, which are equivalent to a Schema definition.

@ -4,4 +4,175 @@ description: NextGraph is based on the Semantic Web principles and uses RDF as i
layout: ../../../layouts/MainLayout.astro
---
Please bare with us as we are currently writing/publishing the documentation (23 of August until 28 of August 2024).
Maybe you heard about Semantic Web, Linked Data or RDF before. Maybe you didn't and that's why you arrived on this page.
The 3 terms are synonyms. RDF is the data format used to describe Resources. Semantic Web just means that as we have the web of pages (linked by hyper references (href) using the HTTP protocol), we also can create a web of data, and that's basically what the Semantic Web is all about. We say semantic because it gets close enough to the data to bring **meaning** to otherwise format-less strings of text.
Linked Data says exactly the same, and is just another name for what Tim Berners-Lee envisioned back in 1999 (the Semantic Web) and reformulated with the term Linked Data in 2006 and the Giant Global Graph in 2007.
SPARQL is the query language of choice for RDF data, and NextGraph supports both of them at its core. You can see SPARQL as the equivalent of SQL for a relational database.
OWL is a language (based on RDF) that expresses the schema (also called Ontology) of the data.
When we are talking about publicly available data, most of the time concerning immutable facts like academic data, we use the term LOD for Linked Open Data. But Semantic data doesn't have to be public nor open, and in NextGraph we use RDf to store private and encrypted data that very few people will ever see.
Eventually RDF is just another data format, like JSON, XML or CSV, except that it has its own characteristics, that are very interesting, and that we will detail now here.
You can find an introduction to [RDF here](https://www.bobdc.com/articles/rdfstandards/) and more details about [SPARQL here](https://jena.apache.org/tutorials/sparql.html)
RDF data is organized in the form of triples. And when we use a database to store and query those triples, we call this database a triplestore.
### Triples
The essential information to understand about RDF is that it encodes the data in the form of triples, which are the composition of 3 elements.
The 3 elements are called : `Subject -> Predicate -> Object`. That's one triple. The semantic database is just a set of triples.
- The **Subject** represents the Resource we are establishing facts about.
- The **Predicate** indicates the "key" or "property" we want to specify about the Resource.
- And the **Object** represents the "value".
Hence, if we want to say that "Bob owns a bicycle", then we write it this way : `Bob -> Owns -> Bicycle`.
`Bob` is the subject. `Owns` is the predicate, `Bicycle` is the object.
We can also say that `Bob -> color_of_eyes -> Blue` and so on.
In addition, the values (aka, the Object part of the triple) can also be a reference to another Resource. So basically we can link Resources together.
If we have a triple saying `Alice -> lives_in -> Wonderland`, and we know that `Alice` and `Wonderland` are 2 RDF resources that have their own triples, then we say that `lives_in` is a predicate that represents a **relationship** between the 2 RDF resources Alice and Wonderland.
Then let's say there is another resource in the system called `Bob` and we also want to say that `Bob -> is_friend_with -> Alice`.
Here we have linked the 2 Resources together, and the Predicate `is_friend_with` is not just a property, but it is in fact a `relationship`.
If Alice also considers Bob as a friend, then we could say the inverse relationship `Alice -> is_friend_with -> Bob`.
We understand that the Predicates of the RDF world, are corresponding to the keys and properties of the JS/JSON world. Values of the JS world are equivalent to Objects of RDF, although in JS, you cannot encode a link to another (possibly remote) Resource as a value, while in RDF, you can.
Finally, the Subject of a resource is its unique identifier. NextGraph assigns a unique ID to each Document, and that's the Subject of the triples it contains.
In the classical Semantic Web, Resources are identified with URLs, but because NextGraph is not using HTTP, we identify the Resources with unique IDs of the form `did:ng:o:[44 chars of the ID]` by example `did:ng:o:EghEnCqhpzp4Z7KXdbTx0LkQ1dUaaqwC0DGVS-0BAKAA`, as explained in the [Documents chapter](/en/documents)
So in fact, we don't use names like Alice, Wonderland, or Bob as subjects or objects, but we use their `did:ng:...` identifiers instead.
Then of course, we attach a nice and easy-to-read text label to each resource, so we can see and understand what the resource is about. This is often done with a predicate called `rdfs:label` which is used pervasively in the semantic web for defining a "title" to anything.
### Ontologies
As you can see, the predicate's names are often written with 2 words separated by a colon like `rdfs:label`. This means that we are referring to the prefix `rdfs` and to the fragment `label` inside it. The prefix `rdfs` must have been defined somewhere else before, and it always points to a full URI that contains the ontology.
In the classical semantic web, this URI is a URL, in NextGraph it is a NURI (a NextGraph DID URI) or it can also be a URL if needed.
So this "file" that contains the ontology, most often in the format OWL, which is also some RDF, that describes the classes, properties, and how they can be combined (which properties belong to which classes, the cardinality of relationships, etc).
Each entry in the ontology gets a name that can be used later on as a predicate, like `label` that can be found in the OWL ontology here [https://www.w3.org/2000/01/rdf-schema#label](https://www.w3.org/2000/01/rdf-schema#label)
When this predicate is saved in the triplestore, it is the long-form "fully qualified" version that is saved "https://www.w3.org/2000/01/rdf-schema#label" and not the "rdfs:label" version, because prefixes can change so we replace the prefix by its real value before saving the triple.
When we retrieve the triples, we can give some prefixes and the SPARQL engine will do the reverse operation of changing the long-form to the prefixed form.
What is really interesting here is that Ontologies can be shared across documents, and also across triplestores. In fact, there exist already a good list of ontologies that have been adopted worldwide to represent the most common properties and relationships that we use in data.
Then some specialized ontologies have also emerged, often created in cooperation between several actors of a specific field of concern, and those ontologies become standards.
They form some **shared schema** that has been agreed globally and that can be reused, and amended of course if need, by anybody.
If need be, it is always possible to define also your own ontologies. if they are of any interest to others, they might be published too, and reused.
This mechanism tends to foment interoperability.
First of all because the technology itself of predicate encoded as URIs is very much supporting interoperability of itself.
But also because groups of interest tend to gather and establish standard ontologies in many fields of concern, which enhance even more interoperability and portability of data.
At NextGraph, we strive to gather the best existing ontologies out there and propose them to you so you can reuse them. We also make it easy for you to create new ones with graphical tools and editors, so you don't have to have a headache when trying to understand exactly how OWL works. If you know about UML, object-oriented programming, or modelling of data, then you can easily create new ontologies. It is all the same thing. We define classes, properties, and relationships between them.
One last word about RDf and ontologies: because the schema of data is encoded inside in each triple (in the predicate part), there is no need to keep track of the schema of your data, as we normally do with relational databases or even in JSON. The advantage of RDF is that predicates get assigned with globally unique identifiers too, so there is never any ambiguity about the schema of the data. No migrations needed. No data inconsistency neither.
### SPARQL
SPARQL is the query language for RDF. And as we have explained earlier in the [Data-First](/en/framework/data-first) chapter, we want to offer to the developer a very simple interface for accessing and modifying the data : in the form of a reactive store with Javascript objects (POJOs).
But sometimes, you also need to run complex queries. Not only to query the data and traverse the graph, but also to update it with specific conditions that SPARQL will help you with much more than going inside the reactive store.
In any case, rest assured that with our framework, you always have access to your data in both ways : via the reactive store, and also via SPARQL.
SPARQL is a query language that looks a bit rebutting at first contact. Many developers do not like it at first. This happened to me too in the past. But I can tell you from experience that once you get to learn it a little bit, every gets much simpler and also very powerful.
It is not for use to give you a course on SPARQL. You can refer to [this page](https://jena.apache.org/tutorials/sparql.html) as a starter, and there are many other tutorials online.
The 2 main types of queries are `SELECT` and `CONSTRUCT`. Select is similar to SQL, and will return you a table with columns representing the variables that you have defined in your SPARQL query, and one row fow each record that has been found. The things to understand is that in the `WHERE` part, you put filters and also patterns for traversing the graph, which means that you can ask the SPARQL engine to "navigate" inside your RDF data and hop from one triple to another, from one resource to another, until you are reaching the desired combination of "match patterns". By example you can ask to find "all the contacts that I have and that live in Barcelona and who are software engineers and who live less than 100m from an ice cream parlor, and you want to see their name, date of birth, phone number, and profile picture." this obviously will hardly work because we usually don't store information about the ice cream parlors on a geo map. But if you had the data, then it would work ! You can see from this example, that the SPARQL engine needs to go through several resources before being able to deliver the results. From Contact -> it goes to -> City -> and then to Shop and in those 3 type of Documents, it checks some properties (Contact.job="software_engineer" and keeps their date_of_birth, phone_number, profile_pic and geo_location for later, then follows the "lives_in" predicate of the contact, that redirects us to a city, then filters by City.label="Barcelona", then follows the predicates "has_shop" and filters by Shop.type="ice_cream_parlor" and by Shop.geo_loc that is within 100 of Contact.geo_location ). This is exactly the same as doing JOINS in SQL, except that you do not need to normalize your tables in advance in order to establish foreign keys. Instead, all the semantic data is always "JOINABLE" by all its predicates.
`CONSTRUCT` are a special type of queries that always return some full triples. They work the same as the SELECT WHERE, but you cannot have arbitrary variable projections. It will always return triples of the form `subject predicate object`. But you can of course tell which filters and patterns you want to follow.
Until now, we explained that each Document can hold some RDF triples. but we didn't explain how they are stored and how the SPARQL engine will be able to run queries that span all the RDF Documents that are present locally.
There is in fact an option in the "SPARQL Query" tool (in the Document Menu, under "Graph" / "View as ...") that lets you query all the documents that you have present locally at once. If you do not toggle this option, you will only get results about the triples of the current Document. While with this "Query all docs" option activated, the SPARQL engine will search in all your documents, regardless if they are in the Public store, Protected store, Private store or in any Group or Dialog store.
What matters is that the documents must be present locally.
When you are using the native app, all your documents of all your stores, are always present locally, because they are stored in the UserStorage.
While for the webapp, you only get to have locally the documents that you manually opened since the last login. This is because the webapp, for now, cannot store locally all your documents, because there is not enough room for that. This will be improved at some point, but it needs more work on our side.
In the future, we will also be able to run federated queries, which means that part of all of the query is gonna be ran on someone else's data, remotely. This is not ready yet, but that's the goal of NextGraph. If we want to query the social graph, by example, we have to go to out contacts, friends, followers and run some queries there on their data.
Of course, this will only work if we got the permission to run those remote queries. And about that you should read our chapter about [permissions](/en/framework/permissions).
In the same manner, we just explained that when you query with the "Query all docs", you directly have access to all your local documents. First of all, we have to be precise and say that the set of documents you have access to is by user/identity. If you have several Identities in your wallet, then only the current identity will be queried.
Secondly, we have to clarify also that only the official apps can have such unlimited access to all your documents. This is because those apps have been coded and audited by us, they are open source, and we know they are ot going to do malicious things with your data. They will never connect to remote machines and send your data there. They will never gather statistics or tracking on you. They can only manipulate your data locally, and they have to behave! you also trust those apps because you trust us. If you wouldn't trust us in general, then there would be absolutely no point in using our products. Your trust though is not "requested" from you. You can easily check our claims about security, privacy and encryption, as all our code is open source, and you can also compile it yourself, as ask an export programmer to audit it or compile it for you.
Then, when you install third party apps, those apps will NOT get unlimited access to all your data. Those applications will have to request permissions from you, before being able to read or write some of your data. At nay moment, you can revoke this grant. More about that in the [permissions](/en/framework/permissions) chapter.
Those 3rd party apps are mostly safe, because we also review them. But because you also have the option to bypass the App Store and install any app that you want, those apps will obviously not be reviewed by us, so as a matter of principle, any third party app needs to present some capability in order to access your data, even if you are the author of such app.
It should be noted that permissions can span a whole UserStorage, or a whole Store, or a set of Documents, or even, for read permission only, a specific branch or block.
### includes
As we already mentioned shortly when we talked about blocks [here](/en/documents#blocks-and-branches), you can include some other blocks inside a document. This will have the effect of including also all the triples of such block, inside the document where the include is declared.
This is very handy when you are in Document A, and you need to access some extra data coming from another Document or Branch B, and you want to make sure that anybody who will read the current Document A, will also fetch and include the other Document or Block B automatically. If you have some logic in your document tha depends on such data, in a SPARQL query by example, this **include** mechanism will solve the headache of fetching and caching the foreign data.
The included block can be from the ame document, from another document in the same store, or not, and even in someone else's document, on a remote machine. hanks to the pub/sub and synchronization mechanism of NextGraph, this foreign block will always stay up to date aas it will synchronize with the original source.
### Named graphs and blank nodes
And this leads us to an explanation about what happens to named graphs in NextGraph.
Named Graphs are an RDF and SPARQL feature that lets you organize your triples into a bag. This bag contains your triples, and we can this bag a Graph. it also gets an ID in the form of a URI (URL normally. In NextGraph, a NURI).
SPARQL has options to specify which named graph(s) you want the query to relate to.
Theoretically, a triplestore can put any kind of triple in any graph. By the way, when the triplestore can understand the concpet of a named graph, then we can it a quadstore, because then each triple has a 4th part telling about which graph the triple is stored in. So triples become quads (they have 4 parts instead of 3). And the triplestore becomes a quadstore.
NextGraph is a quadstore, but there are some limitations.
We do not let the user create graphs manually and arbitrarily. Instead, we associated each new resource/document (with its unique subject ID of the form `did:ng:o:...`) to a new graph, **of the same name as the resource**.
Even creating a new resource od document doesn't happen freely in the quadstore. Instead, there is a special API call for creating a new document, that must be called before any triple can be inserted in this Document. THis API call return the newly generated document ID.
Then it is possible to add new triples in the this Document, and the ID of the document has to be passed to the SPARQL query, as a named graph, or as an argument in the API call itself. This way, we always know in which named graph the data should be saved or retrieved from.
In this Document/Named Graph, the user can add triples, and most of the time, they will add triples that have this Document ID as subject. That's what we call **authoritative triples** because we know that the subject is the same ID as the named graph, (as the document ID) and because we have signatures on every commit and also threshold signatures too, then we can prove that those triples have been authored by the users that claim to be members of such document.
In order to facilitate adding those kind of authoritative triples with a SPARQL UPDATE, or to retrieve them with a SPARQL QUERY, the user has access to the BASE shortcut, which is `<>` in SPARQL, and that represents the current document. it will be replaced internally by the exact ID of the current document. This placeholder is handy and helps you manipulate the authoritative triples of your document.
If we had stoped here, there would be no real interest in having a named graph mechanism.
But you also are able to add triples in the Document/Named graph, that are not authoritative. Those are the triples that have as subject, some other ID than the current Document.
What is it useful for? RDF let's anybody establish facts about any resources. If there is a foreign Document that I am using in my system, and I want to add extra information about this resource, but I don't have write permission on that foreign Document, I can add the triples in one of the Documents that I own. External people who would see those triples that I added, would immediately understand that they are not authoritative, because they were not signed with the private key of the Document ID that they establish facts about (the subject of the triples). So it is possible to say, by example, that `London -> belongs_to -> African continent` but of course, this is not the official point of the view of the author that manages the `London` Document. it only is "my point of view", and people who will see this triple, will also be notified that it isn't authoritative (I think they can easily understand that by themselves without the need for signatures).
Then we have other use cases for extra triples in the Document :
- fragments, that are prefixed with the authoritative ID, and followed by a hash and a string (like in our previous example `#label`).
- blank nodes that have been skolemized. They get a NURI of the form `did:ng:o:...:u:...`. this is because baln nodes cannot exist in a local-first system, as we nee dto give them a unique ID. This is done with the [skolemization procedure](https://www.w3.org/TR/rdf11-concepts/#section-skolemization). For the user or programmer, skolemization is transparent. You can use blank nodes in SPARQL UPDATE and they will be automatically translated to skolems. For SPARQL QUERY anyway, blank nodes are just hidden variables, so there is no impact.
But those extra triples (fragments and skolems) are all prefixed with the authoritative ID, so they are considered authoritative too.
Note that non-authoritative triples can also have fragments and skolemized blank nodes, but their prefix will be a foreign ID, so they won't be considered authoritative neither.
Now that we've explored the Semantic Web and OWL, we can dive more into [Schema definition](/en/framework/schema) in NextGraph.

Loading…
Cancel
Save