Adds a README for the benchmark

pull/21/head
Tpt 5 years ago
parent 7938feaa1a
commit 0f3208d8fa
  1. 2
      README.md
  2. 38
      bench/README.md
  3. 4
      bench/bsbm-plot.py
  4. 1798
      bench/bsbm.businessIntelligence.svg
  5. 1567
      bench/bsbm.explore.svg
  6. 5
      bench/bsbm_oxigraph.sh

@ -22,6 +22,8 @@ Are currently implemented:
* [Turtle](https://www.w3.org/TR/turtle/), [TriG](https://www.w3.org/TR/trig/), [N-Triples](https://www.w3.org/TR/n-triples/), [N-Quads](https://www.w3.org/TR/n-quads/) and [RDF XML](https://www.w3.org/TR/rdf-syntax-grammar/) RDF serialization formats for both data ingestion and retrieval using the [Rio library](https://github.com/Tpt/rio). * [Turtle](https://www.w3.org/TR/turtle/), [TriG](https://www.w3.org/TR/trig/), [N-Triples](https://www.w3.org/TR/n-triples/), [N-Quads](https://www.w3.org/TR/n-quads/) and [RDF XML](https://www.w3.org/TR/rdf-syntax-grammar/) RDF serialization formats for both data ingestion and retrieval using the [Rio library](https://github.com/Tpt/rio).
* [SPARQL Query Results XML Format](http://www.w3.org/TR/rdf-sparql-XMLres/) and [SPARQL Query Results JSON Format](https://www.w3.org/TR/sparql11-results-json/). * [SPARQL Query Results XML Format](http://www.w3.org/TR/rdf-sparql-XMLres/) and [SPARQL Query Results JSON Format](https://www.w3.org/TR/sparql11-results-json/).
A preliminary benchmark [is provided](bench/README.md).
## Run the web server ## Run the web server
### Build ### Build

@ -0,0 +1,38 @@
BSBM
====
The [Berlin SPARQL Benchmark (BSBM)](http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/) is a simple SPARQL benchmark.
It provides a dataset generator and multiple set of queries grouped by "use cases".
## Results
We compare here Oxigraph with some existing SPARQL implementations (Blazegraph, Virtuoso and GraphDB).
The dataset used in the following charts is generated with 10k "products" (see [its spec](http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/Dataset/index.html)). It leads to the creation of 3.5M triples.
It has been executed on a Dell Precision 5520 with 16GB of RAM. For Oxigraph, available memory has been limited to 1GB.
### Explore
The [explore use case](http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/ExploreUseCase/index.html) is composed of 11 queries that do simple data retrieval.
Query 6 existed in previous versions of the benchmark as is now removed.
![explore use case results](bsbm.explore.svg)
### Business Intelligence
The [business intelligence use case](http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/spec/BusinessIntelligenceUseCase/index.html) is composed of 8 complex analytics queries.
Query 4 seems to be failing on Virtuoso and query 5 on Blazegraph and GraphDB.
![explore use case results](bsbm.businessIntelligence.svg)
## How to reproduce the benchmark
The code of the benchmark is in the `bsbm-tools` submodule. You should pull it with a `git submodule update` before running the benchmark.
To run the benchmark for Oxigraph run `bash bsbm_oxigraph.sh`. It will compile the current Oxigraph code and run the benchmark against it.
You could tweak the number of products in the dataset and the available memory using the environment variables at the beginning of `bsbm_oxigraph.sh`.
To generate the plots run `python3 bsbsm-plot.py`.
Scripts are also provided for the other benchmarks (`bsbm_blazegraph.sh`, `bsbm_graphdb.sh` and `bsbm_virtuoso.sh`).

@ -27,7 +27,7 @@ for file in glob('bsbm.explore.*.xml'):
val = float(query.find('aqet').text) val = float(query.find('aqet').text)
if val > 0: if val > 0:
aqet[run][int(query.attrib['nr'])] = val aqet[run][int(query.attrib['nr'])] = val
plot_y_per_x_per_plot(aqet, 'query id', 'execution time (s)', 'bsbm.explore.png') plot_y_per_x_per_plot(aqet, 'query id', 'execution time (s)', 'bsbm.explore.svg')
# BSBM business intelligence # BSBM business intelligence
aqet = defaultdict(dict) aqet = defaultdict(dict)
@ -37,6 +37,6 @@ for file in glob('bsbm.businessIntelligence.*.xml'):
val = float(query.find('aqet').text) val = float(query.find('aqet').text)
if val > 0: if val > 0:
aqet[run][int(query.attrib['nr'])] = val aqet[run][int(query.attrib['nr'])] = val
plot_y_per_x_per_plot(aqet, 'query id', 'execution time (s) - log scale', 'bsbm.businessIntelligence.png', log=True) plot_y_per_x_per_plot(aqet, 'query id', 'execution time (s) - log scale', 'bsbm.businessIntelligence.svg', log=True)
plt.show() plt.show()

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 53 KiB

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 46 KiB

@ -1,7 +1,8 @@
#!/usr/bin/env bash #!/usr/bin/env bash
DATASET_SIZE=100000 DATASET_SIZE=100000 # number of products in the dataset. There is around 350 triples generated by product.
MEMORY_SIZE=1000000 MEMORY_SIZE=1000000 # availlable memory for Oxigraph in GB. Useful to simulate low RAM machines.
cd bsbm-tools cd bsbm-tools
./generate -fc -pc ${DATASET_SIZE} -s nt -fn "explore-${DATASET_SIZE}" ./generate -fc -pc ${DATASET_SIZE} -s nt -fn "explore-${DATASET_SIZE}"
cargo build --release --manifest-path="../../server/Cargo.toml" cargo build --release --manifest-path="../../server/Cargo.toml"

Loading…
Cancel
Save