Advanced search
1 file | 2.22 MB Add to list

Optimizing storage of RDF archives using bidirectional delta chains

(2022) SEMANTIC WEB. 13(4). p.705-734
Author
Organization
Abstract
Linked Open Datasets on the Web that are published as RDF can evolve over time. There is a need to be able to store such evolving RDF datasets, and query across their versions. Different storage strategies are available for managing such versioned datasets, each being efficient for specific types of versioned queries. In recent work, a hybrid storage strategy has been introduced that combines these different strategies to lead to more efficient query execution for all versioned query types at the cost of increased ingestion time. While this trade-off is beneficial in the context of Web querying, it suffers from exponential ingestion times in terms of the number of versions, which becomes problematic for RDF datasets with many versions. As such, there is a need for an improved storage strategy that scales better in terms of ingestion time for many versions. We have designed, implemented, and evaluated a change to the hybrid storage strategy where we make use of a bidirectional delta chain instead of the default unidirectional delta chain. In this article, we introduce a concrete architecture for this change, together with accompanying ingestion and querying algorithms. Experimental results from our implementation show that the ingestion time is significantly reduced. As an additional benefit, this change also leads to lower total storage size and even improved query execution performance in some cases. This work shows that modifying the structure of delta chains within the hybrid storage strategy can be highly beneficial for RDF archives. In future work, other modifications to this delta chain structure deserve to be investigated, to further improve the scalability of ingestion and querying of datasets with many versions.
Keywords
Linked Data, RDF archiving, Semantic Data Versioning, storage, indexing, WEB

Downloads

  • publisher version.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 2.22 MB

Citation

Please use this url to cite or link to this publication:

MLA
Taelman, Ruben, et al. “Optimizing Storage of RDF Archives Using Bidirectional Delta Chains.” SEMANTIC WEB, vol. 13, no. 4, 2022, pp. 705–34, doi:10.3233/SW-210449.
APA
Taelman, R., Mahieu, T., Vanbrabant, M., & Verborgh, R. (2022). Optimizing storage of RDF archives using bidirectional delta chains. SEMANTIC WEB, 13(4), 705–734. https://doi.org/10.3233/SW-210449
Chicago author-date
Taelman, Ruben, Thibault Mahieu, Martin Vanbrabant, and Ruben Verborgh. 2022. “Optimizing Storage of RDF Archives Using Bidirectional Delta Chains.” SEMANTIC WEB 13 (4): 705–34. https://doi.org/10.3233/SW-210449.
Chicago author-date (all authors)
Taelman, Ruben, Thibault Mahieu, Martin Vanbrabant, and Ruben Verborgh. 2022. “Optimizing Storage of RDF Archives Using Bidirectional Delta Chains.” SEMANTIC WEB 13 (4): 705–734. doi:10.3233/SW-210449.
Vancouver
1.
Taelman R, Mahieu T, Vanbrabant M, Verborgh R. Optimizing storage of RDF archives using bidirectional delta chains. SEMANTIC WEB. 2022;13(4):705–34.
IEEE
[1]
R. Taelman, T. Mahieu, M. Vanbrabant, and R. Verborgh, “Optimizing storage of RDF archives using bidirectional delta chains,” SEMANTIC WEB, vol. 13, no. 4, pp. 705–734, 2022.
@article{8724591,
  abstract     = {{Linked Open Datasets on the Web that are published as RDF can evolve over time. There is a need to be able to store such evolving RDF datasets, and query across their versions. Different storage strategies are available for managing such versioned datasets, each being efficient for specific types of versioned queries. In recent work, a hybrid storage strategy has been introduced that combines these different strategies to lead to more efficient query execution for all versioned query types at the cost of increased ingestion time. While this trade-off is beneficial in the context of Web querying, it suffers from exponential ingestion times in terms of the number of versions, which becomes problematic for RDF datasets with many versions. As such, there is a need for an improved storage strategy that scales better in terms of ingestion time for many versions. We have designed, implemented, and evaluated a change to the hybrid storage strategy where we make use of a bidirectional delta chain instead of the default unidirectional delta chain. In this article, we introduce a concrete architecture for this change, together with accompanying ingestion and querying algorithms. Experimental results from our implementation show that the ingestion time is significantly reduced. As an additional benefit, this change also leads to lower total storage size and even improved query execution performance in some cases. This work shows that modifying the structure of delta chains within the hybrid storage strategy can be highly beneficial for RDF archives. In future work, other modifications to this delta chain structure deserve to be investigated, to further improve the scalability of ingestion and querying of datasets with many versions.}},
  author       = {{Taelman, Ruben and Mahieu, Thibault and Vanbrabant, Martin and Verborgh, Ruben}},
  issn         = {{1570-0844}},
  journal      = {{SEMANTIC WEB}},
  keywords     = {{Linked Data,RDF archiving,Semantic Data Versioning,storage,indexing,WEB}},
  language     = {{eng}},
  number       = {{4}},
  pages        = {{705--734}},
  title        = {{Optimizing storage of RDF archives using bidirectional delta chains}},
  url          = {{http://dx.doi.org/10.3233/SW-210449}},
  volume       = {{13}},
  year         = {{2022}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: