Advanced search
1 file | 825.36 KB Add to list

Substring filtering for low-cost linked data interfaces

Joachim Van Herwegen (UGent) , Laurens De Vocht (UGent) , Ruben Verborgh (UGent) , Erik Mannens (UGent) and Rik Van de Walle (UGent)
Author
Organization
Abstract
Recently, Triple Pattern Fragments (TPFS) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate SPARQL queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the TPFS interface purposely does not support complex constructs such as SPARQL filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the TPFS interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of SPARQL queries on multiple implementations, including Elastic Search and case-insensitive FM-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on TPF servers allows users to obtain faster responses for filter-based SPARQL queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries.
Keywords
Linked data, SEARCH, SPARQL, String matching, Regular expressions

Downloads

  • 2015 - Joachim Van Herwegen et al. - Substring Filtering for Low-Cost Linked Data Interfaces.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 825.36 KB

Citation

Please use this url to cite or link to this publication:

MLA
Van Herwegen, Joachim et al. “Substring Filtering for Low-cost Linked Data Interfaces.” Lecture Notes in Computer Science. Vol. 9366. CHAM: SPRINGER INT PUBLISHING AG, 2015. 128–143. Print.
APA
Van Herwegen, J., De Vocht, L., Verborgh, R., Mannens, E., & Van de Walle, R. (2015). Substring filtering for low-cost linked data interfaces. Lecture Notes in Computer Science (Vol. 9366, pp. 128–143). Presented at the 14th International Semantic Web Conference (ISWC), CHAM: SPRINGER INT PUBLISHING AG.
Chicago author-date
Van Herwegen, Joachim, Laurens De Vocht, Ruben Verborgh, Erik Mannens, and Rik Van de Walle. 2015. “Substring Filtering for Low-cost Linked Data Interfaces.” In Lecture Notes in Computer Science, 9366:128–143. CHAM: SPRINGER INT PUBLISHING AG.
Chicago author-date (all authors)
Van Herwegen, Joachim, Laurens De Vocht, Ruben Verborgh, Erik Mannens, and Rik Van de Walle. 2015. “Substring Filtering for Low-cost Linked Data Interfaces.” In Lecture Notes in Computer Science, 9366:128–143. CHAM: SPRINGER INT PUBLISHING AG.
Vancouver
1.
Van Herwegen J, De Vocht L, Verborgh R, Mannens E, Van de Walle R. Substring filtering for low-cost linked data interfaces. Lecture Notes in Computer Science. CHAM: SPRINGER INT PUBLISHING AG; 2015. p. 128–43.
IEEE
[1]
J. Van Herwegen, L. De Vocht, R. Verborgh, E. Mannens, and R. Van de Walle, “Substring filtering for low-cost linked data interfaces,” in Lecture Notes in Computer Science, Bethlehem, PA, 2015, vol. 9366, pp. 128–143.
@inproceedings{7241110,
  abstract     = {Recently, Triple Pattern Fragments (TPFS) were introduced as a low-cost server-side interface when high numbers of clients need to evaluate SPARQL queries. Scalability is achieved by moving part of the query execution to the client, at the cost of elevated query times. Since the TPFS interface purposely does not support complex constructs such as SPARQL filters, queries that use them need to be executed mostly on the client, resulting in long execution times. We therefore investigated the impact of adding a literal substring matching feature to the TPFS interface, with the goal of improving query performance while maintaining low server cost. In this paper, we discuss the client/server setup and compare the performance of SPARQL queries on multiple implementations, including Elastic Search and case-insensitive FM-index. Our evaluations indicate that these improvements allow for faster query execution without significantly increasing the load on the server. Offering the substring feature on TPF servers allows users to obtain faster responses for filter-based SPARQL queries. Furthermore, substring matching can be used to support other filters such as complete regular expressions or range queries.},
  author       = {Van Herwegen, Joachim and De Vocht, Laurens and Verborgh, Ruben and Mannens, Erik and Van de Walle, Rik},
  booktitle    = {Lecture Notes in Computer Science},
  isbn         = {978-3-319-25007-6},
  issn         = {0302-9743},
  keywords     = {Linked data,SEARCH,SPARQL,String matching,Regular expressions},
  language     = {eng},
  location     = {Bethlehem, PA},
  pages        = {128--143},
  publisher    = {SPRINGER INT PUBLISHING AG},
  title        = {Substring filtering for low-cost linked data interfaces},
  url          = {http://dx.doi.org/10.1007/978-3-319-25007-6_8},
  volume       = {9366},
  year         = {2015},
}

Altmetric
View in Altmetric
Web of Science
Times cited: