Advanced search
1 file | 2.90 MB Add to list

A novel approach to assess and improve syntactic interoperability in data integration

Rihem Nasfi (UGent) , Antoon Bronselaer (UGent) and Guy De Tré (UGent)
Author
Organization
Abstract
Data integration is essential to enrich a database with external information. One effective approach is to match shared identifiers across diverse databases. However, a lack of syntactic interoperability, which refers to the ability to match data based on their syntax, can pose challenges. In this paper, we present a novel method to evaluate and enhance syntactic interop-erability, considering associated costs. First, we introduce the linking index and completeness index as generic measures of fine-grained syntactic interoperability. Second, we analyze the data consistency level of the identifiers using a rule-based framework for data quality assessment. Third, we propose a data integration strategy that strikes a balance between fixing data inconsistencies and the resulting benefits, as measured by the linking and completeness indices. The approach is illustrated through two use cases: bibliographic databases and clinical trial registries. The results demonstrate that standardizing identifiers' representations can signifi-cantly improve syntactic interoperability in certain scenarios while in others, the standardization process does not yield improvements, discouraging, hence integration decisions. By conducting a cost-benefit analysis of improving data interoperability, this analysis enables data integrators to make informed decisions regarding the feasibility and advantages of proceeding with data integration.
Keywords
Library and Information Sciences, Management Science and Operations Research, Computer Science Applications, Media Technology, Information Systems, Relational databases, Interoperability, Data quality

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 2.90 MB

Citation

Please use this url to cite or link to this publication:

MLA
Nasfi, Rihem, et al. “A Novel Approach to Assess and Improve Syntactic Interoperability in Data Integration.” INFORMATION PROCESSING & MANAGEMENT, vol. 60, no. 6, 2023, doi:10.1016/j.ipm.2023.103522.
APA
Nasfi, R., Bronselaer, A., & De Tré, G. (2023). A novel approach to assess and improve syntactic interoperability in data integration. INFORMATION PROCESSING & MANAGEMENT, 60(6). https://doi.org/10.1016/j.ipm.2023.103522
Chicago author-date
Nasfi, Rihem, Antoon Bronselaer, and Guy De Tré. 2023. “A Novel Approach to Assess and Improve Syntactic Interoperability in Data Integration.” INFORMATION PROCESSING & MANAGEMENT 60 (6). https://doi.org/10.1016/j.ipm.2023.103522.
Chicago author-date (all authors)
Nasfi, Rihem, Antoon Bronselaer, and Guy De Tré. 2023. “A Novel Approach to Assess and Improve Syntactic Interoperability in Data Integration.” INFORMATION PROCESSING & MANAGEMENT 60 (6). doi:10.1016/j.ipm.2023.103522.
Vancouver
1.
Nasfi R, Bronselaer A, De Tré G. A novel approach to assess and improve syntactic interoperability in data integration. INFORMATION PROCESSING & MANAGEMENT. 2023;60(6).
IEEE
[1]
R. Nasfi, A. Bronselaer, and G. De Tré, “A novel approach to assess and improve syntactic interoperability in data integration,” INFORMATION PROCESSING & MANAGEMENT, vol. 60, no. 6, 2023.
@article{01HF6JBSHA3244PFGSK4XH3X33,
  abstract     = {{Data integration is essential to enrich a database with external information. One effective approach is to match shared identifiers across diverse databases. However, a lack of syntactic interoperability, which refers to the ability to match data based on their syntax, can pose challenges. In this paper, we present a novel method to evaluate and enhance syntactic interop-erability, considering associated costs. First, we introduce the linking index and completeness index as generic measures of fine-grained syntactic interoperability. Second, we analyze the data consistency level of the identifiers using a rule-based framework for data quality assessment. Third, we propose a data integration strategy that strikes a balance between fixing data inconsistencies and the resulting benefits, as measured by the linking and completeness indices. The approach is illustrated through two use cases: bibliographic databases and clinical trial registries. The results demonstrate that standardizing identifiers' representations can signifi-cantly improve syntactic interoperability in certain scenarios while in others, the standardization process does not yield improvements, discouraging, hence integration decisions. By conducting a cost-benefit analysis of improving data interoperability, this analysis enables data integrators to make informed decisions regarding the feasibility and advantages of proceeding with data integration.}},
  articleno    = {{103522}},
  author       = {{Nasfi, Rihem and Bronselaer, Antoon and De Tré, Guy}},
  issn         = {{0306-4573}},
  journal      = {{INFORMATION PROCESSING & MANAGEMENT}},
  keywords     = {{Library and Information Sciences,Management Science and Operations Research,Computer Science Applications,Media Technology,Information Systems,Relational databases,Interoperability,Data quality}},
  language     = {{eng}},
  number       = {{6}},
  pages        = {{23}},
  title        = {{A novel approach to assess and improve syntactic interoperability in data integration}},
  url          = {{http://doi.org/10.1016/j.ipm.2023.103522}},
  volume       = {{60}},
  year         = {{2023}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: