Advanced search
1 file | 1.08 MB Add to list

A possibilistic approach to string comparison

Antoon Bronselaer (UGent) and Guy De Tré (UGent)
Author
Organization
Abstract
In this paper, comparison of strings is tackled from a possibilistic point of view. Instead of using the concept of similarity between strings, coreference between strings is adopted. The possibility of coreference is estimated by means of a possibilistic comparison operator. In literature, two important classes of comparison methods for strings have been distinguished: character-based methods and token-based methods. The first class treats a string as a sequence of characters, while the second class treats a string as a vector of substrings. The first contribution of this paper is to propose a new character-based method that is able to detect typographical errors and abbreviations. The main advantage of the proposed technique is the very low complexity in comparison with existing character-based techniques. In a second contribution, two-level systems are investigated and a new approach is described. The novelty of the proposed two-level system is the use of multiset comparison rather than vector comparison. It is shown how an ordered weighted conjunctive operator that uses a parameterized fuzzy quantifier to deliver weights is competitive with frequency-based weights. In addition, the use of a quantifier is significantly faster than the use of existing weight techniques. In a third contribution, a novel class of hybrid techniques is proposed that combines the advantages of several methods. Finally, comparative tests regarding accuracy and execution time are performed and reported.
Keywords
fuzzy logic, Algorithms, operators (mathematics), possibility theory, string matching, INFORMATION INTEGRATION, SEQUENCE

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.08 MB

Citation

Please use this url to cite or link to this publication:

MLA
Bronselaer, Antoon, and Guy De Tré. “A Possibilistic Approach to String Comparison.” IEEE TRANSACTIONS ON FUZZY SYSTEMS 17.1 (2009): 208–223. Print.
APA
Bronselaer, A., & De Tré, G. (2009). A possibilistic approach to string comparison. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 17(1), 208–223.
Chicago author-date
Bronselaer, Antoon, and Guy De Tré. 2009. “A Possibilistic Approach to String Comparison.” Ieee Transactions on Fuzzy Systems 17 (1): 208–223.
Chicago author-date (all authors)
Bronselaer, Antoon, and Guy De Tré. 2009. “A Possibilistic Approach to String Comparison.” Ieee Transactions on Fuzzy Systems 17 (1): 208–223.
Vancouver
1.
Bronselaer A, De Tré G. A possibilistic approach to string comparison. IEEE TRANSACTIONS ON FUZZY SYSTEMS. 2009;17(1):208–23.
IEEE
[1]
A. Bronselaer and G. De Tré, “A possibilistic approach to string comparison,” IEEE TRANSACTIONS ON FUZZY SYSTEMS, vol. 17, no. 1, pp. 208–223, 2009.
@article{597044,
  abstract     = {In this paper, comparison of strings is tackled from a possibilistic point of view. Instead of using the concept of similarity between strings, coreference between strings is adopted. The possibility of coreference is estimated by means of a possibilistic comparison operator. In literature, two important classes of comparison methods for strings have been distinguished: character-based methods and token-based methods. The first class treats a string as a sequence of characters, while the second class treats a string as a vector of substrings. The first contribution of this paper is to propose a new character-based method that is able to detect typographical errors and abbreviations. The main advantage of the proposed technique is the very low complexity in comparison with existing character-based techniques. In a second contribution, two-level systems are investigated and a new approach is described. The novelty of the proposed two-level system is the use of multiset comparison rather than vector comparison. It is shown how an ordered weighted conjunctive operator that uses a parameterized fuzzy quantifier to deliver weights is competitive with frequency-based weights. In addition, the use of a quantifier is significantly faster than the use of existing weight techniques. In a third contribution, a novel class of hybrid techniques is proposed that combines the advantages of several methods. Finally, comparative tests regarding accuracy and execution time are performed and reported.},
  author       = {Bronselaer, Antoon and De Tré, Guy},
  issn         = {1063-6706},
  journal      = {IEEE TRANSACTIONS ON FUZZY SYSTEMS},
  keywords     = {fuzzy logic,Algorithms,operators (mathematics),possibility theory,string matching,INFORMATION INTEGRATION,SEQUENCE},
  language     = {eng},
  number       = {1},
  pages        = {208--223},
  title        = {A possibilistic approach to string comparison},
  url          = {http://dx.doi.org/10.1109/TFUZZ.2008.2008025},
  volume       = {17},
  year         = {2009},
}

Altmetric
View in Altmetric
Web of Science
Times cited: