Advanced search
1 file | 280.98 KB Add to list

Lazy low-resource coreference resolution : a study on leveraging black-box translation tools

Author
Organization
Abstract
Large annotated corpora for coreference resolution are available for few languages. For machine translation, however, strong black-box systems exist for many languages. We empirically explore the appealing idea of leveraging such translation tools for bootstrapping coreference resolution in languages with limited resources. Two scenarios are analyzed, in which a large coreference corpus in a high-resource language is used for coreference predictions in a smaller language, i.e., by machine translating either the training corpus or the test data. In our empirical evaluation of coreference resolution using the two scenarios on several medium-resource languages, we find no improvement over monolingual baseline models. Our analysis of the various sources of error inherent to the studied scenarios, reveals that in fact the quality of contemporary machine translation tools is the main limiting factor.

Downloads

  • Published proceedings paper.pdf
    • full text (Published version)
    • |
    • open access
    • |
    • PDF
    • |
    • 280.98 KB

Citation

Please use this url to cite or link to this publication:

MLA
Bitew, Semere Kiros, et al. “Lazy Low-Resource Coreference Resolution : A Study on Leveraging Black-Box Translation Tools.” Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021), edited by Maciej Ogrodniczuk et al., Association for Computational Linguistics (ACL), 2021, pp. 57–62, doi:10.18653/v1/2021.crac-1.6.
APA
Bitew, S. K., Deleu, J., Develder, C., & Demeester, T. (2021). Lazy low-resource coreference resolution : a study on leveraging black-box translation tools. In M. Ogrodniczuk, S. Pradhan, M. Poesio, Y. Grishina, & V. Ng (Eds.), Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021) (pp. 57–62). https://doi.org/10.18653/v1/2021.crac-1.6
Chicago author-date
Bitew, Semere Kiros, Johannes Deleu, Chris Develder, and Thomas Demeester. 2021. “Lazy Low-Resource Coreference Resolution : A Study on Leveraging Black-Box Translation Tools.” In Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021), edited by Maciej Ogrodniczuk, Sameer Pradhan, Massimo Poesio, Yulia Grishina, and Vincent Ng, 57–62. Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.crac-1.6.
Chicago author-date (all authors)
Bitew, Semere Kiros, Johannes Deleu, Chris Develder, and Thomas Demeester. 2021. “Lazy Low-Resource Coreference Resolution : A Study on Leveraging Black-Box Translation Tools.” In Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021), ed by. Maciej Ogrodniczuk, Sameer Pradhan, Massimo Poesio, Yulia Grishina, and Vincent Ng, 57–62. Association for Computational Linguistics (ACL). doi:10.18653/v1/2021.crac-1.6.
Vancouver
1.
Bitew SK, Deleu J, Develder C, Demeester T. Lazy low-resource coreference resolution : a study on leveraging black-box translation tools. In: Ogrodniczuk M, Pradhan S, Poesio M, Grishina Y, Ng V, editors. Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021). Association for Computational Linguistics (ACL); 2021. p. 57–62.
IEEE
[1]
S. K. Bitew, J. Deleu, C. Develder, and T. Demeester, “Lazy low-resource coreference resolution : a study on leveraging black-box translation tools,” in Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021), Punta Cana, Domenician Republic, 2021, pp. 57–62.
@inproceedings{8721922,
  abstract     = {{Large annotated corpora for coreference resolution are available for few languages. For machine translation, however, strong black-box systems exist for many languages. We empirically explore the appealing idea of leveraging such translation tools for bootstrapping coreference resolution in languages with limited resources. Two scenarios are analyzed, in which a large coreference corpus in a high-resource language is used for coreference predictions in a smaller language, i.e., by machine translating either the training corpus or the test data. In our empirical evaluation of coreference resolution using the two scenarios on several medium-resource languages, we find no improvement over monolingual baseline models. Our analysis of the various sources of error inherent to the studied scenarios, reveals that in fact the quality of contemporary machine translation tools is the main limiting factor.}},
  author       = {{Bitew, Semere Kiros and Deleu, Johannes and Develder, Chris and Demeester, Thomas}},
  booktitle    = {{Proceedings of the Fourth Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2021)}},
  editor       = {{Ogrodniczuk, Maciej and Pradhan, Sameer and Poesio, Massimo and Grishina, Yulia and Ng, Vincent}},
  isbn         = {{9781955917025}},
  language     = {{eng}},
  location     = {{Punta Cana, Domenician Republic}},
  pages        = {{57--62}},
  publisher    = {{Association for Computational Linguistics (ACL)}},
  title        = {{Lazy low-resource coreference resolution : a study on leveraging black-box translation tools}},
  url          = {{http://doi.org/10.18653/v1/2021.crac-1.6}},
  year         = {{2021}},
}

Altmetric
View in Altmetric