Advanced search
1 file | 520.07 KB Add to list

Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations

Author
Organization
Abstract
Many algorithms for natural language processing rely on manual feature engineering. In this paper, we show that we can achieve state-of-the-art performance for part-of-speech tagging of Twitter microposts by solely relying on automatically inferred word embeddings as features and a neural network. By pre-training the neural network with large amounts of automatically labeled Twitter microposts to initialize the weights, we achieve a state-of-the-art accuracy of 88.9% when tagging Twitter microposts with Penn Treebank tags.
Keywords
neural networks, distributed word representations, Twitter, microposts, part-of-speech tagging

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 520.07 KB

Citation

Please use this url to cite or link to this publication:

MLA
Godin, Fréderic, et al. “Alleviating Manual Feature Engineering for Part-of-Speech Tagging of Twitter Microposts Using Distributed Word Representations.” Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings, 2014.
APA
Godin, F., Vandersmissen, B., Jalalvand, A., De Neve, W., & Van de Walle, R. (2014). Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations. Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings. Presented at the Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing (NIPS 2014), Montréal, Cananda.
Chicago author-date
Godin, Fréderic, Baptist Vandersmissen, Azarakhsh Jalalvand, Wesley De Neve, and Rik Van de Walle. 2014. “Alleviating Manual Feature Engineering for Part-of-Speech Tagging of Twitter Microposts Using Distributed Word Representations.” In Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings.
Chicago author-date (all authors)
Godin, Fréderic, Baptist Vandersmissen, Azarakhsh Jalalvand, Wesley De Neve, and Rik Van de Walle. 2014. “Alleviating Manual Feature Engineering for Part-of-Speech Tagging of Twitter Microposts Using Distributed Word Representations.” In Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings.
Vancouver
1.
Godin F, Vandersmissen B, Jalalvand A, De Neve W, Van de Walle R. Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations. In: Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings. 2014.
IEEE
[1]
F. Godin, B. Vandersmissen, A. Jalalvand, W. De Neve, and R. Van de Walle, “Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations,” in Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings, Montréal, Cananda, 2014.
@inproceedings{5817855,
  abstract     = {{Many algorithms for natural language processing rely on manual feature engineering. In this paper, we show that we can achieve state-of-the-art performance for part-of-speech tagging of Twitter microposts by solely relying on automatically inferred word embeddings as features and a neural network. By pre-training the neural network with large amounts of automatically labeled Twitter microposts to initialize the weights, we achieve a state-of-the-art accuracy of 88.9% when tagging Twitter microposts with Penn Treebank tags.}},
  author       = {{Godin, Fréderic and Vandersmissen, Baptist and Jalalvand, Azarakhsh and De Neve, Wesley and Van de Walle, Rik}},
  booktitle    = {{Workshop on Modern Machine Learning and Natural Language Processing, NIPS 2014Workshop on Modern Machine Learning and Natural Language Processing, Proceedings}},
  keywords     = {{neural networks,distributed word representations,Twitter,microposts,part-of-speech tagging}},
  language     = {{eng}},
  location     = {{Montréal, Cananda}},
  pages        = {{5}},
  title        = {{Alleviating manual feature engineering for part-of-speech tagging of Twitter microposts using distributed word representations}},
  year         = {{2014}},
}