Advanced search
1 file | 546.55 KB Add to list

How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration

Author
Organization
Abstract
Researchers often lack knowledge about how to deal with outliers when analyzing their data. Even more frequently, researchers do not pre-specify how they plan to manage outliers. In this paper we aim to improve research practices by outlining what you need to know about outliers. We start by providing a functional definition of outliers. We then lay down an appropriate nomenclature/classification of outliers. This nomenclature is used to understand what kinds of outliers can be encountered and serves as a guideline to make appropriate decisions regarding the conservation, deletion, or recoding of outliers. These decisions might impact the validity of statistical inferences as well as the reproducibility of our experiments. To be able to make informed decisions about outliers you first need proper detection tools. We remind readers why the most common outlier detection methods are problematic and recommend the use of the median absolute deviation to detect univariate outliers, and of the Mahalanobis-MCD distance to detect multivariate outliers. An R package was created that can be used to easily perform these detection tests. Finally, we promote the use of pre-registration to avoid flexibility in data analysis when handling outliers.
Keywords
PSYCHOLOGY, TRANSPARENCY, outliers, preregistration, robust detection, Malahanobis distance, median absolute deviation, minimum covariance determinant

Downloads

  • 8637569.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 546.55 KB

Citation

Please use this url to cite or link to this publication:

MLA
Leys, Christophe, et al. “How to Classify, Detect, and Manage Univariate and Multivariate Outliers, with Emphasis on Pre-Registration.” INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY, vol. 32, no. 1, 2019, doi:10.5334/irsp.289.
APA
Leys, C., Delacre, M., Mora, Y. L., Lakens, D., & Ley, C. (2019). How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration. INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY, 32(1). https://doi.org/10.5334/irsp.289
Chicago author-date
Leys, Christophe, Marie Delacre, Youri L. Mora, Daniel Lakens, and Christophe Ley. 2019. “How to Classify, Detect, and Manage Univariate and Multivariate Outliers, with Emphasis on Pre-Registration.” INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY 32 (1). https://doi.org/10.5334/irsp.289.
Chicago author-date (all authors)
Leys, Christophe, Marie Delacre, Youri L. Mora, Daniel Lakens, and Christophe Ley. 2019. “How to Classify, Detect, and Manage Univariate and Multivariate Outliers, with Emphasis on Pre-Registration.” INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY 32 (1). doi:10.5334/irsp.289.
Vancouver
1.
Leys C, Delacre M, Mora YL, Lakens D, Ley C. How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration. INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY. 2019;32(1).
IEEE
[1]
C. Leys, M. Delacre, Y. L. Mora, D. Lakens, and C. Ley, “How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration,” INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY, vol. 32, no. 1, 2019.
@article{8619726,
  abstract     = {{Researchers often lack knowledge about how to deal with outliers when analyzing their data. Even more frequently, researchers do not pre-specify how they plan to manage outliers. In this paper we aim to improve research practices by outlining what you need to know about outliers. We start by providing a functional definition of outliers. We then lay down an appropriate nomenclature/classification of outliers. This nomenclature is used to understand what kinds of outliers can be encountered and serves as a guideline to make appropriate decisions regarding the conservation, deletion, or recoding of outliers. These decisions might impact the validity of statistical inferences as well as the reproducibility of our experiments. To be able to make informed decisions about outliers you first need proper detection tools. We remind readers why the most common outlier detection methods are problematic and recommend the use of the median absolute deviation to detect univariate outliers, and of the Mahalanobis-MCD distance to detect multivariate outliers. An R package was created that can be used to easily perform these detection tests. Finally, we promote the use of pre-registration to avoid flexibility in data analysis when handling outliers.}},
  articleno    = {{5}},
  author       = {{Leys, Christophe and Delacre, Marie and Mora, Youri L. and Lakens, Daniel and Ley, Christophe}},
  issn         = {{2397-8570}},
  journal      = {{INTERNATIONAL REVIEW OF SOCIAL PSYCHOLOGY}},
  keywords     = {{PSYCHOLOGY,TRANSPARENCY,outliers,preregistration,robust detection,Malahanobis distance,median absolute deviation,minimum covariance determinant}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{10}},
  title        = {{How to classify, detect, and manage univariate and multivariate outliers, with emphasis on pre-registration}},
  url          = {{http://dx.doi.org/10.5334/irsp.289}},
  volume       = {{32}},
  year         = {{2019}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: