Advanced search
1 file | 241.78 KB

Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant

(2012) PLANT BIOLOGY. 14(2). p.271-277
Author
Organization
Abstract
Selecting an appropriate variable subset in linear multivariate methods is an important methodological issue for ecologists. Interest often exists in obtaining general predictive capacity or in finding causal inferences from predictor variables. Because of a lack of solid knowledge on a studied phenomenon, scientists explore predictor variables in order to find the most meaningful (i.e. discriminating) ones. As an example, we modelled the response of the amphibious softwater plant Eleocharis multicaulis using canonical discriminant function analysis. We asked how variables can be selected through comparison of several methods: univariate Pearson chisquare screening, principal components analysis (PCA) and step-wise analysis, as well as combinations of some methods. We expected PCA to perform best. The selected methods were evaluated through fit and stability of the resulting discriminant functions and through correlations between these functions and the predictor variables. The chi-square subset, at P < 0.05, followed by a step-wise sub-selection, gave the best results. In contrast to expectations, PCA performed poorly, as so did step-wise analysis. The different chi-square subset methods all yielded ecologically meaningful variables, while probable noise variables were also selected by PCA and step-wise analysis. We advise against the simple use of PCA or step-wise discriminant analysis to obtain an ecologically meaningful variable subset; the former because it does not take into account the response variable, the latter because noise variables are likely to be selected. We suggest that univariate screening techniques are a worthwhile alternative for variable selection in ecology.
Keywords
model evaluation, Data mining, overfitting, principal components analysis, goodness-of-fit test, Pearson chi-square test, step-wise analysis, DISTRIBUTION MODELS, PRINCIPAL-COMPONENTS, ECOLOGICAL THEORY, DISTRIBUTIONS, REGRESSION, VARIABLES, NICHE

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 241.78 KB

Citation

Please use this url to cite or link to this publication:

Chicago
Vanderhaeghe, Floris, Alfons JP Smolders, Jan GM Roelofs, and Maurice Hoffmann. 2012. “Selecting Predictors for Discriminant Analysis of Species Performance: An Example from an Amphibious Softwater Plant.” Plant Biology 14 (2): 271–277.
APA
Vanderhaeghe, F., Smolders, A. J., Roelofs, J. G., & Hoffmann, M. (2012). Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant. PLANT BIOLOGY, 14(2), 271–277.
Vancouver
1.
Vanderhaeghe F, Smolders AJ, Roelofs JG, Hoffmann M. Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant. PLANT BIOLOGY. 2012;14(2):271–7.
MLA
Vanderhaeghe, Floris, Alfons JP Smolders, Jan GM Roelofs, et al. “Selecting Predictors for Discriminant Analysis of Species Performance: An Example from an Amphibious Softwater Plant.” PLANT BIOLOGY 14.2 (2012): 271–277. Print.
@article{1989459,
  abstract     = {Selecting an appropriate variable subset in linear multivariate methods is an important methodological issue for ecologists. Interest often exists in obtaining general predictive capacity or in finding causal inferences from predictor variables. Because of a lack of solid knowledge on a studied phenomenon, scientists explore predictor variables in order to find the most meaningful (i.e. discriminating) ones. As an example, we modelled the response of the amphibious softwater plant Eleocharis multicaulis using canonical discriminant function analysis. We asked how variables can be selected through comparison of several methods: univariate Pearson chisquare screening, principal components analysis (PCA) and step-wise analysis, as well as combinations of some methods. We expected PCA to perform best. The selected methods were evaluated through fit and stability of the resulting discriminant functions and through correlations between these functions and the predictor variables. The chi-square subset, at P {\textlangle} 0.05, followed by a step-wise sub-selection, gave the best results. In contrast to expectations, PCA performed poorly, as so did step-wise analysis. The different chi-square subset methods all yielded ecologically meaningful variables, while probable noise variables were also selected by PCA and step-wise analysis. We advise against the simple use of PCA or step-wise discriminant analysis to obtain an ecologically meaningful variable subset; the former because it does not take into account the response variable, the latter because noise variables are likely to be selected. We suggest that univariate screening techniques are a worthwhile alternative for variable selection in ecology.},
  author       = {Vanderhaeghe, Floris and Smolders, Alfons JP and Roelofs, Jan GM and Hoffmann, Maurice},
  issn         = {1435-8603},
  journal      = {PLANT BIOLOGY},
  keyword      = {model evaluation,Data mining,overfitting,principal components analysis,goodness-of-fit test,Pearson chi-square test,step-wise analysis,DISTRIBUTION MODELS,PRINCIPAL-COMPONENTS,ECOLOGICAL THEORY,DISTRIBUTIONS,REGRESSION,VARIABLES,NICHE},
  language     = {eng},
  number       = {2},
  pages        = {271--277},
  title        = {Selecting predictors for discriminant analysis of species performance: an example from an amphibious softwater plant},
  url          = {http://dx.doi.org/10.1111/j.1438-8677.2011.00497.x},
  volume       = {14},
  year         = {2012},
}

Altmetric
View in Altmetric
Web of Science
Times cited: