A protocol for automated timber species identification using metabolome profiling
- Author
- Victor Deklerck, Thomas Mortier (UGent) , Nathalie Goeders, RB Cody, Willem Waegeman (UGent) , E Espinoza, Joris Van Acker (UGent) , Jan Van den Bulcke (UGent) and H Beeckman
- Organization
- Abstract
- Using chemical fingerprints for timber species identification is a relatively new, but promising technique. However, little is known about the effect of pre-processing spectral data parameter settings on the timber species classification accuracy. Therefore, this study presents an extensive and automated analysis method using the random forest machine learning algorithm on a set of highly valuable timber species from the Meliaceae family. Metabolome profiles were collected using direct analysis in real-time (DART (TM)) ionisation coupled with time-of-flight mass spectrometry (TOFMS) analysis of heartwood specimens for 175 individuals (representing 10 species). In order to analyse variability in classification accuracy, 110 sets of data pre-processing parameter combinations consisting of mass tolerance for binning and relative abundance cut-off thresholds were tested. Furthermore, for each set of parameters (designated binning/threshold setting), a random search for one hyperparameter of interest was performed, i.e. the number of variables (in this case ions) drawn randomly for each random forest analysis. The best classification accuracy (82.2%) was achieved with 47 variables and a binning and threshold combination of 40mDa and 4%, respectively. Entandrophragma angolense is mostly confused with Entandrophragma candollei and Khaya anthotheca, and several Swietenia species are confused with each other due to the high similarity of their chemical fingerprints. Entandrophragma cylindricum, Entandrophragma utile, Khaya ivorensis, Lovoa trichilioides and Swietenia macrophylla are easy to discriminate and show less misclassifications. The choice of parameter settings, whether it is in the data pre-processing (binning and threshold) or classification algorithm (hyperparameters), results in variability in classification accuracy. Therefore, a preliminary parameter screening is proposed before constructing the final model when using the random forest algorithm for classification. Overall, DART-TOFMS in combination with random forest is a powerful tool for species identification.
- Keywords
- MAHOGANY SWIETENIA-MACROPHYLLA, NEAR-INFRARED SPECTROSCOPY, MESOAMERICAN POPULATIONS, WOOD IDENTIFICATION, MASS-SPECTROMETRY, GENETIC-STRUCTURE, REAL-TIME, ORIGIN, TRADE, SPECIMENS
Downloads
-
(...).pdf
- full text
- |
- UGent only
- |
- |
- 1.36 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8620241
- MLA
- Deklerck, Victor, et al. “A Protocol for Automated Timber Species Identification Using Metabolome Profiling.” WOOD SCIENCE AND TECHNOLOGY, vol. 53, no. 4, 2019, pp. 953–65, doi:10.1007/s00226-019-01111-1.
- APA
- Deklerck, V., Mortier, T., Goeders, N., Cody, R., Waegeman, W., Espinoza, E., … Beeckman, H. (2019). A protocol for automated timber species identification using metabolome profiling. WOOD SCIENCE AND TECHNOLOGY, 53(4), 953–965. https://doi.org/10.1007/s00226-019-01111-1
- Chicago author-date
- Deklerck, Victor, Thomas Mortier, Nathalie Goeders, RB Cody, Willem Waegeman, E Espinoza, Joris Van Acker, Jan Van den Bulcke, and H Beeckman. 2019. “A Protocol for Automated Timber Species Identification Using Metabolome Profiling.” WOOD SCIENCE AND TECHNOLOGY 53 (4): 953–65. https://doi.org/10.1007/s00226-019-01111-1.
- Chicago author-date (all authors)
- Deklerck, Victor, Thomas Mortier, Nathalie Goeders, RB Cody, Willem Waegeman, E Espinoza, Joris Van Acker, Jan Van den Bulcke, and H Beeckman. 2019. “A Protocol for Automated Timber Species Identification Using Metabolome Profiling.” WOOD SCIENCE AND TECHNOLOGY 53 (4): 953–965. doi:10.1007/s00226-019-01111-1.
- Vancouver
- 1.Deklerck V, Mortier T, Goeders N, Cody R, Waegeman W, Espinoza E, et al. A protocol for automated timber species identification using metabolome profiling. WOOD SCIENCE AND TECHNOLOGY. 2019;53(4):953–65.
- IEEE
- [1]V. Deklerck et al., “A protocol for automated timber species identification using metabolome profiling,” WOOD SCIENCE AND TECHNOLOGY, vol. 53, no. 4, pp. 953–965, 2019.
@article{8620241, abstract = {{Using chemical fingerprints for timber species identification is a relatively new, but promising technique. However, little is known about the effect of pre-processing spectral data parameter settings on the timber species classification accuracy. Therefore, this study presents an extensive and automated analysis method using the random forest machine learning algorithm on a set of highly valuable timber species from the Meliaceae family. Metabolome profiles were collected using direct analysis in real-time (DART (TM)) ionisation coupled with time-of-flight mass spectrometry (TOFMS) analysis of heartwood specimens for 175 individuals (representing 10 species). In order to analyse variability in classification accuracy, 110 sets of data pre-processing parameter combinations consisting of mass tolerance for binning and relative abundance cut-off thresholds were tested. Furthermore, for each set of parameters (designated binning/threshold setting), a random search for one hyperparameter of interest was performed, i.e. the number of variables (in this case ions) drawn randomly for each random forest analysis. The best classification accuracy (82.2%) was achieved with 47 variables and a binning and threshold combination of 40mDa and 4%, respectively. Entandrophragma angolense is mostly confused with Entandrophragma candollei and Khaya anthotheca, and several Swietenia species are confused with each other due to the high similarity of their chemical fingerprints. Entandrophragma cylindricum, Entandrophragma utile, Khaya ivorensis, Lovoa trichilioides and Swietenia macrophylla are easy to discriminate and show less misclassifications. The choice of parameter settings, whether it is in the data pre-processing (binning and threshold) or classification algorithm (hyperparameters), results in variability in classification accuracy. Therefore, a preliminary parameter screening is proposed before constructing the final model when using the random forest algorithm for classification. Overall, DART-TOFMS in combination with random forest is a powerful tool for species identification.}}, author = {{Deklerck, Victor and Mortier, Thomas and Goeders, Nathalie and Cody, RB and Waegeman, Willem and Espinoza, E and Van Acker, Joris and Van den Bulcke, Jan and Beeckman, H}}, issn = {{0043-7719}}, journal = {{WOOD SCIENCE AND TECHNOLOGY}}, keywords = {{MAHOGANY SWIETENIA-MACROPHYLLA,NEAR-INFRARED SPECTROSCOPY,MESOAMERICAN POPULATIONS,WOOD IDENTIFICATION,MASS-SPECTROMETRY,GENETIC-STRUCTURE,REAL-TIME,ORIGIN,TRADE,SPECIMENS}}, language = {{eng}}, number = {{4}}, pages = {{953--965}}, title = {{A protocol for automated timber species identification using metabolome profiling}}, url = {{http://doi.org/10.1007/s00226-019-01111-1}}, volume = {{53}}, year = {{2019}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: