Advanced search
1 file | 2.16 MB Add to list

In search of relevant predictors for marine species distribution modelling using the MarineSPEED benchmark dataset

(2018) DIVERSITY AND DISTRIBUTIONS. 24(2). p.144-157
Author
Organization
Abstract
Aim: Ideally, datasets for species distribution modelling (SDM) contain evenly sampled records covering the entire distribution of the species, confirmed absences and auxiliary ecophysiological data allowing informed decisions on relevant predictors. Unfortunately, these criteria are rarely met for marine organisms for which distributions are too often only scantly characterized and absences generally not recorded. Here, we investigate predictor relevance as a function of modelling algorithms and settings for a global dataset of marine species. Location: Global marine. Methods: We selected well-studied and identifiable species from all major marine taxonomic groups. Distribution records were compiled from public sources (e.g., OBIS, GBIF, Reef Life Survey) and linked to environmental data from Bio-ORACLE and MARSPEC. Using this dataset, predictor relevance was analysed under different variations of modelling algorithms, numbers of predictor variables, cross-validation strategies, sampling bias mitigation methods, evaluation methods and ranking methods. SDMs for all combinations of predictors from eight correlation groups were fitted and ranked, from which the top five predictors were selected as the most relevant. Results: We collected two million distribution records from 514 species across 18 phyla. Mean sea surface temperature and calcite are, respectively, the most relevant and irrelevant predictors. A less clear pattern was derived from the other predictors. The biggest differences in predictor relevance were induced by varying the number of predictors, the modelling algorithm and the sample selection bias correction. The distribution data and associated environmental data are made available through the R package marinespeed and at http://marinespeed.org. Main conclusions: While temperature is a relevant predictor of global marine species distributions, considerable variation in predictor relevance is linked to the SDM set-up. We promote the usage of a standardized benchmark dataset (MarineSPEED) for methodological SDM studies.
Keywords
benchmark dataset, ecological niche modelling, marine, spatial cross-validation, species distribution modelling, variable importance, ECOLOGICAL NICHE MODELS, PSEUDO-ABSENCE DATA, SAMPLING BIAS, SELECTION BIAS, BALTIC SEA, HABITAT, IMPROVE, CLIMATE, SCALE, PERFORMANCE

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 2.16 MB

Citation

Please use this url to cite or link to this publication:

MLA
Bosch, Samuel et al. “In Search of Relevant Predictors for Marine Species Distribution Modelling Using the MarineSPEED Benchmark Dataset.” DIVERSITY AND DISTRIBUTIONS 24.2 (2018): 144–157. Print.
APA
Bosch, Samuel, Tyberghein, L., Deneudt, K., Hernandez, F., & De Clerck, O. (2018). In search of relevant predictors for marine species distribution modelling using the MarineSPEED benchmark dataset. DIVERSITY AND DISTRIBUTIONS, 24(2), 144–157.
Chicago author-date
Bosch, Samuel, Lennert Tyberghein, Klaas Deneudt, Francisco Hernandez, and Olivier De Clerck. 2018. “In Search of Relevant Predictors for Marine Species Distribution Modelling Using the MarineSPEED Benchmark Dataset.” Diversity and Distributions 24 (2): 144–157.
Chicago author-date (all authors)
Bosch, Samuel, Lennert Tyberghein, Klaas Deneudt, Francisco Hernandez, and Olivier De Clerck. 2018. “In Search of Relevant Predictors for Marine Species Distribution Modelling Using the MarineSPEED Benchmark Dataset.” Diversity and Distributions 24 (2): 144–157.
Vancouver
1.
Bosch S, Tyberghein L, Deneudt K, Hernandez F, De Clerck O. In search of relevant predictors for marine species distribution modelling using the MarineSPEED benchmark dataset. DIVERSITY AND DISTRIBUTIONS. 2018;24(2):144–57.
IEEE
[1]
S. Bosch, L. Tyberghein, K. Deneudt, F. Hernandez, and O. De Clerck, “In search of relevant predictors for marine species distribution modelling using the MarineSPEED benchmark dataset,” DIVERSITY AND DISTRIBUTIONS, vol. 24, no. 2, pp. 144–157, 2018.
@article{8556861,
  abstract     = {{Aim: Ideally, datasets for species distribution modelling (SDM) contain evenly sampled records covering the entire distribution of the species, confirmed absences and auxiliary ecophysiological data allowing informed decisions on relevant predictors. Unfortunately, these criteria are rarely met for marine organisms for which distributions are too often only scantly characterized and absences generally not recorded. Here, we investigate predictor relevance as a function of modelling algorithms and settings for a global dataset of marine species. 
Location: Global marine. 
Methods: We selected well-studied and identifiable species from all major marine taxonomic groups. Distribution records were compiled from public sources (e.g., OBIS, GBIF, Reef Life Survey) and linked to environmental data from Bio-ORACLE and MARSPEC. Using this dataset, predictor relevance was analysed under different variations of modelling algorithms, numbers of predictor variables, cross-validation strategies, sampling bias mitigation methods, evaluation methods and ranking methods. SDMs for all combinations of predictors from eight correlation groups were fitted and ranked, from which the top five predictors were selected as the most relevant. 
Results: We collected two million distribution records from 514 species across 18 phyla. Mean sea surface temperature and calcite are, respectively, the most relevant and irrelevant predictors. A less clear pattern was derived from the other predictors. The biggest differences in predictor relevance were induced by varying the number of predictors, the modelling algorithm and the sample selection bias correction. The distribution data and associated environmental data are made available through the R package marinespeed and at http://marinespeed.org. 
Main conclusions: While temperature is a relevant predictor of global marine species distributions, considerable variation in predictor relevance is linked to the SDM set-up. We promote the usage of a standardized benchmark dataset (MarineSPEED) for methodological SDM studies.}},
  author       = {{Bosch, Samuel and Tyberghein, Lennert and Deneudt, Klaas and Hernandez, Francisco and De Clerck, Olivier}},
  issn         = {{1366-9516}},
  journal      = {{DIVERSITY AND DISTRIBUTIONS}},
  keywords     = {{benchmark dataset,ecological niche modelling,marine,spatial cross-validation,species distribution modelling,variable importance,ECOLOGICAL NICHE MODELS,PSEUDO-ABSENCE DATA,SAMPLING BIAS,SELECTION BIAS,BALTIC SEA,HABITAT,IMPROVE,CLIMATE,SCALE,PERFORMANCE}},
  language     = {{eng}},
  number       = {{2}},
  pages        = {{144--157}},
  title        = {{In search of relevant predictors for marine species distribution modelling using the MarineSPEED benchmark dataset}},
  url          = {{http://dx.doi.org/10.1111/ddi.12668}},
  volume       = {{24}},
  year         = {{2018}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: