Environmental Statistics and Spatial Modeling for Conservation

Statistical Biostatistics + Machine Learning for Ecological Modeling

Author

Dr Tahir Ali | AG de Meaux | TRR341 Ecological Genomics | University of Cologne

Audience Researchers in Ecology, Environmental Science, Conservation Biology
Prerequisites Introductory Statistics, Ecological Foundations, Basic R (helpful but not required)
Format 5 sessions x 90 minutes (30 min theory + 60 min hands-on R practice)

Course Overview

This advanced course integrates key concepts from spatial ecology, biostatistics, and machine learning to build statistically rigorous ecological models.

The workshop combines ecological theory with practical R-based spatial modeling workflows, enabling participants to design robust sampling strategies and build reproducible species distribution models (SDMs).

Core topics include:

  • Spatial sampling theory
  • Biostatistical diagnostics
  • BioClim climate predictors
  • Multivariate ecological analysis
  • Machine learning for SDM
  • Spatial cross-validation

Participants will learn to interpret models using biological reasoning and statistical rigor.


Spatial Ecology & Ecological Modeling Framework

Topic Tools Focus
Spatial Ecology & Ecological Modeling sf, terra, ggplot2, dplyr, spatstat, spdep, tmap, dismo, randomForest, blockCV, mlr3 Spatial sampling theory, spatial point-pattern analysis, BioClim climate predictors, multivariate analysis, and machine learning for species distribution modeling with spatial cross-validation

Schedule & Sessions

Session 1 - Spatial Sampling & Ecological Inference

  • Random, systematic, stratified, transect sampling designs
  • Road and river sampling bias
  • Dendritic network dependence
  • Moran’s I and spatial autocorrelation
  • Effective sample size

Session 2 - Data Exploration & Normalization

  • Distribution diagnostics (histograms, QQ plots, density plots)
  • Transformations: log, Box-Cox, scaling
  • Outlier detection (Isolation Forest)
  • Ecological interpretation of transformations

Session 3 - Collinearity & Regularization

  • Correlation matrices
  • Variance Inflation Factor (VIF)
  • Ecological trade-offs in variable selection
  • Ridge regression, LASSO, Elastic Net

Session 4 - Multivariate & Dimensionality Reduction

  • PCA interpretation in ecological space
  • Loadings and gradient interpretation
  • UMAP / t-SNE visualization
  • Climate niche space analysis

Session 5 - Habitat Suitability / Species Distribution Modeling (SDM)

  • GLM and GAM foundations
  • Random Forest and Boosted Trees
  • Regularized MaxEnt modeling
  • Variable importance and interpretation
  • Spatial cross-validation

Software Requirements

Please install R >= 4.2 and RStudio before Session 1.

Install Required R Packages

# Data manipulation and visualization
install.packages(c("tidyverse","dplyr","tidyr","ggplot2"))

# Spatial data
install.packages(c("sf","terra","spdep","tmap"))

# Statistical modeling
install.packages(c("MASS","car","glmnet","mgcv"))

# Machine learning
install.packages(c("caret","randomForest","gbm","dismo"))

# Multivariate analysis
install.packages(c("vegan","FactoMineR","factoextra"))

# Outlier detection
install.packages("isolationForest")

# Spatial cross-validation
install.packages(c("blockCV","spatialsample"))

# Advanced machine learning workflow
install.packages("mlr3")

Workshop Format & Materials

Each session includes:

  • 30 minutes theoretical concepts
  • 60 minutes hands-on R analysis
  • datasets and scripts provided beforehand
  • reproducible ecological modeling workflows

Sessions build progressively:

  1. Spatial sampling
  2. Data diagnostics
  3. Collinearity and regularization
  4. Multivariate ecological space
  5. Species distribution modeling

Final session includes model comparison and spatial cross-validation.

Session links will be activated one week before each workshop.


Learning Objectives

By the end of this workshop participants will be able to:

  • Design ecologically meaningful spatial sampling strategies
  • Diagnose statistical distributions and transformations
  • Detect multicollinearity in environmental predictors
  • Apply dimensionality reduction to ecological datasets
  • Build species distribution models using multiple algorithms
  • Interpret model predictions in ecological context
  • Apply spatial cross-validation to avoid overfitting

License

These workshop materials are shared under CC BY-SA 4.0.


Back to Workshops Home