Introduction

This educational webpage explores the mathematical modeling of alien species co-invasions based on the paper "Mixed additive modeling of global alien species co-invasions." The paper presents a sophisticated statistical framework for understanding how plants and insects co-invade different regions around the world.

The key innovation in the paper is the use of a mixed additive Relational Event Model (REM) that can handle:

This webpage will help you understand the mathematical concepts, inference methods, and simulation techniques used in the paper. We'll also extend the analysis to fungi species, providing a foundation for further research in this area.

Cumulative number of first records over time
Figure 1: Cumulative number of first records of alien species over time, showing the acceleration of invasions.

Mathematical Framework

Relational Event Model (REM)

The paper models alien species invasions as a marked point process, where each "event" represents the first record of a particular species in a specific region. This approach allows us to model the timing and patterns of invasions.

The fundamental concept is the hazard function, which represents the instantaneous probability of an invasion event occurring at time \(t\):

\[ \lambda(t; \theta) = \lim_{\Delta t \to 0} \frac{P(t \leq T < t + \Delta t | T \geq t)}{\Delta t} \] (1)

Where \(T\) is the random variable representing the time of invasion, and \(\theta\) represents the model parameters.

Mixed Additive Model

The paper uses a mixed additive model to incorporate various factors affecting invasion rates:

\[ \lambda_{ij}(t) = \exp\left( \alpha + \sum_{k=1}^{K} \beta_k x_{ijk}(t) + \sum_{l=1}^{L} f_l(z_{ijl}(t)) + u_i + v_j \right) \] (2)

Where:

  • \(\lambda_{ij}(t)\) is the hazard rate for species \(i\) invading region \(j\) at time \(t\)
  • \(\alpha\) is the baseline hazard rate
  • \(\beta_k\) are coefficients for linear effects \(x_{ijk}(t)\)
  • \(f_l(z_{ijl}(t))\) are smooth functions of covariates \(z_{ijl}(t)\)
  • \(u_i\) and \(v_j\) are random effects for species \(i\) and region \(j\)

Time-Varying Effects

One of the key innovations in the paper is modeling how the effects of covariates change over time:

\[ \beta_k(t) = \sum_{m=1}^{M} \gamma_{km} B_m(t) \] (3)

Where \(B_m(t)\) are basis functions (such as B-splines) and \(\gamma_{km}\) are coefficients to be estimated.

Co-Invasion Effects

The paper models how the presence of one species affects the invasion probability of another:

\[ x_{ijk}^{\text{co-invasion}}(t) = \sum_{i' \in S_{i}} \mathbb{1}(T_{i'j} < t) \] (4)

Where \(S_{i}\) is the set of species related to species \(i\), and \(\mathbb{1}(T_{i'j} < t)\) indicates whether species \(i'\) has already invaded region \(j\) before time \(t\).

Number of first records by decade
Figure 2: Number of first records by decade, showing temporal patterns in invasion rates.

Inference Methods

Case-Control Sampling

Estimating the full model with all possible species-region-time combinations would be computationally infeasible. The paper uses case-control sampling to make the estimation tractable:

The log-likelihood for the case-control sample is:

\[ \ell(\theta) = \sum_{(i,j,t) \in \mathcal{D}} \log \lambda_{ij}(t) - \sum_{(i,j,t) \in \mathcal{D} \cup \mathcal{C}} \log(1 + \lambda_{ij}(t)) \] (5)

Where \(\mathcal{D}\) is the set of observed invasions (cases) and \(\mathcal{C}\) is the set of sampled non-invasions (controls).

Penalized Maximum Likelihood

To estimate the smooth functions and prevent overfitting, the paper uses penalized maximum likelihood:

\[ \ell_p(\theta) = \ell(\theta) - \frac{1}{2} \sum_{l=1}^{L} \lambda_l \int [f_l''(z)]^2 dz \] (6)

Where \(\lambda_l\) are smoothing parameters that control the trade-off between fit and smoothness.

Goodness-of-Fit Evaluation

The paper evaluates model fit using several metrics:

Top regions by number of first records
Figure 3: Top regions by number of first records, showing geographic patterns in invasions.

Data Analysis

Global Alien Species First Record Database

The analysis uses the Global Alien Species First Record Database, which contains records of when alien species were first detected in different regions around the world.

Key Database Statistics:

  • Total records: 61,751
  • Date range: Ancient times to 2020
  • Plants (Tracheophyta): 32,098 records (52%)
  • Insects (Insecta): 10,801 records (17.5%)
  • Fungi: 681 records (1.1%)

Temporal Patterns

The analysis reveals clear temporal patterns in invasion rates:

Heatmap of invasions by decade and region
Figure 4: Heatmap showing invasion patterns across regions and decades.

Geographic Patterns

The analysis also reveals geographic patterns in invasions:

Co-Invasion Patterns

The analysis examines co-invasion patterns between plants and insects:

Top species by number of regions invaded
Figure 5: Top species by number of regions invaded, showing the most widespread invasive species.

Extension to Fungi Species

Fungi in the Database

The Global Alien Species First Record Database contains 681 fungi records (1.1% of all records), primarily from two phyla:

Correlation between fungi and plant invasions
Figure 6: Correlation between fungi and plant invasions, showing potential ecological relationships.

Fungi Invasion Patterns

The analysis of fungi invasions reveals several interesting patterns:

Top fungi species by number of regions invaded
Figure 7: Top fungi species by number of regions invaded.

Adapting the Mathematical Framework to Fungi

To extend the mixed additive REM framework to fungi invasions, we need to consider several factors:

1. Specific Covariates for Fungi

2. Random Effects Structure

3. Co-invasion Dynamics

4. Time-varying Effects

5. Data Challenges

Correlation between fungi and insect invasions
Figure 8: Correlation between fungi and insect invasions.

Interactive Simulations

Hazard Function Simulation

This simulation demonstrates how the hazard function changes with different parameter values. The hazard function represents the instantaneous probability of an invasion event occurring at a given time.