Demographic Inference Workshop

Ekaterina (Katya) Noskova

19 November 2024

Demographic History

Demographic History

Demographic History

 Visualization

drawn by demes [Gower et al. 2022]

Why Reconstruct 

Demographic History

?

Understand population history

Why Reconstruct 

Demographic History

?

Conservation biology studies

Starting Tutorial

  • Download the repository with all the materials:
    • $ git clone https://github.com/noscode/GADMA_workshops.git
  • Go to:
    • $ cd GADMA_workshops/2024-11-Demographic_Inference_Worshop/tutorials
  • Run your conda environment:
    • $ conda activate gadma_env
  • Optionally you can install and run jupyter notebook:
    • Install: $ pip install notebook
    • Run it: $ jupyter notebook
  • You can always view those notebooks online here

Demes Tutoral

Go to the directory: $ cd 1_demes_tutorial

Notebook: 1_demes_tutorial.ipynb

Online link

Data:

Alelle Frequency Spectrum

Allele Frequency Spectrum

Derived allele is a new allele formed by mutation.

Allele frequency spectrum (AFS) of P populations is the joint distribution of the derived allele frequencies of a given set of loci (SNP’s) across P populations.

Allele Frequency Spectrum Example

Reference: ATACGTC
1 population 2 population
1 individual ATCCGAC ACACTTC
2 individual ACACGTC ACACGTT
3 individual GCACGTC
Position 1 2 3 4 5 6 7
Der. allele G C C - T A T
Freq. in 1 pop. 1 2 1 - 0 1 0
Freq. in 2 pop. 0 2 0 - 1 0 1
$$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize 1 \\ \scriptsize 0 \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize 0 & \scriptsize 1 & \scriptsize 2 \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=3}{1} \\ \scriptsize 0 \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=3}{0} & \scriptsize 1 & \scriptsize 2 \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=4}{2} \\ \scriptsize 1 \\ \scriptsize 0 \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize 0 & \scriptsize 1 & \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=4}{2} \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=5}{1} \\ \scriptsize 0 \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 2 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=5}{0} & \scriptsize 1 & \scriptsize 2 \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize 1 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=7}{0} \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 2 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize 0 & \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=7}{1} & \scriptsize 2 \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=8}{1} \\ \scriptsize 0 \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 3 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=8}{0} & \scriptsize 1 & \scriptsize 2 \end{matrix} \\ \end{matrix} $$ $$ A = \begin{matrix} \scriptsize 3 \\ \scriptsize 2 \\ \scriptsize 1 \\ \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=9}{0} \\ \\ \end{matrix} \begin{matrix} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \\ 3 & 0 & 0 \\ 0 & 2 & 0 \end{pmatrix} \\ \begin{matrix} \scriptsize 0 & \scriptsize \htmlData{class=fragment highlight-current-blue, fragment-index=9}{1} & \scriptsize 2 \end{matrix} \\ \end{matrix} $$

EasySFS Tutoral

Go to the directory: $ cd ../2_easySFS_tutorial

Notebook: 2_easySFS_tutorial.ipynb

Online link

Demographic Inference

Demographic Inference

Demographic Inference Tools

Demographic Inference

 Tools

Demographic Inference

 Tools

Examples:

  • $\partial a \partial i$ [Gutenkunst et al. 2009]
  • moments [Jouganous et al. 2017]
  • momentsLD [Ragsdale and Gravel 2019, 2020]
  • momi2 [Kamm et al. 2020]
  • fastsimcoal2 [Excoffier et al. 2013, 2021]
  • Dical2 [Steinrücken et al. 2019]

Issues of Existing Tools

Issue 1: Model Specification

Specification
for $\partial a \partial i$

Issue 2: Model Selection

Issue 3: Optimization

Most tools use local search optimization algorithms:

  • BFGS
  • Nelder–Mead method
  • Powell's method
  • EM, ECM

They require initial estimation and perform search for local optimum.

Local vs Global Optimization

Demographic Inference 

for Four and Five Populations

Demographic Inference 

for Four and Five Populations

Challenges:

  • Time-expensive likelihood evaluations
  • Many phylogenetic tree topologies
  • Great number of model parameters

GADMA — Global search Algorithm for Demographic Model Analysis

  • Several likelihood engines ($\partial a \partial i$, moments, momi2, momentsLD)
  • Common interface
  • New model specification
  • Effective global optimization

[Noskova et al. 2020]     [Noskova et al. 2023]

New Model Specification

New Model Specification

New model in GADMA that is specified only by the number of epochs.

Available up to three populations

Flexible Dynamics

New model in GADMA has flexible dynamics of population size change.

Population dynamic can be:

  • Constant (sudden change)
  • Linear
  • Exponential

Additional controls

Global Optimization: 

Genetic Algorithm

Global Optimization: 

Genetic Algorithm

Genetic algorithm:

  • Widely used global optimization.
  • Uses ideas of evolution and natural selection.
  • Can discover solutions in large search space.

GADMA implements a combination of the genetic algorithm followed by a local search method.

Hyperparameters of the genetic algorithm are optimized (SMAC).

Global Optimization: 

Bayesian Optimization

Bayesian Optimization

Machine learning-based technique for optimizing expensive functions.

Enables demographic inference for 4 and 5 populations in GADMA.

[Noskova and Borovitskiy, 2023]

GADMA Tutoral

Go to the directory: $ cd ../3_GADMA_tutorial

Notebook: 3_GADMA_tutorial.ipynb

Online link

Model Selection Tutoral

Go to the directory: $ cd ../4_Model_Selection_tutorial

Notebook: 4_Model_Selection_tutorial.ipynb

Online link

Thank you!

Slides:
ekaterina.e.noskova@gmail.com                       enoskova.me