Hidden Diffusion: Joint Inference of Selection

and Demography from Time-series Data

Ekaterina Noskova

Madleina Caduff, Andreas Fueglistaler, Anna Parker, Youssef Tawfik,
Christoph Leuenberger, Daniel Wegmann

11 July 2024

Genome-wide Selection Scans

Predicting Selection from Time-series Data

Challenges

  • Demography influence selection inference
  • Computational challenge
  • Neighbour loci are linked and show similar absolute values of selection:

SweepLinkHD

Hidden Markov Model (HMM)

$\text{Hidden States} = \Big\{$ $,\ $ $\Big\}$
$\text{Observations} = \Big\{$ $,\ $ $\Big\}$

MCMC method can be used to learn HMM parameters $\bar{\theta}$

SweepLinkHD: Two-layer HMM

SweepLinkHD Layer 1: Genome-wise

SweepLinkHD Layer 1: Genome-wise

Ideas for Transition Matrix

  • Most positions in genome are neutral
  • Absolute selection of neighbour loci should be close
  • Infer absolute selection $|s|$ and sign $\sigma$

SweepLinkHD Layer 2: Time-wise

SweepLinkHD Layer 2: Time-wise

Hidden Diffusion Model

Transition density $p(\Delta t, x, y)$ is defined by demographic history

SweepLinkHD Overview

Two-layer Hidden Markov Model that:

  • Handles multiple populations and wide range of demographic models
  • Infers demography and selection jointly
  • Infers linked selection

Results

Performance Illustration on Simulated Data

Performance Benchmarking

Real Data

Real Human Data

Real Data: Results

Conclusions

Conclusions

SweepLinkHD:

  • adresses classical challenges for selection inference
  • captures linkage between loci
  • is faster and more accurate than existing tools
  • allows to perform genome-wide scans for selection

Thank you!

Acknowledgements

  • Joachim Burger
  • Jens Bloecher
  • Laura Winkelbach
  • Benedickt Kirsch-Gerweck

Workshop on
Demographic Inference

18-19 November 2024

Hybrid (UniFR, Switzerland)

ekaterina.e.noskova@gmail.com                       enoskova.me

Proposed Transition Matrix

$$Q(d_l) = \exp(d_l \cdot \Lambda),$$

$$\footnotesize \text{where } \Lambda = \kappa \begin{pmatrix} -1 & 1 & 0 & 0 & \ldots & 0 & 0 & 0\\ \mu & -1-\mu & 1 & 0 & \ldots & 0 & 0 & 0 \\ 0 & \mu & -1-\mu & 1 & \ldots & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & & \vdots & \vdots & \vdots\\ 0 & 0 & 0 & 0 & \ldots & \mu & -1-\mu & 1\\ 0 & 0 & 0 & 0 & \ldots & 0 & \nu \mu & -\nu \mu\\ \end{pmatrix} \begin{matrix} \leftarrow |s|=|s|_{\max} \phantom{-} \phantom{-} \phantom{-} \phantom{.}\\ \\ \\ \phantom{\ldots}\\ \\ \leftarrow |s|=|s|_0 \phantom{-} \phantom{-} \phantom{-} \phantom{-} \phantom{.}\\ \leftarrow \text{attractor (} |s|=0\text{)}\\ \end{matrix} $$

Matrix $Q$ is parameterized by:

  • $\kappa$ reflects the recombination rate
  • $\nu$ reflects the fraction of neutral positions
  • $\mu$ reflects the strength of selection