Hidden Diffusion: Joint Inference of Selection

and Demography from Time-series Data

Ekaterina Noskova

Madleina Caduff, Andreas Fueglistaler, Anna Parker, Youssef Tawfik,
Christoph Leuenberger, Daniel Wegmann

11 July 2024

Genome-wide Selection Scans

Predicting Selection from Time-series Data

Challenges

Demography influence selection inference
Computational challenge
Neighbour loci are linked and show similar absolute values of selection:

SweepLinkHD

Hidden Markov Model (HMM)

$\text{Hidden States} = \Big\{$

$,\ $

$\Big\}$

$\text{Observations} = \Big\{$

$,\ $

$\Big\}$

MCMC method can be used to learn HMM parameters $\bar{\theta}$

SweepLinkHD: Two-layer HMM

SweepLinkHD Layer 1: Genome-wise

Ideas for Transition Matrix

Most positions in genome are neutral
Absolute selection of neighbour loci should be close
Infer absolute selection $|s|$ and sign $\sigma$

SweepLinkHD Layer 2: Time-wise

Hidden Diffusion Model

Transition density $p(\Delta t, x, y)$ is defined by demographic history

SweepLinkHD Overview

Two-layer Hidden Markov Model that:

Handles multiple populations and wide range of demographic models
Infers demography and selection jointly
Infers linked selection

Results

Performance Illustration on Simulated Data

Performance Benchmarking

Real Data

Real Human Data

Real Data: Results

Conclusions

SweepLinkHD:

adresses classical challenges for selection inference
captures linkage between loci
is faster and more accurate than existing tools
allows to perform genome-wide scans for selection

Thank you!

Acknowledgements

Joachim Burger
Jens Bloecher
Laura Winkelbach
Benedickt Kirsch-Gerweck

Workshop on
Demographic Inference

18-19 November 2024

Hybrid (UniFR, Switzerland)

ekaterina.e.noskova@gmail.com enoskova.me

Proposed Transition Matrix

$$Q(d_l) = \exp(d_l \cdot \Lambda),$$

$$\footnotesize \text{where } \Lambda = \kappa \begin{pmatrix} -1 & 1 & 0 & 0 & \ldots & 0 & 0 & 0\\ \mu & -1-\mu & 1 & 0 & \ldots & 0 & 0 & 0 \\ 0 & \mu & -1-\mu & 1 & \ldots & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & & \vdots & \vdots & \vdots\\ 0 & 0 & 0 & 0 & \ldots & \mu & -1-\mu & 1\\ 0 & 0 & 0 & 0 & \ldots & 0 & \nu \mu & -\nu \mu\\ \end{pmatrix} \begin{matrix} \leftarrow |s|=|s|_{\max} \phantom{-} \phantom{-} \phantom{-} \phantom{.}\\ \\ \\ \phantom{\ldots}\\ \\ \leftarrow |s|=|s|_0 \phantom{-} \phantom{-} \phantom{-} \phantom{-} \phantom{.}\\ \leftarrow \text{attractor (} |s|=0\text{)}\\ \end{matrix} $$

Matrix $Q$ is parameterized by:

$\kappa$ reflects the recombination rate
$\nu$ reflects the fraction of neutral positions
$\mu$ reflects the strength of selection

Hidden Diffusion: Joint Inference of Selection

and Demography from Time-series Data

Ekaterina Noskova

Madleina Caduff, Andreas Fueglistaler, Anna Parker, Youssef Tawfik, Christoph Leuenberger, Daniel Wegmann

11 July 2024

Genome-wide Selection Scans

Predicting Selection from Time-series Data

Challenges

SweepLinkHD

Hidden Markov Model (HMM)

SweepLinkHD: Two-layer HMM

SweepLinkHD Layer 1: Genome-wise

SweepLinkHD Layer 1: Genome-wise

Ideas for Transition Matrix

SweepLinkHD Layer 2: Time-wise

SweepLinkHD Layer 2: Time-wise

Hidden Diffusion Model

SweepLinkHD Overview

Results

Performance Illustration on Simulated Data

Performance Benchmarking

Real Data

Real Human Data

Real Data: Results

Conclusions

Conclusions

Thank you!

Acknowledgements

Workshop onDemographic Inference

Proposed Transition Matrix

Madleina Caduff, Andreas Fueglistaler, Anna Parker, Youssef Tawfik,
Christoph Leuenberger, Daniel Wegmann

Workshop on
Demographic Inference