Postdoctoral Researcher

Manu Saraswat

Mechanistic AI for Biological Discovery

I build AI systems where the model's internal representations are biological entities, not abstract features—enabling predictions that reveal mechanisms, get validated in the lab, and translate to therapeutic strategies. Currently at Memorial Sloan Kettering Cancer Center (MSKCC) and German Cancer Research Center (DKFZ), working with Dana Pe'er and Oliver Stegle.

Manu Saraswat
Research Vision

Representation learning where the representations are the biology.

Biology is at a turning point. Multi-scale, multi-modal measurements—genomes, single cells, spatial assays, imaging—are converging rapidly, but turning this data into insight requires AI that can operate across scales.

My approach is mechanistically grounded representation learning. I encode biological structure directly into architecture, so the model's weights and latent variables correspond to real entities—transcription factors, enhancers, cell states, regulatory circuits. Interpretability isn't retrofitted after training; it is the model's internal language from the start.

Across cancer, immunology, neurodevelopment, and population genetics, I've learned that the most impactful models aren't necessarily the most complex—they're the ones scientists trust enough to act on. My goal is to build AI that accelerates the full arc from measurement to mechanism to therapeutic impact—whether through specialized architectures or by shaping how foundation models represent biological systems.

This philosophy shapes three active research directions:

01

Gene Regulatory Networks

Inferring how transcription factors, enhancers, and signalling control cell identity—from single cells to spatial tissue context.

Single-cell Spatial Cancer Neurodevelopment
02

Genetic Variation & Disease

Connecting DNA variants to gene expression and disease risk across individuals and ancestries.

Sequence to function modelsPopulation Genetics Precision Medicine Biobanks
03

Perturbation & Causal Inference

Modeling how genetic and chemical perturbations alter cell state and validating regulatory predictions through experimental intervention—an emerging direction building on my GRN work.

Perturb-seq Causal Modeling Experimental Validation
News
Dec 2025

Invited seminar on Interpretable deep learning for single cell genomics at Genentech in South San Francisco

Nov 2025

Attending scverse Conference at Stanford

Sep 2025

Started visiting Dana Pe'er's lab at Memorial Sloan Kettering in NYC as a postdoc

Jul 2025

Successfully defended my PhD at DKFZ/EMBL Heidelberg 🎓

Jul 2025

Participated in the Leena Peltonen School of Human Genetics in Cambridge, UK

Selected Projects

Tools & Frameworks

Open-source methods developed for the research community, with a focus on biological interpretability and clinical translation.

scDoRI

Under review at Nature

Single-cell Deep Omics Regulatory Inference

A mechanistically constrained autoencoder that reconstructs enhancer-driven gene regulatory networks from single-cell ATAC–RNA data. Applied to >1M glioblastoma cells, scDoRI identified repressive transcription factor circuits regulating tumor plasticity—validated in vivo by slowing tumor growth and increasing survival in mouse models. This computation → wet-lab validation → therapeutic hypothesis pipeline demonstrates AI that doesn't just predict, but explains and enables intervention.

Prediction → wet-lab validation → in vivo efficacy
Autoencoders Multi-omics GRN Inference Cancer Plasticity

DeepGenoXcan

In development

Personalized Gene Expression Prediction

A modular deep learning system predicting cell-type-specific gene expression from personalized genomes, combining genomic sequence with chromatin accessibility. Unlike Enformer/Borzoi (which predict from reference genomes), DeepGenoXcan is designed for individual genetic backgrounds—critical for precision medicine. Outperforms both linear TWAS models and deep neural network models on donor-level generalization.

Outperforms linear models and finetuned Enformer/Borzoi on personalized expression prediction
Population Genetics TWAS Precision Medicine

ExplaiNN

Published

Explainable Neural Networks for Genomics

One of the first fully transparent deep learning architectures for genomics. Demonstrates that high predictive performance and complete interpretability can coexist, producing motif-level explanations of transcription factor binding and chromatin accessibility predictions.

Full interpretability without sacrificing performance
CNN Interpretability TF Binding Motif Discovery

Biologically Relevant Transfer Learning

Published

Domain-Informed Training Strategies

Systematic evaluation of how biological priors affect deep learning training. Tested TF family relationships, DNA-binding domain similarity, and cofactor information as pre-training signals. Established benchmarks showing biologically informed training outperforms naive approaches, especially in low-data regimes.

Evaluation methodology: domain knowledge improves training strategies
Transfer Learning TF Binding Training Strategies
Publications

Selected Papers

* denotes equal contribution / co-first authorship

2025

Decoding Plasticity Regulators and Transition Trajectories in Glioblastoma with Single-cell Multiomics Co-first

Manu Saraswat*, Laura Rueda-Gensini*, Elisa Heinzelmann*, Tannia Gracia*, Fani Memi* et al.

bioRxiv (under review at Nature)

📌 Interpretable architecture → wet-lab validation → in vivo therapeutic efficacy

2023

ExplaiNN: interpretable and transparent neural networks for genomics Co-first

Gherman Novakovsky*, Oriol Fornes*, Manu Saraswat*, Sara Mostafavi, WW Wasserman

Genome Biology

2021

Biologically relevant transfer learning improves transcription factor binding prediction Co-first

Gherman Novakovsky*, Manu Saraswat*, Oriol Fornes*, Sara Mostafavi, WW Wasserman

Genome Biology

2021

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network Co-first

Mathys Grapotte*, Manu Saraswat*, Chloé Bessière*, FANTOM Consortium, Laurent Brehelin, Charles-Henri Lecellier

Nature Communications

Presentations

Talks & Seminars

Upcoming

Jan 2026
Talk

DKFZ Epigenetics Meeting

Heidelberg, Germany

Invited Talks

Dec 2025
Invited Talk

Genentech Research Seminar

Host: Avantika Lal · South San Francisco, CA

Dec 2025
Invited Talk

New York Genome Center

Host: Neville Sanjana · New York, NY

Oct 2025
Invited Talk

Dana-Farber Cancer Institute / Harvard

Host: Sasha Gusev · Boston, MA

Oct 2025
Invited Talk

MILA Multi-omics Reading Group

Valence Labs · Virtual

Watch recording
Oct 2025
Invited Talk

McGill University (Single Cell Open Club)

Virtual

2024
Invited Talk

Stanford University (Anshul Kundaje Lab)

Virtual

2024
Invited Talk

Sanger Institute (Roser Vento-Tormo Lab)

Virtual

2024
Invited Talk

EMBL (Judith Zaugg Lab)

Heidelberg, Germany

Conference Talks

Sep 2025
Talk

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Networks

Keystone Symposia: AI in Molecular Biology · Santa Fe, NM

Jul 2025
Talk

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

Leena Peltonen School of Human Genetics · Cambridge, UK

2022
Talk

scDoRI: Gene regulatory inference from single-cell multi-omics data using interpretable deep learning

RECOMB/ISCB Regulatory & Systems Genomics · Las Vegas, USA

2020
Talk

Convolutional Additive Models: a fully interpretable approach to deep learning in genomics

Machine Learning in Computational Biology (MLCB) · Virtual

Posters

Nov 2025
Poster

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Network

scverse Conference · Stanford, CA

Oct 2025
Poster

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

ASHG Annual Meeting · Boston, USA

Schools & Workshops

Jul 2025
Participant

Leena Peltonen School of Human Genetics

Wellcome Genome Campus · Cambridge, UK

Background

Path & Experience

Six institutions, five countries—building interpretable AI for biology.

Download CV
2025–present

Postdoctoral Researcher

MSKCC & DKFZ · New York / Heidelberg

Working with Dana Pe'er and Oliver Stegle on interpretable deep learning for spatial and perturbation-based single-cell data, modeling tumor–immune interactions.

2020–2025

PhD in Computational Biology

DKFZ / EMBL · Heidelberg, Germany

Thesis: "Decoding gene regulation from single cells to populations with interpretable deep learning." Supervised by Oliver Stegle and Moritz Mall.

2023

ML Research Intern

Genentech · South San Francisco, CA

Contributed to foundation models integrating DNA sequence, RNA expression, and chromatin accessibility from single-cell multi-omic atlases. Learned how AI tools are developed and deployed in pharmaceutical R&D.

2019–2020

Research Software Developer

University of British Columbia · Vancouver, Canada

Built interpretable deep learning models for TF binding and chromatin accessibility prediction. Two co-first author publications in Genome Biology.

2018–2019

Master's Thesis Research

IGMM, CNRS · Montpellier, France

First extended research experience. Developed deep learning models for transcription initiation at microsatellites. Resulted in co-first author Nature Communications paper.

2018

Summer Analyst

Goldman Sachs · Bangalore, India

Learned to communicate technical concepts to diverse stakeholders in a fast-paced environment.

2014–2019

M.Sc. Mathematics & B.E. Computer Science

BITS Pilani · India

Dual degree.