Postdoctoral Researcher

Manu Saraswat

Mechanistic AI for Biological Discovery

I build representation learning models where the latent spaces encode biology—transcription factors, cell states, regulatory circuits—not abstract features. My goal: AI that doesn't just predict, but generates hypotheses, guides experiments, and translates to real therapeutic impact. Currently at Memorial Sloan Kettering and DKFZ, working with Dana Pe'er and Oliver Stegle.

Manu Saraswat
News
Dec 2025

Invited seminar on Interpretable deep learning for single cell genomics at Genentech in South San Francisco

Nov 2025

Attending scverse Conference at Stanford

Sep 2025

Started visiting Dana Pe'er's lab at Memorial Sloan Kettering in NYC as a postdoc

Jul 2025

Successfully defended my PhD at DKFZ/EMBL Heidelberg 🎓

Jul 2025

Participated in the Leena Peltonen School of Human Genetics in Cambridge, UK

Research Vision

Representation learning where the representations are the biology.

Biology is at a turning point. Multi-scale, multi-modal measurements—genomes, single cells, spatial assays, imaging—are converging rapidly, but turning this data into insight requires AI that can operate across scales.

My approach is mechanistically grounded representation learning. I encode biological structure directly into architecture, so the model's weights and latent variables correspond to real entities—transcription factors, enhancers, cell states, regulatory circuits. Interpretability isn't retrofitted after training; it is the model's internal language from the start.

Across cancer, immunology, neurodevelopment, and population genetics, I've learned that the most impactful models aren't necessarily the most complex—they're the ones scientists trust enough to act on. My goal is to build AI that accelerates the full arc from measurement to mechanism to therapeutic impact—whether through specialized architectures or by shaping how foundation models represent biological systems.

01

Single-Cell Regulatory Networks

Inferring enhancer-driven GRNs from multi-omic data to uncover plasticity regulators. Applied across glioblastoma, neurodevelopment, and immune contexts.

scDoRI VAEs Cancer Neuro
02

Population-Scale Genetics

Connecting DNA variants to gene expression and disease risk across individuals. Designed for diverse ancestries and clinical translation.

DeepGenoXcan TWAS Biobanks
03

Spatial & Perturbation Biology

Modeling tumor–immune interactions and cellular communication using spatial transcriptomics and perturbation screens.

Spatial TX Perturb-seq TME
Selected Projects

Tools & Frameworks

Open-source methods developed for the research community, with a focus on biological interpretability and clinical translation.

scDoRI

Under review at Nature

Single-cell Deep Omics Regulatory Inference

A mechanistically constrained VAE that reconstructs enhancer-driven gene regulatory networks from single-cell ATAC–RNA data. Applied to >1M glioblastoma cells, scDoRI identified repressive transcription factor circuits regulating tumor plasticity—validated in vivo by slowing tumor growth and increasing survival in mouse models. This computation → wet-lab validation → therapeutic hypothesis pipeline demonstrates AI that doesn't just predict, but explains and enables intervention.

Prediction → wet-lab validation → in vivo efficacy
VAE Multi-omics GRN Inference Cancer Plasticity

DeepGenoXcan

In development

Personalized Gene Expression Prediction

A modular deep learning system predicting cell-type-specific gene expression from personalized genomes, combining genomic sequence with chromatin accessibility. Unlike Enformer/Borzoi (which predict from reference genomes), DeepGenoXcan is designed for individual genetic backgrounds—critical for precision medicine. Outperforms both linear TWAS models and deep neural network models on donor-level generalization.

Outperforms linear models and finetuned Enformer/Borzoi on personalized expression prediction
Population Genetics TWAS Precision Medicine

Biologically-Informed Transfer Learning

Published

Using Biology to Design Training Strategies

Demonstrated that training strategies informed by biological knowledge—Transcription Factor (TF) family relationships, DNA-binding domain similarity—systematically outperform naive transfer learning. Biology isn't just the application domain; it's a source of insight for how to train models better.

Biology informs ML, not just the reverse
Transfer Learning TF Binding Training Strategies

ExplaiNN

Published

Explainable Neural Networks for Genomics

One of the first fully transparent deep learning architectures for genomics. Demonstrates that high predictive performance and complete interpretability can coexist, producing motif-level explanations of transcription factor binding and chromatin accessibility predictions.

Published in Genome Biology (2023)
CNN Interpretability TF Binding Motif Discovery
Publications

Selected Papers

* denotes equal contribution / co-first authorship

2025

Decoding Plasticity Regulators and Transition Trajectories in Glioblastoma with Single-cell Multiomics Co-first

Manu Saraswat*, Laura Rueda-Gensini*, Elisa Heinzelmann*, Tannia Gracia*, Fani Memi* et al.

bioRxiv (under review at Nature)

2023

ExplaiNN: interpretable and transparent neural networks for genomics Co-first

Gherman Novakovsky*, Oriol Fornes*, Manu Saraswat*, Sara Mostafavi, WW Wasserman

Genome Biology

2021

Biologically relevant transfer learning improves transcription factor binding prediction Co-first

Gherman Novakovsky*, Manu Saraswat*, Oriol Fornes*, Sara Mostafavi, WW Wasserman

Genome Biology

📌 Uses biological knowledge to design training strategies — demonstrating how domain expertise can improve ML model training

2021

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network Co-first

Mathys Grapotte*, Manu Saraswat*, Chloé Bessière*, FANTOM Consortium, Laurent Brehelin, Charles-Henri Lecellier

Nature Communications

Presentations

Talks & Seminars

Upcoming

Jan 2026
Talk

DKFZ Epigenetics Meeting

Heidelberg, Germany

Invited Talks

Dec 2025
Invited Talk

Genentech Research Seminar

Host: Avantika Lal · South San Francisco, CA

Dec 2025
Invited Talk

New York Genome Center

Host: Neville Sanjana · New York, NY

Oct 2025
Invited Talk

Dana-Farber Cancer Institute / Harvard

Host: Sasha Gusev · Boston, MA

2025
Invited Talk

Stanford University (Anshul Kundaje Lab)

Stanford, CA

2024
Invited Talk

Sanger Institute (Roser Vento-Tormo Lab)

Cambridge, UK

2024
Invited Talk

EMBL (Judith Zaugg Lab)

Heidelberg, Germany

Conference Talks

Sep 2025
Talk

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Networks

Keystone Symposia: AI in Molecular Biology · Santa Fe, NM

2022
Talk

scDoRI: Gene regulatory inference from single-cell multi-omics data using interpretable deep learning

RECOMB/ISCB Regulatory & Systems Genomics · Las Vegas, USA

Posters

Nov 2025
Poster

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Network

scverse Conference · Stanford, CA

Oct 2025
Poster

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

ASHG Annual Meeting · Boston, USA

Schools & Workshops

Jul 2025
Participant

Leena Peltonen School of Human Genetics

Wellcome Genome Campus · Cambridge, UK

Background

Path & Experience

Six institutions, five countries—building interpretable AI for biology.

Download CV
2025–present

Postdoctoral Researcher

MSKCC & DKFZ · New York / Heidelberg

Working with Dana Pe'er and Oliver Stegle on interpretable deep learning for spatial and perturbation-based single-cell data, modeling tumor–immune interactions.

2020–2025

PhD in Computational Biology

DKFZ / EMBL · Heidelberg, Germany

Thesis: "Decoding gene regulation from single cells to populations with interpretable deep learning." Supervised by Oliver Stegle and Moritz Mall.

2023

ML Research Intern

Genentech · South San Francisco, CA

Contributed to foundation models integrating DNA sequence, RNA expression, and chromatin accessibility from single-cell multi-omic atlases. Learned how AI tools are developed and deployed in pharmaceutical R&D.

2019–2020

Research Software Developer

University of British Columbia · Vancouver, Canada

Built interpretable deep learning models for TF binding and chromatin accessibility prediction. Two co-first author publications in Genome Biology.

2018–2019

Master's Thesis Research

IGMM, CNRS · Montpellier, France

First extended research experience. Developed deep learning models for transcription initiation at microsatellites. Resulted in co-first author Nature Communications paper.

2014–2019

M.Sc. Mathematics & B.E. Computer Science

BITS Pilani · India

Dual degree.