Manu Saraswat

Biology is at a turning point. Multi-scale, multi-modal measurements—genomes, single cells, spatial assays, imaging—are converging rapidly, but turning this data into insight requires AI that can operate across scales.

My approach is mechanistically grounded representation learning. I encode biological structure directly into architecture, so the model's weights and latent variables correspond to real entities—transcription factors, enhancers, cell states, regulatory circuits. Interpretability isn't retrofitted after training; it is the model's internal language from the start.

Across cancer, immunology, neurodevelopment, and population genetics, I've learned that the most impactful models aren't necessarily the most complex—they're the ones scientists trust enough to act on. My goal is to build AI that accelerates the full arc from measurement to mechanism to therapeutic impact—whether through specialized architectures or by shaping how foundation models represent biological systems.

This philosophy shapes three active research directions:

01

Gene Regulatory Networks

Inferring how transcription factors, enhancers, and signalling control cell identity—from single cells to spatial tissue context.

Single-cell Spatial Cancer Neurodevelopment

02

Genetic Variation & Disease

Connecting DNA variants to gene expression and disease risk across individuals and ancestries.

Sequence to function modelsPopulation Genetics Precision Medicine Biobanks

03

Perturbation & Causal Inference

Modeling how genetic and chemical perturbations alter cell state and validating regulatory predictions through experimental intervention—an emerging direction building on my GRN work.

Perturb-seq Causal Modeling Experimental Validation

News

Dec 2025

Invited seminar on Interpretable deep learning for single cell genomics at Genentech in South San Francisco

Nov 2025

Attending scverse Conference at Stanford

Sep 2025

Started visiting Dana Pe'er's lab at Memorial Sloan Kettering in NYC as a postdoc

Jul 2025

Successfully defended my PhD at DKFZ/EMBL Heidelberg 🎓

Jul 2025

Participated in the Leena Peltonen School of Human Genetics in Cambridge, UK

Selected Projects

Tools & Frameworks

Open-source methods developed for the research community, with a focus on biological interpretability and clinical translation.

Single-cell Deep Omics Regulatory Inference

A mechanistically constrained autoencoder that reconstructs enhancer-driven gene regulatory networks from single-cell ATAC–RNA data. Applied to >1M glioblastoma cells, scDoRI identified repressive transcription factor circuits regulating tumor plasticity—validated in vivo by slowing tumor growth and increasing survival in mouse models. This computation → wet-lab validation → therapeutic hypothesis pipeline demonstrates AI that doesn't just predict, but explains and enables intervention.

Prediction → wet-lab validation → in vivo efficacy

Autoencoders Multi-omics GRN Inference Cancer Plasticity

Preprint Code

Personalized Gene Expression Prediction

A modular deep learning system predicting cell-type-specific gene expression from personalized genomes, combining genomic sequence with chromatin accessibility. Unlike Enformer/Borzoi (which predict from reference genomes), DeepGenoXcan is designed for individual genetic backgrounds—critical for precision medicine. Outperforms both linear TWAS models and deep neural network models on donor-level generalization.

Outperforms linear models and finetuned Enformer/Borzoi on personalized expression prediction

Population Genetics TWAS Precision Medicine

Paper coming 2026

Explainable Neural Networks for Genomics

One of the first fully transparent deep learning architectures for genomics. Demonstrates that high predictive performance and complete interpretability can coexist, producing motif-level explanations of transcription factor binding and chromatin accessibility predictions.

Full interpretability without sacrificing performance

CNN Interpretability TF Binding Motif Discovery

Paper

Domain-Informed Training Strategies

Systematic evaluation of how biological priors affect deep learning training. Tested TF family relationships, DNA-binding domain similarity, and cofactor information as pre-training signals. Established benchmarks showing biologically informed training outperforms naive approaches, especially in low-data regimes.

Evaluation methodology: domain knowledge improves training strategies

Transfer Learning TF Binding Training Strategies

Paper

Publications

Selected Papers

* denotes equal contribution / co-first authorship

2025

Decoding Plasticity Regulators and Transition Trajectories in Glioblastoma with Single-cell Multiomics Co-first

Manu Saraswat*, Laura Rueda-Gensini*, Elisa Heinzelmann*, Tannia Gracia*, Fani Memi* et al.

bioRxiv (under review at Nature)

📌 Interpretable architecture → wet-lab validation → in vivo therapeutic efficacy

Preprint

2023

ExplaiNN: interpretable and transparent neural networks for genomics Co-first

Gherman Novakovsky*, Oriol Fornes*, Manu Saraswat*, Sara Mostafavi, WW Wasserman

Genome Biology

Paper

2021

Biologically relevant transfer learning improves transcription factor binding prediction Co-first

Gherman Novakovsky*, Manu Saraswat*, Oriol Fornes*, Sara Mostafavi, WW Wasserman

Genome Biology

Paper

2021

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network Co-first

Mathys Grapotte*, Manu Saraswat*, Chloé Bessière*, FANTOM Consortium, Laurent Brehelin, Charles-Henri Lecellier

Nature Communications

Paper

Presentations

Talks & Seminars

Upcoming

Jan 2026

Talk

DKFZ Epigenetics Meeting

Heidelberg, Germany

Invited Talks

Dec 2025

Invited Talk

Genentech Research Seminar

Host: Avantika Lal · South San Francisco, CA

Dec 2025

Invited Talk

New York Genome Center

Host: Neville Sanjana · New York, NY

Oct 2025

Invited Talk

Dana-Farber Cancer Institute / Harvard

Host: Sasha Gusev · Boston, MA

Oct 2025

Invited Talk

MILA Multi-omics Reading Group

Valence Labs · Virtual

Watch recording

Oct 2025

Invited Talk

McGill University (Single Cell Open Club)

Virtual

2024

Invited Talk

Stanford University (Anshul Kundaje Lab)

Virtual

2024

Invited Talk

Sanger Institute (Roser Vento-Tormo Lab)

Virtual

2024

Invited Talk

EMBL (Judith Zaugg Lab)

Heidelberg, Germany

Conference Talks

Sep 2025

Talk

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Networks

Keystone Symposia: AI in Molecular Biology · Santa Fe, NM

Jul 2025

Talk

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

Leena Peltonen School of Human Genetics · Cambridge, UK

2022

Talk

scDoRI: Gene regulatory inference from single-cell multi-omics data using interpretable deep learning

RECOMB/ISCB Regulatory & Systems Genomics · Las Vegas, USA

2020

Talk

Convolutional Additive Models: a fully interpretable approach to deep learning in genomics

Machine Learning in Computational Biology (MLCB) · Virtual

Posters

Nov 2025

Poster

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Network

scverse Conference · Stanford, CA

Oct 2025

Poster

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

ASHG Annual Meeting · Boston, USA

Schools & Workshops

Jul 2025

Participant

Leena Peltonen School of Human Genetics

Wellcome Genome Campus · Cambridge, UK

Background

Path & Experience

Six institutions, five countries—building interpretable AI for biology.

Download CV

2025–present

Postdoctoral Researcher

MSKCC & DKFZ · New York / Heidelberg

Working with Dana Pe'er and Oliver Stegle on interpretable deep learning for spatial and perturbation-based single-cell data, modeling tumor–immune interactions.

2020–2025

PhD in Computational Biology

DKFZ / EMBL · Heidelberg, Germany

Thesis: "Decoding gene regulation from single cells to populations with interpretable deep learning." Supervised by Oliver Stegle and Moritz Mall.

2023

ML Research Intern

Genentech · South San Francisco, CA

Contributed to foundation models integrating DNA sequence, RNA expression, and chromatin accessibility from single-cell multi-omic atlases. Learned how AI tools are developed and deployed in pharmaceutical R&D.

2019–2020

Research Software Developer

University of British Columbia · Vancouver, Canada

Built interpretable deep learning models for TF binding and chromatin accessibility prediction. Two co-first author publications in Genome Biology.

2018–2019

Master's Thesis Research

IGMM, CNRS · Montpellier, France

First extended research experience. Developed deep learning models for transcription initiation at microsatellites. Resulted in co-first author Nature Communications paper.

2018

Summer Analyst

Goldman Sachs · Bangalore, India

Learned to communicate technical concepts to diverse stakeholders in a fast-paced environment.

2014–2019

M.Sc. Mathematics & B.E. Computer Science

BITS Pilani · India

Dual degree.

Awards & Service

2024

DKFZ Collaboration Funding

€20,000

2020

DKFZ International PhD Fellowship

3-year funding

Reviewer

ICLR MLGenX, ISMB MLCSB, MLCB

Mentoring & Community

Mentored students across biology, bioinformatics, and CS—teaching biologists to build their first deep learning models, and helping CS students understand gene regulation and experimental design. Two mentees now pursuing PhDs.

Close collaboration with experimental groups converted initially skeptical wet-lab biologists into ML adopters, leading to in vivo validation of computational predictions.

Queer in AI: Mentor and co-author on research examining LGBTQ+ experiences in ML/AI.(paper).

Technical

ML/AI

VAEs, CNNs, Transformers, Causal Inference, Interpretable ML

Biology

Gene Regulation, Cancer Genomics, Population Genetics, Single-cell ATAC/RNA, Perturb-seq, Spatial TX

Tools

Python, PyTorch, Scanpy, AnnData

Representation learning where the representations are the biology.

Gene Regulatory Networks

Genetic Variation & Disease

Perturbation & Causal Inference

Tools & Frameworks

scDoRI

DeepGenoXcan

ExplaiNN

Biologically Relevant Transfer Learning

Selected Papers

Decoding Plasticity Regulators and Transition Trajectories in Glioblastoma with Single-cell Multiomics Co-first

ExplaiNN: interpretable and transparent neural networks for genomics Co-first

Biologically relevant transfer learning improves transcription factor binding prediction Co-first

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network Co-first

Talks & Seminars

Upcoming

DKFZ Epigenetics Meeting

Invited Talks

Genentech Research Seminar

New York Genome Center

Dana-Farber Cancer Institute / Harvard

MILA Multi-omics Reading Group

McGill University (Single Cell Open Club)

Stanford University (Anshul Kundaje Lab)

Sanger Institute (Roser Vento-Tormo Lab)

EMBL (Judith Zaugg Lab)

Conference Talks

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Networks

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

scDoRI: Gene regulatory inference from single-cell multi-omics data using interpretable deep learning

Convolutional Additive Models: a fully interpretable approach to deep learning in genomics

Posters

Dissecting Cellular Plasticity in Glioblastoma via Deep Learning of Single-Cell Gene Regulatory Network

Sequence-to-Expression Mapping in Personalized Genomes using Interpretable Deep Learning

Schools & Workshops

Leena Peltonen School of Human Genetics

Path & Experience

Postdoctoral Researcher

PhD in Computational Biology

ML Research Intern

Research Software Developer

Master's Thesis Research

Summer Analyst

M.Sc. Mathematics & B.E. Computer Science