February 1, 2026 · Podcast · 1h 4min
The AI-Powered Biohub: Mark Zuckerberg and Priscilla Chan's Bet on Virtual Cells
What happens when you pair a frontier AI lab with a frontier biology lab, designing experiments specifically to generate the data that models need? That’s the thesis behind the Chan Zuckerberg Initiative’s next decade, and it represents a fundamentally different approach from what DeepMind did with AlphaFold, which relied on data other scientists had collected over 30 years.
The Conversation
This crossover episode from the Latent Space podcast (hosted by Swyx and Alessio) features Mark Zuckerberg and Priscilla Chan at CZI’s imaging institute, marking the 10-year anniversary of the Chan Zuckerberg Initiative. The conversation is notably nerdy for a Zuckerberg appearance: no talk of Meta, social media, or VR. Instead, it’s a deep dive into why they’re making science, specifically AI-powered biology, the singular focus of their philanthropy going forward.
The dynamic between the two is revealing. Mark thinks in systems and abstractions (“it’s like building up different levels of pattern matching”). Priscilla brings clinical urgency (“I’m a pediatrician, I think about babies, and very sad things happen to very small people”). Together they make a case that’s both technically grounded and emotionally compelling.
Why Tools, Not Cures
CZI’s strategic bet is counterintuitive: rather than funding research on specific diseases, they build scientific infrastructure. Mark frames this through a historical lens: major scientific advances are almost always preceded by new instruments. The telescope enabled astronomy. The microscope enabled microbiology. CZI’s bet is that new computational and imaging tools will enable the next wave of biological breakthroughs.
This is deliberately different from how the Gates Foundation operates (focused on translational work and public health delivery) and from how the NIH funds research (small grants to individual investigators). CZI builds institutions, hires scientists and engineers directly, and operates labs. The first Biohub connected Stanford, UCSF, and Berkeley, enabling cross-institutional collaboration that sounds obvious but was actually a significant innovation in how scientific research gets organized.
“Our mission is to cure, prevent all diseases, and that’s not going to happen just in our four walls. So the strategy has to be: how do we make every single scientist better and more effective?”
The timeline question produces the episode’s best laugh line. When they initially set “by the end of the century” as a target, biologists thought it was wildly ambitious. AI researchers thought it was comically unambitious. Mark notes the actual timeline will probably depend more on the pace of AI progress than the pace of biological research.
The Human Cell Atlas and the Billion-Cell Project
CZI’s first decade produced the Human Cell Atlas: 125 million cells catalogued, with CZI responsible for roughly 25% of the data and the broader scientific community contributing the rest through Cell by Gene. Priscilla highlights a startling fact: until recently, scientists didn’t even know how many cell types exist in the human body. They know it’s in the billions, and they’ve only characterized a fraction, mostly in healthy states.
The permutations are staggering: species, ancestry, age, gender, environmental exposures, healthy vs. disease states. Each dimension multiplies the complexity.
What took a decade for the first 125 million cells is now accelerating dramatically. The billion-cell project is happening in months at a fraction of the cost. But single-cell transcriptomics is only one dimension. CZI is now adding spatial imaging (through custom-built microscopes, not off-the-shelf equipment), and the next frontier is temporal data: imaging living cells over time, not just frozen slices. They’re even imaging see-through zebrafish as living models, then using AI to translate findings to human biology.
Frontier Biology Meets Frontier AI
Mark introduces a framework that captures the core thesis: a “frontier biology lab” working in sync with a “frontier AI lab.” The distinction from AlphaFold is important. DeepMind applied frontier AI to existing biological data that had accumulated over decades. CZI’s approach is designing biological experiments and instruments specifically to generate the types of data that will make AI models smarter.
This is an inversion of traditional scientific thinking. Classically, scientists generate datasets to look through and make discoveries. CZI’s approach: generate data to train models that will make discoveries you couldn’t have made by looking at the data directly.
“I’m now going to do this so I can help train this other thing to be better and create more advances. That is a little bit of an inversion in the thinking.”
The concrete instantiation is the virtual cell, built hierarchically:
- Molecules to Proteins: Understanding molecular interactions
- Proteins to Cells: Modeling how proteins interact within cells (enabled by ESM3 and the cryoEM spatial model)
- Cells to Systems: Simulating multi-cell interactions (e.g., the virtual immune system)
- Systems to Organism: The eventual “biological omni-model”
Mark draws an explicit analogy to how language models evolved: separate modalities (text, image, audio) were developed independently, then merged, with positive transfer making everything stronger. The same merging process should happen with biological models across scales.
Evolutionary Scale and the AI Leadership Bet
The episode coincides with the announcement that Evolutionary Scale (creators of ESM3, one of the leading protein models) is joining the Biohub, with CEO Alex Rivas leading the combined AI-biology program. Mark frames the decision to put an AI person in charge of the overall program as a signal of how central they believe AI will be.
CZI was “probably the first to build out a large-scale compute cluster for biological research” and plans to release frontier models. The combination of their existing modeling team with Evolutionary Scale’s protein expertise is designed to cover the full hierarchy from molecular to cellular.
The Validation Problem
One of the most interesting threads is the feedback loop between models and wet labs. In language models, you can run tens of thousands of tests cheaply. In biology, validation requires physical experiments.
Priscilla is candid about the challenge: the throughput of established wet lab metrics is getting faster with parallelized experimentation, but it’s nowhere near the tens of thousands of verifications that AI models typically need. They’ll need to be “smart about how we do it.”
Mark pushes back against the idea that virtual models will eliminate wet lab work entirely, calling it “the biological version of ‘AI is going to automate every single thing in society.’” The more realistic near-term value: models generate hypotheses, scientists apply taste to choose which ones to test, results feed back into the model.
“Right now, because the wet lab is so expensive and relatively slow, people are going for hypotheses that are singles or doubles. But if we have a model that can help de-risk some of the bigger, riskier ideas, that’s going to move science faster.”
N-of-1: The Precision Medicine Vision
Priscilla’s most passionate segment is about “variants of unknown significance.” When someone gets genetic testing for a diagnostic mystery, they typically learn about a few unusual genetic variants, with the devastating caveat: “we don’t know what they mean.” Should you panic? Should you not?
Virtual cell models could change this by simulating what each variant actually does at the cellular level, whether it impacts a disease pathway or is benign. It’s currently too expensive and too hard to model each person in the lab. But computational models make it possible.
She uses depression as a concrete example. Current treatment is purely empirical: try an antidepressant (usually whichever the doctor is most familiar with), wait months to see if it works, try another if it doesn’t. Meanwhile the patient suffers. The vision: models that can predict which medication will work for a specific individual based on their biology.
“That’s the future I want to live in, where we can actually understand individuals as individuals and use the biology and science very directly to keep them well.”
The ultimate goal is N-of-1 medicine: treatments designed for each person’s unique biology rather than population-level statistics. Mark envisions this taking the form of a “biological omni-model” that merges virtual cell models across all dimensions, built up over a 5-10 year period.
The Virtual Immune System
The immune system is CZI’s first target for system-level modeling, and Priscilla makes a compelling case for why. It’s a system of individual cells interacting with each other, including cell types like B cells, T cells, and NK cells that aren’t fully understood. It’s a “privileged system” that can travel anywhere in the body. And it has enormous clinical leverage in both directions: when it works, it keeps you healthy; when it malfunctions, you get autoimmune diseases (MS, lupus, and potentially dementia).
The New York Biohub is already doing cellular engineering along these lines: programming immune cells to enter a patient’s heart, check for problematic plaques, encode the result into their DNA, self-lyse, and release the signal as cell-free DNA for a binary yes/no diagnostic read. Then, in theory, other engineered immune cells could go in and clear the plaques.
“I know it sounds sci-fi. It is realistic. It is happening.”
The Role of Doctors and Proactive Healthcare
When asked what happens to doctors in this future, Priscilla reframes the question. AI is already excellent at detecting skin lesions and retinal issues. The future doctor’s role shifts toward care, compassion, and walking patients through understanding, returning to the original calling of physicians as healers who use great tools.
Mark zooms out further: the entire healthcare system needs to shift from reactive (showing up when you’re sick) to proactive. Precision medicine doesn’t mean eliminating all bacteria or preventing every infection. It means catching mutations before they become cancerous, understanding disease risks before symptoms appear, and managing health continuously rather than in crisis mode.
Some Thoughts
The host Nathan Labenz’s intro adds important context that the conversation itself doesn’t cover: this is a moment when the US federal government is cutting research budgets and prominent AI researchers (Jeff Dean, Dario Amodei, Chris Olah) are speaking out against civil rights violations. Private scientific philanthropy isn’t just nice to have; it’s becoming a crucial counterweight to institutional dysfunction.
-
Mark’s insight that the timeline for curing all diseases depends more on AI progress than biology progress is a quietly radical claim. It means the biggest variable isn’t how fast we can build microscopes or run experiments; it’s how fast models get good enough to reason about biological complexity.
-
The “frontier biology + frontier AI in sync” framework is genuinely novel. Most AI-for-science initiatives apply AI to existing data. Designing the data collection around what the models need is an underappreciated strategic advantage, and it explains why CZI chose to build institutions rather than issue grants.
-
Priscilla’s framing of the cell atlas work as “not glamorous” and unlikely to get anyone a tenure-track paper reveals a structural problem in academic science: the incentive system rewards novelty over infrastructure, even when infrastructure is what accelerates the whole field.
-
The longevity question produces a revealing exchange. Mark is interested; Priscilla explicitly isn’t, preferring to focus on pediatric disease. Their decision not to verticalize on any specific disease but instead build horizontal tools means the longevity question is for someone else to answer with their tools. It’s a rare case of a tech billionaire exercising genuine strategic restraint.
-
There’s a data bottleneck that even the smartest AI can’t solve by reasoning from first principles. “A lot of human knowledge comes empirically, not from first principles reasoning.” In a field obsessed with scaling compute, this is a useful corrective: for biology, you need to scale data collection too, and that requires actual instruments, actual experiments, and actual scientists sitting next to each other.