Mark Zuckerberg & Priscilla Chan: How AI Will Cure All Disease

0
0
AI transcript
0:00:04 This is a space that, I mean, there’s just gonna be a huge amount of leverage with AI.
0:00:08 It still seems like there could be a lot more effort in this space around building tools,
0:00:14 and it’s kind of this crazy thing that we’re, you know, here in, you know, 2025,
0:00:18 and there’s not the kind of periodic table of elements equivalent for biology.
0:00:23 We think that this is, like, probably one of the most important sets of tools that you need to build.
0:00:27 When we first set out the goal to cure and prevent disease by the end of the century,
0:00:31 people, like, honestly, most scientists couldn’t look at us with a straight face.
0:00:33 They’re like, you’re crazy.
0:00:40 Yes, and it was true, because if you just decided to spend the money funding the next best grant
0:00:45 for every single lab in the country, like, there was no pathway to that being true.
0:00:48 The biology folks, I think, looked at it as if it were crazy ambitious,
0:00:53 and then the AI folks are like, well, that’s kind of boring.
0:00:55 That’s just automatically gonna happen.
0:00:58 I know, that’s like, okay, there’s something in between there that needs to be bridged.
0:01:04 The scientific community needs fundamentally new tools to cure disease, not just more funding.
0:01:09 For decades, biological research has been constrained by the same limitations.
0:01:14 Small grants that fund incremental progress, isolated labs working on narrow questions,
0:01:18 and a lack of shared infrastructure to tackle the biggest challenges in medicine.
0:01:20 But what if we could change that?
0:01:25 Today, you’ll hear from Priscilla Chan and Mark Zuckerberg on their 10-year journey building the
0:01:27 infrastructure for modern biological research.
0:01:31 We discuss how they accidentally created the standard for biology data with their Cell Atlas
0:01:35 project, cataloging millions of cells in an open-source format.
0:01:40 We explore why they’re betting on virtual cells that let scientists test high-risk hypotheses in silico
0:01:43 before investing in extensive wet lab work.
0:01:49 And we dive into Biohub, their play to accelerate discovery by pairing frontier biology with frontier AI.
0:01:50 Hope you enjoy.
0:01:56 Mark, Priscilla, welcome to the A6Z podcast.
0:01:57 Thanks for having us.
0:01:58 Yeah, great to be here.
0:01:58 Excited.
0:01:59 All right.
0:02:00 Excited to have you.
0:02:01 You’re doing exciting stuff.
0:02:01 Yeah.
0:02:06 To that end, almost a decade ago, you guys started the Chan Zuckerberg Initiative with the mission
0:02:09 and intent to cure, prevent, manage all disease by the end of this century.
0:02:14 There’s a lot of missions that you guys could have poured your time and resources into.
0:02:16 Take us behind the conversations of why you guys picked this one.
0:02:20 Maybe, Priscilla, why don’t we start with you and your side of the story?
0:02:24 It always surprises people when I talk about how we work in basic science research.
0:02:28 I trained as a pediatrician, and people always think, oh, it must be about medicine.
0:02:34 And for me, I went into medicine because I wanted to improve people’s lives.
0:02:35 I wanted to make a difference.
0:02:36 I wanted to be able to help others.
0:02:46 And I think training as a pediatrician at UCSF, I met a lot of patients and, frankly, like little kids and families
0:02:49 for which we just had no idea what the problem was.
0:02:54 And they might have a specific gene that they could name if they were lucky.
0:03:02 Or they could be grouped into a bunch of other diseases, and there’d be a general sort of PDF they’d print out, like, this is what we know.
0:03:08 And then it was my job as an intern or resident to try to translate, like, a few lines of information
0:03:10 to how we were supposed to take care of the patient.
0:03:21 And for me, that’s when I really realized the power of basic science and how we need to work on basic science to advance the forefront of what’s possible.
0:03:23 I think of it as the pipeline of hope.
0:03:27 And why did you think you could cure all disease?
0:03:29 Because that’s, like, a very, like, aggressive goal.
0:03:31 Do you want to answer that one?
0:03:31 Yeah.
0:03:35 Well, I mean, we’re not going to cure all diseases, to be clear.
0:03:40 I mean, the strategy is to help scientists and the scientific community cure all diseases.
0:03:45 So the strategy is really one of accelerating the pace of basic science.
0:03:57 And the theory that we had was, if you look at the history of science, most major breakthroughs are basically preceded by the invention of a new tool to observe phenomena in a new way.
0:03:57 Right?
0:03:59 So think about things like the microscope, right?
0:03:59 Right.
0:04:04 Being able to observe bacteria or other fields, the telescope or, you know.
0:04:13 But it’s, just to use an engineering example, without those kind of tools, it’s kind of like you’re coding without being able to step through the code and see both of these, right?
0:04:14 That’s like the old days.
0:04:16 Yeah, yeah, yeah.
0:04:24 So our whole approach on this is basically, let’s help build tools that will accelerate the pace of the whole field.
0:04:34 And I think that there’s a niche that I think fits that, because if you look at how funding works in science, the vast majority of funding comes from the government and NIH.
0:04:42 It’s parceled out into these relatively small grants that allow individual investigators to investigate usually pretty near-term things.
0:04:54 And the development of these kind of new types of tools, whether it’s imaging or building now a lot of AI things like virtual cell models, are longer-term, oftentimes more expensive to develop.
0:05:06 So think about, like, on the order of maybe $100 million to a billion dollars over a 10- to 15-year period, and then you try to unlock those tools and give them to the scientific community to accelerate the pace.
0:05:07 So that’s kind of the theory.
0:05:07 Right.
0:05:13 And it seems like there’s also something that is you don’t really get credit for the tools in a lot of ways.
0:05:17 I mean, we have companies that use your tools and they’re very happy about it.
0:05:19 But I didn’t even know that that was the case.
0:05:20 And so—
0:05:21 That’s why it’s philanthropy.
0:05:22 Yeah, well, it is.
0:05:25 But most people do philanthropy to get credit, too.
0:05:27 I mean, that’s kind of a part of it.
0:05:30 So, I guess, did you think about that?
0:05:34 Or were you just like, no, like, this is going to work, and if it works, that’s all we need?
0:05:46 We’re super focused on, like, actually making every scientist better, and beyond science, like, startups, startup founders, because the point is we can’t do this alone.
0:05:54 And when we first set out the goal to cure and prevent disease by the end of the century, people—like, honestly, most scientists couldn’t look at us with a straight face.
0:05:57 They’re like, you’re crazy.
0:06:09 Yes, and it was true, because if you just decided to spend the money funding the next best grant for every single lab in the country, like, there’s no pathway to that being true.
0:06:21 But if you forced people to really think about this and, like, okay, what is the most credible pathway to doing this, and what are the barriers to that credible pathway, then we sort of got somewhere, right?
0:06:27 They were like, well, like, there’s no shared tools, or we’re not working on big projects and building the right data sets.
0:06:31 And we’re like, okay, well, then we can start doing something about that.
0:06:36 And so, that’s where the idea of building shared tools, because no one right now in the science—
0:06:37 Well, that’s so interesting.
0:06:41 So, basically, you’re like, we’re going to cure all disease, and they’re like, can’t be done.
0:06:42 Why can’t it be done?
0:06:44 Well, because we don’t have the tools, okay?
0:06:47 That’s a pretty cool sequence.
0:06:57 Yeah, I mean, there’s also this funny thing where the biology folks, I think, looked at it as if it were crazy ambitious, and then the AI folks are like, well, that’s kind of boring.
0:06:59 That’s just automatically going to happen.
0:07:00 So, I know that’s okay.
0:07:02 There’s something in between there that needs to be bridged.
0:07:09 And if you can, like, kind of use the kind of modern AI tools in order to build the types of tools that biologists need.
0:07:13 So, that’s a big part of how we think about our work is—
0:07:17 AI has got to be the most overestimated and underestimated technology ever, like, simultaneously.
0:07:20 I mean, yeah, well, probably like the internet early on.
0:07:28 But we kind of think about ourselves and the work that we’re doing at the Biohub as frontier biology paired with frontier AI, right?
0:07:34 So, there are labs that do frontier AI that basically, you know, are building the most advanced models.
0:07:48 And then there are lots of biological research organizations that effectively do very leading-edge research to build, to either discover new datasets or looking to certain challenges.
0:07:53 But so far, there hasn’t been anyone who’s tried to do both of those at once.
0:07:56 And when you look at—I mean, even something like AlphaFold, which is amazing, right?
0:08:02 It was built off of this dataset that was a public dataset that had been produced decades ago, right?
0:08:13 And what I think you have the opportunity to do, if you do both of those together, is produce specific datasets for the purpose of training AI models to build virtual cells that can do specific things.
0:08:17 So, I think that that’s, like, a pretty interesting zone to be in.
0:08:23 And of all the things that we’ve worked on, you know, actually, when we started CZI, we kind of actually focused on a number of areas.
0:08:27 And what we found is just that the science research has had by far the biggest returns.
0:08:32 We’ve just doubled down on it over and over and over until now we’re at the point that we’re 10 years in.
0:08:36 And Biohub is really the main focus of our philanthropy at this point.
0:08:39 But, yeah, I mean, that’s kind of—that’s basically the focus.
0:08:44 Maybe you’re not giving yourselves enough credit because you’re sort of saying, well, there’s bite-sized science.
0:08:44 We don’t want to do that.
0:08:50 There’s century-scale science, and that seemed like a long time horizon but achievable, ambitious.
0:08:57 But you’ve actually identified, which I think is really fantastic, grand scientific challenges that are right in between.
0:09:05 They’re 10- to 15-year horizons, at least per kind of the way you communicate about them and the way you energize the scientific community about them.
0:09:15 And 10-to-15 is kind of an interesting time horizon, sort of like similar to the time horizon of a venture-backed company, similar to the time horizon on which a team can work together for that period of time.
0:09:17 How did you get to that number?
0:09:23 And then how are you thinking about the challenges that you take on in each 10-to-15-year wave?
0:09:26 Because that’s concrete, achievable.
0:09:30 You build a lot of credibility around it the way that you’ve announced those challenges.
0:09:32 Well, I’m curious how you guys think about it.
0:09:43 But for us, when we looked at the grand challenges on the 10-to-15-year time horizon, it needs to be like—when you look at it, you’re like, I see a path.
0:09:47 Not everything needs to be solved for us to take it on.
0:09:49 In fact, if everything’s solved, then that feels like that should just go.
0:09:50 And it wasn’t ambitious enough.
0:09:53 Yeah, like we have some risk appetite, right?
0:10:04 So we want things where we’re like, there’s a credible pathway, someone who is at the helm who can do this, and there’s enough ambiguity where we feel like we could take on that risk.
0:10:08 And if we do it, like the returns could be higher than even expected.
0:10:14 And the way we modeled that in the biohubs is we have three biohubs.
0:10:18 We have one in San Francisco, one in Chicago, one in New York.
0:10:20 The one in New York works on cell engineering.
0:10:25 Can we engineer cells to go in and detect signals, read it out, or to take certain actions?
0:10:30 In Chicago, we’re building tissues and looking at cell communications within tissues.
0:10:34 And then in San Francisco, we’re looking at deep imaging and transcriptomics.
0:10:39 And that work, the locations are not by accident.
0:10:51 We also look at the partner universities because we have folks who come to the biohubs to do this work, collaborative, interdisciplinary, and sort of unconstrained by the traditional lab.
0:10:58 But we also build off of the labs at these academic institutes that support the work.
0:11:05 And so that’s how we sort of choose the grand challenge and the locations.
0:11:20 And then the sort of layering and the large language models and AI coming into the picture has been so interesting because we were already building tools to measure interesting data, building the data sets.
0:11:22 But we didn’t really know what to do with them yet.
0:11:29 And large language models coming onto the scene, we’re like, wow, we can make sense of all of this now.
0:11:35 So I’m curious what you view success as in the therapeutic realm.
0:11:38 So, you know, we think a lot about understanding biology.
0:11:44 And sometimes we bet on startups that want to unlock completely new biological areas, diseases, where we don’t know what’s going wrong.
0:11:50 And then there’s another group of folks who kind of say, hey, okay, now that we understand what’s going wrong, let’s fix it.
0:11:52 Let’s come in with a drug.
0:11:54 Let’s come in with a new type of chemistry, a new type of antibody.
0:12:06 How do you, what do you think success for the CZ Biohub looks like 10, 20, 50 years from now in terms of the new medicines that you’ve enabled?
0:12:16 We want there to be like an explosion of a community who are building these, just the new wave of what it means to be deploying precision medicine.
0:12:24 Like, I think for rare diseases and common diseases alike, you’re really talking about individual biology that we sort of lump together.
0:12:30 And we often don’t know how it happens, right?
0:12:35 We know that you have this mutation or the worst nightmare is you have a variant of unknown significance.
0:12:37 What does that even mean?
0:12:38 The horrible of the U.S.
0:12:39 Yeah, it’s horrible.
0:12:43 And you’re like, you tell someone you kind of know something, but we don’t know what it means.
0:12:54 But if you look at the way we’ve been able to look at variants and look at single cell transcriptomics, we’re starting to be able to say, okay, this variant actually impacts this set of downstream cells.
0:13:02 And then we start looking at the proteins that get expressed and how it looks similar or different to what a healthy cell would look like.
0:13:06 Then you can start targeting, okay, like, let’s look at that as a target.
0:13:20 And you both know the specificity of the target you want to build based on the ability to connect mutation to protein expression, as well as to be able to predict off target effects.
0:13:21 What are the side effects?
0:13:27 Because you also know where else that drug will be able to interact with the body.
0:13:38 And so those are rare, like, but I really think most diseases should be thought of as rare diseases because each one of our biology is different.
0:13:40 And right now we just get lumped, right?
0:13:47 We get lumped based on age, demographics, ancestry, if we’re lucky to have that level of understanding.
0:13:50 But truly, each one of our biology is different.
0:13:58 And say, like, if you look at hypertension or depression, like, we kind of just go by trial and error and saying, like, let’s just try that drug and see what happens.
0:14:06 But what should really happen is being able to precisely and accurately and quickly treat people by looking at individuals’ biology.
0:14:16 We want to enable the basic science, and we would be thrilled if people picked up the models that we build to be able to build the diagnostics, the therapeutics that need to come.
0:14:19 You’ve built amazing data sets, I have to say.
0:14:29 Like, I mean, you may not hear the feedback from the startup community and the pharma community and the R&D community, but it’s there because you’ve committed to open source.
0:14:35 And so people may not be, they may not all be writing papers, but they are using those tools.
0:14:40 There’s a startup in our portfolio working on idiopathic pulmonary fibrosis.
0:14:42 The name tells you how vexing the disease is.
0:14:43 It’s idiopathic.
0:14:45 We don’t know why it happens.
0:15:01 The IPF is named that way, and so, you know, he was telling me that he used your cell-by-gene atlases to look at millions of single cells in patients with disease, without disease, try to pinpoint the fibroblasts, double-click on the fibroblasts and their gene expression.
0:15:02 It’s incredible.
0:15:12 And try to, you know, use that to inform, hey, where could I go after a new drug target in this disease that’s fundamentally a strange clump of idiopathic origin?
0:15:24 So I think there’s a huge, there’s a huge group of innovators who are, who love the tools, the visualizations, the query systems, and really the software approach that you’ve built to making that data incredibly accessible.
0:15:25 So thank you.
0:15:28 Cell-by-gene is like almost an accident, though.
0:15:30 Tell us more.
0:15:33 So do you want to share a little bit about cell-by-gene, or do you want me to start?
0:15:47 Well, I mean, I don’t know which part you want to get into, but I mean, but the Cell Atlas work overall, and it’s kind of this crazy thing that we’re, you know, here in, you know, 2025, and there’s not the kind of periodic table of elements equivalent for biology, right?
0:15:58 So that was sort of a lot of the inspiration of it was, all right, how do we both, through work that we’re going to do in the Biohub and through other grants, be able to pull together and standardize a format where you can have all this data?
0:16:06 And when we were starting off, we didn’t even necessarily have in mind that we were going to use that to build virtual cell models.
0:16:13 I think that that’s sort of just come into focus as the AI work has advanced, but that’s a very exciting thing.
0:16:18 We should definitely spend a bunch of time on the virtual cell models, but I’m not sure what you wanted to get into on the Cell Atlas.
0:16:25 Well, the single-cell work was one of our first RFAs 10 years ago we started, and we were like, okay, we think this is possible.
0:16:31 We actually funded the methodology for it to standardize how it was going to be done.
0:16:32 So that was 10 years ago.
0:16:43 And we then, we seeded a few labs to start building out that data set, but we were like, there are like millions or billions of different cell types and different permutations.
0:16:46 Like, how are we going to do this?
0:16:49 And especially with like a burgeoning technique.
0:16:55 And so we ended up seeding a few groups, and they started doing work.
0:16:56 And then they told us they had a problem.
0:17:03 There was a bottleneck in their workflow because they couldn’t annotate the data fast enough.
0:17:07 And so we built, Cell by Gene was an annotation tool.
0:17:10 That’s the original source of this.
0:17:17 So we built the annotation tool to make it easy for people who are doing single-cell science to be able to annotate the data.
0:17:21 And then we put the data that we collected publicly so people could share.
0:17:27 But because everyone started using the same annotation tool, everyone was standardized then on the same data formats.
0:17:35 And then there started being a community around the tool, and they wanted to share back and build the atlas.
0:17:43 So now after 10 years, there are millions of cells that have been built into this shared resource for the entire scientific community.
0:17:45 We only funded about 75% of it.
0:17:46 Sorry, that’s wrong.
0:17:49 We’ve only funded 25% of it.
0:17:58 75% came from the broader community saying, this is useful, and there’s an easy way for us to standardize and build this together.
0:17:59 We have the same metadata.
0:17:59 Yeah.
0:18:00 That’s right.
0:18:03 It’s like an interesting, what you’d call a network effect.
0:18:06 Yeah, I was going to say, it sounds like the internet.
0:18:10 Come for the annotation, stay for the virtual cell model.
0:18:16 Well, it was very important when we were getting started with the work to have everyone who was doing it have a consistent format.
0:18:18 So that way, it could be used and portable.
0:18:23 And then once that kind of took off as the way that it would get done, then other people just found it valuable.
0:18:29 Yeah, and even relative to prior data bases like Geo and whatnot, they’re just simply not as standardized or QC.
0:18:30 Yeah.
0:18:35 Let’s get into virtual cells, one of the great challenges, the grand challenges you would focus on.
0:18:39 Maybe talk about what is the promise or the hope and maybe some of the challenges or where we’re at with it.
0:18:44 Yeah, I mean, we think that this is going to be one of the most important tools at this point.
0:18:58 It’s basically building up the kind of hierarchy from proteins to just different structures within the cell to a whole virtual immune system or different levels of hierarchy.
0:19:08 And we think that this is going to end up being like a very important set of tools for people to effectively generate hypotheses for different science work.
0:19:16 You know, even before you get to the point where you’re really running full experiments in it, you can come up with some estimate of how that might run.
0:19:24 It will be useful for some of the precision medicine type examples that Priscilla was talking about a few minutes ago.
0:19:30 But we think that this is like probably one of the most important sets of tools that you need to build.
0:19:34 And it’s not a single thing, right?
0:19:36 So there’s different angles to come at this from.
0:19:40 The cell atlas data is helpful for understanding things on a cellular level.
0:19:49 One of the kind of most important things that we’re doing right now, there’s this great company, Evolutionary Scale.
0:19:53 We actually had a bunch of researchers who’d formerly worked at Meta on protein folding models.
0:20:03 Is joining a biohub and Alex Reeves, the leader of it, is actually going to be the kind of head of the whole science program.
0:20:08 Which is actually kind of interesting when you think about it, where it’s like you have AI and biology coming together.
0:20:13 And really, it’s like an AI person who understands biology is running it rather than a biologist who has some understanding of AI.
0:20:19 I think it just kind of speaks a little bit to where we think the relative weight of these things is.
0:20:24 But, I mean, we basically view, you know, like Priscilla was saying with the different biohubs,
0:20:33 then New York doing cellular engineering will basically make it so that you can have cells that can record different things that are going on around the body.
0:20:36 And share that data, and then you can build that into models.
0:20:44 The Chicago biohub being able to record inflammation and basically study that in order to kind of help understand.
0:20:47 Like, that’s a different data set.
0:20:51 We have the Imaging Institute, which is, we just trained our first set of models around that.
0:20:59 Which are the first, like, spatial models around understanding, like, the way that kind of cells look in different states.
0:21:08 And eventually, just like you have this analogy on the kind of the industry side or on language models,
0:21:12 where you have different capabilities, and then over time you train them into models, and it gets more and more general.
0:21:14 That’s kind of the idea here.
0:21:18 So we’ll build the biohubs around grand biological challenges.
0:21:22 The biohubs will build tools that will generate novel data sets.
0:21:28 We will build models based on those, and then eventually combine the models into an increasingly general view
0:21:36 of a virtual cell that will be useful, both for scientists and hopefully startups and companies that are working on finding drugs,
0:21:40 which is not our part of the whole thing, but I think is obviously a really important part of what needs to happen.
0:21:46 Yeah, and, you know, you guys think about risk all the time, in terms of when you make investments.
0:21:56 Like, I think the promise of being able to do virtual biology using a virtual cell model is you can actually take on riskier ideas.
0:22:02 Right now, like, grant funding can be hard to come by, and the wet lab work is expensive and slow,
0:22:05 and it’s not just, you know, money, it’s also time.
0:22:11 And so you have to choose something that you think is going to have some likelihood of success
0:22:13 to keep your lab career going.
0:22:19 And so it naturally lends people to take on, like, some risk, but not a lot of risk,
0:22:24 because they need to make sure that they are hitting, like, a certain percentage of the time
0:22:28 to make tenure or publish or whatever they need to do.
0:22:33 But if you had a virtual cell model where you could simulate really high-quality biology,
0:22:38 you could actually then start testing and tinkering on the computational side
0:22:42 and, like, ask riskier questions, things that would have been expensive and costly
0:22:44 in terms of time and resources to do in the lab,
0:22:49 and actually see if there is promise doing the experiments in silico
0:22:53 before you make the time and money investment in the wet lab.
0:22:54 Do you think of it kind of like a model organism?
0:22:55 Yeah.
0:22:56 Like it’s the new fruit fly?
0:22:57 Yeah.
0:23:05 I was going to ask, given the complexity of a cell, like, how close, like, how accurate
0:23:07 do you think you’ll get the model, too?
0:23:11 I mean, just assuming, I mean, maybe you get it to, like, a perfectly accurate representation
0:23:16 of a cell, but, like, how accurate to be useful would the virtual cell have to be?
0:23:20 I think it will obviously iterate and get better and better, because right now we, like, right
0:23:23 now we’re still just talking about transcriptomics.
0:23:30 We’re expanding into different ways of looking at the cell, but you get more and more accuracy.
0:23:35 But I don’t think it needs to be 100% accurate to be useful, because you just want to be able
0:23:38 to de-risk the idea on the front end a little bit.
0:23:43 And the more and more you de-risk it, the more efficient it gets, obviously.
0:23:46 But it will be useful if you even get directional signal.
0:23:54 And yes, we do think about it, like, as a model organism, but in a way that’s, like, has fidelity
0:23:55 to the human body.
0:23:57 Like, you know, like, I don’t want to…
0:23:58 All models are wrong.
0:23:59 Some are useful.
0:23:59 Yeah.
0:24:00 Yeah.
0:24:00 Yes.
0:24:03 Hopefully it has utility on certain acts.
0:24:08 And just like the language models, you build in specific capabilities.
0:24:08 So it’s not…
0:24:16 So, for example, you know, one of the models that we’re publishing is variant former, right?
0:24:21 It basically, you know, makes it so that it’s trained on a bunch of effectively pairs.
0:24:25 If you have a cell, you apply CRISPR to it in a place, you see what comes out at the other
0:24:25 side.
0:24:28 So it basically is able to make that kind of a prediction.
0:24:33 Like, okay, if you have this edit that you’re doing to a cell, what is likely going to happen?
0:24:37 Another one of the models is it’s this diffusion model.
0:24:41 Basically, you can describe a type of cell that you would like it to simulate, then it will just
0:24:44 produce a kind of synthetic model of the cell.
0:24:51 Again, I mean, it’s kind of interesting because to Priscilla’s point before about how everyone
0:24:57 is different and, like, and different cells have kind of, you know, you want to be able
0:25:03 to simulate these kind of rare configurations, having at least a synthetic version of what
0:25:04 that could look like is interesting.
0:25:06 And then you can test against that.
0:25:08 The cryo model, I think, is interesting because it’s spatial.
0:25:12 So it kind of gives you a sense of there are all these different models that you can have
0:25:16 that allow you to basically look at different kinds of things.
0:25:19 And then you just train them in to be increasingly general over time.
0:25:21 Yeah, no, very interesting.
0:25:28 And is the modeling technology basically LLMs or, like, is there a reasoning model?
0:25:29 Is it like a just a…
0:25:31 Oh, that’s actually, yeah, no, that’s a fascinating one, too.
0:25:38 Because one of the new models, I think this one is very early, but it’s basically the first
0:25:39 reasoning model over biology.
0:25:46 So the idea is that, yeah, you effectively have these models that kind of simulate world
0:25:47 models in different ways.
0:25:54 And then you want it to be able to not just be able to spit out correlations, right, in
0:25:59 terms of, like, what it’s found, but actually be able to kind of reason through how things
0:26:01 would evolve and why things would happen.
0:26:09 I know it’s quite early, but it is interesting conceptually as what I think is clearly going
0:26:13 to be an important direction in terms of how these models evolve.
0:26:20 Yeah, no, because that’s what I was thinking, you know, that if it doesn’t work, the next
0:26:21 question you have is why.
0:26:22 Yeah.
0:26:22 You know, like…
0:26:25 But I think what you find in reasoning, the analogy to…
0:26:26 Because you’re married to your hypothesis.
0:26:28 Well, yeah, sure, sure.
0:26:30 Yeah, I mean, the…
0:26:34 Yeah, I thought you were saying if the reasoning model doesn’t work, why?
0:26:35 I mean, I think the…
0:26:36 Well, yeah, yeah, yeah.
0:26:36 That’s kind of…
0:26:37 You’re weighing the details, yeah.
0:26:42 I mean, the language model analogy for that would be you need better kind of world models
0:26:45 or better pre-trained models in order to get the reasoning to be good.
0:26:49 But it’s, yeah, you just, you build more, you build more capabilities into it.
0:26:51 And I think that there’s probably an order, too.
0:26:58 So, the work that Alex and the evolutionary scale folks worked on is a lot of it is protein,
0:27:04 which is interesting because that’s at a kind of smaller resolution, obviously, than the
0:27:06 cellular data, the cell atlas.
0:27:11 But part of the hypothesis is that you can look at all these different cells and you can kind
0:27:16 of simulate how they might behave, but you’re going to have a somewhat shallow understanding
0:27:22 unless you actually have this hierarchical understanding of what, how the subcomponents of the cells
0:27:23 are going to interact.
0:27:30 So, our view is that you basically want to build up a state-of-the-art protein model and
0:27:33 then have that be a part of the state-of-the-art cellular model.
0:27:37 And then once you have that, you build things like the virtual immune system, which allows
0:27:40 you to simulate much more complicated systems.
0:27:45 But it’s sort of this, like, hierarchical approach to building up these virtual models.
0:27:47 That makes a lot of sense.
0:27:52 Because also, as you get into personalization, you’ve got, like, common proteins combining
0:27:54 into a unique cell.
0:28:01 So, that makes it, like, from a systems standpoint, that makes it, like, much more manageable.
0:28:02 That makes a lot of sense.
0:28:03 Interesting.
0:28:03 Yeah.
0:28:04 Wow.
0:28:06 Yeah, no, it’s very fascinating stuff.
0:28:06 Yeah.
0:28:09 So, you guys are announcing some big news this week.
0:28:10 Do you want to give us a sneak preview?
0:28:18 Well, the big news is thinking about how we are going to be coming together as one team.
0:28:25 And, you know, in the past, we have done, we’ve run biohubs and we’ve done built software.
0:28:26 We’ve done some AI research.
0:28:31 But all of it has been really thinking about, has been a little bit decentralized.
0:28:39 But now, under Alex’s leadership, we are going to come together as the biohub, an operating
0:28:45 philanthropy where we are doing the science in service of a singular goal together.
0:28:52 And how do we actually advance the state of biology and research at the intersection of
0:28:52 AI and biology?
0:28:53 Amazing.
0:28:54 Alex is amazing.
0:28:56 Yeah, no, he’s great.
0:29:00 And then the other thing is that the piece that I mentioned earlier, which is just, yeah,
0:29:02 I mean, CZI has focused on a number of different things.
0:29:05 We’ve really just found over time that we feel like we’ve been able to make the biggest
0:29:06 difference in science.
0:29:08 So we’ve just kept on doubling down on it.
0:29:11 And we’re going to continue doing work in education.
0:29:15 We’re going to continue supporting local communities and in those different pieces.
0:29:20 But going forward, the biohub is really going to be the main thrust of our philanthropy.
0:29:24 And we’re very excited about that because I think that this is, there has been, you know,
0:29:29 when we started the mission to see if we could help the scientific community cure and prevent
0:29:30 diseases by the end of the century.
0:29:36 I do think with the advances in AI, that should be possible to do significantly sooner.
0:29:41 And that is a very worthy and important and very exciting goal that we think we kind of
0:29:46 have a unique place in the ecosystem that we can help empower others to make fast progress
0:29:47 on that.
0:29:53 So there’s obviously, like, plenty of advantages to decentralization from a management
0:29:54 communication overhead and so forth.
0:30:01 And so, like, what are you trying to add by adding this kind of new layer slash unification
0:30:02 on top?
0:30:04 Like, what are the outputs?
0:30:06 And then I guess, what are the complexities to that?
0:30:10 Because that’s, I’m sorry to ask a CEO question.
0:30:12 No, no, I mean, I’m like, I’m obsessed with stuff.
0:30:13 I’m obsessed with stuff.
0:30:14 We think about this.
0:30:14 You want to go for it?
0:30:15 And then I can jump in.
0:30:15 Yeah.
0:30:22 So there are obviously amazing groups doing Frontier AI and a lot of groups doing great Frontier
0:30:22 biology.
0:30:28 And where we think we can do uniquely is actually tie these two together.
0:30:32 And we are, we’ve funded data sets, we’ve built data sets.
0:30:38 We’re, like, building the instrumentation now to be able to look at the cell, whether it’s,
0:30:44 you know, at the tissue cell-cell communication, our cryo-EM, where we can look at the cell at
0:30:45 nearly atomic level.
0:30:53 So we have the ability to not only build the data sets, but actually shape and form them the
0:30:58 way we want based on what we see as necessary to complement the existing body of knowledge.
0:31:04 And so we have amazing teams doing that work and we’re building these AI models.
0:31:11 And so what, the reason to do it together is then we can actually complete the flywheel.
0:31:16 Like, you know, the model is looking like it has some gaps and blind spots in this area.
0:31:17 Okay, who do we talk to?
0:31:20 How do we build the next data set?
0:31:26 And, you know, we’re seeing this in the lab, like, the metadata is going to be so rich that
0:31:28 we can feed back into the way that we do this modeling.
0:31:35 And so if we can close that loop, which is our goal in bringing everyone together, it’s,
0:31:37 I think it’s going to be incredibly powerful.
0:31:41 And it’s, it’s more than it, it’s more than just like, you know, writing down a spec and
0:31:42 saying, like, please deliver this.
0:31:49 Like, these people need to be sort of working shoulder to shoulder and shaping each other’s
0:31:55 work for this to actually be the more and more accurate model of how the human cell works.
0:31:59 Well, you know, it’s so interesting because that is exactly, like, that’s been the biggest
0:32:04 surprise in the industry for us in AI world, like, forget biology for one second, is that
0:32:11 the domain-specific models have been, like, super interesting.
0:32:16 Like, the original thesis, well, like, there’s just, some AIs are going to get so smart, they’re
0:32:17 going to be smarter than everybody at everything.
0:32:24 But, like, on video models, like, every video model is best at something, but not everything.
0:32:29 And so knowing what problem you’re solving actually turns out to be sort of ironically
0:32:37 very important in AI because you can actually get to a way better result if you put the two
0:32:38 together.
0:32:44 Like, yeah, we’re seeing that over and over and over again in a way that is, I would say,
0:32:48 very counterintuitive to the whole narrative kind of going into it.
0:32:52 And in biology, it used to be the, or at least, you know, one assumption was, well, the data
0:32:53 sets aren’t on the internet.
0:32:57 So part of the reason you need a domain-specific model is that the data sets are not public.
0:33:02 You guys are kind of bucking that trend, too, by creating a lot of open-source access to
0:33:03 the data.
0:33:07 And then even then, it sounds like you’re betting, you know, on the trend that we’re
0:33:08 seeing in other industries.
0:33:12 But still, there will be nuance in how you annotate that data, curate that data.
0:33:13 Well, and how you talk to a scientist, right?
0:33:18 Like, so, because you have to not only know the data and the model and so forth, but, like,
0:33:22 the conversation is what we keep finding out ends up being very, very important, right?
0:33:24 So rich and so important in how you actually.
0:33:28 A scientist isn’t going to talk to it like, you know, I talk to ChatGPT or whatever.
0:33:30 Well, this is the flip fly you can talk to.
0:33:30 Yeah.
0:33:33 That’s really, that’s super exciting.
0:33:36 And the user interface is actually really important.
0:33:40 You talked about, you guys have a founder who’s using Cell by Gene.
0:33:46 That user interface was intentionally designed to not need to have a computational or really
0:33:49 a very deep biological background to be able to use.
0:33:53 Because you want people coming from different fields to look at the problem.
0:33:56 It’s like, look here, help us solve problems here.
0:34:01 And so building that user interface in a way where it’s not a very high barrier to entry,
0:34:07 to be able to poke around and learn something and bring knowledge back to your work, that’s intentional.
0:34:21 And we’re really hoping when we build these virtual models that we get to a place where we can allow a lower and lower barrier entry for people to say, like, you know, like, I have some knowledge about this.
0:34:21 Maybe I can contribute.
0:34:28 A very pertinent example is, turns out, I think immunology has a ton to do with neurodegeneration, right?
0:34:31 Seems like immunology is behind all of this.
0:34:31 Everything.
0:34:33 So it might be part of your century vision.
0:34:42 So you need to be able to allow the immunologists to come in and understand neurodegeneration and understand how their world fits in.
0:34:50 And so the more you lower the barrier to entry allows people to actually think in a sort of truly collaborative and interdisciplinary way.
0:34:53 So will the biohub grow as a team?
0:34:57 Like, will you employ more people at the biohub proper?
0:35:04 Or are you moving towards more of a network model with more sites, more labs, more community-driven data sets?
0:35:06 Like, which is the thrust?
0:35:07 Or maybe it’s both?
0:35:08 Probably a little of both.
0:35:11 And we’ve added new biohubs over time.
0:35:17 And then we’re also building up more of this, like, central AI team.
0:35:17 Cool.
0:35:24 So, but I don’t, I think that these organizational questions of how do you set this up are fascinating.
0:35:31 And a lot of our approach is sort of informed by what the rest of the field is doing.
0:35:34 Because you kind of think about science as it’s this portfolio, right?
0:35:37 Society has a portfolio of stuff that it’s trying to do.
0:35:45 And in terms of philanthropy, you want to be the most additive that you can be by trying to figure out what else is underrepresented.
0:35:49 So science by default is very decentralized, right?
0:35:54 It’s like kind of the way that granting has worked, the way that I think scientists by default want to work.
0:36:09 So I think a lot of what we’ve found is that figuring out ways to encourage collaboration in ways that otherwise seem very simple but weren’t happening before can unlock a lot of value.
0:36:13 So the very first biohub, what we did, there were two kind of interesting things.
0:36:17 One was it was this collaboration between UCSF, Stanford, and Berkeley.
0:36:27 And there were all these really smart people at all these different places who previously, I guess in theory, they could have figured out a way to work together, but there was not really a formal construct for them to do that.
0:36:29 And this just allowed a lot more collaboration.
0:36:36 The other one is cross-discipline, basically having biologists sit next to engineers.
0:36:48 And this view that like these two disciplines are things that need to, and I don’t know, I mean, I’m sure, you know, you’ve seen this in a lot of, in a lot of the companies, but like, it’s, there’s so many interesting.
0:36:50 And the companies, they always set them apart.
0:36:57 Well, it’s interesting, no, it’s interesting how many organizational questions or problems you can fix just by having two teams sit together, right?
0:36:59 It’s like, it doesn’t matter what the org chart is or like whatever.
0:37:04 It’s like, you guys need to sit next to each other until you get this thing to work.
0:37:07 And that’s something I really believe in.
0:37:08 And you have time.
0:37:08 Yeah.
0:37:09 You have time to 15 years.
0:37:14 Well, no, it’s all like communication is such an underrated problem in general.
0:37:14 Yeah.
0:37:18 In all kinds of, in building anything or solving anything.
0:37:21 So that’s pretty neat.
0:37:21 Yeah.
0:37:24 And it’s just really kind of simple stuff.
0:37:27 But I think it’s sort of novel as a model.
0:37:34 And one of the things that’s, so we’ve now copied this from the first Biohub to the Biohub network and expanded it to other models.
0:37:43 But it’s also just been neat to see other folks who are working in the field also adopt similar models because it’s a pretty intuitive thing.
0:37:49 But, you know, at some point you’ll reach the point where, you know, actually it’s really good to have decentralized work too, right?
0:37:52 So it shouldn’t be that like, we’re not saying that this is like the way that all science should work.
0:37:59 We’re just saying that there’s a space for this that can unlock a lot of value because it, for whatever reason, hasn’t been the default.
0:37:59 Yeah.
0:38:02 And we still rely on like.
0:38:05 Yeah, there’s famous like stories in the MIT lab about that.
0:38:07 That’s how they invented lasers and so forth.
0:38:10 It’s put a bunch of people from different departments in the same space.
0:38:11 The media lab.
0:38:11 Yeah.
0:38:14 Well, actually, physics is where we got a lot of the inspiration.
0:38:22 Like physics has just historically been, like labs have just rallied around big projects and big shared resources.
0:38:34 And we will, you know, we are relatively centralized, but we still depend on a lot of labs who are doing sort of exact frontier work or complementary work to come together to support those.
0:38:35 There’s that.
0:38:41 But one more thought on your expansion question is like, and maybe this is like the modern AI lab.
0:38:47 We are not expanding like a lot of square footage per se, but we’re expanding our compute.
0:38:48 Yeah.
0:38:50 The researchers, they don’t want employees working for them.
0:38:51 They don’t want space.
0:38:52 Yeah.
0:38:53 They just want GPUs.
0:38:53 Yeah.
0:38:57 So it’s just like, in a sense, that’s new lab space.
0:39:00 It’s much more expensive than what lab space.
0:39:02 And you guys have always been creative on that.
0:39:07 Even in the last few years, you’ve created ways to share access to compute.
0:39:13 You’ve enabled academic labs to, you know, I forgot the name of your program.
0:39:13 Yeah.
0:39:16 Scientists in Residence or something like that.
0:39:28 The core of it is, you know, if you look at individual labs, they’ll have like, like a large lab would have tens of GPUs.
0:39:33 And we were the first to really build a large scale compute cluster.
0:39:38 A thousand now we have plans to move to the 10,000 range.
0:39:44 And that, one, requires a different type of project, obviously.
0:39:46 You’re able to ask different types of questions.
0:40:03 And it’s a resource that we use, but also we’ve invited scientists to apply and say, like, what question do you have that could use this amount of resource and be able to sort of see collaborations that way?
0:40:17 And so if a scientist is out there listening, like, who’s not employed by the biohub or working at the biohub but wants to collaborate with the biohub, that you’re going to create really interesting doors to utilize the resources.
0:40:18 That’s awesome.
0:40:18 Yeah.
0:40:20 I mean, the GPUs are somewhat zero-sum, right?
0:40:22 So the data isn’t.
0:40:23 So, yeah.
0:40:24 Yeah.
0:40:24 Fair enough.
0:40:30 So you’re about to celebrate 10 years doing this.
0:40:41 As you look out in the years to come, what else can you tell us about either things that you’re thinking about for the future or maybe even principles or a North Star that’s going to guide how you guys grow and evolve going forward?
0:40:53 You know, it’s been really interesting in the past 10 years because I actually spent the first few years completely envious of people working for for-profit companies because there’s so much clarity.
0:41:00 Like, the market will tell you, whether or not it’s private or public, will tell you if you’re doing a good job.
0:41:01 If they think you’re doing a good job.
0:41:02 If they think you’re doing a good job.
0:41:03 They’re not always right.
0:41:05 They’re not always right.
0:41:06 No, it’s a big difference.
0:41:11 But I was still envious because that was, I was like, I craved that feedback, like, am I doing a good job?
0:41:27 And, you know, 10 years in, you know, the reason why we’re doubling down on biology is like, not only did we achieve what we said we were going to do and when we set out to set out on these projects, it actually delivered more than we thought we were going to.
0:41:30 And I was like, okay, that’s a signal I can latch on to.
0:41:35 And, like, that’s a signal we can really continue doubling down and doing more of that.
0:41:42 And so I think it’s continuing to tolerate the early ambiguity when you’re like, okay, I’m going to do more of this.
0:41:53 And being patient but being willing to have a long-time horizon but be impatient at the same time.
0:42:06 Because it’s all those iterations along the way that have sort of allowed us to get to this place where, you know, to get lucky, ready, having built data, data sets, to take advantage of AI and large language models.
0:42:08 It’s because of all the work that we have been doing.
0:42:18 And so being able to continue moving forward in this ambiguity and sometimes lack of signal on a big goal, like, I think we sort of set the DNA for that.
0:42:19 Oh, no pun intended.
0:42:20 Yeah.
0:42:23 But we get to see how many people use the tools and the feedback.
0:42:25 Yeah, yeah, yeah.
0:42:26 Yeah, you have customers, which is pretty cool.
0:42:27 Yeah.
0:42:29 For philanthropy, like, that’s awesome.
0:42:29 Yeah.
0:42:32 No, it’s one of the fun things about building tools.
0:42:37 It’s like, you kind of get to see how valuable do people find the tools?
0:42:40 Do people use the tools in order to publish important work?
0:42:40 Right.
0:42:41 Right, right, right.
0:42:42 Yeah.
0:42:45 Well, I mean, our feedback is, they’re awesome.
0:42:46 Our feedback is great.
0:42:49 And completely unique, by the way.
0:42:53 So, like, the other thing is, like, what would you use if you didn’t have this?
0:42:55 It’s like, there’s nothing.
0:42:58 No, yeah, it’s a real kind of void.
0:43:12 I mean, there’s this whole pipeline that needs to exist from accelerating basic science to funding a lot of people to use it to then you can get into the biotechs that basically can start to work on basically coming up with novel therapies.
0:43:14 And then you get the pharma companies that do them at scale.
0:43:23 And then there’s a space for philanthropy on the other side of public health of basically taking the therapies and kind of bringing them out to everyone in the world.
0:43:28 But this is a space that, I mean, there’s just going to be a huge amount of leverage with AI.
0:43:39 And it is, yeah, it still seems like there could be a lot more effort in this space around building tools and just accelerate the whole thing a lot better.
0:43:44 Yeah, and I do think it is the place where you are completely unique, right?
0:43:49 The other things, there are other people who can do that, but there’s nobody doing what you’re doing.
0:43:50 It’s got good founder market.
0:43:51 Yes, founder market.
0:43:55 If we didn’t exist, would it be a problem?
0:43:55 Yes.
0:43:58 Like, those questions really land.
0:43:59 Yeah.
0:43:59 As a VC.
0:44:03 It’s like one of us is an engineer, the other one is a scientist, doctor.
0:44:03 Yeah.
0:44:05 Very happy in this direction.
0:44:05 Yeah.
0:44:12 We thank you very much, not only for our companies, but for us as humans for working on this work.
0:44:13 It’s amazing work.
0:44:14 Oh, thank you.
0:44:14 Thank you.
0:44:15 Thank you guys.
0:44:16 Thank you so much.
0:44:21 Thanks for listening to this episode of the A16Z podcast.
0:44:28 If you liked this episode, be sure to like, comment, subscribe, leave us a rating or review, and share it with your friends and family.
0:44:32 For more episodes, go to YouTube, Apple Podcasts, and Spotify.
0:44:39 Follow us on X and A16Z, and subscribe to our Substack at a16z.substack.com.
0:44:41 Thanks again for listening, and I’ll see you in the next episode.
0:44:46 As a reminder, the content here is for informational purposes only.
0:44:52 It should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security,
0:44:56 and is not directed at any investors or potential investors in any A16Z fund.
0:45:01 Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
0:45:09 For more details, including a link to our investments, please see A16Z.com forward slash disclosures.

Priscilla Chan and Mark Zuckerberg join a16z’s Ben Horowitz, Erik Torenberg, and Vineeta Agarwala to share how the Chan Zuckerberg Initiative is building the computational tools that will accelerate the cure, prevention, and management of all disease by century’s end. They explain why basic science needs $100 million-scale projects that traditional NIH grants can’t fund, how their Cell Atlas became biology’s missing periodic table with millions of cells catalogued in open-source format, and why their new virtual cell models will let scientists test high-risk hypotheses in silico before investing in expensive wet lab work. Plus: the organizational shift unifying the Biohub under AI leadership, what happens when biologists and engineers sit side-by-side, and why modern biology labs are expanding compute instead of square footage.

 

Timestamps:

4:17 – Building tools to accelerate scientific discovery

5:47 – The credible path to funding basic science

7:21 – Biohub = Frontier Biology + Frontier AI

9:05 – Challenges building on a 10-15 year timeline

9:43 – How CZI chooses what to work on

11:15 – Making sense of science with LLMs

11:31 – Measuring success in the therapeutic realm

13:32 – “Most diseases should be thought of as rare diseases”

15:39 – Inspiration: building a periodic table for biology

19:27 – Why virtual cells?

21:17 – The Biohub Master Plan

21:51 – How virtual cell models allow more risk taking

28:15 – Bringing CZI & Biohub together

30:32 – Why Biohub matters

33:36 – The importance of interface design in democratizing scientific discovery

35:34 – How Biohub encourages cross-functional collaboration
40:38 – Looking ahead: the broader impact of AI on biotech

 

Stay Updated: 

If you enjoyed this episode, be sure to like, subscribe, and share with your friends!

Find a16z on X: https://x.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Listen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYX

Listen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711

Follow our host: https://x.com/eriktorenberg

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Stay Updated:

Find a16z on X

Find a16z on LinkedIn

Listen to the a16z Podcast on Spotify

Listen to the a16z Podcast on Apple Podcasts

Follow our host: https://twitter.com/eriktorenberg

 

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Leave a Reply

a16z Podcasta16z Podcast
Let's Evolve Together
Logo