AI transcript
0:00:11 [MUSIC]
0:00:13 >> Hello, and welcome to the NVIDIA AI podcast.
0:00:15 I’m your host, Noah Kravitz.
0:00:19 My guest today is CEO and co-founder of Recursion,
0:00:22 one of the world’s leading biotech,
0:00:25 or tech bio as they call it, companies in the world.
0:00:29 Chris Gibson started Recursion based on the work he developed while working on
0:00:32 his joint MD-PhD at the University of Utah,
0:00:37 and today the company is dedicated to coding biology in the name of radically improving lives.
0:00:41 Recursion is building one of the largest proprietary datasets in their field.
0:00:44 They just took the wraps off one of the most powerful supercomputers in the world,
0:00:48 and they’re one of the company’s leading the growing field of AI-powered drug discovery.
0:00:52 Chris is here to tell us all about it and then some, so let’s dive right in.
0:00:56 Chris Gibson, welcome, and thank you so much for joining the NVIDIA AI podcast.
0:00:58 >> Thanks, Noah. I’m delighted to be here.
0:01:01 As is often the case with these episodes,
0:01:04 I try to pack everything I can into the intro to set the stage.
0:01:07 You’ve done so much, you’re working on so much,
0:01:08 but let me turn it over to you.
0:01:12 Maybe we can start by you explaining to the audience what Recursion is all about.
0:01:14 >> Yeah. I want to tell you about Recursion,
0:01:17 but first I want to tell you about the problem that we’re solving.
0:01:17 >> First.
0:01:22 >> Everybody knows somebody who has a disease where there’s not a good treatment.
0:01:24 You’ve got a relative who’s died of cancer,
0:01:26 you have somebody who’s suffering from Alzheimer’s or Parkinson’s,
0:01:29 or some of these other devastating diseases.
0:01:33 Today, despite all of the incredible technology in the world,
0:01:37 90 percent of drugs that go into clinical trials fail.
0:01:40 Nine out of 10 drugs that our industry puts into clinical development
0:01:42 fail before they get to patients.
0:01:48 What that tells me is that biology is just this massive quagmire of complexity.
0:01:53 It’s just dramatically complex to a level where despite all of these hundreds of
0:01:57 thousands of scientists around the world and about $50 billion a year of R&D
0:02:01 investment from the industry, we still aren’t that good at it.
0:02:07 We imagined that perhaps there was a way that you could bring together new technologies
0:02:10 to try and take a less biased approach to biology,
0:02:12 to try and step back and say,
0:02:16 instead of trying to understand how every gene interacts with every gene and
0:02:21 every drug interacts with every gene and build it all in our heads,
0:02:23 which is the way it’s been done traditionally.
0:02:25 Could we actually take an industrial approach,
0:02:28 build maps of biology using automation,
0:02:33 using AI, and then leverage those to tell us where to go,
0:02:37 to essentially have the algorithm tell us where to develop a medicine.
0:02:40 That’s what we’ve been working on for the last 10 years.
0:02:43 Now we’re what’s called a clinical stage biotech company.
0:02:46 We’ve got five drugs that are in the clinic ourselves.
0:02:49 We’ve got big partnerships with companies like Roche Genentech and
0:02:53 Buyer to bring drugs into really hard areas of biology.
0:02:56 We just built this incredible supercomputer with Nvidia,
0:02:58 one of our partners as well.
0:03:03 So we’re I think really one of the companies leading at the intersection of biology,
0:03:06 but also of technology.
0:03:08 A million things to get into.
0:03:09 But before we go forward,
0:03:11 I want to ask you use the word map.
0:03:14 You talked about an industrial approach mapping biology,
0:03:17 and that made me think of things like mapping the human genome.
0:03:19 Is there a parallel there?
0:03:21 Is it a similar approach or how does that work?
0:03:23 Yeah, so traditionally in our industry,
0:03:25 people have worked on one disease at a time.
0:03:26 Okay.
0:03:29 And so if we wanted to go after a disease together,
0:03:31 we’d go read the literature,
0:03:32 we’d build a team,
0:03:37 and then we would build specific experiments for that one disease.
0:03:37 Right.
0:03:40 And we would generate data and five to 10 years later,
0:03:42 maybe we would have a drug going into the clinic and
0:03:44 you know what the success rate is there.
0:03:45 It’s a 90% failure.
0:03:47 There’s a different kind of approach,
0:03:49 which is instead of working on one thing at a time,
0:03:55 can we build vast data sets that span very large scales?
0:03:58 And I think the human genome project is a great example.
0:04:00 People said let’s map the entire human genome.
0:04:03 And now today there’s tens of millions of genomes
0:04:04 that have been that have been mapped
0:04:06 and we can compare them and contrast them.
0:04:08 We’re using some of those same data.
0:04:12 And so we’ve taken this approach that to invest more initially
0:04:15 to build really large, complex data sets
0:04:16 that at the very beginning,
0:04:18 you’re paying a lot to build this data set
0:04:21 and you don’t yet have enough data to actually make any progress
0:04:22 against any disease.
0:04:23 But over time,
0:04:26 you start to build these network effects where today,
0:04:29 if we run an experiment in our automated laboratory
0:04:30 and we get some result,
0:04:34 we can compare it to over 250 million experiments
0:04:35 that we’ve run over the past few years
0:04:37 and all of that data is relatable.
0:04:38 So instead of one disease at a time,
0:04:40 instead of slices of biology,
0:04:42 we’re actually building like a volume
0:04:45 and we’re sparsely sampling this volume
0:04:47 and then using AI to sort of fill in
0:04:50 what we can predict about the rest of it.
0:04:53 – Right, is that a unique approach in the industry?
0:04:55 – I think it’s pretty unique in the industry.
0:04:56 Yeah, there’s a handful of companies
0:04:58 that are taking similar approaches.
0:05:01 I’m unaware of any company that has generated a data set
0:05:03 that is this broad.
0:05:05 So we’ve knocked out with CRISPR-Cas9
0:05:07 that some of your listeners may have heard of.
0:05:11 It’s like a molecular scissors that lets us cut out genes.
0:05:13 We’ve knocked out every gene in the human genome
0:05:16 in multiple different human cell types.
0:05:18 We’ve profiled millions of molecules
0:05:21 and all of these data exists now in multiple layers.
0:05:23 We call it omics.
0:05:24 People may have heard of like genomics.
0:05:27 Well, we’re building phenomics and transcriptomics
0:05:29 and invivomics and proteomics
0:05:31 and all of these omics data layers.
0:05:32 And you can think of it as,
0:05:34 go back to like the Google map days, right?
0:05:36 Started out with maps from airplanes
0:05:38 and we got kind of where all the streets were
0:05:40 and then eventually you had street view
0:05:42 and cars were driving around.
0:05:43 We’re building all these same layers
0:05:45 but instead of doing it in the physical world,
0:05:49 we’re using a massive automated laboratory full of robots
0:05:50 to do millions of experiments
0:05:52 to figure out what the roads and streets
0:05:54 and intersections are of biology and chemistry.
0:05:56 And it’s a complex space.
0:05:58 I mean, I think it’s about as complex a problem
0:05:59 as one can work on.
0:06:01 – I can only imagine.
0:06:03 Maybe by the end of this conversation,
0:06:06 I’ll know more about what I don’t know at the very least.
0:06:09 So maybe we can start with the graduate work,
0:06:11 the kind of seated recursion.
0:06:12 And if this is the wrong approach,
0:06:13 take a different one,
0:06:16 but I’m imagining that might help me and the audience
0:06:19 sort of to understand the problem,
0:06:20 the scope of the problem
0:06:23 and how and why you started with this approach.
0:06:24 – Yeah, of course.
0:06:26 I think this is a great way to kind of explain
0:06:27 what we’re doing at recursion.
0:06:30 So I joined the lab of a guy named Dean Lee.
0:06:32 Dean’s actually now the president of Merck Research Lab.
0:06:35 So he’s like the head scientist of Merck.
0:06:37 Very physician scientist, brilliant guy.
0:06:39 The lab was so diverse.
0:06:43 There were physicians and engineers and geneticists
0:06:45 and molecular cellular biologists.
0:06:47 And we were working on all these cool, hard problems.
0:06:49 And one of the things we were working on
0:06:52 was this rare genetic disease
0:06:54 called cerebral cavernous malformation.
0:06:57 And I bet about 1% of your audience has heard of it
0:07:01 because about 1% of the people in the world
0:07:03 have this disease.
0:07:06 So it’s like a rare disease, but it’s not that rare.
0:07:10 It’s six times more prevalent than cystic fibrosis.
0:07:12 But there’s no drug, there’s no treatment.
0:07:13 And so because of that,
0:07:15 people don’t know about the disease as much.
0:07:18 And we were trying to figure out how this disease works.
0:07:20 And we use traditional molecular
0:07:21 and cellular biology approaches
0:07:24 where I could hop on the whiteboard that’s behind me
0:07:27 and I could draw protein X goes to protein Y
0:07:28 goes to protein Z.
0:07:30 And we think protein Z gets too high
0:07:32 and that causes the disease.
0:07:35 And after a decade of working on this,
0:07:37 we think we figured it out.
0:07:40 We’re sitting in lab and we think this protein called row A
0:07:42 is what’s causing the disease.
0:07:46 And we take a row A inhibitor and we put it in mice.
0:07:47 And five months later, I remember sitting in lab
0:07:50 meaning we unveiled the data.
0:07:51 Oh, we changed the mice.
0:07:53 We changed the mice in the wrong direction.
0:07:54 They got worse.
0:07:56 They got more of these lesions.
0:07:58 And this is one of those problems with biology.
0:08:02 It’s like as humans, we are reductionist problem solvers.
0:08:04 We take a really complex system
0:08:06 and we try and reduce it down to these core elements
0:08:08 of protein A, protein B, protein C
0:08:09 so that we can put it on a whiteboard
0:08:11 or put it in a nature paper.
0:08:12 The reality isn’t biology.
0:08:14 There’s probably hundreds of interconnections,
0:08:17 thousands of feedback loops that are all working together.
0:08:19 And if you take a reductionist approach, I would argue
0:08:23 that’s maybe why we’re failing 90% of the time in the clinic.
0:08:26 That the way we need to actually explore biology
0:08:28 is not to take a reductionist approach,
0:08:30 but to up-level our understanding of biology,
0:08:32 to understand the whole complex system
0:08:35 and to build maps that truly embrace
0:08:37 how every gene interacts with every other gene.
0:08:38 And so that’s hard.
0:08:41 I mean, we’ve been working at this for 10 years,
0:08:44 but we took a very early version of this approach
0:08:47 in Dean’s lab after that failure.
0:08:50 We took microscopy images of human cells
0:08:52 where we were modeling this disease
0:08:54 and we took microscopy images of human cells
0:08:56 that were healthy and we trained
0:08:59 a basic machine learning classifier to recognize the two.
0:09:01 And then we added thousands of drugs to the disease cells
0:09:04 and we simply asked the machine learning classifier,
0:09:06 do any of the disease cells look healthy again?
0:09:07 Without any understanding
0:09:09 of what else was happening in biology.
0:09:13 And today, recursion is a few months away
0:09:15 from reading out a phase two clinical trial
0:09:19 against a drug that we discovered doing that work.
0:09:23 It was a totally surprising way that this drug was working.
0:09:25 It was not going after ROA or anything else.
0:09:27 And so I guess the point of this is
0:09:28 when you embrace the complexity of biology
0:09:31 and let biology give you the answer,
0:09:33 it can be surprising, it can challenge dogma.
0:09:35 But if you’re willing to follow that,
0:09:37 that our belief is that eventually
0:09:39 with enough technology and investment,
0:09:41 this could be a more sustainable industrial way.
0:09:44 And so I finished my PhD,
0:09:45 I took a leave of absence from medical school,
0:09:47 subsequently dropped out
0:09:49 because recursion ended up kind of taking off
0:09:50 and started the company.
0:09:52 And we’ve been building for the last 10 years
0:09:55 at this interface of tech and bio.
0:09:58 – So 10 years ago, or thereabouts,
0:10:00 when you got these lab results back
0:10:04 and things were going in the wrong direction,
0:10:05 was there a sense either with you
0:10:09 or colleagues in the lab or just colleagues in general,
0:10:11 was there a sense of like,
0:10:14 we know that there’s a different approach
0:10:16 to embrace the complexity,
0:10:18 but we just can’t do it right now
0:10:20 ’cause it’s too big for the human brain
0:10:24 or even dozens of the best human brains working in parallel.
0:10:27 It’s just too much information to sift through.
0:10:28 And did you know then,
0:10:30 I mean, you said you used an ML classifier,
0:10:32 but was it a thing of like,
0:10:33 if only the tech was a little better
0:10:35 or what was kind of the mind state then?
0:10:37 – Yeah, I think it was,
0:10:40 so we had this idea to use technology
0:10:43 to try and take a less biased approach.
0:10:45 We’re not the first people to have done this.
0:10:46 There’s a handful of other people
0:10:48 that were working on similar things at the time,
0:10:50 but AI wasn’t really being,
0:10:52 this is 2011-ish,
0:10:55 like AI is not really being thrown around as a term,
0:10:57 it’s kind of machine learning at the time.
0:10:59 People aren’t doing a lot of work in neural nets.
0:11:01 I mean, like ImageNet hasn’t even come out yet.
0:11:03 And so we were like,
0:11:06 among a very early wave and certainly in biology,
0:11:09 among a deeply early wave of people saying,
0:11:12 let’s use like computer vision to look at images.
0:11:13 And the work had actually really been pioneered
0:11:16 by a woman named Anne Carpenter at the Broad Institute.
0:11:18 She built the software tool that we used
0:11:19 that helped us go fast.
0:11:21 And yeah, we wanted the software to be better,
0:11:22 but at the same time,
0:11:25 we were pushing up against the frontier
0:11:27 at the time in biology.
0:11:29 And so what we’ve done now at recursion
0:11:32 is we’ve pioneered the industrialization of this approach.
0:11:35 And what I joke about is if you visited our headquarters here
0:11:37 and you saw a robotic laboratory,
0:11:39 there’s like a sad and exciting fact.
0:11:41 And that is that this robotic laboratory
0:11:44 does the equivalent of all the experiments
0:11:46 I did in my entire five years of my PhD,
0:11:49 every 15 minutes on average.
0:11:51 And so like that is both sad in some ways
0:11:52 and kind of exciting in others.
0:11:54 I can only imagine you’re humbled.
0:11:55 And at the same time,
0:11:57 you’re also like excited in the wave.
0:11:58 Yeah, yeah, it’s amazing.
0:12:01 So I don’t know if you want to jump all the way
0:12:03 to the present or what the best way is to walk us through,
0:12:06 but I want to get to what recursion is doing now.
0:12:09 Yeah, I mean, I think we can jump to where we are today,
0:12:11 which is we’ve taken this philosophy
0:12:14 of creating virtuous cycles
0:12:15 of what we call wet lab and dry lab.
0:12:18 And what that means is empirical data generation.
0:12:20 So a wet lab where we’re doing real experiments
0:12:21 with human cells and a dry lab,
0:12:23 which is our supercomputer system
0:12:25 and all of our software tools and AI tools.
0:12:28 And I think that what we’re doing here at recursion
0:12:31 is analogous to so many technology companies.
0:12:34 So Netflix is recording what you’re watching,
0:12:35 when you’re watching it, when you turn it on,
0:12:38 when you turn it off, who’s watching it in the household,
0:12:40 which scenes you’re turning it off on.
0:12:43 And they’re actually now then making predictions
0:12:45 and sort of A/B testing and going back through this loop
0:12:49 to the point today where Netflix is generating content
0:12:51 based on an algorithmic suggestion
0:12:54 of what’s gonna be popular for people.
0:12:56 This is like the drug discovery version of that.
0:12:59 We’re doing experiments, we’re breaking genes
0:13:01 and adding compounds and combinations of the things
0:13:02 that different human cell types.
0:13:06 And our A/B experiment is we make a bunch of predictions
0:13:08 about what genes and what drugs are connected to each other.
0:13:10 And then the next week we go back
0:13:11 and we test those predictions
0:13:13 and create this kind of flywheel approach.
0:13:16 The problem, of course, is that biology
0:13:19 is just so, so, so complex.
0:13:22 There’s this combinatorial explosion
0:13:23 of what we always joke about.
0:13:26 There’s about 21,000 human genes.
0:13:28 And in biology, there’s this really cool thing
0:13:30 called like synthetic lethality
0:13:32 or like synthetic relationships
0:13:36 where you can start to predict that two genes are related.
0:13:37 If you break both those genes
0:13:40 and you get an unexpectedly large or small effect,
0:13:41 it tells you that they might be
0:13:43 in some kind of feedback loop together.
0:13:45 If we were gonna do this synthetic experiment
0:13:49 of knocking out every gene with every other possible gene,
0:13:52 it’s about 250 million experiments.
0:13:52 It’s doable.
0:13:54 We’ve done 250 million experiments.
0:13:56 It’s taken us 10 years to get there,
0:13:58 but we now do up to 2.2 million a week.
0:14:00 So like this is feasible.
0:14:02 But if I just said, what if we did gene by gene by gene?
0:14:03 Three genes.
0:14:06 Now you’re talking about trillions of experiments.
0:14:07 And if you imagine doing four genes,
0:14:10 like you instantly in biology to really explore this,
0:14:12 you get to this combinatorial explosion
0:14:13 where we can’t brute force it.
0:14:15 We’re not gonna be able to empirically do it.
0:14:17 And so this is the beauty of AI,
0:14:20 just like in all of these other technology fields.
0:14:23 If you can sparsely fill this massive volume,
0:14:26 this matrix of genes and compounds and cell types
0:14:30 and interactions, ML and AI tools are often,
0:14:32 if the data is robust, really good
0:14:33 at helping you fill in and predict
0:14:35 what’s gonna happen in between.
0:14:36 And I think that’s what we’re trying to build.
0:14:39 That’s what that map is to us.
0:14:40 – And so forgive me,
0:14:42 this is a very sort of simplistic question,
0:14:45 but is it a case of where you’re running experiments
0:14:48 in the dry lab, doing it all in the AI,
0:14:49 you know, the computerized system,
0:14:52 and then sort of the ones that have the best chance
0:14:54 of going somewhere or go into the wet lab?
0:14:55 Is that kind of the basic?
0:14:57 – Yeah, that’s exactly right.
0:15:01 Yeah, and then what’s cool about these virtuous cycles
0:15:04 is if the prediction we made from the dry lab
0:15:07 goes to the wet lab and it works, it’s really robust.
0:15:08 Awesome, okay, we’ve got a new program.
0:15:10 Like let’s take it forward towards patients.
0:15:12 If it doesn’t work,
0:15:14 we’ve now that week generated a bunch of new data
0:15:17 upon which we can retrain the model.
0:15:20 And that new data is existing in a space
0:15:22 where the model was making a poor prediction.
0:15:24 And so by definition, week over week,
0:15:26 as you keep kind of going through this virtuous cycle,
0:15:28 despite the complexity of biology and chemistry,
0:15:30 you start to make progress
0:15:32 because there are areas of biology
0:15:35 that are kind of like, you know, like the Midwest,
0:15:38 like, you know, cornfields as far as you can see,
0:15:42 it kind of looks reasonably similar 100 miles apart.
0:15:44 And there’s other places in biology
0:15:47 that are like, you know, the Rocky Mountains or, you know,
0:15:49 whatever, and it’s like, you go 10 feet to the left
0:15:50 or 10 feet to the right,
0:15:52 dramatically different situation.
0:15:55 And so we can help to hone in with this iterative approach
0:15:58 around the parts of biology where we need more data
0:16:00 and the parts where we need less.
0:16:04 – Are there certain areas of biology that are,
0:16:06 and again, I’m gonna be overly simplistic here,
0:16:09 that are easier or harder to,
0:16:12 and I don’t know if it’s to map out or to understand
0:16:15 or to be able to, you know, sort of act on?
0:16:17 – For sure, so there are areas of biology
0:16:19 that we think are easier to start with.
0:16:20 And there are areas of biology
0:16:24 where we know exactly what is causing the disease.
0:16:26 So for me, there’s three areas there.
0:16:30 One is genetic diseases where we like cystic fibrosis.
0:16:34 We know that mutations in the CFTR gene cause cystic fibrosis.
0:16:37 Another good example are some cancers
0:16:39 where we know that mutations in certain genes
0:16:40 cause those cancers.
0:16:42 And the third, infectious diseases.
0:16:45 We know that like this virus or this bacteria
0:16:46 causes this disease.
0:16:48 What we like about those is that if we model those
0:16:52 in cells in our lab, you know, we add the virus
0:16:53 or we break the gene,
0:16:56 we know that we’ve at least partially recapitulated
0:16:58 something relevant about the biology.
0:17:01 The harder areas are like in neuroscience.
0:17:03 So take like Alzheimer’s.
0:17:05 We don’t really know what causes Alzheimer’s.
0:17:07 There’s debate, some people think they do,
0:17:09 but most of the drugs against those targets have failed.
0:17:12 So like we don’t really know what causes it.
0:17:15 And so how do you go about trying to understand Alzheimer’s?
0:17:17 And that’s actually neuroscience broadly
0:17:20 is probably one of the areas that’s hardest
0:17:22 because like we really don’t know
0:17:24 what causes a lot of different neurologic diseases.
0:17:28 And so we’ve actually partnered with Roche and Genentech,
0:17:30 the, you know, one of the biggest biopharma companies
0:17:32 there is, one of the most innovative
0:17:36 in a decades long collaboration to go map the genome
0:17:39 in neural specific cells.
0:17:42 So we can start to uncover some of these answers.
0:17:45 But those areas, those are the hard parts of biology.
0:17:46 – Right, right.
0:17:47 Should we talk supercomputers?
0:17:49 – Yeah, let’s talk supercomputers.
0:17:51 – All right, so I would imagine taking, you know,
0:17:54 somewhat of a rogue isn’t the right word,
0:17:56 but a different approach from the beginning, right?
0:17:59 The unbiased sort of industrialized approach.
0:18:02 And then now talking about, you know, mapping things out
0:18:05 and there’s so much we had a guest on recently
0:18:08 who talked about, you know, he had a great metaphor
0:18:09 for the difference between what we know
0:18:12 and what we don’t know about his field, right?
0:18:14 And so similar thing, right?
0:18:17 So I would imagine that the amount of data,
0:18:19 the amount of compute, it’s like with any other problem,
0:18:20 it just goes up and up and up.
0:18:23 So over the years, maybe you can talk a little bit
0:18:25 about kind of the technological path
0:18:29 that led you to, we’re at BioHive 2 now.
0:18:31 – All right, so maybe you can walk us through that a little bit.
0:18:34 – Yeah, I mean, we started out on my laptop
0:18:38 and then we moved to a server we built in-house called Golgi,
0:18:41 which we eventually mourned when we deprecated that.
0:18:44 But today we’ve got about 50 petabytes
0:18:47 of proprietary biological data.
0:18:48 And these data are a little bit different
0:18:51 from sort of like text and other things, some of it’s text.
0:18:54 But a lot of it is images, they’re very memory rich,
0:18:57 memory intensive, you know, they’re like big images.
0:19:00 And so some of the traditional cloud-based approaches
0:19:03 don’t quite work for us because you end up starving the GPU
0:19:05 from just a memory aside.
0:19:10 And so we in 2020 decided we needed to build a supercomputer
0:19:13 in order to make the best use of this giant dataset
0:19:15 we’ve accumulated and built.
0:19:17 And so we built a supercomputer called BioHive 1.
0:19:21 It was at the time on the top 500 list when it came out,
0:19:24 I think like 85th or 58th or something like that.
0:19:27 And what we found was that the scaling laws apply,
0:19:30 the bitter lesson holds in biology.
0:19:33 And that is that more data and more compute
0:19:35 both give you better outcomes.
0:19:37 And we’re generating more data every day.
0:19:40 And so we’re, I think in biology,
0:19:44 probably data is the bigger bottleneck than compute.
0:19:48 You know, there’s not the same arms race as in other fields
0:19:51 because there’s not yet enough relatable data
0:19:54 like we’re building for everybody to be doing this.
0:19:55 We are building the data,
0:19:57 but we also knew we needed more compute.
0:20:00 And so just maybe six months ago,
0:20:03 we announced actually NVIDIA invested in us last summer.
0:20:06 And then as we continued building our relationship with them,
0:20:09 we ultimately decided to build BioHive 2,
0:20:12 which just came out top 500 list.
0:20:14 It’s number 35.
0:20:18 It’s about 23 petaflops and it’s 504 H100s.
0:20:23 And now we just deprecated BioHive 1 with our 300 plus A100s
0:20:26 and we’re moving those over to the new facility.
0:20:31 And so we’ll have this Frankenstein A100 H100 giant supercomputer,
0:20:33 at least in our field, little recursion.
0:20:38 This biotech, tech bio startup in Salt Lake City now owns and operates
0:20:41 the fastest supercomputer in the world of any biopharma company,
0:20:44 you know, from Pfizer, you know, all the way down.
0:20:46 -Amazing. -It’s pretty cool.
0:20:48 -Yeah, yeah, no, it’s amazing.
0:20:50 We, you know, over the years of doing the podcast,
0:20:56 this phrase democratization of tools has always been almost like a mantra.
0:21:00 And, you know, for a while it was thinking about like a lawyer,
0:21:04 his computer and his apartment office has a GPU
0:21:07 and he realized he could use it to do some machine learning stuff.
0:21:09 And, you know, similar thing, right?
0:21:11 You guys are a much bigger scale than that.
0:21:15 But as you said, compared to the longstanding giants of the industry,
0:21:17 you know, for you to be able to build that is amazing.
0:21:20 Do you have a good story or a lesson learned
0:21:26 or something sort of surprising, specific to building BioHive 2?
0:21:29 And that process, that was something that, you know,
0:21:32 a hurdle you had to overcome or something you were expecting
0:21:34 was going to go one way and it turned out to be something different
0:21:36 or, you know, maybe just an anecdote?
0:21:39 -Yeah, I mean, building BioHive 2 was a tremendous effort.
0:21:43 Multiple teams came coming together, the recursion team, the NVIDIA team,
0:21:48 other suppliers teams, we ended up wanting to get this done
0:21:51 in time for submission to the most recent top 500 list.
0:21:53 -Okay. -And, you know, as is the case right now
0:21:56 with the popularity of GPUs and the importance of AI,
0:21:59 we got our GPUs, we got our cluster,
0:22:03 but we were missing racks and cables, certain racks and certain cables,
0:22:08 and those ended up coming in 15 days before the top 500 submission.
0:22:12 And so, our team was sleeping on cots in the data center
0:22:16 and the NVIDIA team joined us, like the amount of effort that they put in.
0:22:22 We got that thing up and running, burned in and benchmarked in 15 days
0:22:23 from getting all the materials.
0:22:29 It ended up being, it’s the first on-slab H100 setup, as far as I’m aware,
0:22:32 and it ended up having really, really good performance.
0:22:33 -Excellent.
0:22:35 My guest today is Chris Gibson.
0:22:39 Chris is co-founder and CEO of Recursion,
0:22:42 one of the world’s foremost biotech.
0:22:44 TechBio, I want to ask you about that in a second,
0:22:46 companies in the world based out of Utah,
0:22:48 and as Chris was just detailing,
0:22:53 they are now the proud owners and maintainers.
0:22:56 Was it the 35th ranked?
0:22:58 -Yeah, 35th. -Super pure in the world, BioHive2.
0:23:03 TechBio, I assume, it’s just kind of a flip on the term biotech
0:23:07 that the tech is kind of leading the way forward?
0:23:08 -Yeah, that’s right.
0:23:10 I mean, the term biotech was coined in the ’80s
0:23:14 to talk about using human proteins as drugs.
0:23:16 -Okay. -And at the time, that was tech.
0:23:18 And it was Genentech that really started this off.
0:23:20 And so credit to them, they were first.
0:23:23 They coined their industry biotech.
0:23:25 We would have called it biotech, but that was taken.
0:23:28 And so somebody in our field called it TechBio,
0:23:30 and the tech is kind of at the forward side.
0:23:34 And it’s a good term to describe the few hundred companies now
0:23:37 that are more digitally native, that are building in this space,
0:23:40 where technology is really at the foundation of what they build.
0:23:41 -Got it.
0:23:44 Also in prepping for this conversation,
0:23:48 I noticed the name BioHive being used.
0:23:50 It’s the name of a public-private partnership
0:23:52 that I think you also have a hand in.
0:23:54 So maybe you can talk a little bit about that.
0:23:56 As I was listening to you earlier,
0:23:59 I was thinking about some of the recent podcasts we’ve done.
0:24:02 We’ve done one or two others in the drug discovery space
0:24:06 and a bunch of others in the sciences.
0:24:08 And more and more, we’re hearing about,
0:24:10 I think the technology is a big part of it,
0:24:13 obviously, in making an open-source collaboration
0:24:16 and just other collaboration with large amounts of data,
0:24:20 easier to do across the world.
0:24:23 So I’m interested both in the specifics of BioHive
0:24:27 and then also just kind of your take on the current state of,
0:24:31 you know, science companies, biotech companies,
0:24:33 working in sort of a capitalist environment
0:24:37 where you’re trying to, you know, compete, you are competing.
0:24:38 But then at the same time,
0:24:39 given the work that you’re doing,
0:24:41 sharing is obviously quite important.
0:24:43 So maybe you can start with BioHive
0:24:46 and kind of get into just what collaboration is like
0:24:47 in the field today.
0:24:50 -Yeah, I think this new generation of tech bio companies,
0:24:54 we really do believe that we will all do better
0:24:57 if we all are collaborating at least
0:24:59 at certain stages of the process.
0:25:01 Because, you know, like at the end of the day,
0:25:03 if there’s a disease that doesn’t have a treatment,
0:25:05 we all still have work to do.
0:25:08 And so we believe a lot in investing
0:25:09 in the communities in which we work.
0:25:12 We acquired a company last year called Valence
0:25:16 that now basically is building out a community
0:25:19 for people to host, you know, different foundation models
0:25:22 and other sorts of things that are important across biology.
0:25:23 They’re hosting data sets.
0:25:27 They’re really investing in building that community.
0:25:30 And we also believe that’s true in the locations we work.
0:25:35 And BioHive is what we call the kind of ground-level
0:25:38 organization that’s helping to brand, bring together
0:25:41 and build the life science ecosystem in Utah.
0:25:43 We also do investments in our other, you know,
0:25:47 we’ve got offices in Toronto, Montreal, London,
0:25:49 and Melpitas, California.
0:25:50 And so we also make investments in those places
0:25:53 because we feel like we’re on like a 30 year…
0:25:56 I mean, we look at NVIDIA 30 years in,
0:25:59 and we feel like we’re 10 years into our 30-year journey
0:26:00 to kind of prove out this vision
0:26:03 that we think could be so impactful for the world.
0:26:07 And we know if you’re on that kind of trajectory
0:26:08 over that kind of timeline,
0:26:11 you’ve got to build community around you as you go.
0:26:13 And so these are important investments for us.
0:26:16 Open-sourcing data sets has been a critical piece
0:26:18 of how we’ve not only helped build the community
0:26:21 but also attracted talent over the years.
0:26:24 And so I think we’re going to continue pushing the industry,
0:26:26 the pharma and biotech industry,
0:26:29 to be a little bit more open to these kinds of approaches.
0:26:31 – Excellent, I love to hear that.
0:26:34 There’s data, there’s compute, there’s hardware.
0:26:37 There’s also software, the tools that make it all run.
0:26:41 I wanted to ask you about a recursion tool called Lowe
0:26:42 that I heard a little bit about.
0:26:44 My understanding is it kind of helps orchestrate
0:26:47 the workflows that uses GenAI in some way.
0:26:48 What’s that all about?
0:26:51 – Yeah, so we’ve been building all these software tools.
0:26:53 Some of them are leveraging neural nets
0:26:54 and other sorts of things.
0:26:56 And we’ve now got dozens of these tools
0:26:58 and it’s become complicated enough
0:27:00 that if you’re a scientist at recursion,
0:27:03 you can’t keep up with all the versions of all the tools.
0:27:05 It’s just, you look on your iPhone,
0:27:07 there’s tens of thousands of apps.
0:27:09 It’s hard to know how to use all of them.
0:27:11 Same kind of thing is happening here in science.
0:27:15 And so what we did was build an LLM and tuned it
0:27:18 to actually interact with the APIs for all of these tools
0:27:22 and to have a sense of when to use different tools
0:27:23 based on natural language.
0:27:25 So I can go in and I can say,
0:27:29 “Give me five novel targets in non-small cell lung cancer.”
0:27:31 And I just type that in and hit enter.
0:27:33 And then the LLM knows which of the software tools
0:27:36 we’ve built at recursion can go look at our data,
0:27:39 at public data, can look for like arbitrage
0:27:41 between those data sets and it can just surface back to you
0:27:44 some insights about what targets you might want to go after.
0:27:46 And then you can say, “Design a drug
0:27:48 that would inhibit one of these targets.”
0:27:51 And it’ll use Gen AI and the protein structure
0:27:55 that AlphaFold or others have predicted for that target.
0:27:58 And then it’ll help design a molecule that can bind
0:27:59 into the binding sites of that target
0:28:01 and we would predict inhibit it.
0:28:02 And you can do all of this with natural language.
0:28:06 And then you can say, “Design and execute an experiment
0:28:08 to validate this interaction.”
0:28:12 And then it can order the chemical from our suppliers.
0:28:13 It can design an experiment
0:28:15 that we can run in our automated wet lab.
0:28:16 From a security perspective,
0:28:18 we make a human approve that experiment
0:28:19 because we want to make sure
0:28:21 that we’re not running any bad experiments
0:28:22 that could do something bad.
0:28:24 And also you don’t want to accidentally run
0:28:28 like a $6 million experiment because nobody approved it.
0:28:29 So we have a human in the loop.
0:28:31 But if somebody approves that, you can go run the experiment.
0:28:34 And so this for us is, I think this is a lot like
0:28:37 the late ’70s and early ’80s in personal computing
0:28:39 where you moved from like the Apple One,
0:28:41 where you had to be this expert user,
0:28:43 you had to know how to solder and all this stuff,
0:28:46 to then with the LISA, the GUI,
0:28:47 you could actually start
0:28:49 to have this democratization of these tools.
0:28:51 And we think the same thing is happening now,
0:28:53 but instead of a graphical user interface,
0:28:56 it’s like a discovery user interface.
0:28:58 And low is our take on this.
0:29:00 There’s a big pharma company, GSK,
0:29:02 that’s building one of these.
0:29:04 There’s a couple other startups building these.
0:29:06 But ultimately we think these kinds of tools
0:29:09 are gonna mean that even if you don’t have 30 years
0:29:12 of experience in chemistry in the biopharm industry,
0:29:13 you’re like fresh out of school,
0:29:16 you’ll still be able to make bigger contributions faster
0:29:17 with this kind of approach.
0:29:18 That’s fantastic.
0:29:21 And you kind of just spoke to my next question,
0:29:23 but I’ll ask in any way.
0:29:25 Jensen said something in,
0:29:28 recently it was a couple of months ago, I think now,
0:29:30 in an interview where somebody asked about,
0:29:32 the future of the computer science field.
0:29:35 And he said something along the lines of,
0:29:39 my advice is go study a field
0:29:40 that you’re interested in,
0:29:42 develop your domain expertise,
0:29:45 because we’re already on this path
0:29:48 and we’re getting to a spot where you’re not,
0:29:51 and I’m paraphrasing you to use these exact words,
0:29:53 you’re not necessarily gonna have to learn
0:29:57 to write Python scripts or learn R to wrangle data.
0:29:59 You’re gonna use natural language prompts.
0:30:01 And then your domain expertise
0:30:03 is really what’s gonna become valuable
0:30:07 ’cause you’ll be able to work with the AI systems,
0:30:08 vet their output, et cetera, et cetera.
0:30:12 It sounds like that’s kind of where recursion’s at
0:30:14 to some extent anyway.
0:30:16 What’s your take kind of on that trajectory
0:30:19 and maybe just expound a little on what you just said
0:30:23 about even new graduates being able to contribute more
0:30:26 to the field because of the natural language prompting?
0:30:27 Yeah, I know, I think he’s exactly right.
0:30:29 I think he was on stage with us
0:30:31 at an event we hosted in January
0:30:32 where he said something similar.
0:30:33 Okay, yeah.
0:30:35 And what we talked about was how important
0:30:37 a classical education will be
0:30:39 in learning how to interpret problems,
0:30:40 identify the right problem,
0:30:43 and then how to ask and answer questions
0:30:44 about that problem.
0:30:46 That’s what’ll matter, it won’t be coding
0:30:47 because everything will be natural language.
0:30:48 So I agree with him.
0:30:50 And so what we’re looking for now
0:30:51 are people who are really good
0:30:53 at operating at the interface.
0:30:56 We actually don’t need somebody that’s memorized
0:30:58 the entire molecular cellular biology textbook.
0:31:00 Just like we don’t need doctors anymore
0:31:04 to memorize every single possible disease,
0:31:05 we are gonna have these tools
0:31:07 that mean you can type in,
0:31:08 here’s the symptoms the patient has
0:31:10 or here’s what I’m seeing in the data,
0:31:11 and then those tools are gonna help
0:31:13 pull out all of that deep information
0:31:15 that nobody should have to memorize.
0:31:16 And what’s gonna be critical is somebody
0:31:18 who can take that information
0:31:19 and say, okay, what do I do with this?
0:31:20 Like what’s the next step?
0:31:22 What’s the killer experiment that I can go run?
0:31:24 Or like, how would I treat this patient?
0:31:26 And so yeah, we’re gonna move to a place
0:31:27 where people who are good
0:31:30 at integrating lots of different ideas and data,
0:31:33 those are the people that we’re looking for.
0:31:36 We’re looking for not just the biology PhD,
0:31:38 but the biology PhD who loves using
0:31:39 all the different new AI tools
0:31:41 or the computational biologists.
0:31:43 People were really working at those interfaces.
0:31:44 – Right, right.
0:31:46 So I think you mentioned at the top,
0:31:49 recursion has created some drugs that are in trials now.
0:31:52 – Yeah, we’ve got five programs that are in clinical trials
0:31:54 and a few more on the way to the clinic as well.
0:31:57 – Okay, so I’m asking that kind of as context
0:32:01 for what the future holds, both for recursion
0:32:03 and then also sort of for the industry
0:32:05 for the field more broadly.
0:32:07 Is the plan on recursion’s end
0:32:10 to keep developing more drugs,
0:32:12 getting more actual solutions to put it that way
0:32:14 out into trials?
0:32:16 What’s sort of the, I don’t know,
0:32:19 the MO for recursion over the coming few years?
0:32:20 – It’s a great question.
0:32:23 I mean, at the end of the day, what matters
0:32:25 is you get a medicine to a patient
0:32:26 and it makes the patient healthy again, right?
0:32:28 So somebody’s gotta develop a product.
0:32:30 When we started recursion,
0:32:31 we believed we would build all these tools
0:32:34 to help people identify what products to build
0:32:36 and that they would then go build the products.
0:32:38 What we were surprised by, what we got wrong
0:32:40 was the reticence to this industry,
0:32:42 to these new technology tools.
0:32:43 And I think it’s a combination
0:32:47 of the regulatory environment with the FDA and EMA.
0:32:49 It’s a combination of that with just the general
0:32:51 conservative nature of an industry that spends,
0:32:56 it’s $2.6 billion of invested R&D
0:32:59 per new drug approval every year in our industry.
0:33:02 So like, and remember, that cost is that high
0:33:05 because 90% of the drugs that people take in the clinic fail.
0:33:06 So like, it doesn’t cost that much
0:33:08 to develop the drug that succeeds.
0:33:11 It’s just that you’ve gotta, you know,
0:33:12 spend all the money on all the failures.
0:33:14 And so the industry’s been conservative
0:33:15 and we kind of felt like
0:33:18 they’re not really taking this up as fast as we would hope.
0:33:21 Do we want to let these drugs we believe in die?
0:33:22 – Right.
0:33:23 – Or do we want to let them see the light of day?
0:33:25 And so ultimately, the biggest pivot we’ve made
0:33:27 as a company is we now have our own drugs
0:33:28 and that’s required more capital.
0:33:30 We’ve had to build new parts of our team
0:33:32 and kind of build the culture in that direction
0:33:35 towards clinical development and interacting with the FDA.
0:33:38 But sometimes you got to go vertical, right?
0:33:39 Sometimes you got to go vertical
0:33:41 if you really believe in something.
0:33:44 And, you know, a great example of this is like Tesla
0:33:46 building the supercharging network.
0:33:49 Like, I guarantee you they didn’t want to build a supercharging.
0:33:51 They’re like, of course gas stations are gonna see
0:33:52 in the future that people will use electronics
0:33:54 and they’ll put in superchargers.
0:33:56 No, nobody was read as this.
0:33:58 So they had to build it and that kind of sucked.
0:34:00 I’m sure we’ve got to build it too.
0:34:01 But at the end of the day,
0:34:05 if we can get medicines to patients, we’ll be happy.
0:34:07 – So before we wrap up here, Chris,
0:34:09 you mentioned, you know, a couple of times
0:34:11 and just now sort of the shift in the field
0:34:14 and kind of the reticence, you know, dating back
0:34:16 to when you were first doing your graduate work,
0:34:18 even up until now and the reason you had to go vertical
0:34:21 and start actually producing the drugs yourself.
0:34:23 People are resistant to change.
0:34:26 Technology takes time to get used to sort of culturally,
0:34:30 globally we’re almost certainly in the opening stages
0:34:34 of this big industrial revolution, you know,
0:34:36 spurred by automation and AI and everything.
0:34:39 Along that sort of cultural front,
0:34:42 have there been kind of big, I mean, is that a big thing?
0:34:46 Is that kind of the, you know, the data, the compute,
0:34:48 all of that stuff, but kind of culturally are you,
0:34:51 do you feel like you’re building kind of a different way
0:34:54 to do things and are there bumps in the road along,
0:34:55 you know, that come with that?
0:34:58 – Yeah, absolutely, I would say that the data
0:34:59 that compute those are moats for us,
0:35:01 but our biggest moat is our culture.
0:35:04 And we’ve had to take software engineers, data scientists,
0:35:06 biologists, chemists, drug developers
0:35:09 and supporting functions and have all of these different
0:35:13 folks working in these deep technical domains come together
0:35:15 and it’s basically meant that we’ve had to create
0:35:17 a new language, a new language where all of these people
0:35:20 can build in the same environment,
0:35:22 both literally and figuratively.
0:35:25 And so I think that we had a C level exec
0:35:27 from a big pharma company here a few years ago.
0:35:29 And on his way out the door, he said,
0:35:31 we could never replicate this culture.
0:35:33 That’s your biggest competitive advantage.
0:35:35 Everything else with enough money we could do,
0:35:36 but that we couldn’t do.
0:35:38 And I think that’s probably true.
0:35:41 And it’s, this is why sometimes those up and coming
0:35:44 young companies out, end up out innovating, you know,
0:35:46 some of the bigger ones is that it’s about the culture
0:35:47 as much as anything else.
0:35:48 – No, absolutely.
0:35:50 How big is recursion now, how many people?
0:35:52 – We’re about 550 people now.
0:35:54 – Wow, amazing.
0:35:57 I mean, it’s relatively small to what you’re doing.
0:35:59 I mean, maybe it’s very, very small to what you’re doing,
0:36:01 but that’s just amazing.
0:36:02 For folks who would like to learn more,
0:36:04 we covered a lot of ground, we could do an hour
0:36:07 just on how you built BioHive too, obviously.
0:36:10 But for folks who want more of the recursion story,
0:36:13 want to dive into some more of what you’re working
0:36:16 on the specifics, maybe even are looking for, you know,
0:36:18 a new role, a great culture to join.
0:36:20 Where can they go to find out more?
0:36:21 – Yeah, go to recursion.com.
0:36:23 You can find out about careers.
0:36:25 You can read all of our papers.
0:36:29 And you can also go to rxrx.ai,
0:36:32 which is where we have lots of large data sets.
0:36:33 If you’re a data scientist and you want to play
0:36:36 with some biology, you can go download them there.
0:36:37 – Fantastic.
0:36:40 Chris Gibson, this was a pleasure.
0:36:41 I learned a lot that’s going to leave me
0:36:43 with many more questions as I process it.
0:36:46 So we may have to do this again,
0:36:48 but appreciate you taking the time to come on the podcast,
0:36:49 tell everybody about the work you’re doing.
0:36:53 And, you know, it feels like an understatement
0:36:56 and a cliche to say, but you’re working on solutions
0:36:58 that can literally change people’s lives
0:36:59 through better health.
0:37:02 So all the best of luck to you and your teams.
0:37:04 – Thanks Noah, really appreciate it.
0:37:06 (upbeat music)
0:37:09 (upbeat music)
0:37:10 .
0:37:12 (upbeat music)
0:37:15 (upbeat music)
0:37:17 (upbeat music)
0:37:20 (upbeat music)
0:37:23 (upbeat music)
0:37:25 (upbeat music)
0:37:28 (upbeat music)
0:37:30 (upbeat music)
0:37:33 (upbeat music)
0:37:36 (upbeat music)
0:37:38 (upbeat music)
0:37:41 (upbeat music)
0:37:43 (upbeat music)
0:37:46 (upbeat music)
0:37:48 (upbeat music)
0:37:51 (upbeat music)
0:37:53 you
0:38:03 [BLANK_AUDIO]
0:00:13 >> Hello, and welcome to the NVIDIA AI podcast.
0:00:15 I’m your host, Noah Kravitz.
0:00:19 My guest today is CEO and co-founder of Recursion,
0:00:22 one of the world’s leading biotech,
0:00:25 or tech bio as they call it, companies in the world.
0:00:29 Chris Gibson started Recursion based on the work he developed while working on
0:00:32 his joint MD-PhD at the University of Utah,
0:00:37 and today the company is dedicated to coding biology in the name of radically improving lives.
0:00:41 Recursion is building one of the largest proprietary datasets in their field.
0:00:44 They just took the wraps off one of the most powerful supercomputers in the world,
0:00:48 and they’re one of the company’s leading the growing field of AI-powered drug discovery.
0:00:52 Chris is here to tell us all about it and then some, so let’s dive right in.
0:00:56 Chris Gibson, welcome, and thank you so much for joining the NVIDIA AI podcast.
0:00:58 >> Thanks, Noah. I’m delighted to be here.
0:01:01 As is often the case with these episodes,
0:01:04 I try to pack everything I can into the intro to set the stage.
0:01:07 You’ve done so much, you’re working on so much,
0:01:08 but let me turn it over to you.
0:01:12 Maybe we can start by you explaining to the audience what Recursion is all about.
0:01:14 >> Yeah. I want to tell you about Recursion,
0:01:17 but first I want to tell you about the problem that we’re solving.
0:01:17 >> First.
0:01:22 >> Everybody knows somebody who has a disease where there’s not a good treatment.
0:01:24 You’ve got a relative who’s died of cancer,
0:01:26 you have somebody who’s suffering from Alzheimer’s or Parkinson’s,
0:01:29 or some of these other devastating diseases.
0:01:33 Today, despite all of the incredible technology in the world,
0:01:37 90 percent of drugs that go into clinical trials fail.
0:01:40 Nine out of 10 drugs that our industry puts into clinical development
0:01:42 fail before they get to patients.
0:01:48 What that tells me is that biology is just this massive quagmire of complexity.
0:01:53 It’s just dramatically complex to a level where despite all of these hundreds of
0:01:57 thousands of scientists around the world and about $50 billion a year of R&D
0:02:01 investment from the industry, we still aren’t that good at it.
0:02:07 We imagined that perhaps there was a way that you could bring together new technologies
0:02:10 to try and take a less biased approach to biology,
0:02:12 to try and step back and say,
0:02:16 instead of trying to understand how every gene interacts with every gene and
0:02:21 every drug interacts with every gene and build it all in our heads,
0:02:23 which is the way it’s been done traditionally.
0:02:25 Could we actually take an industrial approach,
0:02:28 build maps of biology using automation,
0:02:33 using AI, and then leverage those to tell us where to go,
0:02:37 to essentially have the algorithm tell us where to develop a medicine.
0:02:40 That’s what we’ve been working on for the last 10 years.
0:02:43 Now we’re what’s called a clinical stage biotech company.
0:02:46 We’ve got five drugs that are in the clinic ourselves.
0:02:49 We’ve got big partnerships with companies like Roche Genentech and
0:02:53 Buyer to bring drugs into really hard areas of biology.
0:02:56 We just built this incredible supercomputer with Nvidia,
0:02:58 one of our partners as well.
0:03:03 So we’re I think really one of the companies leading at the intersection of biology,
0:03:06 but also of technology.
0:03:08 A million things to get into.
0:03:09 But before we go forward,
0:03:11 I want to ask you use the word map.
0:03:14 You talked about an industrial approach mapping biology,
0:03:17 and that made me think of things like mapping the human genome.
0:03:19 Is there a parallel there?
0:03:21 Is it a similar approach or how does that work?
0:03:23 Yeah, so traditionally in our industry,
0:03:25 people have worked on one disease at a time.
0:03:26 Okay.
0:03:29 And so if we wanted to go after a disease together,
0:03:31 we’d go read the literature,
0:03:32 we’d build a team,
0:03:37 and then we would build specific experiments for that one disease.
0:03:37 Right.
0:03:40 And we would generate data and five to 10 years later,
0:03:42 maybe we would have a drug going into the clinic and
0:03:44 you know what the success rate is there.
0:03:45 It’s a 90% failure.
0:03:47 There’s a different kind of approach,
0:03:49 which is instead of working on one thing at a time,
0:03:55 can we build vast data sets that span very large scales?
0:03:58 And I think the human genome project is a great example.
0:04:00 People said let’s map the entire human genome.
0:04:03 And now today there’s tens of millions of genomes
0:04:04 that have been that have been mapped
0:04:06 and we can compare them and contrast them.
0:04:08 We’re using some of those same data.
0:04:12 And so we’ve taken this approach that to invest more initially
0:04:15 to build really large, complex data sets
0:04:16 that at the very beginning,
0:04:18 you’re paying a lot to build this data set
0:04:21 and you don’t yet have enough data to actually make any progress
0:04:22 against any disease.
0:04:23 But over time,
0:04:26 you start to build these network effects where today,
0:04:29 if we run an experiment in our automated laboratory
0:04:30 and we get some result,
0:04:34 we can compare it to over 250 million experiments
0:04:35 that we’ve run over the past few years
0:04:37 and all of that data is relatable.
0:04:38 So instead of one disease at a time,
0:04:40 instead of slices of biology,
0:04:42 we’re actually building like a volume
0:04:45 and we’re sparsely sampling this volume
0:04:47 and then using AI to sort of fill in
0:04:50 what we can predict about the rest of it.
0:04:53 – Right, is that a unique approach in the industry?
0:04:55 – I think it’s pretty unique in the industry.
0:04:56 Yeah, there’s a handful of companies
0:04:58 that are taking similar approaches.
0:05:01 I’m unaware of any company that has generated a data set
0:05:03 that is this broad.
0:05:05 So we’ve knocked out with CRISPR-Cas9
0:05:07 that some of your listeners may have heard of.
0:05:11 It’s like a molecular scissors that lets us cut out genes.
0:05:13 We’ve knocked out every gene in the human genome
0:05:16 in multiple different human cell types.
0:05:18 We’ve profiled millions of molecules
0:05:21 and all of these data exists now in multiple layers.
0:05:23 We call it omics.
0:05:24 People may have heard of like genomics.
0:05:27 Well, we’re building phenomics and transcriptomics
0:05:29 and invivomics and proteomics
0:05:31 and all of these omics data layers.
0:05:32 And you can think of it as,
0:05:34 go back to like the Google map days, right?
0:05:36 Started out with maps from airplanes
0:05:38 and we got kind of where all the streets were
0:05:40 and then eventually you had street view
0:05:42 and cars were driving around.
0:05:43 We’re building all these same layers
0:05:45 but instead of doing it in the physical world,
0:05:49 we’re using a massive automated laboratory full of robots
0:05:50 to do millions of experiments
0:05:52 to figure out what the roads and streets
0:05:54 and intersections are of biology and chemistry.
0:05:56 And it’s a complex space.
0:05:58 I mean, I think it’s about as complex a problem
0:05:59 as one can work on.
0:06:01 – I can only imagine.
0:06:03 Maybe by the end of this conversation,
0:06:06 I’ll know more about what I don’t know at the very least.
0:06:09 So maybe we can start with the graduate work,
0:06:11 the kind of seated recursion.
0:06:12 And if this is the wrong approach,
0:06:13 take a different one,
0:06:16 but I’m imagining that might help me and the audience
0:06:19 sort of to understand the problem,
0:06:20 the scope of the problem
0:06:23 and how and why you started with this approach.
0:06:24 – Yeah, of course.
0:06:26 I think this is a great way to kind of explain
0:06:27 what we’re doing at recursion.
0:06:30 So I joined the lab of a guy named Dean Lee.
0:06:32 Dean’s actually now the president of Merck Research Lab.
0:06:35 So he’s like the head scientist of Merck.
0:06:37 Very physician scientist, brilliant guy.
0:06:39 The lab was so diverse.
0:06:43 There were physicians and engineers and geneticists
0:06:45 and molecular cellular biologists.
0:06:47 And we were working on all these cool, hard problems.
0:06:49 And one of the things we were working on
0:06:52 was this rare genetic disease
0:06:54 called cerebral cavernous malformation.
0:06:57 And I bet about 1% of your audience has heard of it
0:07:01 because about 1% of the people in the world
0:07:03 have this disease.
0:07:06 So it’s like a rare disease, but it’s not that rare.
0:07:10 It’s six times more prevalent than cystic fibrosis.
0:07:12 But there’s no drug, there’s no treatment.
0:07:13 And so because of that,
0:07:15 people don’t know about the disease as much.
0:07:18 And we were trying to figure out how this disease works.
0:07:20 And we use traditional molecular
0:07:21 and cellular biology approaches
0:07:24 where I could hop on the whiteboard that’s behind me
0:07:27 and I could draw protein X goes to protein Y
0:07:28 goes to protein Z.
0:07:30 And we think protein Z gets too high
0:07:32 and that causes the disease.
0:07:35 And after a decade of working on this,
0:07:37 we think we figured it out.
0:07:40 We’re sitting in lab and we think this protein called row A
0:07:42 is what’s causing the disease.
0:07:46 And we take a row A inhibitor and we put it in mice.
0:07:47 And five months later, I remember sitting in lab
0:07:50 meaning we unveiled the data.
0:07:51 Oh, we changed the mice.
0:07:53 We changed the mice in the wrong direction.
0:07:54 They got worse.
0:07:56 They got more of these lesions.
0:07:58 And this is one of those problems with biology.
0:08:02 It’s like as humans, we are reductionist problem solvers.
0:08:04 We take a really complex system
0:08:06 and we try and reduce it down to these core elements
0:08:08 of protein A, protein B, protein C
0:08:09 so that we can put it on a whiteboard
0:08:11 or put it in a nature paper.
0:08:12 The reality isn’t biology.
0:08:14 There’s probably hundreds of interconnections,
0:08:17 thousands of feedback loops that are all working together.
0:08:19 And if you take a reductionist approach, I would argue
0:08:23 that’s maybe why we’re failing 90% of the time in the clinic.
0:08:26 That the way we need to actually explore biology
0:08:28 is not to take a reductionist approach,
0:08:30 but to up-level our understanding of biology,
0:08:32 to understand the whole complex system
0:08:35 and to build maps that truly embrace
0:08:37 how every gene interacts with every other gene.
0:08:38 And so that’s hard.
0:08:41 I mean, we’ve been working at this for 10 years,
0:08:44 but we took a very early version of this approach
0:08:47 in Dean’s lab after that failure.
0:08:50 We took microscopy images of human cells
0:08:52 where we were modeling this disease
0:08:54 and we took microscopy images of human cells
0:08:56 that were healthy and we trained
0:08:59 a basic machine learning classifier to recognize the two.
0:09:01 And then we added thousands of drugs to the disease cells
0:09:04 and we simply asked the machine learning classifier,
0:09:06 do any of the disease cells look healthy again?
0:09:07 Without any understanding
0:09:09 of what else was happening in biology.
0:09:13 And today, recursion is a few months away
0:09:15 from reading out a phase two clinical trial
0:09:19 against a drug that we discovered doing that work.
0:09:23 It was a totally surprising way that this drug was working.
0:09:25 It was not going after ROA or anything else.
0:09:27 And so I guess the point of this is
0:09:28 when you embrace the complexity of biology
0:09:31 and let biology give you the answer,
0:09:33 it can be surprising, it can challenge dogma.
0:09:35 But if you’re willing to follow that,
0:09:37 that our belief is that eventually
0:09:39 with enough technology and investment,
0:09:41 this could be a more sustainable industrial way.
0:09:44 And so I finished my PhD,
0:09:45 I took a leave of absence from medical school,
0:09:47 subsequently dropped out
0:09:49 because recursion ended up kind of taking off
0:09:50 and started the company.
0:09:52 And we’ve been building for the last 10 years
0:09:55 at this interface of tech and bio.
0:09:58 – So 10 years ago, or thereabouts,
0:10:00 when you got these lab results back
0:10:04 and things were going in the wrong direction,
0:10:05 was there a sense either with you
0:10:09 or colleagues in the lab or just colleagues in general,
0:10:11 was there a sense of like,
0:10:14 we know that there’s a different approach
0:10:16 to embrace the complexity,
0:10:18 but we just can’t do it right now
0:10:20 ’cause it’s too big for the human brain
0:10:24 or even dozens of the best human brains working in parallel.
0:10:27 It’s just too much information to sift through.
0:10:28 And did you know then,
0:10:30 I mean, you said you used an ML classifier,
0:10:32 but was it a thing of like,
0:10:33 if only the tech was a little better
0:10:35 or what was kind of the mind state then?
0:10:37 – Yeah, I think it was,
0:10:40 so we had this idea to use technology
0:10:43 to try and take a less biased approach.
0:10:45 We’re not the first people to have done this.
0:10:46 There’s a handful of other people
0:10:48 that were working on similar things at the time,
0:10:50 but AI wasn’t really being,
0:10:52 this is 2011-ish,
0:10:55 like AI is not really being thrown around as a term,
0:10:57 it’s kind of machine learning at the time.
0:10:59 People aren’t doing a lot of work in neural nets.
0:11:01 I mean, like ImageNet hasn’t even come out yet.
0:11:03 And so we were like,
0:11:06 among a very early wave and certainly in biology,
0:11:09 among a deeply early wave of people saying,
0:11:12 let’s use like computer vision to look at images.
0:11:13 And the work had actually really been pioneered
0:11:16 by a woman named Anne Carpenter at the Broad Institute.
0:11:18 She built the software tool that we used
0:11:19 that helped us go fast.
0:11:21 And yeah, we wanted the software to be better,
0:11:22 but at the same time,
0:11:25 we were pushing up against the frontier
0:11:27 at the time in biology.
0:11:29 And so what we’ve done now at recursion
0:11:32 is we’ve pioneered the industrialization of this approach.
0:11:35 And what I joke about is if you visited our headquarters here
0:11:37 and you saw a robotic laboratory,
0:11:39 there’s like a sad and exciting fact.
0:11:41 And that is that this robotic laboratory
0:11:44 does the equivalent of all the experiments
0:11:46 I did in my entire five years of my PhD,
0:11:49 every 15 minutes on average.
0:11:51 And so like that is both sad in some ways
0:11:52 and kind of exciting in others.
0:11:54 I can only imagine you’re humbled.
0:11:55 And at the same time,
0:11:57 you’re also like excited in the wave.
0:11:58 Yeah, yeah, it’s amazing.
0:12:01 So I don’t know if you want to jump all the way
0:12:03 to the present or what the best way is to walk us through,
0:12:06 but I want to get to what recursion is doing now.
0:12:09 Yeah, I mean, I think we can jump to where we are today,
0:12:11 which is we’ve taken this philosophy
0:12:14 of creating virtuous cycles
0:12:15 of what we call wet lab and dry lab.
0:12:18 And what that means is empirical data generation.
0:12:20 So a wet lab where we’re doing real experiments
0:12:21 with human cells and a dry lab,
0:12:23 which is our supercomputer system
0:12:25 and all of our software tools and AI tools.
0:12:28 And I think that what we’re doing here at recursion
0:12:31 is analogous to so many technology companies.
0:12:34 So Netflix is recording what you’re watching,
0:12:35 when you’re watching it, when you turn it on,
0:12:38 when you turn it off, who’s watching it in the household,
0:12:40 which scenes you’re turning it off on.
0:12:43 And they’re actually now then making predictions
0:12:45 and sort of A/B testing and going back through this loop
0:12:49 to the point today where Netflix is generating content
0:12:51 based on an algorithmic suggestion
0:12:54 of what’s gonna be popular for people.
0:12:56 This is like the drug discovery version of that.
0:12:59 We’re doing experiments, we’re breaking genes
0:13:01 and adding compounds and combinations of the things
0:13:02 that different human cell types.
0:13:06 And our A/B experiment is we make a bunch of predictions
0:13:08 about what genes and what drugs are connected to each other.
0:13:10 And then the next week we go back
0:13:11 and we test those predictions
0:13:13 and create this kind of flywheel approach.
0:13:16 The problem, of course, is that biology
0:13:19 is just so, so, so complex.
0:13:22 There’s this combinatorial explosion
0:13:23 of what we always joke about.
0:13:26 There’s about 21,000 human genes.
0:13:28 And in biology, there’s this really cool thing
0:13:30 called like synthetic lethality
0:13:32 or like synthetic relationships
0:13:36 where you can start to predict that two genes are related.
0:13:37 If you break both those genes
0:13:40 and you get an unexpectedly large or small effect,
0:13:41 it tells you that they might be
0:13:43 in some kind of feedback loop together.
0:13:45 If we were gonna do this synthetic experiment
0:13:49 of knocking out every gene with every other possible gene,
0:13:52 it’s about 250 million experiments.
0:13:52 It’s doable.
0:13:54 We’ve done 250 million experiments.
0:13:56 It’s taken us 10 years to get there,
0:13:58 but we now do up to 2.2 million a week.
0:14:00 So like this is feasible.
0:14:02 But if I just said, what if we did gene by gene by gene?
0:14:03 Three genes.
0:14:06 Now you’re talking about trillions of experiments.
0:14:07 And if you imagine doing four genes,
0:14:10 like you instantly in biology to really explore this,
0:14:12 you get to this combinatorial explosion
0:14:13 where we can’t brute force it.
0:14:15 We’re not gonna be able to empirically do it.
0:14:17 And so this is the beauty of AI,
0:14:20 just like in all of these other technology fields.
0:14:23 If you can sparsely fill this massive volume,
0:14:26 this matrix of genes and compounds and cell types
0:14:30 and interactions, ML and AI tools are often,
0:14:32 if the data is robust, really good
0:14:33 at helping you fill in and predict
0:14:35 what’s gonna happen in between.
0:14:36 And I think that’s what we’re trying to build.
0:14:39 That’s what that map is to us.
0:14:40 – And so forgive me,
0:14:42 this is a very sort of simplistic question,
0:14:45 but is it a case of where you’re running experiments
0:14:48 in the dry lab, doing it all in the AI,
0:14:49 you know, the computerized system,
0:14:52 and then sort of the ones that have the best chance
0:14:54 of going somewhere or go into the wet lab?
0:14:55 Is that kind of the basic?
0:14:57 – Yeah, that’s exactly right.
0:15:01 Yeah, and then what’s cool about these virtuous cycles
0:15:04 is if the prediction we made from the dry lab
0:15:07 goes to the wet lab and it works, it’s really robust.
0:15:08 Awesome, okay, we’ve got a new program.
0:15:10 Like let’s take it forward towards patients.
0:15:12 If it doesn’t work,
0:15:14 we’ve now that week generated a bunch of new data
0:15:17 upon which we can retrain the model.
0:15:20 And that new data is existing in a space
0:15:22 where the model was making a poor prediction.
0:15:24 And so by definition, week over week,
0:15:26 as you keep kind of going through this virtuous cycle,
0:15:28 despite the complexity of biology and chemistry,
0:15:30 you start to make progress
0:15:32 because there are areas of biology
0:15:35 that are kind of like, you know, like the Midwest,
0:15:38 like, you know, cornfields as far as you can see,
0:15:42 it kind of looks reasonably similar 100 miles apart.
0:15:44 And there’s other places in biology
0:15:47 that are like, you know, the Rocky Mountains or, you know,
0:15:49 whatever, and it’s like, you go 10 feet to the left
0:15:50 or 10 feet to the right,
0:15:52 dramatically different situation.
0:15:55 And so we can help to hone in with this iterative approach
0:15:58 around the parts of biology where we need more data
0:16:00 and the parts where we need less.
0:16:04 – Are there certain areas of biology that are,
0:16:06 and again, I’m gonna be overly simplistic here,
0:16:09 that are easier or harder to,
0:16:12 and I don’t know if it’s to map out or to understand
0:16:15 or to be able to, you know, sort of act on?
0:16:17 – For sure, so there are areas of biology
0:16:19 that we think are easier to start with.
0:16:20 And there are areas of biology
0:16:24 where we know exactly what is causing the disease.
0:16:26 So for me, there’s three areas there.
0:16:30 One is genetic diseases where we like cystic fibrosis.
0:16:34 We know that mutations in the CFTR gene cause cystic fibrosis.
0:16:37 Another good example are some cancers
0:16:39 where we know that mutations in certain genes
0:16:40 cause those cancers.
0:16:42 And the third, infectious diseases.
0:16:45 We know that like this virus or this bacteria
0:16:46 causes this disease.
0:16:48 What we like about those is that if we model those
0:16:52 in cells in our lab, you know, we add the virus
0:16:53 or we break the gene,
0:16:56 we know that we’ve at least partially recapitulated
0:16:58 something relevant about the biology.
0:17:01 The harder areas are like in neuroscience.
0:17:03 So take like Alzheimer’s.
0:17:05 We don’t really know what causes Alzheimer’s.
0:17:07 There’s debate, some people think they do,
0:17:09 but most of the drugs against those targets have failed.
0:17:12 So like we don’t really know what causes it.
0:17:15 And so how do you go about trying to understand Alzheimer’s?
0:17:17 And that’s actually neuroscience broadly
0:17:20 is probably one of the areas that’s hardest
0:17:22 because like we really don’t know
0:17:24 what causes a lot of different neurologic diseases.
0:17:28 And so we’ve actually partnered with Roche and Genentech,
0:17:30 the, you know, one of the biggest biopharma companies
0:17:32 there is, one of the most innovative
0:17:36 in a decades long collaboration to go map the genome
0:17:39 in neural specific cells.
0:17:42 So we can start to uncover some of these answers.
0:17:45 But those areas, those are the hard parts of biology.
0:17:46 – Right, right.
0:17:47 Should we talk supercomputers?
0:17:49 – Yeah, let’s talk supercomputers.
0:17:51 – All right, so I would imagine taking, you know,
0:17:54 somewhat of a rogue isn’t the right word,
0:17:56 but a different approach from the beginning, right?
0:17:59 The unbiased sort of industrialized approach.
0:18:02 And then now talking about, you know, mapping things out
0:18:05 and there’s so much we had a guest on recently
0:18:08 who talked about, you know, he had a great metaphor
0:18:09 for the difference between what we know
0:18:12 and what we don’t know about his field, right?
0:18:14 And so similar thing, right?
0:18:17 So I would imagine that the amount of data,
0:18:19 the amount of compute, it’s like with any other problem,
0:18:20 it just goes up and up and up.
0:18:23 So over the years, maybe you can talk a little bit
0:18:25 about kind of the technological path
0:18:29 that led you to, we’re at BioHive 2 now.
0:18:31 – All right, so maybe you can walk us through that a little bit.
0:18:34 – Yeah, I mean, we started out on my laptop
0:18:38 and then we moved to a server we built in-house called Golgi,
0:18:41 which we eventually mourned when we deprecated that.
0:18:44 But today we’ve got about 50 petabytes
0:18:47 of proprietary biological data.
0:18:48 And these data are a little bit different
0:18:51 from sort of like text and other things, some of it’s text.
0:18:54 But a lot of it is images, they’re very memory rich,
0:18:57 memory intensive, you know, they’re like big images.
0:19:00 And so some of the traditional cloud-based approaches
0:19:03 don’t quite work for us because you end up starving the GPU
0:19:05 from just a memory aside.
0:19:10 And so we in 2020 decided we needed to build a supercomputer
0:19:13 in order to make the best use of this giant dataset
0:19:15 we’ve accumulated and built.
0:19:17 And so we built a supercomputer called BioHive 1.
0:19:21 It was at the time on the top 500 list when it came out,
0:19:24 I think like 85th or 58th or something like that.
0:19:27 And what we found was that the scaling laws apply,
0:19:30 the bitter lesson holds in biology.
0:19:33 And that is that more data and more compute
0:19:35 both give you better outcomes.
0:19:37 And we’re generating more data every day.
0:19:40 And so we’re, I think in biology,
0:19:44 probably data is the bigger bottleneck than compute.
0:19:48 You know, there’s not the same arms race as in other fields
0:19:51 because there’s not yet enough relatable data
0:19:54 like we’re building for everybody to be doing this.
0:19:55 We are building the data,
0:19:57 but we also knew we needed more compute.
0:20:00 And so just maybe six months ago,
0:20:03 we announced actually NVIDIA invested in us last summer.
0:20:06 And then as we continued building our relationship with them,
0:20:09 we ultimately decided to build BioHive 2,
0:20:12 which just came out top 500 list.
0:20:14 It’s number 35.
0:20:18 It’s about 23 petaflops and it’s 504 H100s.
0:20:23 And now we just deprecated BioHive 1 with our 300 plus A100s
0:20:26 and we’re moving those over to the new facility.
0:20:31 And so we’ll have this Frankenstein A100 H100 giant supercomputer,
0:20:33 at least in our field, little recursion.
0:20:38 This biotech, tech bio startup in Salt Lake City now owns and operates
0:20:41 the fastest supercomputer in the world of any biopharma company,
0:20:44 you know, from Pfizer, you know, all the way down.
0:20:46 -Amazing. -It’s pretty cool.
0:20:48 -Yeah, yeah, no, it’s amazing.
0:20:50 We, you know, over the years of doing the podcast,
0:20:56 this phrase democratization of tools has always been almost like a mantra.
0:21:00 And, you know, for a while it was thinking about like a lawyer,
0:21:04 his computer and his apartment office has a GPU
0:21:07 and he realized he could use it to do some machine learning stuff.
0:21:09 And, you know, similar thing, right?
0:21:11 You guys are a much bigger scale than that.
0:21:15 But as you said, compared to the longstanding giants of the industry,
0:21:17 you know, for you to be able to build that is amazing.
0:21:20 Do you have a good story or a lesson learned
0:21:26 or something sort of surprising, specific to building BioHive 2?
0:21:29 And that process, that was something that, you know,
0:21:32 a hurdle you had to overcome or something you were expecting
0:21:34 was going to go one way and it turned out to be something different
0:21:36 or, you know, maybe just an anecdote?
0:21:39 -Yeah, I mean, building BioHive 2 was a tremendous effort.
0:21:43 Multiple teams came coming together, the recursion team, the NVIDIA team,
0:21:48 other suppliers teams, we ended up wanting to get this done
0:21:51 in time for submission to the most recent top 500 list.
0:21:53 -Okay. -And, you know, as is the case right now
0:21:56 with the popularity of GPUs and the importance of AI,
0:21:59 we got our GPUs, we got our cluster,
0:22:03 but we were missing racks and cables, certain racks and certain cables,
0:22:08 and those ended up coming in 15 days before the top 500 submission.
0:22:12 And so, our team was sleeping on cots in the data center
0:22:16 and the NVIDIA team joined us, like the amount of effort that they put in.
0:22:22 We got that thing up and running, burned in and benchmarked in 15 days
0:22:23 from getting all the materials.
0:22:29 It ended up being, it’s the first on-slab H100 setup, as far as I’m aware,
0:22:32 and it ended up having really, really good performance.
0:22:33 -Excellent.
0:22:35 My guest today is Chris Gibson.
0:22:39 Chris is co-founder and CEO of Recursion,
0:22:42 one of the world’s foremost biotech.
0:22:44 TechBio, I want to ask you about that in a second,
0:22:46 companies in the world based out of Utah,
0:22:48 and as Chris was just detailing,
0:22:53 they are now the proud owners and maintainers.
0:22:56 Was it the 35th ranked?
0:22:58 -Yeah, 35th. -Super pure in the world, BioHive2.
0:23:03 TechBio, I assume, it’s just kind of a flip on the term biotech
0:23:07 that the tech is kind of leading the way forward?
0:23:08 -Yeah, that’s right.
0:23:10 I mean, the term biotech was coined in the ’80s
0:23:14 to talk about using human proteins as drugs.
0:23:16 -Okay. -And at the time, that was tech.
0:23:18 And it was Genentech that really started this off.
0:23:20 And so credit to them, they were first.
0:23:23 They coined their industry biotech.
0:23:25 We would have called it biotech, but that was taken.
0:23:28 And so somebody in our field called it TechBio,
0:23:30 and the tech is kind of at the forward side.
0:23:34 And it’s a good term to describe the few hundred companies now
0:23:37 that are more digitally native, that are building in this space,
0:23:40 where technology is really at the foundation of what they build.
0:23:41 -Got it.
0:23:44 Also in prepping for this conversation,
0:23:48 I noticed the name BioHive being used.
0:23:50 It’s the name of a public-private partnership
0:23:52 that I think you also have a hand in.
0:23:54 So maybe you can talk a little bit about that.
0:23:56 As I was listening to you earlier,
0:23:59 I was thinking about some of the recent podcasts we’ve done.
0:24:02 We’ve done one or two others in the drug discovery space
0:24:06 and a bunch of others in the sciences.
0:24:08 And more and more, we’re hearing about,
0:24:10 I think the technology is a big part of it,
0:24:13 obviously, in making an open-source collaboration
0:24:16 and just other collaboration with large amounts of data,
0:24:20 easier to do across the world.
0:24:23 So I’m interested both in the specifics of BioHive
0:24:27 and then also just kind of your take on the current state of,
0:24:31 you know, science companies, biotech companies,
0:24:33 working in sort of a capitalist environment
0:24:37 where you’re trying to, you know, compete, you are competing.
0:24:38 But then at the same time,
0:24:39 given the work that you’re doing,
0:24:41 sharing is obviously quite important.
0:24:43 So maybe you can start with BioHive
0:24:46 and kind of get into just what collaboration is like
0:24:47 in the field today.
0:24:50 -Yeah, I think this new generation of tech bio companies,
0:24:54 we really do believe that we will all do better
0:24:57 if we all are collaborating at least
0:24:59 at certain stages of the process.
0:25:01 Because, you know, like at the end of the day,
0:25:03 if there’s a disease that doesn’t have a treatment,
0:25:05 we all still have work to do.
0:25:08 And so we believe a lot in investing
0:25:09 in the communities in which we work.
0:25:12 We acquired a company last year called Valence
0:25:16 that now basically is building out a community
0:25:19 for people to host, you know, different foundation models
0:25:22 and other sorts of things that are important across biology.
0:25:23 They’re hosting data sets.
0:25:27 They’re really investing in building that community.
0:25:30 And we also believe that’s true in the locations we work.
0:25:35 And BioHive is what we call the kind of ground-level
0:25:38 organization that’s helping to brand, bring together
0:25:41 and build the life science ecosystem in Utah.
0:25:43 We also do investments in our other, you know,
0:25:47 we’ve got offices in Toronto, Montreal, London,
0:25:49 and Melpitas, California.
0:25:50 And so we also make investments in those places
0:25:53 because we feel like we’re on like a 30 year…
0:25:56 I mean, we look at NVIDIA 30 years in,
0:25:59 and we feel like we’re 10 years into our 30-year journey
0:26:00 to kind of prove out this vision
0:26:03 that we think could be so impactful for the world.
0:26:07 And we know if you’re on that kind of trajectory
0:26:08 over that kind of timeline,
0:26:11 you’ve got to build community around you as you go.
0:26:13 And so these are important investments for us.
0:26:16 Open-sourcing data sets has been a critical piece
0:26:18 of how we’ve not only helped build the community
0:26:21 but also attracted talent over the years.
0:26:24 And so I think we’re going to continue pushing the industry,
0:26:26 the pharma and biotech industry,
0:26:29 to be a little bit more open to these kinds of approaches.
0:26:31 – Excellent, I love to hear that.
0:26:34 There’s data, there’s compute, there’s hardware.
0:26:37 There’s also software, the tools that make it all run.
0:26:41 I wanted to ask you about a recursion tool called Lowe
0:26:42 that I heard a little bit about.
0:26:44 My understanding is it kind of helps orchestrate
0:26:47 the workflows that uses GenAI in some way.
0:26:48 What’s that all about?
0:26:51 – Yeah, so we’ve been building all these software tools.
0:26:53 Some of them are leveraging neural nets
0:26:54 and other sorts of things.
0:26:56 And we’ve now got dozens of these tools
0:26:58 and it’s become complicated enough
0:27:00 that if you’re a scientist at recursion,
0:27:03 you can’t keep up with all the versions of all the tools.
0:27:05 It’s just, you look on your iPhone,
0:27:07 there’s tens of thousands of apps.
0:27:09 It’s hard to know how to use all of them.
0:27:11 Same kind of thing is happening here in science.
0:27:15 And so what we did was build an LLM and tuned it
0:27:18 to actually interact with the APIs for all of these tools
0:27:22 and to have a sense of when to use different tools
0:27:23 based on natural language.
0:27:25 So I can go in and I can say,
0:27:29 “Give me five novel targets in non-small cell lung cancer.”
0:27:31 And I just type that in and hit enter.
0:27:33 And then the LLM knows which of the software tools
0:27:36 we’ve built at recursion can go look at our data,
0:27:39 at public data, can look for like arbitrage
0:27:41 between those data sets and it can just surface back to you
0:27:44 some insights about what targets you might want to go after.
0:27:46 And then you can say, “Design a drug
0:27:48 that would inhibit one of these targets.”
0:27:51 And it’ll use Gen AI and the protein structure
0:27:55 that AlphaFold or others have predicted for that target.
0:27:58 And then it’ll help design a molecule that can bind
0:27:59 into the binding sites of that target
0:28:01 and we would predict inhibit it.
0:28:02 And you can do all of this with natural language.
0:28:06 And then you can say, “Design and execute an experiment
0:28:08 to validate this interaction.”
0:28:12 And then it can order the chemical from our suppliers.
0:28:13 It can design an experiment
0:28:15 that we can run in our automated wet lab.
0:28:16 From a security perspective,
0:28:18 we make a human approve that experiment
0:28:19 because we want to make sure
0:28:21 that we’re not running any bad experiments
0:28:22 that could do something bad.
0:28:24 And also you don’t want to accidentally run
0:28:28 like a $6 million experiment because nobody approved it.
0:28:29 So we have a human in the loop.
0:28:31 But if somebody approves that, you can go run the experiment.
0:28:34 And so this for us is, I think this is a lot like
0:28:37 the late ’70s and early ’80s in personal computing
0:28:39 where you moved from like the Apple One,
0:28:41 where you had to be this expert user,
0:28:43 you had to know how to solder and all this stuff,
0:28:46 to then with the LISA, the GUI,
0:28:47 you could actually start
0:28:49 to have this democratization of these tools.
0:28:51 And we think the same thing is happening now,
0:28:53 but instead of a graphical user interface,
0:28:56 it’s like a discovery user interface.
0:28:58 And low is our take on this.
0:29:00 There’s a big pharma company, GSK,
0:29:02 that’s building one of these.
0:29:04 There’s a couple other startups building these.
0:29:06 But ultimately we think these kinds of tools
0:29:09 are gonna mean that even if you don’t have 30 years
0:29:12 of experience in chemistry in the biopharm industry,
0:29:13 you’re like fresh out of school,
0:29:16 you’ll still be able to make bigger contributions faster
0:29:17 with this kind of approach.
0:29:18 That’s fantastic.
0:29:21 And you kind of just spoke to my next question,
0:29:23 but I’ll ask in any way.
0:29:25 Jensen said something in,
0:29:28 recently it was a couple of months ago, I think now,
0:29:30 in an interview where somebody asked about,
0:29:32 the future of the computer science field.
0:29:35 And he said something along the lines of,
0:29:39 my advice is go study a field
0:29:40 that you’re interested in,
0:29:42 develop your domain expertise,
0:29:45 because we’re already on this path
0:29:48 and we’re getting to a spot where you’re not,
0:29:51 and I’m paraphrasing you to use these exact words,
0:29:53 you’re not necessarily gonna have to learn
0:29:57 to write Python scripts or learn R to wrangle data.
0:29:59 You’re gonna use natural language prompts.
0:30:01 And then your domain expertise
0:30:03 is really what’s gonna become valuable
0:30:07 ’cause you’ll be able to work with the AI systems,
0:30:08 vet their output, et cetera, et cetera.
0:30:12 It sounds like that’s kind of where recursion’s at
0:30:14 to some extent anyway.
0:30:16 What’s your take kind of on that trajectory
0:30:19 and maybe just expound a little on what you just said
0:30:23 about even new graduates being able to contribute more
0:30:26 to the field because of the natural language prompting?
0:30:27 Yeah, I know, I think he’s exactly right.
0:30:29 I think he was on stage with us
0:30:31 at an event we hosted in January
0:30:32 where he said something similar.
0:30:33 Okay, yeah.
0:30:35 And what we talked about was how important
0:30:37 a classical education will be
0:30:39 in learning how to interpret problems,
0:30:40 identify the right problem,
0:30:43 and then how to ask and answer questions
0:30:44 about that problem.
0:30:46 That’s what’ll matter, it won’t be coding
0:30:47 because everything will be natural language.
0:30:48 So I agree with him.
0:30:50 And so what we’re looking for now
0:30:51 are people who are really good
0:30:53 at operating at the interface.
0:30:56 We actually don’t need somebody that’s memorized
0:30:58 the entire molecular cellular biology textbook.
0:31:00 Just like we don’t need doctors anymore
0:31:04 to memorize every single possible disease,
0:31:05 we are gonna have these tools
0:31:07 that mean you can type in,
0:31:08 here’s the symptoms the patient has
0:31:10 or here’s what I’m seeing in the data,
0:31:11 and then those tools are gonna help
0:31:13 pull out all of that deep information
0:31:15 that nobody should have to memorize.
0:31:16 And what’s gonna be critical is somebody
0:31:18 who can take that information
0:31:19 and say, okay, what do I do with this?
0:31:20 Like what’s the next step?
0:31:22 What’s the killer experiment that I can go run?
0:31:24 Or like, how would I treat this patient?
0:31:26 And so yeah, we’re gonna move to a place
0:31:27 where people who are good
0:31:30 at integrating lots of different ideas and data,
0:31:33 those are the people that we’re looking for.
0:31:36 We’re looking for not just the biology PhD,
0:31:38 but the biology PhD who loves using
0:31:39 all the different new AI tools
0:31:41 or the computational biologists.
0:31:43 People were really working at those interfaces.
0:31:44 – Right, right.
0:31:46 So I think you mentioned at the top,
0:31:49 recursion has created some drugs that are in trials now.
0:31:52 – Yeah, we’ve got five programs that are in clinical trials
0:31:54 and a few more on the way to the clinic as well.
0:31:57 – Okay, so I’m asking that kind of as context
0:32:01 for what the future holds, both for recursion
0:32:03 and then also sort of for the industry
0:32:05 for the field more broadly.
0:32:07 Is the plan on recursion’s end
0:32:10 to keep developing more drugs,
0:32:12 getting more actual solutions to put it that way
0:32:14 out into trials?
0:32:16 What’s sort of the, I don’t know,
0:32:19 the MO for recursion over the coming few years?
0:32:20 – It’s a great question.
0:32:23 I mean, at the end of the day, what matters
0:32:25 is you get a medicine to a patient
0:32:26 and it makes the patient healthy again, right?
0:32:28 So somebody’s gotta develop a product.
0:32:30 When we started recursion,
0:32:31 we believed we would build all these tools
0:32:34 to help people identify what products to build
0:32:36 and that they would then go build the products.
0:32:38 What we were surprised by, what we got wrong
0:32:40 was the reticence to this industry,
0:32:42 to these new technology tools.
0:32:43 And I think it’s a combination
0:32:47 of the regulatory environment with the FDA and EMA.
0:32:49 It’s a combination of that with just the general
0:32:51 conservative nature of an industry that spends,
0:32:56 it’s $2.6 billion of invested R&D
0:32:59 per new drug approval every year in our industry.
0:33:02 So like, and remember, that cost is that high
0:33:05 because 90% of the drugs that people take in the clinic fail.
0:33:06 So like, it doesn’t cost that much
0:33:08 to develop the drug that succeeds.
0:33:11 It’s just that you’ve gotta, you know,
0:33:12 spend all the money on all the failures.
0:33:14 And so the industry’s been conservative
0:33:15 and we kind of felt like
0:33:18 they’re not really taking this up as fast as we would hope.
0:33:21 Do we want to let these drugs we believe in die?
0:33:22 – Right.
0:33:23 – Or do we want to let them see the light of day?
0:33:25 And so ultimately, the biggest pivot we’ve made
0:33:27 as a company is we now have our own drugs
0:33:28 and that’s required more capital.
0:33:30 We’ve had to build new parts of our team
0:33:32 and kind of build the culture in that direction
0:33:35 towards clinical development and interacting with the FDA.
0:33:38 But sometimes you got to go vertical, right?
0:33:39 Sometimes you got to go vertical
0:33:41 if you really believe in something.
0:33:44 And, you know, a great example of this is like Tesla
0:33:46 building the supercharging network.
0:33:49 Like, I guarantee you they didn’t want to build a supercharging.
0:33:51 They’re like, of course gas stations are gonna see
0:33:52 in the future that people will use electronics
0:33:54 and they’ll put in superchargers.
0:33:56 No, nobody was read as this.
0:33:58 So they had to build it and that kind of sucked.
0:34:00 I’m sure we’ve got to build it too.
0:34:01 But at the end of the day,
0:34:05 if we can get medicines to patients, we’ll be happy.
0:34:07 – So before we wrap up here, Chris,
0:34:09 you mentioned, you know, a couple of times
0:34:11 and just now sort of the shift in the field
0:34:14 and kind of the reticence, you know, dating back
0:34:16 to when you were first doing your graduate work,
0:34:18 even up until now and the reason you had to go vertical
0:34:21 and start actually producing the drugs yourself.
0:34:23 People are resistant to change.
0:34:26 Technology takes time to get used to sort of culturally,
0:34:30 globally we’re almost certainly in the opening stages
0:34:34 of this big industrial revolution, you know,
0:34:36 spurred by automation and AI and everything.
0:34:39 Along that sort of cultural front,
0:34:42 have there been kind of big, I mean, is that a big thing?
0:34:46 Is that kind of the, you know, the data, the compute,
0:34:48 all of that stuff, but kind of culturally are you,
0:34:51 do you feel like you’re building kind of a different way
0:34:54 to do things and are there bumps in the road along,
0:34:55 you know, that come with that?
0:34:58 – Yeah, absolutely, I would say that the data
0:34:59 that compute those are moats for us,
0:35:01 but our biggest moat is our culture.
0:35:04 And we’ve had to take software engineers, data scientists,
0:35:06 biologists, chemists, drug developers
0:35:09 and supporting functions and have all of these different
0:35:13 folks working in these deep technical domains come together
0:35:15 and it’s basically meant that we’ve had to create
0:35:17 a new language, a new language where all of these people
0:35:20 can build in the same environment,
0:35:22 both literally and figuratively.
0:35:25 And so I think that we had a C level exec
0:35:27 from a big pharma company here a few years ago.
0:35:29 And on his way out the door, he said,
0:35:31 we could never replicate this culture.
0:35:33 That’s your biggest competitive advantage.
0:35:35 Everything else with enough money we could do,
0:35:36 but that we couldn’t do.
0:35:38 And I think that’s probably true.
0:35:41 And it’s, this is why sometimes those up and coming
0:35:44 young companies out, end up out innovating, you know,
0:35:46 some of the bigger ones is that it’s about the culture
0:35:47 as much as anything else.
0:35:48 – No, absolutely.
0:35:50 How big is recursion now, how many people?
0:35:52 – We’re about 550 people now.
0:35:54 – Wow, amazing.
0:35:57 I mean, it’s relatively small to what you’re doing.
0:35:59 I mean, maybe it’s very, very small to what you’re doing,
0:36:01 but that’s just amazing.
0:36:02 For folks who would like to learn more,
0:36:04 we covered a lot of ground, we could do an hour
0:36:07 just on how you built BioHive too, obviously.
0:36:10 But for folks who want more of the recursion story,
0:36:13 want to dive into some more of what you’re working
0:36:16 on the specifics, maybe even are looking for, you know,
0:36:18 a new role, a great culture to join.
0:36:20 Where can they go to find out more?
0:36:21 – Yeah, go to recursion.com.
0:36:23 You can find out about careers.
0:36:25 You can read all of our papers.
0:36:29 And you can also go to rxrx.ai,
0:36:32 which is where we have lots of large data sets.
0:36:33 If you’re a data scientist and you want to play
0:36:36 with some biology, you can go download them there.
0:36:37 – Fantastic.
0:36:40 Chris Gibson, this was a pleasure.
0:36:41 I learned a lot that’s going to leave me
0:36:43 with many more questions as I process it.
0:36:46 So we may have to do this again,
0:36:48 but appreciate you taking the time to come on the podcast,
0:36:49 tell everybody about the work you’re doing.
0:36:53 And, you know, it feels like an understatement
0:36:56 and a cliche to say, but you’re working on solutions
0:36:58 that can literally change people’s lives
0:36:59 through better health.
0:37:02 So all the best of luck to you and your teams.
0:37:04 – Thanks Noah, really appreciate it.
0:37:06 (upbeat music)
0:37:09 (upbeat music)
0:37:10 .
0:37:12 (upbeat music)
0:37:15 (upbeat music)
0:37:17 (upbeat music)
0:37:20 (upbeat music)
0:37:23 (upbeat music)
0:37:25 (upbeat music)
0:37:28 (upbeat music)
0:37:30 (upbeat music)
0:37:33 (upbeat music)
0:37:36 (upbeat music)
0:37:38 (upbeat music)
0:37:41 (upbeat music)
0:37:43 (upbeat music)
0:37:46 (upbeat music)
0:37:48 (upbeat music)
0:37:51 (upbeat music)
0:37:53 you
0:38:03 [BLANK_AUDIO]
Techbio is a field combining data, technology and biology to enhance scientific processes — and AI has the potential to supercharge the biopharmaceutical industry further. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Chris Gibson, cofounder and CEO of Recursion, about how the company uses AI and machine learning to accelerate drug discovery and development at scale. Tune in to hear Gibson discuss how AI is transforming the biopharmaceutical industry by increasing efficiency and lowering discovery costs.