Author: The AI Podcast

  • How Georgia Tech’s AI Makerspace Is Preparing the Future Workforce for AI – Ep. 229

    AI transcript
    0:00:10 [MUSIC]
    0:00:13 >> Hello, and welcome to the NVIDIA AI podcast.
    0:00:15 I’m your host, Noah Kravitz.
    0:00:21 Georgia Tech University has unveiled a new AI maker space built in collaboration with NVIDIA.
    0:00:23 Housed in Georgia Tech’s College of Engineering,
    0:00:29 the artificial intelligence supercomputer hub is dedicated exclusively to teaching students,
    0:00:31 initially focusing on undergraduates.
    0:00:35 Combining massive compute with classwork and other educational resources,
    0:00:39 the maker space is designed as a hands-on sandbox to give students
    0:00:43 experience with AI and better position them for life after graduation.
    0:00:47 Here to tell us more about how Georgia Tech is reimagining the present and future of
    0:00:51 higher education in the AI era is Eureka Ray Choudhury,
    0:00:53 professor and Steve W. Chatek,
    0:00:58 school chair of electrical and computer engineering at Georgia Tech’s College of Engineering.
    0:01:02 Eureka, thank you for joining the NVIDIA AI podcast, and welcome.
    0:01:05 >> Thank you so much, Noah, and thanks a lot for having me.
    0:01:06 >> So this is really exciting.
    0:01:10 I was there’s some great blogs and great video with a tour of
    0:01:14 the maker space on the Georgia Tech site that I’d encourage everybody to check out.
    0:01:15 But let’s hear it direct.
    0:01:17 Tell us all about the maker space.
    0:01:18 >> Yeah, absolutely.
    0:01:21 I mean, this is a very exciting new maker space in the College of Engineering,
    0:01:23 which I am expecting will have
    0:01:26 a very large impact on the entire community of students in Georgia Tech.
    0:01:29 As you know, we have a very strong engineering program,
    0:01:31 one of the largest in the country,
    0:01:34 and over the last many decades,
    0:01:37 I should say we have been teaching students about AI and
    0:01:42 ML and principles of machine learning and data processing.
    0:01:44 As you all have seen in the last few years,
    0:01:46 there has been several inflection points,
    0:01:49 the first one being deep learning and second,
    0:01:51 I’d say in the last couple of years, more with generative AI.
    0:01:53 Some of the problems and some of
    0:01:57 the engineering tasks that you can do with AI have exploded.
    0:01:57 >> Sure.
    0:01:59 >> Of course, our curriculum,
    0:02:02 our teaching requirements and what we teach our students have
    0:02:05 also morphed and changed following that trajectory.
    0:02:06 One of the things that we have been noticing
    0:02:09 of late is that the amount of data and
    0:02:12 the amount of computational resources that you need to be able to
    0:02:17 solve at scale realistic problems in AI have also grown quite significantly.
    0:02:22 Although we were using online tools and open source hardware,
    0:02:25 or even limited amounts of hardware to train
    0:02:27 students and teach them in classes and so on,
    0:02:30 we found that that was one of the things which was missing,
    0:02:32 that we would not necessarily have
    0:02:36 the resources that the students need to be able to solve real-life problems.
    0:02:41 Jyoti always had this notion that we want to augment
    0:02:45 our theoretical education with hands-on practical training.
    0:02:49 All the schools in the College of Engineering have our own maker spaces.
    0:02:52 ECE, which is my school, we have a maker space,
    0:02:55 and that maker space is the electrical engineering maker space,
    0:02:59 where you can find anything from hammers and screwdrivers to oscilloscopes,
    0:03:04 to very high end equipment for doing electrical experiments.
    0:03:07 These are tools of the trade.
    0:03:09 These are tools of engineering.
    0:03:11 If you ask me what’s the tool of AI,
    0:03:13 that’s computing, that’s computational resources.
    0:03:18 We found that we are at a point where we really needed to step up
    0:03:23 our game and not for the practical hands-on experience for the students in the AI space.
    0:03:26 We needed to provide them with a digital sandbox,
    0:03:29 I’d say, which would allow students to learn in the classroom setting,
    0:03:32 for senior design, for capstone projects,
    0:03:35 or even just do independent study.
    0:03:38 Like if they want to have study something on their own,
    0:03:42 or even pursue some entrepreneurial adventures or ventures,
    0:03:46 they would be able to have the resources to be able to do that and pursue that.
    0:03:48 That was the genesis of the AI maker space,
    0:03:50 and like any other maker space,
    0:03:53 this is dedicated purely to students for student use,
    0:03:59 and we were super excited to partner with NVIDIA as well as Penguin Solutions
    0:04:08 to bring this to life and it’s a huge computational resource of data center,
    0:04:12 AI purpose built for AI that the students have access to,
    0:04:19 and at this point we are ramping up on the usage as well as deploying it across the campus.
    0:04:26 This is the latest addition to our series of physical maker spaces that we have,
    0:04:30 and this is the first virtual maker space that we have.
    0:04:32 >> Fantastic. As we record this,
    0:04:36 I don’t know what academic calendar Georgia Tech goes on.
    0:04:41 It’s early May as we record this and so lots of schools are winding down.
    0:04:43 You said that the maker space is ramping up,
    0:04:48 so it’s open now, when did it open and what’s that ramp-up schedule?
    0:04:50 >> Sure. As you know,
    0:04:58 getting something of this order of this size and magnitude of this order of complexity is a long process.
    0:05:00 We started working with NVIDIA almost a year back,
    0:05:03 and the GPUs and the switches and all that.
    0:05:06 Even reference designs are hard,
    0:05:10 so I think we need the expertise of NVIDIA and Penguin Solutions to work with us.
    0:05:13 We have an existing supercomputer cluster,
    0:05:15 but mostly we’re dedicated to CPUs.
    0:05:20 The team worked with NVIDIA and Penguin to work on reference designs.
    0:05:22 All of that happened late last year.
    0:05:27 I would say in September, October, November of last year, we placed orders.
    0:05:32 The GPUs and all started coming back to us in the December right before the holidays,
    0:05:34 and from the beginning of this semester,
    0:05:36 which was end of January,
    0:05:39 we started slowly putting things together,
    0:05:41 and about a month or so back,
    0:05:43 this became functional,
    0:05:50 the first phase of the micro-space become functional is what is to be a three-phase design.
    0:05:56 This is the first phase where we have essentially 160 H100 processors.
    0:05:58 >> You read my mind, I was going to say,
    0:06:01 we don’t always get into the speeds and feeds on this show, so to speak,
    0:06:03 but I mean, we have to in this case.
    0:06:06 Tell us what’s in the space.
    0:06:11 >> Yeah, so it’s essentially 20 HGX boxes, a total of 160 GPUs.
    0:06:13 Forget the memory specification.
    0:06:16 I think it’s what two terabytes per node,
    0:06:18 and then this is the first phase of this.
    0:06:21 Then this is already fully up and running.
    0:06:24 There is one class which is using it already,
    0:06:27 and then there are a bunch of other classes which are,
    0:06:29 the projects on those classes are kind of incorporating them.
    0:06:34 We haven’t opened it out completely to students yet for general use.
    0:06:36 We plan to do that by the end of this year.
    0:06:41 We are now ramping up on the orders for the phase two.
    0:06:43 So by fall of this year,
    0:06:45 I would say maybe the middle of fall,
    0:06:48 we would have the second phase of the GPUs here,
    0:06:53 which would be the plan is to have H200s at that point and a very similar number.
    0:06:54 By the end of this year,
    0:06:57 we should have the first phase one and phase two of
    0:06:59 the iMaker space all set up and deployed,
    0:07:05 which will essentially be opened up to all 50,000 graduate and
    0:07:07 undergraduate students in Georgia Tech, so all the students.
    0:07:08 >> That’s amazing. I’m having
    0:07:11 flashbacks listening to you to my own undergraduate experience.
    0:07:14 To be fair, I was a liberal arts student,
    0:07:15 not an engineering student,
    0:07:20 but I would go to the VAX terminals in the lab to do my problem sets,
    0:07:22 and a little bit different of experience.
    0:07:26 One of the things that struck me that I think is great,
    0:07:27 that was mentioned in some of the literature,
    0:07:30 is that typically you see this type of
    0:07:33 compute reserved for research projects,
    0:07:35 or at least the preference given to researchers,
    0:07:37 which is obviously important.
    0:07:42 The Georgia Tech makerspace is focused on undergrad students.
    0:07:45 What went into that decision and what’s the larger thinking
    0:07:50 around the role or to use the phrase the intersection?
    0:07:52 Higher ed, undergraduate education,
    0:07:54 and this explosion,
    0:07:56 these inflection points with AI that you described?
    0:07:59 >> Yeah. That’s a great question.
    0:08:00 I think that goes back to some of
    0:08:03 the things that we were already doing in the research space.
    0:08:05 We have computational resources for research.
    0:08:09 We have a lot of researchers working in the AI space,
    0:08:11 not only in the theoretical aspects of AI,
    0:08:12 but the applications of AI.
    0:08:15 We have lots of research going on with NVIDIA as well,
    0:08:16 on the hardware designed for
    0:08:18 the next generation of AI chips, for example.
    0:08:21 We have computational resources that the faculty have,
    0:08:24 and the graduate students and research faculty,
    0:08:25 they have access to.
    0:08:29 The intent for the AI makerspace was to democratize AI.
    0:08:31 What we see today is what
    0:08:33 computing was probably 20-25 years back.
    0:08:36 At that point in time, everybody who started getting into
    0:08:38 college started to realize that they need to
    0:08:40 know a little bit about computing and need to
    0:08:42 do a little bit of programming.
    0:08:43 For example, if you look at
    0:08:46 our university system of Georgia today,
    0:08:48 in Georgia Tech, if you do any major,
    0:08:50 it doesn’t matter what major you are in.
    0:08:51 It can be law, it can be design,
    0:08:53 it can be humanities.
    0:08:55 You have to take a programming course,
    0:08:57 because you need to do programming,
    0:08:58 because it’s a way of thinking.
    0:09:00 It’s a logical way of thinking.
    0:09:03 What we feel is AI is that point
    0:09:07 in our trajectory of human evolution at this point in time.
    0:09:10 Everybody needs to know a little bit about AI.
    0:09:13 Either they would be pushing the boundaries and envelopes of AI,
    0:09:16 then they would be inventing the next new models,
    0:09:18 the next new data structures,
    0:09:20 and the databases.
    0:09:23 They will be the ones who would push the frontiers of AI,
    0:09:26 and then there will also be a large portion of
    0:09:27 our student population when they grow up,
    0:09:29 they pursue their careers,
    0:09:31 which would not be directly related to AI,
    0:09:33 but they will be using AI as a tool.
    0:09:35 Whether you’re doing creative writing,
    0:09:36 whether you’re doing creative design,
    0:09:39 you will be using some form of AI.
    0:09:41 What we wanted to do was to make sure that all our students,
    0:09:43 no matter what their discipline is,
    0:09:45 have a low barrier to AI,
    0:09:49 and not only the theoretical understanding that they need,
    0:09:51 but also practical hands on experience,
    0:09:55 on how to use AI for their particular degree program,
    0:09:57 whatever their major or minor is.
    0:10:01 This is a larger initiative within the College of Engineering,
    0:10:03 and we are working with other colleges as well,
    0:10:05 where we have now started a new AI minor,
    0:10:08 like for students to do more courses in AI.
    0:10:13 We have more and more courses getting retrofitted with AI content.
    0:10:18 We are using AI for a lot of EC courses,
    0:10:19 like engineering courses as well,
    0:10:21 where we are using AI for students
    0:10:26 not only to use AI as a means to extract intelligence from data,
    0:10:30 but also using practical AI for some of the signal processing classes.
    0:10:34 Some of them are using things like conversational AI,
    0:10:37 like large language models to design high-level synthesis programs.
    0:10:39 They’re using it for programming, autopilot.
    0:10:41 These are all things that are already happening
    0:10:44 in a very natural, organic way.
    0:10:47 The AI makerspace is one of those additions
    0:10:50 to that overall broader effort,
    0:10:52 where we are essentially trying to provide access.
    0:10:54 That was the motivation,
    0:10:59 that AI is not something which is only dedicated to research,
    0:11:02 to graduate students who have an understanding of what the systems are,
    0:11:04 when trying to push the research boundaries,
    0:11:06 but we are trying to make sure that AI is a tool
    0:11:10 that can be used by anyone and everyone who comes to Chojitthik
    0:11:12 and has access to computational resources
    0:11:13 that are super critical at the moment.
    0:11:16 -Absolutely. -That’s the kind of the genesis
    0:11:19 of our in-a-thought process.
    0:11:25 I want to ask you in a little bit about the future of higher education,
    0:11:27 but also what happens after graduation,
    0:11:31 and this notion that I think you’re more than hinting at,
    0:11:33 but I don’t want to put words in your mouth,
    0:11:36 but sort of preparing students to be AI-native and AI-ready,
    0:11:41 and as you said, it’s, again, much like when I was coming out of college,
    0:11:44 the internet was just starting too, right?
    0:11:47 And now it’s, you know, there aren’t many jobs
    0:11:49 where you’re not using the internet at least some way,
    0:11:51 so there’s a similar phenomenon I think happening.
    0:11:53 But I want to ask you about the faculty.
    0:11:56 Was there upskilling, or is there, I should say,
    0:11:59 upskilling involved in the faculty?
    0:12:01 You know, I would imagine in the College of Engineering,
    0:12:04 you might have folks who are already using some of these tools
    0:12:06 to advance their own research,
    0:12:08 because it’s kind of been in that domain for a while,
    0:12:13 but what’s the faculty sort of reaction and enthusiasm been like?
    0:12:15 Yeah, I mean, that’s a great question.
    0:12:19 And I think, like, you know, we have a large faculty body,
    0:12:23 which means, you know, we have very good mini-cosmos of society itself,
    0:12:25 so we have an entire spectrum, right?
    0:12:30 But as an institute, we have sort of embraced AI in all possible ways.
    0:12:33 So I think there is, at least from an institutional policy,
    0:12:38 or, you know, on our people’s intent on how they want to use AI,
    0:12:39 there is no controversy.
    0:12:40 You know, we are not saying that, okay,
    0:12:43 this is AI is not something that you cannot use.
    0:12:46 I think JJ Tech was one of the first schools that started telling students
    0:12:50 that you can use AI for writing your college essay,
    0:12:53 as long as you know how you’re using it, and as long as you’re using it right.
    0:12:56 So I think, you know, should we tell our students not to use AI
    0:13:01 because that opens up, you know, opens students up to kind of things like cheating and all?
    0:13:03 I don’t think that’s the right argument.
    0:13:06 I think the right argument should be, how do you use AI better?
    0:13:07 Like, how do you write prompts better
    0:13:10 if you are using conversational language models?
    0:13:14 So I think that’s why we need to embrace and teach students how to use AI better.
    0:13:18 And I think as an institute and as the faculty body, we are all on the same page.
    0:13:19 That’s where we need to be.
    0:13:22 Now, how we get there, you know, at what speed we get there
    0:13:25 and what are the tools that we use to train ourselves?
    0:13:28 You know, it depends from discipline to discipline, it varies from discipline.
    0:13:31 So as I can say, you know, electrical and computer engineering,
    0:13:35 you know, we have some of the people who are actually at the forefront of AI research itself
    0:13:37 and it’s kind of a natural thing for them.
    0:13:40 They have been teaching these courses in computer vision and conversational language
    0:13:42 for many years.
    0:13:45 So those are kind of, you know, they are the pioneers in the field.
    0:13:49 So it’s very easy for them to incorporate them in the classroom setting as well.
    0:13:53 Those of us who are not exactly in that domain, you know, are, I think,
    0:13:58 very well calibrated with what’s going on and how our fields are getting impacted by that.
    0:14:02 So we are trying to use, you know, AI for some of these courses as well.
    0:14:06 Even, I would say, we have courses on technical writing and we have courses
    0:14:09 in electrical engineering on professional, you know, how do you,
    0:14:13 how do you write professional essays or technical essays and stuff like that.
    0:14:18 And we are using, you know, the instructors in those classes are also using language models now
    0:14:21 to teach students how to use it better.
    0:14:24 They are learning the process and they are working with, you know, companies,
    0:14:27 including NVIDIA to kind of learn how the containers would work
    0:14:29 and how their course can fit into that.
    0:14:33 So some of the, you know, tools and software that we are seeing an emergence of new companies
    0:14:37 in the startups in that domain on the intersection of AI and education,
    0:14:39 which are also, you know, our partners.
    0:14:43 So we are working with a whole bunch of people who are learning and teaching at the same time.
    0:14:47 And I think that’s a very interesting and fun place to be in.
    0:14:52 Absolutely. And you mentioned that Georgia Tech now has, or has unveiled,
    0:14:56 I don’t know if it’s available yet, but their first minor degree program in AI
    0:14:58 and machine learning, I believe.
    0:15:03 And then there’s also the creation and kind of reimagining of a dozen or so
    0:15:04 undergraduate courses.
    0:15:06 Yeah, are those new courses?
    0:15:09 Can you talk a little bit about some of those new courses
    0:15:12 and maybe how, you know, AI is at the core of them?
    0:15:13 Yeah, absolutely. Absolutely.
    0:15:19 I think there are some courses which had been now, you know, sort of they had AI already in them,
    0:15:21 but now with the availability of the AI maker space,
    0:15:25 the kind of projects and kind of hands-on work that the students can do
    0:15:27 are kind of, you know, have expanded.
    0:15:31 So the students would be able to do things like, you know, segmentation models.
    0:15:35 So just to give you an example, in a computer revision class,
    0:15:37 they would be doing segmentation models on static images.
    0:15:39 They’ll click on a particular point and they’ll do segmentation.
    0:15:43 They understand how, you know, a segment, anything kind of a model would work,
    0:15:47 but you would not know how to do segmentation on a real-life video stream
    0:15:50 because you would not have the computation resources to do that.
    0:15:54 Now, you know, as we move from an open source platform to the AI maker space,
    0:15:57 the students have taken that particular project in that particular class
    0:16:00 from a static, you know, click here and do segmentation.
    0:16:02 This is the algorithm to building an actual system
    0:16:06 where they are taking in, you know, video streams from a vehicle
    0:16:09 and processing it on the fly, on the AI maker space and doing segmentation
    0:16:11 and you are trying to do navigation and all of that.
    0:16:14 So you can see the complexity of the projects
    0:16:16 that the students can do have kind of exploded.
    0:16:17 That’s one.
    0:16:20 And then there are courses which did not have AI.
    0:16:21 Like, I’ll give you an example.
    0:16:26 I teach a course in VLSI because that’s my area in circuits and VLSI design.
    0:16:31 We, you know, we build the hardware for AI research and AI work,
    0:16:36 but we do not necessarily use AI a lot in designing of chips, for example.
    0:16:37 But that is changing. But that is changing.
    0:16:39 If you look at the latest tools, they are using AI
    0:16:42 for doing in a floor planning placement, that kind of stuff.
    0:16:43 So in the last time I was teaching this course,
    0:16:46 we have started discussing, you know, how AI would, you know,
    0:16:48 impact some of the tools and the flows,
    0:16:53 how you can potentially use a language model to write very long code, for example.
    0:16:57 So these are things that we are kind of preparing the students.
    0:16:59 That, you know, when you go and work in the industry,
    0:17:01 you will be faced with a new reality where many of these tools
    0:17:04 will have some AI component and you can use AI better.
    0:17:06 And then there are, I would say,
    0:17:10 and then the third category would be we are introducing new courses.
    0:17:12 And one of the things that I am very excited about
    0:17:15 is we are trying to, again, you know, most of the AI-related
    0:17:18 or machine learning-related courses were in Georgia Tech
    0:17:22 at 3,000 or 4,000 level courses, which means juniors and seniors.
    0:17:24 – Okay. – And then, of course, graduate students.
    0:17:26 But now I think from fall of this year,
    0:17:30 EC is going to introduce a new course at a 2,000 level,
    0:17:33 which is like AI for everybody, like AI for all students.
    0:17:36 This would be like for, you know, sophomores or even freshmen
    0:17:38 would be able to take the courses.
    0:17:39 And the idea is to kind of, you know,
    0:17:42 take the students who are coming in as teenagers
    0:17:45 or, you know, write out of high school
    0:17:48 and give them some exposure to what AI means,
    0:17:50 you know, what are the mathematical principles of AI.
    0:17:52 It’s not magic, it’s something very structured.
    0:17:55 Maybe there is some, there is an element of black box in there,
    0:17:58 but you know, but this is how we do the, how we write software on AI.
    0:18:00 This is how you can use AI for your curriculum.
    0:18:03 So getting them exposed to the principles of AI right on
    0:18:06 at the very beginning of their journey
    0:18:09 so that they are better prepared on using AI for whatever tasks
    0:18:12 or, you know, courses that they eventually take.
    0:18:15 So I say all these categories, you know, projects,
    0:18:18 the courses that have AI are becoming more, I’d say, hands-on
    0:18:20 because of the AI makerspace.
    0:18:22 Classes that did not have AI are now using AI
    0:18:25 because, you know, that’s what the, that’s the,
    0:18:27 that’s the direction in which we are moving anyways.
    0:18:29 And then we are also introducing new classes,
    0:18:31 particularly at a younger, for younger students,
    0:18:35 just to get exposed to AI and start using AI as early as they can.
    0:18:36 Fantastic.
    0:18:38 I’m speaking with Arjit Ray Choudhury.
    0:18:41 Arjit is professor and Steve W. Chattick School
    0:18:44 Chair of Electrical and Computer Engineering
    0:18:47 at Georgia Tech University at the College of Engineering.
    0:18:51 And we were talking about Georgia Tech’s new AI makerspace,
    0:18:55 which is open in phase one and continuing to ramp up.
    0:18:58 They’ve got a whole, just a lot of compute in there.
    0:19:03 It’s a data center on campus for students to go hands-on.
    0:19:07 And as Arjit was saying, to be able to go from being able to
    0:19:11 do segmentation frame by frame to working on video in real time.
    0:19:14 There was a great quote in one of the materials I was reading
    0:19:18 in prep that talked about it would take one of these nodes,
    0:19:21 I think a second to come up with the question that it would take
    0:19:25 your students decades to answer, which puts us on perspective.
    0:19:28 I want to shift gears a little bit, or maybe it’s not shifting
    0:19:32 gears so much as accelerating to what happens after graduation.
    0:19:34 And the topic of preparing students,
    0:19:38 preparing people for what’s probably going to be a very new
    0:19:44 kind of workforce or at least one that the day-to-day work
    0:19:49 may be quite different than what we’ve seen you and I grew up on.
    0:19:52 A lot of opinions flying around over the past couple of years.
    0:19:55 And I think this is a little different because you’re talking about,
    0:19:59 and I love that you were talking about the AI for everybody class,
    0:20:02 the 2000 level class, where students get exposed.
    0:20:04 It’s not just how do you use it,
    0:20:07 but what are the mathematical principles underneath it?
    0:20:09 Where did this stuff come from?
    0:20:12 And as sort of a scientist, so to speak,
    0:20:13 what does it look like and how do you work with it, right?
    0:20:15 Which is so important.
    0:20:20 How do you see the workforce transforming these individual roles,
    0:20:24 kind of a collective idea of, you know, and take it where you will,
    0:20:27 whether it’s electrical engineering or somewhere else.
    0:20:30 And I keep thinking as we’re talking of, you know,
    0:20:32 within the past couple of months, there was a quote.
    0:20:36 Jensen was being interviewed and he said something about
    0:20:40 if you want to prepare yourself, it’s not so much learning how to code.
    0:20:43 It’s getting expert in a discipline, right, in a domain.
    0:20:47 Follow your passion, you know, dive in to what you do,
    0:20:49 get really good at it because these AI tools
    0:20:53 are going to be part and parcel of whatever it is that you do.
    0:20:55 How do you see all of this and how do you talk about
    0:21:00 that intersection of AI, higher education, moving into the workforce?
    0:21:01 Yeah, again, a great question.
    0:21:04 And I think what Jensen was saying was absolutely spot on.
    0:21:08 I think AI is here to stay and this is again going to be one of those tools.
    0:21:11 I am, of course, you know, I think one of those things we need to teach our students
    0:21:14 is not only what AI can do, but also what I cannot do.
    0:21:18 And I think that’s as important as kind of, you know, understanding
    0:21:20 where to use AI and where not to use AI.
    0:21:23 So I think a part of our education process itself and training process
    0:21:26 for the students would be to kind of take AI, you know,
    0:21:29 with the capabilities that it comes with in areas and disciplines
    0:21:33 where you can use it properly and also kind of understand areas
    0:21:36 where human creativity and ingenuity will still be important
    0:21:38 and you have to think outside the box.
    0:21:40 So you would still be, you know, if you’re working on algorithms,
    0:21:43 you will still need to be able to find out how an algorithm works
    0:21:45 or understand the complexity of an algorithm.
    0:21:48 If you’re a computer scientist, maybe you don’t need to code anymore,
    0:21:53 but that does not mean that you would not need to know how the algorithm works.
    0:21:55 So you would be able to use AI as a tool
    0:21:58 and you would be able to use it very, very efficiently,
    0:22:02 but you have to understand, A, how to use it, B, where not to use it.
    0:22:04 And C, if you want to make enhancements,
    0:22:07 if you want to kind of better human society or your discipline,
    0:22:10 you have to understand, you know, the fundamental and the basic principles,
    0:22:13 which is not something that you can delegate to AI.
    0:22:18 So the foundation of knowledge that we need, I think, you know, is super important.
    0:22:21 And if you just look at, you know, how much data AI consumes,
    0:22:24 I think it’s just, you know, it’s an insane amount of data, that’s true.
    0:22:26 But also, I think, you know, making sense of the data
    0:22:30 or making interpretations out of data is something that is purely human,
    0:22:31 even, you know, even today.
    0:22:36 AI cannot help you explain data, for example, very efficiently, right?
    0:22:40 So those are the things that, you know, that human experts will still need
    0:22:44 expertise in certain disciplines, where you would be able to kind of, you know,
    0:22:46 do your job much more efficiently,
    0:22:51 but still need to understand the core principles of your discipline well.
    0:22:53 So I see the landscape for the workforce
    0:22:57 and how, you know, students today are going to be, you know,
    0:22:58 the professionals for tomorrow,
    0:23:00 how they are going to do their job differently.
    0:23:03 But at the core of it, I don’t see things changing dramatically.
    0:23:07 You know, some jobs will become obsolete, some new jobs will come in its place,
    0:23:09 which has always happened, you know, if you look at, you know,
    0:23:13 going back to our evolution of technology, you know, that happens every time.
    0:23:17 So I am not worried that, you know, that AI is going to take away all jobs.
    0:23:19 I don’t think that’s true. No technology does that.
    0:23:24 It just, you know, you can reimagine that the jobs spectrum is going to change
    0:23:26 and evolve. But more importantly, I feel like, you know,
    0:23:30 the students just need to be aware of how to use AI better and efficiently
    0:23:33 and in their own disciplines and jobs.
    0:23:38 Are the types of internships, jobs, other opportunities
    0:23:43 that students from the engineering college are going to?
    0:23:45 Are they are they changing right now?
    0:23:51 Is there a, you know, I wonder if people feel sort of caught a little bit
    0:23:54 in almost like in a sandwich between what was and what’s coming,
    0:24:01 but we’re not quite there yet with, you know, relative to industry adopting AI.
    0:24:04 And again, that’s a little bit different, obviously, depending on the discipline.
    0:24:07 And but how are you seeing that in your domain?
    0:24:12 Yeah, I think like every industry or every company now needs to have an AI policy.
    0:24:15 Like they are all talking about an AI strategy, whether they need it or not.
    0:24:19 So there is, of course, going to be some noise in the in the in the transient
    0:24:23 where people are trying to figure out how exactly to use AI or eventually maybe the
    0:24:26 you know, at the end of it, they will figure out that they don’t need to use AI
    0:24:30 for their particular, you know, work or particular company or particular,
    0:24:31 you know, discipline.
    0:24:34 But there is, of course, everybody is interested in AI
    0:24:37 because it has been so disruptive in so many areas.
    0:24:40 So I think we see as the students are going for internships,
    0:24:43 co-ops or even full-time positions, there are lots of students who are working
    0:24:46 in the AI space and kind of, you know, becoming either AI software engineers
    0:24:51 and hardware engineers doing AI or even in other disciplines, not just engineering.
    0:24:54 You know, they are using AI and data sciences in very interesting new ways.
    0:24:57 Like, you know, the whole area of bioinformatics has kind of exploded.
    0:25:00 You know, students are working in biological and biological systems.
    0:25:04 A lot of them are using AI as tools for the data sciences aspect of what they do.
    0:25:09 So I see new possibilities, new jobs at the intersection of computing and data
    0:25:11 and other disciplines.
    0:25:15 That’s what seems to be the one area which is exploding and growing very, very fast.
    0:25:20 But have we settled down on where, you know, where the future would essentially settle down to?
    0:25:21 I don’t think we are there yet.
    0:25:23 And it’ll take some time.
    0:25:27 But I think, you know, it’s not something that it’s it’s not something that, you know,
    0:25:32 one person or one company or one school or university will figure out before the rest.
    0:25:36 Everybody in society will come up with the same kind of understanding of where to use AI,
    0:25:39 where not to use AI, how to use it and how to use it better.
    0:25:43 And I think there are still going to be questions around ethics and bias and compliance
    0:25:47 and all of that policy, which will continue to be a topic of discussion.
    0:25:50 I don’t think it’s going to go away in the next few years.
    0:25:53 It’s going to be something we’ll keep on discussing for decades, maybe.
    0:25:57 So so I think our students today would need to be a part and parcel.
    0:26:00 And they would be the leaders who will be leading some of these discussions in 10 years.
    0:26:04 So they need to kind of understand the whole spectrum better.
    0:26:09 As you look ahead to the next couple of years and whether it’s the upcoming phases of the
    0:26:16 makerspace project, the impact of all this technology on electrical engineering, whatever
    0:26:20 it might be, what gets you the most excited?
    0:26:26 What are you just really can’t wait for this thing or this particular kind
    0:26:28 of area of your work to evolve?
    0:26:33 And you know that AI, machine learning, deep learning, all of this is giving it,
    0:26:36 maybe giving it a nudge that wasn’t available before.
    0:26:40 Yeah, I think that’s a, of course, that’s a fascinating question.
    0:26:42 It’s more of a science fiction kind of a question.
    0:26:42 Of course.
    0:26:48 More like, you know, boots on the ground kind of thing.
    0:26:52 And on our end, I think I’m very excited because the AI makerspace with phase two will also be
    0:26:57 connected to our engineering makerspaces, which means, you know, all our robotic arms,
    0:27:02 everything that you see around in the engineering makerspaces now will have this gigantic brain
    0:27:05 on the back end that it says and do all kinds of crazy things.
    0:27:09 So I’m most excited about, you know, the possibilities of new things, new applications
    0:27:13 at the intersection of AI and data sciences and the physical world.
    0:27:17 Where I think, you know, we can see new applications, new ways the students are going to use AI.
    0:27:20 And we as a society, you know, not all of these needs to be huge models.
    0:27:25 There’s probably these small models, you know, which is going to be ubiquitous, you know,
    0:27:28 embedded throughout the world, but you’ll have intelligence and smartness.
    0:27:34 So I’m very excited about the possibilities of the intersection of the of the AI, you know,
    0:27:39 as a whole and the AI makerspace in particular on our campus with the physical aspects of,
    0:27:42 of, you know, design and engineering that we are very familiar with.
    0:27:44 So that makes me really excited.
    0:27:45 Excellent.
    0:27:47 And last question for you.
    0:27:51 Let’s say there’s a teenager listening a high school student or a high school student’s
    0:27:55 parent for that matter, who’s listening and thinking, oh man, this sounds great.
    0:27:56 And this is where the future is headed.
    0:27:59 And I’m interested in science and engineering and roads.
    0:28:07 My thing like, what do I do now as a 14, 15 year old to try to prepare myself to be able to
    0:28:12 maybe get a spot in Georgia Tech’s engineering program in a few years.
    0:28:14 Yeah, absolutely.
    0:28:16 I would encourage you to kind of learn the basics.
    0:28:18 And I think that’s, that’s important.
    0:28:22 Learn math, learn statistics, learn physics, whatever that core discipline you want to pursue.
    0:28:24 If you want to pursue engineering in Georgia Tech, for example,
    0:28:28 you don’t have to be an expert when you, when you come in, you know, that’s not the goal.
    0:28:32 The goal is to be able to have a good understanding of core disciplines,
    0:28:34 because those are the foundational technologies that you need.
    0:28:40 If you have a chance to kind of, you know, take online courses on AI, just our data sciences,
    0:28:44 a lot of these will also have like, you know, programming courses that you can do along with
    0:28:48 that, just to be able to kind of, you know, get your, get yourselves some degree of familiarity
    0:28:49 with the field.
    0:28:52 I think that would be great, not needed, but great.
    0:28:57 Here, if you’re local to Georgia Tech in the city, in this area, we have some partnership
    0:29:01 programs with high school, started high school student summer and they, you know,
    0:29:03 they want to, they work with us with our faculty.
    0:29:07 One of the things I did not mention about the AI maker space was the,
    0:29:08 I talked about the phase one and the phase two.
    0:29:10 I did not talk about the phase three.
    0:29:14 The phase three is where we are hoping to be able to have an impact outside of the
    0:29:15 walls of Georgia Tech.
    0:29:20 We are essentially trying to open it up to our local high schools and middle school students
    0:29:25 who want to come here and spend time, or even to local, you know, HBCUs and HSI’s,
    0:29:30 you know, so that, you know, universities and colleges that may not have the resources to
    0:29:35 have such large computational, you know, capacity, would be able to use our resources
    0:29:36 and teach their own students.
    0:29:40 So I think that, I think, you know, as I was telling someone that, you know,
    0:29:43 this is a huge computational power, but as we know, with power comes responsibility.
    0:29:45 So I think we need to do our part.
    0:29:50 So if you are a student, you know, if you are a student and, and if you are, you know,
    0:29:54 in a high school and want to learn more, drop by, you know, visit us on our website.
    0:29:57 If you can physically come over, we would love to kind of talk to you and show you around.
    0:30:03 But more importantly, my only advice would be, if you’re a student and aspiring to become
    0:30:07 an engineer, run the basics well, because that’s what’s going to, you know, because
    0:30:09 by the lifetime of the, of a person who is a high school student now,
    0:30:11 there will be many inflection points in technology.
    0:30:13 I can’t even imagine.
    0:30:17 Things, things we will keep on changing, but the basics of math and science and
    0:30:18 physics will remain the same.
    0:30:20 So you need to understand that better.
    0:30:21 Fantastic.
    0:30:23 I alluded earlier to a couple of articles.
    0:30:24 You just mentioned the website.
    0:30:27 Is that coe.gatech.edu?
    0:30:29 Yes, that’s our college website.
    0:30:33 And you can find links to go to our, all the makerspaces, including the makerspace.
    0:30:34 Excellent.
    0:30:36 Arjit, this has been great.
    0:30:39 Again, you know, if you’re listening, go check out the blogs.
    0:30:43 There’s a video kind of shows, you know, the data center and the racks and all that,
    0:30:46 but there are students talking about using the makerspace.
    0:30:49 And, you know, as you said, that’s what it’s all about.
    0:30:53 It’s the power, the responsibility to share and democratize access to all of this so that,
    0:30:56 you know, the children can make the world a better place going forward.
    0:30:59 Arjit, again, thank you so much for taking the time to join the podcast.
    0:31:05 Wish you all the best with phase one, phase two, phase three, and whatever else is to
    0:31:06 come going forward.
    0:31:07 Thank you so much.
    0:31:09 It was a real pleasure talking to you.
    0:31:09 Thank you.
    0:31:19 [Music]
    0:31:29 [Music]
    0:31:39 [Music]
    0:31:49 [Music]
    0:31:57 [Music]
    0:32:07 [BLANK_AUDIO]

    AI is set to transform the workforce — and the Georgia Institute of Technology’s new AI Makerspace is helping tens of thousands of students get ahead of the curve. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Arijit Raychowdhury, a professor and Steve W. Cedex school chair of electrical engineering at Georgia Tech’s college of engineering, about the supercomputer hub, which provides students with the computing resources to reinforce their coursework and gain hands-on experience with AI. Built in collaboration with NVIDIA, the AI Makerspace underscores Georgia Tech’s commitment to preparing students for an AI-driven future, while fostering collaboration with local schools and universities.

  • Paige Cofounder Thomas Fuchs’ Diagnosis on Improving Cancer Patient Outcomes With AI – Ep. 228

    AI transcript
    0:00:11 [MUSIC]
    0:00:14 >> Hello, and welcome to the NVIDIA AI podcast.
    0:00:16 I’m your host, Noah Kravitz.
    0:00:20 What’s in NASA’s Mars rovers have in common with the quest to cure cancer?
    0:00:21 A lot, as it turns out.
    0:00:26 Some of the same technology used in the Mars rover missions is now being used to
    0:00:29 help detect, predict, and treat cancerous cells.
    0:00:31 Which technology, and how does it work?
    0:00:34 Here to talk about it is Thomas Fuchs.
    0:00:39 Thomas is the Dean of Artificial Intelligence and Human Health at Mount Sinai in New York City.
    0:00:43 And he’s also the co-founder and Chief Scientific Officer at PAGE,
    0:00:48 the first company with an FDA-approved AI tool for cancer diagnosis.
    0:00:53 Thomas, welcome to the NVIDIA AI podcast, and thanks so much for joining us.
    0:00:54 >> Thank you so much for having me, Noah.
    0:00:57 Very much looking forward to the conversation.
    0:00:59 Likewise, there’s a lot to get into.
    0:01:00 Let’s start with PAGE.
    0:01:03 What is the company all about, and how did it get started?
    0:01:08 >> So PAGE is the leading AI company in pathology.
    0:01:10 It was founded 2017.
    0:01:14 So we spun it out of Memorial Slang Catering.
    0:01:19 And at its core, it does AI in cancer research,
    0:01:24 and especially clinical care, and there within pathology.
    0:01:27 So we look at these digitized pathology slides.
    0:01:33 We find cancer, classify cancer, predict response to treatment, predict outcome,
    0:01:37 help pathologists to do their job not just faster and better,
    0:01:39 but especially help them to do things they can’t do yet,
    0:01:44 and then help oncologists to find better treatments for their patients.
    0:01:46 PAGE is actually an acronym.
    0:01:50 It stands for Pathology, Artificial Intelligence Guidance Engine.
    0:01:51 And that’s what we do.
    0:01:57 And these days, PAGE is used across the globe on four continents,
    0:02:03 and with thousands of patients treated last year already based on diagnosis
    0:02:06 that were rendered with the help of PAGE.
    0:02:09 >> So applications of machine learning, deep learning, AI,
    0:02:13 to use the umbrella term in medicine and health and wellness
    0:02:16 have really been exploding over the past few years in particular.
    0:02:23 And for my money are one of the most interesting and useful applications of the technology.
    0:02:27 We’ve had some different folks, some folks who are working to try to battle cancer
    0:02:33 and some other folks doing other things with machine learning on the podcast before.
    0:02:36 Let’s dive into how PAGE works a little bit.
    0:02:40 My understanding is there’s sort of a combination of the PAGE tools
    0:02:46 and then also third-party applications that are used throughout the diagnosis process.
    0:02:48 Maybe you can take us in a little bit deeper.
    0:02:50 >> Yeah, of course.
    0:02:52 I think it’s always good to start with the patient.
    0:02:57 So suppose you have some pain in the chest or somewhere else.
    0:03:02 First you get the radiology, and then if the radiologist sees something,
    0:03:06 the first thing that’s usually done is to take a biopsy.
    0:03:10 So a needle biopsy at the location.
    0:03:14 And then what comes out of there is some tissue.
    0:03:19 And in the last 150 years in which pathology didn’t change much,
    0:03:23 the pathologist looked at the tissue through a microscope
    0:03:26 and then decided the diagnosis.
    0:03:29 So it’s very subjective, of course, in many ways.
    0:03:33 So what PAGE does is really help to digitize the whole process
    0:03:37 and then build AI on top of it from its beginning.
    0:03:41 The whole thing is part of the field of computational pathology.
    0:03:44 We coined the term over 15 years ago,
    0:03:49 and since then the field exploded, but of course back then it was very niche.
    0:03:54 So after you have the biopsy, the tissue is then put on microscope slides,
    0:03:58 and these are digitized and end up being enormously large images.
    0:04:03 So 100,000 pixels times 100,000 pixels with millions and millions of cells.
    0:04:08 So you could fit all your holiday snaps on one of these single slides.
    0:04:11 And large institutions produce a lot of them.
    0:04:14 So at Monsignor we produce over a million of these slides per year.
    0:04:18 Then pathologists really have to look for the needle in the haystack.
    0:04:21 If you have, for example, a mastectomy is a breast cancer,
    0:04:23 you could end up with hundreds of these slides
    0:04:27 and you’re really looking for a few cells or a larger lesion
    0:04:29 that actually is cancer or not cancer.
    0:04:32 And that’s a very long and cumbersome process.
    0:04:35 And that’s exactly where our AI comes in.
    0:04:39 So we trained AI at scale to actually find cancer,
    0:04:45 and that also led to the first and only FDA approval in pathology, you mentioned.
    0:04:50 So to do that, you trained these very large computer vision models first.
    0:04:53 These, of course, transformer models these days.
    0:04:57 And to do so, we digitized enormous amount of slides over the years,
    0:05:02 linked it to all kinds of clinical data and pathology report data,
    0:05:07 to train models directly from the image against these reports.
    0:05:13 To do so in 2017, we actually built a dedicated compute cluster with NVIDIA DGX.
    0:05:18 So the first ones that came out back in the day, which was, of course, great.
    0:05:25 And that allowed us to build a model based on 60,000 of these slides for the FDA approval.
    0:05:26 And let’s use clinically.
    0:05:29 But these days, of course, that’s not enough.
    0:05:35 And we build now a very large foundation models from millions of these slides.
    0:05:39 And that’s done in partnership with Microsoft and Microsoft Research.
    0:05:44 So we can actually use thousands of GPUs to build these very large computer vision models.
    0:05:48 Since you brought it up, let’s talk about that foundation model.
    0:05:54 My understanding is it’s one of or perhaps the largest image-based foundation model in the world.
    0:05:56 So it’s by far the largest in pathology.
    0:06:01 And what we are training now, based on four million slides,
    0:06:05 is for sure one of the largest publicly announced computer vision models.
    0:06:11 What you have to do is actually have to separate these very large slides into images of normal size,
    0:06:14 like you would have an image net or other databases.
    0:06:19 And in that space, we have 50 billion images.
    0:06:24 So again, image net is only 14 million, so it’s 50 billion images.
    0:06:27 I take a lot of photos as a dad, but to your point earlier,
    0:06:30 I think that’s plenty of room for my holiday snaps.
    0:06:35 Well, it depends how many kits you have, you know.
    0:06:37 But that’s remarkable, 50 billion images.
    0:06:43 Yeah, so if you count actually the pixels, it’s 10 times more data than all of Netflix.
    0:06:46 So think of all the shows you watch, all the movies.
    0:06:52 So if you count the pixels throughout, that’s just 10% of the image data we pipe through these models.
    0:06:56 And the whole point of that is you end up with a foundation model
    0:07:02 that truly understands microscopic morphology of tissue very well,
    0:07:04 not only of cancer, but also of normal tissue.
    0:07:09 And then like in language models, you get these embeddings of these images,
    0:07:11 and you can use them for all kinds of tasks.
    0:07:15 Again, cancer detection, classification, segmentation.
    0:07:18 We also predict molecular changes, for example.
    0:07:23 If you have a specific mutation that can be targeted by some therapy,
    0:07:28 or sometimes you predict outcome to a therapy or to a treatment regime.
    0:07:31 This is making me think of a few episodes we’ve done in the past.
    0:07:39 One was with, to bring up NASA, some astronomers who were taking, looking at images, or using AI.
    0:07:43 I shouldn’t say they were looking, they were processing images taken.
    0:07:48 I think this was with the James Webb, one of the very powerful telescopes, right?
    0:07:54 And another one more recently talking about cardiac care and early detection of cardiac disease,
    0:07:58 using AI to power the techniques.
    0:08:06 And something that stuck with me was this notion that not only can these models
    0:08:10 churn through data with the speed and scale that humans just couldn’t do
    0:08:13 and detect things that humans couldn’t with the naked eye.
    0:08:18 But that in some cases, and I think this is specific to the cardiac care episode now that I think about it,
    0:08:23 they were actually able to detect things that, and forgive my lack of medical knowledge here,
    0:08:30 that were sort of in areas and types of things that researchers and medical professionals
    0:08:36 hadn’t even thought to look for before, because they were almost sort of hidden in the cells,
    0:08:40 in sort of the walls of the heart, or what have you.
    0:08:45 Is that the kind of thing that you’re finding with your research, and can you speak to that a little bit?
    0:08:49 Yes, so you’re pointing actually at the very, very interesting development.
    0:08:54 So some of the models we built, especially also the clinical ones that are FDA approved,
    0:08:57 that more or less mimic what the human does.
    0:09:03 So the pathologist does visual pattern recognition to, for example, do Gleason grading in prostate,
    0:09:06 or sub-classify breast cancer.
    0:09:12 And that can be replicated based on the training data, or for example, the pathology reports.
    0:09:15 But it gets really interesting when you go beyond that.
    0:09:18 I mentioned that example for mutations.
    0:09:23 So we have also sequence data of 100,000 patients where we have the slides,
    0:09:26 and then the somatic sequencing of the tumor.
    0:09:31 And for some of the mutations, you can actually predict them based on the image.
    0:09:34 So the mutations lead to a change of pattern in the image.
    0:09:38 And most of these are not recognizable by humans.
    0:09:42 So we were not trained to do that, and pathologists are not trained to do that.
    0:09:46 Some of them are very obvious, also humans can see, but others not.
    0:09:50 And there, of course, it also gets more into basic science,
    0:09:53 because then it becomes very interesting in reverse.
    0:09:55 Can we find out what the model looked like?
    0:10:01 So was it, for example, as you said, some subcellulum of our phyrophology at the nucleus,
    0:10:06 or was it larger vessel structures or a combination of different kinds of cells?
    0:10:10 For example, for the immuno-oncology work we do,
    0:10:14 you want to detect these tumor-infiltrating lymphocytes.
    0:10:17 Is there inflammation close to the cancer or within the cancer?
    0:10:23 And then you have very high-level or complicated or nonlinear interactions
    0:10:26 that are actually quite difficult for us humans to understand.
    0:10:32 So model introspection becomes something very important to know what’s going on.
    0:10:36 But philosophically, if you actually spin it a little bit further into the future,
    0:10:43 the time will come when these AI models might be able to perform very, very well for one of these tasks.
    0:10:47 For example, predict if the drug will work for you as a patient or not.
    0:10:55 And we as humans might not be able to understand the intrinsic combinations these models look at.
    0:11:02 And then the question is, how can you then still assure that these AI’s are safe and effective and equitable?
    0:11:07 And that’s where all the whole FDA framework and all the efforts into testing these models comes in.
    0:11:14 And I do think that I hope actually we’re going to get to that place because I want to go beyond the state of the art.
    0:11:21 And there’s so much more, of course, in biology and physics, which are underlying these processes that produce that tissue
    0:11:27 than we humans can grasp today, but that can really, really help patients to get better outcomes.
    0:11:31 Full disclosure, I come from a liberal arts background, not a science background per se.
    0:11:40 But all of the talk about all the scientific progress that these technologies are helping further and talk of new discoveries
    0:11:45 is what really gets me excited and interested in where this stuff is headed.
    0:11:49 I want to come back to the technical aspects of this in a moment.
    0:11:54 But I want to ask you, where is page at now in terms of results?
    0:11:56 What are you able to do?
    0:12:00 I don’t know if success rate is the proper way to phrase it.
    0:12:05 But what kinds of actual on the ground results are you seeing in using the tech?
    0:12:10 So I think a good place to study is actually the FDA study we ran.
    0:12:16 And there we could show that the error rate was reduced by 70%.
    0:12:22 So pathologists find much more cancer than they would have before without increase of false positives.
    0:12:24 So that’s very important.
    0:12:33 And we also showed across the globe on 14,000 patients out of 800 different institutions out of 45 countries
    0:12:37 that the AI really works everywhere across the globe.
    0:12:42 That’s because at that scale that we train, it really focuses on the morphology.
    0:12:49 It’s really on the growth patterns that it’s not thrown off by all kinds of nuisance variables or bubbles or air
    0:12:52 or all kinds of things you see on these slides.
    0:13:00 And that has been replicated at DA, at Cornell, in Brazil, in Europe, everywhere across the world.
    0:13:01 That’s great.
    0:13:05 So what’s of course interesting here is that if you think about FDA approvals of AI,
    0:13:09 you have hundreds of AI’s approved in radiology.
    0:13:11 I think they’re up to 500 by now.
    0:13:14 And in pathology, you still have only one.
    0:13:18 There’s just one that is actually safe and effective.
    0:13:20 And that’s of course a huge difference.
    0:13:25 And one reason is that the FDA looks differently at the two domains,
    0:13:28 radiology to a large degree is a screening step.
    0:13:33 You can always go back to another x-ray or another mammography.
    0:13:37 But in pathology, it analyzed the diagnosis.
    0:13:40 And you don’t have cancer until the pathologist says so.
    0:13:44 So it’s much more dramatic in the treatment pathway.
    0:13:51 But it also shows that there’s that huge chest in between all that height we have in AI,
    0:13:57 15,000 biotech AI companies and all that fluff and everything.
    0:14:00 And then the reality in the ground.
    0:14:04 So if you have physician in the trenches and you want to use something that’s safe and effective,
    0:14:05 you have the choice of one.
    0:14:07 And that’s of course something we have to change.
    0:14:12 So PAGE does that drastically and actually expanding to all kinds of different cancer types.
    0:14:15 So we have, for example, a breakthrough designation for the breast system.
    0:14:18 We can do bladder lymph nodes and so forth.
    0:14:23 And then we use these very large foundation models we discussed, especially the new one.
    0:14:28 It’s actually named after the founder of pathology, Rudolf Wirhoff.
    0:14:33 So it’s the Wirhoff one model, the V1 model, and V2 will follow soon.
    0:14:38 And we can actually use it for pan-cancer applications.
    0:14:42 So instead of just doing what we just described for prostate cancer breast,
    0:14:45 you can address nearly all cancers.
    0:14:49 And 50% of cancers are rare cancers.
    0:14:52 So if you inflict cancer, it’s a coin toss.
    0:14:55 If you have a big one where there are a lot of treatment options,
    0:14:58 or you have one of these rare ones where there’s nothing available.
    0:15:04 And the foundation models allow us because they understand morphology in the body of tissue
    0:15:09 so well that you can hit the ground running with few-shot learning on rarer cancers
    0:15:14 and rarer conditions, and then also produce AI for these.
    0:15:16 And that’s very, very promising.
    0:15:17 Absolutely.
    0:15:20 You sort of touched on this mentioning the few-shot learning,
    0:15:23 but are there specific types of cancers,
    0:15:27 and please replace types of cancers with a more precise term,
    0:15:33 but are there specific variants that are more difficult to detect
    0:15:35 because of technical problems?
    0:15:37 And how are you trying to get around some of those?
    0:15:38 Yes.
    0:15:41 So there are usually two reasons.
    0:15:43 First off, they might be rare.
    0:15:51 So you have the big ones like breast, prostate, lung, derm, and so forth.
    0:15:57 And then rarer ones like cholangiocasinoma, so that’s phyldoc, which is terrible.
    0:16:02 And there usually it didn’t have the numbers to actually build robust systems that channel us.
    0:16:05 And that’s what the foundation model is changing.
    0:16:10 And then you also have within the large cancers, you have subtypes
    0:16:13 that can be very rare and difficult to diagnose.
    0:16:18 We just published results in breast cancer that the foundation model now
    0:16:22 can actually really find these super rare forms as well,
    0:16:26 which you wouldn’t do with a normal breast AI.
    0:16:31 And that actually shows that it’s worthwhile building these very large systems,
    0:16:34 although of course they are tremendously expensive.
    0:16:38 Just the production of the data that go into these systems is very cumbersome.
    0:16:43 But it really leads to the benefits for the patients at the end of the day.
    0:16:44 Amazing.
    0:16:46 And a question that popped into my mind earlier.
    0:16:52 Are your system, your techniques sort of backwards compatible with older,
    0:16:57 lower resolution, or not necessarily older, but just kind of lower resolution images?
    0:17:02 Or is it all built atop kind of a, I don’t know, proprietary,
    0:17:08 but sort of current very high res way of scanning for cancer cells?
    0:17:10 So that’s a good question.
    0:17:14 So most scanners that are out there, the high throughput scanners,
    0:17:20 either have a 200 times magnification, 20x, or 440x.
    0:17:23 And we can work with both.
    0:17:24 Okay.
    0:17:28 On the cancer detection side, so we could show that it’s equivalent.
    0:17:33 Sometimes if you have very specific tasks that are really tailored towards,
    0:17:37 I don’t know, the membrane of the nucleus, then of course you want to have higher resolution.
    0:17:39 So it depends on the task.
    0:17:41 I’m speaking with Thomas Fuchs.
    0:17:46 Thomas is the Dean of Artificial Intelligence and Human Health at Mount Sinai Hospital in New York City.
    0:17:50 And he’s also the co-founder and Chief Scientific Officer at PAGE,
    0:17:54 as we’ve been discussing, PAGE is an AI company focused on pathology.
    0:18:00 And they are the first company with an FDA approved AI tool for cancer diagnosis.
    0:18:06 Thomas, I want to shift gears a little bit, kind of jog your memory back to something we tease in the opener.
    0:18:10 You helped out with some of the technologies in the Mars rover.
    0:18:13 I think was it perseverance in particular?
    0:18:14 Yeah.
    0:18:20 Yeah. So the techniques are used today actually for per se and curiosity before.
    0:18:24 The images we worked with were on the previous rovers.
    0:18:25 Okay.
    0:18:31 And it was the Athena class, so spirit and opportunity, and then also from orbit.
    0:18:36 So I had the enormous privilege to work for JPL for NASA a few years.
    0:18:38 Did JPL, the Jet Propulsion Lab?
    0:18:39 Yes, yes.
    0:18:44 Athena, the Jet Propulsion Lab, who is of course handling the Mars rovers.
    0:18:55 And what we did is we, for example, used the imagery from the navigation cameras to differentiate sand from gravel and from other terrain types so they don’t get stuck.
    0:18:56 Right?
    0:18:58 So spirit got stuck in this powder.
    0:18:59 Oh, right, right.
    0:19:00 Yeah, yeah.
    0:19:04 And you want to, for example, omit that and it didn’t hit obstacles and so forth.
    0:19:10 And we also had imagery from the Mars Reconnaissance Orbiter, so which is the satellite around Mars.
    0:19:18 And the images it takes of the ground actually have quite good resolutions, so 30 centimeters per pixel in the best case.
    0:19:19 Okay.
    0:19:23 And they are nearly exactly the same size as these histology slides.
    0:19:25 So they’re really enormously large.
    0:19:33 And you also try to classify different terrain type to, for example, look for good landing spots for the navigation and so forth.
    0:19:36 And yeah, the beauty of machine learning and the eyes.
    0:19:37 Right.
    0:19:40 Use it from the microscopic to the microscopic level.
    0:19:50 And so with the rovers, how does it work sort of training and updating the model and then running inference from that distance?
    0:19:58 Were these, you know, the systems all on the rovers and you would push an update out or I’m getting a little over my head here.
    0:20:04 But how did that work kind of, you know, doing all this work to help real time navigation but from so far away?
    0:20:05 Right.
    0:20:16 So most of the research we did back in the time was, of course, more basic research show that capability or again do it from orbit for all kinds of purposes like looking for landing sites.
    0:20:24 One of the big problems, of course, in space exploration is that as soon as you send something out there, it has to be radiation hardened.
    0:20:30 And that means the machines are usually 10, 10 years behind what you actually have.
    0:20:31 Right.
    0:20:33 There’s no GPU out there.
    0:20:48 I worked on another project where we did actually computer vision for flyby missions at asteroids or for example, Pluto, where the light gap is so big that you large that you can’t send meaningful comments out there.
    0:20:52 And then it really automatically has to, for example, find the asteroid to take pictures.
    0:21:00 And there also JPL had worked on building specific FPGAs and ASICs that could do that in the future.
    0:21:05 But that’s certainly one limitation that you can’t use the…
    0:21:06 Right.
    0:21:08 And maybe that’s something for NVIDIA, right?
    0:21:11 Have radiation hardened GPUs for that specific…
    0:21:12 No, you read my mind.
    0:21:17 I was thinking about the, you know, special edition 40 series radiation hardened.
    0:21:30 So in addition to the work you’re doing now with PAGE and Mount Sinai and the JPL stuff, you’ve had a hand in at least a couple of other really interesting and sort of large profile, high profile projects.
    0:21:34 The Memorial Sloan Kettering AI Research Department.
    0:21:38 And then also there’s a supercomputer in New Jersey, I believe.
    0:21:39 Yes.
    0:21:40 Tell us a little bit about those.
    0:21:41 Yeah, sure.
    0:21:43 So that was the one I was referring to.
    0:21:46 We built in 2017, 18 specifically for PAGE.
    0:21:56 So it’s still the largest standalone supercomputer only for pathology, and we extended it over the years, but it’s of course not big enough to train these humongous foundation models.
    0:21:57 Okay.
    0:22:10 That’s why we needed a relationship with Microsoft, not only for compute, but because of course they’re fabulous engineers and scientists and it’s really a great partnership to drive that forward on modern hardware to build that.
    0:22:25 Because of course we are always, always limited with compute, right, be it at PAGE, but also at Mount Sinai, of course, there’s so much fantastic research that could be drastically accelerated if the research community had access to more compute.
    0:22:40 But it was, of course, a very interesting exercise. I mean, we wrote papers and were published in Nature Medicine and Well-Cited and so forth. But the secret behind all of that is of course engineering, right? It’s MLOps.
    0:22:53 How do you actually build a stack? So usually in that case, the problem are not even the GPUs, but it’s I/O. How can you pipe adabytes of data quickly enough to the GPUs to actually train these very large models.
    0:23:07 And so there’s an enormous software stack to build computer vision models at scale and enormous experience that PAGE has in doing these applications that are, of course, broadly applicable in all kinds of computer vision tasks.
    0:23:12 Or now, of course, we do multi-modal approaches with HACS and radiology and so forth.
    0:23:14 Your own background is originally in engineering?
    0:23:22 Yeah, so I had my undergrad was in mathematics in Austria, which you can hear from Arnold Schwarzenegger accent.
    0:23:23 Sure.
    0:23:27 After that, I did a PhD in machine learning at the time where it wasn’t cool yet.
    0:23:28 Yeah, yeah.
    0:23:37 Europe’s was 400 people and it was very teach and combining, of course, AI with pathology was even more niche back then.
    0:23:40 And I was going to ask, how did you find your way into working in pathology?
    0:23:54 Yeah, so that was at the beginning of my PhD one day pathologist of the hospital we worked with came into the lab and just wanted some high level statistics based on the excel sheets, but they showed us a few images.
    0:23:55 Right.
    0:24:01 And as a naive student, I saw these and said, oh, these are just roundish object cells should be easy to detect.
    0:24:04 And now we are 20 years later and we still can’t do it.
    0:24:10 But at least I stuck with it and we can do it quite well.
    0:24:11 Amazing.
    0:24:12 These origin stories are the best.
    0:24:14 So many of them are so similar to that.
    0:24:18 You know, the student working on this, somebody wandered into the lab.
    0:24:21 I was the only person there 20 years later.
    0:24:26 I’m working with Microsoft to build the world’s largest foundational image model to fight cancer.
    0:24:27 Amazing.
    0:24:31 A lot of times we like to end these shows and kind of a forward looking note.
    0:24:33 I’m going to give you a two part thing here.
    0:24:45 The second part, I want to kind of open it up given your background and, you know, your hands on engineering prowess as well as obviously all the things you’re doing in the real world to ask you about AI more broadly.
    0:24:53 But before that, Paige, your work at Mount Sinai, you talked about this a little bit, but to kind of put a point on it.
    0:25:03 Where do you see the future of AI and medicine and healthcare and we can narrow it down specifically to, you know, using AI and pathology?
    0:25:04 Where do you see things headed?
    0:25:15 What are kind of the big obstacles beyond, of course, everybody needs more compute, you know, kind of the big hurdles that you’re looking to get over near term and take it out as far as you want as far as what you see in the future.
    0:25:26 Yeah, of course. Let’s start with pathology. So pathology has a very unique challenge and that that for the last 150 years, it really didn’t change.
    0:25:30 So first has to go digital before it’s completely AI based.
    0:25:39 So part of what AI does, sorry, what Paige does, it also helps large labs and academic institutions to actually go digital.
    0:25:46 So get these canners, link it to your LIS, make sure everything flows and then it’s a SaaS service in the clouds.
    0:25:51 You can access it from everywhere and you can diagnose from everywhere and so forth.
    0:25:55 So that’s something that has to be driven in pathology.
    0:26:03 Another part is, of course, it’s important that the FDA also as they want to do now take oversight.
    0:26:15 So there are some loopholes, some other companies who don’t validate at that level of scrutiny I mentioned can use and these LUT loopholes should be closed.
    0:26:21 So in healthcare much broader and that’s what brings us to Mount Sinai, which is of course a very unique place.
    0:26:27 It’s the largest health system in New York City with eight hospitals and 200 locations and so forth.
    0:26:32 There we are trying to more or less replicate what we did in pathology for all these different areas.
    0:26:46 So really level up in cardiology, in radiology, in psychiatry and AI now plays a role in all these areas from basic research in drug discovery and drug development to the bad side.
    0:27:00 So Mount Sinai has more than 20 models that are actually used in the hospital to direct the care teams to the right patients who had risk for seizure or malnutrition or for other things.
    0:27:21 And then on the basic research side there’s work in cardiology based on EKGs and ECGs to actually early predict all kinds of cardiology issues or an oftalmology where we look at fundus images to try to even predict new degenerative diseases from that.
    0:27:31 Mount Sinai is also building a neuropathology AI center where 10,000 brains are gonna be scanned and that’s of course for work in Parkinson’s and in Alzheimer’s.
    0:27:32 Amazing.
    0:27:41 And so in all these areas AI is a key to provide capabilities we don’t have yet, mid-care or mid-research.
    0:27:48 Besides all these doomsday scares that AI is an existential threat for us and so forth.
    0:28:03 I think especially in healthcare we have the moral obligation to develop AI as fast and as safe as we can because patients are dying not because of AI but because of the lack of AI.
    0:28:13 In the US alone 800 patients die or are permanently disabled because of diagnostic errors.
    0:28:21 And as we just discussed for example in pathology as we proved out these are things that can be addressed to a large degree with AI.
    0:28:30 So we have to build the tools for the physicians to give better treatment and a slowdown there is certainly not responsible.
    0:28:35 Now I’m gonna ask the old school engineering PhD and you.
    0:28:47 Looking at the current looking ahead a little bit when it comes to the field and the technical everything going on technically and it’s been a what’s the word it’s been a whiz bag.
    0:28:59 There’s a better word but it’s been a whiz bag couple of years in particular, at least as far as you know public exposure to everything going on but I’ve had the chance to talk to a bunch of engineering leaders who’ve just been saying
    0:29:05 I’ve never seen this kind of rate of acceleration and in tech you know in my career anyway.
    0:29:10 What are you excited about what kinds of you know things on the technical ground level.
    0:29:15 Are you looking forward to you know in the next I don’t know a year or two years five years.
    0:29:34 Very mundane side is just to get these tools into the clip right the first thing huge hurdle and usually it’s it’s not capabilities it’s not regulation usually it’s reimbursement very very petty monetary issues that where health systems are hard to change.
    0:29:41 So on the AI side, of course a lot of the current excitement is driven by language models.
    0:29:47 And that’s only for some reason, of course for good reasons we humans over index language.
    0:29:54 Yes, everybody who can who can string five sentences together in a coherent way seems to be intelligent.
    0:30:03 And that’s why we certainly assign a lot of of capability or properties to large language models that might or might not be there.
    0:30:17 But at the end of the day language is of course a human technology it’s produced by our brain when so our brain our relatively feeble wet brain with it with its few neurons can produce language and reason about language.
    0:30:24 But if you go beyond that, for example in biology and cancer research and if you look at the issues there.
    0:30:30 Just at a single cell all these processes are produced by physics and by biology.
    0:30:41 Just if you think of all the proteins that are at play at a single moment in a single cell that is in complexity so far beyond language.
    0:30:49 That there’s a whole universe out there, literally our physical universe and then the biological one that goes far beyond language.
    0:30:59 You see it even now in pathology the large models we built we touched upon that we humans are missing the vocabulary to even describe it in language.
    0:31:07 And then usually we come up with all kinds of comparisons where these cell look like people in single file or these cells look like this.
    0:31:10 But there’s so much more going on.
    0:31:18 We start to capture with their eye be image based be genomic based and so forth that goes beyond our capabilities with language.
    0:31:27 And I think that space is going to be dramatically exciting because that will deliver new drugs better care better outcome.
    0:31:31 It’s an exciting time to be alive to see transformation.
    0:31:32 Amazing.
    0:31:34 Thomas for listeners.
    0:31:39 What while they ponder all of this who want to learn more about what page is doing.
    0:31:42 Maybe what’s going on in Mount Sinai and the other things we touched upon.
    0:31:47 Are there some resources on the web where you you would direct folks to go.
    0:31:48 Of course for page.
    0:31:53 So just go to page dot A.I. B A I G E dot A.I.
    0:32:02 And at Mount Sinai if you look for Mount Sinai you see our initiatives there and the I centers and if you want to go beyond pathology that’s of course the place to go.
    0:32:03 Excellent.
    0:32:05 Thomas this has been an absolute pleasure.
    0:32:06 Thanks so much for making the time.
    0:32:10 There’s a lot to chew on here and a lot to be optimistic about.
    0:32:15 So we appreciate all the work you and your teams are doing and for taking a few minutes to tell us about it.
    0:32:16 Thank you so much Noah.
    0:32:17 It was really a pleasure.
    0:32:19 Thank you for the great questions.
    0:32:21 [Music]
    0:32:23 [Music]
    0:32:25 [Music]
    0:32:27 [Music]
    0:32:29 [Music]
    0:32:31 [Music]
    0:32:34 [Music]
    0:32:39 [Music]
    0:32:42 [Music]
    0:32:44 [Music]
    0:32:47 [Music]
    0:32:50 [Music]
    0:32:52 [Music]
    0:32:54 [Music]
    0:32:56 [Music]
    0:32:58 [Music]
    0:33:00 [Music]
    0:33:02 [Music]
    0:33:04 [Music]
    0:33:06 [Music]
    0:33:08 [Music]
    0:33:18 [BLANK_AUDIO]

    Improved cancer diagnostics — and improved patient outcomes — could be among the changes generative AI will bring to the healthcare industry, thanks to Paige, the first company with an FDA-approved tool for cancer diagnosis. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Paige cofounder and Chief Scientific Officer Thomas Fuchs. He’s also dean of artificial intelligence and human health at the Icahn School of Medicine at Mount Sinai.

    Tune in to hear Fuchs on machine learning and AI applications and how technology brings better precision and care to the medical industry.

  • How Roblox Uses Generative AI to Enhance User Experiences – Ep. 227

    AI transcript
    0:00:10 [MUSIC]
    0:00:13 Hello, and welcome to the NVIDIA AI podcast.
    0:00:16 I’m your host, Noah Kravitz.
    0:00:21 Since it launched in 2006, Roblox has grown into one of the biggest gaming and
    0:00:26 digital experiences platforms in the world, by basically any metric you want to look at.
    0:00:29 According to the company’s fourth quarter 2023 financial reporting,
    0:00:33 Roblox boasts more than 71 million average daily users,
    0:00:38 accounting for 15.5 billion hours of engagement last year.
    0:00:41 Those numbers include both players and creators,
    0:00:45 as user-created content has always been central to the Roblox experience.
    0:00:48 Providing both cutting edge tools for creators and
    0:00:52 quality experiences for users at scale is no easy feat.
    0:00:56 And with so many of Roblox users being under the age of 16, including my own son,
    0:01:01 who can often be found drifting virtual supercars through various of the Roblox games.
    0:01:05 Keeping things civil and safe for everyone has to be a high priority.
    0:01:08 How do you run a platform like this at such a massive scale?
    0:01:11 And how does AI play a part in the process?
    0:01:14 Here to take us under the hood of Roblox is Anupam Singh,
    0:01:17 Vice President of AI and Growth Engineering.
    0:01:22 Anupam, welcome and thank you so much for joining the NVIDIA AI podcast.
    0:01:23 >> Hey, Noah, thank you very much.
    0:01:28 My son, I mentioned in the intro, made sure I did my homework on this one.
    0:01:31 He’s very excited we’re having this conversation.
    0:01:34 Maybe very briefly, because there’s a lot to dig into
    0:01:38 in terms of the infrastructure and the things that are under your purview.
    0:01:41 Again, from both the creator and end user side of things.
    0:01:44 But for folks who might not be familiar with Roblox,
    0:01:46 and I think there are at least a few out there,
    0:01:49 maybe give us an overview of what Roblox is all about.
    0:01:55 So at a vision level, as a founder talks about it for the last 10 years,
    0:01:59 it’s about reimagining the way people come together.
    0:02:01 You and I want to hang out somewhere.
    0:02:04 Obviously, we have many, many different ways to hang out.
    0:02:09 But our vision is to bring a billion people together
    0:02:12 with optimism and civility.
    0:02:16 Now, especially the civility part is very important
    0:02:21 because that gets us to deep technical innovation
    0:02:24 to keep everything safe, everything civil.
    0:02:27 And from day one, we’ve invested a lot in that area.
    0:02:31 And optimism is about, if you go into one of our experiences,
    0:02:37 you’ll be amazed on how encouraging and interesting they could be.
    0:02:40 There’s the latest one, something called Dusty Trip.
    0:02:45 There’s another one called Pet Simulator, where you hang with your pet.
    0:02:48 But there’s a lot of other things going on like Valentine’s Day,
    0:02:49 St. Patrick’s Day.
    0:02:53 So it’s all about connecting a billion people with optimism and civility.
    0:02:55 And it’s a group experience, right?
    0:03:02 There’s, as you put it, it’s as much, it seems as about hanging out with your friends
    0:03:07 as it is about this incredibly diverse array of games and experiences.
    0:03:09 Yes, absolutely.
    0:03:10 And so how do people hang out?
    0:03:15 Is text chat, voice chat, how does the interaction happen?
    0:03:20 Yeah, so it starts with you open the Roblox app or the Roblox web page.
    0:03:25 And there’s hundreds of experience that are there for you to select.
    0:03:29 You know, think of a movie that you have to select to watch.
    0:03:32 But after that, you’re inside the movie, that’s the difference.
    0:03:39 And so you press play like you would do on any other app where you consume content.
    0:03:42 But then you get into the universe.
    0:03:48 And my favorite ones are where you can actually immediately create an avatar.
    0:03:51 You know, you get feathers if you’re if you’re willing to.
    0:03:55 And if you’re interested, you get like a lovely feather dress.
    0:03:58 And, you know, there’s a lot of audio going on.
    0:04:05 And so another favorite ones of mine are where you can dance with people.
    0:04:10 So I managed to say ballroom dance is one of my favorites right now.
    0:04:10 Excellent.
    0:04:13 And I just keep learning dancing.
    0:04:18 Something that honestly, I wouldn’t be able to do in my normal weekday.
    0:04:20 But I can do a ballroom dance in the evening.
    0:04:26 So let’s talk about your role, what’s under your purview and how that, you know,
    0:04:30 sort of informs the experience for the millions and millions of Roblox users.
    0:04:35 You one of the things, as I understand it, that you’re responsible for is managing
    0:04:38 and optimizing infrastructure across the platform.
    0:04:43 And there’s a lot happening now with AI and generative AI tools in particular.
    0:04:46 So maybe you can you can take us a little bit into that.
    0:04:48 Yeah. So two part answer.
    0:04:52 When your son plays it, he doesn’t realize that we have to connect
    0:04:56 in to a data center, which is hopefully close by.
    0:04:58 So that is the experience is amazing.
    0:05:02 Remember, your son is up a generation who believes that press play does not have buffers.
    0:05:06 And I’m sure looking at you, you remember the time when, you know,
    0:05:09 you would press play and some buffering would happen.
    0:05:11 So that’s the first magical experience.
    0:05:14 Oh, I remember that the AOL dial-up sound.
    0:05:14 That’s what I’m thinking about.
    0:05:16 There you go.
    0:05:22 But this generation of users on the Internet, just expect to press play.
    0:05:25 Then you have to place inside the experience behind the things
    0:05:29 what’s happening is he’s connecting with one of our edge data centers.
    0:05:34 And then you have to decide who are the best people your son could play with.
    0:05:37 You know, it has to be age appropriate.
    0:05:38 It has to be interesting.
    0:05:40 Maybe he already has friends on the platform.
    0:05:43 So I have to quickly figure out where are his friends?
    0:05:46 Are they in the same data center or they’re in a different data center,
    0:05:50 placing them in the right instance of the experience?
    0:05:55 Our biggest experience sometimes has 40 to 50,000 copies running worldwide
    0:05:57 on 20 plus data centers.
    0:06:00 So we have to place them in the right instance.
    0:06:03 After that, let’s say he wants to chat with people.
    0:06:04 So we have to enable that chat.
    0:06:08 Let’s say he wants to do a voice call with one of them.
    0:06:09 We have to enable that.
    0:06:14 Let’s say he actually wants to talk to somebody in India,
    0:06:16 which means he has to talk in Hindi.
    0:06:19 But assuming, let’s say, your son doesn’t know Hindi.
    0:06:25 We have real-time translation to facilitate chatting with somebody in India.
    0:06:26 So in Hindi.
    0:06:31 So that’s the kind of stuff that we in the infrastructure team build.
    0:06:33 Now, we like to be invisible.
    0:06:38 You should never think about the data center that you’re connecting to.
    0:06:40 How does machine learning play a role?
    0:06:41 Broad question.
    0:06:42 Very broad question.
    0:06:47 So we think about machine learning, typically, to start with,
    0:06:50 either for the creator or for the user.
    0:06:52 So you have to create these magical experiences.
    0:06:55 We talked about ballroom dance.
    0:06:57 We talked about dress interest.
    0:06:59 There’s a lot of storytelling in there.
    0:07:03 But most of our creators have the right code to build that world.
    0:07:06 Anything that facilitates their building of the world.
    0:07:08 It’s their idea.
    0:07:10 We don’t build our own experiences.
    0:07:13 It’s millions of developers who build experiences.
    0:07:17 We need to facilitate that really, really quickly.
    0:07:19 And I’m sure we’ll go into those examples.
    0:07:22 On the other side are the users.
    0:07:24 The users just join the experience.
    0:07:29 And if we want a safe environment, if we want a civil environment,
    0:07:33 a very simple example, you and I are chatting on Roblox.
    0:07:37 Every letter that you type, every word that you type,
    0:07:40 goes through a text filter for a civility check.
    0:07:45 And just that portion of our world is–
    0:07:46 I have to get this right.
    0:07:56 It’s 90.7 billion messages are translated, and 2.5 billion chat messages are sent every day.
    0:08:01 And every one of these have to be checked whether you’re being civil to me,
    0:08:02 whether I’m being civil to you.
    0:08:07 And this is in real time across all the different languages,
    0:08:09 all the different data centers and geographic locations,
    0:08:13 and the private chats and the public chats and just the scale.
    0:08:14 Every time, yeah.
    0:08:19 Now, of course, people choose to sample this in other settings.
    0:08:22 But from day one, we’ve not sampled.
    0:08:25 Every one of these goes through a text filter,
    0:08:31 which means you need a machine learning solution to look at 2.5 billion chat messages.
    0:08:35 And sometimes, as you said, in a private setting,
    0:08:38 we might be more colorful with each other.
    0:08:42 But in a public setting, we’re not going to be colorful.
    0:08:44 We should not be colorful.
    0:08:51 So making all those decisions in real time would have only been possible with machine learning.
    0:08:57 How do you approach a problem sort of at that scale?
    0:08:59 How do you get– has that been from day one?
    0:09:04 Has the importance of real time civility tools to put it that way?
    0:09:07 Has that always been baked into Roblox’s mission,
    0:09:10 or is it something that kind of evolved necessity?
    0:09:14 Yeah. So one thing for us is it’s always been baked into it.
    0:09:19 So our engineers never start building something, saying maybe we should sample,
    0:09:25 maybe we should just do it for NOAA, but not for Anupam, which is arbitrary.
    0:09:26 It’s for every user.
    0:09:32 So first, your design principle should be clear that I want to do it for every user, every chat message.
    0:09:38 Once you’ve established that, every few years, we have had to change our machine learning model.
    0:09:45 So a few years ago, there was a model that was– AI people have to have acronyms, by the way, NOAA.
    0:09:48 So one of them is SOTA, state of the art.
    0:09:52 The problem is, the SOTA changes every year.
    0:09:57 So we are the state of the art model, which would do a billion messages.
    0:10:02 So we took the model, we trained it with our own data,
    0:10:07 and then we had to make it smaller to run faster.
    0:10:10 Right, right. It could be billions of parameters,
    0:10:15 and then you sort of put it on a diet so it can run faster.
    0:10:20 And so we go from, let’s say, 70 plus billion to 7 billion parameters.
    0:10:26 But what’s happened in the last, I would say, 24 months, which is exciting,
    0:10:28 is the entire model architecture has changed.
    0:10:34 So it’s not just about, hey, let’s take one more model and that does natural language processing.
    0:10:39 You now have all these generative AI models that can do text summarization,
    0:10:41 that can do text moderation, et cetera.
    0:10:47 So we take those bigger models, and to run it from scale, we take every technique possible.
    0:10:49 We have to distill it to make it smaller.
    0:10:53 We have to fine tune it to take our data and fix it.
    0:10:55 And then we run it in production.
    0:11:00 And honestly, once you start running things in production, you have many phase palm moments.
    0:11:02 You say, wow, we could have done this.
    0:11:04 We could have done memory.
    0:11:05 We could back the thing.
    0:11:09 So yeah, it’s our design thinking and then running it in production.
    0:11:15 Now, I know that, broadly speaking, and also in Roblox, and as I understand it,
    0:11:18 even within the purview of scaling, safety and civility,
    0:11:24 multimodal LLMs models are, I don’t want to say the new thing,
    0:11:30 but they’re becoming more and more important and talked about in all phases of technology.
    0:11:32 How do multimodal LLMs play a part?
    0:11:35 Multimodal models play a part in your work.
    0:11:40 And this may or may not be relevant, but it’s something that popped into my mind talking about civility.
    0:11:43 When you were talking about running the chats through text filters,
    0:11:52 it made me think about slang and lingo and how you sort of keep up with words that I might not understand,
    0:11:56 but might be extremely colorful to use that word to people in the know.
    0:11:59 So how do you just keep up with all of that?
    0:12:03 And then second question is about non-text things.
    0:12:13 So if an avatar has something questionable, or I don’t know actually if you’re allowed to share images and memes and that sort of thing.
    0:12:14 But yeah, you can talk about that.
    0:12:25 Yeah, so let’s talk about multimodality and then we’ll talk about, you know, what I would call internet with things that you and I might not know are offensive, but they are offensive.
    0:12:35 But multimodality, you’re 100% right that in the last two years, we just assume text models are there.
    0:12:42 Any of us can access PPD, any of us can access, you know, any model that you want to pick up, right?
    0:12:47 What’s exciting about the last one year is the use papers are also tending.
    0:12:52 So more and more on our platform, people would love to use voice.
    0:13:07 So what we did around two years ago, we decided that we will think twice the same way as your ticket tech, which means, you know, you want to check for profanity and other offensive words in speech.
    0:13:18 And so we use a very large model, thankfully, large models are available both open source and closed source to convert firstly voice to tech.
    0:13:25 And then we use all our tech knowledge from 10 plus years to moderate voice.
    0:13:29 The next question that you have is even more interesting, which is images.
    0:13:46 We always had image moderation in our, in our platform, since 10 plus years, our founder likes to tell the story of, he used to do the moderation you maybe 15 years ago, it does not anymore, but he used to.
    0:13:50 We donate billions of money in 70 days.
    0:13:56 And the exciting part is each one of these large models has the wisdom of the internet packed into them.
    0:13:58 Earlier, we might have to train it from scratch.
    0:14:01 Maybe five years ago, Roblox would have to train it from scratch.
    0:14:06 So now let’s say there is a new plan or a new offensive image.
    0:14:13 I don’t get the benefit of somebody else in the large state of the art model and open source in there.
    0:14:24 But now we start many of our initiatives with a large open source model that has the wisdom of the internet and is able to update to newer things.
    0:14:29 Then we find the unit when we distill it to Roblox as we would say.
    0:14:36 So image, voice and text are all in production for art.
    0:14:46 Then if you’re going to take multi-modality further out, and I know you talked about this in some of your other podcasts is really that’s not just a picture.
    0:14:51 That is, you know, I could make something really interesting.
    0:14:54 And I don’t want to just focus on civility here.
    0:15:01 It’s just the beauty of what you can create on the platform and how quickly can you create about it.
    0:15:07 So we are very focused on 3D and 4D as the next modality.
    0:15:14 And then, honestly, we don’t see a lot of models that have already been built in open source.
    0:15:19 There is no easy wisdom of the internet as far as 3D and 4D.
    0:15:26 And so you’re using kind of a cross all these things I’m generalizing, but you’re using a mix of available open source,
    0:15:32 you know, available models that you find, train and otherwise, but then you’re also building models in-house.
    0:15:42 Yeah, OK, so for example, for recommendations, our model is internally built for certain safety use cases, our model is internally built.
    0:15:45 We have a coding system, but we have our own language.
    0:15:52 So we take an open source model and then we combine with our language and build a large model.
    0:15:59 I’m speaking with Anupam Singh. Anupam is vice president of AI and growth engineering at Roblox.
    0:16:04 Roblox is one of the leading online platforms in the world, bar none.
    0:16:13 But certainly when it comes to creating and using all kinds of digital experiences from games to role-play simulators,
    0:16:20 ballroom dancing, as you said, Anupam, it’s just amazing, all the things that are out there and being built literally every day.
    0:16:29 You touched on this briefly and I want to sort of flip over to the creation experience and how, you know, you mentioned Roblox has its own code,
    0:16:32 the Roblox Studio, I believe, is the platform for creating.
    0:16:41 But actually first, I wanted to start with talking about avatars, because my experience has been, you know, as a user, as a parent of a user,
    0:16:49 but then as a user myself, I’ll go on and play games. I love the driving games. There’s some fun and big paintball is one of my favorites.
    0:16:56 But the whole idea of an avatar is just incredibly essential and important.
    0:17:05 And, you know, it’s vital to so many people’s online identities broadly, but particularly on your platform.
    0:17:10 You know, every time my son will show me like, hey, check out my new avatar’s outfit.
    0:17:18 And it’ll be, you know, his favorite footballer’s uniform or it’ll be, you know, some some just wild costume or this or that.
    0:17:26 So can you talk a little bit about the technology sort of behind being able to create and customize avatars?
    0:17:32 And, you know, how that relates to, I mean, basically anything that that is important to you, but talking about civility.
    0:17:39 But then, you know, more talking about just like giving users the tools to express themselves.
    0:17:48 Yeah, I’ll tell us of such a rich team. We could spend hours talking about it, but let’s start with a very simple application.
    0:17:53 There’s a likelihood that you want the author to reflect some of your personality, right?
    0:18:02 Including you have this lovely backdrop behind you, which people can’t see on the audio podcast of these lovely electronic devices.
    0:18:07 And so let’s start with taking your photo and creating an avatar out of it.
    0:18:13 To be simple, if it was only 2D, 2D filters are on the phone, you can do them today.
    0:18:17 Great thing, right? By all the great phone manufacturers.
    0:18:20 But you want it to moron, you want it to mo.
    0:18:29 That’s when, you know, creation of these photos of the generation of 2D and 3D are major innovations.
    0:18:35 And it is ongoing. There, our technology has to, you know,
    0:18:43 both be built from scratch, but also use some of the existing 2D to 3D, 2D to 3D innovation.
    0:18:47 Then, okay, we took your photo, we will take an avatar, but we want to set it up.
    0:18:52 You know, we want to give it a phone, which is beyond just your maybe face.
    0:18:59 That’s how to set it up. Next, we want to track your upper body, because your upper body movements.
    0:19:04 I love to gesture a lot. Maybe you gesture less. How do we capture that?
    0:19:09 And after all of that, you need to put it in a word, in a cohesive way.
    0:19:16 So each one of these, think of these as almost separate modalities.
    0:19:19 Movement is a different modality from texture.
    0:19:23 It is a different modality from what you’re wearing, between material.
    0:19:27 And none of that, to you or your son, is interesting.
    0:19:34 It is our job as an AI platform, that the right models get called, the right models get orchestrated.
    0:19:39 Of course, in an ideal world, and I’m looking very, very into the future,
    0:19:42 all of this should be answered by one model.
    0:19:50 It can do photo talk, it can set up your upper body, it can set up your face and body coordination,
    0:19:56 it can set up your gestures, and then put you into a world where you’re dancing to impacts or dressing to impacts.
    0:20:01 But in reality, I tell you, each one of these are different models being called.
    0:20:04 So generally, specific, special things to do.
    0:20:07 – Let’s talk about creators. – Yeah.
    0:20:11 You know, you mentioned you’ve got your own language that creators can code in.
    0:20:15 There’s obviously, as we’ve alluded to during the conversation,
    0:20:21 all kinds of visual, 3D, physics engines, all kinds of things happening in these experiences.
    0:20:27 How are you incorporating generative AI into the creator tools?
    0:20:33 And what are some of the, not just concerns, but some of the exciting things that are happening,
    0:20:39 whether they’re at the level of an individual creator, or if we’re talking about orchestrating all of this at this massive scale?
    0:20:47 Yeah, so the creator journey is very fascinating, because many of our creators are unleaving storytelling.
    0:20:52 Right, and forgive me for interrupting, but just to sort of emphasize,
    0:20:56 it’s been like a wake-up call to sort of how, you know, this generation,
    0:21:01 and for me, it’s the generation of my son, but obviously your user span ages.
    0:21:09 But just how important storytelling and just sort of the like personal, narrative, emotional aspects of these experiences really are, right?
    0:21:14 I’ll watch my son, you know, role play, buying a house and decorating it.
    0:21:20 And just doing, to me, seem like mundane things, but he’s hanging out and living this experience.
    0:21:22 And it’s amazing.
    0:21:24 Oh, yes, building a house on Roblox, it’s so much fun.
    0:21:27 We could go on, but let’s go to the creator.
    0:21:29 So there are creators.
    0:21:35 One of my favorite experiences right now is being able to land on any airport in the world,
    0:21:39 which is, you know, airport simulators has been around for a long while,
    0:21:41 but combining it with Roblox is storytelling.
    0:21:46 Now, but think about the creator, what they have to do is they have to start building the airport.
    0:21:48 And I always speak by break.
    0:21:50 We want to facilitate that really, really quickly.
    0:21:56 It all starts with the most popular use case today in the industry is coding assistant.
    0:21:59 Just let me code faster, right?
    0:22:04 But after I’ve coded the world, now I want to give it texture.
    0:22:07 Now I want to give it material, right?
    0:22:11 You know, you’re giving your, you’re a creator.
    0:22:15 You’re giving your users lovely aviator glasses, they could get pilots, right?
    0:22:16 Talk on your friends.
    0:22:21 And these aviators should have textures, they should have sound.
    0:22:23 The material should be maybe metal.
    0:22:26 All of this takes some time for our creators.
    0:22:30 And they’re in a hurry to tell a beautiful story, but this takes time.
    0:22:36 So for every step of their way, generative AI starts helping them.
    0:22:40 The coding assistant is a pure text model.
    0:22:47 But the material and texture models are much more richer and more interesting.
    0:22:52 And now every one of our creators would love to have their players worldwide.
    0:22:53 Right, of course.
    0:22:55 Now, should they spend time translating?
    0:22:59 And that’s where Roblox comes in, our translation is automated.
    0:23:06 You, if Noah is creating a brilliant experience, but now you want to translate it
    0:23:08 everywhere, we just make it happen.
    0:23:13 And now once people are into your experience, we do all the chat translation too.
    0:23:18 So from the time of coding to the time of release, we’re helping you everywhere,
    0:23:19 originally to where.
    0:23:22 >> That reminds me of something I wanted to ask you earlier.
    0:23:28 When we talk about real time in the context of, say, translation or
    0:23:31 even civility filters to put it that way.
    0:23:35 Do you have a benchmark for what real time means?
    0:23:41 >> So a real time for us is literally as you’re typing, the translation happens.
    0:23:45 So as long as the user doesn’t see the thing we talked about,
    0:23:50 when you press play, you shouldn’t see buffering, that’s what we aspire to.
    0:23:51 So we shouldn’t even exist.
    0:23:56 A great example of that is, and I’ll see your founder love to throw this example is,
    0:24:00 Noah and other from get on a one-on-one on Roblox.
    0:24:04 We have Roblox office, you and I get on a one-on-one.
    0:24:06 >> Noah says something colorful.
    0:24:06 >> Yeah.
    0:24:07 >> In real time.
    0:24:09 >> As I want to do.
    0:24:10 >> Yeah.
    0:24:14 >> In real time, you will get a nudge from our voice moderation system.
    0:24:14 >> Right.
    0:24:16 >> Then you say something more colorful.
    0:24:20 You will now be suspended maybe for a few minutes.
    0:24:20 >> Right.
    0:24:22 >> So that is what real time means to us.
    0:24:26 It is not a warning that will come to your inbox a day later.
    0:24:29 >> As we start to wrap up here and ask you the question,
    0:24:32 we always ask it in these conversations.
    0:24:33 What’s next?
    0:24:37 What are you either working on now or looking ahead to?
    0:24:40 You mentioned, of course, way out in the future,
    0:24:44 the one model that can handle all these different modalities and things.
    0:24:47 Are there either problems that you’re trying to tackle right now or
    0:24:53 something in the near horizon that you’re really excited about that you might give us a glimpse into?
    0:24:55 >> Huge excitement is 4D.
    0:24:59 I know you’ve talked about CD and 2D in many of your podcasts.
    0:25:03 The fourth dimension, and I sound science-kitchen-y,
    0:25:07 but the fourth dimension is you want a car in your experience.
    0:25:09 The car doors have hinges.
    0:25:11 It has to open the right way.
    0:25:15 My favorite example is, I live in San Francisco.
    0:25:19 I want to recreate all of San Francisco in its glory.
    0:25:24 Alcatran siren, parade building, the painted ladies, Victorian houses, on Roblox.
    0:25:27 I can create that, but it will take me some time.
    0:25:30 But once I create the Victorian house, I want to enter it.
    0:25:32 I want to open the door.
    0:25:38 I want to hear the echo because the sound is different when I enter a Victorian house where there’s a glass house.
    0:25:42 All of that takes effort on our current platform.
    0:25:43 >> Of course.
    0:25:48 >> What if you could just talk to it and then you start building the story?
    0:25:50 You’ve seen all these movies like in San Francisco, right?
    0:25:53 From romantic comedies to action thriller.
    0:25:56 What if you could do that storytelling much faster?
    0:26:00 Because the base of San Francisco was created so quickly for you.
    0:26:03 That’s what I’m excited about, where they don’t think about models.
    0:26:06 They don’t think about coding assistants, work templates, and etc.
    0:26:11 They just think about storytelling, and the model is in the background just assisting them continuously.
    0:26:15 It’s not even called an assistant because it is so much in the background.
    0:26:21 >> We could talk for hours at least about any of these topics, as you alluded to throughout.
    0:26:24 Perhaps we’ll get the opportunity to do this again in the near future.
    0:26:31 I don’t think Roblox is going anywhere, so it would be great to check back in as technology advances
    0:26:32 and the experiences advances.
    0:26:37 But we do have quite a few technical-minded listeners in the audience.
    0:26:44 So for folks who want more, want to dig into how all of these things, the models and the filters
    0:26:49 and the real-time translation, are happening or are happening at such scale, are there places online?
    0:26:53 Obviously Roblox has a website and a blog, but are there specific places
    0:26:55 that you would direct folks to learn more?
    0:26:58 >> Oh, yes. We have a Tech Talks podcast.
    0:26:59 >> Oh, fantastic.
    0:27:04 >> That is hosted by our CEO and founder, Dave Bazzucchi.
    0:27:06 Every one of the topics that we talked about,
    0:27:09 whether it’s just outcomes, growth engineering, safety, civility,
    0:27:14 we have 40 to 45 minutes of deep dive by Dave himself.
    0:27:17 >> Great. And that’s Roblox Tech Talks?
    0:27:17 >> Yes.
    0:27:20 >> Excellent. There’s a developer blog as well?
    0:27:22 >> Yes, there’s a blog.roblox.com.
    0:27:23 >> Easy enough.
    0:27:26 >> You can go and look at what we are talking about.
    0:27:28 >> Great. Well, Anupam, thank you again.
    0:27:34 This podcast is not about me, but you’ve made me a star in my own household at least for a day,
    0:27:37 getting to talk to you, so I appreciate that.
    0:27:40 But more importantly, appreciate your coming on the podcast.
    0:27:46 Just to give people a little bit of an insight into everything that has to go on behind the scenes,
    0:27:51 so that whether you’re chatting with a friend, being kept safe and civil,
    0:27:56 or experiencing these amazing 3D experiences,
    0:27:59 you don’t have to wait for things to buffer. They just happen.
    0:28:00 >> Yes. Thank you very much.
    0:28:04 This was, I’m an avid listener of your podcast, so it should be great.
    0:28:05 >> I appreciate that.
    0:28:08 Look forward to chatting again and all the best to you and your teams.
    0:28:08 >> Thank you.
    0:28:18 [MUSIC]
    0:28:28 [MUSIC]
    0:28:38 [MUSIC]
    0:28:48 [MUSIC]
    0:28:58 [MUSIC]
    0:29:08 [BLANK_AUDIO]

    Roblox is a colorful online platform that aims to reimagine the way that people come together — now that vision is being augmented by generative AI. In this episode of NVIDIA’s AI Podcast, host Noah Kravitz speaks with Anupam Singh, vice president of AI and growth engineering at Roblox, on how the company is using the technology to enhance virtual experiences with features such as automated chat filters and real-time text translation, which help build inclusivity and user safety. Singh also discusses how generative AI can be used to power coding assistants that help creators focus more on creative expression, rather than spending time manually scripting world-building features.

  • Michael Rubloff Explains How Neural Radiance Fields Turn 2D Images Into 3D Models – Ep. 226

    AI transcript
    0:00:00 [MUSIC]
    0:00:10 >> Hello, and welcome to the NVIDIA AI podcast.
    0:00:13 I’m your host, Noah Kravitz.
    0:00:15 We’re coming to you from GTC 2024 in San Jose, California,
    0:00:20 and we’re here to talk about nerfs.
    0:00:22 No, not foam footballs and dart guns,
    0:00:24 but neural radiance fields.
    0:00:27 What is this kind of nerf?
    0:00:28 It’s a technology that might just be changing the nature of images forever.
    0:00:32 Here to explain more is Michael Rubloff.
    0:00:35 Michael is the founder and managing editor of radiancefields.com,
    0:00:39 a news site covering the progression of radiance field based technologies,
    0:00:42 including neural radiance fields, aka nerfs,
    0:00:45 and something called 3D Gaussian Splatting that I’ll leave to Michael to explain.
    0:00:50 Michael, thanks so much for taking time out of GTC to join the AI podcast.
    0:00:54 >> Of course, thank you so much for having me.
    0:00:55 So first things first, goofy football jokes aside, what is a nerf?
    0:01:00 What does that mean?
    0:01:01 >> Yeah, so essentially, you can think of a nerf as they allow you to take a series
    0:01:06 of 2D images or video, and what you can do from that is you can actually
    0:01:11 create a hyper realistic 3D model.
    0:01:13 And what that allows for is once you have it created, it’s like a photograph,
    0:01:17 but it’s perfect from any imaginable angle.
    0:01:20 Your composition is no longer a bottleneck.
    0:01:22 You can do whatever it is that you would like with that file and it will look lifelike.
    0:01:27 >> So if I were to take, I don’t know how many, two, three,
    0:01:32 five pictures of the two of us sitting here right now in this podcast room,
    0:01:37 if you will, I could put those together into a nerf.
    0:01:42 And then I would have a file that I can look at from different perspectives.
    0:01:48 I can sort of move through.
    0:01:50 How does that work from the user experience side?
    0:01:53 >> Yeah, so typically, the recommended amount is somewhere between like,
    0:01:57 I’d say, 40 to 100 images.
    0:01:59 It’s really easy from a video because then you can just slice up
    0:02:03 individual frames from that video.
    0:02:05 But there are some methods actually that are going all the way down to three
    0:02:09 images and it’s actually able to reconstruct, which is just mind blowing.
    0:02:13 >> I’m too used to few shot and zero shot learning and things like that.
    0:02:18 So I’m like, one image.
    0:02:19 >> Yes, they are getting there.
    0:02:20 And there’s been a ton of amazing work, one called Reconfusion from Google,
    0:02:25 which is just shocking.
    0:02:26 It can go down as far as three images and it’s still very compelling.
    0:02:31 But yeah, so once you actually have gone ahead and taken your images or video,
    0:02:36 you would run it through a Radiance Field Pipeline,
    0:02:38 whether that’s a nerf-based one or a Gaussian-spotting-based one.
    0:02:41 There are several cloud-based options where it’s just drag and drop your images
    0:02:45 and it does all the work for you.
    0:02:47 And the resulting image or resulting file, yeah, you have autonomy over it.
    0:02:52 And you can kind of experience that whatever you’ve captured from whatever
    0:02:57 angle that you would like or whatever your use case might be for it.
    0:03:01 >> How are they created without getting too technical about it?
    0:03:07 Can you kind of give an overview of kind of what’s going on behind the scenes to
    0:03:10 put these together?
    0:03:11 >> Sure, so once you have your initial images,
    0:03:14 the first step on both nerfs and Gaussian-splatting is running it through
    0:03:18 something called structure from motion,
    0:03:19 where essentially you’re taking all the images and kind of aligning them in a
    0:03:23 space with one another.
    0:03:24 So that’s kind of taking a look, saying like if image X is over here and
    0:03:28 image Y is over here, here’s how they overlap and converge with one another.
    0:03:31 And so that’s kind of the baseline approach for both methods.
    0:03:34 And from there, they each have their own training methods where nerfs have
    0:03:39 a neural network involved in the training of them,
    0:03:41 whereas Gaussian-splatting uses a rasterization.
    0:03:44 >> Okay, and I guess that kind of begs the question of beyond what you just said or
    0:03:48 maybe that’s it.
    0:03:48 But what’s the difference between a nerf and a Gaussian-splatting?
    0:03:52 >> Yeah, so nerfs essentially were created first.
    0:03:55 They were found as I joined effort between the University of Berkeley and Google.
    0:04:00 >> Okay.
    0:04:00 >> So nerfs have a implicit representation for them and
    0:04:06 they’re trained through a neural network.
    0:04:07 So there’s a lot of work being done to get higher and
    0:04:11 higher frame rates associated with that.
    0:04:13 Whereas Gaussian-splatting uses just direct rasterization.
    0:04:16 And so you’re able to have a much more efficient rendering pipeline where
    0:04:21 you can really easily get 100 plus FPS and
    0:04:25 you can use them with a lot of different methods as well.
    0:04:27 So they’re very compatible with 3JS and React 3 Fiber where you can see them
    0:04:33 being used in website design now and being on platforms like Spline.
    0:04:36 >> Very cool, and so that kind of gets into the next question,
    0:04:39 which is how are they being used now?
    0:04:40 I saw some examples on, I don’t know if it was the NVIDIA developer blog.
    0:04:45 And they used a kind of an obscure song that my wife and I like a lot.
    0:04:50 So it was perfect, right?
    0:04:51 But it was a nerf, I believe, of a couple walking down memory lane.
    0:04:56 They were walking outside in the foliage around them.
    0:05:00 And you were able to, I was able to sort of zoom around from different points of
    0:05:03 view, see their front, see their back, look at the trees, that kind of thing.
    0:05:07 But beyond sort of a demo seen like that,
    0:05:10 how are nerfs being used out in the world?
    0:05:11 >> Yeah, so that specific one actually is of my parents that I took.
    0:05:14 >> Oh, it’s your parents, that’s very cool.
    0:05:16 >> And so I took, because that’s one of the major use cases for me.
    0:05:18 I want to be able to document my life not only in two dimensions, but
    0:05:22 I want to have a hyper realistic three dimensional moment in time frozen.
    0:05:26 And so for me, that’s one of the personal use cases.
    0:05:29 But on a more commercial basis, where you’re seeing a lot of the early adoption
    0:05:34 is in the media and entertainment world as well as the gaming world too.
    0:05:39 So for instance, Shutterstock has been putting together a library
    0:05:43 of Radiance Fields where essentially what you’re able to do.
    0:05:47 So say, hypothetically, you are wanting to film a Grand Central Terminal.
    0:05:52 But you cannot afford to shut down all the traffic and all the trains and
    0:05:55 all the foot traffic through that to film.
    0:05:58 What you can do is using a Radiance Field.
    0:06:01 If you capture it once, you’re able to then bring that file into Unreal Engine and
    0:06:06 into a virtual production environment.
    0:06:08 And then you can film infinitely.
    0:06:10 And there’s no more rush outside of the actual rental rate.
    0:06:13 And you can go and get the shot that you actually need.
    0:06:16 And that’s where it’s starting to get adopted pretty early on.
    0:06:20 And similarly, in the gaming side of things through generative Radiance
    0:06:24 Fields because you are able to create these from text and images.
    0:06:28 And newly video as well.
    0:06:31 Now you’re able to drop these assets that take, I think the fastest method
    0:06:35 currently takes about half a second to create a full 3D model.
    0:06:39 And you can put that straight into one of the game engines and.
    0:06:43 >> Right, and so let me sort of play this back and
    0:06:46 see if I’m grasping it correctly.
    0:06:48 So if I were to go to Grand Central and do my very short
    0:06:53 relatively speed and my very quick shoot.
    0:06:56 And come away with enough images to create a nerf.
    0:07:00 I would then be able in a virtual production environment to kind of
    0:07:05 create scenes or put elements into scenes from all of these different points of
    0:07:09 view, not just from a single perspective.
    0:07:11 Is that kind of the big?
    0:07:13 >> Yes, yeah, that’s correct.
    0:07:14 Where essentially you’re able to sync up the nerf to the actual camera.
    0:07:17 And then you can use the full virtual production pipeline to go ahead and create.
    0:07:23 >> Wow, this is something I probably should have asked you at the top.
    0:07:25 So listeners, forgive me for not having a more scientific inquiry kind of way of
    0:07:31 organizing my own thoughts.
    0:07:32 But a radiance field, what does that term mean?
    0:07:36 >> That’s a great question.
    0:07:38 So essentially, you could think about a radiance field is a,
    0:07:43 well, if you just break it down into the two simple words.
    0:07:46 The radiance is just what that individual color would look like based
    0:07:52 upon your viewing direction.
    0:07:53 So say if you’re looking at like a, say a glass or something and
    0:07:57 you can see that there’s an actual reflection there.
    0:08:00 Depending on how you look at it and what radiance fields offer is
    0:08:03 something called view dependent effects.
    0:08:05 So as you move your head around, just as you do in real life, light changes,
    0:08:09 light shifts and reflects.
    0:08:10 And just like that, radiance fields are able to model that effect.
    0:08:14 So you could think of radiance fields as the actual shift in colors at a given
    0:08:17 space.
    0:08:18 So knowing that that radiance at a specific point could take a look at that as
    0:08:22 being radiance, whereas it’s being contained inside of a field.
    0:08:25 >> Got it, okay.
    0:08:26 So you said something a minute ago about generative AI,
    0:08:30 using generative AI to create radiance fields.
    0:08:33 And you may have used a term that slipped my mind, forgive me.
    0:08:36 Is it the same basic principle as doing a text to image,
    0:08:42 using a text to image model, chat GPT or dolly, stable diffusion, whatever it is?
    0:08:47 Is it that same principle that I enter a text prompt and
    0:08:50 then the system can create a radiance field?
    0:08:53 Or is it creating a series of images?
    0:08:55 Or can you get more, kind of more control than that over it?
    0:08:59 >> Yeah, so exactly.
    0:09:01 What it does is it will create a series of images of a singular object.
    0:09:06 And actually, in some cases, they’re starting to release some papers where
    0:09:09 they’re creating multiple objects.
    0:09:11 But each object itself is either a nerf or, say, a Gaussian-spotting file.
    0:09:16 And from that, that’s what’s actually being used to train the actual resulting
    0:09:20 3D model, but they’re able to do that in a fraction of a second.
    0:09:23 >> Sure.
    0:09:25 So earlier this week, you hosted a session at GTC.
    0:09:29 And I, unfortunately, wasn’t able to make it.
    0:09:31 I wasn’t in town just yet.
    0:09:33 We were talking about it before we hit record.
    0:09:35 And you spoke to, if I’ve got this right, some of the artistic implications,
    0:09:39 possibilities around using nerfs.
    0:09:42 And then also some more kind of business enterprise oriented applications.
    0:09:47 How did that go?
    0:09:48 What kinds of things did you talk about?
    0:09:49 And then I’m kind of curious what the audience reaction was.
    0:09:52 Either that or kind of more generally, as people learn about
    0:09:58 nerfs and Gaussian-splatting, what kind of the reaction is?
    0:10:02 And does it spark imagination?
    0:10:04 And sort of what are some of the implications?
    0:10:06 >> Yeah, I was surprised by how many people actually came out to attend.
    0:10:11 And so it seemed like there was an extreme amount of interest in terms of
    0:10:15 just like visualizing itself.
    0:10:17 So I had a roughly 20-minute video of just looping different examples of
    0:10:23 Radiance Fields that I’ve created and some of the people in the community have as well.
    0:10:27 And so I think there was a lot of interest across a wide variety of industries,
    0:10:31 where I spoke to professors, I spoke to people working on offshore drilling sites.
    0:10:36 I spoke to physicians, people of really diverse backgrounds and use cases.
    0:10:41 But I think all of them will be affected by Radiance Fields.
    0:10:44 >> And is the interest in some of those, because I want to ask you about sort of
    0:10:48 artistic creative implications as well, but we’ll put a pin in that for a second.
    0:10:52 Is the interest from, say, a physician or the offshore drilling site makes me think
    0:10:57 of use cases of robots and drones to be able to go to places more safely
    0:11:04 than sending a human to inspect something?
    0:11:06 And is it along those lines of being able to create a Radiance Field and
    0:11:12 then from a, quote, safe environment be able to inspect different aspects
    0:11:17 of the offshore site from different angles?
    0:11:20 Is it that kind of thing, or is it something totally different?
    0:11:22 >> Yeah, no, it actually is quite similar where if you have a predetermined camera
    0:11:27 path or you give the necessary information to the model,
    0:11:32 it will be able to create a hyper realistic view of what it sees.
    0:11:36 And from that, you can then flag for a human if they need to go and
    0:11:41 take a visit for actual maintenance or repairs.
    0:11:44 And so you’re able to really give a hyper realistic look for
    0:11:48 that specific use case for asynchronous maintenance.
    0:11:51 >> Right, right.
    0:11:53 Are there implications with VR and extended reality and augmented reality?
    0:11:57 >> Yes, yes, and so that was actually one of the demos that we were showing.
    0:12:00 It’s just like VR applications because they still retain their
    0:12:06 view dependent effects when you’re in VR.
    0:12:08 And so as you move around a scene, we as humans expect light to behave in a certain
    0:12:13 way, and with this, that continues to hold true.
    0:12:17 And with Radiance Field Zero, you can actually walk through the entire scene.
    0:12:22 And so it is, I think, the closest thing to actually stepping back into
    0:12:26 a moment in time that we have.
    0:12:27 >> Yeah, amazing.
    0:12:29 I’m speaking with Michael Rubloff.
    0:12:30 Michael is the founder and managing editor of RadianceFields.com,
    0:12:34 a website that’s covering the progression of Radiance Fields based technologies.
    0:12:38 And we’ve been talking about them about Neural Radiance Fields, NURFs, and
    0:12:42 3D Gaussian Splatting, these techniques that allow us to stitch together 2D images
    0:12:46 and create a hyper realistic 3D model, 3D environment that we can do all these
    0:12:52 different things with.
    0:12:53 I mentioned, wanted to ask you about some of the artistic implications.
    0:12:56 And this might be a weird leaping off point, so redirect me if it is.
    0:12:59 But recently, we recorded a podcast with a woman from a company called Cubrix.
    0:13:06 And I’m going to get this wrong, forgive me.
    0:13:07 But it’s basically kind of like a digital sound stage, sort of an advanced
    0:13:13 digital green screen type thing that you can use in filmmaking,
    0:13:17 video making, and powered by generative AI, kind of similar things.
    0:13:22 And I remember asking her, what advice do you have for burgeoning filmmakers who
    0:13:29 are interested in creating but wondering how to go about it in this age with all
    0:13:34 these AI tools now becoming available and advancing so quickly?
    0:13:38 And her answer was not what I expected, but it was really interesting.
    0:13:40 She said, well, the first thing you should do when you’re thinking about using
    0:13:44 generative AI in filmmaking is really delve into your own subconscious.
    0:13:48 And if I had understood her correctly, I think she was talking about the
    0:13:52 capabilities of what types of images and moving images you can create with
    0:13:59 generative AI tools at your disposal go well beyond what you can create without
    0:14:05 them, and you’re not limited to capturing reality, so to speak.
    0:14:09 You can create reality, which people have been able to do with technology for
    0:14:14 a while now, but easier, faster, perhaps better.
    0:14:18 What do radiance fields do?
    0:14:20 What do you think that they are doing and could do for creative applications?
    0:14:24 Yeah, I see a very large creative opportunity for radiance fields going
    0:14:29 forward.
    0:14:30 I think that they allow people to take larger risks or be able to actually…
    0:14:36 I actually wouldn’t even call them risks because what you can do with them is if
    0:14:41 you get captured, you can film in post.
    0:14:44 You don’t actually need to film up front, you just need to capture it.
    0:14:48 And I think that that allows a lot more thinking about where you’re not
    0:14:53 being constrained for time to think like, what are the actual camera movements
    0:14:57 that we want or what’s the best way to actually tell this story?
    0:15:00 And you have the ability to align and say, if you are a director, you can go to
    0:15:04 your director photography and say, here’s the exact camera movement I’m trying
    0:15:07 to convey in this or here’s the exact thing I’m trying to show.
    0:15:11 And I think that that’s going to really supercharge productions and in the same
    0:15:16 way, I think that it’s also going to allow more stories to be told in places
    0:15:22 that we’ve never been before because we can be transported to these places
    0:15:28 and be exposed to the natural lifelike interpretation of wherever you want to
    0:15:35 take an audience.
    0:15:36 And in the same way, I think for students and for independent filmmakers, it
    0:15:41 represents such a massive opportunity because you will have the ability to go
    0:15:45 to locations and tell stories in locations that you may have always dreamed
    0:15:49 of shutting down the Las Vegas Strip, for instance.
    0:15:51 But now you’re actually going to be able to do that.
    0:15:53 Right, right.
    0:15:54 Where are we at in terms of the technology when it comes to the resolution
    0:15:59 of images and particularly in the backgrounds of the images?
    0:16:03 Are we just constrained by the quality of your camera and the available
    0:16:09 compute or is it more intricate than that?
    0:16:13 No, I think that where we are right now, it’s fascinating.
    0:16:15 There’s a meta-reality labs paper that released late last year called VRNURF
    0:16:20 and essentially what they did was they created this camera rig, effectively
    0:16:25 known as the Eiffel Tower, where it has 22, I think, Sony A9 cameras
    0:16:31 strapped onto it all facing different directions.
    0:16:33 And then from that, what they would do is they’d go into a room and then they’d
    0:16:36 push it through the room and each camera would take nine bracketed exposure
    0:16:41 shots across different exposures.
    0:16:44 Then each one of those images would be compiled into a single HDR image
    0:16:47 and then that image would be trained in the NURF.
    0:16:50 And the resulting image quality of that approaches the IMAX level quality.
    0:16:55 And it’s able to be reconstructed.
    0:16:57 It’s not a bottleneck in terms of the visual fidelity.
    0:17:00 It’s able to handle that.
    0:17:02 It’s more of a compute issue right now.
    0:17:04 But obviously, as time goes on, we’re going to get more and more efficient
    0:17:08 computers as well.
    0:17:10 And so it’s more a proof of concept to me than anything is that, you know,
    0:17:15 when we get to that level, that’s the floor.
    0:17:17 Yeah, right, right, right.
    0:17:19 Beyond capturing moments in time for your own use,
    0:17:24 are you doing other things yourself with the technology right now?
    0:17:27 Yeah, so I’ve been doing some consulting work for businesses
    0:17:30 that want to implement the Radiance Field-Based Technology into their offerings.
    0:17:34 And I can’t talk about too much about like the work.
    0:17:38 But one of the ones I’m really excited to be working on is Shutterstock,
    0:17:43 where it’s just, you know, how do we create these assets
    0:17:46 that are available for people to actually use today?
    0:17:49 Right, amazing.
    0:17:50 So kind of out in, you know, in the mainstream world, so to speak, right,
    0:17:53 in the pop culture world, I would imagine that we’ve probably seen NURFs in action
    0:17:58 and just didn’t realize it or didn’t know how to name it.
    0:18:02 Is that off base?
    0:18:03 Or are there some examples of things out there that listeners might have seen?
    0:18:06 Yeah, there have actually been some pretty high-profile uses of Radiance Field,
    0:18:10 where earlier this year, the Phoenix Suns, actually, the entire team was NURFed.
    0:18:15 And so there are NURFs of Kevin Durant, Devon Booker, the entire team.
    0:18:20 And they actually are using it as part of their introductory video for this season.
    0:18:26 But you’re like during the games and starting lineups?
    0:18:28 Yes, yes, and it really showcases a way that, you know,
    0:18:31 if you as a business can help bring fans closer to the action
    0:18:35 because, you know, you have these lifelike interpretations
    0:18:37 that are doing the most insane camera movements that, you know.
    0:18:40 And so is it like Katie’s going up for a shot
    0:18:43 and the camera sort of seems to, and I don’t know if the motion stops or not,
    0:18:48 but the camera sort of stops and then tracks sort of around him from a different angle,
    0:18:53 like that kind of thing?
    0:18:54 Yeah, that’s actually extremely close to one of the examples where he’s about to, you know,
    0:18:58 dunk and it kind of flies up around him and circles around,
    0:19:01 which would be very difficult to go ahead and create.
    0:19:05 But, you know, Riddensfields make that actually very surprisingly easy.
    0:19:10 Yeah, amazing.
    0:19:11 Yeah, and there’s also been a lot of other high profile use cases within the music industry.
    0:19:15 So Zayn Malik, for instance, has a music video for Love Like This,
    0:19:20 which I think got something like 10 million views in the first 24 hours,
    0:19:24 which is just insane.
    0:19:25 There’s a music video for RL Grimes, Pour Your Heart Out,
    0:19:29 which is actually comprised of over 700 individual nerfs,
    0:19:33 which is just an insane endeavor.
    0:19:35 And every single shot in that music video is a nerf.
    0:19:39 There’s also Usher for his most recent single, I think it’s called Ruin,
    0:19:43 has a few different examples of Gaussian Splatting in it.
    0:19:47 And then Polo G has a song that just released called Sorrys and Ferraris
    0:19:51 that also are using Gaussian Splatting.
    0:19:54 And then there’s also Chris Brown has one as well.
    0:19:58 And J. Cole and Drake put out a music video, I think,
    0:20:02 within the last week or two.
    0:20:03 And there is a nerf hidden in there.
    0:20:05 Nice.
    0:20:05 And I don’t know if this counts or not, but in Jensen’s most recent keynote,
    0:20:10 there actually is a nerf in the background and one of his slides.
    0:20:14 All right, should we make it a contest or listeners to spot it
    0:20:17 or do you want to call it out so we can go look?
    0:20:20 If you guys want to, or if you guys want to pause it
    0:20:23 and see if you guys can spot it from the two hours.
    0:20:27 This is great.
    0:20:28 Everybody, you can go queue up your YouTube playlist
    0:20:30 with all the music videos you just mentioned.
    0:20:32 Yes.
    0:20:33 And then, you know, rewind the podcast, listen back,
    0:20:36 you’re spotting the nerfs.
    0:20:37 And then when you’re done with the pod, go back, rewatch Jensen’s keynote
    0:20:40 and see if you can find it.
    0:20:41 Yes, but it’s in this slide where I think he’s taking a look
    0:20:44 at the different modalities in which Nvidia operates.
    0:20:46 And it’s under the 3D tab and it’s like a coastal cliff view.
    0:20:51 And it was actually taken by one of my good friends, Jonathan Stevens.
    0:20:54 And so it was really cool.
    0:20:55 That was the first time, I think,
    0:20:57 that we’ve seen nerfs being featured in the keynote.
    0:20:59 Right, excellent.
    0:21:00 Shout out to Jonathan.
    0:21:02 Michael, closing thoughts, nerfs, radiance fields, 3D Gaussian splatting.
    0:21:08 For the listener who, you know, never heard of this stuff before.
    0:21:12 Listen to our conversation, you know, ring some bells,
    0:21:15 spark some ideas in their head.
    0:21:17 They’re thinking about going out and, you know, exploring some
    0:21:20 of the music videos we talked about.
    0:21:22 Where do you think this is going to go over the next,
    0:21:25 whatever the time period is, couple of years, 10 years, generations?
    0:21:28 Are all of our photos going to become 3D multi-perspective,
    0:21:33 you know, sort of models going forward?
    0:21:38 Are we forever living in a world of 2D images?
    0:21:42 And then we figure out ways to make them more like 3D models.
    0:21:46 Where’s the future of imaging and sort of post-processing headed?
    0:21:50 Yeah, it’s a great question.
    0:21:51 And, you know, my opinion is that we now have the ability to no longer be constrained
    0:21:56 to 2D, and, you know, 2D is not actually how we experience our lives.
    0:22:01 And it’s, to me, I feel like, you know, it should not be the final frontier of imaging.
    0:22:06 And now that we have the technology to do so, I think it’s really time to begin
    0:22:12 exploring how we can actually document, you know, our lives in a life-like way.
    0:22:17 It’s the way that we actually experience life.
    0:22:18 Because not only can you create, you know, static nerfs or Gaussian splatting files,
    0:22:23 but you can also create dynamic versions of them, too.
    0:22:27 And so if you can imagine, like, you know, an analogy of, you know, static nerfs to photos,
    0:22:32 you can also do the same with videos.
    0:22:34 And so I think that, you know, we’re really entering into an age where imaging is not
    0:22:41 the same as it has been for the last, since the inception of photography.
    0:22:46 You know, obviously, it progressed significantly, but I think that now the technology is there
    0:22:50 where we can just take a fundamental leap forwards into an entirely new dimension.
    0:22:54 Come to GTC and you leave in a new dimension, that’s how it works.
    0:22:59 Michael Rubloff, thank you so much for stopping by the podcast.
    0:23:02 Your website, again, is called radiancefields.com.
    0:23:06 Are there other places for people who want to follow your work, learn more about the space,
    0:23:10 you direct them to go, other websites, social media accounts, anything?
    0:23:13 Yeah.
    0:23:14 All my social media handles are just @radiancefields.
    0:23:17 And so that’s just generally LinkedIn, Twitter are the two big ones that I mainly post on.
    0:23:24 But yeah, I would just encourage all listeners just to try downloading some of the platforms
    0:23:30 themselves and some of the really good ones to get started.
    0:23:33 You can take a look at Luma AI, Polycam.
    0:23:36 If you’re on Windows, you can download PostShot or Nerf Studio.
    0:23:39 And you know, they’re all free right now.
    0:23:42 And it’s not as bad as you would imagine to actually capture everything.
    0:23:45 It’s actually quite straightforward and is pretty forgiving.
    0:23:48 So yeah, give it a try.
    0:23:49 If you can take a picture, you can make a nerf.
    0:23:52 Exactly.
    0:23:53 Excellent.
    0:23:54 Thank you again.
    0:23:55 A pleasure talking to you.
    0:23:56 Thank you so much.
    0:23:56 Thank you.
    0:24:03
    0:24:10
    0:24:17 Thank you.
    0:24:26 Thank you.
    0:24:45 .
    0:24:46 [BLANK_AUDIO]

    Let’s talk about NeRFs — no, not the neon-colored foam dart blasters, but neural radiance fields, a technology that might just change the nature of images forever. In this episode of NVIDIA’s AI Podcast recorded live at GTC, host Noah Kravitz speaks with Michael Rubloff, founder and managing editor of radiancefields.com, about radiance field-based technologies. NeRFs allow users to take a series of 2D images or video to create a hyperrealistic 3D model — something like a photograph of a scene, but that can be looked at from multiple angles. Tune in to learn more about the technology’s creative and commercial applications and how it might transform the way people capture and experience the world.

  • Yotta CEO Sunil Gupta on Supercharging India’s Fast-Growing AI Market – Ep. 225

    AI transcript
    0:00:00 [MUSIC]
    0:00:10 >> Hello, and welcome to the NVIDIA AI podcast.
    0:00:13 I’m your host, Noah Kravitz.
    0:00:15 We’re recording at GTC24 in San Jose,
    0:00:18 California, and we’re here to talk about
    0:00:20 the fast growing AI market in India.
    0:00:23 My guest is Sunil Gupta.
    0:00:25 Sunil is the co-founder,
    0:00:26 managing director and CEO of Yata Data Services.
    0:00:30 Yata is the first Indian cloud service provider
    0:00:32 member of the NVIDIA partner network program,
    0:00:35 and the company Shakti Cloud offering
    0:00:37 is India’s fastest AI supercomputing infrastructure,
    0:00:41 featuring 16 exaflops of AI compute capacity
    0:00:44 supported by more than 16,000 NVIDIA H100 GPUs.
    0:00:49 Sunil has been a busy man this week,
    0:00:51 but he’s taken some time to stop by
    0:00:53 the podcast studio here at GTC to tell us all about
    0:00:57 Yata and its role in India’s fast growing AI sector.
    0:01:01 So let’s get right to it.
    0:01:02 Sunil Gupta, thank you so much for stopping by.
    0:01:05 Welcome to the NVIDIA AI podcast.
    0:01:07 >> Thank you for having me. Thank you.
    0:01:08 >> So let’s start with the basics for listeners
    0:01:11 who maybe just heard of Yata for the first time this week.
    0:01:13 You’ve been in the news, a lot to congratulate you on.
    0:01:16 What is Yata? What does the company do?
    0:01:18 >> Great. So Yata is the managed data center and
    0:01:21 cloud service provider operating in India for the last five years.
    0:01:24 We have been running our own self-designed and
    0:01:28 engineered and constructed data centers.
    0:01:31 We that way offer traditional data center,
    0:01:33 co-location, managed hosting cloud,
    0:01:35 and a variety of managed services there.
    0:01:38 We have been offering GPU services for the last four years.
    0:01:42 Ever since we started, but those were the smaller GPUs,
    0:01:44 the A40s, V40s, and T4s,
    0:01:46 and the use cases used to be creating
    0:01:49 content and game creations and things like that.
    0:01:52 Possibly those were the credentials that I have so
    0:01:55 much of large data center campuses,
    0:01:56 and I have got the experience of handling GPUs,
    0:02:00 and delivering it to enterprise customers for different use cases,
    0:02:03 and from an Indian scale point of view,
    0:02:05 if I’m having 700 GPUs today in my data center,
    0:02:08 that still possibly is the largest deployment of GPUs in India,
    0:02:11 and those were the credentials which
    0:02:12 possibly attracted the eye of NVIDIA India,
    0:02:14 and they suggested to Jensen that possibly these are the right guys,
    0:02:17 having data center that have the power and the right skill sets,
    0:02:20 and exposure to GPUs that,
    0:02:23 and then they are there as I did in one of my interviews just recently,
    0:02:26 that I’m hungry and ambitious,
    0:02:28 and I want to take a plunge of in this market,
    0:02:32 which possibly can become the largest or one of
    0:02:34 the largest market in the world,
    0:02:35 and it can also become a garage as
    0:02:38 a service provider to the rest of the world for developing AI models.
    0:02:41 So, we took the jump, and today, we have got our first set of deliveries already,
    0:02:49 and possibly by end of March,
    0:02:51 I’ll have all the deliveries completed,
    0:02:53 and by around 15th May,
    0:02:54 we are targeting to go live giving customers our GPU services.
    0:02:58 >> Fantastic. 15th of May, you see?
    0:03:00 >> Yeah.
    0:03:00 >> Is that Shakti Cloud to go live?
    0:03:03 >> That’s Shakti Cloud, right.
    0:03:04 >> So, can you tell us about Shakti Cloud?
    0:03:05 >> Yes. So, first, yeah.
    0:03:07 So, we are running data centers and other cloud and managed services for four years,
    0:03:11 we have been giving GPUs for four years,
    0:03:13 but yes, Shakti Cloud is something which is essentially a GPU-based cloud.
    0:03:18 It has got about 16,384 to be precise,
    0:03:22 800 GPUs, it has got a couple of thousands of L40s GPUs,
    0:03:28 which are mainly for inferencing purposes.
    0:03:30 We have put up this entire GPU Cloud on
    0:03:33 NVIDIA’s reference architecture to the T,
    0:03:34 we are not debating from that even for a single element of that,
    0:03:40 it does not debating from that.
    0:03:41 So, essentially, there’s an InfiniBand layer,
    0:03:43 which is actually in a leaf and spine architecture.
    0:03:45 So, we are creating pods of 2,000 GPUs,
    0:03:50 256 nodes put together, create one pod,
    0:03:54 and there’s a core layer on the top of that,
    0:03:56 which essentially means that I can connect eight such pods of 2,000 GPUs
    0:04:01 into one super pod of 16,800 GPUs.
    0:04:04 And that essentially means that on one hand,
    0:04:07 I can give to a small startup, a single GPU or a single node
    0:04:12 or even a partial element of the GPU,
    0:04:16 that is one end of the capabilities for some use cases.
    0:04:18 But on the other hand, if some last-scale customer comes and he says,
    0:04:22 “My model requires 16,800 working parallel to train a very large language model,”
    0:04:28 we can develop even that as well.
    0:04:29 So, that is essentially one element in terms of the underlying processing capability.
    0:04:35 Then, you know, you need as much high-speed special type of storage.
    0:04:39 You know, the data need to be trained on the GPU,
    0:04:42 so it has to be on a high-speed storage.
    0:04:44 So, that is what we have put in from Vekka.
    0:04:47 Then, what we have done, we have put a complete NVIDIA AI enterprise stack,
    0:04:51 software stack on the top of this.
    0:04:53 We have, because we have been working to develop our own sovereign cloud in India
    0:04:57 for the last two years, so that experience came in very handy.
    0:04:59 We, a lot of that coding, actually, we sort of repurposed
    0:05:03 to develop our own orchestration layer, our own self-service portal.
    0:05:07 And so, today, my self-service portal has got entire NVIDIA AI software stack,
    0:05:14 you know, bundled into that.
    0:05:16 I’m putting a whole lot of open-source software libraries.
    0:05:20 I’m putting in a capability for people to invoke the models from, let’s say,
    0:05:25 a hanging face and then bring it into my orchestration layer,
    0:05:28 and then people to bring their own data, you know, annotate the data, clean the data,
    0:05:32 maybe create more synthetic data, do all that stuff.
    0:05:35 It’s essentially MLops operations, and then use an existing model in my marketplace,
    0:05:41 which is there in the portal, to fine-tune their own model, right?
    0:05:44 And once they fine-tune their model, then they can put that model, you know,
    0:05:48 directly for an influencing purpose in the marketplace,
    0:05:50 or they can take it to wherever they want to take it,
    0:05:51 or they can build that model through APIs into one of their enterprise applications.
    0:05:55 So, essentially, what I have tried to do in Shakti Cloud is that on one side,
    0:05:59 you are creating a very, I would say, reasonably large-size underlying
    0:06:04 infrastructure layer of compute and storage and network.
    0:06:07 And on the top of that, you are trying to create a complete, I would say,
    0:06:11 you know, self-contained orchestration layer, where users can come online,
    0:06:16 they can consume GPUs from one GPU to a partial GPU, thousands of GPUs.
    0:06:20 They can decide to create, you know, their own clusters.
    0:06:24 So, I give a capability to them to make online Kubernetes clusters,
    0:06:27 or they can go for SLUM as a cluster.
    0:06:29 And once they have done that, and then, essentially, they bring in the data
    0:06:33 and train the models, either they make a model grounds up,
    0:06:37 or they use one of the pre-existing models in the marketplace,
    0:06:39 and then fine-tune that with their own data.
    0:06:41 So, that is essentially what we are trying to do in Shakti Cloud.
    0:06:45 Can you tell us a little bit, and for the audience, for myself,
    0:06:48 what the digital economy, what the AI scene is like in India,
    0:06:53 and what, you know, you talked about building a sovereign cloud,
    0:06:56 and clearly, you know, Shakti Cloud is offering, as you said,
    0:07:00 services ranging from a partial GPU all the way up.
    0:07:03 What does it mean to the Indian economy, to the tech sector in India?
    0:07:09 You know, what is this presence going to do?
    0:07:11 Well, see, as you know, I just delivered a speech in the conference today,
    0:07:15 and the topic of my speech was that it is AI from India, for India,
    0:07:21 and also for the world, right?
    0:07:22 So, essentially, there were two elements to my speech.
    0:07:24 One was that India itself is potentially going to be the world’s first
    0:07:29 or the second or third largest market,
    0:07:31 and there are some reasons why I am saying so.
    0:07:33 And second part of this is was that, just like India has dominated the world scene
    0:07:38 as the garage for delivering IT software and services for the last three decades,
    0:07:42 there is no reason to believe that India cannot be also an AI garage for the world, right?
    0:07:48 What India has already, if I just give you some dynamics on the first part,
    0:07:52 why India has the potential to become one of the largest market in the world,
    0:07:55 see, you just see the scale of digital adoption.
    0:07:57 Today, India has got 900 million internet users,
    0:08:01 possibly more than the population of some continents, right?
    0:08:04 Out of this 900 million, 600 million are smartphone users,
    0:08:07 essentially the users who are consuming videos,
    0:08:10 who are consuming images, who are generating videos,
    0:08:12 generating images and actually pushing it out, right?
    0:08:15 India today is far ahead in terms of adopting digital payments.
    0:08:19 You are doing 100 billion payment transactions per month,
    0:08:23 which is like 10 times bigger than any other economies in the world.
    0:08:26 Right.
    0:08:27 India has got more than 5 million trained IT professionals.
    0:08:31 This count keeps on increasing and out of which I think 420,000 plus
    0:08:34 are actually AI trained professionals.
    0:08:35 Okay.
    0:08:36 This is the skill set side and the economic side.
    0:08:38 By the way, Indian economy is growing,
    0:08:40 which is the bright spot in the world scene.
    0:08:42 It’s 7.5% CAGR and as per all the various reports for the next few years,
    0:08:47 because the demographic dividend which India is enjoying,
    0:08:49 the population which is earning is more than the population which is dependent,
    0:08:53 India is expected to grow with the same speed for next couple of years.
    0:08:57 So when you combine all these factors together
    0:08:59 and also couple that with the factor that on the infrastructure side,
    0:09:02 where typically India was seen as maybe a laggard economy,
    0:09:05 in the last couple of years,
    0:09:06 India is building infrastructure like anything in terms of digital infrastructure,
    0:09:09 not only the network and the fiber going to the last mile,
    0:09:12 in the 5G going to the last mile,
    0:09:13 Indian data center scene which was just about 200 megawatt just by 2015,
    0:09:19 today India is boasting of around 1 gig of data center capacity,
    0:09:23 which is ready and into production and there’s another about 700 megawatt,
    0:09:27 which is going to be going live in the next couple of months.
    0:09:30 And this is something which is growing at a 40% CAGR.
    0:09:33 So and because all these data centers came up just in the last 7-8 years
    0:09:37 because of the hyperscalers coming and putting up the shops in India,
    0:09:40 so you can just imagine that these data centers have been built
    0:09:43 as per the specification of the hyperscalers.
    0:09:45 So these are world-class, latest generation data centers as good as anywhere in the world.
    0:09:50 And you know, with some retrofitting and extra engineering,
    0:09:55 many of these data centers can be customized to handle the GPU load.
    0:10:01 So if you see, combine the skill stats, growing economy, the digital adoption
    0:10:06 and also that India also is now building up on its infrastructure
    0:10:11 and I now couple that with Shakti Cloud getting launched,
    0:10:15 India will also have a very large supercomputer for the purpose
    0:10:20 of training AI models and putting them for inferencing.
    0:10:23 There’s no reason why India will not, first of all,
    0:10:26 become a very, very large AI market itself.
    0:10:29 We can become a consumer of AI,
    0:10:30 but it will also become a very, very big service provider for the rest of the words.
    0:10:35 You know, to consume AI from India.
    0:10:37 One of the things I noticed in reading about Shakti Cloud
    0:10:39 and listening to some of the media coverage
    0:10:41 was the importance of self-service.
    0:10:44 Can you talk a little bit about that?
    0:10:46 About being able to offer a complete end-to-end high performance and AI tune platform,
    0:10:51 especially considering the diversity of customers that you’re going to be serving,
    0:10:55 you know, startups, academics, researchers, and obviously industry at scale.
    0:10:59 Yes. So this is one of the things which we focused on big time.
    0:11:03 And I can see, you know, possibly we are following Nvidia’s footsteps.
    0:11:06 So if you see Jensen’s keynote also two days back, you know,
    0:11:09 while he was announcing the launch of GB200,
    0:11:12 which was a hardware part, I can say,
    0:11:13 but 70% of his talk actually was focusing on software, right?
    0:11:17 And as much capabilities you are giving to the end users, you know,
    0:11:21 which is making their life easy to develop their AI model
    0:11:24 and put them for inferencing for different industrial use cases.
    0:11:27 And as much you make it easy, you know, for them to do this, you know,
    0:11:31 and that is where the software comes in.
    0:11:33 So as I said that we have been working to develop our own sovereign cloud in India
    0:11:38 for the last two years. This is something which is the need of the art for India.
    0:11:42 We actually use that capability to also very, very quickly,
    0:11:46 possibly just in about three months, we actually developed our own self service portal,
    0:11:50 a orchestration layer, completely homegrown,
    0:11:52 which essentially means that one end users can come in.
    0:11:56 It can be a small startup or it can be a very, very large organization
    0:11:59 wanting thousands of GPUs for everybody.
    0:12:02 You can just come online, create your own account, authenticate yourself.
    0:12:06 If you end user, you can authenticate using my AD.
    0:12:09 If you are a large enterprise, we can plug in your AD
    0:12:11 and you can authenticate with that.
    0:12:13 And once you have done that, you starting from, for example,
    0:12:17 you want to have a bare metal node.
    0:12:20 So whether you want to have a single bare metal node
    0:12:22 or whether you want to have thousands of nodes, you can subscribe it online.
    0:12:25 You know, it will show your drop down.
    0:12:26 You’ll show the types of GPUs available and you can just subscribe it.
    0:12:29 You want to put operating system on this.
    0:12:31 So we have got images of very operating systems on that.
    0:12:33 Now you want to create a Kubernetes cluster on that.
    0:12:35 We want to create a slump cluster on that.
    0:12:37 We are giving those capabilities.
    0:12:38 You select your master nodes, how much work you want.
    0:12:40 And then, you know, the cluster gets made right then and there.
    0:12:44 And then you can monitor the various health parameters of your clusters.
    0:12:46 The other part is serverless, for example, that especially during inferencing,
    0:12:51 how I’m seeing the requirement coming in from our customers
    0:12:54 that during training time, possibly they need, you know,
    0:12:56 maybe a couple of hundreds, a couple of thousands of GPUs dedicated for themselves
    0:12:59 for train their model.
    0:13:01 But once they have done that over a, let’s say, a period of 60 days or 90 days,
    0:13:04 or whatever the time it takes, depending on the size of their model.
    0:13:07 After that, they’re putting the model for inferencing.
    0:13:09 And inferencing traffic is like, like any other application put on the internet
    0:13:13 where users are coming and consuming your model of the application.
    0:13:17 Now that traffic is going to be very, very busted.
    0:13:19 There will be peaks where there’s a huge user traffic
    0:13:21 and they’ll be trough where basically the traffic is not there.
    0:13:24 So essentially, the users are saying that instead of we paying
    0:13:27 for a dedicated committed capacity of GPUs,
    0:13:29 you rather give it to your GPU on demand,
    0:13:31 essentially what we call as a serverless function, you know,
    0:13:33 where, you know, multiple users are actually vying for the same GPU capacity
    0:13:37 and everybody gets satisfied because there’s enough capacity
    0:13:40 and not everybody is peaking at the same time.
    0:13:42 So that’s for serverless functionality.
    0:13:44 Also, we have put in on the fly, you can develop containers
    0:13:46 and put the same GPU capacity for inferencing function as well.
    0:13:51 Then what we did is that, you know, while there’ll be certain people
    0:13:55 who would like to make grounds up large language models,
    0:13:58 they don’t need any of my software other than access to GPUs and clustering.
    0:14:01 But majority of the people possibly will be coming
    0:14:04 to consume one of the foundational model which is available in the marketplace.
    0:14:07 So we have a big marketplace in the auction layer
    0:14:10 and they would like to bring in their own data, possibly to clean the data,
    0:14:13 annotate the data, maybe create some more synthetic data.
    0:14:15 And then they will put the data in one of the pre-existing models
    0:14:19 and actually create their own fine tune models.
    0:14:20 So that I think will be a much, much bigger market going forward.
    0:14:24 Most of the enterprises may not be creating their own foundational models.
    0:14:28 They would rather be using some of the foundational models
    0:14:30 but putting in their own company-specific or industry-specific data
    0:14:34 and create their own, you know, use case-specific models.
    0:14:37 So that is the whole environment we have created in this.
    0:14:40 So there are a whole lot of pre-trained models available,
    0:14:44 whether it is something which is a part of the NVIDIA AI stack,
    0:14:47 which I am integrating into my this,
    0:14:49 or whether you invoke some of the pre-trained models
    0:14:51 from Hugging Face or any of the public platforms.
    0:14:53 In fact, many of my startup customers who are creating
    0:14:57 their own foundational models by default,
    0:14:59 they also are going to put those models also in my marketplace.
    0:15:03 So that their end customers can come to a marketplace
    0:15:05 and start consuming their models to build their own fine-tuned models.
    0:15:08 So if you see this entire capability of this end-to-end software,
    0:15:14 there’s no end to that.
    0:15:15 I mean, I myself don’t know what all we’ll end up doing in the next one year
    0:15:18 because every day is a new discovery as to what all is possible.
    0:15:21 What essentially we are trying to do is that, yes, right now,
    0:15:25 majority of the GPU usage seems to be the domain of the tech companies,
    0:15:31 the startups who are actually trying to train the models,
    0:15:34 either a foundational horizontal model,
    0:15:36 a language model, or an industry-specific LLM.
    0:15:39 But gradually, I will see if the market will start moving more and more
    0:15:43 towards fine-tuned models,
    0:15:44 especially to the needs of a company or enterprise or an industry.
    0:15:47 And that will be put for interesting.
    0:15:49 So how we are able to meet the demand of this wider audience?
    0:15:53 You know, the enterprises in the wider startups,
    0:15:55 instead of just focusing on some of the bulk customer upfront,
    0:15:58 and that is where the important this software comes in,
    0:16:00 multiple people coming in without any manual intervention
    0:16:03 or me trying to do something for them,
    0:16:04 which delays the whole thing.
    0:16:06 They just come and start consuming whatever they want.
    0:16:08 Yeah, yeah.
    0:16:09 Does your approach to security and reliability change going from,
    0:16:15 you know, data centers sort of of the past, if you will,
    0:16:18 to rolling out this large, powerful GPU cloud
    0:16:23 that’s serving so many diverse, you know, types of users and use cases.
    0:16:27 Are there specific things for kind of the AI age
    0:16:31 that you have to take into account when it comes to security?
    0:16:34 Mostly not.
    0:16:35 I mean, and again, I can keep on saying
    0:16:37 that we are all ourselves into a learning phase
    0:16:39 and we are everyday discovering what are the new dynamics coming in.
    0:16:41 But as far as my understanding as on today’s concern,
    0:16:44 there’s not a major difference.
    0:16:45 Because one is that, yes, you know,
    0:16:48 that when you are training the model
    0:16:50 or even when you’re putting the model for fine tuning or for inferencing,
    0:16:53 you know, there was a specific concern of the customers
    0:16:56 that when I’m putting my data, you know,
    0:16:58 it is going to be sometime a very, very secretive,
    0:17:01 you know, very, very sensitive data or sensitive data for a company,
    0:17:04 they want to be doubly sure that even the operator that is ourselves
    0:17:08 are not able to have access to the data.
    0:17:10 So that is where NVIDIA in their HGX H100 boxes
    0:17:13 have come out with a confidential VM feature.
    0:17:16 That is something which gives a confidence to the end users
    0:17:18 that, you know, my data is not exposed.
    0:17:19 And that was very, very important, you know.
    0:17:21 This is one key thing which is specific to AI,
    0:17:23 specific to GPUs, which NVIDIA also brought in
    0:17:25 and which is very, very helpful.
    0:17:27 And many of my customers actually asked for this feature.
    0:17:29 But if you go beyond this, once you have trained the model
    0:17:33 and you put the model for inferencing,
    0:17:35 either the model directly goes to inferencing
    0:17:37 with some console layer which users can consume the model into
    0:17:40 or you plug it into one of the enterprise application.
    0:17:42 I think after that it becomes like any other enterprise application
    0:17:45 which end users are coming and consuming.
    0:17:47 So you will have good users who are genuine users
    0:17:49 and you have bad users who are trying to attack it
    0:17:51 and who are trying to make it, you know, go bad.
    0:17:54 So from that point of view, all the security services
    0:17:57 and I’ve got a big practice of cybersecurity,
    0:17:59 you know, as a service for my end customers
    0:18:01 who are today actually hosting there maybe SAP ERP
    0:18:04 or Oracle or some of the intranets and portals.
    0:18:07 So for that, you know, many of my existing services
    0:18:10 is like I give, you know, a SOC as a service.
    0:18:13 There’s a SIEM and SO2.
    0:18:14 So there are firewalls as a service.
    0:18:15 There’s an IDS/IPS as a service.
    0:18:17 You know, users want their end users to be coming
    0:18:20 and they should be authenticated.
    0:18:21 So there’ll be a sort of a PIM layer or a PAM layer.
    0:18:23 So there are a whole lot of security layers.
    0:18:25 There’s a DDoS layer, for example.
    0:18:27 So yes, for while I’ve sized up my internet backbone
    0:18:31 which connects my data center to the rest of the world through internet
    0:18:34 for a particular use cases which were present till date.
    0:18:36 But now with AI coming in
    0:18:38 and once some good, serious good models are put to inferencing use,
    0:18:43 I presume the traffic which will be coming
    0:18:45 and into my data center to use this model will be humongous.
    0:18:48 So from that point of view, expanding and then increasing
    0:18:52 the size of my internet backing make it much more robust
    0:18:55 and also then also protecting it with a layer of follow-alls
    0:18:58 and the DDoS protection layer is something which we are enhancing.
    0:19:01 But if you ask me fundamentally, you know,
    0:19:04 that the approach to security remains the same
    0:19:06 as you would put for an enterprise application.
    0:19:07 A certain specific thing which is specific to AI,
    0:19:09 like I talked about confidential VM feature in H100,
    0:19:12 is something which is new.
    0:19:13 Right.
    0:19:14 I’m speaking with Sunil Gupta.
    0:19:15 Sunil is the co-founder, managing director
    0:19:18 and CEO of Yata Data Services.
    0:19:20 Yata Data Services is the first Indian cloud service provider member
    0:19:24 of the NVIDIA partner network program.
    0:19:27 And we’ve been talking about Yata Shakti Cloud
    0:19:30 which is launching, well, by the time you hear this,
    0:19:32 it may be launched already,
    0:19:33 offering India’s fastest AI super computing infrastructure
    0:19:37 on end-to-end environment for basically everything you want to do,
    0:19:41 whether you’re a startup, huge player already in India,
    0:19:43 and as you said, Sunil, beyond providing AI services out to the world.
    0:19:48 You’ve been in working with data centers for a while.
    0:19:51 As AI has started to explode into the mainstream consciousness
    0:19:55 and sense generative AI in particular
    0:19:57 has really captured people’s imagination,
    0:19:59 there’s been more of a light shine on the sustainability of data centers.
    0:20:04 The power, you know, as compute grows,
    0:20:07 power needs grow and these data centers grow physically.
    0:20:09 How does that sort of coexist with a world
    0:20:14 where we’re talking about climate change
    0:20:16 and we’re talking about sustainability
    0:20:18 and where is the power coming from
    0:20:19 and clean power and unclean power and all that kinds of things.
    0:20:22 What are your thoughts?
    0:20:23 What’s your approach to building a giant data center
    0:20:26 and thinking about how sustainability factors in?
    0:20:29 No, no, absolutely.
    0:20:30 I think it is a very, very fine balance
    0:20:33 between your need for growth as well as the need for sustainability
    0:20:37 and how you actually balance it is the real key.
    0:20:39 I often say that sometime when you are starting late in the industry,
    0:20:43 it is like a boom sometime.
    0:20:45 There can be disadvantages to that,
    0:20:46 but there are like advantages to that.
    0:20:48 So you are able to go to the latest technologies,
    0:20:52 right in one shot, you are able to take that jump up front.
    0:20:54 And similarly, if they are less certain,
    0:20:56 concern is about industry like for data center industry,
    0:20:59 data centers are known as power guzzlers.
    0:21:00 You know, even today, I think 3% of the world energy
    0:21:02 is used by data center project to cross two digits.
    0:21:05 If the growth of cloud and AI just keeps on happening,
    0:21:09 the way it’s happening.
    0:21:10 So in India, because we started the data industry late,
    0:21:13 we started scaling it up late and now AI also has come up.
    0:21:15 So by default, we are baking in the technologies
    0:21:18 or the frameworks where not only we build larger data centers
    0:21:23 to handle this type of workload,
    0:21:24 but we also take care of the concerns for
    0:21:26 that you use lesser amount of power
    0:21:28 and also you use the right type of power,
    0:21:30 which is the green power.
    0:21:31 So those type of things are getting baked into our design
    0:21:33 right from day one to just to give you an idea.
    0:21:36 So number one, when I designed my data centers,
    0:21:38 I designed it at a PUE level, which was less than 1.5.
    0:21:41 Now for an Indian tropical environment,
    0:21:44 you know, having a PUE of 1.5,
    0:21:45 you know, compared to what India used to have traditionally
    0:21:48 as 1.8 is something which was good.
    0:21:50 It was saving you lots of power.
    0:21:51 And now the attempt is by way of adopting
    0:21:55 latest technologies like a direct liquid cooling
    0:21:57 or immersion cooling,
    0:21:57 you can potentially bring this PUE down to 1.2
    0:22:00 or possibly 1 also.
    0:22:01 So that would be a big,
    0:22:03 big, big contribution to environment
    0:22:04 when you start using lesser power itself, right?
    0:22:07 And that’s what we are trying to do even for Shakti Cloud
    0:22:09 that we have right now because the power per rack
    0:22:14 from a traditional 6 kilowatt per rack,
    0:22:16 now you are handling a 48 to 60 kilowatt per rack,
    0:22:19 there’s so much of power being put into the same rack.
    0:22:22 So we have actually taken the liquid,
    0:22:24 the water right up to behind the rack,
    0:22:25 which is called a rear door heat exchanger,
    0:22:28 you know, which reduces the PUE
    0:22:30 and also makes sure that I’m able to handle that workload.
    0:22:32 That’s what I’ve done for the Shakti Cloud
    0:22:33 environment in one of the floors in my data center.
    0:22:36 In my second phase of deployment,
    0:22:38 I’m already getting the direct liquid cooling,
    0:22:41 which is potentially bringing the PUE to 1.2.
    0:22:44 And once NVIDIA certifies,
    0:22:45 and that is something which I hope NVIDIA will,
    0:22:48 I potentially would like to use immersion cooling
    0:22:50 because immersion cooling is something
    0:22:52 where you are just dipping the chips
    0:22:54 directly into a liquid or water.
    0:22:56 And there are no fans and practically the PUE becomes one,
    0:23:02 which essentially means that you are not spending
    0:23:04 any extra power for cooling the equipment, right?
    0:23:07 So this is something which is in terms of you
    0:23:10 trying to reduce as low a power as possible.
    0:23:12 The second part is the type of power you are using,
    0:23:15 are you using more of a coal or thermal based power
    0:23:17 or you are using green power?
    0:23:18 So the good part is that even today in my data centers,
    0:23:22 I’m using more than 50% of the power
    0:23:24 which I’m using in my data center,
    0:23:25 actually are green sources, it’s lots of hydro sources,
    0:23:27 a lot of solar and wind sources.
    0:23:29 And because our approach since starting when we started,
    0:23:32 Yorta was to build last scale data center campuses,
    0:23:35 which can have multiple buildings,
    0:23:37 and I can serve customers with megawatts
    0:23:39 and megawatts of power and racks.
    0:23:41 So we took some fundamental steps.
    0:23:43 We actually took power distribution licenses in India.
    0:23:45 If you want to have your own power distribution,
    0:23:47 you have to take some licenses in the government.
    0:23:49 So I ended up taking those distribution licenses
    0:23:51 for both my Mumbai and Delhi campuses.
    0:23:53 Now that is giving me a leverage to decide my sources of power.
    0:23:57 At any more time, I can decide which type of power I consume.
    0:24:00 So we actually decided to have more and more element of green
    0:24:04 into our power consumption.
    0:24:06 And yeah, so I would say that as we have more and more GPUs
    0:24:11 into our data center,
    0:24:12 the amount of power we’ll be consuming in the buildings
    0:24:15 is just going to scale up much faster than I had imagined earlier.
    0:24:20 So my race to put my source of power from a 50% to hopefully 100% green
    0:24:27 is something going to become faster.
    0:24:29 So possibly in the very short period of time,
    0:24:31 I would ideally love to have my power source to be completely 100% green.
    0:24:35 Great.
    0:24:35 You’ve spoken about it a little bit in this conversation already,
    0:24:38 but kind of just to put a point on it,
    0:24:40 what does Yorta’s partnership with Nvidia mean,
    0:24:43 kind of both to the company,
    0:24:44 but also to Yorta’s position sort of now
    0:24:47 and going forward in the global tech landscape?
    0:24:50 Well, today, if you’re talking AI, you’re talking Nvidia, right?
    0:24:53 Nvidia is practically holding I think 88% of the whole market share,
    0:24:58 right?
    0:24:58 And with the type of announcement
    0:25:01 which Jensen is making both on the hardware front,
    0:25:03 the GB200 is just taking it so many notches
    0:25:06 about against competition.
    0:25:07 And of course, the bigger focus is clearly that
    0:25:11 how do you make the GPUs to the practical use cases of the industry
    0:25:14 and you guys are doing so much in terms of the software libraries
    0:25:19 which you are bringing in.
    0:25:20 So that capability that I’m aligned with Nvidia
    0:25:24 and you are becoming a NCP Nvidia,
    0:25:27 you know, the cloud partner
    0:25:29 and you’re following the reference architecture to the T.
    0:25:32 That is something which is giving a huge confidence to the end customer
    0:25:35 that these guys are bringing in exactly what Nvidia is doing
    0:25:38 in their own VGX cloud.
    0:25:39 I’m practically replica of that.
    0:25:41 In fact, it is Nvidia’s professional services team.
    0:25:44 You know, so first I engage the Nvidia design team
    0:25:47 to design my entire GPU cloud.
    0:25:49 Now it’s the Nvidia professional services team
    0:25:51 who will be landing in India in the first week of April
    0:25:52 and till 15th of May, they’ll be there
    0:25:55 who will actually will be doing the commissioning also
    0:25:57 and then they’ll be putting all the CUDA layer
    0:25:58 and all the software layer on the top of that
    0:25:59 and then they’ll be taking this whole cloud
    0:26:01 through a huge, you know, I would say sequence of testing it
    0:26:06 and then we’ll be publishing the performance benchmarks on MLPuff.
    0:26:10 And we are expecting because we have followed
    0:26:12 the exact, you know, reference design of Nvidia
    0:26:16 with the same software layer as well.
    0:26:17 So my performance benchmarks of Shakti Cloud
    0:26:19 ideally should be almost same
    0:26:20 as what Nvidia has published for DJ cloud also.
    0:26:22 So this is something which in all my conversations
    0:26:25 with end customers, the moment I tell them this
    0:26:28 coupled with my software layer
    0:26:29 and coupled with my managed services
    0:26:30 which I’m doing through the end customer in my data center.
    0:26:32 This something is going to be huge confidence
    0:26:34 that yes, these guys are bringing in absolute top notch
    0:26:37 something which is there available in any part of the world.
    0:26:41 So that is helping it very, very big time.
    0:26:43 I feel this is from my side
    0:26:45 but I think if you see the reverse way also for Nvidia
    0:26:47 it’s like a marriage of two consenting partners.
    0:26:51 Right.
    0:26:51 For Nvidia also, I think, you know, the Indian market
    0:26:54 is a huge market, you know, not to be ignored.
    0:26:58 After US, India has the highest potential
    0:27:01 to become the market for AI.
    0:27:03 Nvidia also needed a partner who knows the local situations
    0:27:07 and who has the right set of capabilities
    0:27:09 and the infrastructure
    0:27:10 to take the Nvidia capabilities to the end customers.
    0:27:12 So I think it is a great situation to be in.
    0:27:15 We are talking to a whole lot of customers across segments
    0:27:19 whether it is government customers or startups
    0:27:21 or enterprises or educational institutions
    0:27:22 like IIT is in India
    0:27:24 or for that matter a whole lot of customers
    0:27:26 from other parts of the world
    0:27:28 because scarcity is everywhere.
    0:27:29 So whether it is Europe or Middle East
    0:27:31 or some of the APEC countries, you know,
    0:27:33 I’m getting requirements from everywhere.
    0:27:35 And yes, because it is Nvidia,
    0:27:37 because it is Nvidia reference architecture
    0:27:39 and because I’ve got all the software stack out there
    0:27:41 and Nvidia, the company is fully behind your tongue
    0:27:44 in our success.
    0:27:45 That is something which is coming in very, very handy.
    0:27:48 That’s tremendous.
    0:27:49 Kind of to wrap up
    0:27:50 and again everything you’ve said so far sort of leads into this
    0:27:53 but to ask the question, what’s the vision?
    0:27:56 What’s the vision for Yota in India globally
    0:27:59 once Shakti Cloud is deployed?
    0:28:02 What comes next?
    0:28:02 Where do you see this going?
    0:28:04 And if the answer is like you said,
    0:28:06 I don’t know because everything’s changing so fast,
    0:28:08 that’s okay too.
    0:28:08 Yeah, somehow I believe that you can never be a fixed strategy.
    0:28:12 You know, the days of having a fixed strategy
    0:28:15 and just keep on working on that strategy are gone.
    0:28:16 I think if you’re changing your strategy
    0:28:18 even after every three months,
    0:28:20 essentially it is not change of strategy.
    0:28:22 It’s like responding to the market needs, right?
    0:28:24 So you have to change yourself constantly,
    0:28:26 unlearn old things and relearning new things
    0:28:28 which are relevant to the market.
    0:28:30 So if you ask me my thought process today,
    0:28:33 is that first is India possibly needs 100,000
    0:28:38 or maybe multiple 100,000s of GPUs.
    0:28:40 Today this 4,000 to 16,000 look big compared to India scale
    0:28:44 but if I see it compared to a US market, it is too small.
    0:28:47 But all the dynamics which I talked to you
    0:28:50 about the Indian digital adoption,
    0:28:52 there’s no reason to believe that India
    0:28:54 cannot become a much, much larger market.
    0:28:56 Of course.
    0:28:56 So how I keep on building the GPU capacity,
    0:29:00 the latest generation GPU capacity
    0:29:02 of Janssen themselves has committed
    0:29:03 that we will be the early cloud service provider
    0:29:07 who will get access to GB200.
    0:29:09 So I definitely like to have the latest generation of GPUs,
    0:29:12 the latest generation of network products
    0:29:14 available in Shakti Cloud.
    0:29:16 We’ll keep on developing on the softer capabilities.
    0:29:19 We’ll keep on having more and more pretend models into this
    0:29:22 and there’ll be a whole lot of elements will be coming in.
    0:29:24 To our startup customer and also other customers,
    0:29:27 we are actually signing contracts with them
    0:29:29 that whatever models you are creating,
    0:29:31 we are again asking them, requesting them
    0:29:33 to publish those models again in my marketplace
    0:29:35 so that becomes the overall one ecosystem.
    0:29:37 Second part is I’m not going to restrict myself to India.
    0:29:40 Of course, after Mumbai, which is my starting point,
    0:29:42 I’ll actually put my Shakti Cloud node in Delhi,
    0:29:45 where my second campus is.
    0:29:46 And as per the needs and demand of the customer,
    0:29:49 if at all there’s a location specific needs,
    0:29:50 tomorrow nothing stops me to launch it,
    0:29:52 let’s say in Bangalore or some other part of the country.
    0:29:55 What I am very much interested in
    0:29:57 and that is the talks we are having even for our other cloud,
    0:29:59 which is Yantra Cloud, my normal hyperscale cloud.
    0:30:02 And now I’m not applicable to Shakti Cloud is that
    0:30:04 there are customers maybe from data residency concerns point of view
    0:30:09 who would like to have the capabilities of Shakti Cloud,
    0:30:12 but they would like the GPUs to be available in their country.
    0:30:15 So, I in any case in my growth chain as a DC operator,
    0:30:19 I already had the vision of constructing data centers
    0:30:22 in some of the APEC countries,
    0:30:23 some of the Middle East and African countries.
    0:30:25 Maybe that will go at a little back burner
    0:30:29 because constructing a data center into foreign territory
    0:30:32 takes its own sweet time, but for Shakti Cloud,
    0:30:35 I can still outflows the underlying data center co-location capacity
    0:30:40 from one of the partners in those respective territories.
    0:30:42 But then I can definitely have GPUs and this whole
    0:30:45 Shakti Cloud layer delivered and implemented
    0:30:47 and commissioned still managed from India in the respective countries.
    0:30:52 So, essentially, the idea is one, vertically keep on adding up to
    0:30:57 the capabilities in terms of software.
    0:30:59 Second is expanding more and more and more GPUs
    0:31:03 into one data center, two data center and multiple data center in India.
    0:31:07 And three, taking the Shakti Cloud as a product
    0:31:11 to multiple territories across the world wherever there’s a demand.
    0:31:13 So much, there’s so much happening.
    0:31:15 So much, so much happening in the future.
    0:31:17 It’s an amazing time.
    0:31:19 Sunil, for listeners who are digesting this and thinking,
    0:31:23 I need to learn more about Yata.
    0:31:25 This is something I need to keep an eye on.
    0:31:27 Website, there’s been media coverage.
    0:31:29 Where should they look online to find out more?
    0:31:32 So, you can find more about Shakti Cloud on our website.
    0:31:35 The URL is www.shakticloud.ai.
    0:31:39 You can read quite a lot of information about this is a website
    0:31:43 which has just gone like just three days back in the latest version.
    0:31:46 And while right now our online portal where you can access
    0:31:51 all the services online is still being potent.
    0:31:55 So it may take maybe a couple of days more
    0:31:56 before users can start coming in and start consuming the services online also.
    0:32:00 But there are enough resources for you to know about Shakti Cloud on ShaktiCloud.ai.
    0:32:04 Fantastic.
    0:32:05 Sunil Gupta, thank you so much for taking the time.
    0:32:08 Obviously, busy, busy times for your congratulations on all of it.
    0:32:11 But thanks for taking the little time to come out and talk to the podcast audience.
    0:32:14 So they can stay abreast, keep an eye on the burgeoning.
    0:32:18 I mean, to everything you said, the Indian market,
    0:32:21 the Indian scene set to explode.
    0:32:24 And it’s a global world now.
    0:32:25 So the benefits will be felt all over.
    0:32:27 Thank you so much.
    0:32:28 Thank you for having me.
    0:32:29 [Music]
    0:32:39 [Music]
    0:32:49 [Music]
    0:32:59 [Music]
    0:33:07 [Music]
    0:33:17 you

    India’s AI market is expected to be massive. Yotta Data Services is setting its sights on supercharging it. In this episode of NVIDIA’s AI Podcast, Sunil Gupta, cofounder, managing director and CEO of Yotta Data Services, speaks with host Noah Kravitz about the company’s Shakti Cloud offering, which provides scalable GPU services for enterprises of all sizes. Yotta is the first Indian cloud services provider in the NVIDIA Partner Network, and its Shakti Cloud is India’s fastest AI supercomputing infrastructure, with 16 exaflops of compute capacity supported by over 16,000 NVIDIA H100 Tensor Core GPUs. Tune in to hear Gupta’s insights on India’s potential as a major AI market and how to balance data center growth with sustainability and energy efficiency.

  • How Two Stanford Students Are Building Robots for Handling Household Chores – Ep. 224

    AI transcript
    0:00:00 [MUSIC]
    0:00:10 Hello, and welcome to the NVIDIA AI podcast.
    0:00:13 I’m your host, Noah Kravitz.
    0:00:15 We’re recording from NVIDIA GTC24 back live and
    0:00:18 in person at the San Jose Convention Center in San Jose, California.
    0:00:23 And now we get to talk about robots.
    0:00:25 With me are Eric Lee and Josiah David Wong, who are here at the conference to help
    0:00:29 us all answer the question, what should robots do for us?
    0:00:32 They’ve been teaching robots to perform a thousand everyday activities.
    0:00:36 And I, for one, cannot wait for a laundry folding assistant.
    0:00:39 Maybe with a dry sense of humor, to become a thing in my own household.
    0:00:42 So let’s get right into it.
    0:00:44 Eric and Josiah, thanks so much for taking the time to join the podcast.
    0:00:48 How’s your GTC been so far?
    0:00:49 I know that you hosted a session bright and early on Monday morning,
    0:00:53 had a couple days since.
    0:00:54 How’s the week treating you?
    0:00:55 >> Yeah, thanks, Noah.
    0:00:57 Our GTC has been going really well.
    0:00:59 Thanks for inviting us to the podcast.
    0:01:01 We had a really great turnout yesterday.
    0:01:04 People have been very engaged and
    0:01:05 people also ask a bunch of questions towards the end in the Q&A section.
    0:01:08 I guess people join because they’re really tired of household toys.
    0:01:11 >> Who’s not, right?
    0:01:12 >> Yeah, common problem.
    0:01:13 >> So before we get a little deeper into your session and
    0:01:16 what you’re doing with training robots, maybe we can start a little bit of
    0:01:20 background about yourselves, who you are, what you’re working on and where, and
    0:01:24 we’ll go from there.
    0:01:25 >> Yeah, my name is Chun Shui Li and I also go by Eric.
    0:01:27 I’m a fourth year PhD student at Stanford Vision and Learning Lab.
    0:01:30 Advised by Professor Fei Fei Li and Silver Sever as it.
    0:01:34 In the past couple of years, I’ve been working on building simulation platforms
    0:01:38 and developing robotics algorithm for robots to solve household tasks.
    0:01:42 >> Yeah, and I’m Josiah.
    0:01:44 Similar to Eric, I’m one year behind him, so I’m a third year PhD students.
    0:01:48 Also advised by Fei Fei Li.
    0:01:49 And similar to him, I’ve also been working with this behavior project that we’re
    0:01:52 going to talk about today for the past couple of years.
    0:01:55 And I’m really excited to, I don’t know, see robots working in real life.
    0:01:58 And we’re hoping that this is a good milestone towards that goal.
    0:02:01 >> Excellent.
    0:02:02 Before we dive in, for those out there thinking like these guys are studying
    0:02:06 right at the heart of it all and they’re in the lab and
    0:02:08 they’ve got these amazing advisors, I’m going to put you on the spot.
    0:02:11 One thing people might find surprising or interesting or
    0:02:15 fun about the day to day life of a PhD student researcher working in the Stanford
    0:02:22 Vision Lab.
    0:02:23 >> One interesting thing, I think for me, I’ll say one interesting thing is that
    0:02:28 I didn’t expect to be this collaborative.
    0:02:29 It might be unique to our project, but I sort of imagine that you do a PhD,
    0:02:33 you sort of just grind away on your own in a sad corner of the room with no windows.
    0:02:37 And you just not seen any sunlight, but a room is really beautiful.
    0:02:40 And I think we get to hang out.
    0:02:42 So I think I’m really lucky to have people that I can call my friends as well as
    0:02:45 lab mates, and we also get to work closely together.
    0:02:47 So I think it’s something I wouldn’t have thought that I would have at a place
    0:02:51 like Stanford, I guess.
    0:02:52 >> Yeah, that’s awesome.
    0:02:53 >> How are you?
    0:02:54 >> Exactly, I want to echo that.
    0:02:55 I think it’s part of because of the nature of our work, which is really a very
    0:02:59 complex and immense amount of work that we have to assemble a team of a few
    0:03:03 dozen people, which is very uncommon in academic labs, right, setup.
    0:03:08 So it’s, it feels to me that it very much works like a very fast-paced
    0:03:12 startup where people share the same goal and people have different skillsets
    0:03:16 complementing each other.
    0:03:17 So yeah, I think we had a great run so far.
    0:03:21 >> Very cool.
    0:03:21 Community is always a good thing.
    0:03:23 So let’s talk about your work.
    0:03:24 Should we start with the session, or do you want to start further back with the
    0:03:30 work you’re doing in the lab and what led up to the session?
    0:03:32 What’s the best way to talk about it?
    0:03:34 >> Yeah, I guess we can start it maybe two, three years back when we first
    0:03:39 have this preliminary idea of what our project is, which is called behavior.
    0:03:43 I think our professor Fei-Fei Li had this amazing benchmark called ImageNet from
    0:03:49 before in a computer vision community essentially accelerate the progress in
    0:03:53 that field and essentially set a benchmark where everybody can compete fairly and
    0:03:57 in a reproduced way and push the whole kind of vision filled forward.
    0:04:02 I think we were seeing that in the robotics field on the other hand,
    0:04:06 things, because of the involvement of hardware,
    0:04:09 each academic paper seems to be a little bit kind of segregated on their own.
    0:04:13 They will work on a few tasks that are different from each other and
    0:04:16 it’s really, really hard to compare results and kind of move the field forward.
    0:04:20 So we started this project thinking that we should hopefully establish a common
    0:04:27 ground, a simulation benchmark that is very accessible, very useful.
    0:04:31 Everybody can use, it has to be large scale so
    0:04:34 that if it works on this benchmark, hopefully it shows some general capability.
    0:04:39 And it should be human-centered, like the robots should work on tasks that
    0:04:43 actually matters to our day-to-day life.
    0:04:45 It shouldn’t be like some very contrived example that us researchers came up with.
    0:04:50 And in fact, maybe nobody cares about it.
    0:04:52 So that’s very important.
    0:04:53 So we set up to do this benchmark that we have been working on for the past couple of weeks.
    0:04:57 Why a simulated environment?
    0:05:02 Why not just start working with robots, training them out in the real world?
    0:05:05 And as the hardware and the software and the systems that drive the robots,
    0:05:09 capacities increase, you can do more.
    0:05:12 Why work with simulations?
    0:05:13 Yeah, that’s a great question.
    0:05:15 I think we would get this all the time and there’s a couple of answers.
    0:05:18 I think one is that I think, to Eric’s earlier point,
    0:05:21 I think the hardware is not quite where the software is currently.
    0:05:23 So we have all these really powerful brains with chat,
    0:05:26 GPT and stuff that can sort of generate really generalizable, really rich contents.
    0:05:30 But you don’t have the hardware to support that yet.
    0:05:32 And so I think part of the issue is that it’s expensive to sort of iterate on that.
    0:05:35 Sure.
    0:05:36 And along those lines, I think there’s the safety component where,
    0:05:39 because like Eric was mentioning, a lot of the tasks we want to care about are the ones
    0:05:43 that are human-centric.
    0:05:44 It’s like your household tasks where you want to full laundry or do the dishes
    0:05:47 or stuff where you would probably have humans or multiple humans in the vicinity
    0:05:51 of the robot and you don’t want a researcher to be like trying to hack
    0:05:54 together an algorithm and then it just lashes out and it hits you on the face.
    0:05:57 And that’s just, you know, you’re going to get sued to the ground.
    0:05:59 So I think simulation provides a really nice way for us to be able to prototype
    0:06:05 and sort of explore the different options similar to how these other foundation
    0:06:07 models were developed sort of in the cloud and then deploy them in your life
    0:06:11 once you know that they’re stable, once you know that they’re ready to be used.
    0:06:13 And like Eric mentioned earlier, like I think there’s this aspect of reproducibility
    0:06:17 where if you all are using the sort of same environments,
    0:06:19 then you know that the results can transfer and you can be validated
    0:06:22 by other labs and other people.
    0:06:23 Whereas you build a bespoke robot and you say it does something
    0:06:26 and you can’t really validate it unless you buy the robot and, you know,
    0:06:29 completely reproduce it.
    0:06:30 So yeah, a few different benefits we think that are pretty important.
    0:06:32 Now, there are existing simulation engines.
    0:06:36 I don’t know if you’d call them that, but game engines, Unreal, Unity,
    0:06:40 that are used beyond game development, obviously,
    0:06:43 and you can simulate things in those environments.
    0:06:45 Why not go with one of those?
    0:06:48 Right, yeah. Another great question, I think, is a natural fault that a lot of people ask.
    0:06:52 I think there is a couple limitations with the current set of simulators that we have.
    0:06:57 On the one hand, I think you have sort of the, like you mentioned,
    0:07:00 the very well-known game engines like Unity and stuff.
    0:07:04 And I think the problem is that you get really hyper realistic visuals.
    0:07:08 I think it’s, you know, you get these amazing games that are really immersive
    0:07:11 and it feels like it feels like real life.
    0:07:13 But I think when it comes down to the actual interactive capabilities,
    0:07:16 like what you can actually do with your, you know, PS5 controller or whatnot in the game,
    0:07:20 I think it’s definitely curate experiences by the developers.
    0:07:23 And so there’s a clear distinction between what you can and can’t do.
    0:07:27 And that’s not how real life works, right?
    0:07:28 We’re like, you know, there could be a tape that says, do not caution, do not enter,
    0:07:32 but you can just, you know, walk through that tape in real life.
    0:07:33 And, you know, you’ll have to take the consequences, but you can still do that.
    0:07:36 And actually, and I think that’s what we want robots to be able to do where we don’t again,
    0:07:40 to Eric’s point, like we don’t want to pre-define a set of things that we want to teach it.
    0:07:44 We wanted to learn a general idea about how the world operates.
    0:07:47 And so I think that necessitates the need for a simulator where everything in the world
    0:07:51 is interactive, where it’s, you know, a cup on a table or a laptop or, you know, like a door.
    0:07:55 And so there’s no sort of distinction between like, OK, we we curated this one room
    0:07:59 and so this is really realistic and it works really well.
    0:08:02 But if you try to walk outside of it, then it’s going to not work.
    0:08:04 We want it all to work. Yeah. Yeah.
    0:08:05 And so did you build your own simulator?
    0:08:08 It’s more accurate to say we built on top of a simulator.
    0:08:11 And I think this is where we have to give Nvidia so much credit, where, you know,
    0:08:14 they have this really powerful ecosystem called Omniverse, where it’s sort of
    0:08:18 supposed to be this one stop shop where you can get hyper real surrendering.
    0:08:22 They have a powerful physics back end.
    0:08:24 They can simulate cloth. They can simulate fluid.
    0:08:26 They can do all these things where, you know, it’s stuff that we would want to do in real life,
    0:08:30 you know, like full laundry, you know, pour water into a cup, that kind of stuff.
    0:08:34 And so they provide sort of the core engine, let’s say, that we build upon.
    0:08:38 And then we provide the additional functionality that they don’t support.
    0:08:41 And I think together it gives us, you know, a very rich set of features
    0:08:44 where we can simulate a bunch of stuff that robots would have to do
    0:08:47 if you want to put them in our households. Yeah. Anything to add, Eric?
    0:08:51 Yeah, no, I think I do. Yeah, we really want to extend our gratitude to the Omniverse team.
    0:08:56 I think they have hundreds of engineers really putting together these, you know,
    0:09:00 physics engine rendering engine that works remarkably on GPUs
    0:09:04 that may also be paralyzed, which is actually in our next step in our roadmap
    0:09:08 to make our things run even faster, given its powerful capabilities.
    0:09:12 And it’s just impossible to do many of these household tasks
    0:09:17 without the support of this platform.
    0:09:19 You mentioned Omniverse, obviously.
    0:09:20 And so there was a simulation environment called, it was called Gibson, iGibson.
    0:09:26 And then you extended that to create Omnigibson.
    0:09:30 Am I getting it right? Right, yeah.
    0:09:32 We definitely, so iGibson, just for the audience,
    0:09:35 is a predecessor of Omnigibson that we developed, you know, three, four years ago.
    0:09:39 And at that time, Omniverse hasn’t launched yet.
    0:09:42 So we used, we wrote our own renderer and then we were building
    0:09:45 on a previous physics engine called PiBullet,
    0:09:48 which works very well for rigid body interactions.
    0:09:50 And then as Omniverse was launched at that time, that was two years ago.
    0:09:55 That’s also when we decided to kind of tackle a much larger scale of household tasks.
    0:10:01 We decided to work on, for example, one thousand different activities
    0:10:04 that we do in our daily homes, that we quickly realized
    0:10:07 that it has gone beyond the capability of what our previous physics engine
    0:10:11 can do with PiBullet, like it doesn’t handle fluid, it doesn’t.
    0:10:14 It supports some level of cloth, but it’s not very realistic.
    0:10:18 The speed is sort of slow.
    0:10:20 Now we see this brand new toy, I guess, that came right out of the oven.
    0:10:24 And then we thought, let’s try this out.
    0:10:26 So we pretty much kind of started clean from a new project on top of Omniverse.
    0:10:32 Many, many things can change.
    0:10:33 We’re doing some of the design choices that we already made in iGibson
    0:10:36 that proven in history to be working quite well in our research world.
    0:10:40 We heard a lot of ideas, but we also changed a bunch of stuff
    0:10:44 to make things more usable and more powerful as well in the Omnigibson.
    0:10:48 So let’s talk about robots doing chores.
    0:10:51 How does one go about training a robot, whether in a simulation
    0:10:55 or in the physical world, to learn how to do household chores?
    0:11:00 Can you can you walk us through a little bit of what that’s like?
    0:11:02 Oh, that’s a great question, a very open-ended question.
    0:11:05 It’s an obvious, it’s what I do.
    0:11:07 I ask the open-ended questions and sit back.
    0:11:09 I think to make it easy for the audience,
    0:11:10 you can think of it as two maybe broad fields that are generally tackled right now.
    0:11:14 Where one is essentially you throw a robot in and you sort of let it do what it wants.
    0:11:19 And it’s sort of you can think of it as maybe learning a bit from play where you give rewards.
    0:11:24 So you can think of like teaching a child, like, you know, they don’t really know what to do.
    0:11:27 And so you have to sort of give them, you know, you punish them when they don’t do something good.
    0:11:30 You give them like a timeout and then when you do something good,
    0:11:32 you know, you give them like a cookie or, you know, some kind of reward.
    0:11:34 And it’s similar for robots where like you throw them in and the naively,
    0:11:37 the AI model doesn’t know what to do.
    0:11:39 So just try as random things to try as touching table.
    0:11:41 Try as like, you know, touching a cup or something.
    0:11:43 But let’s say what you really wanted to do is to, you know, pick up the cup
    0:11:46 and then pour water to something else.
    0:11:48 And so you can reward the things where it’s closer to what you want to do.
    0:11:51 Like if it picks touches the cup, you can like give it a good reward, like a positive reward.
    0:11:55 And if it like says, knocks over the cup and spills the water, you give it a negative reward.
    0:11:58 And so that’s one approach I think that researchers are trying.
    0:12:01 And another approach is where we learn directly from humans,
    0:12:04 where a human can actually, let’s say, teleoperate.
    0:12:07 So like, let’s say you have a video game controller and can control the robot’s arm
    0:12:10 to actually just directly pick up the cup, pour some water into something else
    0:12:13 and then they call it a day.
    0:12:14 And then the robot can look at the data and just collect it and sort of train on that.
    0:12:17 See, okay, I saw that the human moved my arm to this and like sort of poured it.
    0:12:20 So I’m going to try to reproduce that action.
    0:12:22 And so it’s these two different approaches towards sort of scaffolding directly from scratch
    0:12:26 versus scaffolding based on like human intelligence.
    0:12:29 Right. Yeah.
    0:12:30 And if you’re stringing together a series of actions, like let’s say,
    0:12:34 I mean, even your example of picking up the cup and then pouring the water into a different vessel,
    0:12:39 is it one sort of fluid sequence or is it are you teaching sort of modular tasks
    0:12:46 that you then can string together?
    0:12:48 Yeah, that’s a, it’s another design decision, right?
    0:12:50 Like I think there’s something called task plan where you can imagine
    0:12:53 that every individual step is a different training pipeline.
    0:12:56 So like, I’m just going to focus on learning to pick the cup and I’m not going to do anything
    0:12:59 with it, but I’m just going to repeat that action over and over.
    0:13:01 And then let’s say you can plug it in with something else, which says,
    0:13:03 okay, and I’m going to do like a pouring action over and over.
    0:13:05 And then if we just string them together, then maybe, let’s say,
    0:13:08 you can get the combination of those two skills.
    0:13:10 But others have looked at sort of the end to end, what we call process where, you know,
    0:13:14 you look at the taskable, it’s just like pick up the cup and pour it into another vessel.
    0:13:18 And you just try to do it from the very beginning to the very end.
    0:13:21 And I think it’s still unclear which way is better.
    0:13:24 But again, like it’s a bunch of design decisions and there’s a ton of problems.
    0:13:27 I agree, I agree.
    0:13:28 There’s no really a consensus that researchers have really been poking here
    0:13:32 and there and trying to find their luck.
    0:13:33 And there’s pros and cons on both sides.
    0:13:36 For example, if you do the end plan approach, if it works, it works really well.
    0:13:40 But it’s because the task is longer, it’s more data hungry.
    0:13:43 It’s more difficult to convert.
    0:13:46 On the other hand, if you do a more modular approach,
    0:13:49 then each skill can work really well.
    0:13:51 But like the transition point is actually very brittle, right?
    0:13:54 You might reach some bottleneck where you try to change a couple of skills together.
    0:13:57 And then it breaks in the middle and then it’s very hard to recover from there either.
    0:14:00 So I think we’re still figuring this out as a community.
    0:14:04 What were some of the hardest household tasks for the robots to pick up?
    0:14:09 Or easiest ones or even just sort of the ones that kind of you remember
    0:14:14 because it was interesting, the process was sort of unexpected and interesting.
    0:14:18 I was going to say the folding laundry example I mentioned is one that
    0:14:22 maybe it’s just, you know, the platforms I hang out on, hang out on the algorithms
    0:14:28 and know that I don’t like folding laundry and I’m terrible at it.
    0:14:31 I can’t fold a shirt the same way twice.
    0:14:33 Oh, yeah. But every once in a while, you know,
    0:14:35 I feel like I’ll see a video of a system that’s like
    0:14:38 gotten a little bit closer, but it seems to be a difficult problem.
    0:14:42 Yeah, it’s really challenging.
    0:14:43 I think to be clear to the audience, we haven’t solved all the thousand tasks.
    0:14:47 That’s our goal also.
    0:14:49 I think the first step is just providing these a thousand tasks
    0:14:51 in like a really reproducible way and the platforms so they can actually simulate them.
    0:14:55 But for me personally, I think what immediately comes to mind is like one of the top five tasks.
    0:14:59 So to give a bit of context, like like Eric mentioned,
    0:15:01 we don’t want to just predefined tasks.
    0:15:03 We want to actually see what people care about.
    0:15:05 So we actually like pulled a bunch of thousands of people online
    0:15:08 and we asked them, you know, what would you want a robot to do for you?
    0:15:11 And so we had a bunch of tasks more than a thousand
    0:15:13 and we whittled them down to a thousand based on the results that people gave.
    0:15:16 OK. And one of the top five tasks was clean up after a wild party.
    0:15:19 And so the way we visualized in our simulator was we had this, you know,
    0:15:22 living room and just tons of glass bottles, beer bottles,
    0:15:25 like, you know, like just random objects scattered in the floor.
    0:15:28 And that’s just a distinct memory in my mind because I think it really
    0:15:31 sets the stage for like how much disdain we have for a very certain task.
    0:15:35 And it was clear that people like ranked it very highly because it’s, you know,
    0:15:38 it’s very undesirable or clean laundry or excuse me, full laundry.
    0:15:42 But I’m getting flashbacks out.
    0:15:43 I’m wondering if you have taught a robot to patch a hole in the wall.
    0:15:47 Oh, God, that’s that’s a story I’m not going to get into.
    0:15:49 A hole that the robot maybe made himself when I was trying to do something
    0:15:52 in the real world. Exactly, exactly.
    0:15:53 Make some mistake. Yeah. Any thoughts, Eric?
    0:15:55 What do you think? Yeah.
    0:15:56 I guess some of the cooking tasks seems pretty difficult.
    0:15:59 Oh, yeah, that’s a shame.
    0:16:00 Of our household tasks are cooking related.
    0:16:03 And we did spend quite a lot of effort kind of implementing these complex.
    0:16:08 I guess we we try to do a bit of simplification but we want to get the
    0:16:12 high level kind of chemical reaction that’s happening in a cooking process.
    0:16:16 For example, baking a pie or making a stew, for example,
    0:16:19 those kind of things in our platform.
    0:16:21 Yeah. And these tasks are pretty challenging too, right?
    0:16:24 Like you need to have this kind of common sense knowledge about what does it take,
    0:16:27 you know, to cook a specific dish, you know, what are the ingredients,
    0:16:30 how much you put in, you don’t want to put in too much salt.
    0:16:33 Also not too little salt. Right, it’s right.
    0:16:34 You don’t understand how much time you put into the oven,
    0:16:37 how long to wait and make sure you don’t spill anything else.
    0:16:39 Yeah, that’s some of the longest, longest horizon tasks.
    0:16:42 Yeah. I forgive me, I’m sure there’s a better way to ask this question,
    0:16:45 but what’s difficult about, I can imagine it’s incredibly difficult,
    0:16:49 but what’s difficult about cooking for the robot to learn?
    0:16:54 Is it that there’s so many steps and objects happening?
    0:16:58 Is it something about the motions involved?
    0:17:01 No, that’s, you’re asking a very brilliant way.
    0:17:03 I think both actually, both are kind of the,
    0:17:06 they’re both two challenging aspects.
    0:17:08 One of them is that it involves many concepts or like symbolically,
    0:17:11 you can think of it involves many types of objects.
    0:17:14 And you just kind of chain them together,
    0:17:16 make sure you use the right tool at the right time.
    0:17:18 And also the motions are difficult.
    0:17:20 Like imagine you need to cut an onion into like small dices to make,
    0:17:26 you know, some sort of dish can come off a good example.
    0:17:28 But then the motion itself is very dexterous, right?
    0:17:32 Imagine that, sometimes humans cut their fingers when they’re cooking.
    0:17:35 First thing I thought of.
    0:17:36 Exactly. That’s pretty tough.
    0:17:38 Yeah. And then I think also it just needs to have some understanding
    0:17:41 of things that aren’t explicit.
    0:17:43 Like if I put these two chemicals together,
    0:17:45 it actually creates something third that you didn’t see before.
    0:17:47 And I think a lot of times in current research,
    0:17:49 you sort of assume that the robot already knows everything.
    0:17:52 And so what it’s given, it can only do stuff like combinatorially.
    0:17:55 But I think cooking is an interesting example where,
    0:17:58 you know, you put in, I don’t know, dough into the oven
    0:18:00 and outcomes like it just transforms into bread.
    0:18:02 And like, I think it’s, it’s, you know, there’s the joke about, you know,
    0:18:04 you put in a piece of bread and outcomes toast.
    0:18:06 And then there’s like a comic where, you know, Calvin from Calvin and Hobbes is like,
    0:18:09 oh, like I wonder how this machine works.
    0:18:10 It just somehow transforms it into this new object.
    0:18:13 And like, I can’t see where it’s stored, right?
    0:18:15 And so I think the idea that a robot has to learn that is also quite challenging too.
    0:18:18 I’m speaking with Eric Lee and Josiah David Wong.
    0:18:22 Eric and Josiah are PhD students at Stanford who are here at GTC24.
    0:18:28 They presented a session early in the week.
    0:18:30 We’re talking about it now.
    0:18:31 They’re attempting to teach robots how to do a thousand common household tasks
    0:18:36 that humans just don’t want to do if we can help it,
    0:18:38 which is just one of the many potential future avenues for robotics in our lives.
    0:18:43 But it’s a good one.
    0:18:44 I’m looking forward to it.
    0:18:45 One thing I want to ask you about, LLMs are everywhere right now.
    0:18:49 And, you know, a lot of the recent podcasts and guests I’ve been talking to,
    0:18:53 and just people I’ve been talking to at the show,
    0:18:55 are talking about LLMs as relates to different fields, right?
    0:19:01 Scientific discovery and genome sequencing and drug discovery and all kinds of things.
    0:19:06 There’s been some stuff in the media lately about some high-profile stuff
    0:19:11 about robots that have an LLM integrated, chat GPT integrated,
    0:19:17 so you can ask the robot in natural language to do something.
    0:19:21 You can interact with you, that sort of thing.
    0:19:24 How do you think about something like that?
    0:19:26 From the outside, I sort of at the same time can easily imagine what that is.
    0:19:32 But then in my brain, it almost stutters when I try to imagine,
    0:19:38 like I’ve used enough chatbots and text-to-image models and that kind of thing
    0:19:43 to sort of understand, you know, I type in a prompt and it predicts the output that I want.
    0:19:48 When we’re talking about equipping, you know, a robot with these capabilities,
    0:19:54 is it a similar process?
    0:19:56 Is the robot, when we were talking about cooking, I was imagining, you know,
    0:20:00 can an LLM in some way give a robot the ability to sort of see the larger picture of, you know,
    0:20:07 now remember, when you take the dough out, it’s going to look totally different
    0:20:11 because it’s become a pizza.
    0:20:12 Is that a thing or is that me in my human brain just trying to make sense
    0:20:18 of just this rapid pace of acceleration in this thing we call AI
    0:20:22 that’s actually touching so many different, you know, disciplines all at once?
    0:20:27 Oh, yeah, I do think the development of large models,
    0:20:31 not just large language model, but also like large language vision models,
    0:20:36 will really accelerate the progress of robotics.
    0:20:39 And people actually, researchers in our field have adopted these approaches
    0:20:43 for the last two years and things are fast, you know, things are moving very fast
    0:20:47 and we’re very excited.
    0:20:48 I think one of the challenges is that what these LLM have been good at
    0:20:53 is still at the symbolic level.
    0:20:54 So in kind of, you can think of in the virtual world, it has these concepts,
    0:20:59 you know, like what ingredients to put in, like my great pizza, for example.
    0:21:04 But there’s still this low level, difficult robot skills motions, you would call it.
    0:21:10 How to roll a dough into a flattened thing?
    0:21:13 How do you spring stuff on the pizza so that it’s like evenly spread out?
    0:21:18 All those little detail are the crux of a successful pizza.
    0:21:22 Top-notch pizza, not a rent.
    0:21:24 Edible pizza.
    0:21:25 I don’t even edit all right.
    0:21:27 I hope the listeners can hear it.
    0:21:28 I can see in your face as you’re talking the like, you have to get this right.
    0:21:34 And I’m with you.
    0:21:35 And so the actual physical implementation of doing those motions is something that,
    0:21:41 you know, the robotics field I’m sure is working on, has been working on,
    0:21:44 but work in progress.
    0:21:46 Exactly.
    0:21:46 I think you hit the nail exactly on the head where it is the execution
    0:21:51 where you can think of it as, you know, theoretical knowledge.
    0:21:54 Like you can, you’re human, like the same thing.
    0:21:57 Like, okay, you’re playing in the head.
    0:21:59 You’re playing the chores that you could do.
    0:22:00 So you like list them out and you know exactly what you’re supposed to do.
    0:22:02 But then you actually have to go and execute them.
    0:22:04 And so I think the LLM, because it’s not plugged in with a physics simulator,
    0:22:08 it doesn’t actually know, okay, I think that if I do this, if I, you know,
    0:22:12 pick up the cup, then it, you know, it will not spill any water.
    0:22:15 But if the cup has like a hole in the bottom that you don’t see and then you do
    0:22:18 and then stuff falls out, then you have to readjust your plan.
    0:22:20 And I think if you just have an LLM, you don’t know exactly what the outcomes
    0:22:25 are going to be along the way.
    0:22:26 And so like Eric was saying, I think it needs to sort of, we say like closing the
    0:22:29 loop, so to speak, we’re like, you plan and then you try it out and then you plan again.
    0:22:33 And I think with that extra execution step, I think is something that’s still
    0:22:38 sort of an open research problem that we’re both hoping to tackle.
    0:22:41 Right.
    0:22:42 And so where are you now in the quest for a thousand chores to put it that way?
    0:22:47 Is it all in a simulation environment?
    0:22:50 Are you having robots in the physical world go out and, you know,
    0:22:54 have you gotten to the point where the experiments feel stable enough to try them
    0:22:58 out in the physical world?
    0:22:59 Where are you on that timeline?
    0:23:01 So when we originally posted our work, which was a couple of years ago and we’ve
    0:23:05 done, you know, a bunch of work since then, one of our experiments was actually
    0:23:08 putting up what we call a digital twin, which is we have like a real room in the
    0:23:13 real world with a real robot.
    0:23:15 And we essentially try to replicate it as closely as we can in simulation with
    0:23:19 the same robot virtualized.
    0:23:20 And I think we were able to show that with training the robot at the level of
    0:23:25 telling it, okay, grasp this object, now put it here.
    0:23:27 And then having it learn within that loop and simulation, we could actually have
    0:23:30 it work on the real world.
    0:23:31 So we tested it in the real world and we did see non-zero success.
    0:23:34 So I think the task was like putting away trash or something.
    0:23:36 I think so.
    0:23:37 Yeah.
    0:23:37 So we had to throw away like a bottle or like a red solo cup into like a trash
    0:23:40 can.
    0:23:40 And so that requires like navigation, moving around the room, picking up
    0:23:43 stuff and then also like moving it back and also like dropping it in a specific
    0:23:46 way.
    0:23:46 And so I think that’s a, it’s a good signal to show that, you know, robots can
    0:23:50 learn potentially.
    0:23:51 But of course this is, I think, one of the easier tasks where if it’s folding
    0:23:54 laundry, like if we can do it, you know, not well, then, you know, how much
    0:23:57 hard is it going to be for a robot to do?
    0:23:58 Yeah.
    0:23:58 So I think there’s still a lot of unknown questions to actually hit, you know,
    0:24:02 even a hundred of the tasks, much less than a thousand, all of them.
    0:24:04 So, but I think we have seen some progress.
    0:24:06 So I hope that we can, you know, start to scaffold up from there.
    0:24:09 Yeah.
    0:24:09 So what’s, what’s the rest of, I don’t know, the semester of the year, like for
    0:24:16 you guys, is it all heads down on this project?
    0:24:19 What’s the timeline?
    0:24:20 Yeah.
    0:24:21 I think, I think we have just had our first official release two days ago and
    0:24:26 I think things are at a stage where we have all our 1,000 tasks ready for
    0:24:30 researchers and, and, and scientists to try it out.
    0:24:33 I think our immediate next step are to try some of these ourselves, you
    0:24:38 know, like what Google called like dog food, your own products, right?
    0:24:41 So we’re, you know, Robo learning researchers ourselves.
    0:24:44 We want to see how do the current state of the art Robo learning or robotics
    0:24:49 models work in these tasks?
    0:24:51 What are some of the pitfalls?
    0:24:52 So I think that’s number one that essentially look, tell us where are the
    0:24:57 low hanging fruits that can really significantly improve our performances?
    0:25:00 And the second is that we’re also thinking about potentially, you know,
    0:25:04 hosting a challenge where these, where researchers can kick.
    0:25:09 So everything is even more modular so that people can participate from all over
    0:25:13 the world to, to, to, to, to make progress on this benchmark.
    0:25:17 And I think that’s also in our roadmap to make it happen.
    0:25:20 Well, if you need to volunteer to create the wild party mess to clean up,
    0:25:25 we know who to ask.
    0:25:26 You know, I think along the lines of the challenge, I think a goal is to
    0:25:30 sort of democratize this sort of research and allow more people to explore.
    0:25:33 And so we’ve actually put together like a demo that anyone can try.
    0:25:36 So for the audience listening, like, you know, if you’re technically inclined
    0:25:39 if you’re a researcher, but even if you’re just like, oh, I don’t know.
    0:25:41 I don’t want to say lay person, but you know, a person that’s normally not
    0:25:44 involved with AI, but want to sort of like, just see what we’re all about.
    0:25:46 We do actually have something that we can just try it out immediately and
    0:25:49 we can see sort of the robot in the simulation and like what it looks like.
    0:25:52 So I assume there’ll be some links, hopefully associated with this.
    0:25:55 But yeah, we hope you know, it’s if you know the link off hand,
    0:25:58 you can, you can speak it out now, but we’ll do as well.
    0:26:01 Yeah, it’s on what we say behavior.
    0:26:03 I’ll stand for the idea.
    0:26:04 Yeah.
    0:26:04 Great.
    0:26:04 Okay.
    0:26:05 And that reminds me to ask you mentioned it briefly, but what is behavior?
    0:26:09 Yeah.
    0:26:09 That’s a good distinction.
    0:26:10 So Omni Gibson to be clear is like the simulation platform where we simulate
    0:26:13 this stuff.
    0:26:14 Right.
    0:26:14 And I think overarching that this whole project is called behavior one K
    0:26:17 representing the thousand tasks that we hope robots can solve in the near future.
    0:26:20 Right.
    0:26:21 Yeah.
    0:26:21 That’s the distinction.
    0:26:22 I guess is that it’s the all encompassing thing, which is not just the simulator,
    0:26:24 but also the tasks and also sort of the whole ethos of the whole project is
    0:26:28 called behavioral.
    0:26:28 Okay.
    0:26:29 Okay.
    0:26:29 Yeah.
    0:26:29 All right.
    0:26:30 And before I let you go, we always like to wrap these conversations up with a
    0:26:34 little bit of a forward looking.
    0:26:35 What do you think your work?
    0:26:38 You know, how do you think your work will affect the industry?
    0:26:40 What do you think the future of et cetera, et cetera is robots?
    0:26:44 So it’s 2024.
    0:26:46 Let’s say by 2030.
    0:26:48 Are robots in the physical world going to be, you know, to some extent, we’ve got,
    0:26:53 you know, Roombas and, you know, vacuum cleaner robots, that kind of thing.
    0:26:57 And certainly at a show like this, you see robots out there.
    0:27:00 But, you know, in there, there was an NVIDIA robot.
    0:27:02 I saw yesterday that’s out in children’s hospitals interacting with patients.
    0:27:07 And yeah, it’s down on the, I’m pointing.
    0:27:09 Nobody can see in the radio, but I’m pointing out to the show floor.
    0:27:11 It’s out there.
    0:27:12 You can see it.
    0:27:13 What do you think society is going to be in, you know, five, six years from now as
    0:27:17 relates to the quantity and sort of level of interactions with robots in our
    0:27:22 lives, or maybe it’s more than five, six years.
    0:27:25 Maybe it’s 2050 or further down the line.
    0:27:27 I think it’s hard to predict because if we look at the autonomous driving industry,
    0:27:32 let’s say as a predecessor, I think even though it’s been hyped up as like the
    0:27:37 next ubiquitous thing for what it’s taking now, like for quite a while, right?
    0:27:40 But we’re still not quite, I mean, it’s become much more commonplace, but we’re
    0:27:43 still not at like, what is it?
    0:27:44 Level five autonomy, let’s say.
    0:27:46 And so I imagine something similar will happen with, you know, human robots or
    0:27:51 something that you see with everyday, you know, interactive household robots
    0:27:54 where I can imagine it, we’ll start seeing them in real life.
    0:27:57 But I don’t think you’ll be ubiquitous until, you know, decades.
    0:28:00 It’s my take, but I don’t know if you’re more optimistic.
    0:28:03 Yeah, I think to keep it in prismatic is actually a good thing because I think
    0:28:07 in general, the reason exactly the reason why these things are too hard is
    0:28:13 because human have very high standards.
    0:28:15 Exactly.
    0:28:15 It’s like the sub driving cars.
    0:28:16 People, people are okay drivers or decent drivers.
    0:28:19 So, so you only have X and X number of miles.
    0:28:22 You want the robot to be better at it, much better at it actually.
    0:28:24 So I think we’re still a bit far away from, you know, a house robots that can
    0:28:30 be very versatile, meaning do many things and also doing many things reliable.
    0:28:35 Very consistent, very consistently.
    0:28:36 You don’t want to break, you know, 20% of time.
    0:28:38 That’s, that’s, 20% of conditions.
    0:28:40 Yeah, that’s right.
    0:28:41 So I think it’s hard because we have high standards, but hopefully these robots
    0:28:46 can kind of come in like incrementally for our life, maybe first in more
    0:28:50 structured environment like warehouses and so on, like doing like reshuling
    0:28:54 or like, like restocking shelves or, you know, putting Amazon package here
    0:28:58 and there and then hopefully soon we can have, you know, like time, you know,
    0:29:03 like time, have a folding laundry robot soon.
    0:29:06 Excellent.
    0:29:07 Good enough.
    0:29:07 Eric Josiah guys, thank you so much for dropping by taking time out of your busy
    0:29:11 week to join the podcast.
    0:29:12 This is a lot of fun for me.
    0:29:14 I’m sure for the audience for listeners who want more, want to learn more
    0:29:18 about the work you’re doing, more about what’s going on at the lab, read
    0:29:22 some published research, that kind of thing.
    0:29:24 Are there good starting points online where we can direct them?
    0:29:27 I think what Eric mentioned earlier, like just go to behavior.stanford.edu
    0:29:31 and that’s sort of the entry point where you can see, you know, all this stuff
    0:29:34 about this project, but more, you know, widely you can also then from there
    0:29:37 get to see what else is going on at Stanford.
    0:29:39 That’s exciting.
    0:29:39 So yeah, definitely check it out if you’re so inclined.
    0:29:41 Perfect.
    0:29:42 All right.
    0:29:43 Well guys, thanks again.
    0:29:44 Enjoy the rest of your show and good luck with the research.
    0:29:47 Thanks Noah.
    0:29:47 Thanks for having us.
    0:29:49 [music]
    0:29:51 [music]
    0:29:53 [music]
    0:29:55 [music]
    0:29:56 [music]
    0:29:57 [music]
    0:29:58 [music]
    0:29:59 [music]
    0:30:00 [music]
    0:30:01 [music]
    0:30:06 [music]
    0:30:07 [music]
    0:30:08 [music]
    0:30:33 [music]
    0:30:35 [music]
    0:30:36 you
    0:30:38 [BLANK_AUDIO]

    Imagine having a robot that could help you clean up after a party — or fold heaps of laundry. Chengshu Eric Li and Josiah David Wong, two Stanford University Ph.D. students advised by renowned American computer scientist Professor Fei-Fei Li, are making that a ‌dream come true. In this episode of the AI Podcast, host Noah Kravitz spoke with the two about their project, BEHAVIOR-1K, which aims to enable robots to perform 1,000 household chores, including picking up fallen objects or cooking. To train the robots, they’re using the NVIDIA Omniverse platform, as well as reinforcement and imitation learning techniques. Listen to hear more about the breakthroughs and challenges Li and Wong experienced along the way.

  • Basecamp’s Phil Lorenz on Combining AI With Biodiversity Data – Ep. 223

    AI transcript
    0:00:00 [MUSIC]
    0:00:11 Hello, and welcome to the NVIDIA AI podcast.
    0:00:14 I’m your host, Noah Kravitz.
    0:00:15 The intersection of AI and biology is one of the most fascinating and
    0:00:19 promising areas of modern technology and research.
    0:00:22 My guest today is working at the leading edge of this field in his role as CTO of
    0:00:26 Basecamp Research. Basecamp, who’s a member of the NVIDIA Inception Program for
    0:00:30 startups, is leveraging their unprecedented knowledge of the natural world
    0:00:35 to create better food, better medicines, and better products for the planet.
    0:00:39 Basecamp has collected an unprecedented data set,
    0:00:41 capturing orders of magnitude more diverse biological data than any public
    0:00:45 resources, and they’re leveraging this data for deep learning and
    0:00:48 GNI applications. Here to shed light on what that means for his company and
    0:00:52 for all of us is Phil Lorenz, Chief Technology Officer at Basecamp.
    0:00:57 Phil, thanks so much for taking the time to join the podcast.
    0:01:00 >> Thanks so much for having me.
    0:01:01 >> How’s your trip been so far?
    0:01:02 We’re recording on, I guess this is day three of GTC, so
    0:01:05 you’ve been in town for a few days. How’s the conference?
    0:01:08 >> It’s amazing. It’s great to see some friends who I haven’t seen in a while.
    0:01:12 So that’s actually been great meeting with folks from NVIDIA that we’ve been
    0:01:16 working with for a while. And yeah, I mean, lots of great networking opportunities,
    0:01:20 including people that are really far outside the life science industry.
    0:01:24 But it’s great to see what everyone else is doing, so really exciting, yeah.
    0:01:28 >> It is, it’s nice to be back in person after several years here.
    0:01:31 So let’s start with the basics.
    0:01:33 Maybe you can tell us what Basecamp research is, how you were founded, what you do.
    0:01:38 >> Yeah, of course. Maybe just take a step back with respect to why we’re doing what
    0:01:42 we’re doing and how we thought of this.
    0:01:44 I guess if you think about kind of the life science,
    0:01:47 it’s probably one of the most exciting domains to apply AI to, I think.
    0:01:50 And we obviously have a lot of kind of human clinical data collected
    0:01:54 in the last few years and decades.
    0:01:56 But when it comes to kind of life on Earth and biology as a whole, we actually
    0:02:00 haven’t because there’s probably about 10 to the 26 species out there.
    0:02:04 Which is a lot. And we’ve sequenced a few million.
    0:02:08 And so if you kind of make that comparison in terms of what we know about life on
    0:02:12 Earth, that’s about five drops of water compared to the Atlantic Ocean,
    0:02:15 which is what we don’t know.
    0:02:16 So if that’s the kind of place you’re starting with and
    0:02:19 everything the life science industry has ever built,
    0:02:22 is based on that tiny knowledge of life on Earth, that kind of slice.
    0:02:26 We thought that if you want to do deep learning for the life sciences really well,
    0:02:30 there’s exciting algorithms being built, exciting architectures that we can do.
    0:02:34 But at the same time, we feel like, okay, there’s actually big data problem to begin with.
    0:02:39 And so to do this from first principles, what we’ve done over the last two or
    0:02:44 three years is we’ve built partnerships with nature parks across five continents.
    0:02:49 Including places like the Antarctic and Rainforest and Volcanic Islands and
    0:02:54 you name it.
    0:02:55 And we have professional explorers that go to these places and
    0:02:58 they sequence the biodiversity in these areas.
    0:03:01 Lot of microbes, because that’s where the greatest diversity of life on Earth is.
    0:03:06 And we do this in partnerships with these nature parks.
    0:03:09 >> When you say sequence the biodiversity,
    0:03:12 am I getting that right, sequence the biodiversity that they found?
    0:03:15 What does that mean for kind of a lay person?
    0:03:17 >> Absolutely, yeah, realizing that I’m a heavy life scientist.
    0:03:20 >> I’m saying the lay person, I really mean me.
    0:03:22 >> No, that’s completely fair.
    0:03:23 Sequence basically means every organism has a genome, has a DNA.
    0:03:27 And so sequencing the genomes of all of these unknown organisms.
    0:03:32 >> Okay, and out in, is that just to kind of take attention for a second,
    0:03:36 can that be done out in the field remotely?
    0:03:39 Or are we at the point with gene sequencing where you can do that on site or
    0:03:42 how does that work?
    0:03:43 >> Yeah, so when the founders started Basecam,
    0:03:45 they actually did that on site.
    0:03:47 They were on an ice cap in Iceland with a solar powered tent.
    0:03:52 That’s how the company started.
    0:03:53 Now, because we are doing this at scale, we’re extracting the DNA there and
    0:03:57 then we’re sequencing the DNA at a much larger scale in Europe.
    0:04:01 Yeah, and I think what’s exciting is, again, this is underappreciated home,
    0:04:06 the vastness of unknown life on Earth.
    0:04:08 We now have a database just within two years that has samples collected from
    0:04:14 the Antarctic and volcanic islands and all the life that exists in these places.
    0:04:20 That’s now a few orders of magnitudes, more diverse than all public data combined.
    0:04:25 And what we’re doing on top of this is not just collect hundreds of millions of
    0:04:30 new protein or DNA sequences, but also the chemical environment,
    0:04:34 the geological environment, and connected all of that information together in
    0:04:39 a knowledge graph that now has about six billion relationships.
    0:04:42 And so that really gives us a good information of entirely a new,
    0:04:46 never seen before information that has never existed before.
    0:04:49 >> So to go back to your analogy about drops of water in the ocean,
    0:04:53 if we were at sort of five drops of water compared to the ocean of knowledge,
    0:04:58 how much more knowledge have you been able to accrue in these past couple of years?
    0:05:02 >> Yeah, we’re probably still a few orders of magnitudes away from the Atlantic Ocean.
    0:05:09 We’re trying to be maybe a cup of water getting close to that.
    0:05:13 That’s kind of the goal in the next few months.
    0:05:16 >> That’s a lot of progress, that’s amazing.
    0:05:17 And so what do you do with the data?
    0:05:19 >> There’s several things, I mean, we have a huge,
    0:05:21 there’s just a huge engineering effort to just annotate all of this data,
    0:05:24 to organize all of this data because that’s, it’s now bigger than most public
    0:05:28 database and that’s actually a big undertaking.
    0:05:31 The exciting application really is to see, okay,
    0:05:33 there’s some architectures for deep learning and
    0:05:35 biology such as structure prediction of proteins.
    0:05:38 And now that we have so much more diverse data,
    0:05:40 what we can do is actually leverage some of the algorithms or
    0:05:43 architectures that exist and apply that to our data advantage.
    0:05:46 And so one thing that we’ve now built is called base fold,
    0:05:49 which is using a similar architecture to alpha fold.
    0:05:52 But we can actually be up to six times more accurate because we have so
    0:05:55 much more diverse and additional sequence information.
    0:05:58 And so that’s something that’s really exciting because there’s obviously a lot
    0:06:02 of work and effort being done on using a different algorithm,
    0:06:07 different methods, but actually data,
    0:06:09 especially in the life science makes such a big difference.
    0:06:11 And so that’s something that we’re really excited to use.
    0:06:14 >> What the right way to get into this is,
    0:06:15 I want to ask about what your sort of day-to-day life is as the CTO.
    0:06:19 But then I’m also curious, what happens?
    0:06:21 Are you working with partners across academia, industry to leverage the data,
    0:06:28 to create better medicines, better foods,
    0:06:30 those kinds of things we talked about the intro?
    0:06:33 So either way, talk us through kind of what you do as CTO.
    0:06:37 And then maybe from there we can talk about some of the partnerships.
    0:06:40 >> My role is that I do a lot of very different things at the same time.
    0:06:43 The one thing I’m not doing anymore at all is coding.
    0:06:46 >> [LAUGH]
    0:06:47 >> I did that at the start, but that’s kind of the one thing I’m not really doing.
    0:06:50 But that’s probably a good thing.
    0:06:52 I think my role has kind of three, four main things.
    0:06:55 The first is actually making sure that the data collection process and
    0:07:00 how we enter that into our database.
    0:07:02 We have a genomics team, that’s amazing.
    0:07:03 And they’re doing incredible work dealing with all of this data and
    0:07:07 having really high quality annotations for that.
    0:07:10 That’s kind of one thing.
    0:07:11 And the data engineering, how we organize all of our infrastructure,
    0:07:14 the deep learning work, applying this data advantage to the most exciting
    0:07:18 AI applications, and then the product team.
    0:07:20 And that’s actually what you mentioned with respect to what our
    0:07:24 partnerships look like with therapeutics or biotech companies.
    0:07:28 Some of them, they will ask us, do you have a protein that can do this
    0:07:32 function, like an enzyme that can break down plastic?
    0:07:34 And then we work with them on that.
    0:07:35 >> Right.
    0:07:36 >> Or sometimes gene editing systems that will cure genetic diseases.
    0:07:41 And we have these in our database occurring naturally.
    0:07:44 But because again of our data advantage,
    0:07:46 we can use generative algorithms to actually generate these assets.
    0:07:50 And then they license them from us, and then they use them for
    0:07:53 downstream clinical applications or whatever that might be.
    0:07:56 >> I’m familiar with the basic idea of gene editing, but not much beyond that.
    0:08:02 If a company comes to you and for instances they need an enzyme
    0:08:05 that can break down plastics.
    0:08:07 >> Yeah.
    0:08:07 >> And let’s say, I don’t know if that occurs in the natural world or not.
    0:08:11 But let’s say it doesn’t.
    0:08:13 What can you do then?
    0:08:14 How does that work?
    0:08:15 >> So there’s kind of multiple ways in which we look at this.
    0:08:20 The first one is, let’s say someone wants an enzyme to create plastic.
    0:08:23 In many cases, that assumption of let’s say,
    0:08:26 we’re not sure whether this exists in nature or not.
    0:08:29 It’s actually that assumption is made on what we know from public data.
    0:08:32 And actually, we are making new biological discoveries based on our data set all the time.
    0:08:37 So that’s kind of one argument.
    0:08:39 But then there’s another way of thinking about this, is that because of the data
    0:08:43 advantage, when we use deep learning to optimize or
    0:08:46 generate these enzymes for a specific function.
    0:08:50 Because we have explored sequence space and evolution so much more,
    0:08:53 we can actually understand how to get towards even these unnatural reactions.
    0:08:58 Much easier in many situations.
    0:09:00 >> So you said it’s been about two a little over two years.
    0:09:02 >> Yeah. >> You’ve been gathering the data sets.
    0:09:04 How much has AI, the technology that’s available that compute as well as data,
    0:09:12 how much has that world changed in those couple of years relative to you thinking
    0:09:19 about, well, we’re collecting this enormous amount of data and it’s fantastic.
    0:09:24 And it can open so many doors.
    0:09:26 But do we have enough compute?
    0:09:28 Do we have the right algorithms to be able to work with it?
    0:09:31 From the outside, and especially we talk a lot about generative AI these days.
    0:09:35 For the mainstream, that world has exploded in that time period.
    0:09:39 But for the kind of work you’re doing, has the change been as dramatic?
    0:09:42 >> Absolutely, there were a couple of situations a few years ago even where we
    0:09:49 had maybe not the same size of the data set, but the same foundational architecture
    0:09:53 with the long genome context and all of this metadata that we collect.
    0:09:58 Where I was actually thinking, oh, damn, I don’t think there’s an architecture out
    0:10:01 there that can deal with our data.
    0:10:03 And that is slowly starting to change.
    0:10:06 And I think one of the most exciting things in kind of the deep learning applied
    0:10:10 to biological tasks in the past few months and years is that what all people have
    0:10:14 done is thinking about what additional biological context can I include into my
    0:10:20 language model architecture or something.
    0:10:21 So not just use something from a different domain and
    0:10:25 force that architecture onto a biological question, but think about how can I change
    0:10:29 that model architecture in a way that represents biology much better.
    0:10:33 And I think that kind of trend in the last few months is really exciting.
    0:10:37 And it yields much better results as well, and that’s great.
    0:10:39 >> Are you taking off the shelf models and fine training them for your,
    0:10:44 I mean fine tuning, excuse me, for your own use?
    0:10:47 Or are you building models from scratch?
    0:10:50 How does that work?
    0:10:52 >> We’re doing both.
    0:10:53 So on the folding problem, we use pretty much alpha false architecture that exists
    0:10:57 because I think it’s pretty good and it works.
    0:11:00 And for that, it’s purely just doing a much better job because of the data that we have.
    0:11:05 We’ve also built our own architectures and models one for annotation.
    0:11:10 There’s a lot of what are called functional dark matter sequences where we have a sequence,
    0:11:14 but we have no idea what it does.
    0:11:16 And if you have sequences from the Antarctic that have never been seen before,
    0:11:19 it’s actually important for us to be able to computationally say what they do.
    0:11:22 And so for that, we’ve developed some contrastive deep learning algorithms
    0:11:27 to annotate them at a pretty high accuracy and we presented that in Europe’s last year.
    0:11:31 So we kind of do both.
    0:11:32 It depends what we feel like is worth building something from scratch
    0:11:37 versus just leveraging our data advantage.
    0:11:39 But even when we built something from scratch,
    0:11:41 we’re always leveraging our data advantage as well.
    0:11:43 So it’s kind of, we’re doing both, whatever works.
    0:11:46 >> I’m speaking with Phil Lorenz.
    0:11:48 Phil is the Chief Technology Officer at Basecamp Research.
    0:11:51 And we’re speaking high above the show floor here at GTC 2024,
    0:11:56 our podcast recording area has a nice window view of the show floor coming to life this morning.
    0:12:02 Phil, you came from an academic background.
    0:12:05 You were at University of Oxford before joining Basecamp.
    0:12:08 Maybe you can walk us through your journey a little bit.
    0:12:11 And then we can talk a little bit about what’s the same,
    0:12:13 what’s different from moving from academia into your role now.
    0:12:17 >> Yeah, definitely. I’m a traditional life scientist at heart.
    0:12:20 You have a lot of people in healthcare lifesigns now
    0:12:23 that have a computer science background walking into that.
    0:12:25 And that’s amazing. That’s incredible.
    0:12:27 I’m a little bit from the kind of molecular biology, traditional background,
    0:12:33 which is what I did during my undergrad.
    0:12:34 And then moved towards more kind of deep learning applied to
    0:12:38 genomics sequencing data for my PhD.
    0:12:41 I discovered a couple of new human genes and transcription start site
    0:12:45 using deep learning during my PhD.
    0:12:47 So that’s kind of where I worked on for a while.
    0:12:49 One thing I always really cared about, whatever you do in the life science
    0:12:53 of healthcare industry, is thinking about the problem you’re trying to solve first.
    0:12:58 And then going backwards and thinking, OK, what kind of technologies, tools
    0:13:02 can you use to address that problem?
    0:13:04 That’s kind of always thinking about what you’re trying to do in that kind of
    0:13:07 order of events. And that’s still how I think about this now.
    0:13:10 Even though I wouldn’t really say I’m a traditional life scientist anymore,
    0:13:14 but always thinking about what you’re trying to solve and then what do you need
    0:13:17 to do in that kind of philosophy is still something I think about quite a lot.
    0:13:21 Even though my PhD was quite applied, so it wasn’t too academic,
    0:13:25 which is maybe a good thing.
    0:13:28 But yeah, that’s kind of how I came into what I’m doing now.
    0:13:31 Did you learn the code when you were younger, kind of out of an interest
    0:13:34 in computer science and learning how to code?
    0:13:37 Or was it more of a in your work in the life sciences?
    0:13:40 You kind of hit a point where you realized, oh, this will be faster
    0:13:43 if I learn how to write scripts.
    0:13:44 It was almost, I started coding almost out of a necessity.
    0:13:49 When I got my first kind of big data sets and at some point I was like,
    0:13:54 yeah, I’m not going to use Excel for that.
    0:13:56 So it was almost kind of, I wouldn’t say I was forced to, but at some point I was like,
    0:14:00 well, it’s just going to make everything more efficient faster and so on.
    0:14:03 So it was kind of great because I felt like I was coding always with a purpose
    0:14:08 to do something specific for what I wanted to do with my project.
    0:14:11 So in that way, I kind of always thought really motivated to go after it.
    0:14:16 So that’s kind of good.
    0:14:17 So how long have you been at Basecamp now?
    0:14:19 Basically from the very beginning for almost three years now.
    0:14:22 OK.
    0:14:23 And how big is your team now?
    0:14:25 How has it grown out over that time?
    0:14:27 The company is a whole about 32 people.
    0:14:30 My team is 15, but a bit less than half the company.
    0:14:34 Can you tell a story or kind of explain a discovery along the way at Basecamp
    0:14:41 that will blow our listeners’ minds?
    0:14:44 Is there something about sequences from Antarctica,
    0:14:48 or something undiscovered about the way ecosystems work in the desert,
    0:14:54 or for that matter here in San Jose, I don’t know.
    0:14:56 But something that just really sticks out is like, you know…
    0:15:00 Yeah, I mean, one thing that I still think is a nice thing to share
    0:15:05 is actually from the very beginning, the very first few days of Basecamp
    0:15:10 was, I mentioned this earlier,
    0:15:12 but the two founders of Basecamp, Glenn and Ollie,
    0:15:15 and big coolers to them having this kind of vision,
    0:15:17 they love exploring the world.
    0:15:19 They go out in the wild, they go climbing, they go up mountains, whatever.
    0:15:23 I’m more fragile, I like to stay behind the screen.
    0:15:26 And so at some point, a couple of months before they started Basecamp,
    0:15:29 they spent over a month on an ice cap, fully off-grid in Iceland.
    0:15:34 And Glenn did a lot of sequencing, DNA sequencing during his PhD.
    0:15:38 So he brought these kind of mini flow cells with him,
    0:15:41 these portable sequencing devices.
    0:15:43 And he was just sequencing ice caps and see what was in there.
    0:15:46 You probably don’t expect much life to be happening there.
    0:15:49 And so they came back with this data and they were like,
    0:15:52 “Oh, Phil, you do a lot of this analysis and you do coding in your PhD.
    0:15:54 Can you have a look at what’s in there?”
    0:15:56 And I analyzed this data, I annotated this,
    0:16:00 and something like 97% of it has never been seen before in any public database.
    0:16:05 This was completely novel, had 0% similarity to anything that’s ever been seen before.
    0:16:10 And that was just a random spot, Glenn did, somewhere on an ice cap.
    0:16:13 We didn’t look for something new, it was just random.
    0:16:16 And so from that, we just realized,
    0:16:18 “Oh my God, the vastness of life on Earth is just so huge.”
    0:16:21 And the opportunities lost in the life science industry by not leveraging this data more systematically,
    0:16:26 that’s kind of, was a great kind of origin story for us to realize.
    0:16:30 Like, let’s do this systematically and with deep learning in mind,
    0:16:33 because a lot of the data that we have in the life sciences is us,
    0:16:37 you know, having kind of all these academic endeavors sequencing here or there,
    0:16:41 and fingers crossed that they’ve collected the right data, right?
    0:16:43 And so we’ve kind of made this systematic and in partnership with all of these nature parks,
    0:16:47 which is exciting.
    0:16:49 Yeah, that’s amazing.
    0:16:50 From the technological side, we talked a little bit before about the infrastructure,
    0:16:54 the compute, the technique, sort of, I don’t want to say catching up to,
    0:16:58 but sort of keeping pace with, you know, the size of your data set along the way.
    0:17:04 What are some other AI machine learning related challenges that you’ve encountered at base camp
    0:17:10 that you got passed or perhaps that you’re sort of, you know, still grinding on now?
    0:17:15 Yeah, I mean, one thing that I am most excited by that has been kind of addressed in the last few months
    0:17:20 is how to deal with bigger context sizes.
    0:17:23 So that’s kind of something in the language model field,
    0:17:26 there’s a couple of architectures that were developed,
    0:17:30 especially, I think, in Stanford last year, like Hyena and Mamba,
    0:17:33 that I’m super excited by because what we’re collecting is not just protein sequences,
    0:17:38 but these really long range genomic context windows that that’s not really that common in other public data.
    0:17:44 And so can I ask you to explain what that means?
    0:17:46 Yes, I mean, when people sequence environmental data, not like a strain from a Petri dish,
    0:17:52 but kind of these wild environmental samples where you might have 10,000 species
    0:17:57 in a tiny, tiny piece of soil or something.
    0:17:59 When you sequence that, usually what happens in public data is you get maybe a few thousand base pairs.
    0:18:04 And if you’re lucky, there’s one gene on there.
    0:18:06 What we’ve done is we’ve really tried to optimize this process in a way where we get near full genome.
    0:18:12 So the entire genome of every single organism of that.
    0:18:15 So hundreds of thousands of base pairs with tens of thousands of genes or whatever that might be.
    0:18:20 And so with that, we’re actually understanding a lot more about the interaction of all of these genes,
    0:18:25 what they do to work together and also understand more complex behavior.
    0:18:28 For example, how you can use this for gene editing or therapeutic applications.
    0:18:33 But modeling this with language models hasn’t really been that straightforward
    0:18:37 because there wasn’t really that many architectures out there to deal with that kind of information.
    0:18:41 And with Hyena and Mamba, we’re now really excited that this is now possible
    0:18:45 and that we have the data sets that we can apply this to.
    0:18:48 So that’s something I think in terms of dealing with long context,
    0:18:51 probably the most exciting development for very selfish reasons, basically.
    0:18:55 But I’m super excited about that.
    0:18:56 Very cool, very cool.
    0:18:58 So you kind of hinted at this in that answer when you mentioned gene editing.
    0:19:03 What are some of the applications going forward for the work that Basecamp’s using?
    0:19:09 And then I don’t know and I’m not asking you to compare to competitors or what have you.
    0:19:14 But as the available data sets sampled from biology, from nature, grow,
    0:19:21 as they continue to grow, what are some of the implications and applications
    0:19:25 for everyday folks like me downstream from the work that you’re doing?
    0:19:29 No, I think the reason I think gene editing and at some point gene writing technologies
    0:19:34 is going to change not just medicine but health in general.
    0:19:39 We accumulate millions of mutations every day just by existing,
    0:19:43 by breathing and eating, and some of them are non-significant,
    0:19:46 some of them are bad, some of them may be good.
    0:19:48 But for us to be able to accurately change them or write new DNA into the human genome
    0:19:53 to make a change, that’s really I think kind of the next wave
    0:19:58 of big therapeutic changes that we can make.
    0:20:01 And the machines that can do this actually often are derived
    0:20:04 from the way bacteria and viruses fight with each other.
    0:20:07 So bacteria and phages, those are viruses that infect bacteria.
    0:20:11 They kind of have biological warfare going against each other in the wild, in nature.
    0:20:16 It’s not something you can measure in a petri dish with a sterile strain or whatever,
    0:20:19 but in nature, they have warfare against each other.
    0:20:22 And those machines that enact this warfare, those gene editing systems,
    0:20:26 CRISPR/CasNucleases, it’s like one of the kind of major headlines that came out of that.
    0:20:31 There’s hundreds of millions of these systems that haven’t been discovered yet
    0:20:34 and leveraging them in a way that we do, but also in a way
    0:20:38 that at some point by having enough of these that we can generate them
    0:20:42 or design them through language models, for example.
    0:20:44 That’s something that a development I think is really exciting.
    0:20:48 You know, I’m abstracting to the level that I can comprehend.
    0:20:51 But the last bit that you said, I was thinking, so my kids, maybe my grandkids
    0:20:56 might be able to prompt a model and edit their genes or rewrite their genes.
    0:21:02 And I know it’s maybe not quite like that, but is that a future we’re headed towards?
    0:21:07 I think, I mean, I live in Europe, so there’s always a lot of regulations to think about, maybe?
    0:21:12 So I don’t know, I come from this all of it.
    0:21:15 No, but I think joking is that I think one of the things I can definitely imagine
    0:21:19 is that if the way we monitor our DNA and our mutations
    0:21:24 and the way we can address these mutations, let’s say in 20, 30 years time,
    0:21:28 is something we can do in real time, I can imagine, where because of, you know,
    0:21:33 sequencing in the body as we live and breathe,
    0:21:36 where through some device, we detect a harmful mutation and being able to fix it within two hours.
    0:21:42 Like, this sounds crazy, but I do think this is kind of where this is going in 20, 30 years.
    0:21:47 And that’s kind of the science fiction scenario for therapeutics and gene technologies.
    0:21:52 Amazing.
    0:21:53 NVIDIA Inception, you’re part of it.
    0:21:55 I’m not asking you to plug anything, but how’s that been?
    0:21:58 And, you know, being sort of a startup on the leading edge of life sciences
    0:22:05 must be, you know, in some ways, similar to other startups with similar concerns
    0:22:09 around growth and funding and, you know, keeping things running and all that kind of stuff.
    0:22:15 But I’m sure there’s something unique to being a startup,
    0:22:19 working on, you know, discovering novel science.
    0:22:22 What’s that like and what’s it been like working with Inception?
    0:22:25 It’s been amazing.
    0:22:26 We’ve been working with NVIDIA for two years, almost two years now.
    0:22:30 I think it’s really exciting because a lot is happening.
    0:22:33 And so, I mean, sometimes I open, you know, BioArchive or PubMed or something.
    0:22:39 It’s like, damn, can everyone please stop publishing?
    0:22:41 There’s just so much happening.
    0:22:42 But I actually think it’s exciting because everyone has their strengths, everyone.
    0:22:47 And by having these networks of lots of life science companies and everyone has a different product,
    0:22:52 they have a different strategy.
    0:22:54 And so actually, some people think like, oh, are these, you know, startups all competitive with each other?
    0:22:59 And in some cases, maybe, but I’m actually a lot more excited by the fact
    0:23:02 that what’s really happening is we’re all growing the field.
    0:23:05 We’re all growing the market.
    0:23:07 And so some people offer a software, some people offer an asset.
    0:23:10 Some people will offer a service, whatever it might be.
    0:23:12 And so actually, the products are different, the technologies are different.
    0:23:15 And so just the space growing as a whole is something that’s super exciting.
    0:23:19 And NVIDIA and Inception, they’re connecting everyone and making it happen.
    0:23:22 And so that’s something I’m super excited about.
    0:23:24 That’s fantastic.
    0:23:25 So you mentioned coming up with something of a traditional,
    0:23:29 I hesitate to call it old school because as we’re sitting at the table,
    0:23:32 I’m not going to guess, but I know you’re far more younger than I am.
    0:23:35 So if you’re old school and I want to think about what I am.
    0:23:38 But coming up with more of a traditional life sciences background
    0:23:42 and then kind of moving to a place where you moved into applying it and using technology in that way.
    0:23:47 Jensen said something in the media a couple of weeks ago about giving advice to young folks,
    0:23:53 you know, to focus on a domain and develop domain expertise.
    0:23:57 Because, you know, the the computing language of the future is just speaking, right?
    0:24:02 It’s natural language.
    0:24:03 And so the tools will progress so that you can leverage them in your domain.
    0:24:08 Right, yeah.
    0:24:09 What advice would you give to a young life scientist,
    0:24:12 somebody studying biology in high school or going into undergraduate work in this age
    0:24:18 where everything’s moving so fast and technology is such a big part of it?
    0:24:21 No, I guess, yeah, I mean, I speak to a lot of kind of biologists, but also computer scientists
    0:24:27 that are flying through basically to the biologists.
    0:24:28 My main my main advice is always do what you care about.
    0:24:32 There’s a lot of biologists that have something they really care about,
    0:24:35 but then they go into oncology because that’s where big farmers or whatever it is.
    0:24:40 And that’s great.
    0:24:41 But I actually think there’s so many there’s so many areas in the life science industry
    0:24:46 where the problem is not that they’re not relevant.
    0:24:48 The problem is that they’re not relevant yet because by making more discoveries,
    0:24:52 we’re always going to find something that’s clinically relevant.
    0:24:55 Like gene editing was found through studying bacterial immunology,
    0:24:58 which is like no one ever thought was going to be relevant therapeutically a few years later.
    0:25:02 Right.
    0:25:02 So I think it’s always better to do something that you’re passionate about and make it relevant
    0:25:07 rather than trying to find something that oh, this is what therapeutics cares about
    0:25:10 and just running after that.
    0:25:12 That’s my advice to biologists for computer scientists.
    0:25:15 The main thing I find is when when we have people apply to base camp or interviews and so on,
    0:25:21 often what I hear is people kind of saying, oh,
    0:25:24 I really care about this specific type of diffusion model and I want to apply this.
    0:25:28 And my counter argument often is kind of let’s let’s discuss what we’re trying to solve first
    0:25:33 and then does that help or should we think about something else?
    0:25:37 Or should it be a language model or should it be, you know, regression?
    0:25:40 I don’t know.
    0:25:41 But basically always this is often what when I speak to people from kind of a more technical background
    0:25:47 always and this goes back to Jensen’s point about the domain to interact well with the domain,
    0:25:52 the problem they’re trying to solve.
    0:25:53 And then absolutely, yeah, talk about the technology and what it’s going to identify the problem
    0:25:58 with the nation first and then apply the tools.
    0:26:00 Yeah, it makes good sense.
    0:26:01 One thing I haven’t really talked about much is you probably know about things like the New York Times
    0:26:06 suing open AI because they didn’t ask for their data.
    0:26:09 So one thing that we’re doing that’s different to any other life science organization is that
    0:26:13 we’re not just asking all these nature parks for consent because all the public databases they never did.
    0:26:18 So we don’t just ask them for consent.
    0:26:19 What we’re also doing is when we license something to a partner or whatever,
    0:26:24 we often do something like a revenue share with them to basically make sure that all the progress of life sciences
    0:26:29 or the AI that comes out of all of this data, we share this with the stakeholders that originate to the data.
    0:26:35 So to protect biodiversity, we also have a different model where there’s kind of good data governance for it.
    0:26:42 I don’t know whether, yeah, but that’s kind of one thing our team does, yeah.
    0:26:45 When you started approaching the nature parks about this, were they receptive or they confused?
    0:26:52 Did they understand what you were talking about?
    0:26:54 And I don’t mean to be disparaging to them, it’s unique.
    0:26:57 So not to say the global south because that’s a lot of vastly different players,
    0:27:02 but a majority of biodiversity is obviously like South America, Africa and so on.
    0:27:06 They are on top of this.
    0:27:07 There’s the Nagoya protocol and actually means you have to have to not just ask for consent but share benefits with them.
    0:27:13 And so they are aware of this.
    0:27:14 They’re almost waiting for the West to ask them and work with them.
    0:27:18 So a lot of them are super on top of this.
    0:27:20 Interesting, yeah.
    0:27:21 We’re just the only ones.
    0:27:22 We have a team that is part of the United Nations conventions on biological diversity.
    0:27:26 So that’s a long title.
    0:27:29 But they’re working with nature parks, local governments, national governments to make these deals.
    0:27:34 And sometimes that takes months to have an agreement.
    0:27:37 But that way we know that for every single data point in our database,
    0:27:42 we don’t just have consent and permission of where that comes through.
    0:27:45 But also when we see a commercial success through something, we can share some of that with them.
    0:27:50 And that incentivizes obviously an even bigger data supply chain, which is exciting.
    0:27:56 Phil, for listeners who want to find out more about what Basecamp Research is up to,
    0:28:00 there’s a website, should they go there or where should they go?
    0:28:03 Yeah, absolutely.
    0:28:04 Our website, very simple, Basecamp-research.com.
    0:28:08 I think we have LinkedIn, Twitter and so on as well.
    0:28:10 You can email me if you want.
    0:28:13 Fantastic.
    0:28:14 Absolutely, yeah.
    0:28:15 Great.
    0:28:15 Well, thanks again for taking the time out of GTC to speak with us.
    0:28:19 It goes without saying, but it’s fascinating, fascinating work you’re doing.
    0:28:22 And I only understand it on the level of a couple of drops,
    0:28:25 not the whole ocean, but can’t wait to see what the rest of the year holds for you and for Basecamp.
    0:28:30 Thank you so much.
    0:28:30 This has been great.
    0:28:31 Cheers.
    0:28:32 [MUSIC PLAYING]
    0:28:35 [SUSPENSEFUL MUSIC]
    0:28:39 [SUSPENSEFUL MUSIC]
    0:28:42 [SUSPENSEFUL MUSIC]
    0:28:46 [SUSPENSEFUL MUSIC]
    0:28:50 [SUSPENSEFUL MUSIC]
    0:28:54 [SUSPENSEFUL MUSIC]
    0:28:58 [SUSPENSEFUL MUSIC]
    0:29:02 [SUSPENSEFUL MUSIC]
    0:29:06 [SUSPENSEFUL MUSIC]
    0:29:10 [SUSPENSEFUL MUSIC]
    0:29:13 [SUSPENSEFUL MUSIC]
    0:29:17 [SUSPENSEFUL MUSIC]
    0:29:20 [BLANK_AUDIO]

    Basecamp Research is on a mission to capture the vastness of life on Earth at an unprecedented scale. Phil Lorenz, chief technology officer at Basecamp Research, discusses using AI and biodiversity data to advance fields like medicine and environmental conservation with host Noah Kravitz in this AI Podcast episode recorded live at the NVIDIA GTC global AI conference. Lorenz explains Basecamp’s systematic collection of biodiversity data in partnership with nature parks worldwide and its use of deep learning to analyze and apply it for use cases such as protein structure prediction and gene editing. He also emphasizes the importance of ethical data governance and touches on technological advancements that will help drive the future of AI in biology.

  • Media.Monks’ Lewis Smithingham on Enhancing Media and Marketing With AI – Ep. 222

    AI transcript
    0:00:00 [MUSIC]
    0:00:10 Hello and welcome to the Nvidia AI podcast. I’m your host, Noah Kravitz.
    0:00:15 Earlier this year, a robot named Wormhole made a splash at the CES show in Las Vegas,
    0:00:20 catching Shogor’s attention with its edgy humor and alien likeness right out of the Men in Black
    0:00:25 movies. Wormhole was created in part to show off Monk’s Flow, an AI-powered platform for marketers
    0:00:30 created by an agency called Media Monks. And both Wormhole and some of the humans at Media Monks
    0:00:36 are here at GTC, talking about the power of generative AI in media, marketing, and beyond.
    0:00:41 Here to break it all down for us is Lewis Smithingham, SVP of Innovation and Special Ops at Media Monks.
    0:00:47 Lewis is presenting a session here at GTC entitled, “Revolutionizing Fan Engagement,
    0:00:52 Unleashing the Power of AI and Software-Defined Production.” And he’s been kind enough to stop
    0:00:57 by the podcast as well, so let’s get right to it. Lewis, welcome to the Nvidia AI podcast.
    0:01:02 Thanks so much for stopping by. Hey, Noah. Thank you so much for having me. I’m
    0:01:05 really excited to be here. One thing I will caveat, so while you know how crabs evolved
    0:01:11 convergently five separate times, like crustaceans evolved that way, Worm-like characters can evolve
    0:01:17 separately outside of properties that may or may not be owned by Universal Pictures as well.
    0:01:22 I had full faith that it was going somewhere, but when you said crabs, I was like, “Hey.”
    0:01:27 I’m in a current, like my current meme collection is exclusively crab memes about the fact that
    0:01:33 crabs evolve separately five different times, which says, like, I don’t know what that says
    0:01:36 about my algorithm. I want to ask you about that, but I’m going to do that when we’re done talking
    0:01:42 about Wormhole. How about that? Yeah, absolutely. Excellent. So, first off, Media Monks. What is
    0:01:48 Media Monks? Somebody asked me that a second ago, and I said, “What isn’t Media Monks?” Media Monks
    0:01:52 are a 7,700-person company that’s a global company that solves great solutions for media,
    0:01:59 for entertainment, for advertising, for large-scale tech companies. We’re a combination of
    0:02:05 advertising. We’re a combination of technical services, data services, heavy-duty technical
    0:02:11 solutions. We’re a use case engine founded in 2001 by Wesley DeHarr and Victor Knapp in a basement
    0:02:19 in Hilversum, Radio City, Hilversum. Where’s Hilversum? Hilversum is southeast of Amsterdam.
    0:02:25 It’s an industrial town that was where, during the Second World War, radio signal towers came
    0:02:31 out of. So, all the radios in the UK, my parents are British, and all the radios in the UK have the
    0:02:35 word Hilversum on them as a place where antennas were. So, later on in 2018, Sir Martin Sorrell
    0:02:44 acquired us, and we’ve since then merged with, I think, upwards of 26 other different businesses
    0:02:50 to create one holistic group. One of the things that’s really exciting about the presence we have
    0:02:55 here at GTC is, I think, there’s no more than two people from any one of the original merged
    0:03:00 companies. I was in a meeting earlier today, and they were like, “Well, who do you guys work for?”
    0:03:05 And all of us were from different mergers, and we’re all working together on one team to create
    0:03:10 one thing over one P&L. And it creates really cool opportunities where I can go talk to a team that
    0:03:15 does massive, scaled campaigns for cars or for shoes, and then walk over to a team that builds
    0:03:22 heavy-duty CDNs for video production systems. And so, yeah, the answer to MediaMonks, MediaMonks
    0:03:28 are a massive global, or not massive, we’re a global company that works to create solutions
    0:03:34 for our clients across a wide range of services, whether it is marketing,
    0:03:37 technical services, or innovation spaces as well.
    0:03:41 Very cool. And so, I assume a lot of the work you do is in the digital realm,
    0:03:45 but also you at least build one alien robot, right?
    0:03:48 Well, I mean, that is one differentiating thing. I think a lot of the other folks in our industry
    0:03:53 are focused on physical services or services that aren’t digital. We’re digital focused. And so,
    0:03:59 as a part of that, we build things all with the purpose of driving more digital innovation,
    0:04:04 digital content. And we do have a robotics team down in Latin America that do some in Brazil
    0:04:10 that you go down there and it’s wild. They had flying cars at one point. I mean, we did a show,
    0:04:17 my first job, or third job with MediaMonks, I used to direct for them. And we were doing this
    0:04:22 old spice commercial, I don’t know, commercial is the right word for it. It actually did turn
    0:04:26 into a commercial that was aired during the Super Bowl, but it’s for old spice foam zone.
    0:04:30 It’s Wes’ favorite project. And it was insane. It was a 16-hour game show that they built these.
    0:04:39 And I made these crazy. The thing that was great about MediaMonks as a director is you could pair
    0:04:44 up with them. And the dumbest, craziest ideas you pitched would actually happen. And you’re like,
    0:04:50 “Dude, is that safe? Are we sure we want to do that?” Yeah, we did. And we built this 26,
    0:04:58 I think it’s 36-foot tall stairs that you climb and trip down. And so, the team in Latin America
    0:05:04 in Brazil, it’s called the Shop.Monks. And they have a robotics team. And they’re an
    0:05:10 in-house robotics team. And Wormhole came out of a project that the robotics team,
    0:05:15 because they’re all geniuses, geniuses get bored really quickly. And so, they started spinning out
    0:05:20 tooling and they started figuring out how they could experiment more than a Metronics. And
    0:05:24 I think one of the dreams of that team are to start building things for parks. And so,
    0:05:28 we’re like, “Well, let’s just build something and see what happens with it.” And the guys down
    0:05:33 there sent me a text with it. And we’re like, “Yeah, do you think we could… Could you show this to
    0:05:37 people? Are you kidding?” And it’s a great thing because it’s no context. And we went, we showed
    0:05:43 up at re-invent with it. This is the first time it was in public. And we didn’t have a space for it.
    0:05:48 So, we just like wheeled it into a hallway. And we’re like, “Yeah, you want to talk to this robot?”
    0:05:55 You know, it’s funny you say that because earlier, not to interrupt you, but I was walking from,
    0:05:59 you know, to get lunch or something. And one of the Boston Dynamics robots was just kind of
    0:06:03 scooping it up down the hall. Yeah. I mean, that’s the way to get attention if you run it in a hall.
    0:06:08 And so, then again, what’s great about Monks is there was another team in the Experience.Monks group.
    0:06:14 And they were experimenting with conversational systems. And they’re like, “Well, why don’t we run
    0:06:18 Bedrock on this? Why don’t we figure out a way where we can use this to like actually talk to you?”
    0:06:23 Yeah. So, let’s break it down. What did it describe for the listener? It’s, you know,
    0:06:26 we’re old school, it’s audio only, no visuals. Describe Wormhole.
    0:06:30 I’m not going to do this ASMR, but I will lean into what he looks like. So, Wormhole looks like
    0:06:38 somebody that got kicked out of a casting session for a late ’90s Jim Henson picture.
    0:06:45 So, he’s like three feet tall. He sits on this crate. He would be great to have at your theme park
    0:06:51 talking to guests as they walk through the lines, perhaps. And he is a Worm. He’s like pinkish blue.
    0:06:58 He really likes coffee. And he uses a conversational AI to have conversations with you running on
    0:07:05 Bedrock and a number of other large language models. He also runs on some EC2 instances as well.
    0:07:10 And what I think is really exciting about him is his small footprint. He sits on a crate.
    0:07:17 And we use cloud-based systems and we use GPU to really empower how he talks to folks,
    0:07:22 which is fun. And just looking at part of what I get excited about Wormhole as well is, you know,
    0:07:27 I have a lot of thoughts about the uncanny valley and I think humanoid systems and
    0:07:32 metahumans are absolutely outstanding. We have one on the floor right now that has
    0:07:36 conversationally is absolutely incredible. But because of the deeply ingrained evolutionary
    0:07:42 systems we have around the uncanny valley, there’s still stuff that feels off when you
    0:07:47 talk to a humanoid character. I even think humanoid robots are still a little bit off.
    0:07:52 But when you have a cartoonish looking Worm thing, it’s not weird at all. And you’re like,
    0:07:59 cool, I’ll talk to this Worm, whatever. That’s cool. And it means that you get past so many
    0:08:05 of the barriers that you would in an AI piece when you’re working with something that’s not
    0:08:10 humanoid. Right. And is that kind of the reaction that you observe from folks when,
    0:08:14 you know, they encounter Wormhole at re-invent? Yeah, absolutely. I mean, the thing that’s funny
    0:08:18 about this Wormhole, the original Wormhole had a cup of coffee and the engineers had the amazing
    0:08:23 idea to put actual coffee in it at one point. And so original Wormhole is just like flinging
    0:08:28 coffee around the room. And we have changed that so it’s not ruining people’s shirts anymore.
    0:08:33 Yeah, I was in a meeting with that guy about six years ago. Yeah, he’s just like this.
    0:08:36 Why are you doing that? But the reaction tends to be like people want to interact,
    0:08:43 people want to talk to him. It’s weird, it’s cool, it’s fun, it’s whimsical. And then like you start
    0:08:50 thinking immediately about use cases and you start getting into these systems where, and I think a
    0:08:54 lot of what MediaMonks does is as a case study engine, we use the fact that we have thousands
    0:09:00 of clients across different sectors where we produce thousands of projects a year
    0:09:04 where, and literally some teams produce millions of assets a year, that means you can try out a
    0:09:10 lot of stuff. Right. And then you start getting to a point where like, you know, does Wormhole like
    0:09:15 actually function well as somebody in a medical setting that talks to people? Right. Have you
    0:09:19 tried that? I mean, like I saw some in the keynote yesterday, like who were doing that. And I think
    0:09:24 I would much rather get my diagnoses these from Wormhole than somebody else.
    0:09:29 Then from like a just slightly off looking doctor. Yeah, I’d much rather have Wormhole be like, well,
    0:09:34 you know, you do actually need to get a colonoscopy this year versus some like humanoid robot thing.
    0:09:39 Like I think that’s more fun. And I think that that’s entertaining and that feels oddly more
    0:09:43 personal. But it’s important because of like sort of the co-iification of the way we engage with
    0:09:49 stuff. Has MediaMonks built robots for clients, conversational robots, I should say?
    0:09:54 We have. We have absolutely built them. We’ve built several pretty large scale
    0:09:59 conversational robots for movie properties. And there’s a couple of TV shows down in Latin America
    0:10:04 we have built those characters for. There’s like Wormhole 2.0 is coming up soon. I think we’ll be
    0:10:09 walking and talking, which terrifies me to a certain degree. And I’m concerned about like how
    0:10:15 they’re going to do the walking physics with him, but it should be fun. Right. It might be too early
    0:10:20 for this, but are there learnings you’ve taken away? I mean, other than, I’d rather hear about
    0:10:25 a colonoscopy from an alien worm than an uncanny humanoid. Yeah. I mean, that is a major learning.
    0:10:31 I think the other learnings that we’ve had are around latency, around how he speaks. I’d love
    0:10:36 to look at like using directional audio with him. I’d love to look at like other languages as well.
    0:10:42 Yeah. Is it English only? Well, I think he actually speaks Portuguese.
    0:10:45 A little Portuguese. Yeah. Nice. Yeah. He does know my name, which was weird.
    0:10:51 Is that terrifying? Yeah. It was like, “Oh, what’s up, dude?” I think somebody,
    0:10:57 there’s another project going on. I saw one of my team members who was annoyed I wasn’t joining
    0:11:01 calls this week had sent in a request this morning to synthesize my voice. Right. And
    0:11:07 apparently it got approved, which was like, got this like docusign from legal being like,
    0:11:12 “Hey, Mediamunks now reserves the right to clone your voice.” Okay, sounds great. Does that mean
    0:11:17 I don’t have to go to that meeting? Right. Yeah. But I think the other learnings we’ve learned
    0:11:20 about it is A, it works. It’s fun. And that leans into like, I have a life philosophy around like
    0:11:28 whimsy and the idea that like most things you need to ask yourself when you create things like,
    0:11:33 “Would I like this? Does this suck?” And then also like, is this something we want to bring into the
    0:11:40 world? And like, I think wormhole and things like wormhole is. And where I get really excited is
    0:11:45 the idea of as you get to spaces where, you know, connectivity does get better over time and we
    0:11:49 start looking at some of the advances that were even made overnight, the idea of having these
    0:11:54 conversational robots is really exciting. Yeah. And fascinating and particularly if they look like
    0:11:59 wormhole. Right. It’s going to be even more fun. Cool. So let’s talk about MonxFlow. I’m going to
    0:12:04 read from your website here because that’s what I put in my notes. It’s an AI-centric managed
    0:12:08 service that streamlines how humans and machines work together. What does that mean?
    0:12:12 Yeah. Is it a platform?
    0:12:15 Well, so what MonxFlow does, MonxFlow is a node-based pipeline system. I sort of think of it,
    0:12:21 I come from VFX originally, so I think about it a lot like Nuke or like we’re looking at different
    0:12:26 nodes. And when you look at, and this was great in the keynote yesterday, they talked, I was sitting
    0:12:31 next to our founder and Jensen goes into like this microservices portion and I’m like hitting him with
    0:12:37 my elbow being like, see, see, see this is what’s going on. And so what it does is it takes all the
    0:12:45 services we provide and it breaks them down into microservice-based systems and individualized nodes
    0:12:51 flow together completely as a flow, as a pipeline within that space. And so it does function as a
    0:12:58 managed service, but it also functions with a level of automation and it runs on its own
    0:13:03 levels of learning. And so whether it has your transcription, background generation, foreground
    0:13:09 generation, texture generation. And then what’s really exciting about Monx and what we’ve been
    0:13:14 doing in Monx is the fact that we have teams, we’re a glass-to-glass organization. So we have teams,
    0:13:20 we have on-staff camera operators and literally people. Glass-to-glass camera to screen.
    0:13:26 Camera to screen, yeah. And so we have people who shoot film with cameras and then we have people
    0:13:31 who design the systems and design the performance systems that those run on. And so we have the
    0:13:37 opportunity to take performance data for our clients, feed that back in all the way back to the
    0:13:42 origin. That’s part of the stuff we’re working on with Holoscan for media as well with the idea
    0:13:47 that, and I come from TV and films and in that space, you never knew what your audience was
    0:13:53 thinking and being able to figure out how you create that content because there’s so much,
    0:13:58 such an opportunity to really personalize with that. And that’s where what Monx Flow does is it
    0:14:03 just takes that opportunity, effectively marketing and content creation services, technical services,
    0:14:08 builds them into an entire pipeline, breaks them out into microservices with the idea that we think
    0:14:14 our industry’s model historically has been around selling hours, human being hours. So this human
    0:14:20 being can sit at the desk for 40 hours a week legally and that’s it. And we think that we
    0:14:29 should be selling based on outputs and the quality of those outputs and also on the quantity of those
    0:14:34 outputs. And so it gives us the opportunity and we see AI as this massive opportunity to just scale
    0:14:41 output of individual creators and to really lean into that. And so Flow enables that from end to
    0:14:46 end. And so, I mean, you asked if it was a platform. It is a platform. I think it’s less
    0:14:51 a platform and more a pipeline. Right. Right. No, it makes sense. And so the outputs from Flow are
    0:14:58 video? Wide, wide, wide ranging outputs. So it could be everything from a data set. So individual
    0:15:05 text-based data sets. It could be my voice being synthesized. It could be video assets. It could
    0:15:11 be real-time assets. It could be stills. We’ve worked on a pipeline to build all of that. I mean,
    0:15:17 arguably or not arguably, Wormhole is a portion of Flow as well. It’s a conversational system in
    0:15:24 Flow. And so you’re using that performance system, the learnings across those systems to be able to
    0:15:29 use it for a use case like an avatar that talks to you, an avatar that does translation. All those
    0:15:36 different opportunities come together and we think of them as part of an overall ecosystem.
    0:15:41 And we work with partners like NVIDIA, like Adobe, like AWS, like all sorts of different
    0:15:45 spaces where, I mean, I tend to think that, and I’ve made my career off of building pipelines
    0:15:52 using the tools that people offer in the sense that I would imagine that NVIDIA’s R&D budget is
    0:15:59 probably the size of our overall company. And so why not let them do the R&D and then use that
    0:16:07 to run our own pipelines where we can empower our creatives, which are some of the absolute best
    0:16:11 in the world, to better use those tools to scale that information. So how do the creatives, and I
    0:16:16 don’t know how long folks have been monks, been at monks with monks? Yeah. I’ve been a monk for
    0:16:24 five years, depending upon how you do the math, five years officially, eight years unofficially.
    0:16:29 I was waiting for a third tier, like 12 years metaphysically. Yeah, I mean, probably psychologically
    0:16:35 at least 12 years. People aren’t in the room with me, but I can see that. I’ve been a Westite
    0:16:39 for that long, for sure. So I’d imagine you’ve got folks who are younger and have been maybe more
    0:16:46 tech adventurous and been trying out tools, and then maybe folks who are a little older,
    0:16:50 been in the business longer and are used to doing things a different way. What’s the response been
    0:16:55 like with the creatives trying out, adapting to maybe you’re rejecting, maybe being coerced into
    0:17:02 using AI? What’s the vibe around the creatives using it? So we do not coerce anybody to do anything.
    0:17:07 Certainly do not use robots to do that coercion. With that said, with our creatives, and I will
    0:17:14 very specifically caveat that with our creatives, the response has been overwhelmingly positive.
    0:17:21 Why? So, I mean, medium monks think differently, and I know that that’s a very like cliched thing
    0:17:26 for any company to say, but part of what our founder, Wesley Tarr, came out right of the gates
    0:17:32 in this wave. And we were like, I remember the first meetings around this during the beginnings
    0:17:37 of the wave, but we were working with chat GPT as far back. I mean, I remember projects going on with
    0:17:43 GPT-1, but we’ve been building in these spaces for a very long time. And as a part of that,
    0:17:50 Wes had this slide early on where it’s about Casparov being beaten by Deep Blue. And if you
    0:17:56 think that your job is smarter than a robot that can do chess, your creative idea, like,
    0:18:05 that’s pretty cool, dude, but like, maybe check yourself a little bit there. And so we think,
    0:18:12 and this is a bad metaphor that I came up with when I was, had a fever at another conference,
    0:18:17 but like, we sort of think about it like, and in creatives, there’s a great movie,
    0:18:21 Pacific Rim, check it out if you haven’t seen it. Excellent film. Excellent film. And it’s about
    0:18:26 people that sink their minds with gigantic robots to fight monsters. And we use AI,
    0:18:32 it’s cliched to think of it that way, but we use AI as a way to empower creatives and a way
    0:18:36 to scale our creatives. And, you know, it’s, I think I’m not worried at all about jobs going away
    0:18:44 for that sort of stuff. I’m worried about like, I think the big idea, which is a thing that is
    0:18:49 banded about in advertising and in films for a long time, I don’t think that is a thing
    0:18:54 that’s going to last very long. I don’t think it has. And I don’t, I think it’s arguably
    0:18:59 dead already in the sense that like, in the way that singular commercials on television used to
    0:19:04 create a monoculture moment across our systems, we don’t have monoculture anymore.
    0:19:10 But also like, we don’t have a singular zeitgeist moment with the level of frequency that we used
    0:19:15 to other than perhaps like literally Taylor Swift. But with that said, like our, our belief is that
    0:19:23 this will come with great scale and great scale will require curation, great scale will require
    0:19:28 usage, great scale will require tooling and all those sort of things are where human beings will
    0:19:33 be involved. And somebody else has said this, but like there’s advertising around like Microsoft
    0:19:39 Excel coming out in a 90s. And like, it’s all these ads about like, fire your account. Accountants
    0:19:48 are all going to die. And like the growth of the accounting business sector has been exponential
    0:19:53 since then. So like, if we think that like, this is going to come and destroy anything, if we can
    0:19:59 learn absolutely anything from history with proliferation becomes a need for more. And so,
    0:20:06 you know, you’re going to go from doing six things a year to 600,000 to 6 million, and being able to
    0:20:12 manage and operate those systems and engage. And then honestly, interact with that data is where
    0:20:20 our creatives see this as a massive opportunity to do more. And that’s, it’s really exciting for us
    0:20:27 across the company. And we have, I think our AI Slack channel, AI Collective is the name of the
    0:20:32 channel is like nearly the whole company at this point. Right, sure. And like, it’s not censored.
    0:20:39 And like, it’s like, there’s nobody having that negativity that I would expect. And I have seen
    0:20:45 it’s more just like, Hey, this is a tool, let’s use this tool, let’s do stuff that’s really exciting.
    0:20:50 And I think if anything, we’re excited about the fact that, you know, it democratizes access to
    0:20:55 tooling in the same way I think about mini DV in the 90s. Right, right. I haven’t thought about
    0:21:00 mini DV. I think about it far too far. I’m sure you do. My second mini DV conversation of the day
    0:21:05 for the record. Nice. That’s why you come to a tech conference in 2020. 100%. Yeah. Yeah. You know,
    0:21:10 I’ve, I’ve done work with marketing departments at different types of companies and agencies and
    0:21:15 such. And recently, so I feel like I know enough to be worried. I’m going to say something that’s
    0:21:21 incredibly wrong, but I’m saying anyway, say it. I keep hearing that one of the holy grails with AI
    0:21:27 and marketing and advertising is being able to deliver hyper personalized content. And I see
    0:21:33 nodding. Is that, is that where things, is that where we think things are headed? I’ve heard things
    0:21:38 about, you know, well, the web is going to look totally different because of, you know, generative
    0:21:41 AI and other things that are popping up around it. And we’ll go from me actively searching for
    0:21:48 something to a hyper personalized version of a webpage being delivered to me de facto.
    0:21:54 Is that something you see happening? Is that something that advertisers and marketers are,
    0:22:00 you know, sort of pining for? Is that kind of the model that floats around in your head when
    0:22:05 you think about the future of things like the internet? Army there already. I don’t know.
    0:22:11 What are the top three meme types that you get on your reels? Mine is lots of scuba diving stuff,
    0:22:16 a lot of stuff about fossils, jokes about battleships. And right now it’s a lot of memes about
    0:22:24 crabs converging, evolving convergently. So we’re there. We’re there already. And like,
    0:22:31 I think the thing is, is that we’re all individuals and this isn’t me like trying to make some
    0:22:37 just drummer paraphrase sort of thing, but like we are unique people, we’re intersectional people,
    0:22:43 we have our own identities and marketers, advertisers, brands, leagues, sports, IP holders
    0:22:48 need to start realizing that we are that and we expect that and we want that differently. And
    0:22:53 so when you start looking at the fact that we get into spaces where we have not we media monks,
    0:23:00 but a platform may have upwards of 3500 individual data points on you. There’s a thing and this is
    0:23:07 part of that uncanny valley moment where when you get out of being, it stops being creepy when it
    0:23:13 starts being meaningful and it starts being something you want. Right. Yeah. I don’t want to
    0:23:18 give away my data until you give me something back. Yeah. And like if you’re giving me something back,
    0:23:22 like memes about crabs convergently evolving, then like or Taylor Swift being a World War II
    0:23:27 battleship that just happens to be my favorite type of camouflage, Dazzle Camo, because I was
    0:23:31 searching for Dazzle Camo socks. Like that gets really meaningful and that’s exciting and that’s
    0:23:37 where you get out of creepiness and you get into utility and you get into meaning. And so
    0:23:42 like I think the problem with the advertising industry and I’ll clarify, we’re not an agency,
    0:23:49 we’re a services firm. We work with agencies, but we also work with brands directly. We do,
    0:23:55 we work with IP holders as well. We need to start thinking about people as those intersectional
    0:24:02 identities and serving those systems because if we don’t, and I’m talking about sports highlights
    0:24:08 here as well. And like I’m a massive NHL fan, Gokanes, and part of that is like when I watch
    0:24:16 a broadcast right now, I have a choice that’s home fans and away fans. What else, what other
    0:24:24 identity is that binary? Like we have to start realizing that there’s all sorts of different
    0:24:29 systems that we can build on because we’re living in this space where tools are digitized or at least
    0:24:34 most industries are moving into that space. The opportunity to personalize is massive and so we
    0:24:39 do have teams that produce 1.5 million assets a year for a single piece of IP with the idea that
    0:24:46 as you get into that, you can get more and more performant, you can get more and more personalized
    0:24:50 and you produce meaningful content. Let me go back for a second and ask you to break it down.
    0:24:53 Home fans, away fans, what’s the in-between? What are some of the things you’re talking about that
    0:25:00 exist between them? So like my identity, like you just heard, I’m a scuba diver, I’m a super
    0:25:07 into punk music and heavy metal. I think that’s probably a cornerstone of my identity. I like
    0:25:12 things that are moved fast and are exciting. And so the idea of watching a broadcast that’s
    0:25:19 tailored to that viewership base and is engaging with that because when you think about what other
    0:25:23 media today do you watch that isn’t personalized, and so looking at how a sports broadcast, for
    0:25:30 example, could start using a system. And I think one of the things that if anybody took anything
    0:25:35 away from this conference other than the fact that Jensen may be coming out with the jacuzzi
    0:25:40 product, which I will be buying if that is available, NVIDIA branded jacuzzi sound amazing.
    0:25:45 And there are data centers, by the way, like he made a joke about that. I’m sorry, for those that
    0:25:49 miss this joke, the new giant processors taken, they’re liquid cooled and they take in,
    0:25:56 believe it was at room temperature and they output it about 45, the temperature of a jacuzzi.
    0:26:00 Two liters, I believe it was two liters per second or per minute. And jacuzzi, he made a
    0:26:07 joke that yeah, we’re going to have a jacuzzi peripheral and buy a data center and get a jacuzzi.
    0:26:10 What’s amazing though is there are data centers in the UK where they’re built next door to
    0:26:15 Council of State pools and they heat swimming pools. And they heat them, yeah. But before I
    0:26:20 tend gent on more things, like what RAG is a systems retrieval augmented generation system.
    0:26:28 And so that’s where you take multiple disparate data sets, you query that and you start to take
    0:26:33 those systems together and combine them to figure out a result. And so, for example, in sports,
    0:26:40 if you’re a sport on a digital platform, and if you look at the way sports are right now,
    0:26:44 the rights are being bought by digital platforms almost exclusively, and they’re being brought
    0:26:49 into these digital spaces. And when you take tooling like HoloScan for media, for example,
    0:26:53 now at literal camera level, you’re able to get immediately into a database space and a space
    0:26:59 where analysis can be run. And so, okay, so you have the fact that sports in general,
    0:27:05 since like people like through rocks have been gambled upon. And so, there’s data sets around
    0:27:11 the sport. And most sports right now, particularly professional sports have like accelerometers
    0:27:15 on the athletes. Like we really finite data. All kinds of data, yeah. Yeah, all kinds. And like,
    0:27:19 I mean, depending upon if you’ve gone up north to San Francisco this week, like all the athletes
    0:27:23 are scanned as well anyway. So, we have that data set. You have the fact that the platforms
    0:27:28 have their own algorithms already, you know, and you start thinking about the sort of algorithms
    0:27:33 that are on those platforms as well. Then you have the fact that individual personas, individuals,
    0:27:40 and we have a tool called persona flow that I’ll dig into in a second,
    0:27:43 have their own discrete data sets, like 3,500 sets of data that we talked about a second ago.
    0:27:48 And then you have the fact that the images themselves can now be broken down with things
    0:27:53 like segment to anything. I just saw the team, I really love meta segment, segmentation tool,
    0:27:58 segment anything really powerful open source tool works really well and is a really fun
    0:28:04 tool to play with to learn a bit about segmentation. And it’s also an extremely powerful tool.
    0:28:09 So, we have these four data sets that we can take and we can say, hey, okay,
    0:28:13 let’s literally ask, yo, what would Lewis like to watch if he’s watching a basketball game?
    0:28:21 And so, I like heavy metal. I like intensity. I like combat sports. And I like sneakers.
    0:28:28 So, it’s going to spit out a highlight reel that’s like whenever people are yelling at each
    0:28:32 other, set to heavy metal, probably with a bunch of shots of sneakers in there. And all of a sudden,
    0:28:38 I’m like, yo, do I like basketball now? What’s going NBA if you’re listening? I love basketball,
    0:28:44 but to that same extent, the opportunity is to really create that personalization and create
    0:28:48 access points. In the same sense, we saw Super Bowl this year, which was, and if you look at the
    0:28:53 stats on the Super Bowl, they haven’t been great lately, but we had the single most watched telecast
    0:28:58 in like since the moon landing. Oh, was this year’s? Yeah, since the moon landing. And it’s not
    0:29:04 because like the game was particularly great. Don’t say Taylor Swift. It was, but here’s the
    0:29:08 thing. No, no, no, but hold on. I don’t like Taylor Swift either. No, no, I don’t dislike her.
    0:29:13 But it’s not my vibe at all. It’s not my thing. I think she’s a perfectly fine professional,
    0:29:17 but dude, I like Slayer. It’s not my vibe. With that said, though, it’s perfectly okay to be into
    0:29:26 the Super Bowl because of Taylor Swift. And so here’s the thing. We live in rich media to the
    0:29:31 same point that my phone is full of crab memes and yours isn’t, and Sarah certainly is not.
    0:29:36 Why should I have to even know Taylor Swift exists within that universe? Why can’t I just
    0:29:43 have it tailored to me? Does that create, is there a downside though that it puts us all on
    0:29:48 our own little information bubbles? I mean, I think it probably does, but I’m not a philosopher
    0:29:52 and I’m a person who leans into fun and excitement and enjoyment. And so I’m a person that tends to
    0:29:57 just think that if your core motivation is not to hurt people and is just to make people happy
    0:30:03 and have fun, then it’s probably fine. I’m sure there are silos and stuff like that. But I mean,
    0:30:08 I live in the silo where I’ve listened to the same music since I was 14 and I’m perfectly
    0:30:13 happy with that and it’s great. And leaning into that is where there’s massive opportunities because
    0:30:20 you create more joy for people and you create these spaces where we take ourselves a little bit
    0:30:24 less seriously and we realize that sports like music is a culture vehicle and you have the
    0:30:30 opportunity to engage with culture. And so to that end, one of the tools that I’m most excited
    0:30:34 about right now is our persona flow tool. Yes. Which he announced it, I think it was this week.
    0:30:40 I’m looking over at Ceradis to confirm that we had our formal launch of it and it does…
    0:30:45 We can launch it right now. Oh, it’s launched right now. Perfect, great.
    0:30:48 Sorry about that, Henry. Persona flow, brand new, tell us about it.
    0:30:51 So what it does is it takes the idea of a conversational system with the idea that you
    0:30:57 can build that 3,500 data point system and then talk to it as an advertiser,
    0:31:04 as a broadcaster, as an IP holder and basically say like you can make a Lewis bot and be like,
    0:31:09 “Lewis, what would you want to watch?” Well, and it would respond accordingly and you can have those
    0:31:14 conversations. It’s a way to focus group without having a focus group. It’s a way to create deeply
    0:31:20 personalized content. I’m really excited based on some of the announcements yesterday where we
    0:31:25 could start actually screening video for it. And so it could be like, “Yo, Lewis, do you like this
    0:31:29 clip? Do you think this is exciting?” Right. Then use that data to feed it back and create more
    0:31:33 generated clips and roll and roll and roll until you get to something that I would like eventually.
    0:31:38 And so it’s a form of the AI universe where my agent or my team of agents are running around out
    0:31:44 there doing all the stuff for me so that when they come back, when I actually log on, so to speak,
    0:31:50 and engage, I’m only getting the good stuff because they’ve already done all of that.
    0:31:54 Yeah, it functions like a chat bot in a sense that… And I think you get into really interesting
    0:31:59 insights as well. Sarah says it wouldn’t be an interview without me, without me bringing this
    0:32:03 up. But I think the best marketing activation of all time ever was a video game called ChexQuest
    0:32:09 that came out in the ’90s. ChexQuest. ChexQuest, greatest thing ever.
    0:32:13 What is it? Was it? So ChexQuest, John Oliver, stole this bit for me. But ChexQuest is a… No,
    0:32:19 Stephen Colbert stole it, right? I think it was Stephen Colbert. Anyway, so it was a video game
    0:32:24 that in the ’90s, General Mills was not trending well with kids because it was ChexSerial. And
    0:32:31 ChexSerial is great. My view of Chex is permanently tainted by this video game, tainted in a positive
    0:32:36 manner. And it… I was 12 years old. I really wanted to play Doom. It was too violent. I wasn’t
    0:32:43 allowed to play Doom. I watched all my friends’ older brothers and sisters play Doom. I wanted
    0:32:48 to do it. And Chex went to John Carmack and Joe Romero and said, “Hey, can we license your game?”
    0:32:53 And they were like, “This is a weird idea, but whatever. Cool. I don’t care.” And I don’t know
    0:32:58 if you’ve seen pictures of John Carmack and Joe Romero at that period of time, but you can totally
    0:33:01 see how they would respond. It’s mostly them with flame backdrops and swords and clearly listening
    0:33:07 to Manowar. So they sent ChexSerial licensed Doom. And it’s literally Doom with a ChexSerial
    0:33:17 character running around shooting boogers at other boogers or shooting milk at boogers.
    0:33:22 And for real shooting boogers? Yeah, dude. That’s legit the plot. And as a 12-year-old who couldn’t
    0:33:28 play Doom, I could now play Doom and it was amazing. And I played that to death. Like I bought.
    0:33:35 And what’s interesting is it corresponded to a 300% increase in the sale of Serials.
    0:33:40 And to this day, if I see Chex anywhere, I’m like, “Yo, Chex West. Chex West.” And so what
    0:33:47 that does and what a persona flow will probably do is you get to this point where you start
    0:33:52 thinking about wants and needs and what people are interested in. And if you start asking like,
    0:33:56 “Hey, what are 12-year-olds wanting to do right now? Well, they want to try to play Fortnite,
    0:34:00 but they’re not allowed to for whatever reason. Well, how would we do it?” You start creating
    0:34:04 that sort of stuff. I mean, that’s what we… I was trying to think of the name and I couldn’t,
    0:34:08 but the Fortnite stand-in that we tried to pass off at my younger son before we gave in and said,
    0:34:15 “Yeah, you can just play Fortnite.” Yeah, you should just play Fortnite. I played far too much
    0:34:19 Fortnite as well, but it’s my generation’s form of golf. Totally, totally. We always like to
    0:34:25 end these conversations by asking the guests like you, “What’s next?” I feel like we’ve been talking
    0:34:31 about what’s next for the past half an hour. But is there, whether it’s a technical problem,
    0:34:37 cultural problem, and problem, I’m using that word loosely, right? Something that a solution
    0:34:43 could help with. Technical, cultural, communicative, something in the media pipeline itself,
    0:34:49 I don’t know. Is there something that you are grappling with, excited to overcome,
    0:34:55 kind of see in the future? I mean, yeah. So I think, and I’ll hit like a philosophical
    0:35:01 beat and then I’ll go into something more technical. I think we’re about to enter the age
    0:35:05 of the subculture and we’re entering an age where within those subcultures you start to have,
    0:35:11 because again, we have the death of monoculture has basically occurred at this point. It hasn’t
    0:35:15 hit big media yet. But what you’re going to start to get, and this is really exciting, is I missed
    0:35:21 that the early 2000s and ’90s where there was this massive swath of $2 million budget movies
    0:35:26 that were really meaningful and really interesting. I think as you get to these spaces where these
    0:35:31 layers of personalization and these subcultures and these audience segments are more and more
    0:35:35 defined, you start getting to the space where you can make a $2 million movie and make revenue
    0:35:40 off of that and you can make these smaller pieces of content that are more tailored and more personalized
    0:35:45 individuals and therefore make people happier and make people excited about those pieces of
    0:35:50 content. And I think, you mentioned earlier, there’s a lot of talk of like, oh, isn’t this
    0:35:55 going to break people up? But I think it’ll be the exact opposite. Yeah, like subcultures is a
    0:36:01 community. And as we build that, we’ll be really more exciting. So it wouldn’t be that you and I
    0:36:05 would see slightly different versions of the same movie tailored to each of our tastes,
    0:36:10 but it would be more like you and I might find each other in the audience of a heavy metal
    0:36:16 movie about hockey in North Carolina. 100%. Like you got to think like, and I know that
    0:36:20 we all want to think we’re very individual and we are, every individual matters. I used to know
    0:36:25 that I’m getting older. But when you do the math of like, if there’s 3,500 data points and there’s
    0:36:30 eight billion people, like there’s going to be more than a few that have more than 70% of those
    0:36:38 data points aligned. And so you do, you start getting into these points where like, you just
    0:36:42 literally described a violent gentleman, which is a hockey streetwear brand that was
    0:36:48 founded by the guys. And every time I die started making movies. That’s what you would get. There
    0:36:54 you go. And so I think you get into those spaces where we create community through that. And we
    0:36:58 have the opportunity to create like these infinite culture reels and these infinite storytelling
    0:37:03 pieces. And I really love that as a part of AI. And I think as artists start to grapple with it
    0:37:10 more and start to engage with storytelling around it, you get to something really meaningful.
    0:37:14 And that’s really exciting. I think we are in and I’m a member of the World Economic Forum.
    0:37:20 And I think this beat is overhyped, but it’s very, very true. And sitting yesterday in the keynote
    0:37:26 in Sapp Arena, which had more seats filled than the average sharks game, you, that was me just
    0:37:33 being, that was me being a mean Cane’s fan. I’m sorry, sharks. We took Brent Burns from you.
    0:37:39 But the, I don’t know what that means, but I’ve been to some Oakland A’s games.
    0:37:43 So here’s the thing, like you start getting into this space where we are in the midst of a
    0:37:49 fourth industrial evolution. And in my personal life, I’m reading, I committed myself to reading
    0:37:55 about a lot about Ludditism and the meaning of Ludditism and why Luddites existed, which it was
    0:38:00 like a legit cultural movement. And people died and they smashed hyperwriters and stuff.
    0:38:06 But what we are in is this, as the shift occurs, I think we have the opportunity to create more
    0:38:13 and make more. And as we use systems like RAG where you start to unearth these intersectional
    0:38:18 systems, it gets really, really cool when you start having more things like wormhole or like
    0:38:23 highlights that are focused directly at you. And I think things get less lame as well in the sense
    0:38:29 that we have the opportunity to deliver things to people that they want. I think we talk a lot
    0:38:33 about waste, but I don’t think people think enough about how much time is wasted watching
    0:38:38 shit we don’t want to watch. And like, that sucks, man. Like how much of my life was spent
    0:38:44 watching toilet paper commercials after having not being convinced by toilet paper, but that
    0:38:48 toilet paper commercial. What you get to is the opportunity to reduce waste. And that is something
    0:38:54 like there’s a lot of stuff around sustainability and AI. And I think we have to realize that most
    0:38:58 of the cloud and the data center systems that we have in this country and globally are in sustainable
    0:39:04 facilities. And so the opportunity to get to a more sustainable future through that is really
    0:39:09 exciting. I’m also like, this is my last beat on that. I’m also really excited about what’s
    0:39:15 going on with transmission technology and particularly mobile first societies. And you get
    0:39:20 into like, you look at like places like South Hollow Brazil or places like Mumbai, India,
    0:39:26 where it’s a society that mostly grew up on phones rather than necessarily on PCs. And the way in
    0:39:31 which culture is consumed at this like rapid fire level and the way in which content is consumed
    0:39:37 in an innately social manner is so exciting and so interesting. And you look at the way like
    0:39:43 what’s app is used and how those systems work together. It’s so interesting to see.
    0:39:49 And I do fundamentally believe that access to internet is a basic human right. And as you see
    0:39:54 these groups, more groups get access to internet and get access to a voice. Storytelling is so
    0:39:59 fascinating. So yeah, that’s, I mean, I’m very optimistic about things. The keynote yesterday
    0:40:05 was mind boggling and really, really fun and exciting. And I think it’s really cool to see
    0:40:10 NVIDIA on the stage that they’re at right now. It’s a company I believed in like most of my adult
    0:40:16 life, to be honest. I fired my first financial advisor ever for not letting me invest in NVIDIA
    0:40:21 back in 2015. And I think that, you know, it’s interesting seeing this, I was talking to our
    0:40:29 founder yesterday and he sees this as becoming like a traditional marketing beat of our year.
    0:40:33 Right. Where this is going to be like the CES part two. And like, you know, you see this,
    0:40:38 this was, this used to have the vibe of SIGGRAPH. Yeah. And now you see these people who are like
    0:40:44 legit business people standing next to people like Joelle Pinyu and like people like Rev where
    0:40:49 you’re like, whoa, that’s, that’s that person. Oh my goodness. Nerd superheroes. And it’s,
    0:40:55 it’s exciting to see that culture grow and to see the way the company’s been built. And I’m really
    0:40:59 excited. I think the culture is, is amazing and it’s been really fun. Excellent. We’re going to
    0:41:03 leave it there. Lewis Smithingham, MediaMonks, people who want to learn more about MediaMonks
    0:41:08 about any of the stuff we’ve been talking about. Where would you direct them to go online? Probably
    0:41:13 our website, probablymediamunks.com. But you can also like hit us up on any different platform,
    0:41:19 anywhere, LinkedIn, Instagram, wherever you want to hit us up. We’re very easy to find. My last
    0:41:25 name is, I have the SEO long lock. So I’m pretty easy to find. Good enough. And wormhole, is wormhole
    0:41:32 making more appearances? Wormhole will appear all over the world and there may or may not be more
    0:41:36 than one wormhole. Love it. Teaser at the end. Lewis, thank you so much for joining the pod.
    0:41:41 Enjoy the rest of your time. Good luck with your session. Thank you.
    0:41:44 [Music]
    0:41:48 [Music]
    0:41:52 [Music]
    0:42:04 [Music]
    0:42:16 [Music]

    Meet Media.Monks’ Wormhole, an alien-like, conversational robot with a quirky personality and the ability to offer keen marketing expertise. Lewis Smithingham, senior vice president of innovation and special ops at Media.Monks, a global marketing and advertising company, discusses the creation of Wormhole and AI’s potential to enhance media and entertainment with host Noah Kravitz in this AI Podcast episode recorded live at the NVIDIA GTC global AI conference. Wormhole was designed to showcase Monks.Flow, an AI-powered platform that streamlines marketing and content creation workflows. Smithingham delves into Media.Monks’ platforms for media, entertainment and advertising and speaks to its vision for a future where AI enhances creativity and allows for more personalized, scalable content creation.

    https://blogs.nvidia.com/blog/media-monks-ai-podcast/

  • Performance AI: Insights from Arthur’s Adam Wenchel – Ep. 221

    In this episode of the NVIDIA AI Podcast, recorded live at the GTC 2024, host Noah Kravitz sits down with Adam Wenchel, co-founder and CEO of Arthur. Arthur enhances the performance of AI systems across various metrics like accuracy, explainability, and fairness. Wenchel shares insights into the challenges and opportunities of deploying generative AI. The discussion spans a range of topics, including AI bias, the observability of AI systems, and the practical implications of AI in business. For more on Arthur, visit arthur.ai.

  • AI2’s Christopher Bretherton Discusses Using Machine Learning for Climate Modeling – Ep. 220

    Can machine learning help predict extreme weather events and climate change? Christopher Bretherton, senior director of climate modeling at the Allen Institute for Artificial Intelligence, or AI2, explores the technology’s potential to enhance climate modeling with AI Podcast host Noah Kravitz in an episode recorded live at the NVIDIA GTC global AI conference. Bretherton explains how machine learning helps overcome the limitations of traditional climate models and underscores the role of localized predictions in empowering communities to prepare for climate-related risks. Through ongoing research and collaboration, Bretherton and his team aim to improve climate modeling and enable society to better mitigate and adapt to the impacts of climate change.