GTC DC ’25 Pregame – Chapter 2: Agentic AI for Every Industry

AI transcript

🕒

Việt

中文

0:00:16 Hello, and welcome to a special GTC edition of the NVIDIA AI podcast. This is the second
0:00:22 of five episodes on the road to GTC Live in Washington, D.C. Bonus conversations you won’t
0:00:27 hear anywhere else. Today, we’re exploring agentic AI for every industry. Intelligent
0:00:32 systems are beginning to plan, reason, and act, reshaping how industries work. In this
0:00:37 episode, builders share how these intelligent capabilities are moving from research into
0:00:44 real-world impact. Enjoy the conversation and visit ai-podcast.nvidia.com afterwards to check
0:00:51 out our library of over 275 episodes of the NVIDIA AI podcast. You know, when people talk
0:00:56 about winning the AI race, it’s not just about faster chips or bigger models. It’s about scale
0:01:01 and deploying the American technology stack across the world. That’s right, Brad. From
0:01:06 semiconductors to frameworks, from the cloud to the developers who built it, I believe America
0:01:13 is currently winning that race. And that dominance in AI is really fueling a new era where millions
0:01:21 of AI agents will exist to help us in every part of business and in life. AI is no longer
0:01:28 than a single application. AI systems now decide, design, and deliver. Across sectors, autonomous
0:01:35 agents are transforming how work gets done from strategy to execution. To discuss how agentic AI is
0:01:41 transforming every industry, we’ve put together another incredible panel. Starting with Aravon Srinivas,
0:01:51 founder, co-founder and CEO of Perplexity. Shiv Rao, founder and CEO of Abridge. Scott Wu, founder and CEO of
0:02:00 Cognition. And of course, George Kurtz, the founder and CEO of CrowdStream. You know, Aravon, let’s start with you.
0:02:11 Nobody has innovated more on the chat bot, on search, on the browser now in AI than Perplexity. You’ve consistently been a step ahead,
0:02:23 although fighting maybe up a mountain against bigger incumbents. So tell us now what comes next for the agent? What do you see out there? What is hiding in plain sight?
0:02:34 Yeah, first of all, thanks for having me here. What is really our vision for the browser is not to launch yet another browser. We think of Komet, our browser,
0:02:51 as a personal assistant for all of us here. Essentially, a second brain to delegate all the mundane, boring work. So it gives us a lot more time to explore and just be ourselves on the web. The internet is just a lot better if you can ask questions from wherever you are, whether you’re on a web page or in a web page or in a web page.
0:03:14 You’re in a Slack workspace or you’re actually on an AI tool. It doesn’t matter. You can just ask questions from wherever you are. So that’s what we learned first time when we launched Komet. The number of questions a user asks on Komet is 6 to 18x more than what they ask on Perplexity even on other browsers. So that’s just because the AI is there with them everywhere.
0:03:32 And they’re starting to do a lot of awesome things like setting up their own Shopify stores, setting up their own Facebook ads, listing items on Facebook marketplace, all those sort of things. So we’re just beginning to see this explosion of people getting a lot more agency and autonomy on their own.
0:03:42 I think we’re just at the beginning. We’re going to come back and talk a lot more. I’m going to ask you some of the questions I asked the last panel on when I’m going to get my agent to book my hotel for me. And I know you have opinions on that.
0:03:52 But Scott, Cognition is one of the fastest growing startups in history, building coding agents that are helping to power some of the country’s largest enterprises.
0:04:03 You know, there are a lot of people worried that the AI hype is ahead of the substance, right? You’re on the front lines selling to America’s biggest enterprises, a solution that’s helping improve their business.
0:04:15 So from those front lines, help us understand where is the substance of AI coding today? How is it transforming these companies? And do you think it’s going to keep up with all the excitement and hype?
0:04:26 Yeah, absolutely. I mean, I think right now, you know, in code, especially, you know, you really feel this, which is, you are just faster as a software engineer, if you’re working with the best AI tools and doing the most with that.
0:04:39 And there’s a range of kind of the productivity gains you see on different use cases, you know, on some of the more kind of gritty, very particular, you know, use cases, you might see speed ups on the order of 20%, 30%, 50%, something like that.
0:04:45 For a lot of what we call the engineering toil, you know, that’s things like migrations and replatforms and modernization.
0:04:56 Honestly, we’re seeing gains that are in the neighborhood of 6 to 10x, where basically one hour of an engineer’s time using the best tools corresponds to about 6 to 10 hours not using the tools.
0:05:04 And so, you know, I think the gains are very clear. And I think the thing that’s really exciting about it is every team everywhere has so much more software to build.
0:05:18 And that’s the best part of it, right? And every team has, you know, you told me this, I think, a year or two ago, which is every engineering team out there has 50 projects that they want to go work on, but they have to choose four, because that’s how things are with engineering, right?
0:05:22 And so, you know, the ability to speed up and to do a lot more is really, really exciting for us all.
0:05:31 So, Shiv, I spent, was fortunate enough to spend 10 years on the board of Austin’s largest hospital.
0:05:37 And I was the tech guy coming in, and man, it was like time stood still.
0:05:40 Things were slow, getting paid for things.
0:05:46 If there was a box sitting in the hallway, you would get Jayco to come after you.
0:05:57 I’m curious, though, with all those pressures inside of our healthcare systems, how is AI putting the patient in front of the line here?
0:06:00 Because sometimes that gets overlooked in the bureaucracy.
0:06:01 Yeah, absolutely.
0:06:06 Well, it’s been a wildly historic moment for AI and healthcare over the last few years.
0:06:10 And part of the moment is the problem, is the pain point.
0:06:14 Two out of five doctors don’t want to be doctors in the next two to three years.
0:06:19 There’s a JAMA article that suggested that 30% of nurses don’t want to be nurses in the next 12 months.
0:06:21 So we have a public health emergency.
0:06:22 We have to do something about it.
0:06:29 And that’s where AI comes in, because AI, and in our case with a bridge, we’re able to unburden clinicians so that they can make eye contact,
0:06:35 so that they can be fully present with their patients, knowing full well that a lot of the clerical work that they’ve got to do,
0:06:40 that that’s getting taken care of for them, so that they can just focus on the person, focus on the care that they’re delivering.
0:06:42 That’s incredible.
0:06:45 George, great to see you.
0:06:45 You too.
0:06:46 Thanks for coming on.
0:07:00 So in every major inflection point, whether it was mainframe, to mini, to client server, PC, social, local, mobile, we fractalize our applications and our software.
0:07:03 And what happens in every one of those is security gets more difficult.
0:07:09 How in the age of AI is the risk higher?
0:07:13 And then how is AI using to actually help be more secure?
0:07:18 Well, when we think about technology, and this is the great part about where we are today,
0:07:24 if you look at the slope of the technology innovation curve, security has to parallel the slope of that curve.
0:07:28 So in every inflection point that you just mentioned, you have to have security.
0:07:30 30 years ago, it was an afterthought.
0:07:31 It was a bolt-on.
0:07:34 Now, thankfully, it’s being integrated into the stack.
0:07:41 And I think what we’ve seen over time is where security, where technologies have sort of seams,
0:07:44 that’s where the adversary lives, where you’re trying to connect things together.
0:07:47 So if we can build it in foundationally, it’s going to be much better off.
0:07:50 But from an AI perspective, what does it really buy you?
0:07:56 My view is that data is the key to solving almost every security use case.
0:07:59 So the more data you have, the more use case you can solve.
0:08:04 And obviously, AI seems to be a good opportunity to deal with lots of that data.
0:08:11 So from our perspective, what we try to do and what we’ve seen in the adversary universe is
0:08:17 the time has dramatically been cut for the adversary to actually be able to find vulnerabilities,
0:08:19 exploit them, get in, and pivot.
0:08:22 It used to be months, then weeks, then days, now minutes.
0:08:28 And in one of the cases, we found within 51 seconds, an adversary had dropped onto a system
0:08:29 and pivoted off.
0:08:34 So the only way you’re going to keep up with that is the automated SOC, the AI native SOC,
0:08:40 where you’re driving AI agents doing the work of a security analyst that cannot keep up with
0:08:40 the threats.
0:08:48 And the challenge that we have right now is that AI, in many senses, is great because we’re able to deal with these threats.
0:08:54 But it’s minting new adversaries, because it’s now democratized destruction.
0:08:59 And it’s given this level of sophistication to a much broader group that are not as sophisticated.
0:09:12 So, George, I’ll call it the value chain of security threats goes all the way from a $5 IoT endpoint to the hyperscaler data center and everything in between.
0:09:19 I’m curious, is there a central place that we can secure everything?
0:09:27 Or do you need to have these every step, every link, let’s call it confidential computing?
0:09:30 It’s not an easy problem to solve.
0:09:33 And if you think about it, I always jest, you know, I’m on all these panels every year.
0:09:37 And, you know, for the last 30 years, we still talk about bad passwords and identity.
0:09:39 Like, we still haven’t solved it, right?
0:09:40 I mean, that’s kind of the state we’re in.
0:09:41 We’re getting better.
0:09:44 But you have to look at where compute happens.
0:09:46 Obviously, cloud is a big element.
0:09:48 But now, with AI, it’s being pushed more to the edge.
0:09:52 You know, it was at the edge, and it’s in the cloud, then it goes back and forth.
0:09:54 Now it’s all over the place.
0:10:01 So from my perspective, you’ve got to apply the right security technologies to each of those technologies.
0:10:04 And then you’ve got to connect the dots across them.
0:10:06 There isn’t one magic bullet that’s out there.
0:10:07 There isn’t one company.
0:10:09 There’s one technology that can secure everything.
0:10:13 So it’s using the right security for the right technologies at the right time.
0:10:18 Shiv, when you think about, I love your story, right?
0:10:25 A cardiologist by day and, you know, leveraging AI agents to solve healthcare problems by night.
0:10:30 Today, it’s about translating those doctor-patient conversations into the health record.
0:10:36 But when you look ahead and think agentically, where is AI going to have the most impact on your practice?
0:10:41 What are you most excited about as the next step for a bridge and what’s happening out there?
0:10:42 Yeah, absolutely.
0:10:49 I think a big part of our thesis is that in the next five to 10 years, we’re not going to be able to fully automate a doctor or a nurse.
0:10:57 And if we’re not fully automating them, then the conversations that they’re having with their patients are really upstream of so many of the workflows that happen in healthcare.
0:11:04 So if I see a patient in clinic, I’m without this technology, my back turned to them, not really paying attention to them.
0:11:05 I’m typing the whole time.
0:11:08 I’m not really like making eye contact.
0:11:09 I’m not being present.
0:11:11 With this technology, I’m fully focused on them.
0:11:14 My documentation is getting done for me.
0:11:20 But also in this country, but in around the world, we’re not as clinicians compensated for the care that we deliver.
0:11:22 We’re compensated for the care that we documented that we deliver.
0:11:36 So essentially, these notes are bills in healthcare and being able to generate the note the right way in the most compliant way that checks off all the boxes or those billing and coding experts, as they call them, means that you can keep the lights on for the health system.
0:11:45 Now you can start to remove any amount of waste in the system that is basically taken up by inefficient offshore services.
0:11:47 Go ahead.
0:11:55 Scott, the meme out there is that developer tools are going to mean that we don’t need any developers anymore.
0:12:01 And, you know, it’s funny, if you go back to the old days, machine language to COBOL.
0:12:04 Once COBOL showed up, it was, we’re not going to need any more developers.
0:12:22 What happened is every successive generation of tools, we moved to IDEs, and now we have some pretty amazing products, including yours, that can help do code assist to accelerate time to good program and a lot more.
0:12:25 I’m curious, what’s the end game here?
0:12:28 How far can this be pushed?
0:12:35 Is it like we saw with the other iPhone, the iPhone meme, which was all photographers, no photographers are going to have a job?
0:12:38 And I see photographers out there.
0:12:46 But what it did do is smartphones did democratize taking pictures that looked, you know, pretty good.
0:12:49 So how far does it go in the programming space?
0:12:51 Yeah, no, it’s a great point.
0:12:58 And to your point, I mean, I think software, it’s assembly and COBOL, and maybe a long time ago, it was punch cards.
0:13:02 And, you know, obviously now we have Python, and things are running on the cloud.
0:13:04 And there’s been a lot of form factor changes.
0:13:11 But I think if you think about what is software development, at the end of the day, all you’re trying to do is it’s telling your computer what to do, right?
0:13:15 That’s kind of how I would describe programming, software engineering, whatever you call it.
0:13:18 It’s telling your computer, here’s exactly what I want it to do, and having it go and do that.
0:13:25 And, you know, I think it will always be up to us as humans to decide what the computers should do, right?
0:13:33 And I think getting to this kind of platonic ideal where you can really just work with your computer, you know, Jarvis style and tell your computer what to do is where this is going.
0:13:37 And I think, you know, people talk about Jevon’s paradox, right?
0:13:46 And I think code is perhaps the best example of it, which is, again, as you’re saying, you know, we’ve already made software so much more efficient over the last several decades, right?
0:13:55 I mean, every software engineer, what we all love doing as programmers is going on and automating processes and figuring out how to make this part faster and make this cleaner and simpler and so on, right?
0:14:01 And despite that, you know, the number of software engineers has just gone up and up because we have so much more demand for software.
0:14:10 Ervin, as you think about, I mean, as you know, one of the places I’m most excited is how this is going to transform the consumer experience.
0:14:15 I think all of us have come of age in the age of 10 blue links, right?
0:14:21 And Google was such a breakthrough in terms of information retrieval, transformed all of our lives.
0:14:28 But for my, you know, 17-year-old, they wouldn’t dream of spending their time looking at 10 blue links when they could just get answers.
0:14:31 Where do we go?
0:14:36 I asked the question on the last set, when am I going to be able to book my hotel in D.C.
0:14:41 And, you know, just talk to Perplexity or talk to ChatGPT.
0:14:43 Say, book it next Tuesday at the lowest price.
0:14:45 It already knows what hotel I like to stay at.
0:14:47 It already knows the room I want.
0:14:50 How far are we from actions, not just answers?
0:14:54 A few months to actually do that particular prompt you’re asking for.
0:14:58 So hopefully next time you come here, you can use our product and get it done.
0:15:00 But here’s the thing.
0:15:05 Like three years ago, we started this transition from 10 blue links to answers when we launched.
0:15:11 And it had a big impact in the sense that even Google is starting to do the same thing.
0:15:19 But the real transition entirely from just going and booking your hotel through a keyword on Google
0:15:25 to just asking your agent to go do that for you is only going to be possible with something like a browser agent.
0:15:27 So that’s essentially why we wanted to build Comet.
0:15:33 And essentially what you’re talking about is it should have a deep understanding of you.
0:15:34 It’s your personal AI.
0:15:41 So it should have a sense of what rooms you like, what kind of views you like, what is your budget, where do you typically stay,
0:15:46 which is what are the set of hotels you typically stay in case it has to deal with constraints and availability.
0:15:55 And then it has to actually go to the website of the hotel and check for these availabilities and actually go and use your card and make the transaction on your behalf.
0:16:10 So it’s essentially advantageous for a company like us, which has access to the web, we have our own index, we have our own browsing infrastructure, to be able to do all this with the help of the most capable frontier reasoning models.
0:16:19 And as these models are advancing in their reliability and their capability every few months, it’s possible to actually create this end user experience.
0:16:29 And our goal is that like, even if it takes a few minutes to get it done, you should be able to take your mobile app and just speak out the task, forget about it, delegate it.
0:16:37 It’s running on the background on the server asynchronously, comes back to you, elicits feedback whenever it’s not sure, and actually gets stuff done.
0:16:41 That’s our vision for the mobile version of our common browser.
0:16:48 It should be running on the background asynchronously and able to multitask and do like hundreds of different tasks like these.
0:16:53 There’s been a complaint of some browser-based agents that they’re just slow, right?
0:17:03 They go in, they kick the tires, and I needed to get groceries, and my gosh, it didn’t fully understand what I wanted, and I could have done it quicker.
0:17:10 Very related, I think, to your travel, when will we get to the point where it’s faster?
0:17:13 Or is that not important because it can be doing it in the background?
0:17:17 Yeah, so that’s the key distinction between agents and chat.
0:17:22 It’s not necessary for agents to be living in the chat UI.
0:17:26 In fact, it should be more asynchronous and running in the background.
0:17:28 Chat is synchronous interaction.
0:17:34 So when you give your co-worker or an intern or your assistant a task, it’s not like they finish it in one second.
0:17:36 So why would you want the AI to do that?
0:17:38 Definitely not my interns.
0:17:44 Yeah, but the key point I’m trying to make is it can parallelize, it can multitask, you can do hundreds of tasks.
0:17:51 You can call a plumber, you can call five different plumbers at the same time, find the best option for you.
0:17:55 Whoever can come earliest at the best price and get it done.
0:17:59 It’s basically humanly impossible to do five calls at the same time for one person.
0:18:02 So that’s the kind of thing that we are imagining for agents.
0:18:08 It’s still not a paradigm that everybody is used to, where you ping somebody on Slack, give them a task.
0:18:10 You don’t expect them to reply instantly.
0:18:14 But on a chatbot, you want the answer fast and you’re like, oh, this product is pretty slow.
0:18:18 It’s slow because it actually takes, physically it takes time to get things done.
0:18:19 Great point.
0:18:26 George, I think security is one of the most important areas in AI for part of the reason that you laid out,
0:18:30 which is we’ve democratized the business for attackers.
0:18:32 Two questions.
0:18:38 Number one, if data really is kind of the primitive to AI being able to help defend,
0:18:45 then does this mean that advantage goes to the security companies that are at the largest scale,
0:18:47 that have access to the most data?
0:18:52 So I’ve heard your company, I’ve heard Palo Alto talk about platform systems,
0:18:54 now increasing advantages of scale.
0:19:01 And then the second question, I think, is if you’re a startup, you know, we’re an investor in Expo,
0:19:06 which is, you know, build an AI hacker to help companies offensively try to get ahead of this.
0:19:11 Do startups have the opportunity, given that scale disadvantage,
0:19:15 to continue to be the disruptors in security that they’ve been in the past?
0:19:17 Yeah, two good questions.
0:19:22 So I do think scale is incredibly important now more than ever.
0:19:26 When you look at competitive advantages and modes, one of them is scale.
0:19:31 It’s just, you know, not only the amount of data you have, but the customers you can touch, right?
0:19:33 And that’s the whole platform play, as you know.
0:19:40 I think from a data perspective, there’s a lot talked about, well, we collect data or they collect data.
0:19:46 It’s how you collect it, it’s how you curate it, it’s how you label it, and it’s what you do with it.
0:19:47 It’s not just a pile of data.
0:19:49 And the context is really important.
0:19:56 And one of the things that we focused on since I started the company was never losing the context of when we collect data at the endpoint to the cloud.
0:20:00 We’ve got a mini graph on the endpoint, we’ve got big graphs in the cloud, we never lose context.
0:20:02 And this is the key.
0:20:05 It’s not about collecting a pile of data and putting it somewhere.
0:20:07 It’s never losing the context.
0:20:08 That’s number one.
0:20:18 And I think that has served us well because it allowed us to solve new use cases by creating new modules and with very little effort in terms of the modules that we add on, right?
0:20:22 Because we’ve already collected it once and it ties into the business model.
0:20:26 I think with respect to startups, I think it’s one of the best times to be a startup.
0:20:29 There’s so many things that you can do today in security that you couldn’t do.
0:20:31 We had to build all this stuff, right?
0:20:33 We were like pioneers in AWS.
0:20:35 They didn’t have all these services.
0:20:36 We had to build the hard stuff.
0:20:40 We couldn’t use all these APIs and services that were out there.
0:20:51 So I think, and the startup you mentioned, Expo, is really cool because now they’re at the top of the leaderboard for finding vulnerabilities with all the bug bounty programs.
0:20:52 It’s actually really cool technology.
0:20:56 So if you’re a startup in security, I think you carve out your niche.
0:21:00 You have a lot of advantages that we didn’t have some of the bigger players.
0:21:01 You have speed as your advantage.
0:21:04 And you can connect to all these APIs that just weren’t available.
0:21:12 So if you do what you do really well, either you’re going to get really big and expand horizontally, or you’re going to be part of a broader company.
0:21:13 Two good outcomes.
0:21:20 But I think now is one of the best times to be a startup in security because you can focus on the areas that really matter.
0:21:26 And then companies like ours and others that are out there look at those and go, okay, we want that to be part of our platform.
0:21:31 George, I want to up a level a little to, we’re here in DC, obviously.
0:21:40 Is there something that Washington can do to help make our country more secure in this new age of AI?
0:21:41 There’s a lot that we can do.
0:21:42 I had some meetings already yesterday.
0:21:43 I think there’s two things.
0:21:47 One is there’s a technology piece, which I’ll come back to.
0:21:50 But the first is the procurement piece.
0:21:58 And in most cases, and I’ve been selling in the Washington for the better part of 30 plus years, is they’re buying technology that’s five years old.
0:22:00 Because their procurement cycles take so long.
0:22:01 It takes forever to get through it all, right?
0:22:05 So we’ve got to come up with a better way to procure these.
0:22:09 And instead of most big enterprises, you guys deal with them.
0:22:10 You know what they do?
0:22:13 They buy once for all their subsidiaries and their companies.
0:22:16 And the government, it’s a little piece here or there, and it’s just piecemeal.
0:22:19 So they’ve got to figure out the procurement.
0:22:22 And then on the technology side, they need to be forward-leaning.
0:22:26 And I think we’re in a position currently now with this administration where there are forward-leaning.
0:22:28 They think like a business, right?
0:22:29 Not like a government.
0:22:33 And the key is they need to be implementing the agentic stock.
0:22:47 They need to be pioneering areas that haven’t been done before with companies like ours and others to drive automation and implement technology that’s, you know, future-proof for the next X number of years, not technology from the last five years.
0:22:59 So if you combine speed of procurement with the ability to deploy technologies and partner public-private partnership, I think we can get to what I would call security AGI.
0:23:00 This is my goal.
0:23:02 How do we get to security AGI?
0:23:05 And how do we create the autonomous SOC?
0:23:08 I mean, I think it’s a really important point.
0:23:14 And of course, our AIs, our David Sachs is looking for ways that he can drive further efficiencies in the administration.
0:23:16 And I think that’s a great example.
0:23:18 Procurement’s not sexy.
0:23:26 But if we’re buying technologies that are five years old in the age of AI, in AI dog years, that’s like 50 years.
0:23:26 Exactly.
0:23:30 And so that’s a suggestion we’ll certainly take to him.
0:23:38 I want to come back to this just question we had on the last panel about, you know, power and the primitives and the cost of inference, right?
0:23:41 All your companies are big consumers of inference.
0:23:50 What do you see happening in the cost that, you know, that effectively is a cost of goods sold for many of your companies?
0:23:53 What do you see happening in the cost on the inference side?
0:23:56 Are we bringing down the cost fast enough?
0:24:04 What are things you might be doing, you know, creatively with clouds, with on-prem, et cetera, in order to drop that inference cost?
0:24:06 Or is it something that you think about at all?
0:24:07 Maybe starting with you, Arvind.
0:24:14 Yeah, so I think like a lot of people predicted the cost would roughly half every three months or something like that.
0:24:16 I don’t know if that trend is still continuing.
0:24:20 Costs will continue to drop as good models continue to emerge.
0:24:25 For example, you could see Haiku 4.5 from Enthropic was pretty good.
0:24:30 What we’re doing is driving a lot of our own inference workloads.
0:24:34 So we work with NVIDIA and building our own inference libraries.
0:24:37 And we use that to serve the best open source models.
0:24:42 We collect a lot of tokens from all the several tens of millions of users that we serve.
0:24:47 And we use that to post-strain our own versions of these open source models and serve them.
0:24:49 And that helps to bring down the cost a little bit.
0:24:57 We are really hopeful for GB200s to be like, you know, much more efficient compared to the H200s that we are currently serving on.
0:25:01 And that’s hopefully going to lead to some reduction in cost too.
0:25:05 In addition to that, we’re also introducing new subscription tiers.
0:25:10 So we have a Perplexity Max subscription, which costs $200 a month.
0:25:13 But we’re introducing the concept of background agents there.
0:25:17 Agents that will reply on your behalf, on your emails, draft your replies while you’re sleeping.
0:25:26 You can just add that agent to your email address, just schedule your meeting for you, automatically tag your emails as different categories.
0:25:31 Imagine that sort of agent booking your tickets, booking your flights, booking your restaurant reservations.
0:25:38 I think $2,000 a year is pretty cheap for something that can do all of this in parallel at the same time with all your personal contacts, right?
0:25:43 And that’s going to obviously need a really frontier reasoning model that’s going to be expensive to serve.
0:25:49 But you’re going to make way more in return because people are going to use it to make their life a lot better.
0:25:53 Scott, maybe for you guys, what are you doing?
0:25:59 You know, I know a lot of the deep reasoning that you’re using consumes a lot of inference.
0:26:01 What are you seeing out there and what are you thinking about in the world?
0:26:02 Yeah, yeah.
0:26:07 So one big thing to call out is, you know, agents especially are extremely compute hungry.
0:26:12 And maybe one way to think about it is, you know, you go to ChatGPT and you say, all right, who was the fourth president, right?
0:26:13 And it gives you the answer.
0:26:15 That’s one query, one answer, right?
0:26:18 If you go to Devin and you say, hey, I’ve got this bug.
0:26:26 Can you go click through the product yourself, reproduce the bug, check the logs, see what went wrong, maybe try and make some fixes and then go and test the code and make sure all that works.
0:26:31 That’s hundreds of queries or even thousands of queries that come from just one human ask.
0:26:34 And so, you know, for better or for worse, they are extremely compute hungry.
0:26:43 With that said, you know, the way that I like to say it is the productivity gains that we’re getting are so massive that obviously, you know, we’re not going to have the GPUs and just say, oh, just turn off the GPUs.
0:26:44 We’re not getting enough value out of them, right?
0:26:49 And so I think a lot of what it looks like is kind of optimizing on that spectrum.
0:26:53 And I think the models are getting smaller and faster and smarter, you know, all the time.
0:27:06 But maybe one thing I would point out is there’s kind of this curve of intelligence that always exists where, you know, the absolute smartest model that you could have is also pretty slow and bulky and so on.
0:27:08 And then you have something maybe that’s almost as smart, but is much faster.
0:27:13 And then you have like a really, really fast one that’s, you know, that next level.
0:27:24 And one of the big things that we have to think about with Devin, because Devin is a compound system that uses multiple models, is basically at each point in time, always using the right model for the job, right?
0:27:28 And so, you know, hardest step of this debugging problem, you want to put all the reasoning into it.
0:27:29 You want to do the smartest thing.
0:27:36 If it’s just clicking around the website and doing steps one, two, three, you know, something that’s fast and just kind of like efficient gets the job done, right?
0:27:38 So it’s kind of finding that mix between them.
0:27:49 So, you know, this ensemble model approach is a mechanism you’re deploying in order to not only drive down the cost of inference, but drive down the cost of model use.
0:27:49 That’s right.
0:27:59 Yeah, so you’re able to kind of use this frontier of models where, you know, the biggest and most expensive models, you only use in the times that you absolutely need them.
0:28:07 And then for a lot of these other kind of day-to-day tasks that don’t need the maximum intelligence, you’re able to do that faster, cheaper, and so on.
0:28:08 There’s a lot of chatter out there.
0:28:11 The cursor may be building their own model or attempting to build their own model.
0:28:14 Is that something in the future for cognition as well?
0:28:18 Yeah, so we do a lot of post-training of our own of various models.
0:28:23 And, you know, we produce models that are really well-suited for our particular tasks.
0:28:29 And I think especially when you get into the depth of software engineering, obviously, you know, a lot of the models have a lot of code data trained into them.
0:28:30 And that’s one thing.
0:28:37 But, you know, if you said, hey, my Kubernetes pod is, you know, is going wrong, and could you please just, like, take a look at my logs and see what happened?
0:28:43 That’s obviously a very specific task, which you can train a smaller and faster model to do very well.
0:28:48 And so, you know, specialization of tasks is where we see a lot of this model training coming to play.
0:28:55 So, Shiv, I mean, healthcare, the expenditures are massive, but there’s so much pressure.
0:29:00 How does the cost of inference impact you and what you’re doing in a bridge?
0:29:02 Yeah, it’s a similar playbook for us.
0:29:06 We’re live in over 200 of the largest health systems in the country.
0:29:10 We’re touching well over 70 million patients, you know, every year.
0:29:17 And we’re going really, really deep on a very narrow use case, but it allows us to get a lot of edits from users on a daily basis.
0:29:27 So, our playbooks are similar in the sense that we might hit a model maybe 19 times, let’s say, 20 times to create one set of outputs for any given doctor-patient encounter.
0:29:33 If all of those hits were to an off-the-shelf commercial model, we’d be out of business, obviously.
0:29:40 And so, so much of our playbook is around distillation, it’s around open models, it’s around fine-tuning, and it’s around post-training.
0:29:47 And so, being able to even create that feedback loop in healthcare is easier said than done because of, obviously, privacy and security being so important.
0:29:49 This information is sacrosanct.
0:30:02 So, a lot of what we have to do is build pipelines that allow us to de-identify data, then certify that we’ve done it the right way, and then, you know, build the systems that allow our models to continually improve with every single encounter across the country.
0:30:07 Now, Aravind, I’m going to give you just a softball question, just kidding.
0:30:13 So, there’s been a talk about agentic systems replacing operating systems.
0:30:22 We’ve heard Meta talk about it, I’ve heard Qualcomm talk about it, anybody with a headset as well, an XR headset.
0:30:29 What is the odds or the chances that this could become a reality?
0:30:32 So, could you repeat that?
0:30:32 Sorry.
0:30:41 Yeah, so, essentially, instead of having a full-blown operating system like we have with Windows, iOS, it’s an agentic system replacing it.
0:30:52 Yeah, so, the way we think about it is the browser, one of the reasons we decided to work on a browser is there’s no other way to ship a personal agent on the phone.
0:30:56 Because the phone, there’s only two operating systems that you can have on your phone.
0:30:58 It’s either iOS or Android.
0:31:07 And while Android might appear like an open operating system, technically, it’s not, because what actually gets shipped on the device is controlled by Google.
0:31:15 So, the browser lets you access third-party apps without actually having to do that at the OS level on the phone.
0:31:29 We think cracking that problem is way more important than going out and building a new hardware because your new hardware should still connect to your phone via Bluetooth, and you’re still controlled by all the permissions that your OS on your phone lets you do.
0:31:30 I appreciate that.
0:31:30 I appreciate that.
0:31:32 Yeah, Johnny Ive is working on that.
0:31:34 It’s open AI right now.
0:31:37 It’s not necessarily a device like this, but who knows?
0:31:37 Is it a headset?
0:31:39 Is it a pair of goggles?
0:31:42 By the way, I’m really bullish on other hardware.
0:31:50 The glasses are a really amazing form factor to just visually see things and ask questions based on what you see.
0:31:55 That totally breaks the interaction mode of just asking things from a stream of text.
0:32:03 At the same time, I feel like access to the web, access to browsing, access to tools, these are not going away.
0:32:06 So, you’ve got to work on problems that are hardware agnostic as a software company.
0:32:08 It makes sense, browser-based.
0:32:12 Well, I would say we could sit here and spend another 30 minutes.
0:32:18 Some big news this morning out of OpenAI and Microsoft, what they’re going to do in healthcare.
0:32:25 So, I know you guys need to get off set and to pay attention to the stuff going on in the world, but it’s been great to have you here.
0:32:27 Thanks for spending time with us.
0:33:06 Thank you.
0:33:07 Thank you.

Bonus coverage from the NVIDIA GTC DC ’25 Pregame Show

Chapter 2: Agentic AI for Every Industry

Intelligent systems are beginning to plan, reason, and act, reshaping how industries work. Builders share how these capabilities are moving from research into real-world impact.

Catch up with GTC DC on-demand: ⁠https://www.nvidia.com/en-us/on-demand/⁠

GTC DC ’25 Pregame – Chapter 2: Agentic AI for Every Industry

Leave a Reply Cancel reply