AI transcript
0:00:15 Hello, and welcome to the NVIDIA AI podcast. I’m your host, Noah Kravitz.
0:00:20 Agentic AI is on the tips of everyone’s tongues right now, it seems. But what is it? What makes
0:00:26 Agentic AI so exciting? And what should AI leaders like CIOs and IT execs be thinking about when
0:00:32 designing an Agentic AI system for an enterprise? Here to break it down for us, live from GTC 2025,
0:00:37 is Bartley Richardson. Bartley is Senior Director of Engineering and AI Infrastructure here at NVIDIA,
0:00:43 where he leads Agentic AI and Cybersecurity AI. Previously, Bartley was a technical lead on
0:00:49 multiple DARPA projects. He holds a PhD in Computer Science and Engineering with a focus on AI. And
0:00:54 he’s here right now. Bartley, thanks for taking time out of a busy GTC week to join the podcast.
0:00:58 Thanks, Noah. I really appreciate it. That’s a great intro, by the way. I’m going to bring you
0:01:01 along with me everywhere I go, and you can do that intro every time.
0:01:06 I’m anything I can for the cause, happy to do it. So maybe we can start with, to put you on the spot,
0:01:12 you don’t have to define things, but talk about Agentic AI, what it is, why is it so exciting,
0:01:14 you know, in this context of enterprise leaders?
0:01:20 Yeah, you know, I feel like what we’re really good at in the tech industry, and I’ll include
0:01:24 us at NVIDIA and that and everybody, right, is we’re really good at making things very
0:01:30 complicated. It’s one of our primary concentrations, right? And when I talk with people about agents and
0:01:37 agentic AI, what I really want to say is automation. And I like, that’s the word I really want to use.
0:01:42 But automation, you just get half of it out of your mouth and people fall asleep, right? So we call it
0:01:48 agentic and agent AI, and agentic AI. And really what it is, it is that next level of automation.
0:01:53 If you think about the evolution of automation, and how we were doing things manually, and even if
0:01:58 you go back to factories, right, and that these types of things, these, I don’t want to say mundane,
0:02:04 but I obviously just said it, but these kind of like everyday repeatable tasks, right, we’ve come up
0:02:10 with better ways to do it, right? Like, example, I grew up on a farm in the middle of Ohio, and we would
0:02:15 dig postholes by ourselves, like these terrible tools. We have technology that does this now for
0:02:19 us, right? Agents are very similar. They’re just, instead of working in the dirt in the middle of
0:02:27 Ohio, they’re working on massive petabyte scale data silos, or, you know, information repositories
0:02:33 to take that data, turn on a little bit, do this like mundane task, and then instead of returning data
0:02:39 that the human has to go back through and look at, actually take that, give it context, put, synthesize
0:02:43 it together with other types of data, uh, and make it a little bit more actionable, a little bit more
0:02:49 easy for the human to digest. Right. Can we talk about reasoning? Yeah. Is there, agentic reasoning,
0:02:53 is it a thing now? It’s a thing, right? Like, I think we’re going to make it a thing, right? Like,
0:02:59 you know, it’s, it’s, uh, it’s definitely a thing. Reasoning is, is another one of these terms that gets,
0:03:04 you know, kind of bandied about quite a bit, right? And, and really what you have is this kind of
0:03:11 different type of model, right? Almost where you have a, I’ll say a traditional LLM or, or an old
0:03:15 school LLM. I don’t know what that means. Like two years ago, five years ago, like you have these old
0:03:19 school LLMs and they’re really good at kind of like this token prediction. They can do the thing where,
0:03:23 oh, okay. I, you know, I’ve had the sentence, complete the sentence and all of these types of
0:03:28 things that are going to image task. And that reasoning models are kind of like this next level and
0:03:35 they’ve been trained and tuned in a very specific way to think almost like think out loud. If you
0:03:39 look at their text and look at their structure where they’re going through and you’ve given them the task,
0:03:43 they’re like, oh, I could do this. I could do this. I could do this. I could do this. And it’s kind of
0:03:47 like when you’re brainstorming with like maybe your colleagues or even your family, right? You’re like,
0:03:51 oh, we could do this. We could do this other thing. Reasoning models have that type of feel to them,
0:03:56 right? And so putting them in a place of a system that’s like, hey, I need to, I don’t really know what I
0:04:00 want to do, but like, I kind of want to make a plan. Here’s my, here’s my loose guidelines.
0:04:06 Reasoning models can very easily explore that space for you and get it to like, here’s some options.
0:04:10 So maybe we can dive into the technology a little bit and you can kind of go through some of the
0:04:14 components involved. There’s different NVIDIA technologies here and just talk a little bit
0:04:17 about each one, what they are, what they do rather, why they’re important.
0:04:23 Yeah. And I think the way you set that up is really great because it is this collection of
0:04:26 technologies, right? When we talk about agentic system, it’s not just like, oh, I just
0:04:29 completely a hundred percent brand new thing that we’re trying to do.
0:04:30 I’ve got this agent named John.
0:04:35 Exactly. Right. Exactly. It’s, and I, you know, you tell people it takes years to be an overnight
0:04:41 success, right? And so it takes years to do what we’re doing in agents and agentic systems right
0:04:46 now. And it starts with something like data ingest, right? And data ingest and being able to retrieve
0:04:50 that information. So if you look at things like Nemo Retriever, right, that’s a great place to start.
0:04:57 We have a capability that now Nemo Retriever doesn’t just ingest text, right? Like unstructured text,
0:05:03 right? Right. Yeah. But we can ingest multimodal documents. So everyone’s favorite document type,
0:05:07 the PDF, right? Everyone loves it, right? A great, great document, easy to work.
0:05:14 If you would have told me, uh, 10 years ago that I’d still be working in PDFs, right? Like,
0:05:19 I don’t know. Right. Uh, but, but anyway, here we are and it can have all of these different
0:05:24 modalities in it. It can have pictures, it can have structured images, which are like tables and graphs
0:05:29 and charts. It can have unstructured images and all of these types of things and extracting that
0:05:33 information and keeping the context. I’ll give you a really simple example where let’s say you have a
0:05:38 graph, a bar graph. Okay. Really simple bar graph. And if you were to just give that to a really kind
0:05:42 of basic computer vision parsing model, it would give you back and be like, yep, I understand that.
0:05:47 I have five rectangles. Okay. It’s like, no, you don’t because the right, just having five rectangles
0:05:53 doesn’t make that interesting. Right. So it’s, it’s the relationship with the five rectangles to them.
0:05:57 It’s the caption that goes underneath it. Right. Is there text somewhere else in this document?
0:06:01 It’s not even on that page that explains that chart. So you have all of these different,
0:06:06 when we talk about the, the, um, retriever and ingestion process, we have all these different
0:06:12 models, these NIMS that go into the pipeline and then we stitch them together to make this type of
0:06:18 parsing possible. So we have models that detect bounding boxes of images. We have models that will
0:06:22 like work specifically on texts. We have models that look at captioning and put all this together.
0:06:26 There’s like three, four NIMS, right. That just go into this one process. And what I love about
0:06:32 retriever is, and you’re going to see the theme, right? It’s built for enterprise quality. And so
0:06:36 if you look at PDFs, you know, if you’re the average person, you have like a couple, you’re
0:06:40 doing research, something like that. You have 10 PDFs, you upload them, you don’t care a lot. It takes
0:06:45 five minutes, you know, whatever, come back and do it. If you’re an enterprise, right? Like, you know,
0:06:50 how many PDFs we generated in video? Like, I think we could run the world on these PDFs that companies
0:06:55 generate and you need speed, you need consistency and you need accuracy. Right. And so if you look at
0:07:00 something like retriever, we’re, you know, 15 times the amount of throughput that you can get in other
0:07:06 like leading systems, right. With a really complicated PDF, we can go, you know, 10 pages
0:07:12 per second, right. Like through on like a single GPU instance, right. And extract this. It’s really
0:07:19 cool that, you know, 50% fewer errors in accuracy compared to other systems. So it all starts, if I
0:07:26 think about agents, it all starts with that part, getting the data from human source into an embedded
0:07:31 kind of source. Right. Right. Right. Uh, should we talk about agent ops tools? Yeah. Agent ops tools
0:07:36 also. So I like the progression that you’re going right here. So I’ve got my data, right? Like, uh,
0:07:40 you know, I’ve got it in there. We’re pretty good at that. Agent op tools, uh, are another way of saying
0:07:44 this is kind of like a flywheel. Yeah. Right. So you’ve got your, your system and we’ll talk a little
0:07:49 more about the system in a little bit. Uh, and you might have your models and what agent ops tools
0:07:55 are really good at is fine tuning, honing, and making that system even more efficient. Okay. Right.
0:08:02 Uh, so, you know, for example, you can through successive iterations of fine tuning through a,
0:08:07 uh, new agent ops tools, you can have a, you know, like a 10 X reduction in your model size,
0:08:12 right. By fine tuning that down, distilling it right a little bit, which obviously translates to
0:08:17 either speed or translates to energy or translates to money, right? These are kind of currencies you
0:08:21 can exchange, right? Depending on what you care about. Almost 4% increase accuracy by doing this
0:08:26 fine tuning model. And the real cool thing about agent ops tools is it just kind of sets there,
0:08:30 right? Like, and you hook it in, you know, you’ve got your inputs, you’ve got your outputs,
0:08:35 you hook it into your outputs, it feeds back into your inputs and it’ll prompt the human. Like
0:08:38 every once in a while, it’s like, Hey, what do you think about this? And you know, those, uh,
0:08:40 you go to any tool and you give it the thumbs up, you give it the thumbs down,
0:08:47 right. There’s that, but what’s even better is you can give it a free form text. Like not just,
0:08:51 did I like this? Or did I not like this? It’s like, Hey, feedback. Yeah. Yeah. This wasn’t
0:08:55 quite like this, whatever. And it keeps that context. And we’ll use that to like steer in a
0:09:00 different, in a different direction. Right. And so it’s useful in a lot of ways, model reduction
0:09:05 size, all that, but then you’ve ingested all this data. You’re using this model. It’s a way to kind
0:09:09 of push or like steer this model a little bit towards your particular use or suite,
0:09:13 this model. Right. If I’m jumping ahead to something that you’re going to cover,
0:09:16 tell me and I’ll back off. But you mentioned accuracy a couple of times. Accuracy, obviously
0:09:23 a big, big thing. Do the reasoning models themselves improve accuracy? And I know it’s not like there’s
0:09:27 one kind of reasoning model that fits everything. Right. But I’m just wondering, as you’ve mentioned
0:09:31 it, like, is that part of what the reasoning model is meant to do? Or is it just kind of a happy
0:09:38 side effect or? Um, you know, it can in a, in a system. Right. And certainly if you look at
0:09:42 reasoning models, there’s, you know, if we talk about NVIDIA’s model, right. Like Lama Nemetron
0:09:47 Reason Super. I might’ve messed up. Lama Nemetron Super with Reasoning.
0:09:54 Super with Reasoning. Yes, exactly. If you look at that model specifically, it does have a higher
0:10:00 accuracy than these other reasoning models, right? Like in its, its kind of system where I think,
0:10:06 or where we see this being advantageous is as part of a, a larger agentic system where I’ve got this
0:10:12 reasoning model and there are good tasks for reasoning models. And I’m just going to say
0:10:16 they’re bad tasks for reasoning models, right? Like just like there’s good uses for a fork and
0:10:20 there’s really bad uses for a fork, right? Like it’s the same for this reasoning model. And with
0:10:24 our reasoning model, with Lama Nemetron Reason, one of the great things about it is
0:10:29 you can turn reasoning on or off. So you’ve got a single model that now operates as a reasoning model
0:10:34 or it can operate as just a regular, like non-reasoning model. Right. And so we talk about
0:10:39 accuracy. One of the really nice things about reasoning models is the ability to iterate that
0:10:43 with them. I’m giving you an example. You’re in a deep researcher type environment and you’re like,
0:10:48 Hey, this is the topic I want. I might upload a few PDFs, right? Like I give it the structure.
0:10:53 I want it to have an intro, a, you know, a conclusion, right? Keep it pretty light. You know,
0:10:57 you give it all this thing. A model can go, a reasoning model can go and say, okay, I’m going
0:11:02 to really quickly generate a lot of, a lot of ideas, a lot of questions that I want to do. I’m
0:11:06 going to present it to you, human. It’s like, this is what I’m going to do. This is like seven tasks or
0:11:10 something like that. Something the human can, is sticky for the human in their brain, right? Like
0:11:13 not a hundred thing. And the human can go through and they can be like, oh, that one looks good.
0:11:18 Can you slightly, that one, I want you to retweak it, right? I want it about this topic or not that
0:11:21 topic. That one looks like it’s going to be too high level. Can you take that more towards a 12th
0:11:26 grade level? Right. And it’s really, in my mind, it’s the improved accuracy that we do see a little
0:11:32 bit in the reasoning models, but it’s the human model connection that lets you really, it becomes a
0:11:38 coworker. These agentic systems become the coworker where you can guide them and you can fine tune them
0:11:43 and you can talk with them and interact with them. So you have the report that you want. And then you do
0:11:47 all of the deep researching. Now you’re in minutes of like, I’m going to do this stuff.
0:11:53 And what, if I take us, I’m going to date myself. If I take us way back, if you remember the internet,
0:11:59 when we used to talk about things like load times and you know what I mean? Like what’s the user’s
0:12:04 perceived load time of this? And we would do things to do that. We now have this ability with the
0:12:09 reasoning model, you can interact with it early. And so when you get to the final report or a draft
0:12:13 report, you haven’t spent five minutes and it’s something you totally different want, didn’t
0:12:20 want. So not only is the actual measured accuracy up, but the perceived accuracy from the human,
0:12:23 the perceived efficiency, because you were involved in the process early on.
0:12:29 Yeah. Fantastic. I feel like that’s a great segue into talking about enterprise, the enterprise,
0:12:35 IT leaders, whoever’s designing, developing the AI applications. What are some of the things that
0:12:40 should be considered by, you know, a tech leader, a CIO when they’re designing an agentic AI system in
0:12:47 particular? We have a couple hours, right? Yeah. So I think the biggest one, right, is that when I talk
0:12:53 with anybody, right, in the enterprise, I remind them, I’ll be like, okay, so what is the agentic system we
0:12:56 should use? And I’m like, I’m going to answer your question, but let me ask you a parallel question.
0:13:03 I’m like, tell me, what is the, you know, tell me, what’s the one piece of software you use in your IT
0:13:06 system? And they’re like, what do you mean one piece of software? We have 15 different vendors
0:13:11 and we do all this kind of stuff. I’m like, correct. I’m like, so you have to think about
0:13:16 agentic systems as the same, right? You’re going to get some from your vendors, right? Whoever that is,
0:13:19 you’re going to have an application, you’re going to have your CRM, you’re going to have all these
0:13:24 different things for your developers. They’re going to put their own agents in that, right? And you’re
0:13:28 going to work with those. Some are going to be homegrown by you, right? Because everyone’s moving
0:13:33 really fast in this space. You have your own enterprise data with your own sources. If I
0:13:37 can, if I can talk about a very specific NVIDIA example that I tell everyone, we have this thing
0:13:43 called Envy Bugs Pro AI Search. And I like it for two reasons. One, it uses AI agents and two, the name
0:13:47 just rolls right off the tongue. Totally. Right. At least when you say it. Envy Bugs Pro AI Search,
0:13:52 it’s just like, that one sticks in my brain. Not, not the, not the, not the order of super or reason,
0:13:57 right? But not that one. Right. But the thing I like about this is, is the, our IT, our great IT
0:14:03 department, who are much more engineers than the average IT department, created this. And they
0:14:08 created it very early on in, in LaneChain’s existence. And then they, they modified a little
0:14:13 bit. And now we’re here, fast forward six months. And we want to not only use it with Envy Bugs Pro,
0:14:19 but we want to hook it up to coding repos and forums and CRM systems and all of that. And what that means
0:14:25 is that now you’re in this similar situation where I might have various different vendors and stuff
0:14:30 that I’ve grown myself. So when I talk with them and say, what you’re really looking for is you need
0:14:35 to look at it in that same context. It’s not, I’m going to buy one piece of software. I’m going to write
0:14:39 one piece of software. You’re going to have all these agents working together. And the trick is,
0:14:45 how do you let them all come together, mesh together in a somewhat seamless way for your employees?
0:14:51 So when I log into our systems, or if you log into our systems, it’s context aware. It gives
0:14:56 me the information that I need. It helps me do my job. I look at it like that. It’s a, look at it in
0:15:04 the, in the traditional IT deployment sense. Right, right. How does that differ from what enterprise
0:15:10 IT departments and developers were doing, you know, sort of pre-LLM, pre-AgenticA, that kind of thing?
0:15:14 Is it, to me, being a little bit on the outside thinking about it, it sounds sort of like,
0:15:18 oh, you’ve got different vendors and I know it’s like an app store and CRM systems have their own
0:15:23 apps. Everything’s got their own, like, you know, place where you can go grab apps. Are we moving
0:15:29 towards that for agents? You know, I mean, I think so. Yeah. I mean, there definitely will be. I mean,
0:15:35 they’re out there already. Yeah. Right. You can, everyone has their own and, and that’s fine. I think the
0:15:41 difference is, I say, you know, when we had all these apps that in the, in the, I guess in, in the
0:15:47 before times, I guess, when we had all these apps, uh, we were in data silos and now it, uh, it’s very,
0:15:52 it could be possible. We’re in these agentic silos, right? Like a little bit where I have data here and
0:15:57 I interact with that agent to talk with us. The big difference though, is we’re in a situation now
0:16:02 where we’re not always going to have API to API access, right? I don’t have to necessarily always
0:16:10 have a developer script and code that I can have CRM agent over here talking to Confluence wiki agent
0:16:17 over here. And they are communicating not via an API that a developer set up. They’re communicating
0:16:23 via human language, right? Like with each other. Right. And that kind of makes connection. It kind
0:16:28 of eases the connection points. It raises some, some things you have to look at, right? Like in how are
0:16:32 we monitoring these systems? How are we observing these systems? Right. But that is one of the key
0:16:37 differences is you’re going to not just have API to API to API specs. You’re going to have some of
0:16:42 that. And then you’re going to have these agents that are just communicating with themselves on your
0:16:48 behalf. Right. What kind of complexities does that raise, or perhaps doesn’t raise when it comes to,
0:16:52 you know, data security? And obviously this all runs on data, like you said at the beginning,
0:16:56 right? And just the data. So if you’ve got different agents and there may be from different
0:17:00 vendors and they’re doing, how do they talk to each other in a secure manner?
0:17:06 It’s not a hundred percent solved yet, right? Like, but where it’s headed is this idea that we call
0:17:11 context-based security. And so if you go back from the history and you’re like, you know, security really
0:17:16 started with, let’s say firewalls, right? And what was the motion? I’m going to put everything in this
0:17:20 circle and I’m going to put my hands around it. And then there I secured it. And then we did this thing
0:17:25 where we, I don’t know if you, I don’t know if you remember it, we moved to the cloud or I think we did,
0:17:29 right? Like we moved to the cloud and then it was like, oh, oh, I can’t put my hands around it
0:17:33 anymore. Now I’m application-based security, right? Like, cause I have pieces everywhere.
0:17:37 And now we’re in this motion where every, there’s cloud, there’s on-prem, we still have all of that.
0:17:43 But when I am accessing a system, like I said, versus maybe with like our CFO is accessing that
0:17:49 same system, we are looking at it in a different context, right? And the information we want out of
0:17:53 it. So we’re moving to this context-based type of security where you not only have to understand
0:17:58 the person and the credentials and do all that stuff that you are already doing or supposedly doing,
0:18:03 right? Like our back should be there, right? Like if it’s not there, we have a more basic problem.
0:18:03 Right.
0:18:06 But you have to look at the context in which the question is being asked. Like look at the things
0:18:12 around it, look at what pieces of information are coming with it, do pieces of security before that
0:18:17 question is, and then do an analysis before you return the information to the user. And you can look at
0:18:23 that context. So it does, I don’t want to say complicates it, but it does add about maybe 10%
0:18:29 new stuff that we weren’t doing before to roughly 90% of just like regular security applications.
0:18:34 I was speaking with Bartley Richardson. Bartley is the Senior Director of Engineering and AI
0:18:40 Infrastructure at NVIDIA. And we’ve been talking agents, agentic AI, a little bit of cybersecurity,
0:18:46 got snuck in there. But that’s Bradley’s domain. He leads agentic AI and cybersecurity AI
0:18:52 here at NVIDIA. And we’re talking about agentic AI and the enterprise. To shift gears a little bit
0:18:57 and kind of talk about business needs, designing one of these systems, how, you know, are there
0:19:04 best practices or, or, you know, things that you’ve seen that work well when it comes to making sure
0:19:10 that you’re designing with the business care abouts in mind and then going back and kind of rechecking
0:19:11 keeping that, that North Star?
0:19:16 Well, I think, yeah, there, there are some of it is kind of what we’ve already, already talked about,
0:19:21 right? Like, you know, building on these strong foundations of not just tooling and ingest, but,
0:19:25 you know, models. And so there’s all of that, right? Like, obviously. One thing that we haven’t
0:19:31 talked about explicitly yet is, is this notion of traceability and observability and profiling.
0:19:37 Right. You mentioned briefly. Yeah. And, and you have to imagine if we’ve, uh, go back and we have
0:19:41 this kind of distributed system, there’s all these agents that are connected with other things. They’re
0:19:45 talking in various modalities faster than a human can understand from different vendors and different,
0:19:50 even like agentic framework providers, right? Like you might have someone in your business wrote this
0:19:54 thing in lane chain and someone wrote this thing in crew AI and someone wrote this, you know,
0:19:58 there’s all these different things. Yeah. So how do you have a holistic kind of traceability and
0:20:04 observability platform like across that? And that becomes, it becomes a little challenging and it’s
0:20:10 why we made this new thing. Uh, it’s called agent IQ and it’s not an agentic framework. There’s plenty of
0:20:14 those. I haven’t checked. We’ve been talking for, you know, a few minutes. There probably are 10 more that
0:20:18 popped up while we were talking, right? We don’t need another one of those. Our framework’s the new
0:20:23 prompt engineer. Right. Yes. They’re the new. Yeah, exactly. But it’s not, it’s not one of
0:20:31 those, but rather it’s a really simple, as much as necessary, as little as possible way to get all
0:20:37 of these frameworks and tools and everything to work together and be observable on the same point.
0:20:41 Right. Being a CS person and coming from an engineering background, the way I tell people
0:20:46 is I’m like, again, we’re really good at complicating things. We got agents and tools and all these things.
0:20:52 I’m like, no, everything’s a function call. And so what agent IQ does is it, it lets you use the
0:20:56 frameworks that you were using and you still develop in those frameworks. Like I was saying,
0:21:00 you probably already have stuff. Don’t rewrite that code. Let’s just hook it up to other things
0:21:04 that you have, but let’s do that in a way where we develop everything to a function call. And that
0:21:10 lets you say, oh, I’ve got this, this agent pipeline here. And now I want to add a capability to it.
0:21:15 Well, I can add it on or I can wrap it in something else. So now I have an agent inside of an agent.
0:21:20 It allows this nesting kind of capability. And what it gets you is this really cool traceability.
0:21:27 So you can go into every tool call, every LLM call, every chain of tool calls. You can look at the input
0:21:33 tokens, the output, output tokens, the time it took to do the tool call, the sequence of actions of the
0:21:38 tool call. And you can, and we have customers that are already seeing, you know, they look at the timing
0:21:44 charts and they’ll optimize their tool calling chains and they get the 15X, right? Like speed up through
0:21:48 their, through their pipeline for it. Right. Or they’ll get a, you know, a 5X improvement in
0:21:54 accuracy by moving things around. And it’s that type of information that is part, one of the reasons
0:21:58 that something like Agent IQ exists. Yeah. That’s fantastic. We’ve got a few minutes left. Is there
0:22:03 an area that we haven’t covered that you want to go into in particular? Yeah. I mean, the, the thing I
0:22:08 think I would, I would end on, right, if we’re talking about agents is I feel like as an industry,
0:22:12 we’ve been talking about them for a really long time, but we’re pretty, we’re pretty new,
0:22:17 right. Like to them and their actual usability, right? Like, and, and how we’re, how we’re kind
0:22:22 of getting them into, like we said, enterprise scenarios. I think what’s, what’s incredibly
0:22:26 interesting is these use cases that we have. And some of them we have here in booth demos,
0:22:30 right at GTC this year, right? So I’m sure afterwards we’ll, we’ll have those right where
0:22:36 people can consume as well. But we have this one that is, is automating a lot of what happens
0:22:41 in the kind of feature requests, like what, what issues are a customer having with my product
0:22:46 all the way through writing a PRD for that, a product requirements doc, all the way to assigning,
0:22:50 you know, who’s, who might be the best engineers for doing this. I think the coolest thing that I’ve
0:22:55 seen as part of that is we use a reasoning model to say like, Oh, here’s some, here’s some issues
0:22:59 your customers are talking about on forums, or here’s people, you know, tickets they filed, right?
0:23:04 Here’s a set of brainstorming questions that I reasoned through, go have a meeting. And then the
0:23:07 humans get in the room, they had the meeting about, they talk about, they might draw some diagrams and
0:23:11 that kind of stuff. We, that goes back into the system. So think about this as a big human in the
0:23:17 loop that goes back into the system. What comes out the other side, a fully formed PRD where we’ve
0:23:22 taken the language and the action items from the actual team’s call. We’ve taken pictures of the
0:23:28 whiteboard. It produces the diagrams and then it gives you a PRD that then, you know, if it gets you
0:23:33 75, 80% of the way there, that’s fantastic. That’s great. Cause what’s, you know, I’m sure,
0:23:38 you know, you do your fair share of writing, right? Like the hardest part for me about writing is that
0:23:43 blank page. Totally. Right. And if I can get something that’s 80% of the way there, it’s great.
0:23:49 And that’s the point I would leave on is agentic systems and models will make mistakes, right? Like
0:23:54 they will, they will not never be a hundred percent accurate. I would challenge you to find anything
0:23:58 that’s a hundred percent accurate. Right. But the way to think about them is like, look,
0:24:06 the human will be in the loop. And if it gets you 60, 70, 80% of the way there, that’s amazing.
0:24:09 Yeah. I mean, that’s 70% of work you don’t have to do. It’s there.
0:24:15 Exactly. And that’s the part that I think is incredibly compelling and we should focus on
0:24:19 accuracy and we should always try to make things better. But I never want to lose sight of like,
0:24:24 look at the 70%, right? Like that, that we did and how are we going to continue to make it better?
0:24:28 Of course. Absolutely. All right, Bartley, before we let you go, a little change of pace,
0:24:34 but in your own daily routine, your own daily work, are there AI powered tools that you’re using every
0:24:38 day that, you know, you can recommend them. It can just be a category specific tool. What are you using
0:24:43 that you like? Yeah. I’ll preface this by saying I have no official affiliation with any of these.
0:24:47 Some of them I pay a lot for, right? But other than that, right? No official affiliation. Of course,
0:24:52 I think like a lot of people, you know, AI powered search engines, either that’s something like a,
0:24:59 like, like a perplexity, right? I use that a lot. I use a chat GPT, the reasoning model in chat GPT,
0:25:02 like quite a bit, like that one’s fantastic. If I’m trying to like research things or if I,
0:25:05 you know, I just have something that I wanted to think about just a little bit more.
0:25:10 Of course, that was before our reasoning model came out and before AIQ, right? Like is out there,
0:25:15 which is our deep researcher. So those are kind of the obvious ones. I don’t get to code as you know,
0:25:19 my, my teams don’t let me code as much as they used to and they shouldn’t. That’s smart. That’s
0:25:25 smart of them. But I love cursor as a coding tool and even just making diagrams. Like it’s really good
0:25:29 at that. The amount of times that I just hit tab and I’m like, wow, get out of my brain, right? Like
0:25:35 I just hit tab. The other one that I use a lot is called napkin, napkin.ai. And that’s the one that I
0:25:39 don’t know if a lot of people have heard. Yeah. I haven’t heard of that one. Yeah. It’s in beta right now.
0:25:44 It’s free. Again, I have no affiliation with them and I love their tool. What you can do is you can
0:25:49 either just like type free form text or you can do like a perplexity style and it’ll research things
0:25:54 for you. And it will take, let’s say you have a process diagram or a complex tree or whatever,
0:26:00 and it turns that into the associated diagram and the infographic for you. And then what I love about
0:26:05 it is it gives you kind of pre-formatted options to start from. Do you want this to be
0:26:09 a funnel? Is this a process tree? Is it the cycle and all this in different color options?
0:26:13 And you’re like, yeah, I want that. And then you can go in and you can change anything. You can move
0:26:17 the text. You can change the text. You can move the shapes and all that. So I talk about the 70% there,
0:26:24 right? I would toil away with these in PowerPoint for so long, right? But now, and it’s getting better,
0:26:29 I can just describe it to napkin, right back of the napkin. Right. And it produces this great SVG
0:26:31 or PNG diagram, right? And so it’s fantastic. It’s awesome.
0:26:37 Fantastic. Bartley, for folks listening who want to find out more about all the things you’ve been
0:26:43 talking about, all the work you and NVIDIA is doing with the enterprise, where’s a good place to go
0:26:49 online or places to get started? Yeah. A great place to go is build.nvidia.com, right? So that’s a great
0:26:54 place to get started. You can see all of our models, all of our blueprints, which is just kind of like
0:26:59 intent examples. So if you go to build.nvidia.com, ai.nvidia.com takes you to a similar place,
0:27:03 right? If you’re interested in this new thing that we were just talking about, which is called
0:27:12 Agent IQ, we’ll play on Agentic, right? It’s Agent IQ. Oh, yeah. Right. Yeah. It was clever. But if
0:27:19 you’re interested in that, that is open source software on GitHub. So github.com slash NVIDIA
0:27:24 slash Agent IQ. Great. But we’ll also be all linked off of build.nvidia.com. Yeah. Great.
0:27:28 Bartley Richardson, thank you so much for taking the time out. Best of luck this week. Have a great show.
0:27:32 And in all you’re doing, I mean, we can catch up and do it again down the line and, you know,
0:27:36 see if we’re still talking about PDFs in a year or two. Oh my goodness, Noah. Yes, we’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:39 We’d love to.
0:27:39 We’d love to.
0:27:40 We’d love to.
0:27:41 We’d love to.
0:27:42 We’d love to.
0:27:43 We’d love to.
0:27:45 We’d love to.
0:27:49 We’d love to.
0:27:51 We’d love to.
0:28:14 We’d love to.
0:28:26 We’d love to.
0:00:20 Agentic AI is on the tips of everyone’s tongues right now, it seems. But what is it? What makes
0:00:26 Agentic AI so exciting? And what should AI leaders like CIOs and IT execs be thinking about when
0:00:32 designing an Agentic AI system for an enterprise? Here to break it down for us, live from GTC 2025,
0:00:37 is Bartley Richardson. Bartley is Senior Director of Engineering and AI Infrastructure here at NVIDIA,
0:00:43 where he leads Agentic AI and Cybersecurity AI. Previously, Bartley was a technical lead on
0:00:49 multiple DARPA projects. He holds a PhD in Computer Science and Engineering with a focus on AI. And
0:00:54 he’s here right now. Bartley, thanks for taking time out of a busy GTC week to join the podcast.
0:00:58 Thanks, Noah. I really appreciate it. That’s a great intro, by the way. I’m going to bring you
0:01:01 along with me everywhere I go, and you can do that intro every time.
0:01:06 I’m anything I can for the cause, happy to do it. So maybe we can start with, to put you on the spot,
0:01:12 you don’t have to define things, but talk about Agentic AI, what it is, why is it so exciting,
0:01:14 you know, in this context of enterprise leaders?
0:01:20 Yeah, you know, I feel like what we’re really good at in the tech industry, and I’ll include
0:01:24 us at NVIDIA and that and everybody, right, is we’re really good at making things very
0:01:30 complicated. It’s one of our primary concentrations, right? And when I talk with people about agents and
0:01:37 agentic AI, what I really want to say is automation. And I like, that’s the word I really want to use.
0:01:42 But automation, you just get half of it out of your mouth and people fall asleep, right? So we call it
0:01:48 agentic and agent AI, and agentic AI. And really what it is, it is that next level of automation.
0:01:53 If you think about the evolution of automation, and how we were doing things manually, and even if
0:01:58 you go back to factories, right, and that these types of things, these, I don’t want to say mundane,
0:02:04 but I obviously just said it, but these kind of like everyday repeatable tasks, right, we’ve come up
0:02:10 with better ways to do it, right? Like, example, I grew up on a farm in the middle of Ohio, and we would
0:02:15 dig postholes by ourselves, like these terrible tools. We have technology that does this now for
0:02:19 us, right? Agents are very similar. They’re just, instead of working in the dirt in the middle of
0:02:27 Ohio, they’re working on massive petabyte scale data silos, or, you know, information repositories
0:02:33 to take that data, turn on a little bit, do this like mundane task, and then instead of returning data
0:02:39 that the human has to go back through and look at, actually take that, give it context, put, synthesize
0:02:43 it together with other types of data, uh, and make it a little bit more actionable, a little bit more
0:02:49 easy for the human to digest. Right. Can we talk about reasoning? Yeah. Is there, agentic reasoning,
0:02:53 is it a thing now? It’s a thing, right? Like, I think we’re going to make it a thing, right? Like,
0:02:59 you know, it’s, it’s, uh, it’s definitely a thing. Reasoning is, is another one of these terms that gets,
0:03:04 you know, kind of bandied about quite a bit, right? And, and really what you have is this kind of
0:03:11 different type of model, right? Almost where you have a, I’ll say a traditional LLM or, or an old
0:03:15 school LLM. I don’t know what that means. Like two years ago, five years ago, like you have these old
0:03:19 school LLMs and they’re really good at kind of like this token prediction. They can do the thing where,
0:03:23 oh, okay. I, you know, I’ve had the sentence, complete the sentence and all of these types of
0:03:28 things that are going to image task. And that reasoning models are kind of like this next level and
0:03:35 they’ve been trained and tuned in a very specific way to think almost like think out loud. If you
0:03:39 look at their text and look at their structure where they’re going through and you’ve given them the task,
0:03:43 they’re like, oh, I could do this. I could do this. I could do this. I could do this. And it’s kind of
0:03:47 like when you’re brainstorming with like maybe your colleagues or even your family, right? You’re like,
0:03:51 oh, we could do this. We could do this other thing. Reasoning models have that type of feel to them,
0:03:56 right? And so putting them in a place of a system that’s like, hey, I need to, I don’t really know what I
0:04:00 want to do, but like, I kind of want to make a plan. Here’s my, here’s my loose guidelines.
0:04:06 Reasoning models can very easily explore that space for you and get it to like, here’s some options.
0:04:10 So maybe we can dive into the technology a little bit and you can kind of go through some of the
0:04:14 components involved. There’s different NVIDIA technologies here and just talk a little bit
0:04:17 about each one, what they are, what they do rather, why they’re important.
0:04:23 Yeah. And I think the way you set that up is really great because it is this collection of
0:04:26 technologies, right? When we talk about agentic system, it’s not just like, oh, I just
0:04:29 completely a hundred percent brand new thing that we’re trying to do.
0:04:30 I’ve got this agent named John.
0:04:35 Exactly. Right. Exactly. It’s, and I, you know, you tell people it takes years to be an overnight
0:04:41 success, right? And so it takes years to do what we’re doing in agents and agentic systems right
0:04:46 now. And it starts with something like data ingest, right? And data ingest and being able to retrieve
0:04:50 that information. So if you look at things like Nemo Retriever, right, that’s a great place to start.
0:04:57 We have a capability that now Nemo Retriever doesn’t just ingest text, right? Like unstructured text,
0:05:03 right? Right. Yeah. But we can ingest multimodal documents. So everyone’s favorite document type,
0:05:07 the PDF, right? Everyone loves it, right? A great, great document, easy to work.
0:05:14 If you would have told me, uh, 10 years ago that I’d still be working in PDFs, right? Like,
0:05:19 I don’t know. Right. Uh, but, but anyway, here we are and it can have all of these different
0:05:24 modalities in it. It can have pictures, it can have structured images, which are like tables and graphs
0:05:29 and charts. It can have unstructured images and all of these types of things and extracting that
0:05:33 information and keeping the context. I’ll give you a really simple example where let’s say you have a
0:05:38 graph, a bar graph. Okay. Really simple bar graph. And if you were to just give that to a really kind
0:05:42 of basic computer vision parsing model, it would give you back and be like, yep, I understand that.
0:05:47 I have five rectangles. Okay. It’s like, no, you don’t because the right, just having five rectangles
0:05:53 doesn’t make that interesting. Right. So it’s, it’s the relationship with the five rectangles to them.
0:05:57 It’s the caption that goes underneath it. Right. Is there text somewhere else in this document?
0:06:01 It’s not even on that page that explains that chart. So you have all of these different,
0:06:06 when we talk about the, the, um, retriever and ingestion process, we have all these different
0:06:12 models, these NIMS that go into the pipeline and then we stitch them together to make this type of
0:06:18 parsing possible. So we have models that detect bounding boxes of images. We have models that will
0:06:22 like work specifically on texts. We have models that look at captioning and put all this together.
0:06:26 There’s like three, four NIMS, right. That just go into this one process. And what I love about
0:06:32 retriever is, and you’re going to see the theme, right? It’s built for enterprise quality. And so
0:06:36 if you look at PDFs, you know, if you’re the average person, you have like a couple, you’re
0:06:40 doing research, something like that. You have 10 PDFs, you upload them, you don’t care a lot. It takes
0:06:45 five minutes, you know, whatever, come back and do it. If you’re an enterprise, right? Like, you know,
0:06:50 how many PDFs we generated in video? Like, I think we could run the world on these PDFs that companies
0:06:55 generate and you need speed, you need consistency and you need accuracy. Right. And so if you look at
0:07:00 something like retriever, we’re, you know, 15 times the amount of throughput that you can get in other
0:07:06 like leading systems, right. With a really complicated PDF, we can go, you know, 10 pages
0:07:12 per second, right. Like through on like a single GPU instance, right. And extract this. It’s really
0:07:19 cool that, you know, 50% fewer errors in accuracy compared to other systems. So it all starts, if I
0:07:26 think about agents, it all starts with that part, getting the data from human source into an embedded
0:07:31 kind of source. Right. Right. Right. Uh, should we talk about agent ops tools? Yeah. Agent ops tools
0:07:36 also. So I like the progression that you’re going right here. So I’ve got my data, right? Like, uh,
0:07:40 you know, I’ve got it in there. We’re pretty good at that. Agent op tools, uh, are another way of saying
0:07:44 this is kind of like a flywheel. Yeah. Right. So you’ve got your, your system and we’ll talk a little
0:07:49 more about the system in a little bit. Uh, and you might have your models and what agent ops tools
0:07:55 are really good at is fine tuning, honing, and making that system even more efficient. Okay. Right.
0:08:02 Uh, so, you know, for example, you can through successive iterations of fine tuning through a,
0:08:07 uh, new agent ops tools, you can have a, you know, like a 10 X reduction in your model size,
0:08:12 right. By fine tuning that down, distilling it right a little bit, which obviously translates to
0:08:17 either speed or translates to energy or translates to money, right? These are kind of currencies you
0:08:21 can exchange, right? Depending on what you care about. Almost 4% increase accuracy by doing this
0:08:26 fine tuning model. And the real cool thing about agent ops tools is it just kind of sets there,
0:08:30 right? Like, and you hook it in, you know, you’ve got your inputs, you’ve got your outputs,
0:08:35 you hook it into your outputs, it feeds back into your inputs and it’ll prompt the human. Like
0:08:38 every once in a while, it’s like, Hey, what do you think about this? And you know, those, uh,
0:08:40 you go to any tool and you give it the thumbs up, you give it the thumbs down,
0:08:47 right. There’s that, but what’s even better is you can give it a free form text. Like not just,
0:08:51 did I like this? Or did I not like this? It’s like, Hey, feedback. Yeah. Yeah. This wasn’t
0:08:55 quite like this, whatever. And it keeps that context. And we’ll use that to like steer in a
0:09:00 different, in a different direction. Right. And so it’s useful in a lot of ways, model reduction
0:09:05 size, all that, but then you’ve ingested all this data. You’re using this model. It’s a way to kind
0:09:09 of push or like steer this model a little bit towards your particular use or suite,
0:09:13 this model. Right. If I’m jumping ahead to something that you’re going to cover,
0:09:16 tell me and I’ll back off. But you mentioned accuracy a couple of times. Accuracy, obviously
0:09:23 a big, big thing. Do the reasoning models themselves improve accuracy? And I know it’s not like there’s
0:09:27 one kind of reasoning model that fits everything. Right. But I’m just wondering, as you’ve mentioned
0:09:31 it, like, is that part of what the reasoning model is meant to do? Or is it just kind of a happy
0:09:38 side effect or? Um, you know, it can in a, in a system. Right. And certainly if you look at
0:09:42 reasoning models, there’s, you know, if we talk about NVIDIA’s model, right. Like Lama Nemetron
0:09:47 Reason Super. I might’ve messed up. Lama Nemetron Super with Reasoning.
0:09:54 Super with Reasoning. Yes, exactly. If you look at that model specifically, it does have a higher
0:10:00 accuracy than these other reasoning models, right? Like in its, its kind of system where I think,
0:10:06 or where we see this being advantageous is as part of a, a larger agentic system where I’ve got this
0:10:12 reasoning model and there are good tasks for reasoning models. And I’m just going to say
0:10:16 they’re bad tasks for reasoning models, right? Like just like there’s good uses for a fork and
0:10:20 there’s really bad uses for a fork, right? Like it’s the same for this reasoning model. And with
0:10:24 our reasoning model, with Lama Nemetron Reason, one of the great things about it is
0:10:29 you can turn reasoning on or off. So you’ve got a single model that now operates as a reasoning model
0:10:34 or it can operate as just a regular, like non-reasoning model. Right. And so we talk about
0:10:39 accuracy. One of the really nice things about reasoning models is the ability to iterate that
0:10:43 with them. I’m giving you an example. You’re in a deep researcher type environment and you’re like,
0:10:48 Hey, this is the topic I want. I might upload a few PDFs, right? Like I give it the structure.
0:10:53 I want it to have an intro, a, you know, a conclusion, right? Keep it pretty light. You know,
0:10:57 you give it all this thing. A model can go, a reasoning model can go and say, okay, I’m going
0:11:02 to really quickly generate a lot of, a lot of ideas, a lot of questions that I want to do. I’m
0:11:06 going to present it to you, human. It’s like, this is what I’m going to do. This is like seven tasks or
0:11:10 something like that. Something the human can, is sticky for the human in their brain, right? Like
0:11:13 not a hundred thing. And the human can go through and they can be like, oh, that one looks good.
0:11:18 Can you slightly, that one, I want you to retweak it, right? I want it about this topic or not that
0:11:21 topic. That one looks like it’s going to be too high level. Can you take that more towards a 12th
0:11:26 grade level? Right. And it’s really, in my mind, it’s the improved accuracy that we do see a little
0:11:32 bit in the reasoning models, but it’s the human model connection that lets you really, it becomes a
0:11:38 coworker. These agentic systems become the coworker where you can guide them and you can fine tune them
0:11:43 and you can talk with them and interact with them. So you have the report that you want. And then you do
0:11:47 all of the deep researching. Now you’re in minutes of like, I’m going to do this stuff.
0:11:53 And what, if I take us, I’m going to date myself. If I take us way back, if you remember the internet,
0:11:59 when we used to talk about things like load times and you know what I mean? Like what’s the user’s
0:12:04 perceived load time of this? And we would do things to do that. We now have this ability with the
0:12:09 reasoning model, you can interact with it early. And so when you get to the final report or a draft
0:12:13 report, you haven’t spent five minutes and it’s something you totally different want, didn’t
0:12:20 want. So not only is the actual measured accuracy up, but the perceived accuracy from the human,
0:12:23 the perceived efficiency, because you were involved in the process early on.
0:12:29 Yeah. Fantastic. I feel like that’s a great segue into talking about enterprise, the enterprise,
0:12:35 IT leaders, whoever’s designing, developing the AI applications. What are some of the things that
0:12:40 should be considered by, you know, a tech leader, a CIO when they’re designing an agentic AI system in
0:12:47 particular? We have a couple hours, right? Yeah. So I think the biggest one, right, is that when I talk
0:12:53 with anybody, right, in the enterprise, I remind them, I’ll be like, okay, so what is the agentic system we
0:12:56 should use? And I’m like, I’m going to answer your question, but let me ask you a parallel question.
0:13:03 I’m like, tell me, what is the, you know, tell me, what’s the one piece of software you use in your IT
0:13:06 system? And they’re like, what do you mean one piece of software? We have 15 different vendors
0:13:11 and we do all this kind of stuff. I’m like, correct. I’m like, so you have to think about
0:13:16 agentic systems as the same, right? You’re going to get some from your vendors, right? Whoever that is,
0:13:19 you’re going to have an application, you’re going to have your CRM, you’re going to have all these
0:13:24 different things for your developers. They’re going to put their own agents in that, right? And you’re
0:13:28 going to work with those. Some are going to be homegrown by you, right? Because everyone’s moving
0:13:33 really fast in this space. You have your own enterprise data with your own sources. If I
0:13:37 can, if I can talk about a very specific NVIDIA example that I tell everyone, we have this thing
0:13:43 called Envy Bugs Pro AI Search. And I like it for two reasons. One, it uses AI agents and two, the name
0:13:47 just rolls right off the tongue. Totally. Right. At least when you say it. Envy Bugs Pro AI Search,
0:13:52 it’s just like, that one sticks in my brain. Not, not the, not the, not the order of super or reason,
0:13:57 right? But not that one. Right. But the thing I like about this is, is the, our IT, our great IT
0:14:03 department, who are much more engineers than the average IT department, created this. And they
0:14:08 created it very early on in, in LaneChain’s existence. And then they, they modified a little
0:14:13 bit. And now we’re here, fast forward six months. And we want to not only use it with Envy Bugs Pro,
0:14:19 but we want to hook it up to coding repos and forums and CRM systems and all of that. And what that means
0:14:25 is that now you’re in this similar situation where I might have various different vendors and stuff
0:14:30 that I’ve grown myself. So when I talk with them and say, what you’re really looking for is you need
0:14:35 to look at it in that same context. It’s not, I’m going to buy one piece of software. I’m going to write
0:14:39 one piece of software. You’re going to have all these agents working together. And the trick is,
0:14:45 how do you let them all come together, mesh together in a somewhat seamless way for your employees?
0:14:51 So when I log into our systems, or if you log into our systems, it’s context aware. It gives
0:14:56 me the information that I need. It helps me do my job. I look at it like that. It’s a, look at it in
0:15:04 the, in the traditional IT deployment sense. Right, right. How does that differ from what enterprise
0:15:10 IT departments and developers were doing, you know, sort of pre-LLM, pre-AgenticA, that kind of thing?
0:15:14 Is it, to me, being a little bit on the outside thinking about it, it sounds sort of like,
0:15:18 oh, you’ve got different vendors and I know it’s like an app store and CRM systems have their own
0:15:23 apps. Everything’s got their own, like, you know, place where you can go grab apps. Are we moving
0:15:29 towards that for agents? You know, I mean, I think so. Yeah. I mean, there definitely will be. I mean,
0:15:35 they’re out there already. Yeah. Right. You can, everyone has their own and, and that’s fine. I think the
0:15:41 difference is, I say, you know, when we had all these apps that in the, in the, I guess in, in the
0:15:47 before times, I guess, when we had all these apps, uh, we were in data silos and now it, uh, it’s very,
0:15:52 it could be possible. We’re in these agentic silos, right? Like a little bit where I have data here and
0:15:57 I interact with that agent to talk with us. The big difference though, is we’re in a situation now
0:16:02 where we’re not always going to have API to API access, right? I don’t have to necessarily always
0:16:10 have a developer script and code that I can have CRM agent over here talking to Confluence wiki agent
0:16:17 over here. And they are communicating not via an API that a developer set up. They’re communicating
0:16:23 via human language, right? Like with each other. Right. And that kind of makes connection. It kind
0:16:28 of eases the connection points. It raises some, some things you have to look at, right? Like in how are
0:16:32 we monitoring these systems? How are we observing these systems? Right. But that is one of the key
0:16:37 differences is you’re going to not just have API to API to API specs. You’re going to have some of
0:16:42 that. And then you’re going to have these agents that are just communicating with themselves on your
0:16:48 behalf. Right. What kind of complexities does that raise, or perhaps doesn’t raise when it comes to,
0:16:52 you know, data security? And obviously this all runs on data, like you said at the beginning,
0:16:56 right? And just the data. So if you’ve got different agents and there may be from different
0:17:00 vendors and they’re doing, how do they talk to each other in a secure manner?
0:17:06 It’s not a hundred percent solved yet, right? Like, but where it’s headed is this idea that we call
0:17:11 context-based security. And so if you go back from the history and you’re like, you know, security really
0:17:16 started with, let’s say firewalls, right? And what was the motion? I’m going to put everything in this
0:17:20 circle and I’m going to put my hands around it. And then there I secured it. And then we did this thing
0:17:25 where we, I don’t know if you, I don’t know if you remember it, we moved to the cloud or I think we did,
0:17:29 right? Like we moved to the cloud and then it was like, oh, oh, I can’t put my hands around it
0:17:33 anymore. Now I’m application-based security, right? Like, cause I have pieces everywhere.
0:17:37 And now we’re in this motion where every, there’s cloud, there’s on-prem, we still have all of that.
0:17:43 But when I am accessing a system, like I said, versus maybe with like our CFO is accessing that
0:17:49 same system, we are looking at it in a different context, right? And the information we want out of
0:17:53 it. So we’re moving to this context-based type of security where you not only have to understand
0:17:58 the person and the credentials and do all that stuff that you are already doing or supposedly doing,
0:18:03 right? Like our back should be there, right? Like if it’s not there, we have a more basic problem.
0:18:03 Right.
0:18:06 But you have to look at the context in which the question is being asked. Like look at the things
0:18:12 around it, look at what pieces of information are coming with it, do pieces of security before that
0:18:17 question is, and then do an analysis before you return the information to the user. And you can look at
0:18:23 that context. So it does, I don’t want to say complicates it, but it does add about maybe 10%
0:18:29 new stuff that we weren’t doing before to roughly 90% of just like regular security applications.
0:18:34 I was speaking with Bartley Richardson. Bartley is the Senior Director of Engineering and AI
0:18:40 Infrastructure at NVIDIA. And we’ve been talking agents, agentic AI, a little bit of cybersecurity,
0:18:46 got snuck in there. But that’s Bradley’s domain. He leads agentic AI and cybersecurity AI
0:18:52 here at NVIDIA. And we’re talking about agentic AI and the enterprise. To shift gears a little bit
0:18:57 and kind of talk about business needs, designing one of these systems, how, you know, are there
0:19:04 best practices or, or, you know, things that you’ve seen that work well when it comes to making sure
0:19:10 that you’re designing with the business care abouts in mind and then going back and kind of rechecking
0:19:11 keeping that, that North Star?
0:19:16 Well, I think, yeah, there, there are some of it is kind of what we’ve already, already talked about,
0:19:21 right? Like, you know, building on these strong foundations of not just tooling and ingest, but,
0:19:25 you know, models. And so there’s all of that, right? Like, obviously. One thing that we haven’t
0:19:31 talked about explicitly yet is, is this notion of traceability and observability and profiling.
0:19:37 Right. You mentioned briefly. Yeah. And, and you have to imagine if we’ve, uh, go back and we have
0:19:41 this kind of distributed system, there’s all these agents that are connected with other things. They’re
0:19:45 talking in various modalities faster than a human can understand from different vendors and different,
0:19:50 even like agentic framework providers, right? Like you might have someone in your business wrote this
0:19:54 thing in lane chain and someone wrote this thing in crew AI and someone wrote this, you know,
0:19:58 there’s all these different things. Yeah. So how do you have a holistic kind of traceability and
0:20:04 observability platform like across that? And that becomes, it becomes a little challenging and it’s
0:20:10 why we made this new thing. Uh, it’s called agent IQ and it’s not an agentic framework. There’s plenty of
0:20:14 those. I haven’t checked. We’ve been talking for, you know, a few minutes. There probably are 10 more that
0:20:18 popped up while we were talking, right? We don’t need another one of those. Our framework’s the new
0:20:23 prompt engineer. Right. Yes. They’re the new. Yeah, exactly. But it’s not, it’s not one of
0:20:31 those, but rather it’s a really simple, as much as necessary, as little as possible way to get all
0:20:37 of these frameworks and tools and everything to work together and be observable on the same point.
0:20:41 Right. Being a CS person and coming from an engineering background, the way I tell people
0:20:46 is I’m like, again, we’re really good at complicating things. We got agents and tools and all these things.
0:20:52 I’m like, no, everything’s a function call. And so what agent IQ does is it, it lets you use the
0:20:56 frameworks that you were using and you still develop in those frameworks. Like I was saying,
0:21:00 you probably already have stuff. Don’t rewrite that code. Let’s just hook it up to other things
0:21:04 that you have, but let’s do that in a way where we develop everything to a function call. And that
0:21:10 lets you say, oh, I’ve got this, this agent pipeline here. And now I want to add a capability to it.
0:21:15 Well, I can add it on or I can wrap it in something else. So now I have an agent inside of an agent.
0:21:20 It allows this nesting kind of capability. And what it gets you is this really cool traceability.
0:21:27 So you can go into every tool call, every LLM call, every chain of tool calls. You can look at the input
0:21:33 tokens, the output, output tokens, the time it took to do the tool call, the sequence of actions of the
0:21:38 tool call. And you can, and we have customers that are already seeing, you know, they look at the timing
0:21:44 charts and they’ll optimize their tool calling chains and they get the 15X, right? Like speed up through
0:21:48 their, through their pipeline for it. Right. Or they’ll get a, you know, a 5X improvement in
0:21:54 accuracy by moving things around. And it’s that type of information that is part, one of the reasons
0:21:58 that something like Agent IQ exists. Yeah. That’s fantastic. We’ve got a few minutes left. Is there
0:22:03 an area that we haven’t covered that you want to go into in particular? Yeah. I mean, the, the thing I
0:22:08 think I would, I would end on, right, if we’re talking about agents is I feel like as an industry,
0:22:12 we’ve been talking about them for a really long time, but we’re pretty, we’re pretty new,
0:22:17 right. Like to them and their actual usability, right? Like, and, and how we’re, how we’re kind
0:22:22 of getting them into, like we said, enterprise scenarios. I think what’s, what’s incredibly
0:22:26 interesting is these use cases that we have. And some of them we have here in booth demos,
0:22:30 right at GTC this year, right? So I’m sure afterwards we’ll, we’ll have those right where
0:22:36 people can consume as well. But we have this one that is, is automating a lot of what happens
0:22:41 in the kind of feature requests, like what, what issues are a customer having with my product
0:22:46 all the way through writing a PRD for that, a product requirements doc, all the way to assigning,
0:22:50 you know, who’s, who might be the best engineers for doing this. I think the coolest thing that I’ve
0:22:55 seen as part of that is we use a reasoning model to say like, Oh, here’s some, here’s some issues
0:22:59 your customers are talking about on forums, or here’s people, you know, tickets they filed, right?
0:23:04 Here’s a set of brainstorming questions that I reasoned through, go have a meeting. And then the
0:23:07 humans get in the room, they had the meeting about, they talk about, they might draw some diagrams and
0:23:11 that kind of stuff. We, that goes back into the system. So think about this as a big human in the
0:23:17 loop that goes back into the system. What comes out the other side, a fully formed PRD where we’ve
0:23:22 taken the language and the action items from the actual team’s call. We’ve taken pictures of the
0:23:28 whiteboard. It produces the diagrams and then it gives you a PRD that then, you know, if it gets you
0:23:33 75, 80% of the way there, that’s fantastic. That’s great. Cause what’s, you know, I’m sure,
0:23:38 you know, you do your fair share of writing, right? Like the hardest part for me about writing is that
0:23:43 blank page. Totally. Right. And if I can get something that’s 80% of the way there, it’s great.
0:23:49 And that’s the point I would leave on is agentic systems and models will make mistakes, right? Like
0:23:54 they will, they will not never be a hundred percent accurate. I would challenge you to find anything
0:23:58 that’s a hundred percent accurate. Right. But the way to think about them is like, look,
0:24:06 the human will be in the loop. And if it gets you 60, 70, 80% of the way there, that’s amazing.
0:24:09 Yeah. I mean, that’s 70% of work you don’t have to do. It’s there.
0:24:15 Exactly. And that’s the part that I think is incredibly compelling and we should focus on
0:24:19 accuracy and we should always try to make things better. But I never want to lose sight of like,
0:24:24 look at the 70%, right? Like that, that we did and how are we going to continue to make it better?
0:24:28 Of course. Absolutely. All right, Bartley, before we let you go, a little change of pace,
0:24:34 but in your own daily routine, your own daily work, are there AI powered tools that you’re using every
0:24:38 day that, you know, you can recommend them. It can just be a category specific tool. What are you using
0:24:43 that you like? Yeah. I’ll preface this by saying I have no official affiliation with any of these.
0:24:47 Some of them I pay a lot for, right? But other than that, right? No official affiliation. Of course,
0:24:52 I think like a lot of people, you know, AI powered search engines, either that’s something like a,
0:24:59 like, like a perplexity, right? I use that a lot. I use a chat GPT, the reasoning model in chat GPT,
0:25:02 like quite a bit, like that one’s fantastic. If I’m trying to like research things or if I,
0:25:05 you know, I just have something that I wanted to think about just a little bit more.
0:25:10 Of course, that was before our reasoning model came out and before AIQ, right? Like is out there,
0:25:15 which is our deep researcher. So those are kind of the obvious ones. I don’t get to code as you know,
0:25:19 my, my teams don’t let me code as much as they used to and they shouldn’t. That’s smart. That’s
0:25:25 smart of them. But I love cursor as a coding tool and even just making diagrams. Like it’s really good
0:25:29 at that. The amount of times that I just hit tab and I’m like, wow, get out of my brain, right? Like
0:25:35 I just hit tab. The other one that I use a lot is called napkin, napkin.ai. And that’s the one that I
0:25:39 don’t know if a lot of people have heard. Yeah. I haven’t heard of that one. Yeah. It’s in beta right now.
0:25:44 It’s free. Again, I have no affiliation with them and I love their tool. What you can do is you can
0:25:49 either just like type free form text or you can do like a perplexity style and it’ll research things
0:25:54 for you. And it will take, let’s say you have a process diagram or a complex tree or whatever,
0:26:00 and it turns that into the associated diagram and the infographic for you. And then what I love about
0:26:05 it is it gives you kind of pre-formatted options to start from. Do you want this to be
0:26:09 a funnel? Is this a process tree? Is it the cycle and all this in different color options?
0:26:13 And you’re like, yeah, I want that. And then you can go in and you can change anything. You can move
0:26:17 the text. You can change the text. You can move the shapes and all that. So I talk about the 70% there,
0:26:24 right? I would toil away with these in PowerPoint for so long, right? But now, and it’s getting better,
0:26:29 I can just describe it to napkin, right back of the napkin. Right. And it produces this great SVG
0:26:31 or PNG diagram, right? And so it’s fantastic. It’s awesome.
0:26:37 Fantastic. Bartley, for folks listening who want to find out more about all the things you’ve been
0:26:43 talking about, all the work you and NVIDIA is doing with the enterprise, where’s a good place to go
0:26:49 online or places to get started? Yeah. A great place to go is build.nvidia.com, right? So that’s a great
0:26:54 place to get started. You can see all of our models, all of our blueprints, which is just kind of like
0:26:59 intent examples. So if you go to build.nvidia.com, ai.nvidia.com takes you to a similar place,
0:27:03 right? If you’re interested in this new thing that we were just talking about, which is called
0:27:12 Agent IQ, we’ll play on Agentic, right? It’s Agent IQ. Oh, yeah. Right. Yeah. It was clever. But if
0:27:19 you’re interested in that, that is open source software on GitHub. So github.com slash NVIDIA
0:27:24 slash Agent IQ. Great. But we’ll also be all linked off of build.nvidia.com. Yeah. Great.
0:27:28 Bartley Richardson, thank you so much for taking the time out. Best of luck this week. Have a great show.
0:27:32 And in all you’re doing, I mean, we can catch up and do it again down the line and, you know,
0:27:36 see if we’re still talking about PDFs in a year or two. Oh my goodness, Noah. Yes, we’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:37 We’d love to.
0:27:39 We’d love to.
0:27:39 We’d love to.
0:27:40 We’d love to.
0:27:41 We’d love to.
0:27:42 We’d love to.
0:27:43 We’d love to.
0:27:45 We’d love to.
0:27:49 We’d love to.
0:27:51 We’d love to.
0:28:14 We’d love to.
0:28:26 We’d love to.
Bartley Richardson, senior director of engineering and AI infrastructure at NVIDIA, discusses the transformative potential of agentic AI as the next level of automation and introduces the NVIDIA Agent Intelligence toolkit, which ensures seamless integration and observability across multi-vendor agent systems.



Leave a Reply
You must be logged in to post a comment.