NVIDIA’s Rama Akkiraju on Building the Right AI Infrastructure for Enterprise Success

Summary and Insights

🕒

Việt

中文

0:00:16 Hello, and welcome to the NVIDIA AI podcast. I’m your host, Noah Kravitz. As AI continues
0:00:22 to evolve, organizations need to think carefully about how best to develop and integrate scalable
0:00:28 AI systems within their existing infrastructure. This is where the AI platform architect becomes
0:00:33 so important. It’s a role that’s part technical expert, part strategic thinker, and critical to
0:00:38 driving AI innovation and transformation within a company. With us to explore the importance of the
0:00:45 AI platform architect and AI infrastructure in the enterprise more broadly is Rama Akiraju. Rama is
0:00:52 VP of IT for AI and ML at NVIDIA, where she leads AI and ML initiatives for enterprise use cases.
0:00:58 She’s a former IBM fellow who worked on the Watson Artificial Project as part of her two-plus decades
0:01:04 at IBM. Rama has also been honored as a top 20 woman in AI by Forbes and A-team for AI by Fortune
0:01:10 Magazine, among numerous other accolades. Rama, it’s a pleasure to have you here. Welcome to the
0:01:15 NVIDIA AI podcast. Thank you so much, Noah. Thanks for having me here. Thank you for taking the time.
0:01:20 So maybe you could set the stage for the audience by talking a little bit about what your role is.
0:01:26 What do you oversee? What are your teams working on? And what are some of these enterprise use cases
0:01:31 that you’re leading the delivery of? Sure. I lead the enterprise AI and automation team at NVIDIA,
0:01:39 where I drive AI adoption across the company, enhancing developer productivity, IT operations,
0:01:45 and enterprise workflows. So as part of this effort, my team and I build chatbots, co-pilots,
0:01:52 AI agents for improving our own employee productivity and developer productivity. We also build AI platforms that
0:01:57 are enterprise-grade for everyone in the company to use. So let me give you a couple of examples of the
0:02:03 kind of things that we develop. There is one that sits on our intranet, what is called as NVIDIA info,
0:02:10 a chatbot that answers employees’ questions. I think many things related to the company, internal documentation,
0:02:16 company policies and financial information and so on. And then we build chatbots and
0:02:22 co-pilots that sit within our developer productivity tools like our bug management tools. And we also work
0:02:28 with our sales, supply chain, marketing, finance, and other teams to help them build their own
0:02:35 transformation, AI projects and such using the platforms that our team builds. We built a generative AI platform
0:02:41 in the company and everybody can use it both with APIs and low-code, no-code type of interfaces to build
0:02:45 their own generative AI solutions and deploy them for various other use cases.
0:02:48 And you’ve been with NVIDIA just about three years now?
0:02:49 Not yet. Two and a half.
0:02:50 Not yet. Two and a half.
0:02:56 And so prior to that, you were at IBM for quite a while and worked on some amazing things. Can we take
0:03:01 just a second and maybe speak to your background a little bit and what led you to come to NVIDIA?
0:03:10 Sure. Well, I worked at IBM Research and IBM Product Divisions for several years. And throughout my
0:03:15 career, I have been actually applying AI for solving different kinds of problems. More recently, just
0:03:21 before coming to NVIDIA, I led the AIOps, which is the Applying AI for IT Operations Management,
0:03:27 the Product Suite Development and IBM Watson Division. And prior to that, I led a lot of pre-chat,
0:03:34 GPT kind of natural language understanding suite of services that were part of IBM’s Watson.
0:03:40 My passion and interests have always been about solving real-world problems with AI.
0:03:48 When the opportunity came from NVIDIA to look at all of the enterprise-related use cases and,
0:03:53 you know, in the new era, where chat GPT was just about, it wasn’t there yet, but within a few months
0:03:59 after I came, but then the previous versions of GPT were all there. So we were at this inflection
0:04:05 point where AI could really now deliver on the promise that it had all along that, and people,
0:04:10 you know, talked about, but now it was about to happen. And this role and this opportunity to
0:04:15 look at pretty much all the use cases in a company and apply AI to solving them and to transforming them
0:04:21 and to rethink the way problems are solved and business processes are done and implemented was
0:04:22 just too exciting to pass up.
0:04:27 Right. It sounds like it hit that sweet spot of the leading edge research, but being applied,
0:04:32 as you said, to solving real problems. That’s great. So you alluded to this a little bit. I’m just
0:04:37 talking about your own background, but can you talk a little bit about the evolution of AI kind of,
0:04:41 you know, talking about machine learning and generative AI, as you alluded to, and some of the
0:04:46 things going on now in the enterprise. But with that lens of the enterprise, what this
0:04:49 evolution of AI over time has meant and means for the enterprise.
0:04:55 Yeah. It’s been fascinating. Actually, you know, since I’ve been in this space before natural
0:05:01 language processing and understanding really even kind of took off, you know, from grad school on days,
0:05:08 we, you know, we were looking at perception AI, right? Where computer vision models are running on,
0:05:14 you know, whatever available compute at the time, and then mostly applied to structured data with data
0:05:19 mining as the main discipline that, you know, was in vogue. And then statistical machine learning was
0:05:24 slowly starting to make an impact in the enterprise, which was being actively applied for, you know,
0:05:28 various use cases for supply chain and, you know, those types of use cases. So that’s the era of
0:05:33 classic ML or perception AI, if you will. Computer vision was there, but, you know, not enough data,
0:05:39 not enough compute and all of that. Right. Cut short now to generative AI era that happened with the
0:05:45 generative pre-trained transformers, the GPT models and the generative AI era, where the language
0:05:52 understanding really took off, right, with even multi-model inputs, not only text, but images and all of that.
0:05:57 So that really unleashed a whole slew of other use cases in the enterprise where you can now start to tap into
0:06:03 the unstructured data, which was pretty much off limits for any kind of automated insight derivation before,
0:06:09 which was 80 percent of enterprise data. And now you can not only do data mining and unstructured data,
0:06:15 which constitutes for 20 percent of enterprise data, to now open it up to all data, which includes the unstructured data.
0:06:20 Can you give an example kind of from the enterprise of the difference between structured and unstructured data?
0:06:27 Yeah, sure. So structured data is, you know, everything that you are capturing in your, you know, database tables,
0:06:34 transactions, your purchases and your sales data and all of that. Right. Whereas unstructured data is, you know,
0:06:38 you have all this, hey, this is how my company runs. This is a documentation about my product.
0:06:46 And these are my notes from this meeting. And so all of this information that sits in, say, you put them in Google,
0:06:51 on Google Drive or SharePoint or Confluence pages or, you know, they’re sitting there.
0:06:56 But humans have to read the process and then create some kind of structured tables if you have to do some insights from,
0:07:00 derive some insights from them. And that was a laborious process.
0:07:05 And in a long way, because we couldn’t tap into it and we couldn’t question it, we couldn’t get insights from it easily.
0:07:09 It was all kind of manual effort and people spend a lot of time reading it.
0:07:13 You know, if you have to go to do a customer presentation, you would have to read so many documents.
0:07:18 You have to go find where all that information is in the enterprise and pull all of them and create some succinct summary
0:07:21 that pertains to what you need to prepare for this particular customer meeting. Right.
0:07:29 So all of that you could now imagine with, you know, tools like ChatGPT or even DeepSeq models, you can just, you know, ask it.
0:07:33 You upload a few of the documents and say, create me a nice summary for this particular customer meeting,
0:07:36 given that these were waiting minutes from last meeting. Right.
0:07:45 You can, your productivity shoots up significantly and because you are able to now tap into all of that information to one example from this non-structured data.
0:07:54 Yeah. So from that classic model AI, perception AI to generative AI, which is where majority of the use cases are being explored are with large language models,
0:08:00 with multi-model inputs even, where you can also include PDF documents and PPTs and those sort of things to derive inputs.
0:08:08 To now we are already in agentic era where, you know, people are talking about AI systems and models that can reason and plan and act autonomously.
0:08:14 Right. And they can integrate into various tools, workflows, and you can automate things.
0:08:21 So examples being, you know, you could say I have these five what-if scenarios to run in my supply chain planning.
0:08:31 Do them and tell me if there is a, you know, disruption to my supply chain from Hong Kong for this week because of whatever that’s happening, geopolitical situations.
0:08:32 Right.
0:08:40 What changes do I need to make, let’s say you would start with those kind of what-if scenarios and the system would do planning and figure out all of those things.
0:08:55 In fact, you could even go further upstream and say, what other, what should be the what-if scenarios that I should run in order to derive, get to the best allocation of my demand or, you know, to ensure that my supply meets the demand.
0:09:01 Given these particular set of situations that are unfolding around me, between my supply network partners.
0:09:07 So that is even before you can run the what-if scenarios, these AI systems can help you plan and reason.
0:09:09 So that’s an agentic AI example in the enterprise.
0:09:15 And then, of course, the physical AI with more sensors and there you have digital twins types of things.
0:09:26 If you’re modeling your data centers and you can have a, or even your inventory warehouse, you can monitor everything, rack to shelves, to things that are sitting on it and have full monitoring.
0:09:31 So with physical AI, that takes it to the next level combined with agentic AI and generative AI.
0:09:39 So there is the AI evolution, while it took maybe 30, 25 to 30 years from perception AI to get to generative AI.
0:09:42 To go from generative AI to agentic AI, it took two years.
0:09:45 And from agentic AI to physical AI, already work is underway.
0:09:48 So it’s having a huge impact on the enterprise.
0:09:57 And, you know, if I have to summarize the impact on the enterprise, it’s not just summarization and, you know, doing things the same way, but automating things a little bit more.
0:09:58 It’s no longer that.
0:10:03 It’s actually enabling us to fundamentally rethink the way we do things and write business logic.
0:10:05 Because you can now do more personalization.
0:10:07 You can now do more reasoning and planning.
0:10:16 So even the things that you weren’t able to do before are doing it with multiple business processes can all be now combined into a single process that’s a lot more efficient.
0:10:19 So that’s why people are even saying, you know, SaaS may be dead.
0:10:24 The business applications need to be rethought because the logic now starts to go into the AI layers.
0:10:31 And the business logic, it starts to expand into all of the business layers, the AI capabilities, right?
0:10:33 So we need to rethink.
0:10:36 So the enterprises are up for huge transformation.
0:10:39 And we’re only at the beginning of that in terms in that rethinking journey.
0:10:54 And it requires not only reformatting the use cases as they are known to rethinking, reprocessing fundamentally and also upskilling, reskilling in the enterprise and looking across all of the business functions, HR, IT, finance, legal, marketing, sales, everything.
0:10:56 All of those functions are up for transformation.
0:11:13 And in doing so to enable everybody to get their faster exploration, experimentation and building platforms is going to be a very critical aspect of it because the platforms are the ones that enable you to quickly leverage these because the stacks still are very complex to build and to test and all of that.
0:11:22 So the more they’re fully baked and tested, that makes it easy for everybody to really apply the leverage, the capabilities to transform their use cases.
0:11:24 Right. So much has happened.
0:11:26 And I was thinking of it as you as you were speaking.
0:11:27 I didn’t want to interrupt.
0:11:38 And I also always hesitate to speak to somebody who’s in the thick of the technical building, the things that we’re talking about, the technologies to say, you know, well, from the outside, it seems like it’s moving so fast.
0:11:46 But, I mean, you said it two years, give or take, from, you know, after a 30-year gap from perception to generative, that two years to agentic just really seems like a blur.
0:11:52 I want to ask you to get into some of the details about developing agentic AI applications for the enterprise.
0:11:58 But first, and this is kind of going up a level again, so if it’s not a fair question to ask, that’s fine.
0:12:12 But can you kind of quantify for the business user, the technology leader at an organization who’s been hearing about agentic AI and understands, you know, oh, it can reason, it can plan and do these more complex tasks.
0:12:29 But is there a way to sort of talk about how big of a step forward of a leap, you know, going from entering a prompt into a chatbot with an LLM behind it and going kind of step by step like that to using agentic AI and talking about some of the workflows and applications that you’re working with in the enterprise?
0:12:32 Yeah, it’s a non-trivial process, definitely.
0:12:36 And we need more tools and automations to simplify that to get there.
0:12:38 But enterprise data is complex, right?
0:12:39 Let’s start there first.
0:12:44 If you look at enterprise data, structured, unstructured data is the two sets that we talked about.
0:12:48 There is something that’s a combination, semi-structured data, and then there is the human, this is all human-generated data.
0:12:56 There is machine-generated data, right, like the logs, tickets, alerts, metrics, and all of the IT systems and infrastructure and others generate.
0:13:06 So, first of all, enterprise data is complex, and moreover, the data tends to be kind of very distributed in different places, and there is access control permissions that one has to deal with and all of that.
0:13:25 So, to build, you know, going from prompt, you put something in a prompt, to really getting value out of it really requires a full-blown, multi-layered stack together, right, starting with enterprise data ingestion pipelines, which ingest all of the enterprise data so that we can enable this fresh enterprise data
0:13:31 and make it available to LLMs to use, because LLMs are trained on public domain data.
0:13:33 They don’t know, they don’t know anything about your enterprise data.
0:13:38 So, we need to leverage all of the enterprise data and still leverage the goodness of LLMs, right?
0:13:52 And the way to do that is there are, you know, techniques like retrieval augmented generation, where you essentially load all of your enterprise data into these vector databases, where all this unstructured data is vectorized, and structured data can still stay in the structured tables.
0:14:16 And then you put these AI pipelines, which can go at, you know, do the inferencing, by means of inferencing, go retrieve the relevant documents for a given question or an insight that you need to generate from either the unstructured database, which is through the vector database pipeline, or to the structured database, which could be through talk to your supply chain data, like talk to your data types of pipelines, generate text to SQL.
0:14:24 So, that is one way where you can make these pipelines have access to fresh enterprise data.
0:14:37 So, first you have to solve for all of the things related to that, which includes continuous ingestion of your data into the right kind of databases, vector databases, and so on, so that you can do the information retrieval and insights generation from it.
0:14:41 And that requires role-based access control management and all of that.
0:14:43 So, that whole layer needs to be managed and built.
0:14:44 Right.
0:14:59 Then actual, the accuracy of insights generation from unstructured data is in itself is still very much an art, where you have to fine-tune for various parameters, like how do you process the data, how do you chunk it, and how good is the retrieval relevancy?
0:15:04 How are you going to re-rank the results that you get, and what embedding models do you use?
0:15:06 Are LLMs doing the right thing or hallucinating?
0:15:08 Which model is the right model for me and all of that?
0:15:10 So, that whole layer needs to be put together.
0:15:10 Yeah.
0:15:18 Then, once you have the initial set, you need to make sure that you have the full stack of testing, validation, pipelines, and all of that built out.
0:15:26 Then you need to make sure that no overly shared sensitive documents are lying around in the company, which could accidentally be, you know, revealing some sensitive information.
0:15:30 So, you have to focus for that with enterprise content security kinds of platforms.
0:15:43 So, you have to build out this whole stack of platforms, and only then you are able to now build your agentic workflows on top of this stack, you know, with all of the enterprise controls that passes your security with the right kind of guardrails and everything.
0:15:50 So, the stack is pretty complex, and that’s where the architecture and the platform story comes up, really, the button to build that up.
0:16:02 And so, pretty much right now, every company is having to either build that platform or get it from some kind of a vendor who supplies that platform, but still, they have to hook it up with all their enterprise data because that nobody can do it for you.
0:16:09 You have to either find the right kind of hoses and the plug points to connect to them, right, so that the data stops flowing into the stack.
0:16:12 And that’s where you then you start building the workflows, then you have to test it.
0:16:13 The accuracy has to be good.
0:16:18 That’s when you then do the pilots and then do the deploy and then start to observe the usage.
0:16:30 Sometimes it takes what you build needs to be embedded, right, as co-pilots into the development environments or whatever environment where the users are doing their work for them to really be able to access these tools without friction, right?
0:16:36 Otherwise, if it’s something that changes their workflow significantly, they won’t use it because they have to go out of their way.
0:16:48 So, you have to think about all those deployment-related aspects to deploy it and then ensure that people are using it, monitor the usage metrics based on that compute, maybe the productivity gains that you’re getting or not getting.
0:16:51 And why and take the feedback and continuously improve the models.
0:16:57 And for that also, you need a whole platform stack, which is the data flywheels continuously improving from user feedback data.
0:16:59 If need be, improve the prompts.
0:17:02 If need be, improve the models by fine-tuning them.
0:17:06 Or if need be, improve the retrieval relevance accuracy or embedding models.
0:17:13 So, there are many control points along the way in each one of these layers in the stack that need to be carefully looked at for continuous improvements as well.
0:17:23 So, it’s a non-trivial process to go from a use case that you think about to making it really work in the enterprise and platforms will play a significant role.
0:17:24 Right.
0:17:26 And this is where the role of the architect comes in.
0:17:26 Exactly.
0:17:32 I mean, the role of architects and, you know, any kind of a platform architect in an enterprise IT has always been there.
0:17:42 But what is now kind of taking it to the next level is this opportunity where you can really derive insights from all of your enterprise data as opposed to what was originally only structured data.
0:17:54 And the opportunity to now significantly rethink the way the business processes are done because now you can plan and reason and automate a lot of things and workflow automations and all.
0:18:07 So, to build all of those things out, you need, on top of your existing stack, which always has been there and will continue to be there, you need this whole new levels of stacks of platforms, if you will, that are specific to AI, generative AI.
0:18:09 And that’s where this role becomes very important.
0:18:11 Like, for example, vector databases management.
0:18:17 Container orchestration has been there, but now you have to, you’re talking about GPU-level container orchestration.
0:18:17 Okay.
0:18:18 Right, right.
0:18:24 And for that, you need GPU optimizations, quantization, and you are talking about new microservices that you now have to manage.
0:18:32 Then you are observing, you already have maybe application performance monitoring and observability platforms in your IT, but now you have to monitor the LLM.
0:18:34 So, you need the next level of LLM observability.
0:18:48 The pipelines will write logs about, this is the prompt that came in, this is the retrieved chunks, this is how I re-ranked it, and this is how the new prompt got constructed to send to an LLM.
0:18:50 This is how the citations were generated.
0:18:52 There could be mistakes anywhere in the pipeline.
0:18:53 So, how do you know?
0:18:56 You have to have the LLM-level observability at that level.
0:18:58 You have to have an auto-evaluation framework.
0:19:09 And if you’re calling external LLMs, you have to have an LLM gateway so that you are monitoring the cost and subscriptions and all of that, and also, you know, carefully making sure that no sensitive data is going out.
0:19:16 You need different kind of storage for storing your training data and even the checkpoints of the models that you may be fine-tuning, right?
0:19:20 And you may need agentic frameworks like LangChain, LangGraph, or those kind of things.
0:19:28 So, it’s like a whole slew of set of new things that you now have to have in your platform to manage the AI workflows that you might be building.
0:19:40 As I was listening, I kind of went from thinking, so is there a sort of difference in, I don’t know, obviously in skill set, it’s continual, you know, upskilling and staying on top of things.
0:19:49 But in terms of, like, personality and strategic approach from a sort of pre-AI platform architect as opposed to an AI platform architect like we were talking.
0:20:01 But then by the end of what you were just saying now, I thought, oh, we can just take what Ram is saying and unpack it afterwards, run it through an LLM perhaps, and build out a blueprint for all the things that this role should cover.
0:20:01 So, it’s terrific.
0:20:06 But is it a matter of, you know, an IT professional who’s a platform architect?
0:20:23 It sounds like there’s a AI, I mean, we know this, that AI and everything you’ve been talking about has brought us to this moment of real transformation for, as you said, how businesses do business, how we think and rethink our processes, even how we rethink software.
0:20:27 So, is it more of a, is it an evolution of a platform architect?
0:20:30 Is it a sidestep into just a whole new world?
0:20:47 It definitely is a whole new set of skills that the same existing architects upskill themselves to learn all these things and get there or new roles that emerging, you know, if I look at my own team, for example, I mean, we have this unique program within NVIDIA where we, you know, kind of drink our own champagne.
0:21:02 So, we test a lot of our NVIDIA’s own hardware and software technologies using the program to give early feedback to our product teams, but also really build out enterprise-grade solutions on our own stack to provide that feedback to the market to zero reference.
0:21:05 So, this is how, you know, this is how you can build things.
0:21:14 So, as part of that, our team had to be pretty sophisticated in terms of really understanding and very quickly taking very early technology that our product teams are putting out to test them out.
0:21:33 So, in our case, yes, we ended up actually creating a machine learning team who deeply understands all of these technologies, both from an engineering perspective and also from a data science perspective in terms of how best to evaluate the models, how best to improve the accuracy, how to construct the pipelines, what are the control points and all of that.
0:21:48 So, for a company that may not have this kind of drink your own champagne or building these kinds of actual LLM models and the full stack set of software like how NVIDIA does, so you may not have the luxury to hire, you know, sophisticated ML teams to build this out.
0:22:10 So, that’s why it becomes all the more important to build platforms that make it as easy as possible to operate that can then be used by an engineer who is skilled enough in some of the core technologies of cloud maybe and, you know, managing Kubernetes and those sort of things can upskill themselves to leverage these platforms to build things out.
0:22:31 I do want to add, from what I’m seeing, my observation perspective, though, at least most medium to large companies have some amount of machine learning teams that are actively exploring, experimenting and building things out and kind of paving the way, defining and deriving the recipes within their own companies to help other teams come along.
0:22:42 So, there is some sort of the central AIML team that either builds the platforms or tests and derives the best practices and creates those recipes to enable others to run fast.
0:22:43 Right.
0:22:52 That IT has to happen for the foreseeable future until these tools and platforms become so easy that anybody can, you know, operate them.
0:22:53 Yeah, right.
0:22:54 Yeah.
0:22:54 Makes sense.
0:22:59 What are the key components that go into an AI stack these days?
0:23:08 And how does an IT leader, a platform architect, go about deciding between cloud-based and on-prem on-premises solutions for their organization?
0:23:09 Yeah.
0:23:12 So, if you look at the stack, I mentioned a few of those things before.
0:23:14 Say, for example, enterprise data management.
0:23:18 First, you have to have that, you know, either for structured data or unstructured data.
0:23:19 Where does my enterprise data sit?
0:23:22 Is it properly protected with role-based access control?
0:23:28 And is that, are there the right kind of APIs available for me to fetch that data, load it into either my vector databases or something else, right?
0:23:30 So, that’s enterprise data management.
0:23:32 It’s something that every company has to have.
0:23:35 And that has to be the foundational starting point for anything.
0:23:42 Then, you know, some of the other things like enterprise content security, again, ensuring that you cannot bypass any of those things.
0:23:50 You have to make sure that any data that is being used for any particular use case is access control protected and also sensitive data is protected in the company.
0:23:55 Because sometimes people put confidential documents out in the open by mistake.
0:24:00 These powerful tools can now go and search all of that data and could accidentally expose sensitive.
0:24:02 So, that has to be there as well.
0:24:05 So, enterprise data management, content security.
0:24:06 Then comes the rest of the AI stack.
0:24:10 Like, for example, you need a vector database or you’re tapping into unstructured data.
0:24:21 And you need a retriever that sits on top of the vector database to allow you to operate with it with embedding models and all vectorize the data and get the insights and the chunks or documents from it.
0:24:25 Then you need, of course, the full compute and infrastructure that needs to be set up.
0:24:31 If you’re going with SaaS applications, then, of course, the SaaS vendors will provide all of that for you as a services.
0:24:38 But if you are setting up on-prem for whatever reason, could be data sensitive and cannot move to cloud or whatever those reasons may be.
0:24:44 And some companies have regulatory requirements and such and they need to be on-prem within the country and different kinds of restrictions.
0:24:53 And if that is the case, you have to have the full compute and infra management layers for your setting up the data centers, the operating systems, the container platforms, the storage and all of that.
0:24:55 So, those are all part of the stack.
0:24:59 Again, if you’re doing it on-prem, there is the AI model inference serving.
0:25:12 How do you serve the GPU optimized versions of the inference models, whatever you pick, either a public open access one, like the LAMA models that are, like, for example, available to NVIDIA as NVIDIA inference microservices.
0:25:20 Or if you’re calling some external ones, OpenAI, GPT-4 or Cloud or anything else, then you may want to go through LLM Gateway.
0:25:22 So, LLM Gateway should be part of the stack.
0:25:26 Then, as I mentioned, LLM observability tools need to be there or you build them.
0:25:31 And then the basic core container and application performance monitoring has to be there.
0:25:42 Then auto-evaluation framework is a critical part of that stack because whenever you build something, you want to automatically evaluate it against accuracy, you know, how helpful the answer is, are the citations good?
0:25:48 Maybe use LLM itself as a judge to compute all of this and then measure the latency.
0:25:52 So, all of that automated evaluation framework should be part of the CICD, part of the development process.
0:26:01 Then any kind of agentic frameworks you have, LangChain, LangGraph, and with NVIDIA, we now have AI, AgentIQ and AIQ, which were released at the GTC conference.
0:26:10 So, these are agentic frameworks that sit on top, allow you to quickly configure the agents and tools so that you can set up things for planning and reasoning.
0:26:17 Then you probably, if you’re talking about in an enterprise, you want to have some kind of a user experience that everybody don’t have to go build and figure it out.
0:26:21 So, there is a UI that’s built into this platform that comes ready.
0:26:27 If you want to build a chatbot, you get a chatbot with a feedback gathering mechanism, you know, all of those things baked in.
0:26:35 The platform should have a basic one and people can choose to build their own on top of it or refine it, but there should be a basic one in there.
0:26:51 And then when everybody is building all these agents and workflows and all that, you need some way to monitor them all and to make sure that you have usage and all the string tracks or dashboards and other things so that you know what kind of engagement is happening with which workflow, which AI tool, which chatbot or copilot, whatnot.
0:26:57 So, you need a repository where all these things are actually available and you set up for some kind of an automation.
0:27:05 And if people want to build on it, it would be good to have some low-code, no-code kind of a layer on top for people to quickly build their agentic flows on top.
0:27:14 And then last but not least, I would say there has to be something for model fine-tuning and refinement because, you know, no AI model would start with 100% accuracy.
0:27:21 Very likely that it will be somewhere anywhere between 80, 70, higher upper 70s to 80s to closer to 90s.
0:27:26 Maybe that’s kind of where you start and then you have to slowly use user feedback, put it in production, then improve and so on.
0:27:31 So, there has to be the full pipeline and frameworks for model fine-tuning and data flywheel management.
0:27:34 So, these are all the full set of things that are included in a stack.
0:27:43 Yeah. Talking about all of this software and talking about, you know, no-code, low-code tools, building platforms that developers can build agentic applications on everything.
0:27:47 What are your thoughts on the role of AI in software development itself?
0:27:50 And kind of specifically, how does that pertain to infrastructure?
0:27:56 How are you, what are you doing to prepare infrastructure for this increased role of AI in developing the software?
0:27:58 AI for software development. Wow.
0:28:01 So, yes, fundamentally reshaping software development, right?
0:28:06 Not just how we code, but how we plan, how we debug, how we test, and even then about it.
0:28:13 And how we write our product requirements, documents, and, you know, everything in the entire product lifecycle management, right?
0:28:19 So, first, we are seeing AI-assisted development, of course, you know, co-pilots, code generation for test generation and automation and all of that.
0:28:23 This boosts developer productivity and velocity in building code quickly.
0:28:29 So, from writing code, you are basically getting to a point of reviewing code and guiding the AI-generated code and improving productivity.
0:28:34 Next part of it is that AI is becoming a part of application logic in itself.
0:28:40 And, you know, this is where I was saying that AI now starts to go down into the business layers.
0:28:44 So, we’re building more and more, as an industry, more AI-native apps.
0:28:50 So, the components like retrieval, personalization, LLMs, and agents, these will all be inside of the business processes.
0:28:53 And they may need to be completely rethought.
0:28:56 Traditional backend systems may need to integrate more deeply with ML components.
0:28:59 So, AI and software development fundamentally reshaping it.
0:29:03 And, you know, we need to build out the multi-model pipelines for this.
0:29:06 And we need to scale out the GPU-backed infrastructure.
0:29:08 We need to build out the RAG pipelines.
0:29:13 We need to build out the text to, you know, talk to your data, text to SQL or text to code generation pipelines.
0:29:18 And whatever models are, if you’re using vendor platforms, they need to build all of this internally.
0:29:26 So, ultimately, you know, treat AI like a new layer in the development stack, which is fundamentally reshaping the way we write software.
0:29:31 Right. Not to hit you with one more kind of big-picture question, particularly as we start to wind down here.
0:29:36 Looking ahead to, say, the next five years, but feel free to adjust that timeframe to something more appropriate.
0:29:43 What trends, what are some of the, if you can look out, you know, that far in this rapidly evolving world we’re talking about,
0:29:48 what are the trends that you foresee shaping AI infrastructure in the, you know, next few years?
0:29:52 And is there anything you can do now, listeners can do now to prepare for them?
0:29:55 Yeah. Well, AI infrastructure is evolving very rapidly.
0:30:04 Yeah. I just, we’re audio only, but I’ve just been nodding my head constantly listening to you because I didn’t want to interrupt, let alone, you know, take you off course.
0:30:05 But there’s so much happening.
0:30:11 Yeah. Yeah. Maybe we can break it down into three kind of major trends that will emerge from this.
0:30:12 Okay.
0:30:20 One is over time, what is now a specialized AI architecture should become more of a native integrated enterprise architecture.
0:30:29 So that’s, that will happen because this AI will now become a very integral part of building any kind of a business process.
0:30:44 So that is something that we see, which means that, you know, shared orchestration layers, GPU aware, schedulers, unified logging, observability, and, you know, all the things of vector databases and everything that we talked about, they all become part of, should be part of AI native platforms.
0:30:58 Then we can almost think of likely specialization of hardware and models may happen where domain specific models, small, smaller models, faster, private models for specific use cases, more general purpose alongside the general purpose LLMs.
0:31:07 They may happen because the cost economics eventually have to work out and, you know, smaller models that are fine-tuned may do the job just as well.
0:31:21 So we may start to see more of these domain specific models emerging and also hardware that is also very specialized LLMs tuned for probably edge mobile or even browser-based AI kind of things may start to emerge.
0:31:32 So, again, different hardware models for which we need to have the right kind of infrastructure and hyper-computing stack around parameter-efficient tuning, quantization, and, you know, smart model routing, all of these things have to happen.
0:31:38 And that is where NVIDIA is building out and a lot of the stuff related to Dynamo and others that are coming up.
0:31:44 Then I would say third one is the agentic systems where are becoming increasingly autonomous, right?
0:32:00 So we will start to see more of the enterprise use cases that are starting to leverage go beyond chatbots to multi-step agents, reasoning, planning, action-taking, and decision-making types of systems will start to become more prominent.
0:32:12 And for that, from a platform perspective, one has to be prepared with more long-term memory management, context management, and workflow chaining, and, you know, all of those kinds of things will start to emerge.
0:32:17 So I would say those three things, you know, I said, you know, AI stacks becoming more native.
0:32:22 So unification of, you know, traditional and AI stacks, that becomes more commonplace in the enterprise.
0:32:30 Then specialized models, domain-specific models, and maybe specialized hardware and models for edge mobile and those kind of things.
0:32:41 And then finally, more of the proliferation of the agentic AI use cases and more autonomous use cases will require the platforms to really have long-term memory, context management, and, you know, workflow management, and all of that.
0:32:48 And the AI infrastructure and the platforms have to evolve to really take advantage of the advancements happening in the field of AI.
0:32:54 Rami, you’ve covered a lot of ground in a short time, but we’re literally talking about building the foundation.
0:33:05 I mean, not just for AI success, but I think for the future of how we work, how the enterprise does work, how we all rethink the way that we’re doing everything from processes to developing software and everything in between.
0:33:08 That’s fantastic. Thank you so much for taking the time to share.
0:33:17 For listeners who want to go deeper, want to learn more about anything related to AI and the enterprise and platform architecture and everything,
0:33:23 is there a good place online on the NVIDIA website, maybe on social media, that you would direct listeners to start?
0:33:35 I think if you just go to AI.NVIDIA.com, you’ll see a lot of these models as well as workflows and blueprints that NVIDIA is releasing available for developers to try them out right there.
0:33:36 That’s a great place to start.
0:33:41 And there are many example workflows that are given along with example code samples and everything.
0:33:49 You can just simply download them into your environments that are also provided right there to try them out quickly and then, you know, replicate them in your own example.
0:33:50 So that’s a good place to start.
0:33:52 Perfect. Rama Akaraju, thank you again.
0:33:56 It was a pleasure and all the best and all the work that you and your teams are doing.
0:33:59 Thank you so much. It’s a pleasure to be on your podcast, Noah.
0:34:00 Thanks for having me.
0:34:34 Thank you.
0:35:04 Thank you.

Rama Akkiraju, VP of IT for AI and ML at NVIDIA, discusses the transformative power of AI in enterprises. Akkiraju highlights the rapid evolution from perception AI to agentic AI and emphasizes the importance of treating AI as a new layer in the development stack. She also shares insights on the role of AI platform architects and key trends shaping the future of AI infrastructure.

NVIDIA’s Rama Akkiraju on Building the Right AI Infrastructure for Enterprise Success – Ep. 255

Leave a Reply Cancel reply