Imbue CEO Kanjun Qiu on Transforming AI Agents Into Personal Collaborators

AI transcript

🕒

Việt

0:00:10 [MUSIC]
0:00:13 >> Hello, and welcome to the NVIDIA AI podcast.
0:00:15 I’m your host, Noah Kravitz.
0:00:17 One of the big transformations being
0:00:20 enabled by AI is the way we create software.
0:00:21 From coding co-pilots to
0:00:23 in-development systems built to translate
0:00:26 plain language requests into fully functional applications,
0:00:28 generative AI is fueling a new wave
0:00:31 of tools to help us create software faster.
0:00:33 Our guest today is Kanjun Kew.
0:00:36 Kanjun is co-founder and CEO of Imbue,
0:00:38 a three-and-a-half-year-old company somewhere,
0:00:40 and they’re founded in 2021,
0:00:43 that is building AI agents that work with us to
0:00:46 translate our ideas into code and bring them to life.
0:00:48 There’s a lot more to it than that,
0:00:51 but why hear it from me when you can hear it directly from Kanjun.
0:00:53 Kanjun Kew, welcome.
0:00:56 Thank you so much for taking the time to join the NVIDIA AI podcast.
0:00:58 >> Thank you, Noah. It’s great to be here.
0:01:01 Let’s talk about software development and AI,
0:01:03 and Imbue, and all of that good stuff.
0:01:05 But maybe let’s start with agents.
0:01:09 Agents are, I don’t want to use the word “hot”
0:01:11 because I don’t want it to sound fluffy, right?
0:01:13 But agents are a thing right now.
0:01:15 We’ve had some guests on recently talking about
0:01:17 agents in different contexts,
0:01:21 and Imbue’s approach to agents is something worth delving into.
0:01:24 Maybe we can start there. What are AI agents?
0:01:26 Why do we need them? Why isn’t Imbue working on them?
0:01:28 >> Yeah. Agents are really popular right now.
0:01:32 So Imbue was founded in early 2021,
0:01:33 and at that time,
0:01:37 our goal was to figure out how to make and work on general AI agents.
0:01:40 At that time, people thought we were totally crazy.
0:01:41 Like, what are agents?
0:01:43 Everyone’s working on AGI.
0:01:46 AGI is going to rule everything.
0:01:48 But what we were really interested in is,
0:01:53 how can we have systems that we as people can mold,
0:01:56 shape to what we’re wanting?
0:01:59 As opposed to, oh, this external intelligence that knows all the answers.
0:02:02 And so we started as a research lab at that time
0:02:05 because the technology was certainly not good enough
0:02:08 to build general agents that could reliably do anything
0:02:09 that we wanted them to do.
0:02:11 And, you know, in the very beginning,
0:02:14 we thought of agents in a similar way
0:02:16 to how a lot of people think about agents today.
0:02:19 As these kind of, you know, think about what an agent is.
0:02:21 Often people think about kind of an autonomous,
0:02:23 personal assistant type thing.
0:02:25 You ask it to do something, it does it.
0:02:27 For you, it comes back to you.
0:02:29 Now everyone has their own personal assistant.
0:02:32 And actually, a lot of our learning has been
0:02:36 that that’s a really tricky user experience.
0:02:39 And we experience it ourselves with human agents,
0:02:41 which is that often when I delegate something,
0:02:43 it comes back, it’s not quite what I wanted.
0:02:46 And now I have to negotiate how to get what I wanted.
0:02:48 I was listening, I told you before we record,
0:02:51 I was listening to a little bit of your fireside chat
0:02:54 with Brian Cutton-Zera from GTC this year,
0:02:55 which listeners go check that out.
0:02:57 It’s a great listen.
0:02:59 And you were talking about the difficulty,
0:03:01 and I can relate to this so much,
0:03:04 the difficulty inherent in delegating work
0:03:06 to someone else, right?
0:03:08 And to your point, thinking of it as humans,
0:03:09 you have to break the problem down,
0:03:11 you have to sort of figure out,
0:03:14 well, how much do I tell them exactly what to do?
0:03:14 – Yeah.
0:03:15 – Yeah, all that, yeah.
0:03:17 – What context does it need ahead of time?
0:03:19 What instruction should I give?
0:03:21 Delegation is actually a really tricky paradigm
0:03:23 because it actually puts all the onus
0:03:24 on the person who’s delegating
0:03:26 to define the problem, define the scope.
0:03:29 Of course, the person who’s being delegated to the agent
0:03:31 might come back with some questions and stuff like that,
0:03:34 but it’s a very tricky thing to trust.
0:03:35 So what we’ve learned over the years
0:03:38 working on general agents is we’ve actually started
0:03:40 to think about agents in a very different way.
0:03:43 And this is both from a kind of pragmatic business,
0:03:46 like user perspective and also from a mission perspective.
0:03:49 The way that we think about agents is,
0:03:51 if you think about what an agent is,
0:03:53 what this personal assistant is doing,
0:03:56 what it is is it’s kind of this intermediary layer
0:03:58 between you and your computer.
0:04:00 And you’re kind of telling it stuff
0:04:02 or interfacing with it in some way,
0:04:05 whether it’s a UI or a natural language,
0:04:07 and it’s interfacing with your computer.
0:04:10 And the most effective way to interface with your computer
0:04:11 is by writing code.
0:04:13 That’s what your computer is made of.
0:04:15 And there are really kind of two types of code.
0:04:17 There’s like everything is hard-coded,
0:04:19 which is the default for software today.
0:04:20 Everything is hard-coded.
0:04:22 So now your agent can only do a subset of things.
0:04:25 Or now with language models, you can generate code.
0:04:27 You can write code that does new stuff
0:04:28 that’s not hard-coded.
0:04:30 And now you have an agent that’s much more general.
0:04:31 It’s able to do sets of tasks
0:04:34 that I didn’t program into it ahead of time,
0:04:35 but now it’s able to do.
0:04:37 And so the way that we think about what agents are
0:04:40 is actually as this intermediary layer
0:04:41 between you and your computer,
0:04:43 and it’s essentially a layer of abstraction
0:04:48 on top of programming that allows us as regular people
0:04:50 to be able to program our computers
0:04:53 without even thinking about what we’re doing as programming.
0:04:54 And so we think of our mission
0:04:58 as essentially trying to reinvent the personal computer
0:05:02 and really deeply empower everyone to be able to create
0:05:04 in this computing medium of the future,
0:05:06 because this digital medium
0:05:08 is becoming more important in our lives.
0:05:11 And what we want is actually not what I want,
0:05:12 at least personally,
0:05:14 is actually not a super centralized assistant
0:05:16 that someone else has decided
0:05:19 is able to do this, integrate with that.
0:05:23 What I actually want is something that I can make and own
0:05:26 that is mine, and that does what I wanted to do,
0:05:27 kind of serves me.
0:05:29 And today we’re kind of in a world
0:05:31 where like all of our software is rented,
0:05:32 it serves other people.
0:05:33 So that’s kind of what we think of what agents are
0:05:35 is this layer of abstraction on top of programming
0:05:37 makes it so that it’s very intuitive
0:05:39 for everyone to program.
0:05:41 And that actually requires quite a bit of invention.
0:05:43 So can get into that in historical context.
0:05:46 – Yeah, well, the way you were,
0:05:47 we’re describing it,
0:05:50 I know you weren’t describing the sort of AI,
0:05:52 the assistant, the agent layer,
0:05:54 all the A-words, AI, assistant, agent,
0:05:55 you weren’t describing the agent layer
0:05:57 as replacement for a user interface.
0:05:58 So you mentioned, you know, UI,
0:06:00 but that’s kind of what I was thinking of,
0:06:02 right, that abstractive layer that’s kind of in between.
0:06:06 So how is imbue approaching it sort of on the ground
0:06:09 and working with you or primarily with enterprise clients?
0:06:12 – No, we primarily with, I would say it’s prosumer.
0:06:14 So people who are,
0:06:16 so the way to think about what we’re doing is,
0:06:19 instead of thinking of agents as automation systems,
0:06:20 right now we’re in kind of the agent,
0:06:22 automation, agent paradigm,
0:06:24 we think of agents as collaborative systems.
0:06:27 So how do we enable a system that empowers me
0:06:30 and helps me do more of what I want to do and work with?
0:06:31 I can work with it.
0:06:33 And so in the beginning,
0:06:36 we’re actually, you know, enabling you to write code.
0:06:38 Right now, these models are not that good at writing code.
0:06:40 As they write code, you actually have to go check it.
0:06:42 So you have to understand how to write code to go,
0:06:44 to see, oh, did the agent do a good job?
0:06:47 But over time, as these models get better at writing code,
0:06:49 now you don’t have to check it anymore.
0:06:50 And so in the beginning,
0:06:52 we start with software engineers as the prime,
0:06:54 or we call them software builders.
0:06:55 So you don’t have to be an engineer.
0:06:56 I’m a software builder.
0:06:58 I’m not a software engineer.
0:06:59 We start with software builders who can’t.
0:07:01 – I’m a prototype builder then.
0:07:02 I wouldn’t go as far as software.
0:07:03 – Okay.
0:07:06 So you could probably be a soon to be user.
0:07:08 Once we get to a place where you don’t have to read
0:07:10 and write, touch the code so much.
0:07:12 Right now, we’re targeting software builders
0:07:14 who can read and touch the code and be like,
0:07:16 okay, well, it’s not quite right.
0:07:17 We want to adjust it in this way.
0:07:20 And over time, as the models get better,
0:07:22 and you don’t have to be so low level in the code,
0:07:24 now more and more types of creatives,
0:07:27 types of builders can use these systems
0:07:28 to really mold and shape their computer
0:07:30 to what they want it to be.
0:07:32 – And what level of complexity,
0:07:35 what level of depth is the actual software
0:07:38 that users are building within view?
0:07:40 There’s the issue of accuracy, obviously,
0:07:42 as you were saying they’re not,
0:07:45 none of the models are creating 100% perfect code yet.
0:07:48 But I also wonder how complex can these things get?
0:07:50 And that kind of brings us to talking about reasoning.
0:07:51 We don’t have to get there yet,
0:07:54 but kind of edging towards that conversation.
0:07:55 – Yeah, I think one of actually,
0:07:58 our biggest learnings has been that if as a user,
0:08:01 I have a code base that is fairly modular
0:08:02 and I’ve kind of broken things down,
0:08:04 then the model is actually pretty good at dealing
0:08:06 with a very complex system,
0:08:08 because it doesn’t have to load that much stuff
0:08:10 into its head and it doesn’t have to cross check
0:08:12 all of these dependencies.
0:08:14 So just like with humans building software,
0:08:16 where you don’t want a ton of dependencies,
0:08:19 also, if you have a slightly more isolated system,
0:08:20 it’ll do a better job.
0:08:23 Similarly, there’s a lot of kind of user expertise
0:08:25 in using this kind of product.
0:08:27 So our product, it feels very collaborative.
0:08:28 It’s almost like a document editor
0:08:31 and kind of like interfacing and interacting
0:08:32 with an agent in that way.
0:08:35 And so as a user, you can basically,
0:08:37 we learn to give it tasks,
0:08:39 it’s more likely to succeed out.
0:08:41 And we learn to structure our work
0:08:43 so that we can delegate it to agents.
0:08:45 And we’ve seen this with other AI tools as well,
0:08:47 like Co-Pilot, our team definitely writes code
0:08:48 in a slightly different way,
0:08:50 so that Co-Pilot can work well for them.
0:08:51 – Right, right.
0:08:52 – To your question of complexity,
0:08:54 it like really depends on the user,
0:08:58 like some of us can make it work with really complex things.
0:08:59 – Yeah, yeah.
0:09:02 Where are you seeing agents being used the most
0:09:04 or perhaps it’s more that they’re having
0:09:06 a dramatic impact where they are being used
0:09:10 and how does that translate into businesses,
0:09:13 remaining competitive, having a competitive edge.
0:09:15 Folks that I’ve talked to and I keep talking about,
0:09:17 2022 last year,
0:09:19 we’re kind of the year of the models coming out
0:09:22 and mainstream taking notice of JNAI.
0:09:23 And perhaps this year has been the year
0:09:25 where people are trying to figure out,
0:09:28 what apps do I build to leverage these things,
0:09:30 as a software building business
0:09:33 or as a business that does other things
0:09:34 that wants to leverage this.
0:09:37 So where is in view seeing impact being made
0:09:40 or even looking to areas in the near future?
0:09:42 – You know, the interesting thing about agents
0:09:45 is it’s such an ill-defined term right now.
0:09:47 And people are calling all sorts
0:09:49 of very trivial things agents and that’s fine.
0:09:51 But I think there’s a spectrum of kind of agent,
0:09:53 usefulness, effectiveness.
0:09:56 There’s like a system that scrapes the web
0:09:57 and then aggregates the data
0:09:58 and pulls the data out in some way.
0:10:02 This is kind of like basically just a JNAI model.
0:10:04 Like, you know, it’s kind of very similar to chatGPT,
0:10:05 but you like put something on top of it
0:10:08 to like dump the output into a different system.
0:10:09 You could call that an agent.
0:10:10 So some people call that an agent.
0:10:13 We see that kind of thing being implemented
0:10:14 in all sorts of places.
0:10:16 But I think the much more exciting thing
0:10:19 when it comes to agents is these more general agents
0:10:22 that enable people to kind of start doing things
0:10:25 that they previously didn’t even imagine that they could do.
0:10:28 You know, I think like some of the really simple examples
0:10:33 right now are for us, like some researcher or scientist,
0:10:35 a biologist has a ton of data that they need to process
0:10:37 and they’re not a software engineer,
0:10:38 but they’re technical enough
0:10:40 that they can kind of like pull in the data
0:10:41 and then get something out of it,
0:10:43 get something that helps us out of it.
0:10:45 If they’re able to use something like this
0:10:48 that lets them work at this slightly higher level
0:10:50 or, you know, kind of over time,
0:10:54 that a very exciting thing is as we start to build
0:10:58 the tools that we need, like an example is my grandmother
0:11:00 gets a bunch of scam calls in Chinese
0:11:02 and, but all of her calls are in Chinese.
0:11:04 And if I want to build a piece of software
0:11:06 that filters out her scam calls from her other calls,
0:11:08 like this is very hard right now,
0:11:11 even for me as someone who knows how to build software.
0:11:13 And it’s such a niche market.
0:11:16 Like no one else is going to build that software for her.
0:11:18 We’ve tried to find software like that in the US,
0:11:19 doesn’t really exist.
0:11:21 And so, exactly.
0:11:24 So right now we’re in this world where software is built by–
0:11:25 – Not to interrupt you.
0:11:28 If it exists in the US for English language spam,
0:11:30 it doesn’t work that well either for my–
0:11:32 – Exactly, exactly.
0:11:34 So, you know, right now we’re in this world
0:11:37 where other people build software for us.
0:11:39 We have to rely on other people to build software for us.
0:11:41 And it’s actually really strange.
0:11:44 Like we don’t really own our digital environments
0:11:45 in that way.
0:11:46 Like everything is kind of built by someone else
0:11:49 because it’s too hard for us to like build or own things.
0:11:52 And I think there is a future in which like I could actually
0:11:54 pretty easily build something for my grandmother
0:11:57 or for my community or for my group of friends
0:12:01 or for my church to manage registrations or whatever it is.
0:12:04 And that can be really tailored to my particular use case
0:12:06 and to me, my community, my friends.
0:12:09 And so, I think the really exciting thing about
0:12:12 this future is like all of this like bespoke software
0:12:14 as opposed to today where we have this
0:12:17 kind of centralized software.
0:12:19 It’s almost like people don’t often think
0:12:20 of their digital environment in this way,
0:12:23 but the digital environment is like the physical environment.
0:12:26 And today it’s as if we all live in corporate housing.
0:12:29 – I used to be so excited that I could listen
0:12:32 to any music I wanted for, you know, 10 bucks a month.
0:12:33 And now I’m thinking like,
0:12:34 “But I don’t own any of it.”
0:12:36 They could take it away for a minute or a second.
0:12:39 – Yeah, and honestly, I think a lot of the kind
0:12:41 of frustration people have about big tech, about technology
0:12:44 is that we don’t feel like, and I don’t feel like
0:12:46 I have control over these things
0:12:48 that are a huge part of my life.
0:12:51 And so that’s what at MBU, what we wanna do
0:12:54 is give that control and power back to the people.
0:12:57 And we do that by creating these tools and systems
0:12:59 that collaborate with you to help you
0:13:01 be able to create stuff for yourself.
0:13:03 – So how hard is it to build these things
0:13:06 for folks who use what, you know, as you mentioned,
0:13:10 there are different, many different, you know,
0:13:12 voices, individuals, companies,
0:13:14 talking about the agent’s agentic AI,
0:13:16 and a lot of them are defining it,
0:13:18 talking about it at least slightly different.
0:13:20 I’m sure taking, you know, different approaches,
0:13:22 kind of under the hood.
0:13:23 What are the challenges?
0:13:25 What are the things that, you know,
0:13:28 we can get a little bit technical here as you like.
0:13:30 Some of the things, some of the problems that, you know,
0:13:32 you and your teams are solving to make it easier
0:13:35 for the rest of us to translate our ideas into software.
0:13:37 – Yeah, so some problems I would say
0:13:39 are kind of universal across all of these different types
0:13:40 of people who are building agents.
0:13:43 And some problems are unique to us
0:13:44 and what we’re trying to do.
0:13:47 So I would say, you know, most people are building agents
0:13:49 in this kind of like workflow automation paradigm
0:13:50 I mentioned earlier.
0:13:52 And so for that paradigm,
0:13:54 robustness, reliability is really important.
0:13:57 Like, okay, you know, I built a thing
0:13:59 that responds to customer service tickets.
0:14:00 But if it 3% of the time,
0:14:02 it says something really terrible to the user,
0:14:04 like this is not a usable agent.
0:14:07 For us, reliability and robustness is important,
0:14:09 but it’s actually a little bit less important.
0:14:12 As it gets better, the user experience just gets better.
0:14:15 As a user, I don’t have to check stuff as much.
0:14:18 But even if it’s not the best, it’s still okay.
0:14:19 Like I can still use it as the user
0:14:21 and I’ll like fix the bug that, you know,
0:14:22 the model produced and that’s okay.
0:14:24 So a lot of what we think of is kind of like,
0:14:25 how do we get agents to meet
0:14:28 both the model capabilities and also users
0:14:29 where they are today.
0:14:31 – So that expectation is built in
0:14:35 that we’re not at the stage yet where it’s error-free.
0:14:37 And as a user, you need to know that.
0:14:39 And it’s not just like, okay, you have to accept that,
0:14:41 but it’s actually like your experience
0:14:43 is gonna wind up being better, right?
0:14:44 ‘Cause you know you’re part in it.
0:14:46 And it’s a, again, as you said,
0:14:47 it’s not sending it off to do something
0:14:50 and, you know, giving us the final result
0:14:51 we had no part in.
0:14:52 – Yeah, exactly.
0:14:54 I think, you know, people often think of agents
0:14:55 as a research problem, but we think of it
0:14:58 as both a research problem and a user experience problem.
0:15:00 And this user experience part
0:15:02 is really about setting the right expectations
0:15:04 so that with the experience,
0:15:06 so that it’s like, I’m not expecting it to go off
0:15:08 on its own for 10 minutes or 10 hours or 10 days
0:15:10 and come back with something magically correct.
0:15:12 Instead, I’m kind of iteratively working with it
0:15:14 and seeing, oh, it’s kind of like clay, you know,
0:15:16 I’m molding it, shaping it, shaping the output.
0:15:19 I think the workflow automation agents,
0:15:20 some of these agents kind of,
0:15:23 they’re a little bit more, the bar is higher
0:15:25 for how accurate they have to be
0:15:28 because what we have found is that as a user,
0:15:30 the longer it takes to come back with an answer,
0:15:32 the more I expect the answer to be correct
0:15:34 and I’ll get frustrated if it takes a really long time.
0:15:37 So we’re very much on the, like, highly interactive,
0:15:39 don’t take a super long time to come back with something,
0:15:42 be an agent that really works with the user side.
0:15:44 – Thinking about the idea of moving from,
0:15:47 I’m expecting from a computer, right?
0:15:50 The utmost inaccuracy, my calculator always says
0:15:51 two plus two is four.
0:15:54 Kind of moving from that to just a different frame of mind
0:15:58 as an end user saying, okay, we’re prioritizing, you know,
0:15:59 I don’t want to put words in your mouth,
0:16:01 but kind of the speed and experience
0:16:04 and you know, going in, it’s not gonna get them all right.
0:16:05 Is that something that, you know,
0:16:08 because this is the nature of AI and gen AI
0:16:10 and people are kind of used to that,
0:16:12 by now people are accepting of
0:16:14 or is there still kind of a, I don’t know,
0:16:16 maybe I’m just old, is there still kind of a mental hurdle
0:16:19 to getting past the right, that expectation?
0:16:21 – Yeah, so one of our core philosophies
0:16:23 is that we need to meet people where they are
0:16:25 in terms of what mental models we have
0:16:26 in our heads as people.
0:16:29 And so actually a good historical analogy
0:16:31 is a back before the personal computer,
0:16:33 people are really excited about the super computer.
0:16:35 And the first personal computer first came out,
0:16:36 everyone made fun of it.
0:16:38 They were like, this is a hobbyist toy.
0:16:40 And the super computer, you know,
0:16:42 you accessed it with a terminal,
0:16:44 you were time sharing on these super computers.
0:16:45 It was not especially usable.
0:16:48 And so a very small set of people were able to use it.
0:16:51 But as time went on, a small group of people at Xerox Park
0:16:53 actually invented a lot of these primitives
0:16:56 that led the personal computer to be usable by people.
0:16:58 They invented the desktop, files, folders.
0:17:00 These are like concepts that we understood
0:17:02 as humans at that time.
0:17:04 And so for us, you know,
0:17:06 part of actually building a good user experience
0:17:08 around agents requires invention.
0:17:11 It requires inventing concepts that match kind of like,
0:17:13 are able to map this technology
0:17:15 to what we currently understand as humans.
0:17:16 So earlier I was saying,
0:17:18 it’s kind of like a document editor right now,
0:17:19 you know, our current product experience.
0:17:22 And it may not ultimately be that way,
0:17:25 but a document is something that I as a person today
0:17:27 understand how to edit and work with.
0:17:30 And it’s almost like an interactive editor
0:17:31 that helps me accomplish my tasks.
0:17:34 And to your question of how users receive it,
0:17:36 one thing that’s really interesting that we observe
0:17:40 is software builders, the feedback so far has been,
0:17:42 wow, like this is really interesting.
0:17:44 It lets me work at this higher level.
0:17:47 I don’t have to dig so far into the code all the time.
0:17:50 I can kind of stay and thinking at this higher level.
0:17:53 And it’s actually able to go down into the code
0:17:56 and surface to me, like do the task and then I check it.
0:17:58 And that’s pretty cool.
0:17:59 It like lets me move a lot faster.
0:18:01 And that’s really interesting.
0:18:03 You know, that for us, that’s kind of the primary thing.
0:18:06 Like I want people to be able to work
0:18:11 at our human problem level of abstraction with software
0:18:14 instead of having to get so deep into the weeds.
0:18:15 Yeah, yeah.
0:18:19 No, I can relate from the standpoint
0:18:23 of learning how to use these tools when I’m writing,
0:18:27 when I’m not on the podcast, asking people like you questions.
0:18:29 You know, a lot of my work is writing.
0:18:32 And if I’m working from a document, a transcript,
0:18:34 source materials, it’s that same thing
0:18:38 when I can use the tool to just kind of surface to me.
0:18:40 You know, did anywhere in the document
0:18:45 or in the transcript at CanJune talk about pineapple on pizza?
0:18:48 You know, and just being able to surface that back, right?
0:18:50 It saves all the time of going through the document.
0:18:51 I don’t need the exact words.
0:18:53 I don’t need it 100%.
0:18:54 Then we’ll get to that later,
0:18:56 but I can just kind of go back and check.
0:18:57 Like, oh, right.
0:19:00 She said she is, isn’t a fan of Pepperoni.
0:19:00 You know, yeah.
0:19:03 And it’s, it’s, it’s incredibly helpful.
0:19:04 You know, it’s not just a saving time,
0:19:05 but I think you said it really well.
0:19:08 It allows you to stay at that level of thinking.
0:19:09 Exactly. Yeah.
0:19:11 I think our core, the thing I really care about
0:19:14 is helping people be able to be creative in this way.
0:19:17 And often there’s such a barrier between our ideas
0:19:18 and the execution of those ideas
0:19:20 that we never get to do a lot of things
0:19:22 that we really want to do with our lives.
0:19:24 And so I think the like true power of computing
0:19:26 as a medium hasn’t really been unleashed yet
0:19:28 and we want to unleash it.
0:19:30 And what that looks like is that people are able
0:19:31 to kind of like take their ideas.
0:19:33 Like you can take your ideas about writing this piece
0:19:35 or aggregating all of the pieces you’ve written
0:19:37 and being able to draw insights out of them
0:19:40 in order to create your book, like take these ideas
0:19:42 and actually be able to work with them at this higher level
0:19:44 so that you’re not always bogged down in the weeds.
0:19:46 And I think the true power of AI,
0:19:48 the true power of computers, like that’s what it enables.
0:19:50 And we’re not there yet, but we can get there.
0:19:53 And it’s not just about automation or business automation,
0:19:54 business workflow automation.
0:19:55 Right, right.
0:19:58 Now, what was the, in your conversation with Brian from GTC,
0:19:59 what was it that he said?
0:20:01 I’ve got all these ideas and I sit down to code
0:20:04 and I’m like import, what do I want to import, right?
0:20:06 And you’re, I think that was great
0:20:07 ’cause you’re derailed immediately
0:20:10 and I can relate in the work that I do that, you know, yeah.
0:20:11 Yeah, 100%.
0:20:14 Yeah, one of our users said, wow,
0:20:16 I never realized how much context switching I do
0:20:18 when I’m writing code from high to low level.
0:20:21 Same with when you’re writing normal stuff.
0:20:23 I’m seeking with CanJune Q.
0:20:27 CanJune is co-founder and CEO of Imbue
0:20:31 and we have been talking about Imbue’s work on AI agents
0:20:35 that help people code and really fascinating approach
0:20:36 that, you know, as we’ve been talking about,
0:20:39 I think goes beyond just expressing in code,
0:20:42 but code being the way that we interface with computers
0:20:44 and get them to do the things we want them to do.
0:20:47 Well, I want to ask you about AI models and reasoning.
0:20:51 And then I also want to ask you sort of about scale
0:20:56 and what goes into, you know, building an agent for yourself
0:20:59 and then what goes into building agents and multiple agents
0:21:02 and agents that collaborate for users at scale.
0:21:04 Is there an order we should go in?
0:21:06 Should we talk about reasoning first?
0:21:08 Is there a relation?
0:21:08 That’s interesting.
0:21:10 Let’s talk about scale first.
0:21:11 Okay, cool.
0:21:14 Yeah, so one way that people think about scale with agents
0:21:17 is a lot of agents interacting with each other
0:21:18 and kind of what that looks like.
0:21:20 And some people do that by giving different agents
0:21:23 different prompts so they have different personalities
0:21:23 and things like that.
0:21:28 And honestly, I think that’s a little bit, it’s interesting.
0:21:32 It’s a little bit limited because we already have agents today.
0:21:34 All software is agentic.
0:21:37 The whole point of what an agent is is it something
0:21:40 that takes action and it uses your computer
0:21:42 to kind of like execute something.
0:21:44 And so almost all software is executing something.
0:21:47 It’s kind of like changing the state of your computer,
0:21:49 website, data, et cetera.
0:21:51 Now, the difference between most software
0:21:54 and like AI agents, what we call AI agents today
0:21:56 is that AI agents can process stuff
0:21:58 in a way that’s like not fully deterministic.
0:22:03 But even so, we still had AI agents in Facebook news feed,
0:22:05 for example, recommendation engine is an agent
0:22:09 that non-deterministically decides what to show you.
0:22:11 So we’ve had agents since forever, since we had software.
0:22:13 And so, you know, kind of the way I think about scale
0:22:15 and agents is actually about,
0:22:17 is the same as the way I think about scale of software.
0:22:19 So in the future, in the next 10 years,
0:22:21 I think there’s gonna be this explosion of software
0:22:24 and the software is gonna be slightly less hard coded
0:22:24 than before.
0:22:27 It’s gonna be able to work with more empty US input.
0:22:29 It might be more interactive.
0:22:31 Hopefully a lot of people are going to be able
0:22:33 to create it if we succeed.
0:22:37 And so now we end up with this giant ecosystem,
0:22:40 like world of software that’s far beyond what we have today.
0:22:43 And in that world, what happens is now you have a lot
0:22:46 of different automated systems interacting with each other.
0:22:49 And this is actually, could be super exciting.
0:22:52 Every person could have their own automated systems
0:22:54 that do things for themselves, for their lives.
0:22:57 They have, you know, I’m supported by the software
0:22:59 that surrounds me, as opposed to today
0:23:01 maybe being bothered by it.
0:23:02 (laughing)
0:23:05 – I was listening to you and I was thinking of how to phrase,
0:23:07 how to try to phrase the question,
0:23:10 getting back to your point about, you know, sort of,
0:23:13 I was thinking of it as sort of one size fits all software,
0:23:15 you know, that’s deterministic and it does what it does
0:23:18 versus, you know, me being able to create
0:23:20 and reshape things as we go.
0:23:21 And you answered it for me.
0:23:22 So that’s that.
0:23:23 – Oh, that’s great.
0:23:26 I love this, I love this one size fits all software today
0:23:28 versus future bespoke software.
0:23:29 That’s a great term.
0:23:30 – Are you worried at all about,
0:23:33 I think, I don’t know if the term AI slop applies to code,
0:23:38 but this idea of, you know, AI models creating texts
0:23:41 that’s kind of meaningless or value-less,
0:23:44 but it’s being automatically put out onto the web and stuff
0:23:47 for a very sort of, you know, relatively ignorant point
0:23:51 of view, the notion of model-generated software
0:23:55 that can run itself being deployed out on the public web,
0:23:57 you know, is a little scary to me,
0:23:59 but also I’m sure there’s stuff out there,
0:24:01 but how do you think about that?
0:24:05 – Yeah, the way I think about kind of AI slop for software
0:24:09 is automated systems that like infringe on us.
0:24:12 So scam calling or spam calling is a good example
0:24:14 of an automated system that infringes on us,
0:24:16 or like apps that have lots of notifications
0:24:19 or, you know, games that are meant to be extractive.
0:24:21 Those are our systems that infringe on us.
0:24:24 And, you know, I actually think the default path
0:24:27 of where things are going with centralized AI systems
0:24:30 and the kind of like returns the scale
0:24:33 of improvements of the underlying models
0:24:35 is that we will kind of, as humans,
0:24:38 be a little bit disempowered and be holding
0:24:41 to whoever controls the automated systems.
0:24:43 It’s not necessary, you know, I’ve kind of told you
0:24:45 about this beautiful vision of a future
0:24:46 where everyone can create software
0:24:47 and everyone creates bespoke software,
0:24:49 and like that’s the future we want to build,
0:24:51 but it’s not necessarily the future
0:24:53 that is the default path.
0:24:56 The default path could be that there’s a lot of,
0:24:58 like even more software out there today
0:24:59 that’s trying to get our attention
0:25:00 and trying to extract from us.
0:25:03 And what we need, I think, what I want
0:25:05 is for people to create defenses
0:25:07 and to kind of like get rid of that stuff
0:25:08 and disrupt it.
0:25:11 You know, hopefully when I can build my own software,
0:25:13 I can actually disrupt a lot of the stuff
0:25:14 that’s being built today,
0:25:16 so that I have software that’s serving my own interests.
0:25:18 I have agents that are serving my own interests
0:25:20 and helping me do what I want to do.
0:25:21 So to your question of AI slop,
0:25:24 I think there’s definitely going to be people
0:25:25 making more automated systems
0:25:28 that are bots and bother people.
0:25:30 And just like in cybersecurity,
0:25:32 I think there’s an attack defense dynamic,
0:25:34 and what we want is to enable people
0:25:36 to create opposing systems for themselves
0:25:38 that like help defend their environment and themselves
0:25:41 and help kind of protect us to do what we want
0:25:43 and liberalize we want.
0:25:45 And hopefully there’s also kind of some,
0:25:47 you know, there’s already some regulatory aspect of this
0:25:50 that exists and, you know, there hopefully will be more
0:25:51 in response to what we see.
0:25:53 So that effect is real.
0:25:54 – Right, all right.
0:25:59 There’s been a lot in the generative AI news cycles.
0:26:01 There’s a thing, that’s a thing.
0:26:04 This year in particular about models and reasoning
0:26:06 and future models, models being trained
0:26:08 that have reasoning capacity, that kind of thing.
0:26:12 Recently open AI launched a new iteration of a model
0:26:13 talking about reason capabilities.
0:26:18 Is reasoning something that does or can happen in an LLM?
0:26:21 Is it something that the agents bring to the table,
0:26:22 so to speak?
0:26:23 How do you think about reasoning?
0:26:27 How does in view approach building reasoning capabilities
0:26:28 into your products?
0:26:29 – Yeah, that’s a great question.
0:26:31 So yeah, reasoning is a buzzword right now
0:26:34 and models definitely do reasoning.
0:26:38 That’s, you know, and in the way that like underlying LLMs
0:26:40 definitely do reasoning in the way that humans kind of,
0:26:43 it’s not exactly how we do reasoning maybe,
0:26:44 and it’s not always perfectly correct.
0:26:47 And it’s often continuing to justify its own reasoning,
0:26:48 although humans do that too.
0:26:50 – Now he’s gonna say, that’s familiar.
0:26:53 – So unclear how similar or different it is to humans,
0:26:56 but the underlying LLM definitely does some reasoning.
0:26:58 One key difference that we observe right now
0:27:01 is that the underlying LLM isn’t necessarily
0:27:05 as good at verifying if its own answer is correct.
0:27:08 And as a person when I’m doing a task,
0:27:09 I actually, we don’t notice this,
0:27:11 but we’re always constantly checking like,
0:27:12 is this right?
0:27:12 Did I do that right?
0:27:13 Is this what I expected?
0:27:14 And there’s not that loop.
0:27:17 So that loop is kind of added by the agentic piece.
0:27:18 – Okay.
0:27:20 – Yeah, we think a lot actually about our research direction
0:27:24 as around this kind of verifying verification.
0:27:24 Is it correct?
0:27:25 Did I do it right?
0:27:27 And if I have an agent that’s writing code for me,
0:27:29 I do want it to check itself like,
0:27:30 hey, did I do that right?
0:27:31 I did.
0:27:32 Oh, I didn’t.
0:27:33 Let me fix this mistake and then come back to you.
0:27:36 And so the better it is at verifying its own answers,
0:27:39 the more like a better user experience it is.
0:27:41 And so, when we talk about reasoning,
0:27:43 we mostly talk about this kind of like verification
0:27:44 and robustness.
0:27:47 Is it able to verify what it’s doing?
0:27:49 We’ve actually learned some pretty interesting things
0:27:52 when working on verification around kind of,
0:27:55 it turns out that, so in software development,
0:27:56 when you write a function,
0:27:58 you also often write software tests.
0:28:00 You’re testing, okay, did the software
0:28:01 have the behavior expected?
0:28:03 And given really good tests,
0:28:05 actually the underlying models are pretty good
0:28:08 at treating the function or treating the piece of software.
0:28:09 – Right.
0:28:10 – But given the piece of software,
0:28:12 the underlying models are not very good
0:28:13 at treating good tests.
0:28:16 Which is kind of interesting.
0:28:17 – Yeah.
0:28:18 – One.
0:28:19 – Any idea why?
0:28:21 – Yeah, one, you know, it’s partly because the models
0:28:23 are probably not trained that much on this particular task
0:28:24 of creating tests.
0:28:27 Two, though, maybe it’s possible.
0:28:29 We don’t know, we’re not 100% sure,
0:28:31 but it’s possible that actually verifying
0:28:34 if something is correct is a harder reasoning problem
0:28:37 than kind of creating the thing in the first place.
0:28:38 – Yeah.
0:28:40 – So it kind of requires this like analysis and judgment.
0:28:43 And so our research direction is primarily focused
0:28:45 on verification, how do we get to models
0:28:47 that actually are able to properly verify
0:28:49 that the output is correct and kind of is
0:28:50 what the user wanted.
0:28:53 And we think of that as the hard problem
0:28:54 in reasoning for agents.
0:28:58 – At the same time, MBU has, did you pre-train a model?
0:28:59 Did you build a foundation?
0:29:01 You didn’t build the model from scratch,
0:29:03 but it’s a 70 billion parameter model.
0:29:04 – That’s right, we actually did pre-train
0:29:06 a 70 billion parameter model from scratch.
0:29:08 – It is from scratch, okay, one mistake.
0:29:10 – Yeah, we actually learned a ton from that process.
0:29:12 And one of the things we learned was like,
0:29:14 actually we don’t know if we need to do tons
0:29:15 of pre-training going into the future.
0:29:16 We’ll see. – Yeah.
0:29:20 – But we got a lot out of post-training on that model.
0:29:23 And so for a lot of the verification work,
0:29:26 we’re actually really interested in post-training,
0:29:27 fine-tuning, reinforcement learning
0:29:29 on top of the underlying models.
0:29:31 – That seems like a good place to ask this question.
0:29:33 What does the future look like?
0:29:35 I almost want to leave it just there, but that’s not fair.
0:29:37 What does the future of AI agents look like?
0:29:38 What is MBU’s approach?
0:29:41 Do you have a roadmap you can share any of?
0:29:42 – Where is this heading?
0:29:44 And I know that’s an impossible question to answer
0:29:46 in many ways, but I’m also guessing
0:29:48 you have something of a vision, so.
0:29:49 – Yeah, that’s a great question.
0:29:52 So I talked today about trying to enable everyone
0:29:55 to build software, but really internally,
0:29:57 the way we think about it is all software
0:29:59 in the future will be agents, basically.
0:30:01 I mean, not all software, but most software.
0:30:03 It’ll be a little bit smarter,
0:30:04 kind of just like living software.
0:30:07 And what we want to do is enable everyone
0:30:10 to be able to build agents so that in the world,
0:30:12 we’re all able to build our own agents for ourselves
0:30:13 or use each other’s copy,
0:30:16 someone’s modify a little bit from myself.
0:30:18 And that’s kind of what our product
0:30:19 is meant to do over the long term.
0:30:23 And so actually in our 70 billion parameter model,
0:30:26 we released a set of blog posts that taught people
0:30:29 how to set up infrastructure to train such models.
0:30:32 We expect most people, most of us won’t train our own models,
0:30:34 but it’s kind of part of this desire
0:30:36 to democratize a lot of these capabilities.
0:30:40 We also released a toolkit for people doing evaluations
0:30:43 of their models and with clean data and everything.
0:30:46 And so in terms of kind of what’s the future
0:30:49 of building agents, my hope is that agent building,
0:30:50 unlike software building, is not something
0:30:53 that only a few people are able to do well.
0:30:54 My hope is that there’s actually something
0:30:56 that’s like widely democratized
0:30:58 and where everyone is empowered
0:31:00 to be able to create their own agents.
0:31:05 And I think right now we have this very scary view
0:31:07 of somebody else is going to create a system
0:31:09 that automates my job.
0:31:09 And that sucks.
0:31:11 That’s like really disempowering.
0:31:13 I don’t want that for my job.
0:31:14 – Yep.
0:31:17 – But the thing I love doing is automating parts
0:31:18 of my own job.
0:31:19 – Yeah.
0:31:20 – I like, you know, love making it better
0:31:21 in all of these different ways.
0:31:23 And that’s what we want to enable people to be able to do.
0:31:26 Like by giving you the tools to make your own agents,
0:31:28 that means that you can make your own things
0:31:29 that automate parts of your job.
0:31:32 And now your job can be higher leverage and higher level.
0:31:33 And now you can do a lot more.
0:31:34 – Right.
0:31:35 – And so we want to give kind of, you know,
0:31:38 someone else automating my job is very disempowering to me,
0:31:39 but someone giving me the tools
0:31:41 so that I can make my own tools for myself.
0:31:43 That’s very empowering for me.
0:31:44 And I think this mentality shift
0:31:46 is actually really important.
0:31:47 – Amen.
0:31:49 Ken June for folks listening
0:31:51 who would like to learn more about imbue.
0:31:54 You mentioned a blog as well.
0:31:55 Where should they start online?
0:31:58 Website, social media, podcast.
0:31:59 There’s a podcast, I think.
0:32:00 Where should they go?
0:32:00 – Great question.
0:32:03 So imbue.com is where we are on the internet.
0:32:06 And you can follow our Twitter account imbue.ai.
0:32:08 And we will have, you know,
0:32:11 as we start to release the product a little bit more
0:32:13 publicly, we’ll probably have announcements
0:32:16 and things where you can start experimenting
0:32:17 with what we’re doing.
0:32:18 So please do follow us.
0:32:20 There’s also a newsletter sign up
0:32:22 where we send extremely rare emails.
0:32:24 (laughing)
0:32:27 Because we mostly focus on building.
0:32:30 – You’re not an application trying to extract constantly?
0:32:32 – No, no, not trying to get your attention,
0:32:33 trying to make a useful product.
0:32:34 – Good, good, good.
0:32:35 Ken June, this is delightful.
0:32:38 Thank you so much for taking the time to come on the pod.
0:32:39 And best of luck with everything you’re doing.
0:32:41 Maybe we can check in again down the road.
0:32:42 – Definitely.
0:32:43 Thank you, Noah.
0:32:44 This is super fun.
0:32:46 (upbeat music)
0:32:49 (upbeat music)
0:32:52 (upbeat music)
0:32:54 (upbeat music)
0:32:57 (upbeat music)
0:32:59 (upbeat music)
0:33:02 (upbeat music)
0:33:04 (upbeat music)
0:33:07 (upbeat music)
0:33:10 (upbeat music)
0:33:12 (upbeat music)
0:33:15 (upbeat music)
0:33:17 (upbeat music)
0:33:20 (upbeat music)
0:33:22 (upbeat music)
0:33:25 (upbeat music)
0:33:28 (upbeat music)
0:33:30 (upbeat music)
0:33:33 (upbeat music)
0:33:43 [BLANK_AUDIO]

Vietnamese translation content goes here.

In this episode of the NVIDIA AI Podcast, Kanjun Qiu, CEO of Imbue, explores the emerging era where individuals can create and utilize their own AI agents. Drawing a parallel to the personal computer revolution of the late 1970s and 80s, Qiu discusses how modern AI systems are evolving to work collaboratively with users, enhancing their capabilities rather than just automating tasks.

Imbue CEO Kanjun Qiu on Transforming AI Agents Into Personal Collaborators – Ep. 239

Leave a Reply Cancel reply