Marc Andreessen and Amjad Masad: English As the New Programming Language

AI transcript

🕒

Việt

中文

0:00:02 We’re dealing with magic here that we I think probably all would have thought was
0:00:05 impossible five years ago, or certainly ten years ago. This is the most amazing
0:00:07 technology ever and it’s moving really fast and yet we’re still like really
0:00:11 disappointed. Like it’s not moving fast enough and like it’s not like maybe right
0:00:13 on the verge of falling out. We should both be like hyper excited but also on
0:00:17 the verge of like slitting our wrists because like you know the gravy train is coming to an end.
0:00:21 Right. It is faster but it’s not at computer speed right? Right. What we
0:00:25 expect computer speed to be. It’s sort of like watching a person work. It’s like
0:00:32 watching John Carmack on cocaine. The world’s best programmer on a stimulus.
0:00:37 On a stimulus. Yeah that’s right. Every few decades programming takes a massive leap
0:00:42 forward and this might be the biggest one yet. In this episode Mark Andreessen and I
0:00:47 are joined by Amjad Massad, CEO and founder of Replit, to talk about how AI agents are
0:00:52 changing what it means to code. We discuss the end of syntax, the rise of agents that
0:00:56 can think in built software for hours, and how reinforcement learning and verification
0:01:01 loops are pushing AI towards something that looks a lot like reasoning. And finally, Amjad
0:01:05 shares his story from hacking his university database in Jordan to building one of the
0:01:09 most powerful developer tools in the world. Let’s get into it.
0:01:16 So let’s start with, let’s assume that I’m a sort of a novice programmer. So maybe I’m a
0:01:19 student or maybe I’m just somebody I took a few coding classes and I’ve hacked around a
0:01:23 little bit or I don’t know, I do Excel macros or something like that. But I’m like not as
0:01:28 well as I’m not like a master craftsman of coding. And somebody tells me about Replit and
0:01:32 specifically AI and Replit. What’s my experience when I launch in with what Replit is today with AI?
0:01:37 Yeah, I think the experience of someone with no coding experience or some coding experience
0:01:42 is largely the same when you go into Replit. The first thing we try to do is get all the
0:01:46 nonsense away from setting up development environment and all of that stuff and just have you focus on
0:01:51 your idea. So what do you want to build? Do you want to build a product? Do you want to solve a problem?
0:01:55 Do you want to do a data visualization? So the prompt box is really open for you. You can put in
0:02:01 anything there. So let’s say you want to build a startup. You have an idea for a startup. I would
0:02:07 start with a paragraph long kind of description of what I want to build. The agents will read that and
0:02:12 will punch it out. Standard English. Standard English. You just type it in. I want to sell
0:02:16 crepes online. So you just like type in. I want to sell crepes online. It literally could be that
0:02:22 four words or five words. Okay. Or it could be if you have a programming language you prefer or stack
0:02:25 you prefer. You could do that. But we actually prefer for you not to do that because we’re going to pick
0:02:32 the best thing for we’re going to classify the best stack for that request. If it’s a data app, we’ll
0:02:37 pick Python, stream it, whatever. If it’s like a web app, we’ll pick JavaScript and Postgres and things
0:02:41 like that. So you just type that. Or you can decide. You can say, and I want to do it. I know Python or I’m
0:02:45 learning Python in school and I want to do it in Python. That’s right. The cool thing about Replit is
0:02:49 we’ve been around for almost 10 years now and then we built all this infrastructure. Replit runs any
0:02:53 programming language. So if you’re comfortable with Python, you can go in and do that for sure.
0:02:57 Okay. And then just again, I know this is obvious people have used it, but like I’m dealing in English.
0:03:03 Yes. Go ahead. Yes. You’re fully in English. I mean, just a little bit of a sort of background here.
0:03:10 Like when I came here and pitched to you like 10 years ago or whatever, seven years ago, what we were
0:03:16 saying is we were exactly describing this future is that everyone would want to build software. And the
0:03:21 thing that’s kind of getting in people’s ways is all the, what Fred Brooks called the accidental
0:03:25 complexity of programming, right? They’re like essential complexity, which is like, how do I
0:03:32 bring my startup to market and how do I build a business and all of that? Accidental complexity
0:03:37 is what package manager do I use and all of that stuff. We’ve been abstracting away that for so many
0:03:43 years. And the last thing we had to abstract away is code. Right. I had this realization last year,
0:03:47 which is, I think we built an amazing platform, but the business is not performing. And the reason
0:03:52 the business is not performing is that code is the bottleneck. Right. Yes. All the other stuff is
0:03:57 important to solve, but syntax is still an issue. Syntax is just an unnatural thing for people. So
0:04:00 ultimately English is the programming language. Right.
0:04:05 By the way, just to clarify, does it work with other world languages other than English at this point?
0:04:08 Yes. You can write in Japanese and we have a lot of users, especially Japanese. Okay.
0:04:12 That tends to be very popular. So does it support these days, like does AI support every language or is
0:04:16 it still, do you still have to do custom work to craft a new language? No, most mainstream languages
0:04:20 that has a hundred million plus people will speak at AI is pretty good at it. Okay. Yeah. Yeah.
0:04:24 Wow. So I did a bit of historical research recently for some reason. I just want to just
0:04:29 understand the moment we’re in and because it’s such a special moment, it’s important to contextualize it.
0:04:36 And I read this quote from Grace Hopper. So Grace Hopper invented the compiler, as you know,
0:04:42 at the time people were programming machine code and that’s what programmers do. That’s what the
0:04:47 specialists do. Yes. And she said, specialists will always be a specialist. They have to learn
0:04:52 the underlying machinery of computers, but I want to get to a world where people are programming English.
0:04:57 That’s what she said. That’s before Karpathy, right? That’s 75 years ago. And that’s why
0:05:02 I invented the compiler. And in her mind, like C programming is English, but that was just the
0:05:07 start of it. You had C and then you go higher level, Python, JavaScript. And I think we’re at
0:05:13 a moment where it’s the next step. Right. Instead of typing syntax, you’re actually typing thoughts,
0:05:15 you know, which is what we ultimately want. Right. And the machine writes the code.
0:05:17 And the machine writes the code. Right, right. Yeah.
0:05:20 I remember it. You’re probably not old enough to remember, but I remember when I was a kid,
0:05:24 there were higher level languages by the seventies, like basic and so forth and Fortran and C, but you
0:05:28 still would run into people who were doing assembly programming, assembly language, which by the way,
0:05:31 you still do like game companies or whatever, still do assembly to get out.
0:05:33 And they were hating on the kids that are doing basic.
0:05:36 So the assembly people were hating on the kids doing basic, but there were also older coders who hated on
0:05:39 the assembly programmers for doing assembly and not, and not, and not, and not, and not do it.
0:05:41 No, no, no, not doing direct machine code. Right.
0:05:45 Not doing direct zero and one machine code. So people don’t know assembly language is sort of this
0:05:48 very low level programming language that sort of compiles to actual machine code. Yeah.
0:05:52 It’s incomprehensible gibberish to most programmers, even most programmers writing an octal or something.
0:05:54 You’re writing like very, very close to the harbor, but even still,
0:05:57 it’s still a language that compiles to zeros and ones. Right.
0:06:00 Whereas the actual real programmers actually wrote in zeros and ones. Yeah.
0:06:03 And so there’s always this tendency for the pros to be looked on the nose. Yeah.
0:06:06 And say the new people are being basically sloppy. They don’t understand what’s happening.
0:06:08 They don’t really understand the machine. And then of course,
0:06:11 what the higher level abstractions do is democratize. Right.
0:06:15 The absolute irony is I was part of the JavaScript revolution. I was at Facebook
0:06:20 before starting Replit and we built the modern JavaScript stack. We built React.js and all the
0:06:25 tooling around it. And we got a lot of hate from the programmers that you should type vanilla
0:06:30 JavaScript directly. And I was like, okay, whatever. And yeah, and now that’s mainstream. And now those
0:06:37 guys that built their careers on the last wave we invented are hating on this new wave. People never change.
0:06:40 Okay. Got it. Okay. So you’re typing English. I want to sell grapes online. I want to do this.
0:06:43 I want to have a t-shirt, whatever the business is. Okay. What happens then?
0:06:49 Yeah. And then Replit agent will show you what it understood. So it’s trying to build
0:06:53 a common understanding between you and it. And I think there’s a lot of things we can do better
0:06:58 there in terms of UI. But for now, I’ll show you a list of tasks. I’ll tell you, I’m going to go set
0:07:04 up a database because you need to store your data somewhere. We need to set up Shopify or Stripe because we need to
0:07:08 accept payments. And then it shows you this list and gives you two options initially.
0:07:13 Do you want to start with a design so that we can iterate back and forth to get locked the design down?
0:07:17 Or do you want to build a full thing? Hey, if you want to build a full thing, we’ll go for 20,
0:07:23 30, 40 minutes. And the agent will tell you, go here, install the app. I’m going to go set up the
0:07:28 database, do the migrations, write the SQL, build the site. I’m going to also test it. So this is a recent
0:07:34 innovation we did with Agent 3 is that after it writes the software, it spins up a browser and
0:07:39 goes around and tests in the browser. And then any issue it kind of iterates, kind of goes and fix the
0:07:43 code. So we’ll spend 20, 30 minutes building that. I’ll send you a notification. I’ll tell you the app
0:07:48 is ready. So you can test it on your phone or go back to your computer. You’ll see, maybe you’ll find
0:07:53 a bug or an issue. You’ll describe it to the agent. I’ll say, hey, it’s not exactly doing what I expected.
0:07:58 Or if it’s perfect, you’re ready to go. And that’s it. By the way, there’s a lot of examples
0:08:02 where people just get their idea in 20, 30 minutes, which is amazing. You just hit publish.
0:08:09 You hit publish, a couple of clicks, you’ll be up in the cloud. We’ll set up a virtual machine in
0:08:14 the cloud. The database is deployed. Everything’s done. Now you have a production database. So think
0:08:19 about the steps needed just two or three years ago in order to get to that step. You have to set up
0:08:23 your local development environment. You have to sign up for an AWS account. You have to provision
0:08:27 the databases, the virtual machines. You have to create the entire deployment pipeline.
0:08:32 All of that is done for you. And it’s just, a kid can do it. A layer person can do it.
0:08:37 If you’re a programmer and you’re curious about what the agent did, the cool thing about Replit,
0:08:42 because we have this history of being an IDE, you can peel the layers. You can open the file tree,
0:08:47 and you can look at the files. You can open Git. You can push to GitHub. You can connect it to your
0:08:51 editor if you want. You can open it in Emacs. So the cool thing about Replit, yes, it is a
0:08:55 Vibe coding platform that abstracts away all the complexities, but all the layers are there for you
0:08:59 to look at. That was great. But let’s go back to you said it. You say, I’ve got my idea. You plug it
0:09:02 in and it says, it gives you this list of things. And then when you describe it, you said, I’m going
0:09:06 to do this. I’m going to do that. The I there in that case was the agent as opposed to the user.
0:09:10 Yes. And so the agent lists the set of things that it’s going to do. And then the agent actually does
0:09:15 those things. Agent does those things. Okay. Yeah. That’s a very important point. When we did this
0:09:21 shift, we hadn’t realized internally at Replit how much the actual user stopped being the human user,
0:09:26 and it’s actually the agent programmer. Right. So one really funny thing happened is we had servers in
0:09:32 Asia. And the reason we had servers in Asia, because we wanted our Indian or Japanese users to have a
0:09:37 shorter time to the servers. When we launched the agent, their experience got significantly worse.
0:09:41 And we’re like, what happened? Like, it’s supposed to be faster. Well, it turns out it’s worse. It’s
0:09:46 because the AIs are sitting in the United States. And so the programmer is actually in the United States.
0:09:50 You’re sending the request to the programmer and the programmer is interfacing with the machine
0:09:56 across the world. And so, yes, suddenly the agent is the programmer. Okay. So the new terminology,
0:09:59 agent is a software program that is basically using the rest of the system
0:10:05 as if it were a human user, but it’s not. It’s a bot. That’s right. It has access to tools such
0:10:12 as write a file, edit a file, delete a file, search the package index, install a package,
0:10:18 provision a database, provision object storage. It is a programmer that has the tools and interface.
0:10:23 It has a sort of an interface that is very similar to a human programmer. And then, you know,
0:10:26 we’ll talk more about how this all works, but a debate inside the AI industry
0:10:31 is with these, it was kind of this, you know, this idea now of having agents that do things
0:10:34 on your behalf and then go out, you know, go, go out and kind of accomplish missions.
0:10:38 There’s this, you know, kind of debate, which is, okay, how, like, obviously, you know,
0:10:42 it’s a big deal even to have an AI agent that can do relatively simple things to do complex things,
0:10:44 of course, is, you know, one of the great technical challenges of the last 80 years,
0:10:48 you know, to do that. And then there’s this sort of this question of like, can the agent go out and
0:10:53 run and operate on its own for five minutes, you know, for 15 minutes, for an hour, for eight hours,
0:10:58 and meaning like, sort of like, how long does it maintain coherence? Like, how long does it actually
0:11:03 like stay in full control of its faculties and not kind of spin out? Because at least the early agents
0:11:07 or the early AIs, if you set them off to do this, they might be able to run for two or three minutes,
0:11:10 and then they would, they would start to get confused and go down rabbit holes and,
0:11:16 you know, kind of spin out. More recently, you know, we’ve seen that agents can run a lot longer
0:11:22 and do more complex tasks. Like, where are we on the curve of agents being able to run for how long
0:11:28 and for what complexity tasks before they break? That’s absolutely the, I think the main metric
0:11:33 we’re looking at, even back in 2023, you know, I’ve had the idea for software agents, you know,
0:11:38 four or five years ago now. The problem every time we attempt them, the problem of coherence,
0:11:44 you know, they’ll go on for a minute or two, and then they’ll just, you know, they compound in errors
0:11:48 in a way that they just can’t recover. And you can actually see it, right? Because they actually,
0:11:52 they actually, if you watch them operate, they get increasingly confused, and then, you know,
0:11:57 maybe even deranged. Yeah, very deranged, and they go into very weird areas, and sometimes
0:12:03 they start speaking Chinese and doing really weird things. But I would say sometime around,
0:12:11 last year, we maybe crossed a three, four, five minute mark. And it fell to us that, okay,
0:12:18 we’re on a path where long, you know, long horizon reasoning is getting solved. And so we made,
0:12:22 we made a bet. And I tell my team… So, sorry, long horizon reasoning, meaning,
0:12:29 reasoning meaning like dealing in like facts and logic in a sort of complex way,
0:12:33 and the long horizon being over a long period of time. Yes. With many, many steps to a reasoning
0:12:37 process. Yeah, that’s right. So if you think about the way large language models work, is that they
0:12:43 have a context. This context is basically the memory, all the texts, all your prompts, and also all the
0:12:48 internal talk that the AI is doing as its reasoning. So when the AI is reasoning, it’s actually talking to
0:12:53 itself. It’s like, oh, now I need to go set up a database. Well, what kind of tool do I have? Oh,
0:12:59 there’s a tool here that says Postgres. Okay, let me try using that. Okay, I use that. I got feedback.
0:13:06 Let me look at the feedback and read it. And I’ll read the feedback. And so that prompt box or context
0:13:14 is where both the user input, the environment input, and the internal thoughts of the machine
0:13:20 are all within… It’s sort of like a program memory in memory space. And so reasoning over that
0:13:26 was the challenge for a long time. That’s when AI is just like went off track. And now they’re able
0:13:32 to kind of think through this entire thing and maintain coherence. And there’s there’s now techniques
0:13:38 around compression of contexts. So there’s still the context length is still a problem, right? So
0:13:45 I would say LLMs today, you know, they’re marketed as a million token length, which is like a million words
0:13:51 almost. In reality, it’s about 200,000. And then they start to struggle. So we do a lot of,
0:13:58 you know, we stop, we compress the memory. So if a memory, if a portion of the memory is saying that
0:14:04 I’m getting all the logs from the database, you can summarize, you know, paragraphs of logs with one
0:14:09 statement or the database setup, that’s it, right? And so every once in a while, we’ll compress the context
0:14:14 so that we make sure we maintain coherence. So that there’s a lot of innovation happened outside
0:14:19 of the foundation models as well in order to enable that long context coherence.
0:14:22 And what was the, what was the key technical breakthrough at the, in the foundation models
0:14:23 that made this possible? Do you think?
0:14:29 I think it’s RL. I think it’s reinforcement learning. So the way pre-training works is,
0:14:38 you know, they pre-training is the first step of training a large language model. It reads a piece
0:14:43 of text, it’s covers the last words and tries to guess it. That’s how it’s trained. That doesn’t really
0:14:50 imply long context reasoning. It, you know, it, it, it, it turns out to be very, very effective. It can
0:14:57 learn language that way. But the reason we weren’t able to move past that limitation is that that
0:15:04 modality of training just wasn’t good enough. And what you want is you want a type of problem solving
0:15:12 over a, uh, over long context. So what reinforcement learning, uh, uh, especially from code execution
0:15:21 gave us is the ability to, for the machine to, for the LLM to roll out what we call trajectories in AI.
0:15:29 So trajectory is a, uh, step-by-step reasoning chain in order to reach a solution. So, uh, the way,
0:15:34 uh, as I understand reinforcement learning works is they put the LLM in a programming environment
0:15:41 like Replit and say, Hey, here’s a per, here’s a, uh, uh, code base, here’s a bug in the code base,
0:15:46 and we want you to solve it. Um, now the human trainer already knows what the solution would look
0:15:50 like. So we have a pull request that we have on GitHub. So we know exactly, or we have a unit test
0:15:55 that we can run and verify the solution. So what it does is it rolls out a lot of different trajectories.
0:16:01 Those, they sample the model and maybe one of those trajectories will reach and a lot of them will
0:16:08 just go, go off, off track, but one of them will reach the solution by solving the bug and it reinforces
0:16:12 on that. So that, that gets a reward and the model gets trained that, okay, you know, this is how you
0:16:16 solve these type of problems. And so that’s how we were able to extend these reasoning chains.
0:16:23 Got it. And, and how, as a two part question is how, how, how good, how good are the models now at long,
0:16:28 long, long reasoning? Uh, and I would say, and how do we know, like how, how is that established?
0:16:39 Um, there is a nonprofit called meter, um, that is, uh, measuring, uh, useful, uh, has a benchmark to
0:16:46 measure, uh, how long a model runs while maintaining coherence and doing useful, useful things, whether
0:16:51 it’s programming or other top benchmark tasks that they’ve done. Uh, and they put up a paper,
0:17:00 I think, uh, late last year that said every seven months, uh, the, the minutes that a model can run
0:17:06 is doubling. So you go from two minutes to, you know, four minutes and seven months. I think they’ve
0:17:10 vastly underestimated that. Is that right? Vastly. It’s doubling, it’s doubling more often than seven
0:17:16 months. We, so agent three, we measure that, uh, you know, very closely. Uh, and we measure that in
0:17:21 real tasks from real users. So we’re not doing benchmarking. We’re actually doing AB tests and
0:17:27 we’re looking at the data that how users are successful or not for us that the absolute sign
0:17:30 of success is you made an app and you published it because when you publish it, you’re paying extra
0:17:34 money. You’re saying this app is economically useful. I’m going to publish it. So that’s as
0:17:40 clear cut as possible. And so what we’re seeing is in agent one, the agent can run for two minutes
0:17:47 and then, and then perhaps struggle. Agent two came out in February. It ran for 20 minutes. Agent three,
0:17:54 200 minutes. Okay. Some users are pushing it to like 12 hours and things like that. I’m less confident
0:18:00 that it is as good. And when it goes to these stratospheres, but at like two, three hours
0:18:05 timeline, it is really, it’s, it’s, it’s, it’s, it’s insanely good. And, and the main innovation
0:18:13 outside of the models is a verification loop. Actually, uh, I remember reading, um, a research
0:18:21 paper from Nvidia. So what Nvidia did is they’re trying to, uh, write, um, GPU kernels, uh, using deep seek.
0:18:26 And that was like perhaps seven months ago when deep seek came out. And what they found is that if
0:18:32 we add a verify in the loop, if we can run the kernel and verify it’s working, we’re able to run deep seek
0:18:39 for like 20 minutes. And it was generating actually optimized kernels. And so I was like, okay, the next
0:18:47 thing for us, obviously as an, as a sort of a agent lab or like Applay, our company, we’re not doing the
0:18:51 foundation model stuff, but we’re doing a lot of research on top of that. And so, okay, we know that
0:18:58 agents can run for 10, 20 minutes now, or LLMs can stay coherent for longer, but for you to push them to
0:19:05 200, 300 minutes, you need a verifier in the loop. So that’s why we spend all our time, uh, creating
0:19:11 scaffolds to make it so that the agent can spin up a browser and do computer use style testing. So once
0:19:18 you put that in the middle, what’s happening is it works for 20 minutes, uh, it spins up another agent, uh,
0:19:24 spins up a browser, tests the work of the previous agent. So it’s a multi-agent system. And if it is,
0:19:31 uh, if it founds a bug, it starts a new trajectory and says, okay, good work. Let’s summarize what you
0:19:37 did the last 20 minutes. Now that, that plus what the bug that we found, that’s a prompt for a new
0:19:41 trajectory, right? So you stack those on each other and you can go endlessly, but it’s like a marathon,
0:19:46 like setting up a marathon or like a relay race. As long as he, as long as each step was done properly,
0:19:49 you could do in sort of an infinite number of steps. That’s right. That’s right. You can always
0:19:53 compress the previous step into a paragraph and that becomes a prompt. So it’s, it’s an agent
0:19:57 prompting the next agent. Right, right, right. That’s amazing. So, um, and then when an agent,
0:20:00 like when a modern agent, like running on modern, modern LMs that are trained this way, when it,
0:20:05 let’s say it runs for 200 minutes, like when you watch the agent run, is it like running? Is it like
0:20:11 processing through like logic and tasks at the same pace that like a human being is or slower or faster?
0:20:18 It is actually, I would say it is faster, but not that much significantly faster. It’s not
0:20:21 at computer speed, right? Right. What we expect computer speed to be.
0:20:24 It’s like watching a per, like if you watch that, if you, if it’s describing what it’s doing,
0:20:29 it’s sort of like watching a person work. It’s like watching John Carmack on cocaine work.
0:20:36 The world, okay. The world’s, the world’s best programmer. Yeah. The world’s best programmer on a
0:20:40 stimulus. On a stimulus. Yeah, that’s right. Okay. Working for you. Working for you. Yeah.
0:20:46 So it’s very fast and you can see the file lifts running through, but every once in a while it’ll
0:20:50 stop and it’ll start thinking, I’ll show you the reasoning. Yeah. It’s like, I did this and I did
0:20:55 this. Am I on the right track? It kind of really tries to reflect. Right. And then it might review its
0:21:01 work and decide the next step, or it might kick into the testing agent or, you know, so, so you’re
0:21:05 seeing it do all of that. And every once in a while it calls a tool, for example, it stops and says,
0:21:14 well, we ran into an issue, you know, Postgres, um, 15 is not, uh, compatible with this, you know,
0:21:20 database or M package that, that I have. Um, okay. This is a problem I haven’t seen before. I’m going
0:21:25 to go search the web. So it has a web search. We’ll go do that. And so it looks like a human
0:21:29 programmer. Right. And it’s really fascinating to watch it. So one of my favorite things to do is
0:21:34 just to watch the tool chain and reasoning chain and the testing chain. And it is, yeah, it is like
0:21:39 watching a hyper-productive programmer. Right. So, you know, we’re kind of getting into here,
0:21:43 kind of the holy grail of AI, which is sort of, you know, generalized reasoning, um, you know, by the
0:21:48 machine. Um, so, uh, you, you mentioned this a couple of times with this idea of, of, of, of,
0:21:53 of verification. So, so just for folks on the listening podcast who maybe aren’t in the details,
0:21:56 let me try to describe this and see, see if I have it right. So like a, just a, just a large
0:22:00 language model, the way you would experience, you would have experienced with like chat GPT out of
0:22:04 the gate two years ago or whatever, would have been, it’s like, it’s incredible how fluid, uh,
0:22:09 it is at language. Um, it’s incredible how good it is at like writing Shakespearean sonnets or rap lyrics.
0:22:12 It’s, it’s amazing how good it is at human conversation. But if you start to ask it,
0:22:17 like problems that involve like rational thinking, uh, or problem solving, all of a sudden, like,
0:22:20 you know, or math or the math, the whole show. And, and, and in the very beginning,
0:22:23 it was, you could ask if you asked a very basic math problems that, you know, it would,
0:22:26 it would not be able to do them. That’s right. Uh, but then even when it got better at those,
0:22:30 if you started to ask it to like, you know, it, it could maybe add two small numbers together,
0:22:33 but it couldn’t add two large numbers together. Or if it could add two large numbers,
0:22:36 it couldn’t multiply them. Yeah. And it’s just like, all right, this is true. And then it had
0:22:39 this sort of, there was this famous, the famous was the strawberry, the strawberry test, the famous
0:22:42 strawberry test, which is how many R’s are in the word strawberry. That’s right. And there was this
0:22:46 long period where it kept, it would, it would just guess wrong. It would say there are only two R’s
0:22:51 in the word strawberry. And then it turns out there are three. Um, so, um, so it was, it was this thing.
0:22:55 And so people were, and there was even this term that was being used, kind of the, the slur that was
0:23:00 being used at the time was stochastic parrot. Yeah. I was thinking clanker. Well, clanker is the,
0:23:05 is the new slur. Clanker, clanker is just the full on racial slur. I guess AI is a species.
0:23:10 Um, but the, the technical critique was so-called stochastic parrot, stochastic means random.
0:23:14 Yeah. Uh, so sort of random parrot, meaning basically that this thing was sort of a,
0:23:18 the large language models were like a, they were like a mirage where they were like repeating
0:23:20 back to you things that they thought that you wanted to hear, but they didn’t.
0:23:23 And in a way it’s true in the, in the pure pre-training LLM world.
0:23:26 Right. For the, for the very basic layer. But then what happened is, as you said,
0:23:30 over the last year or something, there was this layering in of reinforcement learning. And then,
0:23:31 but the key to-
0:23:34 And it’s not new, crucially, it’s like, it’s AlphaGo, right?
0:23:34 Right. So,
0:23:36 Okay. Describe, so describe that for a second.
0:23:43 Yeah. So we, we had this breakthrough before. And, uh, 2015 was the AlphaGo breakthrough,
0:23:49 I think 2015, 2016, where it is emerging of sort of, uh, you know, the, the, the, you would know
0:23:54 a lot better than me, the old AI debate between the connectionists, uh, the, the, the people who,
0:24:00 who thinks neural networks are the true sort of way of doing AI and, uh, symbolic systems,
0:24:05 I think, or like the people that think that, you know, discrete reasonings or F statements and
0:24:09 knowledge bases, whatever, this is the way to go. And so there was, there was a merging of these two
0:24:15 worlds where the way AlphaGo worked is it had a neural network, but it had a Monte Carlo tree search
0:24:20 algorithm on top of that. So the neural network would generate, uh, with, with like, uh, generate
0:24:27 a list of potential moves. Uh, and then you had a more discrete algorithm sort those moves and find the
0:24:33 best based on just a tree search based on just, uh, trying to verify again, this is sort of a verifier
0:24:41 in the loop, trying to verify which move might yield the best based on more classical way of doing
0:24:48 algorithms. Um, and so that, that’s a resurgence of, of that movement where we have this amazing
0:24:55 generative, uh, neural network that is the LLM. And now let’s layer on more discrete ways of trying to
0:24:59 verify whether it’s doing the right thing or not. And let’s put that in a training loop. And once you do
0:25:05 that, the LLM will start gaining new capabilities such as, uh, reasoning over math and code and things
0:25:09 like that. Exactly. Right. Okay. And then that’s great. And then, and then the, the key thing there,
0:25:14 though, for, for our elder work, for LLMs to reason, the key is that it be a, a problem statement
0:25:20 that there is a defined and verifiable answer. Is that right? And so, and, and, and you might think
0:25:23 about this as like, let’s give a bunch of examples. Like in medicine, this might be like, um, you know,
0:25:28 a diagnosis that like a panel of human doctors agrees with, um, or, or, or by the way, or a diagnosis
0:25:33 that actually, you know, solves the condition. Um, in law, this would be a, um, you know, a, a,
0:25:37 a argument that the front of a jury actually results in an acquittal, uh, or, or something
0:25:42 like that. Um, in, uh, math, it’s an equation that actually solves properly. Uh, in physics,
0:25:45 it’s a result that actually works in the real world. I don’t know. In civil engineering,
0:25:48 it’s a bridge that doesn’t collapse. Right. So, so, so there, there, there’s always some,
0:25:49 some tests.
0:25:55 The caveat is that the first two do not work very well just yet. Like the, the, like the,
0:26:01 the, I would say, uh, law and healthcare, they’re still a little too squishy, a little too soft.
0:26:02 Okay.
0:26:07 It’s unlike math or code, like the way that they’re training on math, they’re using this, uh,
0:26:11 sort of like a program language, uh, provable language called lean for proofs, right? So you can
0:26:17 run a lean statement. You can run a computer code. Uh, perhaps you can run a physics simulation
0:26:22 or a civil engineering, uh, sort of physics simulation, but you can’t run a diagnosis.
0:26:22 Okay.
0:26:24 So, uh, I would say that.
0:26:26 But you could verify it with human answers or, or not.
0:26:29 Yeah. So that, that’s a more, or a chaff in a way.
0:26:30 Okay.
0:26:35 So it is not the like sort of autonomous RL train, like fully scalable at times.
0:26:36 Okay.
0:26:41 Which is why coding is moving faster than any other domain is because we can,
0:26:45 we, we, we can generate these problems and verify them on the fly.
0:26:48 But there’s two, but with coding is anybody who’s coded knows there’s coding.
0:26:50 There’s two tests, which is one is, does the code compile?
0:26:50 Right.
0:26:52 And then the other is, does it produce the right output?
0:26:52 Right.
0:26:54 And just because it compiles doesn’t mean it produces the right output.
0:26:55 Right.
0:26:57 And I, you tell me about verifying that it’s the correct output is harder.
0:27:07 Yeah. So Sweebench is a collection of, uh, verified pull requests and states.
0:27:10 Uh, so, so it is, it is not just about compiling.
0:27:13 We, so they, they, a group of scientists.
0:27:19 So Sweebench is the main benchmark used to test whether AI is good at software engineering tasks.
0:27:21 And we’re almost saturating that.
0:27:27 So last year at like maybe 5% early 24 or less.
0:27:32 And now we’re like 82% or something like that with cloths on at 4.5, that’s state of the art.
0:27:36 And that’s like a really nice hell climb that’s happening right now.
0:27:39 Uh, and basically they went and looked on GitHub.
0:27:42 They found the, you know, most complex repositories.
0:27:45 They found bug statements that are very clear.
0:27:50 Uh, and they found requests that actually solve those bug statements with unit tests and everything.
0:27:56 So there is an existing corpus on GitHub of tasks that, that the AIs can, can solve.
0:27:58 And you can also generate them.
0:28:04 Those are not too hard to, to generate, uh, you know, what’s called synthetic, uh, data.
0:28:07 Uh, uh, but, but you’re right.
0:28:12 It’s not infinitely scalable, um, because you, you, some human verifiers still need to
0:28:16 kind of look at the, at the task, but maybe the foundation models have found a way to have
0:28:18 the synthetic training go all the way.
0:28:18 Right.
0:28:21 And then what’s happening, I think, I think because what’s happening is the foundation model
0:28:23 companies are in some cases, they are higher.
0:28:26 They’re actually hiring human experts to generate new training data.
0:28:26 Yes.
0:28:29 So they’re actually hiring mathematicians and physicists and coders to basically sit.
0:28:33 And, you know, they’re, they’re, they’re, they’re hiring, they’re, they’re hiring
0:28:34 human programmers, putting them on the cocaine.
0:28:34 Yes.
0:28:37 And having them, probably coffee.
0:28:37 Yeah.
0:28:41 Um, uh, and, and having them actually write code and then, and then write code in a way
0:28:44 where there’s a known result of the code running such that the, the, this RL loop
0:28:45 can be trained properly.
0:28:45 That’s right.
0:28:48 And then the other, the other, and then the other thing these companies are doing
0:28:52 is, as you said, they’re, they’re building systems where the software itself generates
0:28:56 the training data, generates the tests, generates the valid, the validated results.
0:28:58 And then that’s so-called synthetic training data.
0:28:59 That’s right.
0:29:04 And, but yeah, but, but again, those, those work in the very hard domains, it works to some
0:29:06 extent in the software domains.
0:29:06 Mm-hmm.
0:29:08 And I think there’s some transfer learning.
0:29:12 We can, you can see the reasoning work when it comes to, you know, tools like deep research
0:29:13 and things like that.
0:29:18 But we’re not making as fast as progress in the, in the more soft domains.
0:29:23 So, so, so software domains, meaning like domains in which it’s hard, it’s harder, harder or
0:29:28 even impossible to actually verify correctness of a result in a sort of a deterministic, factual,
0:29:30 grounded, non-controversial way.
0:29:36 Like if you have a chronic disease, you could, you could have, you know, you have a POTS or,
0:29:43 you know, whatever, EDS syndrome or, and they’re all, they’re all clusters and it’s,
0:29:48 because it is the domain of abstraction. It is not as concrete as code and math and things
0:29:51 like that. So I think there’s still a long ways to go there.
0:29:55 Right. So sort of the more concrete, the problem, like it’s the concreteness of the problem that
0:29:58 is the key variable, not the difficulty of the problem. Would that be a way to think about it?
0:30:04 Yeah. Yeah. I think that the concreteness in a sense of, can you get a true or false
0:30:05 verifiable app? Right.
0:30:09 But like in any domain, in any domain of human effort in which there’s a verifiable answer,
0:30:10 we should expect extremely rapid progress.
0:30:11 Yes. Right. Okay.
0:30:13 Yes, absolutely. And I think that’s what we’re saying.
0:30:15 Right. And that, and that for sure includes math, that for sure includes physics,
0:30:18 for sure includes chemistry, for sure includes large areas of code.
0:30:20 That’s right. Right. What else does that include, do you think?
0:30:23 Yeah. Bio, like we’re saying with a protein.
0:30:26 Like genomic, genomic, yeah. Yeah, yeah. Things like that.
0:30:35 I think some, some areas of robotics, there’s a clear outcome, but it’s not that many. I mean,
0:30:37 surprisingly. Well, it depends.
0:30:42 Yeah. Depends on your point of view. Some people might say that’s a lot. So, and then you
0:30:46 mentioned that, you mentioned the pace of improvement. So what would you expect from
0:30:48 the pace of improvement going forward for this?
0:30:54 I think we’re ripping on coding. Like I think, I think it’s just, it’s going like, I think
0:31:01 it’s going to be like what we’re working on with, with agent four right now is by, by next year,
0:31:06 we think you’re going to be sitting instead of rep, in front of rep lead and you’re shooting off
0:31:12 multiple agents at a time. You’re like planning a new feature. So I, I want to, you know, social network
0:31:20 on top of my storefront and another one is like, Hey, um, refactor the database in, in your running
0:31:24 parallel agents. So you have about five, 10 agents kind of working in the background and they’re merging
0:31:29 the code and taking care of all of that. But you also have a really nice interface on top of that,
0:31:34 that you’re doing design and you’re interacting with AI in a more creative way, maybe using visuals
0:31:39 and charts and things like that. So there’s a multimodal angle of that, of that interaction.
0:31:47 So I think, you know, creating software is going to be such an exciting, uh, area. And, and, and I
0:31:53 think that the lay person will be as good as a, what a senior software engineer that works at Google
0:32:00 is today. So I think, I think that’s happening very soon. Um, but, but, you know, I don’t see them
0:32:05 and be curious about your point of view, but like my experience between as, as a sort of a,
0:32:12 you know, on the, let’s say healthcare side or more, you know, write me an essay side or more creative
0:32:17 side, haven’t seen as much of a rapid improvement as what we’re seeing in code. So, so I think, I think
0:32:23 code is going to go to the moon. Math is probably as well. Some, some, you know, scientific domains,
0:32:27 bio, things like that. Those are, are going to move really fast. Yeah. So this is, there’s this,
0:32:31 there’s this weird dynamic. Let’s see if you agree with this and Eric also be curious your
0:32:33 point of view on this. Like there’s this weird dynamic that we have and we have this in the
0:32:37 office here a lot. And I also have this with like leading entrepreneurs a lot, which is this thing of
0:32:41 like, like, wow, this is the most amazing technology ever. And it’s moving really fast. And yet we’re
0:32:46 still like really disappointed. Um, and like, it’s not moving fast enough. And like, it’s like,
0:32:49 like maybe right on the verge of stalling out and like, you know, we should both be like
0:32:52 hyper excited, but also on the verge of like slitting our wrists because like, you know,
0:32:57 the gravy train is coming to an end. Right. And I always wonder, it’s like, you know,
0:33:00 on the one hand, it’s like, okay, like, you know, not all, I don’t know, ladders go to the moon,
0:33:03 like just cause something, you know, looks like it works or, you know, doesn’t mean it’s going to,
0:33:06 you know, be able to scale, you’re gonna be able to scale it up and have it work,
0:33:10 you know, to the fullest extent. Um, uh, you know, so like, it’s important to like recognize
0:33:14 practical limits and then not just extrapolate everything to infinity. Um, on the other hand,
0:33:17 like, you know, we’re dealing with magic here that we, I think probably all would have thought
0:33:20 was impossible five years ago or certainly 10 years ago. Like I, I didn’t, you know,
0:33:24 look, I, I, you know, I got my CS degree in the late eighties, early nineties. I, I never,
0:33:26 I didn’t think I would live to see any of this. Right. Like, this is just amazing that this is
0:33:32 actually happening in, in, in my lifetime. Um, but, but there’s a huge bet on AGI, right?
0:33:36 Like whether it’s the foundation models, uh, I think, you know, now the entire U.S. economy is
0:33:43 sort of a bet on AGI and, and there are crucial questions to ask whether are we on track to AGI or not,
0:33:47 right? Because there are some ways that I can tell you, it doesn’t seem like we’re on track
0:33:53 to AGI because we, uh, because there doesn’t seem to be transfer learning across these domains that are,
0:33:57 that are, you know, significance. Right. And so if we get a lot better at code,
0:34:03 we’re not immediately getting better at like generalized reasoning. We need to go also,
0:34:09 you know, get training data and create RL environment for bio or chemistry or physics or math or law.
0:34:15 or so, so, and this, this has been the sort of point of discussion now in the AI community after
0:34:21 the, uh, Dworkish and Richard Sutton, uh, interview where, uh, you know, Richard Sutton kind of poured
0:34:27 this cold water on the, um, uh, on the bitter lesson. So everyone was using this, uh, essay that he wrote
0:34:34 called the bitter lesson. The idea is that there are, um, infinitely scalable ways of, uh, doing, uh,
0:34:42 uh, uh, AI research and, and, and, and, and anytime you can pour more compute and more data and get more
0:34:48 performance out, you’re just, you know, that’s the ultimate way of getting to AGI. And some people,
0:34:56 you know, interpreted that interview that perhaps he’s doubtful that even we’re even on a, on a bitter,
0:35:03 uh, lesson path here and perhaps the current training regime is actually very much the opposite
0:35:08 in which we, we are so dependent on human data and human annotation and, and all of that stuff.
0:35:14 So I think that I agree with you. I mean, as a company, we’re, we’re excited about where things
0:35:20 are headed, but, but there’s, there’s a question of like, are we on track to AGI or not? And be curious what
0:35:25 you think. So, so, and you know, Ilya, I think, you know, Ilya Sescover makes a, makes a specific
0:35:27 form of this argument, which is basically like, we’re, we’re just literally running out of training
0:35:30 data. It’s a fossil fuel argument, right? Like we’ve, we’ve slurped all the training data,
0:35:34 but fundamentally we’ve slurped all the data off the internet. That is where almost all the data
0:35:37 is at this point. There’s a little bit more data that’s in like, you know, private dark pools
0:35:40 somewhere that we’re going to go get, but we have it all. And then, right, we’re, we’re in this
0:35:43 business now trying to generate new data, but generating new data is hard and expensive,
0:35:47 you know, compared to just like slurping things off the internet. So there are these arguments. Um,
0:35:50 you know, having said that, you know, you get a definition of questions here really quick,
0:35:53 which are kind of a rabbit hole, but having said that, like you mentioned, transfer learning. So
0:35:57 transfer learning is the ability of the machine to write, to be an expert in one domain and then,
0:36:01 and then generalize that into another domain. My answer to that is like, have you met people?
0:36:03 Yeah. And how many people do you know,
0:36:07 are able to do transfer learning? Yeah, not many. Not many, right? Well, because they’re,
0:36:11 because they’re quite the opposite, actually. The nerdier they are in a certain domain,
0:36:16 they kind of, you know, often they have blind spots. We joke about how everyone’s just retarded in one
0:36:21 area or they make some like massive mistake and don’t trust them on this, but another topic,
0:36:21 you know? Right. Yeah. Well,
0:36:24 and this is a well known thing among like, for example, public intellect. So this happens,
0:36:27 but there’s actually been whole books written about this on so-called public intellectual. So
0:36:31 you get these people who show up on TV and they’re experts. And what happens is they’re like an expert
0:36:34 in economics, right? And then they show up on TV and they talk about politics and they don’t know
0:36:37 anything about politics, right? Or they don’t know anything about like medicine,
0:36:40 or they don’t know anything about the law, or they don’t know anything about computers.
0:36:44 You know, this is the Paul Gurgman talking about how the internet is going to be no more significant
0:36:45 than the fax machine. Fax, yeah.
0:36:47 He’s a brilliant economist. He has no idea how a computer works.
0:36:48 Is he a brilliant economist?
0:36:56 Well, at one point, at one point, at one point, let’s get, even if, even if he’s a brilliant,
0:36:59 well, this is the thing, like, what does that mean? Okay. Should a brilliant economist be able
0:37:04 to extrapolate, you know, the internet is a good question. But, but the point being like,
0:37:07 even if he is a brilliant, you know, or take anybody, take anybody, oh, by the way, or like,
0:37:12 Einstein’s like actually my favorite example. I think you’d agree, Einstein was a brilliant physicist.
0:37:12 Yeah.
0:37:16 He was like, he was, he was a Stalinist. Like he was a, yeah, he was a socialist and he was
0:37:18 a Stalinist. And he was like, well, he thought like Stalin was fantastic.
0:37:19 Well, he’s already out, so.
0:37:21 Yeah. Okay. All right.
0:37:22 True socialism.
0:37:28 All right. All right, Einstein. You know, I’ll, I’ll, I’ll, I’ll, I’ll, I’ll, I’ll take your word for it.
0:37:31 But like, once he got into politics, he was just like totally loopy or, or, you know,
0:37:33 or even right or wrong. It’s just, he just sounded like all of a sudden,
0:37:37 like an undergraduate lunatic, like somebody in a dorm room, like he, there was no transfer
0:37:42 learning from physics into politics. Like he, was it right or wrong? He didn’t, there was no,
0:37:46 there was clearly, there was nothing new in his political analysis. It was the same rote routine
0:37:50 bullshit you get out of, you know. Yeah. So in a way, the argument you’re making is like,
0:37:55 we may, maybe already a human level AI. I mean, perhaps the definition of AGI is, is, is something
0:38:00 totally different. It’s like above a human level, that something that truly generalizes across domains.
0:38:01 It’s, it’s not something that we’ve seen.
0:38:04 Yeah. Like we’ve idealized, yeah. As I said, we’ve, we’ve, we’ve, and, you know,
0:38:07 look, we should, we should shoot big, but we’ve, we’ve idealized a, a, we’ve idealized a goal,
0:38:12 um, that may be idealized in a way that like, number one, it’s just, it’s, it’s, it’s like so
0:38:15 far beyond what people can do that it’s, it’s no, you’re no longer, it’s no longer a relevant
0:38:19 comparison to people. And, and usually AGI is defined as, you know, able to do everything better than a
0:38:22 person can’t. And it’s like, well, okay. So if doing everything better than a person can,
0:38:24 it’s like, if a person can’t do any transfer learning at all.
0:38:25 Right.
0:38:29 You know, doing even a little, a little bit, a marginal bit might, might actually be better.
0:38:32 Or it might not matter just because no, no human can do it. And so therefore you just,
0:38:36 you just stack up the domains. There’s also this well-known phenomenon in AI with, you know,
0:38:39 typically this works the other way, which is a phenomenon in AI. AI engineers always complain
0:38:43 about, scientists always complain about, which is the definition of AI is always the next thing that,
0:38:47 that the machine can’t do. And so like the definition of AI for a long time was like,
0:38:50 can it beat humans at chess? And then the minute it could beat humans at chess,
0:38:53 that was no longer AI. That was just like, oh, that’s just like boring.
0:38:55 That’s computer chess. It became an entire decision.
0:38:57 Yeah, computer chess. It’s just like boring. And now it’s an app on your iPhone and nobody,
0:39:00 and nobody, and nobody cares. Right. And it’s immediately that.
0:39:01 Turing test was the next one.
0:39:02 The Turing test.
0:39:03 And then we passed it and nobody.
0:39:04 We blew, this is a really big deal.
0:39:05 There was no celebration.
0:39:08 There was no parties. That’s exactly right. There was no party.
0:39:10 For 80 years, the Turing test, I mean, they made a movie about it,
0:39:13 like the whole thing. That was the thing. And like, we blew right through it,
0:39:16 and nobody even registered it. Nobody cares. It gets no credit for it.
0:39:18 We’re just like, ah, it’s still a complete piece of shit.
0:39:18 Like I said, yeah.
0:39:23 Right. And so there’s this thing where, so the AI scientists are used to complaining,
0:39:27 basically, that they’re always being judged against the next thing as opposed to all the
0:39:31 things they’ve already solved. But that’s maybe the other side of it,
0:39:33 which is they’re also putting out for themselves.
0:39:35 An unreasonable goal.
0:39:38 An unreasonable goal. And then doing this sort of self-flagellation kind of along the way.
0:39:38 That’s right.
0:39:41 And I kind of wonder, yeah, I wonder kind of which way that cuts.
0:39:46 Yeah. Yeah. It’s an interesting question. Like I started thinking about this idea of like,
0:39:54 it doesn’t matter whether it’s truly AGI. And the way I define AGI is that you put in an AI system in
0:39:56 any environment and efficiently learns. Right.
0:40:00 You know, it doesn’t have to have that much prior knowledge in order to kind of learn something,
0:40:03 but also can transfer that knowledge across different domains.
0:40:09 But, you know, we can get to like functional AGI. And what functional AGI is, is just, yeah,
0:40:18 collect data on every useful economic activity in the world today and train an LLM on top of that,
0:40:24 or train the same foundation model on top of that. And we’ll go, we’ll target every sector economy,
0:40:29 and you can automate a big part of labor that way. So I think, I think, yeah, I think we’re on that
0:40:31 track. Right. For sure. Right.
0:40:36 Um, you tweeted after GPT-5 came out that you were feeling the diminishing returns. Yeah. What were you
0:40:40 expecting and what needs to be done? Do we need another breakthrough to get back to the pace of
0:40:45 growth or what are your thoughts? I mean, this, this whole discussion is, is sort of about that. And my
0:40:52 feeling is that, uh, you know, GPT-5, uh, got good at verifiable domains. It didn’t feel that much
0:40:58 better at anything else. The more human angle of it, it felt like it regressed. And like you had this,
0:41:06 uh, sort of, uh, Reddit pitchfork, uh, sort of, uh, movement against, against Sam and OpenAI because
0:41:13 they felt like they lost a friend. GPT-4-0 felt a lot more human and closer. Uh, whereas GPT-5 felt a lot
0:41:18 more robotic, you know, very in its head to kind of try to think through, through everything. And,
0:41:26 um, and so I, I, I would have just expected like when we went from GPT-2 to 3, it was clear it was
0:41:32 getting a lot more human. It was, uh, a lot closer to our experience. It can, you can feel like it’s
0:41:37 actually all it gets me. Like there’s something about it that understands the world better. Similarly,
0:41:46 three to four, four to five didn’t feel like it was a better overall being, as it were.
0:41:52 But is that, is that, is that, is that, is that a, is the question there like, is it emotionality?
0:41:59 Partly emotionality, but, but again, partly, like, I like to ask models like very controversial,
0:42:07 uh, things. Um, can it reason through, uh, I don’t know how deep we want to go here, but like, um,
0:42:12 what happened with the world trade seven? Right, right. Sure.
0:42:17 That’s an interesting question, right? Like, I’m not, I’m not putting out a theory, but like,
0:42:22 it’s interesting. Like, how did it, you know, and, and can it, can it think through controversial
0:42:27 questions in the same way that it can go think through a coding problem? Right.
0:42:31 And I, there, there hasn’t been any movement there. Like the, all the reasoning and all of
0:42:36 that stuff, I haven’t said, and not just that, you know, that’s a cute example, but like, um, COVID,
0:42:42 right? Like, you know, what, uh, the origins of COVID, right. You know, go, uh, you know, dig up
0:42:47 GPT four or other models and go to GPT five. You’re not going to find that much difference of,
0:42:51 okay, let’s reason together. Let’s try to figure out what was the origins of COVID because it’s
0:42:55 still an unanswered question, you know, and I don’t see them making progress on that.
0:42:58 I mean, you play a lot with them. Do you feel like I use it differently? I don’t know. Maybe
0:43:03 I have different expectations. Um, I, I’m, I, the way I, my main use case actually is sort of,
0:43:08 sort of PhD and everything at my beck and call. Um, and so I’m, I’m trying to get it to explain things
0:43:12 to me more than I’m trying to like, you know, have conversations with it. Maybe, maybe I’m just
0:43:16 unusual with that, but that gets better. Well, so what I, what I, what I found specifically is,
0:43:20 uh, a combination of like GPT five pro plus deep reasoning or like rock four heavy,
0:43:24 like the, you know, the, the, the highest end models, um, uh, like that. Um, you know,
0:43:29 they, you know, basically generate 30 to 40 page, you know, essentially books on demand on any topic.
0:43:33 Um, and so anytime I get curious about something, uh, you’re just taking it, maybe it’s my version
0:43:37 of it, but it’s something like, I don’t know. Like a good, here’s a good example. Um, when, when,
0:43:40 when, when, when an advanced economy puts a tariff on, on, on, on a raw, you know,
0:43:44 on a raw material or on a finished good, like who pays, you know, is it, is it the consumer?
0:43:48 Is it the, is it the importer? Is it the exporter or is it the producer? And, and, and this is actually
0:43:51 a very complicated, it turns out a very complicated question. It’s a big, big, big thing that economists
0:43:55 study a lot. And it’s just like, okay, who, you know, who pays? And what I found like for that kind
0:43:59 of thing is it’s outstanding. Well, well, but, but it’s outstanding at, um, sort of going out
0:44:04 of the web, getting information, synthesizing it. Correct. It gives me, it gives me a synthesized
0:44:09 20, 30, 40 page. It basically tops out of 40 pages, 40, 40 pages of PDF. Yeah. Um, uh, but it,
0:44:13 I can get it and get up to 40 pages of PDF, but it’s a completely coherent. And as far as I can tell
0:44:18 for everything I’ve cross-checked, I completely like, like world-class, like if I hired, you know,
0:44:23 for a question like that, if I hired like a great, you know, econ PhD postdoc at Stanford who just like,
0:44:27 went out and did that work, like it would maybe be that good. Yeah. Um, but then, but then of
0:44:31 course the significance is it’s like, it’s like, you know, at least for this, this is true for many
0:44:34 domains, you know, kind of PhD and everything. And so, but, but this is synthesizing knowledge,
0:44:37 not trying to create new knowledge. Well, but this, this, this gets to the,
0:44:41 this sort of, you know, of course you get into the angels dancing on the head of a pin thing,
0:44:44 which is like, what, what, you know, what’s the difference? How many, how much new knowledge
0:44:49 ever actually is there anyway? What do you actually expect from people when you ask them questions? Um,
0:44:52 and so what I’m looking for is like, yes, explain this to me and like the, the, the clearest,
0:44:57 most sophisticated, most complex, most like complete way that it’s possible for somebody to,
0:45:01 you know, for a real expert to be able to, to, to explain things to me. Um, and that’s what I use
0:45:04 it for. And at least, and again, as far as I can tell from the cross-checking, like I’m getting,
0:45:07 you know, like almost like basically a hundred out of a hundred, like, I don’t even think I’ve had an
0:45:11 issue in months, um, where it’s like, had a, had a, had a problem in it. Yeah. And it’s like,
0:45:14 yeah, you can say, yeah, synthesizing is supposed to create new information,
0:45:18 but like it’s, it’s generating a 40 page, it’s basically generating a 40 page book. That’s amazing.
0:45:22 That’s like incredibly like fluid. It’s, you know, it’s, it’s, it’s, it’s, you know,
0:45:26 the, the lot, the logical coherence of the entire, like, it’s, it’s a great right. Like if, if you,
0:45:31 if you evaluated a, a, a human author on it, you would say, wow, that’s a great author.
0:45:35 Yeah. You know, do, are people who write books, you know, creating new knowledge? Well,
0:45:39 yeah, well, sort of not because a lot of what they’re doing is building on everything that came
0:45:43 before them as, as synthesizing, but also like a book is a creative accomplishment. Right. And so,
0:45:49 yeah, one of the things I’m, I’m, I’m, I’m interested in, I’m hoping AI could help us solve this,
0:45:56 just like how confusing the information ecosystem right now, you know, everything feels like propaganda,
0:45:58 like it doesn’t feel like you’re getting real information from anywhere. So I,
0:46:03 I really want an AI that could help me reason from first principles about what’s happening in
0:46:08 the world. Right. For me to actually get real information. And, and maybe that’s an unreasonable
0:46:14 sort of ask of, of the AI researchers, but, but I don’t think we’re, we have made any progress there.
0:46:17 So maybe I’m over focused, maybe I’m over, maybe in my, my line of work,
0:46:20 maybe I’m over focused on arguing, arguing with people as opposed to, um, trying to get to,
0:46:24 trying to get to underlying truths. But, well, here, here’s the thing I, I do a lot with this,
0:46:28 is I just say like, take, take a provocative point of view, um, and then steel man the position.
0:46:31 Uh, take your COVID thing. Steel man, so I, I often, I have a pair of these steel man,
0:46:35 the position that it was a lab leak, um, and the steel man, the position that it was natural origins.
0:46:38 Mm-hmm. Um, and, and again, like, is this creativity or not? I don’t know. But like,
0:46:42 what comes back is like 30 pages each of like, wow, like that is like the most compelling case
0:46:45 in the world I can imagine with like every, you know, everything marshaled against it,
0:46:48 like the argument structured in like the most possible, part of the reason that’s not happening
0:46:52 is because it stopped being taboo to talk about a human origin.
0:46:52 Yes.
0:46:58 When it was taboo, the, the AIs would like talk, well, you know, talk down to you. It’s like,
0:47:04 oh, you’re a conspiracy theorist. And so, uh, there’s a, there’s a, you know, a period of time. And so,
0:47:08 to take something truly controversial and they actually, they, they can’t reason about it because
0:47:13 of all our LHF and all sorts of limitations. And as, as you know, I won’t pick no specific ones here,
0:47:15 but like there, there are certain, certain big models that will still lecture you.
0:47:16 Yeah.
0:47:19 That you’re a bad person for asking that question. But, but, you know, like I, I just, there,
0:47:23 some of them are just like really, really open now to, you know, being able to do these things.
0:47:23 Mm-hmm.
0:47:30 Um, and then, um, uh, yeah. So, um, okay. Uh, yeah. So, okay. So, yeah. So there’s this.
0:47:34 Yeah. So, so basically like ultimately what you’re looking for, like the ultimate thing would be
0:47:38 if there’s something that’s like, I don’t think anybody’s really defined this really well.
0:47:42 Cause it’s not, cause again, it’s like conventional, all the conventional definitions of AGI are like
0:47:43 basically comparing to people.
0:47:43 Yeah.
0:47:48 And there, there, and there it’s always like, you know, it’s, it’s the conventional explanations of,
0:47:52 of, of, um, of, uh, of, of AGI always, for me, strike me a lot. Like the debate around like
0:47:56 whether a self-driving car works or not, which is, is, is, does a self-driving car work because
0:48:00 it’s a perfect driver? Uh, or does it work because it is, it is better than the human driver and
0:48:04 better than the human driver, I think is actually quite, you know, just like with the, the chess
0:48:07 thing and the go thing. I actually think like that, that that’s like a real thing. And then,
0:48:09 and then, and then, and then, and then there’s the like, is it a perfect driver,
0:48:11 which is, you know, what they’re obviously the, the self-driving
0:48:14 car companies are working for. But then I think you’re looking for something beyond
0:48:17 the perfect driver. You’re looking for the car who like knows where to go.
0:48:22 So I, I, I, I’m of two minds, right? Okay. So one mind is the sort of practical entrepreneur.
0:48:28 Right. Uh, and I just, I have so many toys to play with, to build, like stop AI progress today.
0:48:32 Yeah. And Replo will continue to get better for the next five years. Like, wait, there’s so much
0:48:37 we can do just on the app, uh, app layer and the infrastructure layer. So, you know, I, but,
0:48:42 but I think that it will, you know, the, the foundation models will continue to get better as well.
0:48:46 And so it’s, it’s a very exciting time in our industry. Um, the other mind is more academic
0:48:52 because as a kid, I’ve always been interested in the nature of consciousness, nature of intelligence.
0:48:58 I was always interested in AI and reading the literature there. And I would point to the RL,
0:49:02 uh, literature. So Richard Sutton, there’s another guy, I think co-founder of DeepMind,
0:49:08 and Shane Lagg wrote a, wrote a paper trying to define what AGI is. Um, and I, in there,
0:49:15 I think that the definition of AGI, I think is the, is the original perhaps correct one, which is, uh,
0:49:21 efficient continual learning. Like if you, if you truly want to build an artificial general intelligence
0:49:27 that you can drop in any domain, you can drop in a car without that much prior knowledge about cars.
0:49:33 Right. And within, um, you know, how long does it take a human to, to learn how to drive?
0:49:36 Right. You know, within months to be able to drive a car really well.
0:49:37 Right. You know, and learn.
0:49:41 Generalized skill, sort of generalized skill acquisition, generalized understanding acquisition.
0:49:42 Yeah.
0:49:43 Generalized reasoning acquisition.
0:49:47 And I think that’s the thing that would like truly change the world.
0:49:49 Right. That’s the thing that would give us
0:49:53 a better understanding of, of the human mind, of human consciousness.
0:49:59 And that’s the thing that will like propel us to the next, uh, level of human civilization.
0:50:03 I think so, so on a civilization level, I think that’s, that’s a really deep question.
0:50:04 Yeah.
0:50:08 But separate from the economy and the industry, which is all exciting, but, but there’s
0:50:10 an academic aspect of it that I’m really.
0:50:13 And so what, and what odds, what, if we’re on, if we’re on, if we’re on Kelsey today,
0:50:14 what, what odds do we place on that?
0:50:22 I, I, I’m kind of bearish on, on, on, on true AGI breakthrough because what we built is so useful
0:50:25 and economically valuable.
0:50:26 Uh, so in a way.
0:50:27 Oh, good enough.
0:50:29 Good enough is the enemy.
0:50:29 Yeah.
0:50:30 Yeah.
0:50:31 Do you remember that essay?
0:50:32 Um.
0:50:32 Worse is better.
0:50:33 Worse is better.
0:50:34 Worse is better.
0:50:34 Worse is better.
0:50:34 Right.
0:50:35 And, and, and.
0:50:36 So there’s like a local, there’s like a trap.
0:50:38 There’s like a local, local maximum trap.
0:50:38 That’s right.
0:50:39 We’re in a local maximum trap.
0:50:43 Local maximum trap where it’s, because it’s, because it’s good enough for so much economically,
0:50:44 productive work.
0:50:45 Yes.
0:50:48 It relieves the pressure, um, in the system to create the generalized answer.
0:50:49 Yes.
0:50:53 And then you have the weirdos like Rich Sutton and others that are still trying to go that,
0:50:54 down that path and maybe they’ll succeed.
0:50:55 Right.
0:50:59 Uh, but there’s enormous optimization energy behind the current thing.
0:50:59 Right.
0:51:01 That we’re hell climbing on this like local maximum.
0:51:02 Right, right, right.
0:51:05 And the irony of it is everybody’s worried about like the, you know, the gazillions of dollars
0:51:06 going into building out all this stuff.
0:51:09 And, and so the, the, the, the most ironic thing in the world would be if the gazillions
0:51:11 of dollars are going into the local maximum.
0:51:11 That’s right.
0:51:15 As, as opposed to a counterfactual world in which they’re going into solving the general
0:51:15 problem.
0:51:15 Yeah.
0:51:17 But, but, but it’s also potentially irrational.
0:51:17 Yeah.
0:51:20 Like maybe the general problem is actually, you know, not within our lifetimes.
0:51:21 Right.
0:51:21 Who knows?
0:51:22 Right, right, right.
0:51:26 Um, so how much further do you think, like, do you think we squeeze most of the juice out
0:51:27 of, out of LLMs in general then?
0:51:31 Or are there any other research directions that you’re particularly, um, excited about?
0:51:38 Well, that’s the thing that I think the problem is there aren’t that, that many, I think the,
0:51:42 the, the breakthroughs in RL are incredibly exciting, but we also knew about them now for
0:51:48 like over 10 years where you marry generative, uh, systems with, uh, with tree search and things
0:51:48 like that.
0:51:51 Um, but, but there’s a lot more to go there.
0:51:56 And I think, again, the, the, the, the, the original minds behind reinforcement learning are
0:52:00 trying to go down that path and try to kind of bootstrap intelligence from scratch.
0:52:05 Uh, Carmack is, is going down that path. As, as far as I understand Carmack, you guys may
0:52:09 be invested, but the, the, you know, they’re, they’re not trying to go down the LLM path.
0:52:15 So there are people that are trying to do that, but I’m not seeing a lot of progress or outcome
0:52:16 there, but I, I watch it kind of from far.
0:52:19 Although, you know, for all we know, it’s already, there’s already a bot on X somewhere.
0:52:20 What’s that?
0:52:22 Maybe, you know, you know, you never, you never, you never know.
0:52:23 It might not be a big announcement.
0:52:26 It might just be a, you know, one day there’s just like a bot on X that starts winning
0:52:27 all the arguments.
0:52:28 Yeah, it could be.
0:52:34 Or a code, a code, a user read it and all of a sudden it’s generating incredible software.
0:52:37 Um, okay. Let’s, uh, let’s spend our remaining minutes. Let’s, let’s, let’s talk, let’s talk
0:52:41 about you. So, uh, so, uh, so how, so yeah, take us and start from the beginning with your,
0:52:44 uh, with your life. And how, how did you get, how did you get from being born and being in
0:52:44 Silicon Valley?
0:52:45 Okay.
0:52:48 In two minutes.
0:52:48 Yeah.
0:52:54 I’m just, I’m joking, but yeah, I, I got introduced to computers, uh, very, very early on.
0:53:00 And so for whatever reason, so I was born in Amman, Jordan, and for whatever reason, my,
0:53:07 my dad, who was just a government engineer at the time, uh, decided that computers were important.
0:53:12 And he didn’t have a lot of money, took out a debt, bought a computer. It was the first computer
0:53:18 in their, in our neighborhood, first computer of anyone I know. And I just, one of my earliest
0:53:24 memories, I was six years old, just watching my dad unpack this machine and sort of open up this huge
0:53:29 manual and kind of finger type, CD, LS, MKDIR. And like, I would, you know,
0:53:35 be behind his shoulder and just like watching him, you know, type these commands and seeing the sort
0:53:40 of machine kind of respond and do exactly what he’s asked it to do. Um,
0:53:51 popping, popping, popping Tylenol. Exactly. Autism activated, of course, you have to, you have to.
0:53:59 Exactly. What kind of, um, what kind of computer was it? Uh, it was, uh, an IBM, as far as I remember.
0:54:04 IBM PC. Yeah, it was IBM PC. So what year was this about? 1993. 1993.
0:54:08 Okay, so it’s, uh, so did it have Windows at that point or? No, it didn’t have Windows.
0:54:11 It was right before Windows. Right before Windows. But I think Windows had been out, but you would,
0:54:16 It was an add-on. It was an add-on. You wouldn’t boot it up. So we, I think we bought the disk for,
0:54:23 uh, for, uh, for Windows and you had to kind of, uh, bootload it, you know, from the disk and, and it
0:54:27 will open Windows and you can click around. It wasn’t that interesting because there wasn’t a lot on it.
0:54:32 So a lot of time I just spent it in DOS and writing batch files and opening games and messing around with that.
0:54:40 Um, but it wasn’t until Visual Basic that I started. So like after Windows 95, that I started making real
0:54:47 software. Right. Uh, and the first idea I had was, um, I, I used to be a huge gamer. So I used, I used to go
0:54:54 to these, uh, LAN gaming cafes and play Counter-Strike and I would go there and, you know, the whole thing
0:54:58 is full of computers, but they don’t use any software to run their business. It was just like people running
0:55:03 around just like writing down that your machine number, how much time you spent on it and how
0:55:07 much did you pay and kind of tapping your shoulders. Like, Hey, you need to pay a little more for that.
0:55:11 And I asked them like, why don’t you like just build a piece of software that allows me to log in and
0:55:15 have a time or whatever. And I was like, yeah, we don’t know how to do that. And I was like, okay,
0:55:20 I think I know how to do that. So I spent, I was like 12 or something like that. I spent like two years
0:55:25 building that, uh, and then went out and tried to sell it and was able to sell it. Uh, I was making
0:55:32 so much money. I remember McDonald’s opened, uh, in Jordan, uh, around the time when I was 13, 14,
0:55:36 I took my entire class at McDonald’s. It was very expensive, but I was bawling, you know, all this
0:55:45 money and I was showing off. Um, and, uh, and so that was, that was the first, uh, business that I, uh,
0:55:52 created. And then when it came to, and at the time I started kind of learning about AI, you know,
0:55:58 reading sci-fi and all of that stuff. And when it came time to go to college, uh, I didn’t want
0:56:03 to go to computer science because I felt like coding is on its way to get automated. I remember using
0:56:09 these, um, wizards. Do you remember? Yes. Wizards basically. It’s like extremely crude,
0:56:14 early, um, bots that generate code. Yeah. And I remember you could like, you know,
0:56:17 type in a few things like here’s my project, here’s what it does, whatever. And then click,
0:56:21 click, click. And we just like scaffold a lot of code. I was like, oh, I think that’s the future.
0:56:25 Yeah. Like coding is such a. It’s almost solved. Yeah. It’s solved. You know,
0:56:29 why, why should I go into coding? I was like, okay, if AI can do the code, what should I do?
0:56:33 Well, someone needs to build and maintain the computers. And so I went to the computer engineering
0:56:39 and, and, and, and did that for a while. Uh, but then rediscovered my love for, for programming,
0:56:44 uh, reading, uh, reading program essays on Lisp and things like that. And, uh, started messing
0:56:50 around with scheme and programming languages like that. Um, but then I found it incredibly difficult
0:56:55 to just like learn different programming languages. I didn’t have a laptop at the time. And so every
0:57:02 time I go to like wanting to learn Python or Java, I would go to the computer lab, download gigabytes of
0:57:08 software, try to set it up, type a little bit of code, try to run it, you know, run into missing
0:57:16 DLL issue or, and I was like, man, this is so primitive. Like at the time it was 2008, something
0:57:24 like that. You know, we had, uh, Google docs, we had Gmail. You could like open the browser, uh, and
0:57:29 partly thanks to you and be able to kind of, uh, use software on the internet. And I thought the
0:57:34 web is the ultimate software platform. Like everything should go on the web. Okay. Who’s
0:57:39 building an online development environment and no one. Right. And it felt like I would, I found like
0:57:44 a hundred dollar bill on the, you know, on the floor of a grand social station. Like surely someone
0:57:48 should be building this, but no, no one was building this. And so I was like, okay, I’ll, I’ll try to build
0:57:55 it. And I got something done in like a couple hours, uh, which was a text box. You type in some
0:58:00 JavaScript. We, and there’s a, there’s a button that says eval, you click eval and evaluates,
0:58:06 it shows you on a, in an alert box, right? So one plus one, two, I was like, oh, I have a programming
0:58:10 environment. I showed it to my friends. People started using it. I added a few additional things
0:58:14 like saving the program. I was like, okay, all right, this is, there’s, there’s a real idea here.
0:58:20 People love it. And then again, it took me two, two or three years to actually be able to build
0:58:25 anything because, you know, the browser can only run JavaScript. And it took a breakthrough at the
0:58:33 time, uh, Mozilla had a resource project called MScripten that allowed you to, uh, compile different,
0:58:39 uh, programming languages like C, C++ into JavaScript. And for the browser to be able to run something
0:58:44 like Python, I needed to compile C Python to JavaScript. So I was the first to do it in the world.
0:58:49 Uh, so built, uh, contributed to that project and built a lot of the scaffolding around it.
0:58:57 And we, uh, my friends and I compiled Python into JavaScript and it was like, okay, we did it for
0:59:01 Python. Let’s do it for Ruby. Let’s do it for Loa. So, and that’s how the emergence of the idea for
0:59:07 Replit came is that when you need a REPL, you should get it. You should REPL it. And so a REPL is the most
0:59:12 primitive programming environment possible. So I added all these programming languages. And again, all this time,
0:59:18 my friends were using it and excited about it. And I was on GitHub at the time. And just my standard
0:59:22 thing is like when I make a piece of software is open source it. And so I was open sourcing all the
0:59:26 things I was, you know, years building, just like this underlying infrastructure to be able to just
0:59:34 run code in the browser. And then it went viral, uh, went viral on Hacker News and it coincided with
0:59:40 the MOOC era. So massively online courses, Udacity was coming online, Coursera and, and
0:59:45 most famously Code Academy. So Code Academy, uh, was the first kind of website that allowed you to
0:59:50 code in the browser interactively and learn how to code. And they built a lot of it on my software
0:59:53 that I was open sourcing all the way from Jordan. And so I remember seeing them on Hacker News and
0:59:58 they were going super viral. I was like, Hey, that’s, you know, I recognize this. What are you using?
1:00:02 And so I left a Hacker News comments. I was like, Oh, you’re using my, my, my open source package.
1:00:05 And so they reached out to me, they, uh, they’re like, Hey, we’d like to hire you. I was like,
1:00:08 I’m not interested. I want to start a startup. I want to start this thing called Replit.
1:00:14 And, and they’re like, well, no, you know, you should come work with us. We can, we can do the same
1:00:19 stuff. And I kept saying, no, I was like, okay, I’ll contract with you. They were paying me $12 an
1:00:24 hour. I was really excited about it back from Amman. Um, but they came out to their, to their credit.
1:00:29 They came out of Jordan to recruit me and spent a few, a few days there. And then I, you know,
1:00:35 I kept saying no. And then they gave me an offer. I can’t refuse. Um, and they got me an O-1 visa,
1:00:38 came to the United States. That’s when you moved. So when, when, when was the first,
1:00:43 cause you were born, what year? 1987. What was the first year that you could remember where you had
1:00:47 the idea that you might not live your life in Jordan and you might, you might actually move to the U.S.?
1:00:50 Uh, when I watched Pirates of Silicon Valley. Is that right? Okay.
1:00:55 I got it. All right. Um, maybe 98 or 99. I don’t know when it came out.
1:00:56 Okay. That might be a good place. Yeah.
1:01:00 Is it worth telling the hacker story? Because there’s a version of the world where you didn’t
1:01:02 actually, like if that changed differently, maybe you wouldn’t have gone to America.
1:01:07 Right. Right. Yeah. So, uh, in, in school I was programming the whole time. You know,
1:01:11 so I, I just want to start businesses. I just like, I’m exploding with ideas all the time.
1:01:15 And like the reason Replet exists is because I have ideas all the time. I just want to go type
1:01:20 it on the computer and like build them. Um, so I wasn’t going to school. It was like incredibly boring
1:01:24 for me. Uh, and part of the reason why Replet has a mobile app today is because I always wanted to
1:01:26 program under the desk. Right.
1:01:33 Yeah. It’s like just to do things. Um, and so the, at school they kept failing me, uh, for attendance.
1:01:38 No. So I would get A’s, but I just didn’t show up. And so they, they would fail me. And so I felt
1:01:44 that it was incredibly unfair and all my friends were graduating. Now this year, this was like 2011,
1:01:49 I’ve been like for six years in college. It should be like a three or four year. And I was just like
1:01:55 incredibly depressed. I really wanted to be in Silicon Valley. And so I was like, oh, what if I
1:02:04 changed my grades? Uh, there we go. The university database. And, um, and, and so I went into my
1:02:13 parents, uh, uh, uh, basement, uh, and, uh, implemented, uh, the polyphasic sleep. Are you
1:02:18 familiar with that? I, I, I, I am. Uh, Leonardo da Vinci’s, uh, polyphasic sleep. I didn’t hear from
1:02:23 Leonardo da Vinci. I heard it from Seinfeld. Cause, uh, there’s an episode where John Kramer goes
1:02:28 on polyphasic sleep. 20 minutes, every four hours. Yes. 20 minutes, every four hours.
1:02:32 Yeah. And yes. And this, this somehow is going to work well in it. Yeah. And, and hacking,
1:02:35 if you’ve ever done anything, as the meme goes, this, this has never worked for anybody else,
1:02:42 but it might work, but it might work for me. Yes. And a lot of what hacking is, is that you’re,
1:02:46 you’re coming up with ideas for like finding certain security holes and like writing a script
1:02:50 and then running that script. And that script will take you like a 20, 30 minutes to run. And so you’ll take
1:02:54 that, you know, 20, 30 minutes to sleep and go on. And so I spent two weeks just going mad,
1:03:01 like trying to hack into the university database. And, uh, finally I found, um, a way I found a SQL
1:03:07 injection somewhere on the site. Uh, and I found a way to like be able to edit the records, but I didn’t
1:03:12 want to risk it. So I went to my neighbor who was going to the same school. Uh, I think till this day,
1:03:16 no one caught him, but I went to him and I said, um, Hey, uh, I have this way to change grades.
1:03:20 Like, would you want to be my guinea pig? And I was honest about it. I was like, I’m not going to do
1:03:25 it. Are you open to do it? He’s like, yeah, yeah, yeah. They call us human trials.
1:03:28 This is how medicine works.
1:03:35 So, so we, we went and, and, uh, we went and changed his grades and he, he went and pulled
1:03:41 his transcript and the, you know, the update wasn’t, wasn’t there and went back to the basement.
1:03:46 Well, turn out that I had access to the, uh, slave database. I don’t have access to ambassador
1:03:52 database. So find a way through the network privilege escalation. It was an Oracle database
1:03:56 that had a vulnerability and then found the real database. And then I just, you know,
1:04:03 did it for myself, uh, changed the grades and went and pulled my transcripts. And sure enough,
1:04:09 it actually changed, went and bought the, the, the, the gown, went to the graduation parties,
1:04:16 uh, did all that. And we’re graduating. Um, and then one day I’m at home, it’s like maybe
1:04:23 six or seven PM. I get a, you know, the, the telephone at home rings, I’m gonna, I’m gonna,
1:04:24 I’m gonna ring sound.
1:04:30 Yes. Uh, well, um, hello. And he’s like, Hey, this is the university registration system.
1:04:34 And I knew the guy that run it. Uh, it’s like, look, you know, we, we, we’re having this problem.
1:04:39 The system’s been down all day and it keeps coming back to your record.
1:04:43 There’s an anomaly in your record where you’re both pass. You have a passing grade,
1:04:48 but you’re also banned from that, uh, final exam of subject. I was like, oh,
1:04:53 shit. Well, it turns out the database is not normalized. So typically that when they ban you
1:04:58 from an exam, the grades resets to 35 out of a hundred, but apparently there’s a Boolean flag.
1:05:02 And by the way, all the column names in the database are single, single letters.
1:05:05 So that was the hardest thing. It’s security by obscurity. Right.
1:05:09 Right. And it turns out there’s a flag that I didn’t check. So when, when, when you go over
1:05:16 attendance, um, uh, when, when you don’t attend and they, they, they want to fail you, they, they ban you
1:05:21 from the final exam. So I changed the grades and that, that, that created, uh, an issue and brought
1:05:27 down the system. So they were calling me and I thought at the time I was like, you know, I could,
1:05:33 I could potentially lie and it’ll, it’ll be a huge issue. Or I just like, I’ll just, I’ll just
1:05:38 fess up. Yeah. So I said, Hey, listen, look, um, yeah, I might know something about it. Hey,
1:05:42 let me, let me come, uh, tomorrow and kind of talk to you about what happened.
1:05:47 So I go in and I opened the door and it’s the deans of all the, uh, all the schools. It’s like
1:05:51 computer science computer. They were all working on it for like days because it’s like, it’s like,
1:05:56 it’s a very computer heavy, you know, university. And it was like a problem. And they’re all kind
1:06:02 of really intrigued about what happened. And so I pull up a whiteboard and started explaining what I
1:06:07 did and, and everyone was engaged. I gave them a lecture, basically your oral exam for your PhD.
1:06:13 Yeah. They were, they were really excited. And, uh, and I think it was endearing to them. I was
1:06:19 like, Oh wow, this is a, this is a very interesting problem. Um, and then I was like, okay, great.
1:06:25 Thank you. And I was like, Hey, wait, wait, we don’t know what to do with you. Do we send you to jail?
1:06:32 And, uh, I was like, Hey, we have to escalate to the university, um, uh, president. And, and he,
1:06:38 he was a great man. And I think, uh, he gave me a second chance in life and I went to him and I,
1:06:42 uh, you know, I, I explained the situation. I said, like, I’m really frustrated. I need to graduate.
1:06:48 I need to get on with my life. I’ve been here for six years and I just can’t sit in, in, in school.
1:06:53 The stuff I already know, I’m a really good programmer. Uh, and, and he gave me a Spider-Man
1:06:57 line at the time. It’s like, but the great power comes great responsibility and you have a great power.
1:07:02 And, and, and it really affected me. And I think he was right at the moment. And, and so he said,
1:07:07 well, we’re going to let you go, but you’re going to have to help the system administrators secure the
1:07:12 system. There we go. Uh, for the summer. I was like, I’m happy to do it. And I show up and all the
1:07:18 programmers there hate me, hate my guts. And, uh, they, they would lock me out. Like I would see them,
1:07:20 they would be outside. I would knock on the door and no one would listen. It was like,
1:07:24 they don’t want to let me in. I tried to help them a little bit. They, they weren’t collaborative.
1:07:30 And so I was like, all right, whatever. Uh, and so it, it, it came time for me to actually
1:07:36 graduate. It was the final project. And one of the computer science team came to me and he said,
1:07:40 look, I, I need to call a favor. I was a big part of the reason we kind of let you go and we didn’t
1:07:46 kind of prosecute you. Uh, so I want you to work with me on the, um, on the final project. And it’s
1:07:51 going to be around security and hacking. I was like, no, I’m, I’m done with that. Like, I just want to,
1:07:56 I just want to build programming environments and things like that. Uh, and he’s like, no,
1:07:59 you have to do it. I was like, okay. So I, I thought I’d do something more productive.
1:08:03 So I wrote a security scanner, uh, that was very proud of that, that kind of crawls a different
1:08:09 side that tries to SQL injection and all sorts of things. Um, and actually my security scanner found
1:08:10 another vulnerability in this system. Amazing.
1:08:16 And so I went to the defense and he’s like, you need to run this security scanner live and
1:08:20 show that there’s a vulnerability. And I didn’t understand what was going on at the time, but I
1:08:25 just, okay. So I give the presentation about how the system works and I was like, oh, let’s run it.
1:08:29 And it showed that there’s security vulnerability. Okay, let’s get, let’s try to get a shell. So the
1:08:36 system automatically runs all the security stuff and it gets you, gets you a shell. And then the other dean,
1:08:41 that turned out he was giving the mandate to secure the system. And now I started to realize
1:08:48 I’m a pawn and some kind of rivalry here. And, and his, his face turned red and it’s like, no,
1:08:55 it’s impossible. You know, we secure the system. You’re lying. I was like, you know, you’re accusing
1:09:01 me of lying. All right. What should we know? Should we know your, um, uh, your salary or your password?
1:09:05 What do you want me to look up? And I was like, yeah, I look up my password. So I, I look up his,
1:09:09 his password. Uh, and it was like gibberish. It was encrypted. And I was like, oh, that’s not
1:09:13 my password. See, you’re lying. I was like, well, there’s a decrypt function that the programmers
1:09:18 put in there. So I do decrypt and it shows his password. And it was something embarrassing.
1:09:24 I forgot what it was. And so he gets up, really angry, shakes my hand and leaves to change his
1:09:30 password. Uh, so that I, I was able to hack into the university another time. Luckily I, I was able
1:09:36 to graduate, give them the software that secured the system. But, um, but yeah, later on, I would
1:09:40 realize that, yeah, he wanted to embarrass the other guy, which is why I was in the middle.
1:09:44 Well, I think the moral, the moral of the story is if, if you can successfully hack into
1:09:48 your school system and change your grade, you deserve the grade and you deserve to graduate.
1:09:49 I think so.
1:09:52 And, and, and, and just for any, for any parents out there, just, yeah, or children out there,
1:09:55 you can just, you can, you can cite, you can cite, you can cite, cite, cite me as the moral,
1:09:56 you can cite, you can cite, you can cite, I’m proud of me as the moral,
1:09:58 the moral authority, the moral authority in this.
1:10:06 One maybe lesson I think that is very relevant for the AI age. Uh, I think that the traditional
1:10:13 sort of more conformist path is paying less and less dividends. And I think, uh, you know,
1:10:18 kids coming up today should use all the tools available to be able to discover and chart their
1:10:25 own paths. Cause I feel like just, you know, listening to the traditional advice and doing
1:10:30 the same things that people have always done is just not as it’s not working out as much as,
1:10:34 as we’d like. Yeah. Thanks for coming to the podcast. Thank you, man. Fantastic.
1:10:42 Thanks for listening to this episode of the A16Z podcast. If you liked this episode, be sure to like,
1:10:46 comment, subscribe, leave us a rating or review and share it with your friends and family.
1:10:52 For more episodes, go to YouTube, Apple podcasts, and Spotify. Follow us on X,
1:10:58 A16Z and subscribe to our sub stack at a16z.substack.com. Thanks again for listening.
1:11:03 And I’ll see you in the next episode. As a reminder, the content here is for informational
1:11:09 purposes only should not be taken as legal business tax or investment advice, or be used to evaluate any
1:11:15 investment or security and is not directed at any investors or potential investors in any A16Z fund.
1:11:19 Please note that A16Z and its affiliates may also maintain investments in the companies discussed in
1:11:36 this podcast. For more details, including a link to our investments, please see A16Z.com/disclosures.

Amjad Masad, founder and CEO of Replit, joins a16z’s Marc Andreessen and Erik Torenberg to discuss the new world of AI agents, the future of programming, and how software itself is beginning to build software.

They trace the history of computing to the rise of AI agents that can now plan, reason, and code for hours without breaking, and explore how Replit is making it possible for anyone to create complex applications in natural language. Amjad explains how RL unlocked reasoning for modern models, why verification loops changed everything, whether LLMs are hitting diminishing returns — and if “good enough” AI might actually block progress toward true general intelligence.

Resources:

Follow Amjad on X: https://x.com/amasad

Follow Marc on X: https://x.com/pmarca

Follow Erik on X: https://x.com/eriktorenberg

Stay Updated:

If you enjoyed this episode, be sure to like, subscribe, and share with your friends!

Find a16z on X: https://x.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Listen to the a16z Podcast on Spotify: https://open.spotify.com/show/5bC65RDvs3oxnLyqqvkUYX

Listen to the a16z Podcast on Apple Podcasts: https://podcasts.apple.com/us/podcast/a16z-podcast/id842818711

Follow our host: https://x.com/eriktorenberg

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Stay Updated:

Find a16z on X

Find a16z on LinkedIn

Listen to the a16z Podcast on Spotify

Listen to the a16z Podcast on Apple Podcasts

Follow our host: https://twitter.com/eriktorenberg

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Marc Andreessen and Amjad Masad: English As the New Programming Language

Leave a Reply Cancel reply