AI transcript
0:00:09 For anyone who’s saying that these models do not reason and they’re just like parents or whatever, no.
0:00:16 So the average IQ range for adults is between 90 and 109, with 100 being the theoretical average.
0:00:23 So estimated for 03, 157. This is like top of the top smartest people in the world level.
0:00:30 Hey, welcome to the Next Way Podcast. I’m Matt Wolf. I’m here with Nathan Lanz.
0:00:36 And there has been a lot happening in the AI world, including an announcement that a lot of people
0:00:42 think is just this monumental, huge, massive announcement in the fact that we got this 03
0:00:49 model from OpenAI. Well, I guess I can’t really say we got it. We got to see it. We got to see
0:00:55 the 03 model from OpenAI. But a lot of people have been, you know, touting it as like this
0:00:59 massive breakthrough. I’ve even seen like YouTube videos give it the title of like,
0:01:06 we finally have AGI. And everybody’s saying that 03 is AGI. So in this episode, we’re going to talk
0:01:12 about that a little bit. We’re going to talk about our predictions for 2025. And Nathan’s going to
0:01:17 show off something pretty cool that he’s been working on using some of these models. So let’s
0:01:24 get into some of the like 2025 predictions. You and I obviously talk a lot. So chances are our
0:01:29 predictions are going to be fairly close to each other. But let’s go ahead and dive in.
0:01:34 Yeah, let’s start on the models, I guess. I think by the end of the year, a few things are going
0:01:40 to have happened. I think XAI is going to roll out Grok 03. And it’s probably in terms of like a
0:01:44 fast LM that you can just chat with for daily use. It’s going to be really, really, really good.
0:01:48 The fact that they’re just, they’re trying to train it on more data than almost anyone else.
0:01:53 And they’ve, you know, they’re buying more chips from NVIDIA than almost anyone else.
0:01:56 I think it’s going to be really good for like daily, just like, just chatting with AI.
0:01:58 It’s going to, I think it’s going to surprise people. But in terms of like,
0:02:02 Do you think that’ll replace some advanced voice mode for ChatGPT for you?
0:02:07 I don’t know. I don’t think so. I don’t, I’m not, I’m not sure. Like I think for power users,
0:02:13 I’m pretty convinced ChatGPT and OpenAI are going to be even further ahead by the end of 2025.
0:02:19 Like, I think there’s a decent chance that, because like 03, if it’s just 01 with more
0:02:22 compute thrown at it, well, they’ve definitely been working on other stuff. They’re not just
0:02:26 sitting there like, Hey, let’s take what we’ve created and throw more compute at it. And like,
0:02:31 that’s all, that’s all we’re doing for the year. Like, it makes sense to me that like
0:02:36 something like Orion, like a larger model underneath as the baseline will be part of
0:02:41 some new model by the end of 2025. And I think there’s a decent chance that they actually
0:02:44 release it in 2025, like by the end of 2025.
0:02:49 Well, Sam Altman hinted that he believes AGI is coming in 2025, right? So yeah,
0:02:56 yeah, like I said, like I think 01 Pro is like some kind of basic AGI already. That’s my definition.
0:02:59 People like some people think that’s like stupid. Everyone had different definitions of this stuff.
0:03:05 But I think by the time you get to 04, when not only it can like respond quickly to like basic
0:03:10 questions and have like incredibly good answers, but also if you ask it something complicated,
0:03:14 it’ll give you something back in a minute or two that’s amazing. Most people are going to call
0:03:20 that AGI. And I think that could be coming by the end of 2025. And that will make all this stuff
0:03:24 we talked about, like in terms of agents possible, like agents that go off and do market research
0:03:28 for you, come back and bring you some charts and show you what, how you should be approaching a
0:03:32 market and what’s the best strategy. That stuff will be agents, how you interact with email,
0:03:38 do email, like your assistant. That’ll probably be an agent by the end of 2025.
0:03:41 And that’s going to be a dramatic change. That’s going to be like where we’re going to start to
0:03:47 have some real questions, societal questions by the end of 2025. And it’s going to be interesting
0:03:51 to see how that plays out because, you know, I have said in the past, well, Donald Trump’s
0:03:58 administration is going to be very pro AI. But also there was that recent thing with the port,
0:04:03 what was it, the port unions, unions for the port workers, where he kind of backed down from them
0:04:06 saying that we need to protect their jobs and we shouldn’t automate the ports.
0:04:10 And we might see more of that because like there may be such
0:04:17 cultural backlash to some of the job loss that it’s, you know, it may become a pretty tense
0:04:21 conversation and by the end of 2025, because I do think we’ll start to actually see dramatic
0:04:25 changes in the job market. Like maybe everyone’s not going to fire people, but you’re going to see
0:04:30 like, oh, I’m not hiring as many engineers anymore. Like there’s less job openings, less
0:04:37 assistance hired, things like this. Yeah, I agree. I think I do think like the big buzzword of 2025
0:04:42 is going to be agents though. I think like every company is going to be talking about AI agents.
0:04:47 I think you’re going to be hearing Google, open AI, Claude, X, you know, you name it,
0:04:53 they’re all going to be talking about agentic workflows and tool use and being able to like
0:04:58 do things on your computer, on your behalf. And like, I think that’s going to be the conversation
0:05:03 of 2025. I think if there’s any one prediction that I’m fairly certain on, it’s that. I also
0:05:06 think the video models, I don’t think we’ve seen anything yet. I think, you know, eight-second
0:05:12 videos that look pretty decent. I don’t think we’ve seen anything yet. I think 2025 is going to be
0:05:16 wild with what we see with AI video. I mean, Matt, like, let’s still try to step back for a
0:05:19 second. Like, like realize, like, you know, we’ve been talking about this stuff for like over two
0:05:25 years now. Like, you know, in 2023, the videos looked horrible. I was sharing the, I was sharing
0:05:29 threads about them just because they look cool because of technology underneath them. I was
0:05:34 like, this is amazing that you’ve asked AI to dream up what a video of this scene. That’s
0:05:40 incredible. My co-founder for one of my previous startups, JR Bedard, unfortunately, he passed
0:05:44 away, but he used to be fascinated with this stuff. He would like, he would like read books about
0:05:47 like how AI was going to be able to generate arc and he would show me all this stuff. This was
0:05:52 maybe like 10 years ago. Right. Oh, that’s cool. It can make a little pattern and stuff.
0:05:55 Yeah. Yeah. Yeah. And it’s like, that’s where it was 10 years ago. And it’s like, that was like
0:05:59 the seeds of what we have now, but it was like, oh, it can make out a little, a cool pattern.
0:06:02 And I was like, oh, that’s cool. And I was like, do you think that’s going to be like a real thing?
0:06:06 Like in the next like five or 10 years, and it was like, you know, maybe in like 10 years,
0:06:11 and it’s about where it is now. And the videos looked like they were cool if you understood
0:06:16 the technology, but they looked horrible two years ago. And now are they perfect to know? But
0:06:21 like, man, they already look like like 90, 95% of what you might see in a film, like at least
0:06:27 like a short scene or something. Yeah. And, and it feels like too, that, you know, I think I may
0:06:32 have said this offhand the other day, but I believe that where the video models are right now, they’re
0:06:37 kind of like in GPT-4 level in terms of they’re just trained on massive amounts of data, but they’re
0:06:43 not doing a whole lot in terms of reasoning about that data, about that output. And so, so once you
0:06:47 start to, and this is why I think like probably Sora is probably still going to be very good.
0:06:52 Because if OpenAI is that good with their internal models, as soon as they apply that to video as
0:06:57 well, like reasoning on top of it, you’re going to get so much more control of the output of these
0:07:01 videos. You’ll be able to like, yeah, I wanted the character to turn that way. I wanted to do this,
0:07:04 and also make sure that the characters are consistent, because then you can reason upon
0:07:08 the output and make sure that it’s consistent. That quality is going to go up dramatically.
0:07:14 Well, I mean, we’ve got AI large language models now that it will basically generate an output,
0:07:20 double check its output before sending it to you. When you look at like AI art generation,
0:07:28 the big way of doing AI art early on was GANs, Generative Adversarial Networks, where it would
0:07:35 essentially, you had a discriminator and a generator, and the generator would generate a
0:07:42 piece of art, and then the discriminator would go, you would give it a prompt like draw me a cat,
0:07:45 and the generator would draw a cat, and the discriminator would go, that didn’t look like
0:07:48 a cat, try again, and it would draw another cat, and the discriminator would go, that doesn’t look
0:07:54 like, and it would go back and forth until the discriminator couldn’t tell the difference between
0:08:00 a real photo of a cat and an image of a cat. We haven’t seen that kind of tech inside of video
0:08:07 yet, so I think we’re going to have something equivalent to like a GAN, or like what we’re
0:08:12 seeing with like O3, which is kind of the same idea, right? It does a generation, double checks
0:08:16 itself, goes, all right, there’s things that needs to be fixed, double checks itself, takes it to be
0:08:21 fixed, and then finally presents the output once it’s confident in the output that it gave you,
0:08:26 right? Very, very similar conceptually. Well, I think we’re going to get to a point where
0:08:32 we do that with video models, you give it a prompt of like a unicorn flying in space with a cat
0:08:37 writing on its back, it will generate that, it will watch the video back itself and go,
0:08:40 that doesn’t look right, generate it again, that doesn’t look right, generate it again,
0:08:46 and eventually it spits out a video that looks exactly what you had in your mind of what a
0:08:51 unicorn flying through space with a cat writing on its back would look like, right? Like, I think
0:09:01 we are still at the like mid-journey v2 level of AI video, right? And if you remember, if you
0:09:07 followed along to like AI art, mid-journey from v, I think it was v3 to v4 was this massive leap,
0:09:11 right? Once v4 came out, that’s when people started fooling people on Facebook with like
0:09:16 the pope and like a big old puffer jacket and stuff like that. And it was this giant leap where
0:09:22 all of a sudden people no longer could tell what was AI generated and what wasn’t. I think we have
0:09:28 not seen that leap with video yet. And I think in 2025, we will see that leap where it gets so good
0:09:34 that, you know, people didn’t even realize it coming. And the pace at what this happened
0:09:39 is so mind-blowing to me, because like you were mentioning, very beginning of 2023, we were
0:09:45 looking at tools like model scope and zero scope, and they were generating like two second clips,
0:09:51 and they were like 280p resolution, right? Like, they were like 280p.
0:09:56 -The object is kind of jittering around. -Yeah, and like you can barely tell what they were
0:10:01 actually happening on the video. And that was like 23 months ago,
0:10:07 and now we’re getting what we get out of VO. Like, holy crap, like that was a massive leap in two years.
0:10:12 Yeah, and then the crazy thing is if you look at like public perception, people are like,
0:10:19 “Yeah, Sora sucks.” And like for me, it’s just like, “Oh my god, I was, you know,
0:10:25 one of the first people like tweeting a lot about AI video on X, and I remember the comments of,
0:10:29 “Oh, but it’ll be like 10 years before you get anything that’s even close to like a cinematic shot.”
0:10:32 -No, we’re there. -No, we’re there.
0:10:36 Yeah, if you just need like a two or three second like B-roll shot for something,
0:10:40 it might take a few re-rolls, but you can get it right now, you know?
0:10:44 Yeah, and the fact that there’s no like kind of reasoning model on top of that, like really
0:10:49 judging the output, and like as soon as you apply that, like you said, it’s going to get way better.
0:10:53 Now, one thing that with all of this is going to be interesting is the cost.
0:10:58 Like, okay, it can get way better when you reason about the output, like how much does that cost?
0:11:03 And we’ve already seen like with like, you know, okay, with AGI, they eventually want to make it
0:11:08 where everyone has access to it. But it seems likely that like people who have money are going
0:11:14 to get access to that way sooner, just because of the cost to run all of this. Like Pro is $200.
0:11:20 I think that the Chief Financial Officer of OpenAI, she said, “We’re open to experimenting with
0:11:25 things like $2,000 a month.” And I think you probably will see that, right? You’ll see like,
0:11:29 okay, there’ll always be a model that everyone can use. It’s like around like a 20 bucks a month
0:11:33 or whatever. You might have something slightly better, you know, or like, you know, significantly
0:11:39 better for like 200 bucks a month. And if you’re a professional, you might be like paying two grand.
0:11:43 Yeah, well, I mean, most of them need to have free models, right? You can use Claude for free,
0:11:48 I believe, right? You can use ChatGPT for free. You can use Grock for free now. That’s brand new.
0:11:54 Grock only became free like two weeks ago or something. Like Gemini, I’m pretty sure, yeah,
0:12:00 they have Gemini and Gemini Advanced. The Gemini Not Advanced is free, right? I think you could even
0:12:06 test the Gemini 2.0 right now for free inside of their AI studio, right? So there’s always going
0:12:10 to be like that free plan that’s like, anybody can access it, but it’s not going to quite be the
0:12:16 state-of-the-art model. And the more state-of-the-art, the further you go up that sort of tier of how
0:12:22 state-of-the-art it is, the more expensive it’s going to be. Yeah, I mean, so from using O1 Pro,
0:12:27 which in my opinion right now is the best by far, it takes a long time to give you answers back.
0:12:33 And so something I’ve been realizing is, while you’re waiting, like, what do you do? Because
0:12:37 sometimes it’s like taking like three to five minutes and it’s like, okay, you know, I go, what
0:12:40 do I do? It’s like, it’s like back in the day when you would code, it takes something would take a
0:12:45 long time to compile and you’re like, okay, what do I do while that’s compiling? I go and check
0:12:47 something. I mean, I still run into that when I’m trying to render my videos, right? It takes
0:12:52 10 minutes to render out a video that I make for YouTube or something. And I’m like, well,
0:12:58 my GPU is in use right now. So like, I can’t really do a whole lot with my computer. So I go
0:13:01 walk away and do something else, you know, right? Well, but in this case, you’re using their GPU,
0:13:07 so you can still use your computer. That’s true. And so, you know, it slightly reminds me of like,
0:13:11 so when I was younger, I think I told you this before, like probably 15, I made money playing
0:13:14 EverQuest. And there was a lot of people who made money playing EverQuest back in the day.
0:13:19 I was as far as I know, I was like a top 10 player on the game. I was like, I was a well-known
0:13:24 necromancer, dark elf necromancer on the game. And one of the strategies we would employ is, you
0:13:29 know, like if you went with a party in the game, and you were with a bunch of other people, if
0:13:33 something good dropped, you had to roll on it. And that kind of if you were doing it to make money,
0:13:37 that sucked to have to share it with those people. And you might get unlucky and you might not get
0:13:43 this stuff. So what became very common was to multi box, which was basically where you would
0:13:48 be running multiple computers, multiple accounts, and then you get a huge financial benefit from
0:13:52 doing so. And you would also make alliances with other people who were multi-boxing,
0:13:58 right? So like maybe if it was like, okay, it’s hard to run six computers, but like the, you know,
0:14:02 if we could put five people, maybe one person runs three, one runs two, and then that’s a better
0:14:08 split than six random people that I’m sharing with. And so we would do that. And you would make
0:14:12 a lot more money from doing that. And so I do wonder like, okay, in this new world, where
0:14:18 if you can spend more money, you get way better outputs, but you’re still tied to like the one
0:14:24 account, will we see like some really clever people start multi-accounting AI?
0:14:29 Yeah, yeah. Well, you know what you’re describing here too, it is also, and I don’t want to get
0:14:34 two in the weeds on this, but it also feels like a fairly decent use for like blockchain tech,
0:14:40 right? Because you can distribute the compute, right? You can have other people like you ask
0:14:45 O3 a question and it goes off and it uses somebody else’s account and they get sort of tokens back
0:14:50 for allowing to use the account kind of thing. I can see something like that where it’s almost
0:14:56 like this distributed network of you’re using other people’s O3 accounts to generate what you
0:15:00 want to generate, because they’re not currently using their account. And whenever their account is
0:15:07 inactive, you use it and you pay some sort of cryptocurrency to use it and that’s their form
0:15:12 of mining, you know? Yeah, that sounds good conceptually. I’ve worked with blockchain stuff
0:15:20 in the past. It’s often harder than it sounds, but I guess I’ll ask O3 or O4 for advice on
0:15:25 if that’s the best way to go and we’ll… Yeah, maybe AI will develop that software for us instead
0:15:31 of a human actually developing it. Do you think O3 is AGI? Like from my definition of AGI, which I
0:15:36 guess is like, can do most things humans could do. If you didn’t put all this stuff into the models
0:15:41 where they tell you that they’re AI, I’m not a human. If you didn’t have any of that in there,
0:15:47 I think you could trick most people into believing that this was a human doing stuff for you. I
0:15:54 really do. And I already thought like 01 Pro was close to that. So like based on the benchmarks
0:15:59 they’re showing for O3 where it’s past what the ARC AGI benchmark where it’s got like a
0:16:08 87.5 and the average human taking it gets like an 85. That’s for me pretty close to AGI.
0:16:13 I could call that the beginning, like a first version of AGI to me.
0:16:17 Yeah. Well, I looked it up on a live stream earlier today where we’re talking about O3
0:16:23 and that ARC AGI benchmark, the average human gets about a 76% on it.
0:16:31 Oh, really? Okay. And so if we look at the O3 model that just came out, the O3 low model,
0:16:36 what do they call it? There’s like a smaller model that used less training and then like a
0:16:43 larger model. The smaller model got a 76%. So the smaller model hit the average that
0:16:49 a normal human would hit. The larger model, the one that used the insane amounts of compute,
0:16:55 got like an 86% or something like that. So smarter than the average human.
0:17:01 Yeah, which is wild. This is it looking at complicated logical puzzles and things like
0:17:05 that and solving them. For anyone who’s saying that these models do not reason and they’re just
0:17:12 like parrots or whatever, no. There is some reasoning going on here and this is the breakthrough.
0:17:16 Like this is the fact that you now have like the large language model, but you’ve got this computing,
0:17:21 you know, test time computer, whatever they’re calling it. Like that’s the big unlock where
0:17:25 like you combine the two. It’s like the two sides of a brain. Well, I just pulled up this chart.
0:17:33 This is from the ARC prize website here about OpenAI’s O3 model. So their low model scored about
0:17:38 what an average human would score. But one thing that I find super interesting about this is the
0:17:44 cost per task down here, right? So it’s like this low model, you know, this is a logarithmic scale,
0:17:49 right? So it goes from $1 to $10 to $100 to $1,000. So if you were to like chart this,
0:17:55 it would have like a sort of curving up slope, right? Yeah. So this isn’t like
0:18:04 $11 for 76%. This is like, this is probably something like $30 per task right here, if I
0:18:11 had to guess, right? And if you look at the O3 high model, this is $1,000. If there was another line
0:18:16 drawn right here on the chart, that would be at $10,000 following the, you know, the same sort of
0:18:24 logarithmic scale. So this 88% is actually halfway between $1,000 and $10,000. So in the range of
0:18:32 somewhere around $5,000 per task to be able to get that O3 high model actually working,
0:18:38 yeah, very, very, very expensive right now to be able to use this model as of right now. I mean,
0:18:43 like my understanding, like the amazing thing about this is at least from like what people are
0:18:48 like kind of, you know, inferring from like tweets from people from OpenAI is that O3,
0:18:54 which they skipped O2 because of some trademark stuff apparently. Well, there’s O2 Arena in London,
0:18:58 right? That’s, there’s a big company called O2 that owns that arena licensing and yeah. Yeah.
0:19:05 So it seems that O3 is literally just O1 with more compute thrown at it. And if that’s the case,
0:19:08 I mean, yeah, that’s what you would expect. The cost to go up dramatically and you get,
0:19:11 it’s amazing that you can just get more performance from it though. Like you do get
0:19:17 more intelligence, the more compute you throw at it, which means that we are now in a world where
0:19:21 like all the stuff we’ve talked about in the past, we’re like, okay, race for more compute, race for
0:19:28 more energy. Yeah. Those things are majorly in play now because we’ve now proven that you probably
0:19:35 can get to beyond AGI just with more compute. That’s been proven now. And so that’s, that’s a
0:19:40 big revelation like, okay, we thought that might be the case. It appears to be the case now. It is
0:19:46 impressive. Like in this, it shows on, so here’s a benchmark in competition math, which is like
0:19:50 the kind of math that like we have no idea how to do. This is like people who are like
0:19:54 PhDs in math to solve incredibly challenging. You know, this is not just like basic algebra.
0:20:01 And so like that, it’s, you know, it’s frontier math that GPT-40 scored like a 13.4, but like
0:20:08 03 got a 96.7. So I mean, this is an incredibly high score. This is like top of the top smartest
0:20:14 people in the world level. And I found this fascinating, which this is, I think this is,
0:20:18 I’m not sure if this is like an official like who’s came up with these IQ rankings for the AI,
0:20:25 but some people ran some numbers and it was shown that like, you know, GPT-40 scored like an estimated
0:20:31 115 IQ, which is considered like a somewhat smart person, not like a super smart, like a
0:20:37 relatively smart person. 130 or over is considered a smart board, you know, maybe 140 is like
0:20:43 plus is genius. Einstein’s estimated to be like around 160 IQ. So the average IQ range for adults
0:20:51 is between 90 and 109, with 100 being the theoretical average. So 100 is about what you’d
0:20:59 expect average to be. Yeah, yeah. Yeah, in America, average is about 100. So it’s estimated for 03,
0:21:05 157. And like I said, you know, it’s estimated, it’s, you know, IQ, it’s highly debated, like
0:21:10 how useful it is. But, you know, Einstein was estimated to be 160. And so this is like pulling
0:21:16 in like close to 157. So that means we’re possibly, you know, in the ballpark where like this will
0:21:22 actually be helping us solve frontier math problems, like problems in physics and just basic understanding
0:21:26 of the universe that we live in. This may be approaching that, like by the time we get to 04,
0:21:33 maybe that is already able to do that. Also, this appears to still be not a larger model that they’ve
0:21:39 trained. This still seems to 01 and 03 are not dramatically larger models and trained. They’ve
0:21:44 just came up with this new paradigm of like applying like test time compute on top of the
0:21:50 existing models. So if that’s the case, well, then maybe what was called Orion or GPT-5 or whatever,
0:21:58 maybe that’s still in the works. And so maybe by the end of next year, not only will we possibly
0:22:03 get 03, or maybe it’ll be like a slightly less compute intensive 03. So maybe it’s like three
0:22:09 times better than 01, you know, Pro or something. Just to clarify, what you’re saying is like,
0:22:17 they’re still using like a GPT-4 level model, but they’re just throwing more compute during the sort
0:22:22 of inference phase, right? It’s already, it’s the same sort of trained model that they trained with
0:22:27 maybe GPT-4, maybe it’s some sort of like 4.5 or something, right? Like maybe it’s a slightly
0:22:33 newer model. But the inference, when you actually give it a prompt and ask a question, it’s throwing
0:22:38 a ton of compute at it to like process the question and then double check itself and then
0:22:43 double check itself and then double check itself until it’s fairly certain that it gave like the
0:22:49 proper output that you’re looking for, right? But it’s still essentially using like a GPT-4,
0:22:54 maybe a slightly better model. Yeah, yeah, that’s, that’s what people believe. And that kind of,
0:22:58 that makes sense to me, honestly. Like if they have discovered this new paradigm of like,
0:23:02 yeah, not only do you get the result from the LLM, but then you’d like reason on top of that,
0:23:07 you could see, you know, it appears you can get like major improvements just from that.
0:23:12 And if that’s the case, it could be that, yes, they do have a, the model that they’ve been training
0:23:16 for a while that’s on a, you know, has massively more amounts of data that’s been trained upon,
0:23:21 that could still be in the works. And so when Sam Altman is saying that we’re going to see
0:23:28 huge increases in 2025, that could be what it is. They still have the new model coming. And then
0:23:34 they’re just going to slap this new reasoning model on top of that. Yeah, yeah, it’s like GPT-5
0:23:40 combined with like the ’03 reasoning model. And like all of a sudden ’03 gets even further
0:23:45 up the benchmarks and does even better. And the only thing that changed was the underlying model
0:23:49 before the reasoning portion happened. Yeah, because in its baseline is way higher that it’s
0:23:55 a reasoning upon, right? Right, right, right, right. So you could in theory see dramatic increases
0:23:59 beyond what we’re seeing in these benchmarks. I mean, so you might see like, okay, yeah,
0:24:06 we get like a 200 IQ plus model by the end of next year, which obviously dramatically changes
0:24:11 things because like, you know, I’ve been testing ’01 Pro and people are kind of now saying that maybe
0:24:17 ’01 Pro is maybe, it’s basically in between ’01 and ’03. It’s like basically like when they were,
0:24:21 when they were testing ’01. It’s the ’02 that we’re not allowed to talk about. Yeah, it’s basically
0:24:26 like when you take ’01 and put more compute towards it, what happens, it’s some first version of that
0:24:32 test. Right, right. And it seems like that, like I think they showed, I forgot where the benchmark
0:24:37 was, but I think somebody did show that like there was a major difference between ’01 and ’01 Pro
0:24:43 on that ARC AGI test. And I’ve been seeing that too, like when you throw, like if you throw a lot
0:24:50 of code at ’01 Pro and tell it to reason about that code and to help rearchitect that code,
0:24:57 it can do it where ’01 fails. Because one of the things that I keep on hearing come up is like,
0:25:02 it’s a huge advancement and it’s really cool, but what are the actual practical use cases? And I
0:25:08 think the most obvious practical use case is probably coding, right? I think that’s going to be,
0:25:13 like as far as like a general consumer, the most actionable use case. But I do think
0:25:19 there’s a lot of other use cases for something like ’03 beyond that. They’re just not necessarily
0:25:23 use cases that the average general consumer is going to care that much about, right? That they’re
0:25:29 not going to try to tackle themselves using ’03. Things like, you know, trying to solve climate
0:25:36 problems or, you know, trying to cure diseases or things like that. But I just wanted to hear,
0:25:41 like from your perspective, like outside of coding, let’s just set coding aside, outside of coding,
0:25:48 what do you see like ’03 like really helping the world accomplish? I think that when you start
0:25:54 having agents powered by ’03, that’s when you’ll see like real, like regular people getting benefits
0:25:59 from these reasoning models. I do think they’re like, okay, if Orion comes out and it is what’s
0:26:04 powering like an ’04 or whatever, maybe there is like a standalone non-reasoning model too,
0:26:09 or they have some kind of smart router that knows, okay, I can respond super fast to this basic
0:26:15 query. And if it’s a hard question, let’s spend some time thinking about it. I hope that’s coming.
0:26:19 Because right now, like the ’03, like the ’01 Pro, like the average person, if they used it,
0:26:24 and they’re asking it like how to bake something, you know, or whatever they’re asking it,
0:26:28 like they’re going to have a bad experience. Like it’s going to be like, oh, this is like,
0:26:32 you can get just as good of a response out of GPT-4, basically. Yeah, or like maybe,
0:26:35 maybe ’01 Pro would be like slightly better, but like it would take so much longer that
0:26:40 like the perception would be that it was not good for sure. And so I think that the magic’s
0:26:45 going to be when it start powering the agents, you know, because then you can like, okay, I hate
0:26:50 dealing with email, let’s just have the agent handle all my email. I don’t actually want to do
0:26:55 email anymore. That thought just makes me laugh because it’s like, what happens when everybody’s
0:27:00 doing that? It’s just like, nobody’s writing or checking their emails anymore, it’s just agents
0:27:06 doing it all for them. I think that’s where we’re heading though. Like, you know, it’ll be super
0:27:12 rare that you get like a handwritten email from somebody, you know. It’s like, not a handwritten
0:27:17 letter, but like a handwritten email. Right, right. A hand typed email. A hand typed, yeah,
0:27:21 hand typed email will be a rare thing, probably in like two or three years. I think that’s where
0:27:24 we’re headed because like everybody hates email. There’s all these things that we do in our daily
0:27:30 lives that like almost everybody hates. And there haven’t been major improvement to these systems.
0:27:35 So I think the systems will continue to exist, but we’ll start to have agents actually do those
0:27:40 things for us that we don’t want to do. And so I think that’ll probably, I think in 2025, we will
0:27:45 see O3 come out and it will be what makes agents really work. And that’s when people see a change
0:27:50 in their daily lives. Once we have robots and like, reasing models are going to be fundamental to
0:27:54 having robots that can actually do things in your house that you feel safe about.
0:28:00 Yeah. And where I think like the sort of like O1 and O3 are really powerful for agents is the
0:28:04 fact that they’ll essentially double check themselves, right? Like going back to the coding
0:28:10 thing, right? It can write code for you and then look at your code and look to see if it made any
0:28:14 mistakes and then fix the code and then double check to see if it made any mistakes and then go,
0:28:18 okay, I’m pretty sure the code’s right now and give it back to you. Right. So the fact that it’s
0:28:24 sort of like constantly double checking its work is one of the biggest sort of jumps that I think
0:28:29 this O1 and O3 are creating is that, you know, the reason it’s taking so long is it’s almost
0:28:33 like kind of double checking its work and then triple checking and then quadruple checking and
0:28:37 the ones that are using insane amounts of compute, it’s like, all right, we just double check this
0:28:42 a hundred times. We’re sure this is the right answer, you know? Yeah. It’s like multiple AIs talking
0:28:46 to one another and then like finding the best, you know, and double checking everything to make
0:28:51 sure it’s all accurate. Like it hallucinates way less. Like I’ve been playing around. I’ll show
0:28:56 you something later. Like a very basic game demo. Don’t expect this is not a real engine five game
0:29:00 or anything like that. It’s very, very basic. But I’ve been playing around with creating a game
0:29:05 demo just because I got so excited when I tested O1 Pro and realized how good it was at like actually
0:29:09 architecting systems with code. But you find like it’s amazing at some things, but there’s still
0:29:14 some things that Claude’s good at and there’s some things that even like Jim and I 2.0 is good at.
0:29:19 Right. But I do find that O1 Pro is the best at architecting hard systems, like actually like
0:29:24 architecting the code, making major changes or refactors to the code and in terms of like
0:29:29 not hallucinating about helping you solve something. The more and more I tested Claude,
0:29:33 like it’d be very good at some things. Like some things like in terms of like modifying my UI or
0:29:37 stuff like that. Claude was the best for some reason, but also it would just hallucinate your
0:29:41 really weird things. Like I would like show it a bunch of code and say, “Hey, what do you think is
0:29:46 the problem with this? I’ve got a bug. Here’s the bug.” And I was like, “Oh, you need to change this.”
0:29:51 And now like I look at my code and that’s already with my code. It’s like you need to
0:29:55 change this to this. And I look at my code and it’s like, “What you’re telling me to change it to
0:30:00 is my code.” Yeah, I’ve seen that in cursor before. I’ve seen that because I have my cursor set up
0:30:06 for Claude. Was it Sonnet, right? I have it set up for Sonnet. And I’ve seen that before where it’s
0:30:11 like, “Oh, I spotted the problem. You need to change this to this.” And I’m like, “They’re the same thing.
0:30:16 You didn’t change anything.” I know. And it’s like, so that’s a crazy hallucination where it’s
0:30:19 literally like what you just pasted to it. You’d imagine that what you pasted was something else
0:30:26 and that what you pasted was a solution. Yeah, yeah. And I have not had O1 Proc ever do that.
0:30:33 So that’s a major change where like you can like O1 Proc likely catch it. It might still hallucinate,
0:30:37 but then it catches its hallucination before getting back to you, right? Because there’s a
0:30:41 lot of stuff that’s going on with the O1 model behind the scenes, right? That they’re not showing
0:30:47 you the entire thinking. So it’s very possible that it is hallucinating, then double checking
0:30:53 what it just responded with, notice it hallucinated, and then fixed itself before responding to you.
0:30:58 Yeah. Yeah. So that’s going to make these models way more reliable. Like anything that needs to be
0:31:02 reliable, they’re going to get reliable. Like because probably the more compute you throw at it,
0:31:06 the more reliable it will get. The more times it thinks about the answer before it gives it to you,
0:31:13 probably maybe it’s not going to be 100%, but we’ll get like all approach like 99.9% accuracy
0:31:16 with these models probably the next year with reasoning. So I guess the big question is like,
0:31:21 okay, for 2025, like with like models, like all the models are getting really good. And like now
0:31:25 Google’s like teasing, they’ve got like a reasoning model and they’re kind of making it sound like
0:31:30 it’s not a big deal. And they’re even saying that they’re probably going to beat OpenAI in terms of
0:31:36 coding next year. I do wonder if that’s true or if like Google’s overestimating or underestimating
0:31:41 what OpenAI has accomplished. So like OpenAI is saying like, it’s not as simple as you think,
0:31:48 like we’ve discovered some things. The thing about OpenAI though, is that I do feel like a lot of
0:31:53 the other companies could potentially catch up fairly quickly, just due to the amount of brain
0:31:58 drain they have at that company, right? Like how many people are helping develop this stuff
0:32:02 and then walking away from OpenAI like pretty close to the time they actually announce it,
0:32:07 right? We’re seeing it constantly. Yeah, maybe, but I mean, I don’t know. That could just be a
0:32:11 narrative too though, because like that’s always happened in Silicon Valley. Like talent moves
0:32:15 around because like, if you’re at a company for a while, what happens is you get some stock there
0:32:19 and the people at other companies just offer you ridiculous amounts of money. And so then it makes
0:32:25 sense just to like keep your stock, go get more stock. This has always happened in Silicon Valley.
0:32:29 Like people usually move around every two years. Like people rarely stay at a company more than
0:32:34 two years in Silicon Valley. I do wonder, like that could be the case. You could be right,
0:32:37 but it also feels to me like maybe it’s something that’s kind of been exaggerated,
0:32:41 just because it’s great for tweets and things like that. And it’s just like more of what’s
0:32:46 always happened in Silicon Valley. And there’s constantly, like who’s to say they’re not getting
0:32:52 higher quality talent coming in as other talent leaves, because now they can attract the best of
0:32:58 the best. Whereas when they started a company, you know, maybe they, you know, maybe Carpathia is
0:33:01 amazing, but maybe he’s not the best at creating this despite having a huge persona online.
0:33:07 Yeah. And so in Silicon Valley, that’s always like resulted in like issues with founders and
0:33:11 people leaving and things like that. I really think it’s going to come down to like three major
0:33:16 players. I think really it’s going to be Google, Anthropic and OpenAI. And the reason I think
0:33:20 Anthropic is in the mix is because they’re just, they’re getting infinite amounts of cash from
0:33:25 Amazon, right? So it’s like, you’ve got, you’ve got Anthropic, who’s essentially owned by Amazon
0:33:30 at this point, OpenAI, who’s essentially owned by Microsoft at this point. And then Google and
0:33:35 DeepMind, which, you know, all under the same umbrella. I think you’re really missing XAI.
0:33:40 Oh, you’re right. You’re right. I am mixing. I do think they are going to, they are going to,
0:33:43 you’re right. I am missing that one as well. I do think they’re going to do a lot, especially
0:33:48 with, you know, Elon building like the largest server farm on the planet and whatnot. So yeah,
0:33:52 yeah. And it’s going to be interesting because it sounds like Elon is thinking he’s going to
0:33:59 be able to create the largest model. But if OpenAI has discovered that like, okay,
0:34:03 having a bigger model is good and does improve something, but reasoning is actually the more
0:34:08 important part. You know, is Elon going to be able to like kind of slightly,
0:34:11 is he going to be able to catch up and figure out how they did it? Or does OpenAI really
0:34:15 have some kind of secret sauce there that you don’t take people time to figure out?
0:34:22 I do often underestimate and take for granted XAI because I feel like they’re still trying to
0:34:25 catch up, right? Like the current models that we have available to us, which are,
0:34:30 you know, GROC inside of X and then they have their new like Aurora image model.
0:34:36 Neither of like their large language model or their image model feel as good as what else is
0:34:41 available. But also I’m not counting them out because I know how much obviously Elon, I believe
0:34:46 is still the richest person on the planet. If last time I checked, right? He’s got $6 billion
0:34:53 in investment. He’s building like the largest GPU cluster on the planet, right? We’re going to see
0:34:58 them make some waves and forgot to mention, he’s best friends with Donald Trump at this point.
0:35:04 So, you know, he’s got government inroads as well. So I think, yeah, I do think we are going to see
0:35:08 a lot more out of XAI. It’s just, I always forget about them because I don’t personally use them on
0:35:11 a daily basis, you know? Right. Yeah. They don’t have the consumer adoption yet, but like people
0:35:16 are showing that like that has been going up. I think I saw a chart showing that it was going
0:35:21 up a lot in Japan, which makes sense because in Twitter, X is huge in Japan. But like I saw like
0:35:25 searches for GROC as well, where it’s like going up dramatically in Japan. I think Elon last year,
0:35:31 that is like almost like it was like a global trend. He showed that chart and people were like,
0:35:35 that’s Japan, but you know, but Japan’s very important. So like a top three market. And so I
0:35:39 am curious to see how that plays out. But yeah, I think we have at least four major players
0:35:45 going after it. And I think all of them are going to make major progress in 2025. So,
0:35:50 which will make all of them make progress faster. They’re all, you know, people have said like,
0:35:56 oh, open AI had to show 03 on the final day of the 12 days of open AI, because Google has been so
0:36:00 freaking impressive with their announcements over the last month. Do you think that they had something
0:36:05 else intended for the 12th day? And they pivoted to show off 03 because of Google like sort of
0:36:09 matching them day for day with something equally as big, if not bigger? I think it’s, I think it’s
0:36:13 possible because you know, it was also you started seeing, you know, we’ve had Logan on the podcast
0:36:17 before and he was starting to like tweet out some stuff, you know, kind of kind of hinting
0:36:23 that like Google’s not behind, maybe they’re maybe they’re ahead. And then like when 03 was
0:36:27 the benchmarks were shown for 03, like all these open AI people started sharing memes like, you
0:36:32 know, opening AI was never behind. You thought they were behind. Everybody’s saying they’re behind.
0:36:40 What are we behind? You know, the most used model and with like the stuff they have internally,
0:36:45 steal the best. Well, they needed something like that to really sort of get people excited again,
0:36:49 because I feel like during the 12 days, there was a couple cool announcements, but most of them were
0:36:57 kind of like most people, it doesn’t impact them too much. Like I thought the 100, 1800 chat GPT
0:37:05 was sort of like a gimmicky novelty sort of thing. We got Santa voice. Yeah, the Santa voice.
0:37:10 I mean, I think the big moves they made were putting out Sora, but then Vio came out like
0:37:16 two days later and sort of knocked Sora off the top of like being the best AI video tool at the
0:37:21 moment. Sora turbo, right? The other sort of might still be a lot better. I don’t know. But then
0:37:26 they also announced the vision inside of the advanced voice mode. And I think that was probably
0:37:33 the most impactful thing that they did was the advanced voice mode with vision. We also got the
0:37:40 $200 a month plan, which gave us access to the GPT or the 01 Pro. Yeah, I mean, for me, 01 Pro is
0:37:45 the biggest one. But like I said, like most people, number one, they’re not going to pay $200 a month.
0:37:49 And number two, even if they did, you can see that everybody’s getting very different results.
0:37:54 Like it depends on what you’re trying to use this for. The card coding stuff or science,
0:37:58 people are blown away. If they’re trying to use it like they use just chat GPT on a regular daily
0:38:05 basis, they’re like, this sucks. Yeah. Yeah. Yeah. Yeah. Yeah. I’m curious. I want to see this this
0:38:09 game that you’ve been talking about. It’s super, it’s super simple. So don’t expect much, but I’m
0:38:18 not a game designer at all. I’m expecting Baldur’s Gate 4 meets Elden Ring. Baldur’s Gate meets Elden
0:38:25 Ring. Let’s see it. No, no, it’s not that. But you’ve been using 01 to develop this or 01 Pro.
0:38:31 I’ve been using 01 Pro. I’ve been using Claude. I’ve been using Gemini 2.0. I’ve been using Sora.
0:38:38 Are you using an IDE like a cursor or visual studio? I’m using cursor, but I feel like I’m
0:38:42 kind of a newbie using cursors, to be quite honest. Now that they added, you can put 01
0:38:47 into cursor. I do want to try it again, because that probably better than the outputs I was getting
0:38:52 from Claude. I feel like right now, 01 is like slightly better than Sonnet. 01 Pro is significantly
0:38:56 better, but way slower. But I’ve been using everything. That’s why I was saying about the whole
0:39:06 multi-boxing thing is multi-accounting. I think the wizards of AI, they are going to be putting
0:39:10 together all these different tools and maybe multiple LLMs based on what they’re good at.
0:39:15 It’s happening already. I think you’re going to see ChatGPT give you an output,
0:39:19 and then it spits it over to Gemini to double check it, and then it spits it to Claude to double
0:39:24 check it. It goes round robin through all these various LLMs and then spits back out
0:39:31 like a sort of aggregated response based on what all the LLMs said. This is very basic,
0:39:38 so don’t judge me. This is all super ugly, like you train a character, you go into a little town,
0:39:43 which is like all mid-journey generated. It looks sick, dude. It does look really good.
0:39:51 And dude, I’ve had mid-journey. I use mid-journey for this. The special effects, I asked open01pro
0:39:57 what to do, and they’re just like, “Okay, let’s put some snow here. Okay, cool. Let’s do that.”
0:40:01 So it’s like a static image, but then some coated in snow over the front.
0:40:05 Of course, you would make it look better. There’s this one game called Deepest Dungeon,
0:40:09 where they do something like this, where you have a static image of a town,
0:40:13 and when you highlight over them, they light up and stuff like that.
0:40:16 So I’ll eventually do some kind of special effects where when you go over it, it lights up.
0:40:21 Here, I go build my party. I have not worked on the UI, so all this is just super ugly,
0:40:27 but okay, recruit my team, random. So how many hours do you have into this right now?
0:40:32 It’s a decent amount, but not a crazy amount. I have a hard time estimating,
0:40:35 because it’s been the kind of thing where I’ll do it while I’m doing other stuff.
0:40:40 I’ll have open01pro running, and I’ll just be totally doing other stuff. And I’ll be like,
0:40:44 “Hey, improve this.” Not only like improve this, but sometimes it’s like,
0:40:48 “What should we improve?” So like, yeah, you hear me some kind of like a world map,
0:40:53 you know, it’s all not that great for now, but like you’d go in Mario style.
0:40:57 We have different nodes that you go through to get to the final boss.
0:41:00 And then what I’ve kind of done gameplay-wise is I was like,
0:41:04 “Well, what’s something simple I could actually implement without like making,
0:41:07 you know, Baldur’s Gate 3, which is super complicated.”
0:41:12 And I was like, “Well, I used to play this game called Puzzle Dragon, actually Taizosan’s game,
0:41:15 where it’s like basically like a match 3 game with an RPG element.”
0:41:19 But I always felt like, “Oh, but it doesn’t have like a real RPG elements in it. Like,
0:41:22 they’re very, very light.” I was like, “What are you made of this?” So like,
0:41:27 in between Baldur’s Gate, like more RPG elements, and like a basic puzzle,
0:41:30 match 3 that everyone understands, and some people love, people hate.
0:41:33 And so I just created something like that. And then you have like,
0:41:36 you basically have like a party here, like your party with like health bars,
0:41:39 it looks really good. All this is mid-journey art.
0:41:44 This is all mid-journey. Yeah. I used, I started to use scenario a little bit,
0:41:48 but this is mostly mid-journey. And, you know, and it’s just like a very basic match game where
0:41:53 you charge stuff up, you got different abilities. I hadn’t worked on the special effects at all.
0:41:56 I literally just asked 01pro, like, “Put some special effects in there.”
0:41:59 It looks really good. It’s like a, like a-
0:42:02 All the special effects, it just came up with. I didn’t, I didn’t do any of that yet.
0:42:08 It’s almost like a Candy Crush kind of concept in the actual like puzzle game almost,
0:42:12 but then you’ve got a whole bunch of like fantasy RPG elements also.
0:42:15 Yeah, yeah. I’m not going to just play the entire game logic.
0:42:18 You’re getting hooked on it. You’re like, “I’m playing now. I’m doing good.”
0:42:23 This is not work now. Just play games. You know,
0:42:26 then you like have different choices for going to different rooms.
0:42:30 I probably would have the AI, I’d probably like actually like generate the dungeon,
0:42:31 so it’s not the same every time.
0:42:32 Yeah, yeah.
0:42:35 And then you have different, you know, different stuff you buy.
0:42:40 And so, to be honest with you, I’m kind of surprised how good it is.
0:42:44 It looks really good. I mean, everything, I just, I look pretty solid.
0:42:50 Yeah, dude. Like considering like, I’ve literally, that’s like 90% 01pro
0:42:54 with little bits of Gemini 2.0 and Claude for different uses.
0:42:58 Well, you lowered my expectations where you were like, “Yeah, don’t expect too much.”
0:43:01 And then what you showed me is better than I expected.
0:43:03 So I think it’s been kind of good for me to do this too.
0:43:05 Because, you know, I think I told you off camera.
0:43:09 Like I’ve been kind of decide like, what else should I work on beyond the podcast?
0:43:13 Like, should I be building another startup or should I be doing a YouTube?
0:43:17 And I kind of decided like, well, I think I just want to be playing more of these AI tools.
0:43:21 I think it’ll just be more useful for the podcast for me to actually have hands-on experience.
0:43:24 Maybe these things turn into real projects. Maybe they don’t, but they’re like.
0:43:27 I can see it spinning off into a real like game studio.
0:43:28 And you’re actually selling them.
0:43:32 It’s been really useful for me though, to actually understand how good, like,
0:43:35 what are these things good at currently? What are the limitations?
0:43:43 What would be really cool is if a tool like a cursor sort of knew what each model was good at,
0:43:47 and then would send whatever your prompt is to the best model for that specific thing.
0:43:50 Or you could create your own logic that’d be kind of cool too.
0:43:53 Like, okay, there’s suggestions, but like you can kind of like find, you can like,
0:43:55 yeah, when I’m doing UI, I want to use this.
0:44:01 Yeah. I can see cursor building in something like that, where it kind of,
0:44:05 it’s already model agnostic anyway. You just pick which model you want to use.
0:44:08 It doesn’t seem like too hard of an extra step to basically say,
0:44:11 if the question is related to this, send it to this model.
0:44:13 If it’s related to this, send it to this model.
0:44:16 Yeah. So this, you know, doing this like little test like demo project,
0:44:19 like maybe realize like, you know, maybe a year or two ago, I was like,
0:44:24 oh, you’re going to see like one or two people startups that hasn’t super happened yet.
0:44:28 But I feel like after like actually trying the newest stuff and like realizing how much
0:44:31 better it’s about to get, I think we probably are going to get there where you’re going to see
0:44:35 like one or two people, maybe five, you know, like a small group of people
0:44:38 creating massive projects stuff that used to would have taken,
0:44:43 you know, hundred people, thousand people, you’re going to see small teams of people
0:44:46 with the concentrated focus being able to crank out amazing projects.
0:44:52 And the fascinating thing is too, is they may be able to do multiple projects as well.
0:44:56 Like I’ve been finding like, I can totally work on other stuff while I’m doing this,
0:44:58 because it’s more like, it’s almost more like I’m like the director.
0:45:01 Yeah, yeah, yeah, yeah. You’re basically going and telling your team members,
0:45:04 all right, we need to create this next and they go and do it.
0:45:04 Yeah.
0:45:06 And you’re just steering the ship.
0:45:07 You’re not doing the things.
0:45:12 It is fun. And I think most people don’t know how to tie this stuff together
0:45:16 as well as I do. So I think that’s going to get easier and more people will know how to do that.
0:45:20 I totally agree. I mean, we were talking about this before hitting record.
0:45:23 I think a lot of SaaS companies are going to be in trouble.
0:45:28 I think it’s, you know, a lot of companies are going to be able to just like build
0:45:32 tailor-made tools for their needs without needing to go and, you know,
0:45:36 pay for a SaaS company to do the thing for them, right?
0:45:38 I’m already finding myself doing that when I have like a need.
0:45:41 Like I was talking on a previous episode of like,
0:45:44 I want something that no matter what file type I put into it,
0:45:46 it converts it to a JPEG.
0:45:51 So if I throw in a PNG, converts it to a JPEG, a WebP JPEG, AVIF file, JPEG.
0:45:52 I don’t care what file type it is.
0:45:55 If I throw it, even if I throw in a JPEG,
0:45:58 I want it to output it as a JPEG in the selected folder
0:46:02 so that I can bulk throw images in there and convert them to JPEGs.
0:46:04 I guarantee there’s already tools out there that do it,
0:46:08 but it took me three minutes to build that using cursor and Claude.
0:46:11 So I just built it real quick and, you know,
0:46:17 I don’t have to go and use some either insanely ad-ridden website for it
0:46:19 or pay for some SaaS to do it.
0:46:22 I just have a tool that does it on my desktop now
0:46:23 and it works really, really well.
0:46:26 And I think you’re going to see more and more and more of that
0:46:28 where people are just like, I need this problem solved.
0:46:32 I’m going to go and just code up a solution real quick
0:46:36 because all it is is a prompt and the prompt will code up a solution, you know?
0:46:37 Yeah, it’ll get faster and faster too.
0:46:40 Right? Like eventually this will be almost instant
0:46:43 where within a second you’ve got that output of what you wanted.
0:46:45 A tool that does what you want.
0:46:48 I think in the ’03 mini demonstration they showed
0:46:50 where like it basically set up its own server
0:46:52 and then it set up all this different stuff and it’s like…
0:46:55 It set up its own server and created…
0:46:57 This came up with a benchmark for itself
0:46:59 and then set up a server to run that or is it something like this?
0:47:00 I built a benchmark.
0:47:03 Look, I’m really good because I beat my own benchmark that I just built to test.
0:47:07 But still, it is fascinating that it’s able to do these kind of things.
0:47:10 Right? It’s mind-blowing.
0:47:11 Yeah, for sure.
0:47:14 Yeah, I mean, I think it’s exciting times right now.
0:47:17 I think 2025 is going to be even more exciting.
0:47:18 So a lot of good stuff.
0:47:22 I think this is probably a good spot to wrap this one up,
0:47:24 our first episode of the year.
0:47:28 Thank you so much for tuning in for our show in 2024
0:47:31 and hopefully you’ll be with us for 2025.
0:47:34 Make sure you subscribe wherever you listen to podcasts.
0:47:38 We prefer YouTube because we try to make this visual and show stuff on our screen
0:47:41 and show off what we’re talking about a little bit.
0:47:45 But if you prefer audio, you can also find us wherever you listen to podcasts.
0:47:47 So thanks so much.
0:47:50 Happy New Year to everybody and we’ll see you in the next one.
0:47:51 Yeah, happy New Year’s.
0:48:09 [Music]
Episode 40: What will AI look like in 2025 and how will it change our daily lives? Matt Wolfe (https://x.com/mreflow) and Nathan Lands (https://x.com/NathanLands) dive deep into AI’s future with perspectives on emerging models and innovations.
In this episode, Nathan and Matt explore the vast potential of AI models like OpenAI’s upcoming o3, the future advancements expected by 2025, and the significant societal implications of these powerful technologies. They discuss potential impacts on the job market, daily life, and how AI-driven tools might handle routine tasks, freeing up human creativity for more complex endeavors.
Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd
—
Show Notes:
- (00:00) Basic AGI emerging by 2025, transforming tasks.
- (05:14) Discussing AI-generated art and personal connection.
- (08:59) AI video advancements to reach new realism by 2025.
- (10:41) Improved AI reasoning entails significant cost challenges.
- (13:04) Top EverQuest player used multiboxing for profit.
- (18:57) More compute increases performance and intelligence.
- (21:09) New paradigm might solve frontier math problems.
- (25:46) o3 agents may eventually benefit regular users.
- (27:54) Agents constantly self-check, improving code accuracy.
- (32:04) Frequent job changes commonplace in Silicon Valley.
- (35:32) Japan remains important; major progress expected 2025.
- (38:36) Exploring AI tools; curious about updates.
- (41:08) Created an RPG-infused match-three puzzle game.
- (44:13) Small teams creating massive projects efficiently now.
- (45:51) Quickly built desktop tool converts files to JPEG.
—
Mentions:
- OpenAI: https://openai.com/
- Claude: https://claude.ai/
- EverQuest: https://www.everquest.com/
- Midjourney: https://www.midjourney.com/
Get the guide to build your own Custom GPT: https://clickhubspot.com/tnw
—
Check Out Matt’s Stuff:
• Future Tools – https://futuretools.beehiiv.com/
• Blog – https://www.mattwolfe.com/
• YouTube- https://www.youtube.com/@mreflow
—
Check Out Nathan’s Stuff:
- Newsletter: https://news.lore.com/
- Blog – https://lore.com/
The Next Wave is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Production by Darren Clarke // Editing by Ezra Bakker Trupiano