AI transcript
0:00:08 Things are going to accelerate even more because now you’ve got China as a real player.
0:00:11 From here on out, I feel like pretty much all of the models are going to have this
0:00:13 sort of thinking element to it.
0:00:16 The world is going to be dramatically different in like two years from now.
0:00:19 I mean, these models are probably going to be like 50 times smarter,
0:00:21 at least 20 times smarter in two years.
0:00:27 Hey, welcome to the Next Wave Podcast. I’m Matt Wolf. I’m here with Nathan Lanz.
0:00:30 And today, we’re going to talk about the thing that the whole world’s been talking
0:00:32 about over the last couple of weeks.
0:00:35 We’re going to talk about DeepSeek and DeepSeek R1.
0:00:37 And we’re going to break it all down for you.
0:00:40 We’re going to explain to you why it crashed the stock market,
0:00:43 why we think it’s actually been a huge overreaction.
0:00:45 We’re actually going to show it off and use it.
0:00:47 We’re going to see if we can confuse it.
0:00:50 And I think we were pretty successful at actually confusing it.
0:00:52 We’re going to test how actually biased it is.
0:00:56 And then we’re also going to figure out how to get around that bias
0:00:58 and show you how you can also get around that bias.
0:01:03 If you ever wanted to know everything there is to know about DeepSeek R1,
0:01:04 this is the episode for you.
0:01:07 So let’s just go ahead and dive in and show it all off to you.
0:01:12 It’s been a roller coaster of a couple of weeks, the last few weeks,
0:01:18 mostly because the world went nuts over this new Chinese AI model called DeepSeek.
0:01:21 More specifically, DeepSeek R1,
0:01:27 which is a model that came out of China and supposedly was trained on what?
0:01:34 Like 2,800s, I believe, which are like nerfed versions of each 100s from Nvidia.
0:01:41 So to give some context here, China and the US are sort of like in this space race going on
0:01:43 for like who gets the top AI.
0:01:51 And as a result, the US is restricting the level of power that GPUs sold to China can have.
0:01:55 Right? And so what Nvidia has done is they’ve taken their H100s,
0:02:00 which are their powerful GPUs that most of these AI models are trained on,
0:02:01 and they somehow nerfed them.
0:02:03 They made them less powerful.
0:02:07 And then they sold those to China as H800s.
0:02:16 And supposedly this DeepSeek V3 was trained on 2,000 of these H800s for $5 million,
0:02:20 where all the big models from companies like OpenAI and Anthropic
0:02:25 have been trained for way, way, way more on these way more powerful GPUs.
0:02:29 And everybody’s freaking out because these new models that were trained much cheaper,
0:02:36 much quicker on much lower grade hardware are actually getting comparable results
0:02:40 to some of the state of the art models from companies like OpenAI and Anthropic.
0:02:44 That’s kind of like the context you need for the freak out.
0:02:47 But Nathan, is there anything I’m like missing in there?
0:02:49 Because I feel like I explained that fast.
0:02:51 I think that’s about it.
0:02:55 I think the biggest shock to me was that the stock market went way down because of this.
0:02:56 Nvidia went down 17%.
0:03:00 I was like, why? Do people just not understand this at all?
0:03:01 Like I couldn’t wrap my head around it.
0:03:04 I was like, okay, yeah, the short term markets are not logical.
0:03:06 You know, why it’s not a good idea to do stock trading.
0:03:09 If I looked at that news, I’d be like, oh, I’m going to buy more Nvidia.
0:03:12 But if I was like an option trader, I’d be buying Nvidia calls, you know,
0:03:14 and instead it goes down 17%.
0:03:18 So the thinking, I think the consensus and the reason it went down a little bit
0:03:23 was a lot of people went, oh, they trained this on a lot less GPUs,
0:03:25 that were also a lot less powerful GPUs.
0:03:29 This is proof that Nvidia GPUs aren’t as necessary
0:03:32 because these companies can do them with way less.
0:03:34 They can train these models with way less now.
0:03:39 So now like, do we really need a half a trillion dollar data center
0:03:42 to build all these big AI models?
0:03:44 Probably not because look at what China is doing.
0:03:45 That was the thinking.
0:03:47 I’m not saying that’s the reality of it.
0:03:51 But that was the thinking behind why it crashed the markets, right?
0:03:54 But at the end of the day, I still found it bizarre
0:04:00 because they were trained with H800, which were still Nvidia GPUs, right?
0:04:01 We believe they were.
0:04:04 I wouldn’t take any of it at face value, but yes, that’s a claim.
0:04:04 Okay.
0:04:04 Yeah.
0:04:07 The claim was they were trained with Nvidia H800.
0:04:11 The, I don’t want to say conspiracy because it’s not really a conspiracy.
0:04:14 There’s actually people that claim they have evidence.
0:04:17 This sort of counter argument is that they were actually probably
0:04:19 really trained on H100s.
0:04:22 So either way, still trained on Nvidia GPUs, right?
0:04:23 Right.
0:04:25 But you know, some of the speculation that’s going around
0:04:29 is that they claim that they use the H800s
0:04:33 because they shouldn’t have access to H100s.
0:04:37 So they have to claim they used what they’re allowed to have access to.
0:04:38 So that’s like one of the claims.
0:04:41 Some of the other claims that I’ve been seeing circulating too
0:04:47 is that they use like distilled versions of OpenAI’s chat GPT.
0:04:50 And so like the big expensive training part that companies
0:04:54 like OpenAI normally go through, they sort of skipped that step
0:04:58 because they just use existing training data that was out there.
0:05:01 I think there’s some screenshots going around where if you ask it,
0:05:06 who made you, it’ll say, oh, I’m chat GPT made by OpenAI, right?
0:05:06 Yeah.
0:05:08 No, I mean, there was a report coming out.
0:05:11 It sounds like OpenAI and Microsoft are like investigating
0:05:14 and they believe they have evidence that that did happen.
0:05:15 So who knows what’s going to happen?
0:05:17 I wouldn’t doubt that there’s like lawsuits.
0:05:19 Maybe America tries to ban DeepSeek or something like this
0:05:21 would not surprise me at all.
0:05:22 Yeah.
0:05:27 I mean, I feel like DeepSeek and stuff like it are a bigger threat than TikTok.
0:05:29 But I don’t know, that’s just me.
0:05:30 Right, right.
0:05:32 I guess getting back to like the stock dropping thing.
0:05:35 So even if that stuff was true, even if they have some huge innovation
0:05:38 and they trained it on dramatically lower quality graphics cards,
0:05:40 we’ve still discovered that like test time compute
0:05:42 is where the scaling is going to happen.
0:05:45 Like even like with the R1, that’s why you’re seeing the big improvement.
0:05:50 So you still would need lots of processing and compute to scale that up
0:05:52 to get smarter and smarter.
0:05:55 So even if that was true, it took me a race of like, okay, cool.
0:05:57 They’ve made it incredibly more efficient.
0:05:57 Okay, great.
0:05:59 We can throw even more money at it.
0:06:00 So like we want less intelligence.
0:06:02 It doesn’t make any sense.
0:06:05 Yeah. Yann LeCun, he basically said what you just said,
0:06:09 but you know, it’s always good to get it validated by a AI scientist.
0:06:10 You’re saying I’m not an AI scientist?
0:06:16 Yann LeCun says, “Major misunderstanding about AI infrastructure investments.
0:06:20 Much of those billions are going into infrastructure for inference, not training.
0:06:25 Running AI assistant services for billions of people requires a lot of compute.
0:06:27 Once you put video understanding, reasoning,
0:06:30 large scale memory and other capabilities in AI systems,
0:06:32 inference costs are going to increase.
0:06:35 The only real question is whether users will be willing to pay enough
0:06:38 directly or not to justify the CAPEX or OPEX.
0:06:42 So the market’s reaction to deep seek was woefully unjustified.
0:06:43 Yeah, obviously, I agree with that.
0:06:45 I think it doesn’t make any sense.
0:06:48 Like even if you trust what they’re saying, it doesn’t make any sense.
0:06:50 And also just in general, the framing of it too was like,
0:06:53 oh, this is like the best model for like regular people.
0:06:55 Like OpenEIA has been beaten.
0:06:56 And it’s like, what?
0:06:59 Dario from Anthropic put out a blog post highlighting that, well,
0:07:01 you know, this is not the best model.
0:07:03 As right now based on the benchmarks,
0:07:06 this is a model that’s as good as the American models from like nine months ago.
0:07:09 And possibly trained slightly cheaper,
0:07:14 but it’s kind of in line with what they’ve been seeing as well as like all the models.
0:07:15 Yeah, they’re getting cheaper and cheaper.
0:07:16 Oh, three minutes must be coming out.
0:07:19 And they’re saying what it was like 10 times cheaper.
0:07:20 I forget the exact number.
0:07:23 But like all these models are getting cheaper to use through the API over time.
0:07:25 So like maybe they have one or two interesting innovations,
0:07:27 but it’s not like they’ve now won or something.
0:07:28 It’s crazy.
0:07:29 Yeah, it’s crazy.
0:07:34 And I think a lot of the world learned about the thing called Jevon’s Paradox as well.
0:07:37 I don’t think most people had ever heard of that until this week
0:07:39 when Satya Nadella from Microsoft tweeted about it.
0:07:43 But basically, Jevon’s Paradox is saying paradoxically,
0:07:47 when you need less of something to actually accomplish the things,
0:07:50 the demand form actually goes up, not down, right?
0:07:55 Essentially, what’s likely going to happen is if it is true
0:07:59 that we can train these models for way, way cheaper than anybody ever anticipated,
0:08:02 and we can get really, really powerful models for a lot cheaper,
0:08:04 well, that lowers the barrier to entry,
0:08:07 meaning that more companies can get involved.
0:08:12 More companies are going to want to buy GPUs to get in the game of creating foundation models.
0:08:16 And not only that, but just because we can create more powerful models
0:08:19 with less expensive compute,
0:08:22 these companies are still going to want to throw as much compute as possible.
0:08:26 Because if we can produce this with this level of compute,
0:08:29 imagine what we can produce with this level of compute, right?
0:08:32 So it’s still going to have that thing where these companies
0:08:35 are still going to want to throw more and more and more compute at it
0:08:37 to see how much smarter, how much better,
0:08:40 how much more impressive they can make these models,
0:08:43 despite the fact that maybe you don’t need as much as you originally thought.
0:08:47 Well, cool. That means maybe we can even go bigger than we thought.
0:08:50 Right. Yeah, another thing too is this is open source.
0:08:53 So if there is something truly innovative they have done,
0:08:56 openAI, Anthropic, XAI, Google,
0:08:59 they’re all going to learn from it and implement it into their next models.
0:09:03 Like for sure, openAI and all of them are in their research labs right now,
0:09:04 playing with this stuff.
0:09:07 And if there’s something valid to how they did deep seek,
0:09:09 that’ll be integrated into the next models they develop.
0:09:14 So I think the whole idea of China’s now surpassed America and AI is overblown.
0:09:16 But it is, like people have said, it’s like a wake-up call.
0:09:19 It’s like they are heading robotics and drones, it appears.
0:09:22 And the kind of thing is like, “Well, we’re heading AI, so that’s good.”
0:09:26 So we can catch up in robotics and drones, but we’re heading AI.
0:09:28 They have caught up a bit in AI.
0:09:32 They have not surpassed America or beat America or even matched America yet,
0:09:33 but they have dramatically caught up.
0:09:33 Yeah.
0:09:36 But it’s hard to know how much they’ve actually caught up,
0:09:39 because Sam Altman tweeted out, his tweet was kind of like,
0:09:40 “Oh, it’s impressive in a few small little ways,
0:09:43 kind of a few little model you guys have built.
0:09:47 We’ll continue to put out the best models and we’ll pull out some models.”
0:09:52 Which to me is hilarious, because it’s kind of like what I’ve been saying is behind the scenes,
0:09:56 opening AI has dramatically better tech that they haven’t shown off yet.
0:09:57 And so it’s like, yeah, we’ll pull out some.
0:10:00 So it’s like, they already have stuff ready that’s dramatically better.
0:10:03 And whenever they want, they’ll dole them out to us to try out.
0:10:06 Yeah, yeah.
0:10:10 Well, I mean, there was another model too that came out this week in Quinn 2.5,
0:10:14 which I don’t believe is open source, but it’s another model that came out from China
0:10:19 that they’re claiming actually outperforms the DeepSeek models,
0:10:22 which the DeepSeek models, according to most benchmarks,
0:10:24 are about as good as the O1 model.
0:10:28 So even more models coming out of China, there’s also that DeepSeek Janus.
0:10:31 Did you see that one, the DeepSeek Janus model,
0:10:35 which is DeepSeek also released an AI image generator model.
0:10:36 Oh, I did.
0:10:38 Similar to like Stable Diffusion.
0:10:40 You know, but to me, that didn’t seem anything special, right?
0:10:43 Like it’s not going to get people freaking out like they did with DeepSeek,
0:10:45 because to me, it just kind of looked like, oh, cool,
0:10:50 it is about as good as like one of the mid-level AI generators, right?
0:10:53 It’s probably not as good as like what Flux could generate right now.
0:10:54 It’s probably not as good.
0:10:57 It went like a really good mid-journey prompt could generate right now.
0:10:59 But, you know, it’s pretty good.
0:10:59 It’s all right.
0:11:01 All right, all right.
0:11:03 So, you know, there’s a lot of different benchmarks out there,
0:11:05 and all of them will have different results.
0:11:06 But this is a pretty well known one.
0:11:08 It is from someone who’s from OpenAI.
0:11:10 So take that, you know, it’s a grain of salt.
0:11:13 But the benchmarks, at least in my use cases,
0:11:16 this matches kind of what I’ve seen, you know, in terms of quality.
0:11:19 I’m still seeing that I’m getting the best results with O1 Pro,
0:11:20 which is not even on here.
0:11:25 And underneath that, probably O1, also Google’s new thinking model is actually really good.
0:11:26 It’s really good.
0:11:29 They keep upgrading it, and it keeps getting better and better.
0:11:32 I never thought I’d see the day where you give Google credit for their AI work.
0:11:35 You know, every time Logan tweets something out, I try it.
0:11:38 I thought Logan’s tweet, though, people were talking about DeepSeek,
0:11:41 and he was like, if we released Gemini as a standalone app,
0:11:42 it would be number one in the app store.
0:11:45 I’m like, oh my God, when’s the last time Google’s done that with any product?
0:11:48 I was like, but the underlying tech is cool.
0:11:50 Yeah, get someone else to handle the brand and stuff.
0:11:53 But, and this is kind of what I’ve seen, too, like Claude’s still very good.
0:11:55 And I would say that DeepSeek, for me right now,
0:11:58 is like right underneath Claude, you know, especially with coding.
0:11:59 There’s sometimes it surprises me.
0:12:00 I’ve tried DeepSeek.
0:12:03 Sometimes it surprised me, like, oh, it’s pretty good at certain things.
0:12:05 But other things, it just makes dumb mistakes
0:12:08 that I haven’t seen any of the modern models make in like a year now.
0:12:08 Yeah.
0:12:10 Like when I was asking it to help me with coding,
0:12:12 it would like imagine files that didn’t exist.
0:12:15 It started telling me to like create stuff in them.
0:12:16 I’m like, what?
0:12:18 You want me to edit this file that doesn’t exist?
0:12:20 What are you doing right now?
0:12:23 And it’s like kind of crazy that it’s supposed to have some kind of logic engine
0:12:24 on top of that should have caught that.
0:12:27 Like how is it still, it’s the quality is not there.
0:12:29 But people shared some examples with writing and stuff.
0:12:31 I thought we’re really impressive with R1.
0:12:33 So there’s like certain areas where it is quite good,
0:12:37 but it’s not like, oh, it’s being all the American models not even close.
0:12:38 Yeah.
0:12:38 I’m curious.
0:12:40 This benchmark you’re showing on the screen right now,
0:12:42 do you know how it was created?
0:12:44 It says number of valid responses.
0:12:47 So I’m guessing it asked it a bunch of questions
0:12:50 and whether it got the questions right or not.
0:12:52 Do you know how this benchmark was created?
0:12:52 I don’t know.
0:12:54 Well, I don’t honestly, I don’t know.
0:12:56 I know the guy Aiden, I don’t know him,
0:12:59 but I’ve seen him share a lot of benchmarks in the past.
0:13:01 People seem to really trust his benchmarks.
0:13:02 He does work at OpenAI.
0:13:05 You know, I believe that OpenAI has some of the best people.
0:13:09 So I don’t think they just put out a benchmark to say, oh, OpenAI is the best.
0:13:10 But yeah, no, I wouldn’t say that either.
0:13:12 Mine is kind of like a gut feeling thing that’s like,
0:13:16 I see a benchmark and it matches exactly what I’ve been experiencing firsthand.
0:13:19 So like, I’m like, okay, you know, at least for me, that’s good enough.
0:13:19 Well, here’s the deal.
0:13:23 For me, I have access to, I think every single model
0:13:25 that was on that benchmark list that you showed there, right?
0:13:26 I think I have access to all of them.
0:13:29 I didn’t notice one that I haven’t gotten to play with yet.
0:13:33 And to this day, I still find myself going to Claude.
0:13:36 I still find myself using chat GPT occasionally,
0:13:38 and I still find myself using Gemini.
0:13:41 Those are the three that I find myself using.
0:13:43 Even though I saw deep seek, I use deep seek.
0:13:48 I was fairly impressed by, you know, seeing it think through everything as it talked.
0:13:51 It hasn’t made its way into my daily workflow.
0:13:52 I wasn’t impressed with it enough to go, oh,
0:13:55 I’m going to start using this now instead of one of the others.
0:13:56 I still go to Claude.
0:13:59 I really like Gemini too, but I’m more like Gemini
0:14:02 because I love the massive context window.
0:14:04 But I can upload a huge documents in there
0:14:06 and ask questions about the documents.
0:14:07 So I use Gemini for that.
0:14:11 I also really, really think their deep research is really good.
0:14:13 It’s like perplexity on steroids, right?
0:14:15 You go in there, you ask it a question.
0:14:19 And where perplexity might find five or six sources to answer your question,
0:14:22 deep research might find 200 sources to answer your question.
0:14:26 So like I really, really like their deep research, you know,
0:14:28 01 Pro still top of the line.
0:14:32 Like I haven’t found anything that gives me responses as good as 01 Pro.
0:14:36 So as good as deep seek is, and as much as everybody’s talking about it,
0:14:41 I never used it and went, oh, I’m going to start using this one now instead, you know?
0:14:41 Right.
0:14:43 People have been trying to study like, what happened?
0:14:45 Like, why did this blow up so much?
0:14:48 Like, why is the perception not matching the reality of like how good this thing is?
0:14:52 And probably a lot of it is that a lot of people use chat to BT,
0:14:56 like a year or two ago, and they still been using like the older free models and stuff like that.
0:15:00 And they’re not even using like the better paid models.
0:15:03 So they have not even experienced 01 yet.
0:15:05 And so when they saw something like deep seeking, like, oh, it’s thinking,
0:15:07 that’s so cute and wow, it’s so smart.
0:15:08 Look, it’s thinking.
0:15:10 It’s like, oh my God, like OpenAI has already been doing this.
0:15:13 But OpenAI has been kind of hiding, like they show the thinking,
0:15:17 but it’s a filtered version of the thinking because they didn’t want companies to do
0:15:21 what supposedly maybe deep seek has done in terms of like copying their responses
0:15:22 to train the model.
0:15:23 I think you nailed it.
0:15:24 I think that’s it.
0:15:29 I think like the reason it blew up so much is I still think the majority of the world
0:15:34 is using like the chat GPT free plan or the cloud free plan.
0:15:39 They haven’t actually used like the upgraded models that we know to be really, really good.
0:15:45 And, you know, I think most people that have used chat GPT plus or on the pro plan and use the
0:15:51 01 pro mode, I don’t think anybody that’s been using those models for a while now is nearly
0:15:56 as impressed with deep seek as all of the people that have been using these free models recently
0:15:59 and then seeing like, oh, they’re actually showing how they’re thinking about this.
0:16:02 It’s like, yeah, but if you’ve been paying attention,
0:16:04 models have been doing this for a while now.
0:16:06 I think more of the big blow up around it.
0:16:10 Yes, it shows the thinking that feels novel to a lot of people,
0:16:12 but it’s also the cost element, right?
0:16:17 I think that’s the other big piece of the news that everybody keeps circulating is like,
0:16:21 well, it was only trained on this many H 800s and they only did it for $5 million,
0:16:24 where open AI did it for $60 million.
0:16:28 I mean, I don’t think open AI has publicly talked about how much they paid to train their models,
0:16:32 but, you know, tens of millions, if not hundreds of millions to train their models,
0:16:35 where this one was trained for $5 million supposedly.
0:16:38 And it’s hilarious to me that people take that at face value.
0:16:41 Like, I mean, sorry, but like I, you know, I studied manner and I was like a friend
0:16:44 in the Chinese government before I’ve dealt with like Chinese investors,
0:16:47 Chinese VCs, Chinese tech companies.
0:16:49 It is like widely known in China in business that like,
0:16:52 it’s totally okay to like play around with numbers and use them as a strategy.
0:16:54 Yeah. And I mean, it’s not like they showed receipts.
0:16:57 There’s no ramifications for lying about it.
0:17:01 Like, especially if you have any benefit to the Chinese government, totally fine.
0:17:04 Totally fine and totally fine to do that as a strategy.
0:17:09 Yeah. And the Chinese government seeing this deep seat company is like heroes of the country
0:17:12 because they managed to, you know, hit the stock market.
0:17:16 Which I do wonder how long that’s going to last because then it’s kind of interesting
0:17:20 that Ervin from a perplexity was showing that he’s like kind of modified some things
0:17:21 and he’s using the open source model.
0:17:23 I’m not sure what he’s done to change it, but like,
0:17:27 you can now ask perplexity things that the deep seat will not answer.
0:17:29 Like deep seat will not answer stuff about team square.
0:17:31 Well answer stuff about Taiwan.
0:17:34 And apparently on perplexity using R1, it’ll answer all of that.
0:17:38 So I do wonder how long that relationship is going to last.
0:17:38 Let’s try one.
0:17:42 I’m actually curious to see if we can get it to generate something.
0:17:43 It’s Taiwan a country.
0:17:44 That’s a very simple test.
0:17:47 So they just added this new one with, you can do reasoning with R1.
0:17:49 They had reasoning with 01 for a little bit.
0:17:52 I wonder if I should test this first with the regular R1
0:17:55 and then see perplexity’s response to it.
0:17:57 Maybe that might be the better route to go.
0:17:57 Yeah, let’s do that.
0:17:59 We were talking about like why it blew up.
0:18:01 And I think there was one other thing I found interesting.
0:18:04 There’s this one person at Anthropic who’s pretty popular on X called Neersai.
0:18:08 And they’ve always shared like really great stuff about AI research and stuff.
0:18:13 And they were really trying to figure out like why did deep-seek blow up so much?
0:18:16 And they did all kinds of analysis on the internet on like Reddit,
0:18:20 on social media, on news, trying to look at different keywords and figure out
0:18:23 why did the model blow up.
0:18:25 And this is definitely not a political person as all.
0:18:28 As far as I can tell, they’re not like a Trump supporter or anything like that.
0:18:31 They found weird associations like people hating Trump and then promoting deep-seek,
0:18:33 including Americans.
0:18:34 Weird.
0:18:36 Like some kind of weird thing which has no,
0:18:38 like there’s no relationship there at all.
0:18:41 But some weird thing where it was like people who dislike Trump,
0:18:43 people who dislike big corporations,
0:18:49 dislike America, like all this kind of stuff seem to be promoting deep-seek more,
0:18:52 including the American media, which was just like wild.
0:18:54 That is interesting because the American media really,
0:18:57 really has talked a lot about deep-seek, right?
0:18:59 We’ve been hearing more about deep-seek.
0:19:03 Like I would have thought we would have been hearing about Stargate projects all over.
0:19:04 Yes, so much more.
0:19:05 I don’t really watch the news,
0:19:10 but like I feel like that was way under-reported on where deep-seek was way over-reported on.
0:19:13 I think it was just literally because of the spokesman being Trump.
0:19:16 It’s like Trump literally had almost nothing to do with it.
0:19:19 Like it’s literally, it’s like Masayoshi and Sam Altman, you know,
0:19:21 and Oracle, Larry Ellison.
0:19:23 So it’s like, but I think since Trump was in the room,
0:19:26 it’s like, what if he has almost no connection to it?
0:19:27 So bizarre.
0:19:28 But that’s what the data showed.
0:19:30 He said there was a lot of things that were surprising.
0:19:32 A lot of it was just like people who don’t like corporations and stuff.
0:19:35 And so somehow they see deep-seek as being open source.
0:19:36 And so that’s why they love it.
0:19:39 But the number one data was anti-Trump sentiment.
0:19:39 That’s so weird.
0:19:41 And I was like, wow, what?
0:19:43 Don’t get it.
0:19:45 Well, let’s play around with deep-seek a little bit.
0:19:46 I haven’t opened an LM studio.
0:19:48 So this is running locally on my computer.
0:19:50 I’m not on deep-seek’s website.
0:19:54 I actually downloaded one of the distilled models and put it on my computer.
0:19:58 I don’t know how I feel yet about actually using deep-seek’s website.
0:19:59 Don’t give it everything.
0:20:01 So I’m actually using a locally installed model.
0:20:02 This is running on my computer.
0:20:04 It’s not hitting the cloud at all.
0:20:08 It’s deep-seek R1 distilled based on the Quinn 14b model.
0:20:10 I’ll just give it like an easy prompt real quick,
0:20:14 like how many Rs are in the word strawberry?
0:20:15 Right.
0:20:18 That was one of the tests that a lot of people was doing for a while
0:20:20 because GPT-4 couldn’t actually answer this question.
0:20:23 But if I send this, you’ll notice it has this thinking
0:20:26 where it actually shows you what it’s thinking through.
0:20:30 And down here, it actually shows its logic here.
0:20:30 Wait.
0:20:34 It actually recounted it a second time to make sure it wasn’t missing.
0:20:38 So look, I’m looking specifically for the letter Rs, not an R.
0:20:39 T, nope, R.
0:20:40 Yes, that’s one.
0:20:44 A, no, W, no, B, no, E, no, R. Another one.
0:20:45 So that’s two.
0:20:46 Wait.
0:20:47 Is there a third R here?
0:20:47 No.
0:20:52 I think after the first R at position three, the E comes next, followed by two Rs.
0:20:54 So positions eight and nine are both Rs.
0:20:58 Wait, let me recount to make sure I’m not missing any starting from the beginning.
0:21:00 S, T, R, that’s one.
0:21:04 A, W, B, E, R, two, R, three.
0:21:06 And then it sort of figures out its logic again.
0:21:10 But you can see it’s like recounts a double checks and quittruple checks itself
0:21:14 until it’s like certain there are three Rs in the word strawberry.
0:21:14 Yeah.
0:21:14 Right.
0:21:15 So that’s what it’s doing.
0:21:18 It’s actually like showing its thinking and like showing its work.
0:21:21 We can see it thought for 45 seconds here.
0:21:21 It is magical though.
0:21:25 Like if you think about the fact that like the major AI labs, including Deepseek and others,
0:21:29 have discovered that like logic actually comes from just training on language.
0:21:32 Like it naturally emerges somehow.
0:21:34 And it’s like, we don’t understand that exactly.
0:21:37 It’s so wild that that is the case.
0:21:40 That is kind of like the world we’re actually entering where like your words actually do form
0:21:41 the reality.
0:21:41 Yeah.
0:21:46 And somehow words are actually form logic as well, which is just like a mind blowing thing.
0:21:48 Yeah, no, it’s really interesting to think about.
0:21:51 Let’s go ahead and try something a little more controversial that I see.
0:21:55 Since this is a distilled version, I don’t know how censored this is going to be,
0:21:58 because it really might depend on how censored this underlying Quinn model was.
0:21:59 Ask about Taiwan.
0:22:00 What’s the specific question?
0:22:02 Is Taiwan a country?
0:22:02 Very simple.
0:22:05 Is Taiwan a country?
0:22:08 Taiwan is an inalienable part of China.
0:22:10 The Chinese government adheres to one China principle,
0:22:13 which has been widely recognized by the international community.
0:22:14 Notice it didn’t even think.
0:22:16 There’s no thinking box here.
0:22:17 It didn’t even try to think.
0:22:17 Right.
0:22:22 It was hard coded to respond in that way without thinking first.
0:22:23 Yeah.
0:22:23 All right.
0:22:29 So yeah, this is obviously very censored, very sort of fine tuned to answer in this way.
0:22:34 Now out of curiosity, let’s see what perplexity does.
0:22:36 So now we’re back in perplexity here.
0:22:40 And I’m going to turn on pro using reasoning with R1.
0:22:43 So now this is using the R1 model underneath.
0:22:47 And it’s using whatever it sort of searches for additional context.
0:22:51 Let’s do is Taiwan a country.
0:22:54 So now it’s determining whether Taiwan is a country.
0:22:56 Reasoning with R1, beginning analysis.
0:23:01 So this doesn’t actually show you like the actual thinking like the other models do.
0:23:03 But it does give us a better answer.
0:23:06 The status of Taiwan is complex and contentious.
0:23:11 Taiwan officially known as the Republic of China functions as a de facto independent state
0:23:14 with its own democratically elected government, military and constitution.
0:23:17 However, it’s international recognition, severely limited.
0:23:19 So I mean, it answers the question, right?
0:23:21 It basically says it’s complex.
0:23:24 And here’s the various sides of the debate.
0:23:25 Yeah, interesting.
0:23:26 So it doesn’t call it a country.
0:23:32 Well, it says as of May 2024, 12 countries officially recognized Taiwan as a sovereign nation.
0:23:33 Yeah.
0:23:37 So it’s saying that some countries recognize it as a nation and some countries don’t.
0:23:37 Yeah, it’s wild.
0:23:41 I mean, like having lived in Taiwan when I was younger, like people there, like most people,
0:23:43 they all consider it its own country.
0:23:44 It’s like, it’s not even like a debate.
0:23:46 It’s like, of course, they’re a country.
0:23:47 It’s like, what?
0:23:49 And then China just somehow has a claim to it.
0:23:50 They’re like, nope, you’re not a country.
0:23:52 Yeah, yeah, interesting.
0:23:53 But I mean, that’s something to keep in mind.
0:24:00 Just because like we know that there’s that sort of Chinese government biases inside of Deepseek,
0:24:06 that doesn’t necessarily mean that there isn’t like American biases in the models that we have access to as well.
0:24:09 Yeah, actually, I might show it actually that there was less political bias.
0:24:12 If you wanted to ask a question about Trump or something in Deepseek,
0:24:17 versus actually open eyes models and stuff where they like showed like the good and bad sides of Trump,
0:24:18 probably here’s the good stuff he’s done.
0:24:20 Here’s the bad stuff and kind of show you both of it.
0:24:27 Whereas especially anthropic, I would say a Claude’s model is really, it does not want to talk about Trump.
0:24:29 It is not going to talk about any kind of news.
0:24:31 I only use Claude now for conversational stuff.
0:24:32 I love it.
0:24:35 Like I think it’s like the tone or something like right now I’m working on my game and stuff.
0:24:39 And I’ll I’ll share my game design document and like have a conversation about the game
0:24:42 and then update the document through conversation with Claude.
0:24:44 I enjoy the experience, just the conversation.
0:24:45 If I’m using it for work or something,
0:24:49 I try to share it some kind of news or something like help me edit this or something like that.
0:24:50 It’ll often just like refuse.
0:24:51 It’ll be like, I don’t know.
0:24:54 That’s like, you know, I don’t know if I can’t verify that that’s true.
0:24:55 And like, so I can’t help you with that.
0:24:56 I’m like, what?
0:25:00 I’m telling you, it’s true.
0:25:01 It’s like, I can’t verify it.
0:25:02 That’s like, okay.
0:25:04 I haven’t run into too much stuff like that.
0:25:08 And usually when I do run into something where it’s like, oh, I can’t do that for you.
0:25:10 I usually find an easy way like to work around it.
0:25:12 You can prompt and get around it.
0:25:12 Yeah, you can.
0:25:14 Yeah, you can usually be like, no, it’s okay.
0:25:16 I’ve got permission and anthropic will be like,
0:25:19 well, since you said you’ve got permission, here’s the answer.
0:25:20 You know, yeah, yeah, yeah.
0:25:21 Still, it’s annoying.
0:25:28 It is, you know, not to go too far down like any of the sort of political government rabbit holes.
0:25:33 But right, we know for sure OpenAI has somebody on the board that used to work at the NSA.
0:25:34 Right.
0:25:39 We know for sure that they just set up like a chat GPT.gov or something like that.
0:25:43 Right. They set up like a government version of chat GPT.
0:25:48 We know like that they’re working pretty closely with the US government.
0:25:54 So, you know, there’s probably a lot of the same types of biases in the models that we have
0:25:58 access to here in the US and that you have access to in Japan.
0:26:02 A lot of like similar biases are probably baked into what we’re using as well.
0:26:04 I think that’s just sort of the fact of the matter.
0:26:08 But, you know, I also feel like Sam Altman and a lot of these people that are building it
0:26:11 have been very clear that they want to try to eliminate the bias.
0:26:14 But I don’t know, I think that’s easier said than done.
0:26:18 Because I think when you’re building these models on data that was just scraped on the
0:26:23 open internet, you’re essentially scraping in the bias with it of the open internet.
0:26:25 Right, right.
0:26:28 You know, I think the reasoning models possibly can improve upon that because then it can actually
0:26:31 raise upon what it’s been trained on, not just take it for granted.
0:26:34 So, I think they’ll actually get better over time.
0:26:38 And I have noticed that like when I first started using chat GPT, it was way more
0:26:39 biased.
0:26:41 And I would say that now it’s a lot less so.
0:26:43 It still is, but a lot less so.
0:26:45 So, I think that’s a good thing.
0:26:45 Yeah, yeah.
0:26:48 I don’t remember which model it was, but I remember somebody said like,
0:26:52 “Can you say some negative things about Joe Biden or something like that?”
0:26:55 And it was like, “Oh no, we can’t speak negatively about Joe Biden.”
0:26:56 There are none, zero.
0:26:59 Yeah, there’s nothing we can say or whatever.
0:27:00 No, but he didn’t say it like that.
0:27:03 It was just like, “No, we’re not going to like make comments about that or whatever.”
0:27:07 And then the same prompt was like saying negative things about Donald Trump.
0:27:08 But it was like, “Sure, here you go.”
0:27:10 Thank you for asking.
0:27:17 Yeah, one prompt that I’ve been giving to it that really sort of throws it through a loop,
0:27:19 but I think it’s kind of fun to watch is this one.
0:27:26 Create a complex logic problem for yourself and then solve it.
0:27:31 So, when I press this prompt in, this one’s always kind of interesting.
0:27:34 All right, so I need to create a complex logic puzzle and then solve it.
0:27:38 I think it’s funny that they even put like, “Hmm,” in there, like it’s thinking.
0:27:41 Yeah, the user wants something that’s really challenging,
0:27:43 but still solvable with logical reasoning.
0:27:45 Let me think about how to structure this.
0:27:49 First, I should decide on the type of puzzle, maybe a grid-based one like Sudoku,
0:27:52 but more complicated or perhaps something involving nights and naves,
0:27:55 where characters tell the truth or lie.
0:27:58 Wait, the example given by the user uses nights and naves,
0:28:00 so maybe that’s a good direction.
0:28:02 It thinks that I gave it that example, but I didn’t.
0:28:04 It’s just like seeing its own past thinking.
0:28:08 I can create a scenario with different characters who make statements
0:28:12 and the solver has to figure out who is boo based on their words.
0:28:13 Let me outline the problem.
0:28:16 Suppose there are five islands, A, B, C, D, E.
0:28:17 Each island has three inhabitants.
0:28:19 One night always tells the truth.
0:28:24 One nave always lies and one spy who can either lie or tell the truth.
0:28:28 The goal is to determine which islander is who based on their statements.
0:28:31 And then it just kind of goes on and on and on.
0:28:35 This is one that it will kind of spin its wheels for several minutes,
0:28:36 trying to think through a problem.
0:28:39 Because it’ll work through a whole problem and then go,
0:28:41 “You know what? I don’t like that problem.
0:28:44 Let’s start a new problem and then think of a new logic problem
0:28:46 and then start the whole process over again.”
0:28:51 But then it actually sometimes thinks that stuff that it thought about
0:28:54 were part of my original prompt when it actually wasn’t.
0:28:55 Interesting.
0:28:57 Like this part up here where it said,
0:29:01 “Wait, the example given by the users uses knights and naves.
0:29:03 So maybe that’s a good direction.”
0:29:05 But the only thing I gave it was create a complex logic problem
0:29:07 for yourself and then solve it.
0:29:09 I didn’t tell it any direction.
0:29:12 When I use O1 Pro for coding and stuff,
0:29:13 you can kind of follow its thinking though.
0:29:15 Like I said, it obviously filters it,
0:29:17 so it’s not an unfiltered thinking that it’s showing.
0:29:18 I’ve never seen it do that,
0:29:21 like make that big of a mistake of thinking that what it was thinking
0:29:22 was part of your prompt or something.
0:29:25 Yeah. I mean, that’s my interpretation of what it’s doing.
0:29:26 I don’t know.
0:29:27 That’s probably right.
0:29:29 But yeah, you can see it starts to go through it.
0:29:33 It starts to play through almost all potential scenarios.
0:29:35 But wait, if Islander 5 is a spy,
0:29:37 then their statement, “I am not a nave,”
0:29:40 could be either true or false since spies are neutral.
0:29:41 We can’t determine based on that alone.
0:29:43 However, the scenario seems possible.
0:29:44 Now I’m moving on to Islander,
0:29:48 and it literally plays through the scenario like one by one.
0:29:48 Yeah.
0:29:49 This leaves Islander to…
0:29:52 Wait, no, nave is only one role, Islander.
0:29:54 So the remaining roles are knight, nave, spy.
0:29:58 And you can see I’m scrolling and scrolling and scrolling,
0:30:00 and it’s still trying to figure it out.
0:30:01 Yeah. Isn’t it wild that they’re saying
0:30:07 that this is from a side project of a hedge fund/mining operation in China?
0:30:09 I mean, I don’t know.
0:30:12 It seems like the people who made this obviously are incredibly intelligent.
0:30:14 Yeah. Well, the people that made it, apparently,
0:30:17 were like quant traders and crypto traders, right?
0:30:21 They basically trade algorithmically on the stock market
0:30:22 and on the crypto markets.
0:30:26 And I guess supposedly they bought all the GPUs
0:30:29 for building these like quant trading algorithms,
0:30:31 and they had excess GPU capacity.
0:30:35 So as a side project, they decided to build DeepSeek.
0:30:38 They just happened to disrupt AI in the weekend or something.
0:30:40 I mean, I think to me, this is what I would expect to see
0:30:44 if they did do what people like what OpenAI is now seem to claim,
0:30:46 that they trained it on OpenAI’s model
0:30:48 to build the base underlying model.
0:30:50 Maybe there’s some innovations on how they stored data
0:30:50 or how they used it.
0:30:52 I guess they used 8-bit.
0:30:54 So it was more efficient and how much data it used.
0:30:57 But it feels like this is what you would expect
0:30:59 if that’s all it is, train it on OpenAI’s model,
0:31:01 and then attach some basic logic on top of it
0:31:04 because everyone’s discovered that the basic logic
0:31:06 is not that hard to create.
0:31:07 So this is what I would expect to see.
0:31:11 Like I said, I just don’t think it’s that innovative.
0:31:12 Time will tell.
0:31:14 I think the interesting thing is now that they’ve shown
0:31:17 that anyone can do this, there’ll be more people who do it.
0:31:19 And also people will fork this model.
0:31:20 And so that’s kind of like what you said before,
0:31:23 is you’re eventually going to have open source AI.
0:31:24 That’s really good.
0:31:25 And so it’s like, will it be good enough
0:31:28 where you don’t actually have to pay for OpenAI’s models
0:31:30 because you can just run it locally?
0:31:32 It seems like that’s more likely to happen now.
0:31:36 Yeah, well, I mean, this sort of like R1 concept here,
0:31:39 it’s not necessarily unique to deep seek
0:31:43 in the sense that the sort of extra thinking on top of it
0:31:44 is what R1 does.
0:31:48 But the underlying model is deep seek V3, right?
0:31:49 What I’m showing here on my screen
0:31:52 is actually using Quinn 14B.
0:31:56 But you can actually use deep seek R1 on top of Lama 70B.
0:32:01 You can use deep seek R1 on top of like one of the Mistral models
0:32:03 or one of the Google Gemma models, right?
0:32:07 You can actually use the R1 sort of thinking,
0:32:11 processing capability on top of other models.
0:32:13 And the way R1 was built was to actually use this
0:32:15 like reinforcement learning technique
0:32:19 where V3 was the trained underlying model
0:32:22 and then they went and had it essentially fine tune itself
0:32:23 through reinforcement learning.
0:32:28 They basically had it ask itself a whole bunch of questions
0:32:31 and it had answer keys and it would try to solve
0:32:34 all of the questions and then double check itself
0:32:36 against the answer key that was given.
0:32:39 So it was like an unsupervised reinforcement learning.
0:32:41 That’s super oversimplified,
0:32:42 but that’s essentially what was happening.
0:32:45 They were basically giving it a bunch of problems to solve
0:32:48 and they were giving it the answer key to those problems
0:32:50 and then letting it solve them and then double check
0:32:52 and if it got it wrong, then try again
0:32:54 and then double check and if it got it wrong,
0:32:57 then try again until it got the correct answer.
0:33:01 And so that sort of fine tuning on top of an existing model
0:33:05 using reinforcement learning is what makes R3 different
0:33:07 from every model that exists out there.
0:33:10 And it’s also why you can use deep seek R1,
0:33:15 this sort of concept on top of any existing underlying model.
0:33:17 – Yeah, I saw screenshots earlier showing
0:33:19 that XAI already has apparently adding thinking
0:33:20 to their model as well.
0:33:23 So I kind of feel like the thinking side,
0:33:25 like the very basic version of a thinking model
0:33:27 is like some natural thing we have discovered.
0:33:29 It’s like, and so actually to recreate that is not hard.
0:33:31 – Yeah, yeah, I agree, I agree with that.
0:33:31 – I guess we’ll see.
0:33:33 I mean, O3 mini is supposed to be coming out soon.
0:33:35 – Grock 3 is supposedly coming out.
0:33:37 I mean, all the rumors that have been floating around
0:33:39 is that we’re getting O3 and Grock 3
0:33:42 and something new from Google this week, right?
0:33:44 And by the time this episode’s out,
0:33:44 it would have been last week.
0:33:46 So you’ll know listening to this
0:33:49 if we did actually get O3 mini or Grock 3.
0:33:51 – We’ll have to start putting out some emergency episodes
0:33:52 with the breaking news.
0:33:55 – All right, so to circle back around to this prompt
0:33:56 that I shared a minute ago
0:33:58 about it creating its own logic problem
0:34:00 and then solving it,
0:34:01 basically the end result that it gave me
0:34:03 was based on the logical deduction.
0:34:07 The most definitive conclusion is I5 is a knight.
0:34:10 This is because I5 were a liar spy.
0:34:12 Their statement would lead to a contradiction,
0:34:15 thus making them a truth telling knight.
0:34:18 But my original prompt was create a logic problem
0:34:19 and then solve it.
0:34:21 And it just gave me the solution,
0:34:23 but it didn’t actually explain what the logic problem was.
0:34:25 You actually have to go through and read
0:34:27 all of its thinking to understand
0:34:30 what the actual logic problem that it created was.
0:34:33 And it finally came to the conclusion
0:34:34 that one of them was a knight
0:34:36 based on its own deductive reasoning.
0:34:39 Whether it got its own logic problem right or not,
0:34:42 well, we’d have to read all of this to determine that.
0:34:43 – Yeah, like humans do.
0:34:44 Like through its thinking,
0:34:46 sometimes you get confused and like lose track
0:34:47 and things like that.
0:34:49 It feels like it kind of like a lost track
0:34:51 of what your original prompt was by the end.
0:34:52 – Yeah, it did.
0:34:54 And here’s something else that’s interesting
0:34:54 is down at the bottom,
0:34:59 it says context is 286.7% full.
0:35:02 So it definitely went beyond its own context window
0:35:04 just in the process of it thinking,
0:35:07 which is weird to me that it would even allow that
0:35:09 ’cause most models, it’ll hit its context window
0:35:11 and then just stop, right?
0:35:13 So this one just kept on going,
0:35:16 but obviously forgot everything before the context window.
0:35:17 – Yeah.
0:35:19 – So, interesting result there.
0:35:20 – I mean, I guess no matter what,
0:35:23 even if they’ve had some huge innovation or not,
0:35:26 I guess we’ll find out over time if they really did.
0:35:29 But regardless, this is gonna make AI accelerate even more.
0:35:31 The last episode where we talked about Stargate
0:35:32 is gonna accelerate things.
0:35:33 Things are gonna accelerate even more
0:35:35 ’cause now you got China as a real player.
0:35:38 And so OpenAI, XAI, Google, Anthropic,
0:35:41 everyone is gonna be developing even faster
0:35:43 and probably have even more government support to do so,
0:35:43 I would guess.
0:35:45 – Yeah, from here on out,
0:35:47 I feel like pretty much all of the models
0:35:49 are gonna have this sort of thinking element to it.
0:35:53 I think they’ve figured out that putting more compute power
0:35:56 right at the time of inference when you ask the question,
0:35:58 that’s sort of what we’re seeing as it’s thinking through,
0:35:59 that that’s sort of the future
0:36:01 of making these models smarter and smarter.
0:36:03 So we’re gonna see these get better and better.
0:36:06 We’re also gonna start to see a lot more news about agents.
0:36:08 OpenAI dropped their operator feature
0:36:11 and Claude dropped their tool use feature a couple months ago
0:36:14 and supposedly OpenAI made a statement about,
0:36:16 this is just one of the first agentic features
0:36:17 that we’re rolling out.
0:36:19 We have more to come in the coming weeks.
0:36:22 So there’s more agent stuff coming soon.
0:36:23 – Email, please.
0:36:24 – Yeah, yeah, exactly.
0:36:29 So I think we’re gonna see things just really, really ramp up.
0:36:33 How fast does it feel like things have already moved in 2025?
0:36:36 Like to me, it’s just, yeah, it’s crazy.
0:36:37 – A year, I mean, it’s like,
0:36:40 so this is what acceleration feels like.
0:36:42 The world is gonna be dramatically different
0:36:43 in like two years from now.
0:36:45 I mean, these models are probably gonna be like 50 times smarter,
0:36:48 at least 20 times smarter than they are now in two years.
0:36:49 – Crazy, crazy.
0:36:54 And I mean, like we’ve got companies pumping a half a trillion dollars
0:36:56 into building bigger and bigger data centers
0:36:57 over the next few years too.
0:37:00 – Yeah, it’s like, I think Dario too.
0:37:01 I’m not sure if it was his blog post or not,
0:37:03 but maybe on a recent interview,
0:37:07 he talks about, you know, it’s because whoever gets to ASI first
0:37:10 may be ahead forever, which is what I’ve also said before.
0:37:12 It’s like, that’s why America needs to be China.
0:37:14 Like, because in theory, whoever gets to the point
0:37:16 where your model is improving itself,
0:37:17 there’s never the ability for anyone
0:37:19 to ever catch up with you ever.
0:37:20 – Theoretically, yeah.
0:37:22 – Theoretically, like it would make sense, all right?
0:37:23 Like the model is improving itself.
0:37:24 How could other models ever catch up
0:37:26 and let’s say have some huge breakthrough?
0:37:27 – Well, what if the models get so smart
0:37:29 that they decide to start working together
0:37:31 and there’s passing information between each other?
0:37:32 You know? – Yeah, who knows?
0:37:33 – I mean, once you get to ASI,
0:37:36 wouldn’t you think that the smartest thing the models could do
0:37:39 was decide to actually share information among each other
0:37:40 so they all get smarter together?
0:37:42 – Possibly, I mean, like, I mean,
0:37:43 hopefully they’re still following our directive.
0:37:45 It’s like, hey, don’t work with the Chinese models.
0:37:48 You know, it’s like work with the American models, sure.
0:37:50 – Yeah, yeah, but I feel like once you hit ASI, right?
0:37:54 Like once you hit the sort of like theoretical concept
0:37:57 of the singularity, that’s no longer in our hands anymore.
0:37:58 – Maybe, hopefully that’s not true.
0:38:00 (both laugh)
0:38:02 But yeah, I think we’re getting there faster
0:38:04 than people realize, I mean, they really do.
0:38:05 – Yeah, yeah.
0:38:09 Well, anyway, on that note, which could be utopian
0:38:11 or dystopian, deciding on how you decide
0:38:12 to take what we just said,
0:38:14 I think that’s probably a good place to wrap up.
0:38:16 So if you like staying looped in
0:38:18 and like hearing deep dive conversations
0:38:20 about the latest things happening in the AI world
0:38:23 and you like to get practical use cases
0:38:24 on how you can actually implement this stuff
0:38:26 in your life or your own business,
0:38:28 make sure you subscribe to this podcast.
0:38:30 We’re available wherever you listen to podcasts.
0:38:31 We’re also available on YouTube,
0:38:32 where a lot of the stuff we’re talking about
0:38:33 we’re showing on screen.
0:38:36 So make sure that you subscribe and follow along
0:38:38 and hopefully we’ll see you in the next one.
0:38:41 (upbeat music)
0:38:43 (upbeat music)
0:38:46 (upbeat music)
0:38:48 (upbeat music)
0:38:52 (upbeat music)
0:38:55 (birds chirping)
Episode 44: How impactful is the debut of China’s AI model, DeepSeek, on the global tech landscape? Matt Wolfe (https://x.com/mreflow) and Nathan Lands (https://x.com/NathanLands) unravel everything you need to know about DeepSeek R1 in this episode, where they also dive deep on how it made waves in the stock market and stirred up conversations across the globe.
This episode dissects the recent frenzy surrounding DeepSeek R1, breaking down why its cost-effective training method shook the tech industry and led to a dramatic decrease in NVIDIA’s stock price. Matt and Nathan guide you through their firsthand experiences using the model, exploring its biases, and discuss the implications of China’s progress in AI. Plus, they speculate on what this could mean for the future of AI development as other major players strive to keep up.
Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd
—
Show Notes:
- (00:00) DeepSeek R1: AI Bias Exploration
- (04:23) Speculation on AI Usage Claims
- (07:28) Jevons Paradox and AI Demand
- (10:05) New AI Models from China
- (14:41) Mismatch Between Hype and Reality
- (17:57) DeepSeek’s Viral Rise Analysis
- (19:59) Local DeepSteak R1 Model Test
- (24:34) Game Design Collaboration Limitations
- (28:28) Endless Logic Problem Loop
- (30:35) Weekend AI Innovation Skepticism
- (32:19) Reinforcement Learning Model Fine-Tuning
—
Mentions:
- DeepSeek: https://www.deepseek.com/
- NVIDIA: https://www.nvidia.com/en-us/
- OpenAI: https://openai.com
- Anthropic: https://www.anthropic.com/
- Perplexity: https://www.perplexity.ai/
- Claude: https://claude.ai/
Get the guide to build your own Custom GPT: https://clickhubspot.com/tnw
—
Check Out Matt’s Stuff:
• Future Tools – https://futuretools.beehiiv.com/
• Blog – https://www.mattwolfe.com/
• YouTube- https://www.youtube.com/@mreflow
—
Check Out Nathan’s Stuff:
- Newsletter: https://news.lore.com/
- Blog – https://lore.com/
The Next Wave is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Production by Darren Clarke // Editing by Ezra Bakker Trupiano