AI transcript
0:00:06 they’re going to come to market faster, they’re going to be able to ramp faster,
0:00:11 they’re going to have better negotiations with whether it’s TSMC or SK Hynix and the memory and silicon side
0:00:15 or all the rack people or like copper cables, everything, they’re going to have better cost efficiency.
0:00:17 So you can’t just like do the same thing as NVIDIA.
0:00:20 You have to really leap forward in some other way.
0:00:22 You have to be like 5x better.
0:00:28 Today, we’re talking AI, hardware, chips, and the infrastructure powering the next wave of models
0:00:30 with three people at the center of it all.
0:00:36 Dylan Patel, founder and CEO of SemiAnalysis, one of the sharpest voices on chips, data centers,
0:00:39 and the economics driving AI’s explosive growth.
0:00:45 Aaron Price-Wright, general partner at A16Z, investing in the technologies and infrastructure shaping the future.
0:00:51 Guido Appenzeller, partner at A16Z with decades on the front lines of AI, cloud, and networking.
0:00:57 From GPT-5’s launch to NVIDIA’s dominance, custom silicon and the global race for compute,
0:00:59 we’re covering what’s happening behind the scenes.
0:01:00 Let’s get into it.
0:01:04 Dylan, welcome to the podcast.
0:01:06 Thank you for having me.
0:01:07 We’ve been trying to get you for a while.
0:01:09 You’re a busy man, but it worked out.
0:01:13 Guido, why don’t you introduce why we’re so excited to have Dylan on the podcast and what we’re excited to discuss?
0:01:18 I think, Dylan, you’ve done an exceptional job in covering what’s happening in the AI hardware space,
0:01:21 AI semi-space, and now more and more data center space as well.
0:01:28 And just looking at it, currently the most valuable company on the planet is an AI semi-company, right?
0:01:33 The, I think, biggest IPO so far in AI was an AI cloud company.
0:01:34 This is currently where it’s happening, right?
0:01:37 And any gold rush in the early days is the peaks and truffles that make money.
0:01:39 And I think this is the stage that we’re in.
0:01:40 So super excited to have you here today.
0:01:41 Awesome.
0:01:42 Thank you.
0:01:44 Happy to talk about my favorite topics.
0:01:45 Amazing.
0:01:46 Well, maybe let’s start with GPT-5.
0:01:50 We just had some of the research for Christina and Isabella on here last week.
0:01:51 You said it was disappointing.
0:01:55 Why don’t you share your reactions or what capabilities you were hoping to see or overall?
0:01:58 I think it depends on what tier of user you are, right?
0:02:03 If you’re just using GPT-5 and before you were $20 or $200 a month subscriber,
0:02:09 you no longer have access to 4.5, which, in my opinion, is still a better pre-trained model for certain things.
0:02:17 Or you no longer have access to O3, which would think for 30 seconds on average, maybe, right?
0:02:22 Whereas GPT-5, even when you’re using thinking, only thinks for like 5 to 10 seconds on average, right?
0:02:24 Which is an interesting sort of phenomenon, right?
0:02:28 But basically, like, GPT-5 is not spending more compute per se.
0:02:31 The model did get a little bit better on a vanilla basis, right?
0:02:33 4.0 to 5 is actually quite a bit better.
0:02:38 But when you think about, you know, what is this curve of intelligence, right?
0:02:41 It’s like, the more compute you spend, the better the model gets.
0:02:44 And that’s whether it’s a bigger model, which GPT-5 isn’t, right?
0:02:46 You can see it’s not a bigger model.
0:02:49 It’s roughly the same size, you know, or you think more, right?
0:02:53 But again, like, this is something that OpenAI’s first thinking models, you know,
0:02:59 the first few generations of O1, O3, would think for a long time and waste a lot of tokens, if you will.
0:03:02 And when you look at, for example, Anthropics thinking models,
0:03:05 even when you put them in thinking mode, they think a lot less, right?
0:03:07 To get to the same results or better results, right?
0:03:08 As OpenAI was.
0:03:15 And so OpenAI, I think, like, optimized a lot of like, well, if I ask, like, I think the silliest one I had asked was like,
0:03:17 I asked O3 once, is pork red meat or white meat?
0:03:19 And it thought for like 48 seconds.
0:03:20 It was like, what are you doing?
0:03:22 Like, they should just like, tell me the answer.
0:03:27 And so like, the nice thing is that GPT-5 will think a lot less, even if you select thinking manually.
0:03:33 But more importantly, they have the sort of auto functionality, the router, which lets them decide whether or not,
0:03:36 hey, do I route to the regular model?
0:03:39 Do I route to maybe many if you’re at a rate limits?
0:03:41 Or do I route to thinking, right?
0:03:42 And how much do I think?
0:03:44 But in general, the thinking model will think less.
0:03:50 So there’s less compute going into a power user’s average query than before.
0:03:55 But isn’t it even more interesting, OpenAI cannot control how much compute it wants to allocate to you, right?
0:04:00 If we’re in a high load situation, maybe tune the router a little bit so it’s less, right?
0:04:02 Maybe I have no idea what they’re doing behind the curtain.
0:04:06 But there’s this meme out there at the moment that basically all they did, which is a meme, right?
0:04:07 It’s not true.
0:04:13 But all they did is take O3 plus a couple of smaller models, put a router in front and offer that at a lower render price, essentially, right?
0:04:15 I think there’s a little bit of that, right?
0:04:19 Cost suddenly matters, and they figured out a way how they can steer that.
0:04:24 I think, yeah, I mean, and they talked about how they’ve been able to dramatically increase their infrastructure capacity.
0:04:29 Because I myself was just regularly using O3 or 4.5, right?
0:04:34 And now I’m forced to use auto, which sometimes gives me the O3 equivalent thinking model,
0:04:37 but sometimes gives me just a regular base, which sucks.
0:04:40 But I think for the free user, it’s actually quite interesting, right?
0:04:46 The free user was not getting thinking models pretty much ever or not using them.
0:04:48 Or in many cases, they just opened the website and asked their query.
0:04:50 And now sometimes their query gets routed there.
0:04:51 So sometimes they get a way better model.
0:04:56 But now sometimes the OpenAI can gracefully degrade them if they need to, right?
0:05:00 And I think the router points to the future of OpenAI from a business, right?
0:05:02 Like, you can look at sort of the model companies, right?
0:05:04 Anthropic is fully focused on B2B, right?
0:05:07 API, code, et cetera, right?
0:05:08 Or a cloud code, whatever it is, right?
0:05:13 OpenAI, yes, they have that business, Kodaks and API business.
0:05:16 But really, the majority of the revenue is consumer, right?
0:05:17 And it’s consumer subscriptions.
0:05:21 But they have no way to upsell, you know, to make money off of all the free users, right?
0:05:26 In any other application, consumer app, the free user still pays via ads.
0:05:29 But this is not compatible with AI, right?
0:05:30 Like, it’s a helpful assistant.
0:05:34 You can’t just make the user’s result worse by injecting ads.
0:05:36 Banner ads don’t really work in AI either.
0:05:39 So it’s like, how do you now monetize them?
0:05:44 And I think with the router, they’re getting really close to figuring out how to monetize that user, right?
0:05:48 With the new CEO of applications, if you saw her product that she launched at Shopify,
0:05:52 I think it was Shopify, was an agent for shopping, right?
0:05:56 And now this, like, immediately clicks, like, oh, if the user asks a low-value query,
0:05:59 hey, why is the sky blue, just route them to mini, right?
0:06:01 The model can answer perfectly fine.
0:06:03 And that is a chunk of queries, right?
0:06:06 But if they ask, what’s the best DUI lawyer near me, right?
0:06:11 All of a sudden, this is like, you know, you’re in jail, you have one shot, you’re like, screw it.
0:06:13 Let me ask Chad GPT what the best DUI lawyer is.
0:06:15 And now all of a sudden, the model’s not capable of it today.
0:06:18 But soon enough, it’ll be able to contact all the lawyers in the area
0:06:24 and figure out what their results are and maybe search their, like, court filings and whatever, right?
0:06:26 Book the best lawyer for you or an airplane ticket.
0:06:28 Negotiate a cut as part of that.
0:06:29 Yeah, of course, they’re going to take a cut, right?
0:06:32 But there’s a much better way of monetizing the free user.
0:06:34 It’s like, you know, it’s like Etsy.
0:06:35 10% of their traffic now comes from chat.
0:06:38 And OpenAI makes nothing off of that.
0:06:41 But they really, really will soon, right?
0:06:43 And partially that’s because Amazon blocks chat.
0:06:48 But there’s a way to make money from shopping decisions, whether it’s booking flights or looking
0:06:48 for items.
0:06:51 And those you now say, free user, I don’t care.
0:06:52 I’m going to send you to my best model.
0:06:53 I’m going to send you to agents.
0:06:58 I’m going to spend ungodly amounts of compute on you because I can make money off of this.
0:07:02 But if it’s a query that’s like, help me with my homework, I’ll send you like a decent model,
0:07:02 right?
0:07:04 I don’t need to spend money on you.
0:07:08 And so this is how I think, like, OpenAI can finally make money off of the free user.
0:07:11 And I think that’s the biggest, like, thing about the router, right?
0:07:12 This is super interesting.
0:07:17 I think this is the first time that we’ve seen that there’s a launch of a new model where,
0:07:20 to some degree, cost is the headline item, right?
0:07:22 I mean, so far, I was always like, who has the smartest model?
0:07:23 Who has the highest MLU score?
0:07:27 Now we have suddenly people who use models for coding for eight hours a day and surprised
0:07:32 that if you take a large context window and the best model creates thousands of dollars
0:07:32 of cost a month.
0:07:34 So cost matters.
0:07:37 And so to some degree, so where you’re on the parade of frontier between cost and performance
0:07:42 is the new benchmark for model competitive no longer cost alone.
0:07:43 Is that what we’re seeing here?
0:07:44 I mean, I think definitely, right?
0:07:48 Like, OpenAI said they doubled their rate limits for big amounts of users.
0:07:52 They’ve dramatically increased the number of tokens they’re serving from this launch, which
0:07:54 effectively says this is an economic release.
0:07:56 Probably also means the tokens are now cheaper, right?
0:07:58 Yeah, yeah, for sure, for sure.
0:08:02 I think the funniest thing is this whole cost thing you mentioned is like, we’ve seen this
0:08:03 in the code space, right?
0:08:06 Cursor had to pull away the unlimited cloud code.
0:08:10 Initially, they have this super expensive plan and it had like unlimited rates and then they
0:08:11 were only like a weekly rate limit.
0:08:13 Now they have like hour-based rate limits.
0:08:19 And I saw the craziest like thread on Twitter where this guy said he changed his sleep schedule,
0:08:19 right?
0:08:22 Modeled after like how sailors in the bay.
0:08:24 If you’re sailing, you can’t sleep, right?
0:08:25 Like solo sailing.
0:08:29 They’ll take like power naps when they get to the right spots so that they can like still
0:08:30 be safe.
0:08:31 In the morning when it’s not very windy.
0:08:34 Well, but like they can’t sleep uninterrupted, right?
0:08:39 And so because Anthropic had to put rate limits that are like not just week-based, but like
0:08:40 number of hours based.
0:08:46 And like he like basically sleeps multiple times a day, but small chunks just so he can maximize
0:08:47 the usage.
0:08:51 And there’s also a leaderboard on Reddit where people are like competing to see how many
0:08:52 tokens they’re using through their subscription.
0:08:55 And there’s like a dude spending like $30,000 a month.
0:08:59 So I’m going to find some developer in India that I can do pair programming with so I can
0:09:03 get the day cycle, he can get the night cycle, and we both can maximize together the quota
0:09:03 for the account.
0:09:04 Is that the future then?
0:09:09 I mean, but it’s clear like people are taking advantage of the negative gross margin, like
0:09:11 sort of subscriptions that are offered.
0:09:14 I think Anthropic probably makes a positive gross margin off of my subscription.
0:09:15 I don’t code enough.
0:09:17 But there’s plenty of people that are definitely losing money.
0:09:19 And so as you said, it’s an economic.
0:09:22 It’ll push more and more to, I think, just usage-based pricing, right?
0:09:27 If you have an underlying commodity that you’re reselling to some degree that has, that is
0:09:29 that large a part of your cost of goods, right?
0:09:30 You need to go to usage-based pricing.
0:09:35 How much do you think the like customer capture and stickiness for these code products is?
0:09:37 I’m curious what you think on that, right?
0:09:42 Once you use an IDE, once you integrate one of the CLI products in, like how sticky is it?
0:09:43 Or is it people just switch like that?
0:09:44 That is a billion dollar question.
0:09:45 That’s a very conservative estimate.
0:09:52 Look, Andrew Parthy has this great slide where he basically says, if you’re building an agentic system today, right?
0:09:54 But fundamentally, what it is is of this loop, right?
0:09:57 Where half of the loop is the model thinking, right?
0:09:58 And they’re trying to do some.
0:10:01 The other half is then the user verifying what did the agent do?
0:10:01 Is it the right thing?
0:10:03 Providing feedback and trying to steer it in the right direction.
0:10:05 Because we can’t run forever.
0:10:06 Eventually, you need to steer it back.
0:10:08 One half of that is the model provider, right?
0:10:09 They’re trying to build the best models.
0:10:14 The other half is really about, I think, designing the best possible UI to enable the user to give feedback.
0:10:15 And I think there’s value in that.
0:10:17 So I think there’s a certain amount of stickiness in there, right?
0:10:20 So what are all the different tools, like in terms of visual, like, say, take code editing, right?
0:10:23 How can I most easily visualize what the code changes are?
0:10:25 How can I most easily visualize, you know, what they impact?
0:10:26 Which files?
0:10:31 You know, how can I, for small changes, get very quick feedback versus for complex ones, you know, get complex feedback?
0:10:33 There’s some tools that actually draw diagrams for you of what they do, right?
0:10:35 So I think this will be the battle.
0:10:36 I think there’s stickiness in that, right?
0:10:37 How much exactly?
0:10:42 So in that sense, like, people should be doing subscriptions to get people locked in, right?
0:10:45 Instead of moving to usage-based pricing.
0:10:50 Well, I think it’s the customers that don’t want to do usage-based pricing because it’s so hard to guarantee.
0:10:52 It’s so hard for it to get away from them.
0:10:58 And you actually want guarantees and you’re willing to commit to pretty high spend in order to not have usage-based pricing.
0:11:00 I think it’s the model companies that want usage-based pricing.
0:11:08 I think with consumers, it’s frankly very hard to not have usage-based pricing just because the variability is so massive, right?
0:11:14 If it’s us coding versus somebody who does this as their full-time job, right, you just have a factor of 20 or so difference in usage.
0:11:16 That costs a lot of money, right?
0:11:21 I think for enterprises, we could see, like, more flat fee pricing because we can average it out more.
0:11:28 When you have a developer that’s using it all day, you kind of know, in a general sense, like, how many hours a day they’re programming and what that sort of looks like.
0:11:29 The vibe quotas are harder.
0:11:30 Yeah.
0:11:35 Before we leave OpenAI, I want to ask a broad question, which is if someone was sitting here and saying,
0:11:40 Hey, Dylan, I’ll listen to anything you tell me to do, any advice you have, as long as it makes OpenAI more valuable, what would you tell them?
0:11:55 I would say immediately launch a method for you to input your credit card into ChatGPT and agree that for anything it, like, agentically does for you, it’ll take X cut and then launch that product because where it does shopping, right?
0:12:08 Because, like, everyone knows that, like, Anthropic and OpenAI and all the other labs are buying RL environments of Amazon and of Shopify and of Etsy and of all the different ways to shop on the internet, of airline websites, right?
0:12:10 Now, just like, hey, integrate my calendar.
0:12:14 I want to fly to there on Thursday, make sure I don’t miss a meeting.
0:12:15 Cool, book, right?
0:12:17 Do that integration, like, super well.
0:12:20 Know my preferences on whether I like aisle or window, all this stuff, right?
0:12:21 And just take a take rate.
0:12:24 I think this will make them so much money the moment they launch it.
0:12:33 And I think they’re working on it already, but I’d like to hear how he thinks about it because he’s shifted his tone massively on, like, ads over the last six months, right?
0:12:34 He used to be like, no way.
0:12:36 And now he’s like, oh, maybe, you know.
0:12:38 There’s a way to do it without harming the user.
0:12:42 And I think this is how you monetize the free user, right?
0:12:47 So I think that’s probably what I’d tell him slash ask him about, like, a whole line of questions around this.
0:12:49 Well, he’s coming on the podcast in a few weeks, so we’ll ask him.
0:12:51 I want to shift to NVIDIA.
0:12:52 NVIDIA’s having a monster year.
0:12:54 They’re up almost 70%.
0:12:56 What are the possible paths from here?
0:12:57 How do you see it playing out?
0:13:01 Depends, like, how pilled you are on, like, the continued growth.
0:13:03 But I think you guys have a good vantage point.
0:13:10 We have a good vantage point of how fast revenue is growing for a lot of these companies, especially the code companies, but even many other applications.
0:13:14 I think we can clearly see the demand side is accelerating, right?
0:13:17 And then if you look at the training side, I think the race is on.
0:13:18 Meta’s upping hugely.
0:13:20 Google’s upping hugely.
0:13:34 If you just look at, again, just OpenAI and Anthropic and the compute that they have and are getting this year from Google and Amazon for Anthropic and from Microsoft, CoreWeave, Oracle for OpenAI, 30% of the chips are going to them.
0:13:36 Just those two companies.
0:13:40 But that’s actually, like, okay, well, like, 70% of the stuff, like, who’s making off?
0:13:42 Well, one-third of it is, like, ads, right?
0:13:46 Whether it be ByteDance or Meta or many of the other people who are doing ads.
0:13:50 So then it’s still like, okay, well, where are the rest of these one-third of the chips coming from?
0:13:57 Well, they’re, like, mostly uneconomic providers who I don’t think it’s, like, an obvious bet that they’re going to, you know, keep raising bigger and bigger rounds.
0:14:10 So what happens there, I think, with the, you know, we talked about, like, coding, right, like, earlier, actually, the Quen Coder 3 model is actually super cheap if you’re running it on-prem or if you’re running it in the cloud with all these inference libraries.
0:14:11 And so, like, there’s stuff like that as well.
0:14:14 So I think the question is, like, how much does it keep growing?
0:14:20 Because clearly, I think the first third is definitely skyrocketing, right, of OpenAI Anthropic, lab spend.
0:14:22 The second third of, like, ads is going to grow.
0:14:27 It’s not going to grow, like, crazy, but I think there’s definitely an inflection point that could be hit with Gen.AI ads.
0:14:31 I know Meta’s been experimenting with it a lot, but I could totally be convinced that there’s going to be a huge inflection.
0:14:38 And take right there, right, where you start showing me personalized ads, like, every person that’s an ad is, like, looks like me, and I’ll be like, okay, yes.
0:14:42 Except, like, slightly better, so I, like, feel better, right, and I’m, like, I want to buy it, yeah.
0:14:45 I have no idea how this is going to scale, right?
0:14:49 But if you ask the question, how much could it scale, right, like, how much value are we creating here?
0:14:52 Can we create enough value to actually keep growing for a long time?
0:14:55 If you just take AI as software development, right?
0:14:55 Yeah.
0:15:00 We know we can easily get about 15% more productivity out of a developer.
0:15:00 I don’t think that’s right.
0:15:01 I think it’s way higher.
0:15:08 No, no, but the straight, like, I talk to a lot of enterprises, like, a classical enterprise, straight up GitHub Copilot deployment, that gives you about 15%.
0:15:09 We can do much more than that.
0:15:12 But, bro, like, you know how bad GitHub Copilot is?
0:15:16 Like, how did they, how, look at the revenue ARR chart, it’s so funny.
0:15:28 It’s so funny if you look at the revenue ARR chart, it’s, like, cloud code in three months has surpassed that, and cursor, you know, easily surpassed that, and then, like, even, like, companies like Replin are, like, and Windsurf slash Cognition are, like, going to pass them.
0:15:30 Like, it’s, like, you’re preaching to the choir.
0:15:30 What’s going on?
0:15:32 So, look, let’s assume we can get this to 100%.
0:15:33 Yeah.
0:15:35 So, as we can double the productivity of a developer, right?
0:15:38 About 30 million developers worldwide, give or take.
0:15:38 Yeah.
0:15:39 Right?
0:15:41 Let’s say 100K value add per developer.
0:15:42 This might be a little high worldwide.
0:15:44 The US is low, but worldwide is high.
0:15:45 So, it’s $3 trillion.
0:15:46 Yeah, yeah.
0:15:46 Right?
0:15:51 So, we’re probably building technology here, which adds $3 trillion of GDP value.
0:15:54 In theory, we could put that into GPUs, because that’s the main cost factor here.
0:15:57 Just from a coding model, like, ignoring every other use case.
0:16:01 So, at least, in theory, the value generation is here to keep growing, right?
0:16:03 Now, how that translates to the industry is much more complicated.
0:16:07 I think we’ve already seen AI’s value creation exceed.
0:16:12 So, sort of, there’s, like, the whole, like, the famous, like, oh, 300 billion problem, or
0:16:13 200 billion problem now.
0:16:14 It’s 600 billion problem, I’m sure.
0:16:17 So, Koi is going to put out, like, the $1.8 trillion problem, right?
0:16:17 Soon enough.
0:16:21 But, like, there is some, like, reality in that, of course.
0:16:26 But, you know, it ignores that, like, infrastructure spend today is accounting for five years of revenue,
0:16:29 not, like, one, and the revenue looks like this, not, like, flat line.
0:16:37 But I think the main thing is that AI is already generating more value than the spend.
0:16:39 It’s that the value capture is broken, right?
0:16:44 Like, I legitimately believe OpenAI is not even capturing 10% of the value they’ve created in
0:16:47 the world already, just by usage of chat, right?
0:16:52 And I think the same applies to, you know, Anthropic and Cursor and whoever else you’re
0:16:52 looking at.
0:16:56 I think the value capture is really broken.
0:17:01 Even, like, internally, I think, like, what we’ve been able to do with, like, four devs
0:17:02 in terms of, like, automation.
0:17:07 Like, our spend on, like, Gemini API is absurdly low.
0:17:10 And yet we go through every single permit and regulatory filing around every single data
0:17:11 center with AI.
0:17:14 And it’s, like, and we take satellite photos of every data center.
0:17:19 And we, like, we’re able to label our data set and then recognize what generators people
0:17:22 are using, what, like, cooling towers and the construction progress and substations.
0:17:24 All this stuff is, like, automated.
0:17:26 And it’s only possible because of GenAI.
0:17:29 But, and we do it with, like, very few developers.
0:17:33 And then, like, the value capture that I’m able to generate by selling this data, by consulting
0:17:34 with it is so high.
0:17:38 But the company is making it as, like, they get nothing out of it, right?
0:17:44 Like, I think this, like, there is a value capture challenge here that far out exceeds the
0:17:46 sort of creation, right?
0:17:52 And as you get models like GPT-5 or open source models, like, continuing to drive it down,
0:17:56 it’s, like, the value capture is just harder and harder and harder for these companies
0:18:00 because they’re making, you know, 50% gross margin on inference if they’re, you know, or
0:18:01 less in many cases.
0:18:06 In so many words, you’re saying we’re getting commoditized and therefore you can’t capture
0:18:11 the value and thus you should temper your expectations of how much you can spend on GPUs.
0:18:16 Well, no, I think you can, I think there’s still ways to, like, inflect hugely on value capture,
0:18:16 right?
0:18:20 Like I mentioned, the ads are a huge value capture.
0:18:21 But that needs to happen before.
0:18:23 Before we see a massive increase.
0:18:27 Um, no, I think, I think the other thing is, like, there’s a lot of capital that’s not
0:18:27 been spent, right?
0:18:34 Like, the hyperscalers still can grow CapEx, uh, 20, 30% next year, right?
0:18:35 From what they’re doing this year.
0:18:39 In addition, companies like CoreWeave and Oracle, uh, because they’re tapping capital markets
0:18:42 can raise way more than 20 to 30% CapEx.
0:18:47 And then you go down the list further and it’s like, oh, the largest infrastructure funds in
0:18:49 the world, like Brookfield and Blackstone.
0:18:52 Well, actually, they’re, they’re turning all of their eyes to investing even more in the
0:18:54 infrastructure, AI infra.
0:18:59 Um, and then you’re like, the sovereign wealth funds of the world, like the G42s or, you know,
0:19:04 um, the Norway one or GIC and, um, Singapore.
0:19:07 Like, these people have barely started touching AI.
0:19:11 And so I think there’s a whole lot more CapEx that can come without it being necessarily,
0:19:12 like, economically motivated day one.
0:19:18 Um, I’m also saying, like, economically motivated CapEx can only grow, like, so much.
0:19:23 But there’s so much other, like, like, like, where it’s not clear from, you know, if you
0:19:27 have a spreadsheet, you know, and you’re basing it on real business that you should actually
0:19:27 spend this much.
0:19:31 But people will, because they believe, I believe, I think you believe, like, infra,
0:19:35 you know, people believe that this will be, you’ll get profit out of it.
0:19:41 But there’s no, like, 100% certain, like, you know, way to argue it.
0:19:41 Yeah.
0:19:46 How strand, uh, is NVIDIA by custom silicon?
0:19:53 I think that’s the biggest thing, right, is, um, when we look at orders from Google and
0:19:59 from, um, Amazon, right, especially their, and, and Meta, their custom silicon is, not, not
0:20:01 Microsoft, their custom silicon kind of sucks.
0:20:05 Uh, but the other three, they’re really upping their orders massively over the last year.
0:20:09 Um, you know, Amazon is making millions of Tranium.
0:20:11 Google’s making millions of TPUs.
0:20:14 Um, TPUs clearly are, like, 100% utilized, right?
0:20:15 Yeah.
0:20:20 Um, Tranium’s not there, but I think Amazon will figure out how to do that, um, and Anthropic
0:20:20 will.
0:20:26 Um, so, so I think, I think that’s the biggest threat to NVIDIA is that people figure out how
0:20:28 to use custom silicon more broadly.
0:20:34 Um, and this sort of becomes the sort of, like, if AI is concentrated, then custom silicon will
0:20:35 do better.
0:20:38 Um, and that’s not even talking about, like, OpenAI’s silicon team and stuff, right?
0:20:44 Like, if AI is really concentrated, uh, then, then they’ll do better, uh, custom silicon.
0:20:49 But if it gets dispersed broadly because there’s all these open source models from China, um,
0:20:54 and there’s all these, um, open source software libraries from, you know, NVIDIA and China,
0:20:58 and it makes the deployment costs, like, rock bottom, then potentially.
0:20:59 Hear me out here.
0:21:05 If, if Google’s TPU is, is able to compete with NVIDIA, in theory, it could do it on the,
0:21:06 on the open market.
0:21:08 NVIDIA is worth more than Google these days.
0:21:10 Shouldn’t Google start selling the chips to everyone?
0:21:12 I mean, in theory, they should be able to achieve a higher market cap.
0:21:15 I, I, um, I absolutely think so.
0:21:18 I think Google’s even discussing it, um, internally.
0:21:25 I think it would require a big reorg of culture, um, and a big reorg of, like, how Google Cloud
0:21:29 works, um, and how the TPU team works and how the JAX software team and XLA software teams
0:21:30 work.
0:21:32 Um, I totally think they could.
0:21:37 Um, it would just take them, like, shaking themselves pretty hard to be able to do it.
0:21:43 Um, yeah, but I, I, I totally think Google should sell TPUs externally.
0:21:45 Not just renting, but, like, physically.
0:21:51 It’s, it’s kind of funny if a side hobby, in theory, has a higher company value potential
0:21:52 as your main product.
0:21:55 Than your entire business, especially as you think about the degradation of search as a
0:21:55 core business.
0:21:56 I mean.
0:21:59 Yeah, I think, but I think, like, if you were to ask, like, Sergey, right?
0:22:04 Like, hey, do you think selling chips and, and racks is more valuable?
0:22:09 Or a cloud or, or Gemini, um, he’d be like, no, no, no, no, no, no.
0:22:11 Like, Gemini is going to be worth way, way, way more.
0:22:12 It’s just not yet today.
0:22:13 Right?
0:22:18 Um, and so I think, like, like, today you say NVIDIA is the most, again, it’s like a whole
0:22:19 concentration thing, right?
0:22:23 If the world is super concentrated in terms of customers, then NVIDIA will not be the most
0:22:24 valuable company in the world.
0:22:24 Right?
0:22:32 Um, but if it gets dispersed more and more, um, which arguably we’re starting to see with a
0:22:36 lot of these open source models getting better and better and better, um, and with ease of
0:22:41 deploying them getting better, then you would see, I think you could argue NVIDIA will remain
0:22:43 the most valuable company in the world for a long period of time.
0:22:49 Um, and historically, no, but Nintendo software has eaten the world in most markets, right?
0:22:54 I mean, like, if you look at, uh, early networking days, Cisco was the most valuable company on the
0:22:55 planet, right?
0:22:57 for a while, it’s no longer, right?
0:23:02 They’re the guys that build services on top, like, like Google or Amazon or, or Meta eventually
0:23:02 eclipsed them.
0:23:05 Well, which is why NVIDIA is like making all these software libraries, right?
0:23:08 Like that’s, that’s, and they’re trying to commoditize inference, right?
0:23:13 Like, um, I, you guys don’t, I think even have an inference API provider investment, do you?
0:23:17 Um, well, we have, we have all kinds of model providers.
0:23:20 Model providers, but I’m talking about a pure API provider investment, I think, right?
0:23:21 Is that correct?
0:23:27 I think I talked to, um, one of the team members, maybe, maybe Rajko or someone about
0:23:31 like why you guys don’t, didn’t invest in like, you know, like a together or like a fireworks.
0:23:35 And sort of the argument was like, well, we think inference, just serving models, uh, alone
0:23:37 without making them will sort of be commoditized.
0:23:38 Yeah.
0:23:38 Right.
0:23:41 Um, we have some in the stable diffusion ecosystem, like with like a file.
0:23:42 With file, yeah.
0:23:45 Yeah, it’s a little bit different, uh, dynamics there, I think.
0:23:49 They, they tend to be, make much more compound models than the, the LM folks.
0:23:49 I think.
0:23:50 Yeah, yeah.
0:23:54 But, but like, you guys don’t have one of these, like, you know, base 10 or any of these
0:23:59 like sort of like API investments because you think, um, this is from someone on the Infra
0:24:03 team that you guys think it’ll get commoditized because the software and videos making because
0:24:07 VLM and HDLang, which is like open source software coming out of Berkeley and now, you know, sort
0:24:09 of has their own environments now.
0:24:15 Um, and supported by many, like this being commoditized means that like API providers aren’t
0:24:18 necessarily worth a ton, right?
0:24:19 Is sort of your argument, maybe.
0:24:24 Um, I think that’s, that’s relevant to this whole thing, which is, you know, why, right?
0:24:24 Like, why would you do this?
0:24:27 Shifting gears, what about the, the Silicon startups?
0:24:29 Uh, what’s, what’s your take on those?
0:24:32 I mean, there’s, there’s a ton of capital flowing into that, right?
0:24:37 We’ve seen, I have not numbers, but probably billions, uh, being invested in, in, in, in
0:24:38 ship startups.
0:24:39 Yeah, for sure.
0:24:39 For sure.
0:24:39 For sure.
0:24:43 I mean, like whether you’re looking at like, um, you know, companies like, I think it’s
0:24:50 like pretty impressive that a few companies like, um, etched and Revos, um, and a number
0:24:54 of other companies, you know, Maddox and others like have gotten the amount of funding they’ve
0:24:56 had without even launching a chip.
0:24:57 Right.
0:25:02 Um, you know, in the past, like, yeah, Silicon companies would make money or raise money, but
0:25:05 they would at least launch a chip before they get a, you know, a big round.
0:25:09 But like, uh, etched and Revos, like have raised, you know, a lot of money without ever
0:25:13 launching a chip publicly, which I think is, I mean, it speaks to, well, like, yes, Silicon
0:25:16 is super, uh, capital intensive.
0:25:20 If you’re building a chip, especially an accelerator, which has so many moving pieces.
0:25:25 Um, and there’s, there’s like, there’s like 10 different AI accelerator companies out there,
0:25:25 right?
0:25:27 Like that are newish in the last few years.
0:25:28 I think there’s a lot more.
0:25:30 That are like, yeah, yeah, that’s fair.
0:25:35 Um, and then, and then, and then there’s the old guard, which continues to raise
0:25:35 money, right?
0:25:41 Like Grok and Cerebris and, and, and Samanova and, and Tenztor and so on and so forth.
0:25:41 Right.
0:25:45 Like, um, or Graphcore getting bought out by SoftBank and SoftBank dumping money into this
0:25:46 effort as well.
0:25:46 Right.
0:25:52 There’s, there’s a lot of capital being invested to dispel, uh, dispel sort of NVIDIA’s top
0:25:53 dollar or top position.
0:25:57 Um, but it becomes challenging, right?
0:25:59 It’s like, how do you beat NVIDIA?
0:25:59 Right.
0:26:04 Like the hyperscalers, I think are like kind of lucky in that they can, they can do mostly
0:26:05 the same thing as NVIDIA.
0:26:05 Right.
0:26:07 They’re the captive customer, which is themselves.
0:26:07 Right.
0:26:10 And it’s, they can, they can just win on supply chain, right?
0:26:11 Like I’m using cheaper providers.
0:26:13 It’s a margin compression exercise, essentially.
0:26:14 Yeah.
0:26:14 Yeah.
0:26:17 And maybe, maybe for certain workloads, like metaphor recommendation systems, they’ll have
0:26:19 a better, you know, they can specialize more.
0:26:24 Um, but for the most part, it’s like, no, we’re, we’re targeting the same workloads.
0:26:29 We can just simplify supply chain or, or in-house a lot of it and compress margin and it’ll be
0:26:29 fine.
0:26:34 Um, but in the case of, you know, these, these other companies, it’s like, well, they don’t
0:26:35 have a captive customer.
0:26:39 So now you have to contend with, well, I’m using the same ecosystem.
0:26:44 Um, and either I can use some custom Silicon provider who’s going to take a margin anyways
0:26:50 on top and that’s going to compress my, like what I can sell for, or I can try and in-house
0:26:50 everything.
0:26:52 But then it’s like, this is really hard, right?
0:26:54 Like I’m going to do all the software design.
0:26:55 I’m going to do all the Silicon design.
0:26:57 I’m going to build all this different IP.
0:27:01 I’m going to manage the supply chain on chips, on racks, on everything.
0:27:01 Right.
0:27:04 Ends up being a huge effort in terms of team size.
0:27:10 Um, all in the end, like, Hey, I make a 75% gross margin as NVIDIA.
0:27:13 Um, AMD sells their GPUs for 50% gross margin.
0:27:18 Um, and they have a hard time out engineering NVIDIA and they’re great at engineering, right?
0:27:23 Like, uh, they’re, they, but yet they still take more Silicon area, more memory to achieve
0:27:25 the same performance and they have to sell for less.
0:27:26 So their margin gets compressed.
0:27:27 That makes sense.
0:27:32 Look, the, I think historically, if you look at it, typically new entrants in markets didn’t
0:27:35 win by marginally improving on something existing.
0:27:39 That happens sometimes, but, but more likely they, they jumped on some, some kind of disruptive
0:27:40 technology leap, right?
0:27:42 Where it’s like, we have a different approach, we have different technology.
0:27:44 Is that possible here?
0:27:48 I mean, to, to, to some degree, um, maybe there’s overfant simplifying a little bit, but I think
0:27:53 part of the reason why the transformer model won was because it runs so incredibly great on
0:27:54 GPUs, right?
0:27:58 Like a, like a recurrent neural network is similarly performing, it looks like, but it, it runs terribly
0:27:59 on now on a GPU.
0:28:03 So, so did we sort of pick the model for an architecture and now it’s, it’s hard to come
0:28:07 up with an architecture that, uh, you know, really, well, it’s, it’s, it’s hardware software
0:28:08 co-design, right?
0:28:12 Like there’s like, there’s all this hype about, um, neuromorphic computing, right?
0:28:14 Like theoretically it’s amazing and super efficient.
0:28:15 It’s like, okay, great.
0:28:17 Like there’s no ecosystem of hardware.
0:28:18 There’s no ecosystem of software.
0:28:23 It would take like, you know, tens of thousands of people who are the best AI today focusing
0:28:26 on that to even prove out if it’s worthwhile or not, right?
0:28:29 Um, on a hardware side, on a software side, on a model side.
0:28:36 And so like, you look at like Grok, Cerebris, Salmonova, um, they all like sort of over-indexed
0:28:38 to the models that were leading at the time when they designed their chips.
0:28:40 And so they made certain trade-offs, right?
0:28:44 They put a lot more memory on chip and NVIDIA was like, well, we’re not going to do that.
0:28:45 A lot faster at least, right?
0:28:50 Well, more like, like if you compare the amount of memory of SRAM on NVIDIA’s chips, it’s
0:28:51 much, much lower.
0:28:52 Yes, correct.
0:28:53 They went SRAM instead of DRAM.
0:28:55 But then they usually have less DRAM.
0:28:56 So there’s a trade-off there as well.
0:28:56 Right.
0:28:57 There’s less DRAM.
0:28:58 There’s more SRAM.
0:29:01 And because there’s more SRAM on the chip, you have to have less compute on the chip.
0:29:03 And so they ended up losing, right?
0:29:06 Um, because the model sizes got too big and all this, right?
0:29:12 Um, and so you have this like super weird dynamic where, uh, they bet on something that
0:29:13 was actually better, right?
0:29:18 Like I have no doubt that Cerebris would run certain types of models better than NVIDIA or
0:29:20 Grok or, hey, Dojo, right?
0:29:23 Dojo runs certain, you know, in Tesla’s Dojo.
0:29:26 Would run certain types of models way better than NVIDIA’s chips, uh, because they’re
0:29:27 optimized to that.
0:29:31 But then it’s like, oh, well actually even in vision tasks, you use vision transformers
0:29:31 now.
0:29:32 So it’s like, okay, cool.
0:29:35 Um, gives model sizes grew and all these things.
0:29:40 So it ends up being a, you know, catch 22 in that, like you optimize for something.
0:29:44 And so now like today you have this new age of AI accelerator companies that are like, okay,
0:29:45 we’re going to optimize for transformers.
0:29:49 But the time they started designing, they’re like, okay, transformers are dense models that
0:29:50 are this big.
0:29:53 What’s the best, you know, the hidden dimension is 8K and your batch sizes are this big and
0:29:55 your sequence points are this big.
0:29:57 So let’s just make a super large systolic array.
0:30:01 Um, so you can, you know, create the maximum efficiency.
0:30:04 And then it turns out, oh, look at DeepSeq or, you know, go look at what the labs are doing.
0:30:06 Actually, their, their shapes are much smaller.
0:30:10 Actually, you need to do a bunch of small matrix multiplies, not massive, massive, massive, you
0:30:12 know, singular matrix multiplies per layer.
0:30:16 And then it ends up, you know, oh, well, that chip you’re designing for that is actually
0:30:17 not super effective for that.
0:30:23 And so the software is evolving constantly because of what, because of what, um, works best on
0:30:24 NVIDIA.
0:30:28 And you see that with, you know, whether it be what DeepSeq’s doing or Alibaba’s doing or
0:30:29 what the labs are doing internally.
0:30:33 Um, and you even see this like for Google, right?
0:30:39 Like their open source Gemma models make different decisions because the shapes of a TPU are different
0:30:39 than a GPU.
0:30:45 Um, and those, the GPU and the TPU are actually not that far apart, right?
0:30:48 Like you would say, yes, they’re very different, but like Blackwell and TPUs are, are very, very,
0:30:50 they’re converging on similar designs actually.
0:30:56 Um, whereas like to be NVIDIA, you can’t just have the supply chain, you know, with, right?
0:30:57 You don’t have this captive customer.
0:31:02 So now you need to do something, you know, that will give you 5X advantage, right?
0:31:04 In hardware efficiency for a certain type of workload.
0:31:07 And then pray the workload doesn’t shift, right?
0:31:10 Because NVIDIA is also optimizing their architecture generation.
0:31:16 They’ve added a lot of stuff to, um, make their model, their chips way better for the existing
0:31:16 models.
0:31:20 But it’s like, they’re taking, you know, large steps every year, every two years towards
0:31:21 something.
0:31:25 Whereas you have to like go way over there in left field and hope that models stay over
0:31:25 there.
0:31:25 Right.
0:31:31 Um, because you have to win by 5X because NVIDIA is going to have supply chain efficiency
0:31:31 over you.
0:31:35 They’re going to have time to market over you in terms of like a new process node or new
0:31:39 memory or, you know, whatever, whatever technology, right?
0:31:40 Even AMD, right?
0:31:41 They got to two nanometer before NVIDIA.
0:31:44 Um, they had higher density, uh, HBM.
0:31:50 Um, they use 3D stacking, all these things on supply chain that should be better than NVIDIA.
0:31:51 And yet they still lose.
0:31:52 They’re still the software angle, right?
0:31:53 NVIDIA is fantastic.
0:31:53 Yeah.
0:31:55 And then there’s software as well, right?
0:31:57 But it’s like, NVIDIA is going to have better networking than you.
0:31:58 They’re going to have better, uh, HBM.
0:32:00 They’re going to have better process node.
0:32:01 They’re going to come to market faster.
0:32:02 They’re going to be able to ramp faster.
0:32:06 They’re going to have better negotiations with whether it’s TSMC or SK Hynix and the memory
0:32:10 and silicon side or all the rack people or like copper cables, everything.
0:32:12 They’re going to have better cost efficiency.
0:32:13 So you have to be like 5X better.
0:32:17 But, but to be fair, if, if somebody had a viable competitor, which would even be marginally
0:32:21 cost competitive, if my guess is many of the big consumers of GPUs would immediately shift
0:32:23 some revenue there just to have a number tool, right?
0:32:24 Just to, just to.
0:32:26 Well, that’s, that’s AMD today, right?
0:32:27 And Microsoft stopped.
0:32:27 Somewhat, yeah.
0:32:28 I mean, like.
0:32:30 There’s still pretty limited traction though, right?
0:32:31 Sure, but.
0:32:31 Yeah.
0:32:35 Meta continues to buy from them and Microsoft did buy a bunch and then they stopped because
0:32:41 it’s like, well, yes, they’re, you know, AMD is giving you all these advantages, but
0:32:43 it ends up still not being better on a performance per watt basis.
0:32:47 And they have a way bigger software team that are somewhat competitive on like all these
0:32:48 dynamics that I mentioned, right?
0:32:50 So you can’t just like do the same thing as NVIDIA.
0:32:52 You really, and do it better, right?
0:32:54 Or try and execute better like AMD.
0:32:57 Like you have to really leap forward in some other way.
0:33:02 But that’s the, the design cycle takes so long that models will shift, right?
0:33:05 Because they’re like, oh, what’s the next generation of TPU and GPU look like?
0:33:06 Okay, let’s optimize for that.
0:33:10 And the research path is, you know, like great.
0:33:14 Like, yes, neuromorphic computing could be the most optimal thing for us to do, but no
0:33:17 one’s working on that because you have to advance in the tech tree you’ve chosen, right?
0:33:20 If you restart the tech tree, you’re going to be like, well, this sucks.
0:33:24 And so like, if it branches this way and you’re over here, you’re screwed.
0:33:26 Because you have to be 5X better.
0:33:27 There’s a mode.
0:33:31 Because the supply chain stuff means that 5X actually turns into a 2.5X.
0:33:34 And then NVIDIA can compress their margin a little bit if you’re actually competitive.
0:33:37 And then that 2.5X becomes like a 50% better.
0:33:38 And then, yeah.
0:33:42 So it’s like, it ends up being way too difficult to, and the software stuff, right?
0:33:45 Everything like takes your 5X and makes it like, oh, you’re actually only 50% better.
0:33:48 And defense supply chain, for sure.
0:33:49 Yeah, defense supply chain.
0:33:51 And then like, they get that, right?
0:33:56 Like, so it’s like, and Lutnik himself said we had to do this for rare earth minerals.
0:33:57 And it’s like, interesting.
0:34:04 China, there’s like provinces in China that have like rules that say the H20 is not efficient
0:34:09 enough to be deployed, which is like super bizarre, because it’s clearly the best AI chip China
0:34:10 has.
0:34:12 Huawei is still a little bit behind.
0:34:18 Well, what’s interesting is that, you know, efficiency is just not, is so much less of an
0:34:21 issue in China than here, because they just have the power infrastructure to be able to
0:34:21 support.
0:34:25 So even if they’re running less powerful chips, you know, you would imagine that it doesn’t
0:34:30 really matter because China has just such an infinite supply, infinite supply of power
0:34:32 that, you know, they’d sort of be okay with it.
0:34:35 So it’s interesting, which is, it’s a big challenge in America, right?
0:34:41 Like, there have been, there have been companies that were like, they would, they’ve like, you
0:34:45 know, Jensen keeps saying he couldn’t give away H20 in America for free.
0:34:49 But I’ve literally like heard companies like say, like now say like, yeah, no, I mean, I
0:34:52 wouldn’t because like, I only have this much power.
0:34:57 How am I going, you know, in data centers ready to go over the next year, if I bought an
0:35:00 H20, I’d literally have less compute capacity, and then I’d lose, right?
0:35:02 Even if it was free, like, it doesn’t make sense.
0:35:05 Whereas China doesn’t care, they can build these things.
0:35:06 They have the muscle.
0:35:11 I’m curious how this all shakes out.
0:35:12 You know, China’s posturing really hard.
0:35:15 They even like put out something that was like, we’re investigating to see if there’s
0:35:16 backdoors in the H20.
0:35:18 It’s like, there’s no backdoor in the H20, like chill.
0:35:24 You know, it’s like, you know, GPU, GPU is usually like firewalled from the public internet
0:35:28 anyways, like you step through stuff before you get to the GPU clusters.
0:35:31 So like a backdoor wouldn’t even matter.
0:35:33 I don’t know.
0:35:42 I think it’ll be interesting to see because China can definitely deploy way, way, way more
0:35:45 power to AI the moment they decide to.
0:35:49 But there’s these like, there’s like competing interests, right?
0:35:52 Like, because they want Huawei to be better than NVIDIA.
0:35:52 Yeah.
0:35:54 And then this is how NVIDIA argued to the administration.
0:35:59 They’re like, if we don’t do this, actually, I think it’s like a very like powerful argument
0:36:03 that like, like, for example, within Triton, which is a common ML library.
0:36:09 Anyway, like, like ByteDance has open source some stuff that plugs into this that is like super
0:36:09 awesome.
0:36:10 And there’s like all these other libraries.
0:36:12 It’s not just models that China open sources.
0:36:17 It’s like software for NVIDIA that Chinese companies open source.
0:36:23 In a sense, like by NVIDIA selling GPUs, is NVIDIA’s argument again, like was like, they
0:36:28 were able to, you know, stop Huawei from building up a software ecosystem and the Western ecosystem
0:36:28 is better.
0:36:32 But then in the flip side, it’s like, again, if you believe the models deliver more economic
0:36:37 value to society than the hardware, which, which I actually think they do, it’s just,
0:36:42 there’s a value capture problem today, then you’re giving China way more by giving them
0:36:46 H-20s and soon a version of Blackwell that’s cut down, like, like Trump said, right?
0:36:51 Versus, versus, you know, selling them the chips, right?
0:36:54 The economic value derived from selling them the chips is not as large as, you know, being
0:36:56 able to somehow sell them AI services.
0:37:00 So is, is China gatekeeping power for AI?
0:37:02 I don’t think so.
0:37:08 I think, I think, I think again, like there’s a lot of like, we, what we see is that like,
0:37:15 even with H-20 being sold to China, into, into China, um, and, and, and future versions of
0:37:20 the chip, the H-20E and other chips, um, we still see like Chinese companies like Alibaba,
0:37:26 uh, renting GPUs outside of China because the GPUs they can get outside of China are just so
0:37:28 much better on a dollar spend per performance basis.
0:37:32 Renting them or even like going through sort of like a Singaporean company that is effectively
0:37:36 a Chinese company and building data centers, um, and, and putting chips in them.
0:37:39 So it’s like, I don’t think China’s limiting the power per se.
0:37:43 It’s that it’s, you know, you can only, if you can spend like your Chinese companies are
0:37:47 growing their CapEx, uh, way more than U S companies on a percentage basis next year.
0:37:51 The dollar, absolute dollar numbers, you know, obviously the U S companies are spending more
0:37:51 still on AI.
0:37:57 Um, the percentage basis Chinese companies are growing more next year and you, you still
0:38:03 have the problem of like, well, dollars spend to AI output, uh, in tokens or in whatever
0:38:06 is going to be lower because these chips are worse.
0:38:09 So power is not the gating factor.
0:38:11 It’s always capital, right?
0:38:11 At least today.
0:38:11 Right.
0:38:15 Um, now China can spend a lot more capital if they wanted to.
0:38:20 Um, they’re subsidizing the semiconductor industry to the tune of like 150, $200 billion
0:38:23 a year, uh, through SOEs, through CapEx, that’s not generating revenue, et cetera.
0:38:26 So it’s not like they couldn’t do this to the AI ecosystem.
0:38:27 Right.
0:38:29 Given, you know, Meta’s CapEx is like 60 billion.
0:38:30 Right.
0:38:32 Um, and Google’s CapEx is like 80 billion.
0:38:32 Right.
0:38:36 Like they could totally spend way more than that on a single effort.
0:38:37 They just haven’t decided to.
0:38:41 Um, and I just think for the U S our build outs are constrained by power.
0:38:42 Right.
0:38:48 Like Google has a ton of TPUs sitting, waiting for data centers to be powered and ready
0:38:50 as does Meta with GPUs.
0:38:50 Right.
0:38:53 We’ve posted about how, uh, Meta is now building these like effectively tents.
0:38:58 Is this to some degree also coupled to their unwillingness to sell them to a broader ecosystem?
0:39:03 I mean, if they want to be confined in their own data centers and they’re, you know, didn’t
0:39:06 ramp data center, uh, build out for their own hyper, for their own hyper, for their own
0:39:08 hyper security cases quickly enough.
0:39:08 Right.
0:39:10 Then yes, that constrains them.
0:39:10 Right.
0:39:13 If, if they were on the open market, will we still be constrained?
0:39:13 Yeah.
0:39:14 Yeah, for sure.
0:39:19 Cause like companies like CoreWeave, you know, why is CoreWeave valuable is really because
0:39:21 they build infrastructure really fast.
0:39:21 Right.
0:39:25 Um, and, and their software is nice, I think, but like a lot of their customers are bare
0:39:25 muddle.
0:39:26 Right.
0:39:28 Just, just replace the GPUs whenever they’re broken and network.
0:39:29 They do more aggressively.
0:39:31 And I think they’ll go anywhere.
0:39:32 Jensen likes them as well.
0:39:32 Yeah.
0:39:33 They’ll, they’ll go.
0:39:34 Yeah.
0:39:34 Yeah.
0:39:38 That’s very important as well, but, um, they’ll like go because it’s, it, it, it, it,
0:39:41 it unconcentrates the ecosystem, which is better for NVIDIA.
0:39:41 Right.
0:39:43 Having worked at Intel, I know exactly what Chip was going through his mind.
0:39:44 Yeah.
0:39:48 So I think what’s really important is that like CoreWeave doesn’t care.
0:39:48 Right.
0:39:49 They’re like, Oh, crypto data center.
0:39:52 I will convert it to an AI data center.
0:39:52 Right.
0:39:56 They bought a company for like $10 billion that’s doing crypto mining, which is worth
0:39:58 like $2 billion, like a couple of years ago.
0:40:01 And it’s not because they’re Bitcoin mining business is growing.
0:40:03 It’s because they have power data centers, right?
0:40:08 Like anywhere and everywhere people are trying to build power data centers.
0:40:14 Um, and companies like, like CoreWeave and Oracle are moving to the, actually today, Google
0:40:18 just didn’t bought 8% of a crypto mining company, um, called Terrowolf.
0:40:18 Right.
0:40:20 Um, and-
0:40:21 Not because they’re getting into crypto mining.
0:40:23 No, because they need the data centers, right?
0:40:23 They want the power.
0:40:24 They need the power.
0:40:24 Right.
0:40:30 And it’s like, um, all the hyperscalers have like said, screw off to my sustainability pledges
0:40:31 because they need power as fast as possible.
0:40:32 Yeah.
0:40:32 Right.
0:40:37 Um, they’re, you know, they’re, they’re doing things that are not that they take a little
0:40:43 bit longer to move the ship, but like, even if you didn’t do it in your own self-built data
0:40:47 centers, there’s still a lot of challenges in the open market.
0:40:48 There’s a deficit, right?
0:40:54 Um, and that’s, that’s constraining American ship build outs heavily.
0:41:00 Um, yes, others could maybe do it a little bit faster, like CoreWeave or others, right?
0:41:02 Um, Oracle’s got an open mind as well.
0:41:06 Um, but it’s still constraining U.S. build outs heavily.
0:41:09 Um, even, even though the capital has been spent, right?
0:41:13 The chips are, you know, 60 to 80% of the cost of the cluster, depending on what chips you’re
0:41:13 getting.
0:41:16 So it’s like, they’ve already bought the chips.
0:41:18 They just can’t put them anywhere because the data centers aren’t ready.
0:41:21 Supplies to Google, applies to Microsoft, applies to Meta, applies to a lot of folks.
0:41:26 Um, I mean, it’s really hard to build infrastructure, power infrastructure in the U.S.
0:41:30 Power, grid interconnections, transmission, substations.
0:41:36 Uh, all of this stuff, like, like electrical contractors, electricians, um, in Texas, if
0:41:39 you’re willing to like be a travel electrician, it’s like, it’s like oil pay, right?
0:41:44 Like it used to be that, like, if you’re physically adept, you could go make, you know, a hundred
0:41:47 grand in West Texas, but like, who the fuck wants to do that?
0:41:52 Um, now it’s like, well, you could, you could go like 200 miles away from Dallas and what’s
0:41:57 still a reasonable town, um, and, and build a data center and work on the wiring within the
0:41:59 data center and all this other stuff, uh, the transmission stuff.
0:42:03 And your pay is up like two X now, uh, versus what it was just a few years ago.
0:42:05 This labor problem is a challenge too.
0:42:10 And it’s, it’s, yeah, I think in China, they don’t have any of these problems, but they
0:42:12 just haven’t spent the capital yet.
0:42:13 But capital is an issue as well.
0:42:16 Um, because of the scale of what’s being spent, right?
0:42:21 Like, like NVIDIA’s revenue this year is going to be like over $200 billion and next year
0:42:23 expects over 300 billion.
0:42:27 Plus Google is going to spend like $50 billion on TPU data centers, right?
0:42:31 And it’s like, and Amazon’s going to spend tons and tons on Tranium data centers.
0:42:37 It’s like the scale of dollars is, is quickly growing to, um, nation state level stuff.
0:42:43 Um, and, and what’s more important is being able to decide to spend the dollars and what’s
0:42:43 cost-effective.
0:42:49 And so to some extent, China’s still constrained by that, but they can smuggle chips in, they
0:42:50 can build data centers outside of China.
0:42:55 They can rent data centers outside of China, um, and be more and have the most cost-effective,
0:42:57 you know, Blackwell chips or whatever, right?
0:43:02 ByteDance is, you know, either the biggest or the second biggest customer of Google cloud
0:43:03 for a reason, right?
0:43:08 And they’re getting, you know, over, you know, they’re getting many, many Blackwell from them,
0:43:08 right?
0:43:12 And the same with Oracle and the same with Microsoft and all these other companies are renting tons
0:43:16 of chips to China anyways, because it’s more cost-effective to do that than build it yourself.
0:43:20 So it’s not like China has this mentality where we only have to, well, the government does,
0:43:24 but the infrastructure companies don’t, um, like Alibaba, Tencent, ByteDance, et cetera.
0:43:27 What, what, what’s the end game for data centers?
0:43:30 I mean, like, we, we need more power, we need more cooling.
0:43:34 Will the end be every, all data centers would be next to a nuclear reactor or lots of solar,
0:43:40 you know, next to a deep level, uh, like, um, deep sea water that we use for cooling or something
0:43:40 like that?
0:43:47 Or what’s, um, I think that like cooling is, like the physical cooling of a data center are
0:43:50 like, you know, there’s this whole narrative about like, oh, AI uses so much power.
0:43:56 And it’s like not really, um, you know, farming alfalfa uses like a hundred X the water of,
0:44:00 of AI data centers, uh, even by the end of the decade, it’ll be the same.
0:44:01 And it’s like alfalfa is like worth very little.
0:44:06 So it’s like, there’s like, it’s like cooling is like, not that, you know, people have like
0:44:09 experimented with like, you know, undersea data centers to reduce the cooling cost.
0:44:11 But it’s like, it’s like five, 10% savings.
0:44:15 But then like, if you want to get the water out of the ocean, then then put the data center
0:44:15 into the ocean.
0:44:17 It’s like, if you want to service it, like you’re screwed, right?
0:44:22 So like the same with power, it’s like, we talk a lot about like, the power is not actually
0:44:22 that expensive.
0:44:24 It’s just hard to build, right?
0:44:25 And how to get to the right place.
0:44:30 And how to get to the right space and convert it down to the voltages and all the stuff that
0:44:31 that chips need.
0:44:35 So it’s less the magnitude of power and more where it is and how it moves.
0:44:36 Well, the magnitude too, right?
0:44:36 Like it’s going to be.
0:44:39 It’s a total worldwide energy consumption.
0:44:44 AI data is still a fraction of a percent.
0:44:45 Yeah, yeah, yeah.
0:44:51 Even by the end of the decade, you know, the US will be like 10% of our power will be AI
0:44:52 data centers, which is still like.
0:44:53 Of electricity.
0:44:53 Of our electricity.
0:44:56 In terms of energy, that’s even a smaller fraction, right?
0:44:57 Oh, yeah, yeah.
0:44:57 Because you think about.
0:45:02 But shifting to electric vehicles, we can probably make a bigger swing than, you know, with all
0:45:03 the AI data centers we can build.
0:45:06 But outside, like it’s like in Europe, like that number is not moving up that fast.
0:45:11 And like all these other countries, I think we need to build a lot more power, but it’s
0:45:16 not like some, some crazy, crazy like amount.
0:45:19 It’s just like doing it properly is the hard thing.
0:45:20 Yeah, I agree.
0:45:25 And again, like the cost of power, like you go look at like these deals people are signing,
0:45:29 they’re still signing, like even though the price has skyrocketed from like a few cents
0:45:32 a kilowatt hour for these massive, massive purchases to like 10.
0:45:37 It’s still, you know, when you think about the full TCL of a cluster, you know, the GPU
0:45:40 cost of networking, all of this stuff far outstrips the power.
0:45:40 Yeah.
0:45:41 And same with cooling.
0:45:48 What percentage is power from like if you do a four year amortized GPU data center, what
0:45:49 percentage will be power?
0:45:52 80% of the cost of a GPU data center if you’re building Blackwell is capital.
0:45:52 Yeah.
0:45:52 Right.
0:45:59 It’s the GPU purchases, it’s the networking, it’s the, it’s the physical data center conversion,
0:46:00 power conversion equipment.
0:46:03 All of this stuff is like 80% of the cost.
0:46:09 And then 20% is going to be your land and your power and your cooling and your cooling towers
0:46:11 and your backup power and your generators and all this stuff.
0:46:16 That’s like nothing, which is why it doesn’t matter if you spend, you know, 10% or 50%
0:46:16 more on that.
0:46:21 Because at the end of the day, the expensive thing, right?
0:46:24 Like this is why what Elon did would seem silly, right?
0:46:30 They spent a lot more money on, you know, generators outside the data center and these mobile chillers
0:46:34 to cool the water down for their liquid cooling instead of like the more cost effective option
0:46:36 because it got the data center up three months faster.
0:46:40 And so like that three months of additional training time is worth way, way, way more.
0:46:42 On a TCO basis, right?
0:46:46 The performance you got out of the chips and the time to market and all this is way, way faster.
0:46:51 And therefore it was the right decision, even though this part of the data center ballooned at cost.
0:46:55 Everything else is still there and you’re still paying for the chips.
0:46:57 And if they were sitting idle, it’s not worth it.
0:47:03 Just by like bypassing the grid, bypassing anything to do with interconnect, anything to do with public utilities.
0:47:03 Exactly.
0:47:04 Exactly.
0:47:05 What’s your take on Intel?
0:47:06 Where is Intel going?
0:47:09 I think the world, well, the U.S. needs Intel.
0:47:11 I think the world needs Intel.
0:47:17 I think the world needs Intel because like Samsung is doing worse than Intel on leading edge process development,
0:47:24 in my opinion, based on even on various customers in the industry having done test chips at like Intel versus Samsung.
0:47:32 They think, I think industry generally agrees that Intel is further along, you know, the sort of the two nanometer class process technology than Samsung is.
0:47:34 But both are way behind TSMC.
0:47:38 And TSMC is a monopoly in some extent.
0:47:43 The number one question always people ask is like, why is TSMC not making more money?
0:47:48 Why are they only raising prices, you know, next year, you know, three to 10% depending on what it is.
0:47:50 It’s like TSMC is a monopoly.
0:47:56 Like they could raise a lot more, but they’re good Taiwanese people rather than like dirty American capitalists.
0:48:02 If TSMC was owned or was managed by Americans, I think most of the ownership is actually American in terms of the stock.
0:48:06 It’s on the New York Stock Exchange and all this like, you know, they would have raised prices a lot more.
0:48:17 And so like there is this like difficult, difficult thing to be done that like, hey, there’s one island that controls all leading edge semiconductors
0:48:21 and not just all leading edge, like the majority of trailing edge production as well.
0:48:23 Something needs to be done.
0:48:26 Intel is behind, but not like, not like absurdly so, right?
0:48:31 Like if something were to happen to Taiwan, Intel would have the most advanced technology in the world, right?
0:48:33 It’s just, it’s not economic.
0:48:36 Can you keep Intel as one company if you want them to be competitive?
0:48:44 I think the process of splitting it would take so much executive time and so much executive effort that you would have been bankrupt by then, right?
0:48:45 And that’s the big challenge.
0:48:47 Like I think Intel should be separate, right?
0:48:53 But to properly split the company and for all the management time that’s needed is like absurd.
0:48:58 And instead, like what you need is like, you need Liputan, who’s the CEO, who’s CEO of Intel.
0:49:05 You know, there’s a lot of drama going around about him because he’s, he’s one of the greatest semiconductor investors ever, right?
0:49:07 He’s invested in so many different companies first.
0:49:20 You know, he was on the board of like SMIC, which is China’s TSMC effectively, which is like a big like drama or like the, the, some of the biggest tool companies in China is the first investor in them because, you know, there’s a multipolar world there and he’s making good investments.
0:49:28 But like, you know, now like people are getting mad about that, but it’s like, no, he, he, he recognizes he, he, the companies, like he understands the supply chain.
0:49:33 He needs to not spend his time on splitting the company because then he never actually fixes the company.
0:49:33 Right.
0:49:40 Intel’s problem is that like, it takes them five to six years to go from design to shipping the product.
0:49:44 Um, in some cases more, uh, and when they tape out a chip, right?
0:49:47 Like, you know, you send the design to the fab, the fab brings back the chip.
0:49:54 They go through 14 revisions in, in some cases where it was like the rest of the industry goes through like one to three, right?
0:50:01 Revisions, if they’re good, um, of like send the design in, uh, get the chip back, test it, send the design in, right?
0:50:04 Uh, for, for a public launch and they’ll launch a chip in three years.
0:50:10 So, but if you look at Intel today, right, they still don’t have a competitive entry on the, uh, on the AI side.
0:50:11 And they won’t, uh, right.
0:50:15 Can you, so what, but what does that mean for their offering?
0:50:17 I mean, they, they, they’re still doing great on, on, on CPUs.
0:50:20 They don’t have a good AI, uh, AI chip product.
0:50:22 Is it long-term sustainable positioning, right?
0:50:24 I mean, as, as, as, as a standalone chip company?
0:50:27 I mean, IBM still makes more money every launch off of mainframes.
0:50:30 So it’s not like, it’s not like x86 is dead.
0:50:34 It’s like, you don’t get the growth rates, but like you could totally run this as a very profitable enterprise.
0:50:37 Um, and, and I think the same with PCs, right?
0:50:39 There’s, you know, there’s some turmoil.
0:50:40 There’s some ARM entries.
0:50:41 There’s some AMD competition.
0:50:49 Well, like, I think it’s a very, it can be a very profitable business if it had like one third the people or half the people working on it.
0:50:59 And so like Liputan to fix Intel needs to go into both the design company and lay off a shitload of people, but like keep all the good people, um, and make sure that they’re designing fast.
0:51:04 And they’re launching from design conception to launches two to three years, not five to six.
0:51:07 Um, and, and that’s on the design side and make that profitable.
0:51:09 And then on the fabs, he has to do the same thing.
0:51:14 There’s all these people, like, um, one of the heads of, uh, fab automation at Intel.
0:51:22 Um, I explicitly told Liputan because, you know, we have a couple X Intel people who are actually good in the company, um, that worked on the fab side.
0:51:24 And we’re like, they were like, we’re like, who’s the worst people and friends.
0:51:26 It’s like, Oh, this guy sucks.
0:51:27 I explicitly told Liputan.
0:51:30 He had never talked to the guy because it was like four layers down.
0:51:32 The company has like absurd amounts of hierarchy.
0:51:33 It’s like four layers down.
0:51:35 He goes and talks to the guy and he’s out.
0:51:35 Right.
0:51:38 It’s like, like he figures out like who’s bad.
0:51:38 Right.
0:51:39 Um, and who’s good.
0:51:47 And he has to go in and he’s still like, Hey, the vast majority, the team at Intel is the one who led the world in production and process technology for 20 years.
0:51:48 Yeah.
0:51:48 Right.
0:51:50 But there’s a lot of like built up crap.
0:51:52 So he has to go figure this out.
0:51:53 Right.
0:51:57 He can’t waste his time on like, Oh, all this like structuring to split.
0:52:00 Like, I think it would be better if the company split.
0:52:02 I just don’t think he can spend the time to do that.
0:52:07 Um, and if the design side of the company is, you know, you’re not really going to get into AI.
0:52:16 You’re not really going to, you have to make some money there, but the fabs, I think could truly become a competitor, but they’re going to go bankrupt by the time anything, you know, can happen.
0:52:17 So they have to figure out how to get capital.
0:52:19 So he has to figure out how to get capital.
0:52:24 He has to figure out how to clean up all the crap, uh, make the, um, you know, yields go up.
0:52:24 Right.
0:52:26 Make the product ship way faster.
0:52:28 Like all of these things are basic problems.
0:52:29 I think the goals are completely correct.
0:52:32 I mean, I think, I think the big challenge is just reflecting back on my time there.
0:52:32 Right.
0:52:37 I think the big challenge is that right now, if you look at Intel, right, they have essentially software.
0:52:41 sort of the, the chip design making the, and then there’s sort of, you know, the core manufacturing part.
0:52:41 Right.
0:52:46 And they have three different, very different cultures and it’s very hard to get everything under one umbrella.
0:52:46 Right.
0:52:48 And so I think that is the big challenge.
0:52:50 I think you could, you should run the company separately.
0:52:50 Right.
0:52:56 Like, but like you can’t physically separate them entity wise because it’s going to take so long to sever all these things.
0:52:58 Um, because he doesn’t have time.
0:52:58 Right.
0:53:04 Like Intel is literally going to go bankrupt if they don’t have a big cash infusion or they lay off like half the company.
0:53:04 Right.
0:53:11 Uh, which some could argue you need to lay off like 30% of the company anyways, but, um, there’s a lot of bad things that happen if that happens.
0:53:12 Right.
0:53:18 Um, and they need to spend a lot more on building the next generation fab, even if they fix the fab and they don’t have money for that.
0:53:18 Right.
0:53:29 So there’s like, there’s like a lot more, more important problems than like physically separating the company, even though I think long-term, yes, the fab has to be separate from the chips design software.
0:53:29 Right.
0:53:36 Like our chip design, uh, part of the company, just like that’s going to make each company much more accountable, uh, be able to service their customers better, et cetera.
0:53:39 It’s just, that’s going to take too long and they’re going to go bankrupt by then.
0:53:40 Awesome.
0:53:44 I, I, but I think, I think, I hope, I hope, I hope I pray someone does something, right?
0:53:46 Like you get a big capital infusion.
0:53:46 I don’t know.
0:53:53 The big hyperscalers are like muscled into like, oh, okay, wait, if TSMC eventually grows their margin to 75% because of the monopoly.
0:54:01 Um, plus they intake all this stuff like co-package optics and power delivery and all this, like all of a sudden the cost is going to spike.
0:54:04 So we should actually just throw $5 billion at Intel each, right?
0:54:05 Screw it.
0:54:09 And that could actually give Intel enough of a lifeline to potentially get to something and maybe be competitive.
0:54:11 Um, that’s, that’s the hope.
0:54:17 Can we, uh, can we finish by finishing this game that we started when we gave Sam Altman advice?
0:54:21 If, if Jensen was here, what, what advice would you have for him?
0:54:22 Hmm.
0:54:31 If Jensen was here, uh, you know, I think, I think he has a massive, massive balance sheet, right?
0:54:31 Jensen does.
0:54:35 He’s, he’s cash, free cash flow is like, you know, ridiculous.
0:54:46 Um, the tax cut, the new, the new Trump, you know, tax bill, um, institutes something really incredible, which is that you can depreciate all of the GPU cluster costs in year one.
0:54:52 Um, which we put out like a note about how like the tax implications to like Meta are like $10 billion a year.
0:54:55 And across each of the major hyperscalers, it’s like massive.
0:55:03 It’s like, well, NVIDIA is going to spend tons and tons of cash, um, or, uh, they’re going to spend like, you know, tens of billions of dollars of taxes.
0:55:05 Why don’t you get into the infrastructure game somehow?
0:55:15 Um, now this is obviously going to be like crazy because like now they’re buying their own GPUs and putting them in data centers and doing stuff and they’re competing with their own customers.
0:55:17 But they’re already doing that anyways, because their customers are trying to make chips.
0:55:23 Um, but they should like accelerate the data center ecosystem with investments, right?
0:55:35 Uh, because really we, we think we can like have very high degree of like accuracy on what they’re going to do next year in terms of revenue, because it’s just the number of data center watts that are being built, right?
0:55:38 Like this is harder thing to shift up and down, right?
0:55:46 Now there’s a little bit of share difference between how much is TPU versus GPU, but like, it’s like you have to accelerate the infrastructure and you need to spend all of this capital that you’re building, right?
0:55:50 Like, okay, do you want to go the route of like doing buybacks and dividends?
0:55:51 Like, great.
0:55:53 Like you’re a loser if you do that, right?
0:55:57 Like you, you can make more money by reinvesting and building a bigger company.
0:56:05 That’s not just chips into the ecosystem or servers into the ecosystem, but actually like controlling the infrastructure end to end somehow.
0:56:10 Um, so I think there’s something he could do there, uh, with this massive war chest.
0:56:14 And there’s a reason like NVIDIA has done some buybacks and they’ve done some dividends and increasing,
0:56:21 but the cash on their balance sheet keeps growing and they’re going to have north of a hundred billion dollars of cash on their balance sheet by the end of this year, I think.
0:56:23 Um, so it’s like, what are you going to do with that?
0:56:33 Um, I think, I think there’s something moving into the infrastructure layer much more, um, that they, they could do if he really wants to be the king of the world, right?
0:56:34 Uh, which I think he does.
0:56:36 Uh, Sergey and Sendar?
0:56:38 Um, who?
0:56:44 Um, I, I think, I think they should open up the Kimono on TPUs, right?
0:56:51 Like start selling them, um, open up the software, open source, a lot more of the XLA software because there’s open XLA and there’s XLA, but the vast majority is closed source.
0:56:57 Um, really, really open up the Kimono on that, um, and be a lot more aggressive, right?
0:57:00 They’ve, they’re still pretty not aggressive on data centers.
0:57:05 Um, they’re pretty not aggressive on a lot of elements of the company.
0:57:12 Um, the TPU team’s next-gen designs are pretty not aggressive, partially because a lot of the TPU team, uh, has left to go to open AI.
0:57:15 The best people that I knew, uh, it was actually really annoying.
0:57:22 I knew, like, four people, uh, or five people, and they all went to open AI, and it’s like, fuck, like, now I don’t get as much, it’s, I met some other people, right?
0:57:27 But it’s like, you know, um, I think they could be a lot more aggressive in many ways across the company.
0:57:28 They don’t have to be, right?
0:57:43 But they could, um, because AI, you know, like, this ChatGPT, TakeRay, the shift of search queries, the monetizable ones, especially from two, two purchasing agents, is gonna really screw Google long-term if they don’t, you know, get their act together.
0:57:45 I think they’ve gotten their act together on DeepMind.
0:57:49 There’s still some inefficiencies, but Sergey works on, works within DeepMind a lot, and they’re driving hard.
0:57:58 They’re still a little bit behind, but, like, I think, like, physical infrastructure, TPU, um, and, and how much money they could make, and how much they could take the wind out of everyone.
0:58:15 else’s sales, if they start selling TPUs externally, um, and reorg around, like, building data centers much faster, so that they do have the most compute in the world, because they did, uh, but now there’s certain companies that are gonna surpass them, uh, potentially over the next few years if they don’t really get their act together.
0:58:16 So I think that’s what I would say for them.
0:58:21 Um, yeah, and also, like, like, learn how to ship product.
0:58:23 Zuck.
0:58:28 Um, I think, I think Zuck, you know, it remains to be seen.
0:58:39 What goes on with superintelligence, but, like, they’re trying to move super fast with the data centers, uh, you know, like, screw it, we’ll build tents, uh, instead of, like, physical data centers, because we only need these for five years anyways.
0:58:53 Um, you know, the superintelligence moves, you could, you could say whatever you want, but, like, you know, trying to buy, like, Thinky for, like, 30 billion, or, or, um, SSI for 30 billion didn’t work out, so then they spent, you know, not even that much on hiring, not 30 billion on hiring all these people.
0:58:58 Um, so I think that he recognizes the urgency with the models, with the infrastructure.
0:59:07 Um, so I really think he needs to, like, you know, if you read his website post about, like, AI, like, I think, you know, he sees the vision, right?
0:59:12 There’s the wearables, there’s integrating AI into that, there’s being your AI assistant to do all this purchasing and stuff.
0:59:26 I think he sees the vision, but I think he also needs to focus on, like, actually, like, releasing that faster, um, but also, like, the products that they do outside of their core IP every time they launch something is kind of mid, right?
0:59:39 Um, you know, Metal Reality Labs is doing well, but I think they should, like, go more explicit, like, have a chat GPT competitor, have a quad, uh, like, quad code competitor, like, just start releasing way more products.
0:59:45 Um, because they’re really just focused on their individual gardens rather than, like, branching outside of it.
0:59:50 Do you think Apple should have that same sense of urgency, or if Tim Cook was here, what would you tell him?
0:59:54 The funny thing is, like, some of their best AI people are now, like, at superintelligence.
1:00:00 Uh, they’re building an AI accelerator, they’re gonna, they’re, they’re, they have AI models, but they’re just, like, way slower.
1:00:10 Um, they did mention on the last earnings call they’re gonna allocate more capital to this, but it’s, like, guys, Apple, like, you guys are gonna lose the boat if you do not spend, like, 50, 100 billion dollars on infrastructure.
1:00:12 You don’t think the ground theory will, will cut it?
1:00:23 Um, I think, I think, like, more and more you’ll see people, like, you know, great, Apple has this walled garden, but, like, they can only do so much to protect it, right?
1:00:32 Like, IDFA, like, they shut down ads to, or data sharing to Meta, but Meta made better models, and now they have way more data and way more power over the user than they ever did before.
1:00:38 Kind of, it was good that Meta kicked the crutch off of them, or Apple did, uh, but the same applies to, like, AI.
1:00:48 Like, yes, they have access to the text, and they have access to this, but, like, I think other people are gonna be able to integrate user data, um, and agents will be able to integrate all this user data,
1:00:59 and they’ll start to lose control of what the user experience is, as more and more gets disintermediated by AI being the interface rather than touch, rather than, you know, touchpad and keyboard.
1:01:06 Um, and I don’t think they’ve truly realized what happens when the interface to computing is, is AI.
1:01:10 Like, they, they market it, but, like, that’s gonna shift computing really heavily.
1:01:18 They have great hardware, and their hardware teams are working on awesome stuff and form factors, but, like, I just don’t know if they, like, get what is actually gonna happen.
1:01:24 They’re gonna go into the world in the next five years, uh, truly well enough, and they’re not building fast enough for it.
1:01:25 What about Microsoft to that end?
1:01:29 Um, Microsoft has the same problem, I think.
1:01:35 Um, they were super aggressive in 23 and 24, um, and then they pulled back heavily, right?
1:01:40 Uh, now, like, opening eyes, slipping through their grasps, uh, there’s that whole thing there.
1:01:42 Uh, they cut back on data center investments heavily.
1:01:46 They were gonna be the largest infrastructure company in the world by, like, a factor of 2x.
1:01:47 Yeah.
1:01:51 Uh, which would have been, you know, you could argue maybe that was, like, too much, and maybe it wouldn’t have been economical.
1:01:54 Um, but, like, they’re losing grasp on OpenAI.
1:01:58 Their internal model efforts are failing, uh, spectacularly.
1:02:03 Like, they’re on LLM Arena right now, and they’re pretty decent there, but it’s, like, that’s just, like, a sycophantic model.
1:02:06 Like, um, it’s a codename, but, like, whatever.
1:02:08 Like, MAI is, like, failing.
1:02:15 Azure is, like, losing a lot of share to Oracle and CoreWeave and Google, um, and so on and so forth, right?
1:02:20 Their internal chip effort is by far the worst of any hyperscaler.
1:02:22 Like, uh, they’re just, like, misexecuting.
1:02:28 Like, GitHub, how is GitHub not the highest ARR software, uh, code model?
1:02:33 I mean, they only had the best IDE, the best source code repository, the best enterprise Salesforce,
1:02:37 the best model company as a relationship, and they were the first to market, right?
1:02:38 It’s, like, they had everything going for them.
1:02:40 And, like, there’s just nothing, right?
1:02:43 It’s, like, like, GitHub Copilot is failing.
1:02:46 Microsoft Copilot is, like, still crap, right?
1:02:47 Like, uh…
1:02:48 Yeah, it’s unusable.
1:02:51 It’s, like, what is, like, you know, what is going on?
1:02:53 Like, you need to shake the crap out of the company.
1:02:58 Like, I think they win a lot because they have the best business-to-business relationship with so many enterprises.
1:03:00 That’s Salesforce on the planet.
1:03:05 Yeah, but, like, they end up, like, not having the actual product to sell them, which is, like, really scary.
1:03:06 So they need to really work on product.
1:03:10 Satya has done great on sales and stuff, but, like, yeah.
1:03:13 If Elon was here, what advice would you give him?
1:03:17 A lot of people at XAI are mad about the porn models, like, porn stuff.
1:03:19 It’s fine.
1:03:20 Like, you’re going to make a ton of money off of this.
1:03:22 This is how you accelerate the revenue of that company.
1:03:26 But, like, he’s losing a lot of talent and axing a lot of good projects.
1:03:31 But Elon is a magnet to amazing talent in building stuff, so I won’t bet against him.
1:03:34 But it seems like since he left the administration and focused on stuff again.
1:03:37 But I think, I don’t know, I think he’s focused on a lot of things.
1:03:40 And I think, like, RoboTaxi starting to look good, actually, again.
1:03:42 Like, I haven’t ridden one yet, but I have some friends who’ve ridden one.
1:03:43 It’s, like, looks pretty decent.
1:03:50 He could, like, not make these snap decisions, which often are the reason why he’s amazing.
1:03:52 But, like, some of these snap decisions are hurting him.
1:03:59 I’m not sure if I can give Elon that much great advice because I think maybe it’s just, like, focus on, like, the products again, right?
1:03:59 More.
1:04:02 But he is working on that stuff a lot.
1:04:02 Yeah.
1:04:05 I think that might be a good place to wrap.
1:04:05 Awesome.
1:04:06 It was a great discussion.
1:04:07 Dylan, thanks so much for joining us.
1:04:08 Thank you for having me.
1:04:08 Thank you.
1:04:12 Thanks for listening to the A16Z podcast.
1:04:18 If you enjoyed the episode, let us know by leaving a review at ratethispodcast.com slash A16Z.
1:04:21 We’ve got more great conversations coming your way.
1:04:22 See you next time.
1:04:26 As a reminder, the content here is for informational purposes only.
1:04:32 Should not be taken as legal, business, tax, or investment advice, or be used to evaluate any investment or security,
1:04:37 and is not directed at any investors or potential investors in any A16Z fund.
1:04:42 Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
1:04:49 For more details, including a link to our investments, please see A16Z.com forward slash disclosures.
1:04:55 A16Z.com forward slash disclosures.
1:04:56 A16Z.com forward slash disclosures.
1:04:56 A16Z.com forward slash disclosures.
1:04:56 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:57 A16Z.com forward slash disclosures.
1:04:58 A16Z.com forward slash disclosures.
The AI hardware race is heating up, and NVIDIA is still far ahead. What will it take to close the gap?
In this episode, Dylan Patel (Founder & CEO, SemiAnalysis) joins Erin Price-Wright (General Partner, a16z), Guido Appenzeller (Partner, a16z), and host Erik Torenberg to break down the state of AI chips, data centers, and infrastructure strategy.
We discuss:
- Why simply copying NVIDIA won’t work, and what it takes to beat them
- How custom silicon from Google, Amazon, and Meta could reshape the market
- The economics of AI model launches and the shift toward cost efficiency
- Infrastructure bottlenecks: power, cooling, and the global supply chain
- The rise of AI silicon startups and the challenges they face
- Export controls, China’s AI ambitions, and geopolitics in the chip race
- Big tech’s next moves: advice for leaders like Jensen Huang, Sundar Pichai, Mark Zuckerberg, and Elon Musk
Resources:
Find Dylan on X: https://x.com/dylan522p
Find Erin on X: https://x.com/espricewright
Find Guido on X: https://x.com/appenz
Learn more about SemiAnalysis: https://semianalysis.com/dylan-patel/
Stay Updated:
Let us know what you think: https://ratethispodcast.com/a16z
Find a16z on Twitter: https://twitter.com/a16z
Find a16z on LinkedIn: https://www.linkedin.com/company/a16z
Subscribe on your favorite podcast app: https://a16z.simplecast.com/
Follow our host: https://x.com/eriktorenberg
Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.