The Race for AI—Search, National Infrastructure, & On-Device AI

AI transcript
0:00:06 In 2024 alone, I think there were more than 700 pieces of state-level legislation that were AI-specific.
0:00:10 The average query on perplexity is 10 to 11 words.
0:00:14 The average search on Google is 2 to 3 keywords.
0:00:19 You’ve had an enormous amount of NVIDIA’s purchasing orders come from the balance sheet of governments.
0:00:22 You can actually reimagine the AR experience.
0:00:28 If you’re a founder like that who has the guts, then your impact on humanity ends up being quite generational.
0:00:33 Here we are again, inching even closer to the end of 2024.
0:00:38 And as we near 2025, here are a few dates to give you some perspective.
0:00:46 We’ve had 24 incredible years of Wikipedia, 18 years of the iPhone, and 16 years since the Bitcoin White Paper release.
0:00:54 So as we look to 2025 and the speed of innovation is only increasing, we continue our coverage of A16Z’s big ideas.
0:00:59 Together with the dozens of partners who are meeting daily with the people building our future.
0:01:01 Last year, we predicted.
0:01:03 A new age of maritime exploration.
0:01:06 Programming medicine’s final frontier.
0:01:08 AI through schemes that never end.
0:01:10 Democratizing miracle drugs.
0:01:14 On deck this year, closing the hardware/software chasm.
0:01:16 Game tech powers tomorrow’s businesses.
0:01:18 Super staffing for healthcare.
0:01:22 And throughout our four-part series, you’ll hear from all over A16Z,
0:01:26 including American dynamism, healthcare, fintech, games, and more.
0:01:30 However, if you’d like to see the full list of 50 big ideas,
0:01:34 head on over to a16z.com/bigideas.
0:01:40 And of course, if you missed it, check out part one, all about the intersection of hardware and software.
0:01:45 As a reminder, the content here is for informational purposes only.
0:01:48 Should not be taken as legal, business, tax, or investment advice,
0:01:51 or be used to evaluate any investment or security,
0:01:55 and is not directed at any investors or potential investors in any A16Z fund.
0:02:01 Please note that A16Z and its affiliates may also maintain investments in the companies discussed in this podcast.
0:02:07 For more details, including a link to our investments, please see a16z.com/disclosures.
0:02:15 Today in part two, we’ll be talking about the topic of the day.
0:02:19 The month, and quite frankly, the year. Artificial intelligence.
0:02:22 There’s certainly AI, AI, AI.
0:02:28 And the race is on, whether it’s across companies like Google and the Disruptors Chasing Way 10 Blue Links,
0:02:32 or sovereign countries trying to capitalize on the next frontier,
0:02:37 or even device companies figuring out their role as AI meets the edge.
0:02:39 That is all on deck today.
0:02:43 There’s a bit of an innovator’s dilemma here, and I’m excited to watch it play out.
0:02:48 That was Alex Zimmerman, and I’m a partner here on The Growth Fund.
0:02:50 Here’s his big idea.
0:02:53 The search monopoly ends in 2025.
0:02:59 Google controls 90% of U.S. search, but its grip is slipping.
0:03:05 Its recent U.S. antitrust ruling encourages Apple and other phone manufacturers
0:03:08 to empower alternative search providers.
0:03:13 More than just legal pressure, GenAI is coming for search.
0:03:18 ChatGBT has 250 million weekly active users.
0:03:25 Answer Engine Perplexity is gaining share, growing 25% month on month,
0:03:28 and changing the search engagement form.
0:03:33 Their queries average 10 words, three times longer than traditional search,
0:03:36 and nearly half lead to follow-up questions.
0:03:44 Claude, Grock, MetaAI, Poe, and other chatbots are also carving off portions of search.
0:03:52 60% of U.S. consumers used a chatbot to research or decide on a purchase in the last 30 days.
0:03:57 For deep work, professionals are leveraging domain-specific providers
0:04:01 like Causally, Consensus, Harvey, and Hebbia.
0:04:05 Ads and links historically aligned with Google’s mission.
0:04:11 Organize the world’s information and make it universally accessible and useful.
0:04:17 But Google has become so cluttered and gamed that users need to dig through the results.
0:04:20 Users want answers and depth.
0:04:26 Google itself can offer its own AI results, but at the cost of short-term profits.
0:04:32 Google as a verb is under siege. The race is on for its replacement.
0:04:37 Maybe start by setting the stage. So how big is the search market today?
0:04:42 The search market is enormous. Anyone listening to this uses search.
0:04:45 Virtually anyone with the internet uses search.
0:04:49 But to put some numbers around it, Google that I just mentioned, they’re the biggest game in town.
0:04:53 They’re approaching $200 billion of revenue annually.
0:04:56 They’re still growing double digits, highly profitable.
0:05:00 Microsoft Bing, which has been the number two player for a long time,
0:05:06 mid-single-digit market share, so pretty small, they have $12 billion of revenue.
0:05:10 This is a massive, massive market, one of the largest out there.
0:05:14 And it does feel like there are forces reshaping this industry, so tell me about those.
0:05:17 So there’s certainly AI, AI, AI.
0:05:23 But before we jump into that, we can level set with the legal pressure that’s mounting on Google.
0:05:26 So earlier this year, Google was declared a monopoly.
0:05:32 The court ruled that they’re spending billions of dollars to phone manufacturers.
0:05:37 In the case of Apple, tens of billions of dollars is monopolistic.
0:05:42 It’s anti-competitive and it’s preventing their competitors from gaining share in the marketplace.
0:05:46 They are the default search engine on all these phone manufacturers.
0:05:50 And it basically makes it impossible for any of the others to gain share.
0:05:55 So the exciting technological change, of course, is around AI.
0:05:59 The Gen AI search providers are fantastic.
0:06:05 And the market has been dominated by Google for close to 25 years at this point.
0:06:08 And as a monopoly, they’re not really innovating.
0:06:11 They have no incentive to until now.
0:06:17 If you think about the Google experience as it is today, it’s just a long list of links.
0:06:21 And the first few are sponsored, they’re ads.
0:06:26 And I as a user, when I make a query on Google, I then need to make the decision.
0:06:31 I have to sift through the information on that site and find the answer that I’m looking for.
0:06:33 That’s actually a pretty long process.
0:06:37 Instead, with Gen AI, I can just get the answer.
0:06:44 So on chat GBT, perplexity, Claude, or when I’m chatting with my AI friends on character AI or on Poe,
0:06:46 I get an answer immediately.
0:06:50 And that’s just a much, much better experience.
0:06:54 And I hear tons of people saying they’re trying these new tools, but maybe you can give us a sense.
0:06:58 Is this shift really stark? Are people really moving over?
0:07:00 People are definitely moving over.
0:07:02 And I’d say there’s four main reasons.
0:07:07 One is the one we just talked about, the shift from links to answers.
0:07:16 A second one would be how they are personalized, how the answers feel interactive, how they are conversational.
0:07:21 The average query on one of these new services has a follow-up question.
0:07:27 So it’s not just about that initial engagement, it’s about the ongoing conversation.
0:07:33 The third difference is about how these new services engage with complex queries.
0:07:37 So the average query on perplexity is 10 to 11 words.
0:07:41 The average search on Google is two to three keywords.
0:07:47 As you can imagine, their ability to leverage that information and get you what you want is much higher.
0:07:50 On Google, you’re going to get a list of links.
0:07:52 One of those links might have one side of the debate.
0:07:54 Another link may have the other side.
0:07:56 You may never make it to that second side.
0:08:00 On perplexity, on chatGBT, they’re going to synthesize both sides.
0:08:02 They’re going to give me both perspectives.
0:08:06 And then the fourth, they’re just not cluttered with ads.
0:08:09 That may change over time, but putting it all together,
0:08:14 it should be no surprise that these AI-native services are gaining share.
0:08:20 For 10% of consumers, they’re now referring to chatGBT as their search engine of choice.
0:08:25 Perplexity queries are growing 25% to 30% month over month.
0:08:33 And I saw a survey last week that 60% of consumers for purchase decisions in the last 30 days used a chatbot.
0:08:37 All of these services are growing massively.
0:08:41 So if we look at some of the tools that exist today that are coming for that market share,
0:08:47 you take a perplexity or you take a chatGBT, those are also broad-based search engines or chatbots.
0:08:53 But there are these other players you mentioned, Consensus or Habia, for example, that are verticalized.
0:08:57 Does it surprise you that the last wave of search engines weren’t verticalized?
0:09:05 It does not surprise me that the last wave or the next wave will result in a winner-take-most dynamic.
0:09:08 It’s a pretty monopolistic market.
0:09:11 Search is very much a distribution game.
0:09:13 Google has an incredible brand.
0:09:15 They have incredible direct traffic.
0:09:18 They are the default search engine on most browsers.
0:09:22 They are the Kleenex, the Band-Aid of online search.
0:09:28 It’s important to note that search engines definitely benefit from network effects.
0:09:32 More users coming to Google provides more data around preferences.
0:09:37 For Google, that means the next search has more relevance and brings more users.
0:09:44 It also means more users means more advertisers, more profit dollars that you can invest back in the service
0:09:49 or into ensuring that Apple puts them as the default provider.
0:09:50 Absolutely.
0:09:56 And so as we think about this next AI wave, it sounds like you think maybe a similar dynamic might be at play.
0:10:01 Or how do you think about maybe these smaller players actually finding a wedge or differentiating?
0:10:08 One framework for thinking about how search could fragment or verticalize
0:10:13 is just looking at why does vertical software beat horizontal software in some cases.
0:10:21 And it’s typically the vertical requires a specific user interface, proprietary data,
0:10:24 features, workflows, compliance, etc.
0:10:31 And that’s why Viva in life sciences and pharma can beat Salesforce in horizontal CRM.
0:10:36 For the average query on chat, GBT, perplexity or Google, they’re pretty good.
0:10:40 You don’t need a vertical specific application.
0:10:47 But for deep domain research, I can imagine a world in which standalone apps thrive.
0:10:53 In the case of our portfolio company, Hebbia, they have a unique interface.
0:10:58 It looks like a spreadsheet which is native to financial services and their customers.
0:11:04 It brings in public filings, earnings transcripts, but also private data.
0:11:07 It can bring in research, it can bring in survey results.
0:11:14 You can query and then the output can also be specific to that industry.
0:11:20 Not just look like a spreadsheet, but populate a meeting agenda so you can go in immediately prepared.
0:11:25 And so when I think about vertical search or vertical apps, it’s not just about the search,
0:11:27 it’s about everything around the search.
0:11:28 That’s a good distinction.
0:11:33 And as we think about how this maybe continues, the progression of this industry,
0:11:38 if we go back to the last wave of search, we did see a bunch of search engines get traction to start,
0:11:40 maybe Alta Vista or Ask Jeeves.
0:11:41 We all pick on Ask Jeeves.
0:11:45 You know, all the Gen Zers will not know what the hell any of that is,
0:11:49 but they had some traction and then Google exploded and took over.
0:11:53 Do you expect that same kind of consolidation or is this time different?
0:11:59 I think consolidation should be expected in general purpose search because of distribution,
0:12:01 because of network effects.
0:12:07 Google won the last time around in part because of their page rank algorithm.
0:12:09 It produced superior results.
0:12:11 It had a minimalist UI.
0:12:12 It was really simple.
0:12:16 Ask Jeeves, Alta Vista, they had tons of ads.
0:12:19 It was cluttered links on the homepage.
0:12:21 No one wanted to use it.
0:12:27 The irony is that today, what we’re all complaining about with Google is that their pages are cluttered with ads on the results,
0:12:31 and that creates the opportunity for these new search engines.
0:12:34 I would expect consolidation again.
0:12:40 And I think something else that’s interesting is what you’re pointing at with some of these verticalized solutions
0:12:46 is that it’s not just for the everyday consumer, but also for the lawyer or for the academic researcher.
0:12:49 Do you think that is also part of the fragmentation?
0:12:51 Do you expect that to continue?
0:12:56 Search historically has been thought of as a consumer product and for good reason.
0:13:00 I make a lot of searches that are related to my personal lives,
0:13:04 but I use, as do all the professionals you named, search at work.
0:13:09 I probably make more Google searches at work than I do around personal matters.
0:13:14 But there is a category of enterprise search in a market that has existed,
0:13:20 and you can think of that as querying box, Dropbox, Salesforce, all in one.
0:13:23 But I think those two worlds are going to blend together.
0:13:27 Consumer search should not just be limited to what’s on the web,
0:13:31 and enterprise search should not be limited to just proprietary data.
0:13:35 It should have both, and a lot of these AI-native services are working on that.
0:13:40 Yeah, that’s a great point. And as we think about those two models maybe blending together,
0:13:42 we think of consumer search for sure.
0:13:45 It’s always been an ad-based model, or at least today.
0:13:49 Google is this massive economic engine, but no one’s paying a subscription for that.
0:13:53 The new entrants do seem to have a more subscription-based model.
0:13:57 Is this a temporary thing, or how do you see those two dynamics playing?
0:13:59 So the subscriptions could stick around,
0:14:03 but I think this is going to continue to be a digital advertising-focused market.
0:14:07 And I think digital advertising is going to grow because of these AI services.
0:14:10 So these AI services today, they don’t have ads.
0:14:12 I imagine that’s going to change.
0:14:16 It’s been in the news that Perplexity is already talking to advertisers.
0:14:20 They have a high-income, highly educated user base.
0:14:23 That should be attractive to these advertisers.
0:14:28 But as we discussed earlier, the queries on these services, they’re longer.
0:14:32 They’re more complex. They’re more detailed. They’re more personalized.
0:14:37 And because of that, there’s greater intent, which should be more helpful to advertisers
0:14:41 in producing the best results on their end and creating an even larger market.
0:14:46 But in the meantime, these subscription business models, they make a lot of sense.
0:14:50 They bootstrap the business, they cover costs, and from a personal perspective,
0:14:55 I’m very happy to pay $20 a month for chat, GBT, Perplexity, and Poe.
0:14:57 I mean, they provide so much value.
0:14:59 A lot more than $20 a month.
0:15:03 Yes, yes. So obviously, this was your 2025 big idea,
0:15:07 and I know this is going to be a many-year, maybe even decade-long progression.
0:15:09 But what are you paying attention to in the next year?
0:15:11 What opportunities are still on deck?
0:15:15 As I looked at 2025, Google it may be on decline,
0:15:18 but I don’t expect them to go down without a fight.
0:15:21 Google and Meta, I think they can be big players.
0:15:26 Google AI Overviews already has a billion monthly active users.
0:15:30 Meta is not far behind. They have 500 million monthly active users.
0:15:34 Again, this shows the power of distribution for search.
0:15:39 And maybe to that end, Apple, if you could call them a dark horse, is a dark horse.
0:15:46 They are not investing the CapEx like Meta and Google and Microsoft and Amazon,
0:15:49 but they control a central node to the consumer.
0:15:52 And if they wanted to build a search application,
0:15:55 the next day, they could have a billion users.
0:16:00 We just heard how quintessential winning AI is to multi-trillion-dollar companies
0:16:02 like Google and Meta.
0:16:04 But what about nation-states?
0:16:09 In the race for AI dominance, compute has become critical national infrastructure,
0:16:13 but not every country is equipped to compete in that race.
0:16:14 That was…
0:16:18 Manjaneh Mida, I’m a general partner here at A16Z, where I focus on AI infrastructure.
0:16:20 And here’s his big idea.
0:16:23 My big idea is infrastructure independence.
0:16:29 The idea that a lot of countries and regions are starting to realize that modern AI,
0:16:32 deep learning-based AI, generative models,
0:16:36 are a form of what have been called general-purpose technologies.
0:16:43 In the history of humanity, we’ve only had maybe 20 or 22 or so general-purpose technologies.
0:16:48 And these are usually types of technologies like electricity, the printing press,
0:16:52 that have very broad-based applications in society.
0:16:57 Not being largely horizontal economic multipliers and progress multipliers
0:17:01 across a whole set of pillars and domains in society.
0:17:04 There are usually two moments in the adoption of a general-purpose technology
0:17:07 where first countries, nation-states start asking,
0:17:12 “Are we going to welcome this technology or are we going to be hostile to its development?”
0:17:13 Okay.
0:17:17 And that’s the first step that becomes pretty important in a country’s progression
0:17:19 or in a nation-state’s progression or a region’s progression.
0:17:20 Do we even want to adopt it?
0:17:21 Do we allow this in, regardless of whether we own it?
0:17:22 Right.
0:17:23 Do we embrace it or not?
0:17:24 Yeah.
0:17:26 And then the second is, do we build or buy?
0:17:27 Right.
0:17:31 Which is, can we trust somebody else to provide it for us?
0:17:34 We’re well past the stage of, do we embrace it or not?
0:17:37 We’re already well into billions of people around the world now,
0:17:39 having already embraced it.
0:17:41 So the governments don’t really have a choice, so to speak.
0:17:44 So in a sense, AI is already percolated throughout society
0:17:48 at one of the fastest diffusion rates of any general-purpose technology,
0:17:51 because it’s piggybacked off of years and years of digital infrastructure.
0:17:54 And so now the question everybody’s asking is, do we build or buy?
0:17:55 Yeah.
0:17:58 It’s the single largest, probably purchasing decision
0:18:04 that’s going to happen in the next 24 months is, do nation-states start buying it?
0:18:05 Do they build or buy?
0:18:06 Do they build or buy?
0:18:07 Yeah.
0:18:10 I love the parallel of companies because there are many companies that do choose to build,
0:18:12 but also companies that choose to rent or buy.
0:18:13 Right.
0:18:17 So as you think about that, there’s large nations around the world
0:18:20 like the United States, which are clearly building.
0:18:21 Right.
0:18:26 But talk about the argument for smaller nations or the 190 plus
0:18:28 that should be thinking about buying.
0:18:32 The good news here is that we’ve got hundreds of years of human history
0:18:35 to look at for clues about what happens next.
0:18:39 If you’re a small country and you were in the early 1900s,
0:18:44 you were watching the modern electrification of the developed world,
0:18:46 the United States or Europe.
0:18:48 If you chart what happened with many of those countries,
0:18:52 many of them decided to actually enter into what were called joint venture agreements.
0:18:55 It starts with a joint venture with a country that’s at the frontier.
0:18:56 Right.
0:18:58 These countries at the frontier of AI is what I call hyper centers.
0:19:02 These are countries that have the ability to develop, train,
0:19:04 build and host their own frontier models.
0:19:08 I call them hyper centers mostly as an homage to the word hyperscaler.
0:19:09 Right.
0:19:11 Which is that there have been a handful of companies that have had the compute
0:19:13 and the talent to actually build frontier AI.
0:19:17 And now I think what we’re seeing is a shift from just those companies
0:19:20 driving a bunch of frontier AI to countries and regions driving it.
0:19:22 And so if you’re a small country and you’re going,
0:19:29 well, we certainly believe that it’s important to have our own AI infrastructure.
0:19:33 We want to be independent, but we don’t have all the compute
0:19:36 required to train these models or we don’t have all the talent locally.
0:19:39 So what you enter into as a joint venture with a country
0:19:43 or an overseas partner that matches your values.
0:19:46 And this is the really important thing about AI and AI models
0:19:49 and how they’re different from infrastructure like electricity
0:19:52 is there’s a fundamental encoding of human values in AI models
0:19:54 because they’re trained on data.
0:19:55 Yeah.
0:20:00 And the data has these local norms and cultural values embedded.
0:20:05 And so if you happen to train a model on a bunch of internet data collected in the US,
0:20:08 the models are just generally American.
0:20:10 They’re encoded with that.
0:20:14 And if you’re trained the models on data in France,
0:20:19 they actually subtly have a bunch of different values encoded in the models
0:20:21 that reflect those cultural norms.
0:20:23 And so I think step number one, if you’re a small country,
0:20:27 is actually being a little bit crystal clear about which value systems
0:20:30 you align with most out of the hyper centers.
0:20:33 Now it’s not lost in people that the way the internet worked out
0:20:35 was they essentially ended up being two internets, right?
0:20:37 The Chinese internet and the rest of the world.
0:20:38 Yeah.
0:20:40 AI may not end up looking that different.
0:20:41 And if you’re a small country,
0:20:45 what you really have to figure out is whose values align more with yours.
0:20:50 A good historical precedent to look at here is the technology of money.
0:20:52 Money is a pretty general purpose technology.
0:20:57 And what happened in the early 1900s with the modernization of finance
0:21:00 is a number of countries started to ask the same question is,
0:21:02 do we build or buy our own currency?
0:21:04 Do we rely on the dollar?
0:21:05 Right.
0:21:07 Or do we have our own currency?
0:21:09 And that led to the modern day currency regime
0:21:12 where the dollar is a single global reserve currency.
0:21:14 And that happened through a bunch of allied cooperation
0:21:17 where a number of countries realized they did not have
0:21:22 the local resources required to hold the peg of gold right to the dollar.
0:21:26 And so I think what we’re going to end up seeing is the emergence
0:21:29 of very similar to what happened with currency flows,
0:21:32 where you have a couple of large countries
0:21:35 that control their own sovereign currencies, right?
0:21:38 You have the US, you have China, you have India.
0:21:41 And then you have a number of smaller countries that decided
0:21:43 they wanted to be flow points.
0:21:47 So you have Singapore and Ireland and you have Luxembourg and Zurich
0:21:51 that become massive global leaders in modern finance
0:21:56 because they decide they want to ally with one of those power centers.
0:21:58 So if you think about in the AI world,
0:22:00 let’s call regions at the frontier hyper-centers
0:22:02 and then we have compute deserts.
0:22:05 And these are places that have literally no install base
0:22:07 of compute capacity to even be relevant.
0:22:09 All the smaller folks have to figure out
0:22:11 which of the hyper-centers they want to align with
0:22:13 and how do you become a modern day Singapore,
0:22:17 Ireland, Luxembourg, etc. for the world of AI infrastructure.
0:22:19 And so it starts with deciding
0:22:21 whether you want to be a compute desert or not.
0:22:23 And if you’re not and you’re going to actually embrace
0:22:25 AI infrastructure as a government,
0:22:28 I think you’ve got to figure out which hyper-center you want to align with most.
0:22:31 And then it becomes actually quite easy to reason about
0:22:34 how to be a valuable ally.
0:22:35 That’s such a good parallel
0:22:38 because a lot of people think about resources in terms of the farmland that you have,
0:22:40 the people who are working in that economy.
0:22:43 But what you’re pointing out is that countries for a long time
0:22:46 have offered value or offered a resource in other ways.
0:22:48 And as we think about AI,
0:22:51 there’s a few things that you’ve pointed out that countries can invest in,
0:22:53 whether it’s the compute capacity that they have,
0:22:57 the energy resources to power AI and forward thinking policies.
0:22:59 So maybe we can break down each of those.
0:23:01 How do you think about each of those blocks
0:23:04 and how countries should be maybe maneuvering or investing in those things?
0:23:07 The good news is that there’s only three or four ingredients here that really matter.
0:23:09 The first is compute, which we’ve talked about.
0:23:12 The second is abundant and low-cost energy,
0:23:14 which powers the data centers.
0:23:17 The third is data, just the availability of really high quality
0:23:19 tokens for these models to learn on.
0:23:21 And the fourth is regulation.
0:23:22 So that’s the good news.
0:23:25 Now, the bad news is that world is pretty unevenly split up.
0:23:28 Some countries just have dramatically more compute than others.
0:23:32 Others have dramatically more energy than others
0:23:34 because of their natural reserves.
0:23:36 So if you are in the Middle East,
0:23:39 you may not have massive data centers yet,
0:23:42 but what you do have is vast reserves of oil.
0:23:47 And how you translate that into becoming a hypercenter is quite simple.
0:23:49 It’s the law of comparative advantage.
0:23:50 You’ve got energy.
0:23:55 You should use that to attract the world’s best teams and companies
0:24:01 and foundation model labs and so on by trading what you have with what they have.
0:24:07 And so I’m quite bullish on allied ties between countries
0:24:10 that recognize what their strengths are
0:24:12 and then partner with other countries to fill that gap.
0:24:16 And by countries, I mean private companies too from other countries.
0:24:19 One of the things we may end up seeing in the coming years
0:24:21 is jointly trained models between countries.
0:24:24 Basically, I think for most countries, it’s impossible
0:24:27 to have total infrastructure independence at all parts of the stack.
0:24:31 What is much more feasible is to be great at one part of the stack
0:24:35 and then collaborate with another sovereign or another country or region
0:24:41 to achieve joint independence from a value system that you don’t subscribe to, like the CCP.
0:24:46 And so I actually think what’s more important is for countries, regions
0:24:50 and frankly, in some of the world’s largest companies that operate at nation scale
0:24:54 to assess which parts of the stack are critical to them
0:24:56 that they must have independence from.
0:24:59 And the answer there is the function of what asset they already have, right?
0:25:00 It’s their strengths.
0:25:03 And then if they’ve got a critical gap to fill, it’s to go and buy that.
0:25:06 Now, in the long term, you might be able to build things out.
0:25:09 But with infrastructure, especially of this kind,
0:25:12 you can often take years if not a decade long scale.
0:25:16 So as an example, lower down in the stack from the model layer, you have the chip layer.
0:25:19 And even below that, you have the lithography layer, right?
0:25:22 There’s a company in Tallinn called ASML
0:25:25 that builds literally the world’s most important machines.
0:25:26 How many machines do they make per year?
0:25:28 It’s some very small number.
0:25:30 Each machine costs about $200 million.
0:25:34 And I think they did 23 billion in revenue this year,
0:25:37 80% of which came, by the way, from China,
0:25:40 because China was stockpiling ASML machines
0:25:42 before a bunch of export restrictions kicked in.
0:25:44 And they’re the only company that can actually make these new lithography.
0:25:47 They’re the only company that can do UV lithography of this precision.
0:25:51 Now, is it feasible for the US to say we’re going to build our own ASML like tomorrow?
0:25:53 No, I mean, it’s just going to take 10 plus years, right?
0:25:56 UV lithography just takes a really long time.
0:25:59 On the other hand, is it feasible for a smaller country to say
0:26:02 we’re going to train our own local models at the frontier?
0:26:06 That’s a little bit easier to do over the quarters of timescale
0:26:09 if you’ve got a leading research team.
0:26:10 If.
0:26:11 If, and that’s a big if, right?
0:26:14 There’s only a handful really of research teams globally that are capable of this.
0:26:17 And so to answer your question, yes, I don’t think sovereign AI
0:26:21 or infrastructure independence means you have 100% ownership
0:26:24 over every part of the stack that’s infeasible over the short term.
0:26:29 It means that you don’t rely on somebody for a critical part that you don’t trust.
0:26:30 Right.
0:26:32 Can we talk about private companies for a second?
0:26:33 Sure.
0:26:34 Because you’ve brought them up a few times.
0:26:38 How do you think about that dynamic where, as a nation state,
0:26:40 you’re saying, okay, you need this sovereignty.
0:26:41 Right.
0:26:44 But at the same time, can you rely on that sovereignty through the companies
0:26:47 that exist within your nation, just using America as an example?
0:26:48 Right.
0:26:50 Does the government really need to be involved
0:26:54 or can they just let anthropic or open AI kind of command that part of the stack?
0:26:59 Or how do you think about the difference between government versus private enterprise?
0:27:04 The line is pretty stark in a few countries and is more blurry in others.
0:27:06 So in China, the line is very clear.
0:27:10 There’s a law called the PRC 2017 National Intelligence Law
0:27:15 that says Chinese individuals and entities are required
0:27:20 to support PRC national intelligence work by law,
0:27:26 which means if there’s any technology that a PRC company has access to,
0:27:29 they are automatically obliged to make that available to the government.
0:27:31 And that’s not the case in the United States.
0:27:32 Right.
0:27:36 There are some covered types of technology like dual use technology,
0:27:39 like classified defense technology, where if you are developing it,
0:27:42 particularly if you’re funded under a defense program,
0:27:44 then you’re required to make that available to government
0:27:46 because the government’s paying for the development of that technology.
0:27:47 Right.
0:27:49 But by and large, the private sector in the United States
0:27:52 and most other allied countries is by default protected
0:27:54 from having to make its technology available to the government.
0:27:56 It’s not the case in the CCP.
0:27:59 So I think that the question becomes for most countries
0:28:01 is where on that spectrum do you want to exist?
0:28:06 There’s a general framework through which most infrastructure is categorized.
0:28:09 Every country approaches it slightly differently,
0:28:13 but the G5, the Five Eyes, the US, Canada, UK, Australia, New Zealand,
0:28:18 we generally have a joint approach or framework to categorizing this infrastructure.
0:28:25 And by and large, AI models have not been categorized as being dual use
0:28:27 or protected under national security.
0:28:30 The short answer is the history of technology has largely shown
0:28:37 that if you’d like to win, then unlocking the best talents of a country
0:28:41 with as few bureaucratic slowdowns usually ends up winning.
0:28:45 Well, if we think about wanting to keep America at the frontier
0:28:49 and we think about the different layers or ingredients that we talked about earlier,
0:28:54 are there any high risk areas that we think or that you think we’re falling behind?
0:28:59 I think we go back to the four ingredients we talked about earlier of the frontier of AI,
0:29:03 which is compute data, energy, and laws.
0:29:07 Now on the compute front, I think the private market in the United States is doing a pretty good job.
0:29:09 It’s pretty responsive to market demand.
0:29:13 And I think there’s no coincidence that the largest infrastructure businesses
0:29:16 in the United States are chip companies and computing companies,
0:29:21 because I think we’ve generally done a pretty good job of letting the market feed that demand.
0:29:25 I think on the data side, things are extraordinarily tough,
0:29:31 because one, the Biden executive order last year was a starting gun that said,
0:29:34 “Oh, AI is important. Please do something about it,”
0:29:36 and left it to the states to figure it out.
0:29:42 And the states have all taken a complete patchwork of approaches to data regulation.
0:29:48 In 2024 alone, I think there were more than 700 pieces of state-level legislation that were AI-specific,
0:29:53 and a bunch of those laws, if you look at it, are really well-intentioned,
0:29:57 but atrociously implemented ideas for data regulation.
0:29:59 Right, and impossible to adhere to.
0:30:00 Basically impossible to adhere to.
0:30:05 And so I think one area where we’re just handicapping ourselves is that there’s no unified framework
0:30:11 in the United States at the federal level yet for data, especially around training.
0:30:13 And I think we needed that yesterday.
0:30:19 Overseas in a number of countries where rule of law, especially on copyright and IP and so on,
0:30:24 is just less stringent, those labs are happy to just race ahead,
0:30:28 whereas our companies here are trying to figure out what they should even comply with,
0:30:31 and that greatly hurts you more than actually a laissez-faire approach.
0:30:33 I think our companies would be totally fine.
0:30:35 The best founders at the frontier of AI would be,
0:30:37 find the United States being compliant.
0:30:39 They just want to be told what to comply with.
0:30:43 Not across 50 different states with different regulations that are changing and unclear,
0:30:44 and in some cases, impossible.
0:30:47 Right, and there’s also the fundamental scientific problem that
0:30:50 they’re just very real data walls that these models run into.
0:30:54 And I do think one of the things that hurts frontier research in the United States
0:31:00 and allied countries is a lack of government support in collaborating across borders
0:31:03 to make more data available to allied regions.
0:31:04 So that’s number two.
0:31:08 On energy, I think we’ve obviously hamstrung ourselves in the United States with nuclear.
0:31:11 France, for example, is an embrace of nuclear 20 years ago,
0:31:14 has positioned them to have extraordinarily efficient data centers today.
0:31:15 Whereas in the United States,
0:31:17 I think we’ve basically shot ourselves in the footer on that.
0:31:20 And then lastly, I think around inference regulation,
0:31:26 what we’re not doing enough of is making it clear who the liability rests on.
0:31:30 I’ve seen a number of proposals ahead of legislative sessions next year
0:31:36 that want to hold model developers liable for the outputs of the inference,
0:31:38 even if the misuse is being done by somebody else.
0:31:39 And what does that do?
0:31:42 That drives those very important developers elsewhere.
0:31:49 Essentially forces most startups to lose much needed ground to big tech companies
0:31:52 and that entrenches incumbents more.
0:31:55 So as we think about 2025, whether it’s in the US or elsewhere,
0:31:59 because I mean, this idea really is truly global.
0:32:00 What are you looking out for?
0:32:04 Or what should maybe let’s say a legislator or let’s say the head of a nation,
0:32:06 what should they be thinking about?
0:32:08 And what are you looking out for in some of those decisions?
0:32:13 Are you looking for countries that are buying GPUs or building out new energy centers?
0:32:14 What are you paying attention to?
0:32:16 The leading indicator is definitely compute.
0:32:21 If you think about the AI supply chain, the first mile starts at the data center.
0:32:24 That’s the new atomic unit of sovereignty, I would say, which is a new thing.
0:32:28 We’ve never actually had nation states think about atomic units of an AI data center
0:32:30 as a thing that countries should be purchasing.
0:32:36 And I think about 24 months ago, we started seeing nations reason about that first mile as being important.
0:32:42 So you’ve had an enormous amount of Nvidia’s purchasing orders come from the balance sheet of governments.
0:32:48 Just unprecedented demand they’ve been seeing from nation states realizing that they want to be hyper centers.
0:32:55 And that starts with them placing orders 12 to 36 months in advance to take delivery of GPUs
0:32:58 because if you don’t get in front of that line, it’s over.
0:33:00 You’re getting it after everybody else.
0:33:02 So that was step one.
0:33:05 The second thing I look for is founders who are deeply both technical,
0:33:11 who often come from deep research backgrounds and scientists who’ve led frontier model development already,
0:33:13 often inside of large hyperscaler labs.
0:33:16 So an example is Arthur Mench who started Mistral.
0:33:21 They worked at DeepMind or Guillaume Lomp who led the initial Lama family at Metta,
0:33:27 who are deeply mission led and believe that they can help solve a bunch of these infrastructure problems for the world’s largest governments.
0:33:33 So there’s a new class of founder who’s both primarily technical and has their training in academia,
0:33:38 but is motivated to solve all the really hard problems that come with having to deliver,
0:33:44 solve a bunch of these infrastructure problems for really large nation states and regions.
0:33:51 But I think if you’re a founder like that who has the guts, then your impact on humanity ends up being quite generational.
0:33:56 But now let us convince you that the future of AI may not be so straightforward.
0:34:02 Instead of models running in the cloud, perhaps you’re bound for a future where many more applications will run on device.
0:34:07 I expect smaller on-device AI models to dominate in terms of volume and usage.
0:34:13 This trend will be driven by use cases as well as economic, practical and privacy considerations.
0:34:17 That was Jennifer Lee, I’m a general partner on infrastructure team.
0:34:19 Here’s her big idea.
0:34:25 My big idea is on-device and smaller generative AI models will become more popular in the next year.
0:34:30 If you’re a frequent user of Uber, Instacart, Lyft, Airbnb applications,
0:34:33 I’m sure there are many, many machine learning models already running on your device.
0:34:41 Very easily when you load up an Uber screen, it’s 100 models that’s coordinating routes and giving you a real-time price.
0:34:45 What I’m more referring to is the generative models that are creating.
0:34:52 Image, voice, video will become more prevalent in the same way to run on device and within your applications,
0:34:55 similar to these other traditional machine learning models.
0:34:58 The models that we’ve seen in the last few years do take a lot of compute.
0:35:04 Can you square that with how much compute we can get from something like a smartphone and also these models,
0:35:07 whether they’re getting smaller or how this kind of comes together?
0:35:10 Yeah, first, never underestimate the compute power.
0:35:16 Our smartphone is probably as powerful as a computer 10 or 20 years ago, thanks to moral law.
0:35:23 At the same time, the models are for especially smaller sizes of two billion, eight billion parameter models.
0:35:29 That’s enough compute for them to run on device and it can generate and create very robust experience already,
0:35:31 be it text or image or audio.
0:35:37 And some of these models, if they’re diffusion models, they’re intrinsically smaller than large text models to be very capable.
0:35:44 And there’s another new set of tooling and also technology developed around distillation is if you have a very powerful large model,
0:35:51 can be distilled to a smaller parameter size model and still maintain a lot of the capabilities the large model contains.
0:35:56 So both on the infrastructure side and also on the device compute power side,
0:36:00 it’s a perfect setup for the smaller models to be more popular.
0:36:02 Totally. So I’m hearing a few things.
0:36:05 I’m hearing that the smartphones are becoming more powerful.
0:36:10 Some of these models are becoming more efficient, but that kind of brings us to the question of why?
0:36:12 So why would we want to run these models on device?
0:36:15 What are the advantages of that and also the disadvantages?
0:36:22 As consumers and day-to-day users, we’re already spoiled by real-time and very performant applications.
0:36:25 If you’re talking to a chatbot, if you’re talking to a conversational AI,
0:36:30 if you’re adding filters to your video and images on Instagram or TikTok,
0:36:33 you don’t want to wait for multiple seconds to load a new filter.
0:36:37 You don’t want to wait for multiple seconds for the chatbot to respond to you.
0:36:42 Those are many real use cases that can really delight and improve user experience.
0:36:44 Also optimization for compute.
0:36:50 There are a lot of harder, more complex questions or video processing that requires going into the cloud.
0:36:58 But largely, if it’s again changing user experiences and improving the visual and sound effect of things,
0:37:02 it doesn’t have to route through multiple servers going through a network.
0:37:10 So both from a user experience and efficiency perspective, it’s a much better design to run some of the models on device.
0:37:13 And then the last part is just privacy.
0:37:16 Users do care about if my meeting notice is taken locally,
0:37:24 I probably would use this meeting to take it much more often than knowing some of the data is being sent to a server at their processing.
0:37:30 A lot of my private conversations, so it depends on the use case again for the application.
0:37:32 I think that also improves that option.
0:37:37 Absolutely, and that has my wheelspinning for sure in terms of maybe this unlocks new applications.
0:37:41 So on that note, you mentioned a few already, but where might we see applications pop up
0:37:45 or where perhaps are we already seeing applications with these on-device models?
0:37:48 First come to mind is real-time voice agents.
0:37:52 It’s a very popular topic and it’s something I’m very excited about.
0:37:55 We invested in and work very closely with this company called Eleven Labs,
0:38:02 and that’s one of the areas they’re spending also a lot of efforts on is not just having the human like synthetic voice
0:38:08 but being able to handle conversations fluently with end users and to get the latency down
0:38:14 and also to think about what type of real-time exchanges you want to have with your AI companion,
0:38:18 your support agents, or any sort of life coach.
0:38:26 I think we do need to think about the modality and the latency in a much more, I guess, improved fashion.
0:38:34 So I won’t be surprised if some of those inference workloads are running locally coming into the next 12, 18 months.
0:38:39 Absolutely, and as we think about how maybe these different models also interact with other parts of a smartphone,
0:38:44 let’s say the camera, do you expect this to also maybe change user behavior and what we can do?
0:38:52 100% you can actually re-imagine the AR experience of if I point a camera to this room
0:39:00 and I want to see a new surface and wallpaper and furniture, the technology is already there.
0:39:07 We can actually leverage both generative AI and the camera and also prompting interaction
0:39:12 to create new experiences already of how we interact with real physical life.
0:39:17 And that’s where I also think a lot of on-device models will play a big role
0:39:21 of how to interact with the 3D world, how to interact with physical world,
0:39:25 and not just using the camera for capture but also using it as a projector.
0:39:32 Definitely, and let me ask you about economics then because a lot of the models that exist today do rely on inference
0:39:35 and sending that inference up to the cloud and that costs money.
0:39:42 Do the economics change if you all of a sudden have these models running on device on the smartphone compute that already exists?
0:39:47 Do the economics actually shift or can we come up with new ways of monetizing in this new world?
0:39:51 Yeah, it’s a great question and I honestly don’t really have the answer
0:39:56 because even for larger models, the inference price has been dropping really significantly.
0:40:00 Further optimizations to be done if it’s a very workload-intensive compute,
0:40:04 let’s say using your computer or phone, I think we’ll still have economic benefits
0:40:12 but I don’t think it’s a very direct answer of it’s going to substantially reduce infrastructure costs for some of these applications.
0:40:16 But architecting and structuring sort of the whole tool chain,
0:40:22 it does change sort of economics on the developer efficiency and sort of iteration speed.
0:40:28 There is pros and cons when shipping in the cloud where it can launch more continuously on device
0:40:34 as its own challenges because you’ll have to go with the updates with the application and with hardware.
0:40:42 So there is that side of economics that I think will have impact from how teams are being structured in launching models in a hybrid mode.
0:40:47 So I would encourage teams who are thinking of leveraging this technology consider it more holistically.
0:40:52 Super interesting. And as we think about that world, are there any players that you think really succeed here?
0:40:59 Like in one sense I could see maybe the phone manufacturers, I could also see maybe the manufacturers of wearables
0:41:05 being able to introduce all kinds of new applications, think of wearing an Apple watch, Fitbit, Woop, things like that.
0:41:11 Is it Nvidia that benefits in some way? Who do you think actually benefits from this idea of the models becoming more efficient
0:41:15 and these on-device models becoming a thing?
0:41:24 Right now I’ve seen more interest and enthusiasm from the hardware development side, whether it’s chips, it’s the phone makers.
0:41:34 I do think there’s also a lot of interest from the model developers as well of just like proliferating the model adoption across different setups and devices.
0:41:38 But I think over the long run it’s probably going to impact the whole supply chain.
0:41:44 We talked about some of these macro trends throughout. How do you specifically see those trends shaping up in 2025?
0:41:47 And is there anything in particular that you’re putting your eye toward?
0:41:54 This will sound more like a consumer investor, I’ve been like a hardcore investor, but I am very excited about the mixed reality
0:42:01 where generative models, 3D models, video models that really again makes the reality of what we’re seeing today
0:42:04 and through the camera lens, through the microphones.
0:42:10 Much more creative world even when sitting at home or when going on the ride.
0:42:13 That’s the type of experience I’m very much looking forward to.
0:42:18 I think the foundation model technology is pretty mature, the infrastructure is getting ready.
0:42:22 So I’m personally very excited about sort of the new consumer experience.
0:42:28 All right, I hope these big ideas got you geared up and ready for 2025.
0:42:30 Stay tuned for parts 3 and 4.
0:42:38 And again, if you’d like to see the full list of 50 big ideas, head on over to ASXMUSY.com/bigideas.
0:42:39 It’s time to build.
0:42:41 [MUSIC PLAYING]
0:42:43 (soft music)
0:42:46 (gentle music)

The AI race is on, and 2025 could be its most transformative year yet.

In this episode, a16z General Partners Anjney Midha and Jennifer Li, and Partner Alex Immerman dive into the trends reshaping AI and its impact on search, infrastructure, and devices.

We explore:

  • How AI-native tools like ChatGPT and Perplexity are challenging Google’s search dominance.
  • The rise of infrastructure independence as governments prioritize compute, energy, and data.
  • The future of smaller, on-device AI models driving privacy, performance, and new consumer applications.

With insights from a16z’s Growth and Infrastructure teams, this episode unpacks the forces driving AI innovation—and the opportunities founders and nations could seize to lead in the next wave of technology.

Stay tuned for more in this four-part series, and explore the full 50 Big Ideas for 2025 at a16z.com/bigideas.

Resources: 

Find Alex on X: https://x.com/aleximm

FInd Anjney on X: https://x.com/AnjneyMidha

FInd Jennifer on X: https://x.com/JenniferHli

Stay Updated: 

Let us know what you think: https://ratethispodcast.com/a16z

Find a16z on Twitter: https://twitter.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Subscribe on your favorite podcast app: https://a16z.simplecast.com/

Follow our host: https://twitter.com/stephsmithio

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

Leave a Comment