a16z Podcast
Inferact is a new AI infrastructure company founded by the creators and core maintainers of vLLM. Its mission is to build a universal, open-source inference layer that makes large AI models faster, cheaper, and more reliable to run across any hardware, model architecture, or deployment environment. Together, they broke down how modern AI models are actually run in production, why “inference” has quietly become one of the hardest problems in AI infrastructure, and how the open-source project vLLM emerged to solve it. The conversation also looked at why the vLLM team started Inferact and their vision for a universal inference layer that can run any model, on any chip, efficiently.
Follow Matt Bornstein on X: https://twitter.com/BornsteinMatt
Follow Simon Mo on X: https://twitter.com/simon_mo_
Follow Woosuk Kwon on X: https://twitter.com/woosuk_k
Follow vLLM on X: https://twitter.com/vllm_project
Stay Updated:
Find a16z on X
Find a16z on LinkedIn
Listen to the a16z Show on Spotify
Listen to the a16z Show on Apple Podcasts
Follow our host: https://twitter.com/eriktorenberg
Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.
Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

-
Big Ideas 2026: New Infrastructure Primitives
New infrastructure primitives are creating entirely new rails for building. In this episode of Big Ideas 2026, we explore three foundational shifts that unlock new markets and workflows, not through incremental upgrades, but through primitives…
-
Big Ideas 2026: Physical AI and the Industrial Stack
AI is moving into the physical economy. In this episode of Big Ideas 2026, we explore what changes when AI leaves the screen and becomes part of factories, construction sites, supply chains, and critical infrastructure.…
-
Big Ideas 2026: Voice Agents and High-Stakes Trust
Voice is becoming one of the fastest paths for AI to do real work, especially in regulated environments where accuracy and compliance matter. In this episode, we look at voice agents replacing and augmenting phone-based…
-
Big Ideas 2026: The Enterprise Orchestration Layer
AI is becoming the orchestration layer inside the enterprise. In this episode of Big Ideas 2026, we explore the shift from isolated AI copilots to coordinated multi-agent systems that plan, analyze, and execute work across…
-
Big Ideas 2026: The Agentic Interface
AI is moving from chat to action. In this episode of Big Ideas 2026, we unpack three shifts shaping what comes next for AI products. The change is not just smarter models, but software itself…
-
The Rise, Fall & Reset of The Fintech Industry
Fintech went from a full-blown surge to a near standstill in just two years. At its peak, about 25 percent of all venture dollars were pouring into the category. By late 2022, that number had…
-
Do Revenue and Margins Still Matter in AI?
In this episode, we’re sharing a conversation with David George, General Partner at a16z on the firm’s growth investing team. David has been involved in backing many of the defining companies of this era and…
-
The Crime Crisis In America and How Technology Fixes It
What if America tried to eliminate crime instead of just reacting to it? Not with slogans, but with staffing, technology, and strategy scaled to the problem. In this episode, Erik Torenberg speaks with Garrett Langley,…
-
Ryo Lu (Cursor): AI Turns Designers to Developers
Ryo Lu spent years watching his designs die in meetings. Then he discovered the tool that lets designers ship code at the speed of thought: Cursor, the company where Ryo is now Head of Design.…
-
Dwarkesh and Ilya Sutskever on What Comes After Scaling
AI models feel smarter than their real-world impact. They ace benchmarks, yet still struggle with reliability, strange bugs, and shallow generalization. Why is there such a gap between what they can do on paper and…
