As AI factories scale and token costs become a defining competitive variable, the way businesses measure infrastructure ROI needs to change. In this episode, Shruti Koparkar from NVIDIA’s Accelerated Computing team breaks down tokenomics—the four-pillar framework of token utility, supply, demand, and monetization—and reveals why NVIDIA Blackwell’s architecture delivers 50x more tokens per watt than NVIDIA Hopper, translating to a 35x reduction in token cost.
🔬Topics covered:
The four pillars of tokenomics: utility, supply, demand, and monetization
Why cost per token beats FLOPS per dollar as an infrastructure metric
NVIDIA Blackwell vs. Hopper: 50x more tokens per watt, 35x lower token cost
How extreme co-design turns spec-sheet numbers into real-world output
Jevons paradox: why lower token cost always drives more GPU demand, not less
The four business models for turning tokens into revenue
Chapters:
00:00 – Introduction and the four pillars of tokenomics
02:09 – Token value: intelligence, interactivity, and use case mapping
06:32 – Estimating token demand: users, reasoning, and agentic multipliers
10:00 – Token supply and why cost per token is the right infrastructure metric
13:12 – NVIDIA Blackwell vs. Hopper: 50x more tokens, 35x lower cost
14:52 – Extreme co-design for lowest token cost and the NVIDIA Vera Rubin platform
21:10 – How software multiplies hardware performance (8x gains in six months)
23:56 – Token monetization: pricing and business models
26:52 – Jevons paradox and the future of GPU demand

Leave a Reply
You must be logged in to post a comment.