How OpenUSD and AI Are Building Smarter Virtual Worlds

Summary and Insights

🕒

Việt

中文

0:00:16 Hello, and welcome to the NVIDIA AI podcast. I’m your host, Noah Kravitz. Today we’re talking
0:00:23 about the future of collaboration in 3D. Universal Scene Description, OpenUSD, is revolutionizing
0:00:28 3D graphics and simulation, especially when you combine it with the latest in physical AI.
0:00:33 The technology is transforming industries from manufacturing to robots. And here to explain
0:00:40 what OpenUSD is and why it works so well together with AI is NVIDIA’s Aaron Luke. Aaron is a
0:00:45 Director of Product Management for NVIDIA Simulation Technology, leading Universal Scene Description
0:00:51 ecosystem development. Aaron, welcome to the AI podcast. Hi, Noah. Good to be here. Great to have
0:00:58 you. Thanks for taking the time to join us. So let’s start kind of at the beginning and work our
0:01:03 way up, if you will. What is OpenUSD, and why does it matter so much? That’s right. So as you
0:01:08 mentioned, OpenUSD, the USD stands for Universal Scene Description. It’s a project that was open
0:01:15 sourced by Pixar Animation Studios in 2016. But it’s the result of evolution of decades of data
0:01:20 engineering at Pixar around, you know, basically 3D world building. Right. 3D world building among all
0:01:26 the sort of disciplines that it requires for filmmaking, but it generalizes quite beautifully to world building
0:01:31 in the industrial world and in the real world as well, too. So it’s an open source project that also
0:01:37 now is under the governance of the Alliance for Open Universal Scene Description, the AOUSD, in which we are
0:01:44 formalizing USD as industry standards with a lot of great partners. Fabulous. And so what are some of the
0:01:50 benefits? I mean, obviously having an open source standardized framework for describing and working
0:01:55 with 3D worlds is great in and of itself. But what are some of the particulars about OpenUSD that make it
0:02:01 really great to work with? So the really interesting thing about USD is that it’s designed to bring lots of
0:02:08 different types of data sources together. In particular, it’s called composition within USD. And every document
0:02:13 in USD is called a layer. So when you bring all these things together, you have these network of layer stacks
0:02:19 within USD that presents itself as a holistic composed scene graph. And every object in that
0:02:24 scene graph is like an object in 3D that you can do for moviemaking, but also for industrial layout and
0:02:31 design. And the power of USD is all of that is abstracted from the actual data source and the actual data
0:02:37 serialization, the actual formats. And this was a boon within Pixar, because like I said, every type of artist
0:02:43 within Pixar, whether they’re doing modeling, animation, effects, animation with physics and other
0:02:47 simulation, lighting, all that kind of stuff, they might have different tools, they might have different
0:02:52 ways of working with things. And what Pixar did was they kind of unified them all around these common
0:03:00 data models in USD. So they present themselves as schemas in USD, where every object in USD has a typed
0:03:06 schema, for example, a mesh for the shapes that you define in USD. And you can also add applied schemas
0:03:12 onto those geometries, like physics APIs to imbue them with collision properties and things like that.
0:03:18 Yeah, but it’s all abstracted in USD and it’s all this data can live in separate layers. And those
0:03:23 layers, they can live on disk as files, or they could be in the cloud as files, or they can be
0:03:28 populated from databases or even dynamically generated, right? So you can see where I’m going here, where
0:03:35 this, it makes it a really nice fit for the sheer volume of not just the amount of data, but the types
0:03:41 of data that are flowing into industrial digital twins. It’s really, really exciting to see. Certainly, even within
0:03:45 filmmaking, like you’re already bringing lots of different types of tools together. But in the industrial
0:03:53 world, that’s even more expansive between all the CAD tools that can feed USD. And then as you expand that
0:03:59 out into those kind of industrial workflows, like product lifecycle management, facility design and
0:04:04 planning, all the way up into like operational twins, right? In which you actually have a physical
0:04:10 facility and also the digital version of it that’s tracking all the things that are happening with the
0:04:13 equipment and the robots that are in a facility and so on.
0:04:20 Yeah, the little bit of exposure I’ve had to open USD has been through industrial projects, right? So that
0:04:25 whole world of just operating these, you know, physically accurate and pixel perfect simulations,
0:04:30 digital twins of a factory industrial site is just amazing. One of the things to me that’s really
0:04:36 cool about open USD, as I understand it, is that we can be working on different layers, collaborating,
0:04:40 but sort of working without getting in each other’s way. And it’s non-destructive, as I understand it.
0:04:45 Yeah, that’s right. So again, let’s take the filmmaking analogy, right? Where multiple artists might be
0:04:50 working on the same shot, maybe not at the same time, but certainly in different layers of that shot.
0:04:56 And that way, right, a layout artist can make like the basic layout of where the characters start in a
0:05:01 shot. And then a character animator can then add, you know, all the all the expressiveness on top of that
0:05:06 layout and so on and so forth, right? And that same principles apply to industrial design and layout,
0:05:11 right? Where a layout person, a planner might might do some initial factory layout, but then someone who
0:05:16 is really specializing in a particular work cell within that factory might iterate on top of like,
0:05:21 you know, the base layout of the entire factory in there. And then the person who’s working on the
0:05:26 robot arm within that work cell might like, you know, add more details on top. And so what we mean by
0:05:30 non-destructive is that everyone who’s working on that, like their work is still preserved somewhere
0:05:35 in the layer stack, right? And so you’re not you’re overriding each other’s work and adding to it and
0:05:40 tweaking it accordingly. But everyone’s work is preserved. So you can always sort of look exactly
0:05:45 at what someone did and what kind of changes were made on top and that kind of thing. So like really
0:05:50 quite a boon to industrial workflows where, you know, they’re like everyone has a part to play and
0:05:57 everyone’s adding something, right? Right. Absolutely. So how does OpenUSD work with AI and
0:06:03 in particular, how can it accelerate the development of physical AI? Yeah, I think OpenUSD is a really
0:06:09 good fit for physical AI because OpenUSD has already, you know, the native 3D paradigms, but
0:06:14 because of the nature of its flexibility of the data model and the composability of different data
0:06:20 resources, right? It makes it a great way to describe, you know, the worlds in which physical
0:06:26 AIs are operating in particular for training robots. So with USD, right, you could have your robot that’s
0:06:33 described in USD and maybe it’s translated from URDF or NJCF, any number of robotic formats that can
0:06:39 be mapped to USD via schemas. But also the world that the robot is in is also within USD. And then
0:06:45 that can come from CAD or it could be made in house and so on and so forth. What USD does is it gives you this
0:06:52 unified, holistic way of simulating the world for physical AI to create those environments for robots to learn
0:06:59 how to navigate, how to respond to different scenarios. And then it plays really nicely with technologies like
0:07:07 NVIDIA Cosmos as well, right? Where you have your baseline scenarios in USD and you can start some synthetic
0:07:12 day generation to vary different objects in the scene and vary different scenarios in them. But then you can
0:07:20 vary even more conditions in Cosmos, like time of day and weather conditions and things like that. So
0:07:29 before you know it, you basically have just a rich, comprehensive set of scenarios that the robot can
0:07:35 learn from, right? And this is all being rendered through something like Sensor RTX in Omniverse, right?
0:07:39 So that you get, you know, like you said, it’s pixel perfect, but it’s pixel perfect for sensors,
0:07:45 right? So it’s physically accurate in simulating what the physical sensor on the robot would see in the
0:07:51 real world. And that robot is effectively seeing vast amounts of scenarios to be able to learn from and
0:07:54 before it’s ever deployed into the physical world.
0:08:01 So when we’re talking about robots, autonomous vehicles, things that we’re simulating to sort of prepare to
0:08:08 deploy in the real world, right? Can you explain what the sim-to-real gap is and how OpenUSD plays
0:08:11 into, you know, helping solve that for physical AI training?
0:08:17 Yeah, sure. So the sim-to-real gap is basically what I just mentioned, right, where what the robot sees in the
0:08:23 virtual world should match what it would see in the real world, right? And so the big aspects there are
0:08:30 the physics simulations. So you need really good solvers to run through all the rigid bodies and
0:08:34 all the sort of things that happen in the real world. And then you also need to visualize it
0:08:40 in the same way that a sensor would perceive it as well. So those are some of the key aspects to fill
0:08:45 in. Obviously, there’s always lots of great physics research that’s going on, even outside of the
0:08:49 computer graphics community, right? Physicists are always learning more about how the world works,
0:08:54 right? But the cool thing is like AI is also learning that as well, too, right? And that’s another
0:08:59 place where Cosmos comes in, right? Where you can like, if you’re what you’re trying to do is
0:09:04 simulate what the robot is seeing, right? Like AIs can sort of recognize the patterns and
0:09:09 and again, like vary, vary things accordingly that you don’t even have to then simulate in 3D
0:09:15 anymore. When we’re talking about generating simulations like this, are capturing and replicating
0:09:21 physical, you know, the way physics works, is that one of the trickier aspects or the trickiest aspect
0:09:25 of it? Or what are some of the big hurdles that have to be cleared? Or perhaps that, you know,
0:09:34 the growth of OpenUSD has helped us clear recently. I think what OpenUSD is doing is it’s more giving folks
0:09:40 commonplace to unify their data models, right? So every physics solver is going to have different
0:09:45 behaviors, different characteristics for the kind of performance characteristics that they’re trying to hit.
0:10:02 But what USD is helping with is like, how can we can we standardize the inputs to those solvers, right? Such that you can run even multiple solvers for multiple aspects of your scene for co-simulation. That way, you can have multiple
0:10:09 multiple physics solvers and engines all operating on different parts of your scene, just like in the real world, right?
0:10:36 You wouldn’t have the same physics solver or locomotion as you do for grasping things necessarily. And certainly for all the things that robots are doing and what other machines are doing within the factory, right? There’s all sorts of different things. And what USD is doing is providing this framework around like, okay, can we converge on the inputs that are common to these kinds of physical operations, like rigid body collisions, like soft body collisions, and that kind of stuff.
0:11:05 Right. We mentioned earlier, I mean, we’ve been mentioning throughout, but specifically talked a little bit about digital twins and industrial AI earlier. And I referenced, actually, we had Siemens on the podcast not too long ago talking about this. Obviously, lots of NVIDIA customers, Lowe’s is another one that are building digital twins, these replicas of factories or cities or perhaps even retail sites using open USD, using AI to, you know, as you’ve been talking about, simulate scenarios, but to optimize operations.
0:11:25 And, you know, do these other things that are driving innovation right across industrial use cases. You talk a little bit, and I think you referred to this, I was thinking about synthetic data when you were talking about simulations. But how does OpenUSD specifically play a role in creating digital twins and generating synthetic data for these industrial use cases?
0:11:47 Yeah. So because, again, USD already has all these 3D capabilities for describing virtual worlds, right, it’s a good fit to vary, you know, existing USD objects, right, to produce synthetic variations of, you know, baseline objects, like the things that you’re manufacturing in a factory, or the things that you’re selling in a retail space, and so on and so forth.
0:12:03 And it’s all about, like, because USD can kind of adapt to any data or any data model can be adapted to USD, right, kind of anything that you want to vary, certainly if it manifests itself in sort of physical appearance and shape and things like that is a natural fit in USD.
0:12:25 But even beyond that, right, there might be other things that you want to vary, like I said, like weather conditions and things like that, right, where I’m sure that there’s simulations that we probably haven’t even thought of yet, but I know that they’ll be expressible in USD because it’s, you know, you’ll be able to define a schema around which those simulations can be described and the inputs to those simulations, that kind of thing.
0:12:34 So, again, it’s that kind of extensibility of USD that makes it a really nice fit for, you know, synthetic data, like variations that we haven’t thought of.
0:12:43 Right. And along those lines, I would imagine it’s got to be beneficial when you’re building or even scaling out pipelines to deal with synthetic data.
0:12:51 And even in these situations, like you said, we haven’t imagined the thing that we want to simulate, you know, down the road, but you know it’ll be expressible.
0:13:04 Yeah. So, Aaron, you talked a little bit earlier about some standards emerging out of OpenUSD that are making it easier for people, you know, and work on different projects in different places, just standards just make things easier for folks to work.
0:13:13 Are there standards, some of the standards emerging out of OpenUSD specific to working in physical AI that are making things easier maybe across industries?
0:13:22 Yeah, I kind of think all of the standardization that’s happening in USD will eventually funnel towards some sort of physical AI use case.
0:13:31 And standards are particularly important because I think they’re the bridge to what I was mentioning earlier, where USD is great because it’s so flexible, right?
0:13:38 And so adaptable to lots of different domains, lots of different use cases, and that’s why we’re seeing such large adoption around it.
0:13:41 But the flip side of flexibility is ambiguity.
0:13:42 Right, right.
0:13:49 And what standards really do is empowers you with the flexibility, but sort of removes the ambiguity such that we’re all sort of like rolling in the same direction.
0:13:55 And so USD is very open in how you express things, and it’s great for that.
0:14:04 But where standards come in are, for example, USD allows you to express transforms in any arbitrary number of ways, which is very powerful.
0:14:17 But if you want to use it for physical AI, right, you might want to simplify the transform stack that your USD object has so that your physical AI kernels have less complexity to reason about and things like that.
0:14:22 So you can envision USD as itself a stack, a multi-part specification, right?
0:14:25 And at the core of it is the core specification.
0:14:29 So in the AOUSD, I’m serving as chair of the core specification working group.
0:14:38 And that is really where we are normatively specifying the most novel aspects of USD in this foundation, that the ability to compose data together.
0:14:41 So, you know, what is the specific data models that feed the composition engine?
0:14:43 What’s the algorithm for composition?
0:14:54 And then how do you take that composed scene graph and issue predictable queries on, like, say, when you traverse that scene graph, you’re able to get predictable queries on, like, what are all the objects and what are all the properties of that object?
0:14:58 Everything beyond that, you can kind of think of USD as a standard of standards.
0:15:07 So there’s already quite a lot of standards in the industrial space for CAD, for product lifecycle management, for geometry, and all those kinds of things.
0:15:11 In the operational space, there’s OPC UA and web of things.
0:15:20 And what I described before as USD schemas, you can think of those as, like, mappings of those existing data models into USD.
0:15:20 Right, right.
0:15:34 And so as we build out, you know, this stack of standards, it’s about mapping other standards into USD, right, such that USD is speaking all these other data models that exist, but presenting them to you in this holistic way.
0:15:41 And that’s sort of what physical AI needs in particular, because you need to be able to describe everything that’s happening in the real world.
0:15:50 And the real world does have a lot of these standards that exist as well for physical objects, but in particular around equipment in your facilities.
0:15:54 And there are already specs for that equipment and that kind of stuff.
0:15:58 So a lot of the standards work is, like, mapping those existing standards into USD.
0:15:59 Into USD, right.
0:15:59 Yeah.
0:16:00 Makes sense.
0:16:02 You mentioned the working group that you’re a part of.
0:16:05 You were at Pixar previous to joining a video?
0:16:06 Yeah, that’s right.
0:16:09 And were you working on the development of USD back then?
0:16:10 Oh, yeah.
0:16:14 I was actually one of the original two developers on USD.
0:16:23 It started off as a pair programming project, like, taking some existing technologies at Pixar, particularly the composition engine from the animation package,
0:16:32 as well as the scene cache format that was being used to move data between departments, between tools at Pixar, and sort of marrying them into a single paradigm.
0:16:32 Very cool.
0:16:33 Yeah.
0:16:35 And how long ago did it get started?
0:16:37 Was it open USD for the first taking shape?
0:16:46 So the actual USD project, I think, started in, like, 2012-ish or so, kind of right after, yeah, around that time.
0:16:52 But the technologies underpinning it have been dating back to Pixar since probably a bug’s life.
0:17:01 But, you know, pretty much right after they wrapped Toy Story 1, they were already thinking about, you know, how can we better organize this data across our departments?
0:17:02 And, yeah.
0:17:07 So the composition engine started, I think, probably around 2005 or so.
0:17:17 And that’s, but even that, but yeah, but the concepts for the composition certainly is sort of the referencing and, like, the non-destructive workflows type stuff have been around at Pixar for decades.
0:17:18 Yeah, yeah, neat.
0:17:24 How, I don’t know, looking back on, you know, almost 15 years now, I guess, of USD, how has it evolved?
0:17:25 How have you seen it change?
0:17:33 Are there things that you, you know, I don’t know, maybe you didn’t think of when it first got going that now you’re like, wow, I’m so glad that, you know, that that came to be?
0:17:34 Oh, yeah.
0:17:36 It’s been evolving quite quickly.
0:17:50 I certainly didn’t envision all of this industrial adoption at the time because what excited me at the time was more seeing how many of those concepts mapped to what other movie studios, both in visual effects and animation, were doing, right?
0:17:55 Certainly going to SIGGRAPH at the time, I would attend pipeline talks and hear about similar concepts, right?
0:18:09 So it’s great now to be on calls with ISVs and customers and hearing about their ways of working and really, like, showing them how it maps to this USD way of working, of, like, kind of having the data really travel all along your workflows.
0:18:21 And, you know, really rethinking what we mean when we say pipeline, right, I think in the industrial world, it’s a little bit more, the source data kind of gets very hard exported, you know, between disciplines and things like that.
0:18:27 And, you know, the original still kind of exists, but you’ve kind of lost that link to it over the course of how that data travels.
0:18:33 And USD kind of allows that data travel and you’re adding to it as you go along, just like you do in a real assembly line.
0:18:33 Right.
0:18:40 When I was working on USD, it didn’t have this notion of API schemas, which I think are super powerful.
0:18:45 So that’s how, like I said, you can add additional annotative properties to existing objects.
0:18:51 That’s how your shapes also become physically simulatable objects for rigid bodies and other simulations.
0:19:01 And this is how we have also added semantic labels onto objects as well, which is really key for machine learning and segmentation of the scene, those kind of things.
0:19:01 Yeah.
0:19:02 Very cool.
0:19:08 So we’ve talked about USD, OpenUSD, and, you know, I was going to say all of the upsides.
0:19:19 We’ve only hit some of the upsides, but it’s vast in all of these different industries and situations we’ve been talking about, the power of having digital twins and collaborative 3D simulations and such.
0:19:20 How do you get started?
0:19:22 I’m listening to the podcast.
0:19:24 I’m listening to you, Aaron, talk about this.
0:19:28 I’m like, wow, this sounds exactly like we need, but how do we, I don’t know where to begin.
0:19:29 USD, what do I do?
0:19:33 What does it even mean to get started with USD and how would one go about that?
0:19:34 Yeah, sure.
0:19:42 So just like you can get started with AI on NVIDIA’s Deep Learning Institute, we also have LearnOpenUSD on NVIDIA’s Deep Learning Institute as well.
0:19:54 So that’s a growing curriculum of, you know, hands-on, self-paced courses that start with, you know, really the basic foundational principles of OpenUSD, and we’re always adding more courses to it over time.
0:20:09 And that’s a really good way to get yourself grounded and really learn the skills that you need to contribute to USD and develop these pipelines that are so key to physical AI, to moving data around into the unified worlds that physical AI needs.
0:20:22 And that path to that, too, leads you to a new USD certification program for which, you know, this DLI curriculum is designed to get folks certified, just like you can get certified as an AI developer as well.
0:20:23 Fantastic.
0:20:33 And that’s the way you can really, you know, distinguish yourself and get hands-on and learn USD for, you know, any number of domains and use cases.
0:20:39 And so somebody could go to NVIDIA DLI, Deep Learning Institute, and get started learning OpenUSD.
0:20:39 Yep.
0:20:40 Fantastic.
0:20:41 Yeah.
0:20:43 And then, of course, USD is open source as well.
0:20:47 So it’s got a GitHub repository, which has, you know, its own set of issues.
0:20:54 And lots of great folks in the community have been labeling issues as they triage them as good first issues.
0:21:04 And that’s a way, too, as like a new developer, to get hands-on with USD and contribute to it directly by fixing a bug or improving documentation and that kind of thing.
0:21:05 Fantastic.
0:21:14 And for someone who’s more versed in the 3D space, a designer, developer, but not necessarily a coder, do you need a coding background to get going with USD?
0:21:15 Not necessarily.
0:21:29 Especially now, too, where you can use co-pilots to issue prompts, to be like, please write me a Python script in USD to create a grid of, you know, nine boxes in a factory or something like that.
0:21:39 These are things that you can try, especially with omniverse technologies, where I know some partners have integrated things like that into their experiences.
0:21:47 So, yeah, I think even without a coding background, just like the world of coding is evolving in general, so is the world of coding for USD, right?
0:21:51 And coding may mean refining prompts accordingly.
0:21:51 Right.
0:21:52 Everything’s changing.
0:21:55 Aaron Luke, this has been a fascinating conversation.
0:22:00 And for the little bit I mentioned I had exposure to USD beforehand, I’ve certainly learned a ton.
0:22:13 And that idea of, you know, the standard with the standards inside of it and you can, it’s portable and you can annotate and it just, it all makes sense and it’s, I can see why it’s so popular and so powerful.
0:22:16 Thanks for taking the time to join the podcast to talk about it.
0:22:19 We mentioned the certification program and DLI.
0:22:29 Anywhere else you would direct a listener who wants to learn more about USD, about the work NVIDIA is doing with it, the work that you and your teams are doing, anywhere else they might go online?
0:22:30 Yeah, sure.
0:22:34 Aousd.org is the entry point for Aousd.
0:22:37 There’s also forums there, forums.aousd.org.
0:22:41 For NVIDIA, I highly recommend build.nvidia.com.
0:22:43 I know that’s come up on other podcasts as well.
0:22:43 Sure.
0:22:47 There’s blueprints there around digital twins in which USD is involved.
0:22:51 There’s always going to be new or expanded blueprints around that.
0:22:55 And certainly docs.omniverse.nvidia.com is a good place to go.
0:23:03 There’s dedicated USD learning paths there as well that complement the Learn OpenUSD material as well.
0:23:12 Things like workflow guides on sort of using USD to assemble industrial scenes, that kind of thing.
0:23:13 Perfect.
0:23:14 We’ll leave it there.
0:23:19 Listeners have a whole bunch of places to go dig in, get hands-on with USD, OpenUSD.
0:23:26 And again, Aaron, thank you for taking the time, let alone all the contributions you’ve made to USD in the industry over the years.
0:23:28 It was a pleasure talking, and let’s do it again sometime.
0:23:29 All right.
0:23:29 Thank you, Noah.
0:23:29 Thank you, Noah.
0:23:59 Thank you, Noah.
0:24:00 Thank you, Noah.

Aaron Luk, NVIDIA’s Director of Product Management for Simulation Technology, dives into Universal Scene Description (OpenUSD) and how it integrates seamlessly with AI to simulate rich, physically accurate scenarios. Discover how OpenUSD unifies data, powers digital twins, and uses AI for realistic simulations in industries from filmmaking to robotics. Learn why standards matter, how non-coders can get started, and how OpenUSD is shaping the future of digital design and physical AI. Learn more at ai-podcast.nvidia.com.

How OpenUSD and AI Are Building Smarter Virtual Worlds – Ep. 268

Leave a Reply Cancel reply