We’re excited to welcome Richard Kerris, Vice President of Developer Relations and GM of Media & Entertainment at NVIDIA, to the show today. Richard has had an extensive career working with creators and developers across film, music, gaming, and more. He offers valuable insights into how AI and machine learning are transforming creative tools and workflows.
Listen on Apple, Spotify, or YouTube.
In particular, Richard shares his perspective on how these advanced technologies are democratizing access to high-end capabilities, putting more power into the hands of a broader range of storytellers. He discusses the implications of this for the media industry—will it displace roles or expand opportunities? And we explore Richard's vision for what the future may look like in 5-10 years in terms of applications being auto-generated to meet specialized user needs.
We think you’ll find the wide-ranging conversation fascinating as we explore topics from AI-enhanced content creation to digital twins and AI assistants. Let’s dive into our discussion with Richard Kerris .
Key Points:
- Democratization of creative tools is enabling more people to produce high-quality media content. This expands opportunities, rather than displacing roles.
- Future apps may be auto-generated via AI to meet specialized user needs, rather than created as generic solutions.
- Industrial metaverse allows manufacturers to optimize workflows through virtual prototyping.
- Open USD provides a common language for 3D tools to communicate, improving collaboration.
- AI agents will become commonplace assistants customized to individual interests and needs.
- Computing power growth is enabling complex digital twins like the human body to improve health outcomes.
- Generative AI introduces new considerations around rights of trained content and output.
- Education benefits from AI’s ability to showcase different artistic styles and techniques.
Transcript (from Apple Podcasts):
Welcome to Artificiality, Where Minds Meet Machines.
We founded Artificiality to help people make sense of artificial intelligence.
Every week, we publish essays, podcasts, and research to help you be smarter about AI.
Please check out all of Artificiality at www.artificiality.world.
We're excited to welcome Richard Kerris, Vice President of Developer Relations and GM of Media and Entertainment at Nvidia to the show today.
Richard has an extensive career working with creators and developers across film, music, gaming, and more.
He offers valuable insights into how AI and machine learning are transforming creative tools and workflows.
In particular, Richard shares his perspective on how these advanced technologies are democratizing access to high-end capabilities, putting more power into the hands of a broader range of storytellers.
He discusses the implications of this for the media industry.
Will it displace roles or expand opportunities?
And we explore Richard's vision for what the future may look like in 5 to 10 years in terms of applications being auto-generated to meet specialized user needs.
We think you'll find the wide-ranging conversation fascinating as we explore topics from AI-enhanced content creation to digital twins and AI assistants.
Let's dive into our discussion with Richard Kerris.
Richard, thanks so much for joining us.
Excited to talk to you.
Great to be here.
You've had a long career in the creative space, dealing with, and that's a good thing, don't worry, dealing with everyone from filmmakers to musicians, 3D artists, animators, the whole long list.
When you think about the impact of AI and ML today on that world, what gets you most excited?
Well, the biggest thing that I see is the democratization of creative tools to a broader audience of storytellers and people that want to get something out.
I equated a bit back to when the audio world saw the transformation into sampling and synthesizers, and all of a sudden you had musicians making Grammy award-winning records in their bedroom on a laptop.
And the beautiful thing about that whole transfer of the power to the broader sense is that it didn't neglect the higher-end studios.
Like, they're still doing great work at Abbey Road and Electric Lady and all the other places, but it just broadened out that footprint of people to be able to do creative things.
When I look at all the AI tools that are coming out now, and we've got close to 2,000 startups in the creative space that are AI-based tools for artists, what excites me so much about that is a lot of that power is, you know, it's being given to people that have never had access to it.
Like, they've never had access even to a powerful Nvidia GPU, or they've never had, you know, access to a data center to process a lot of these things.
And now all of a sudden, you're learning that you can prompt and get imagery back.
You can modify that imagery.
You can modify the prompting of that process.
And so all of a sudden, you have a much wider group of people being able to create very high-quality images, 3D, 2D, videos, et cetera, and they're doing it by accessing that power in the cloud.
And I think that's one of the biggest things that we're seeing in the industry is the removal of the necessity to have all of that stuff on premises, to have it in your hands, and actually by putting it in the cloud, and you can access it as if you were buying a song on iTunes.
That, to me, is super exciting, because when I see the creative work that's being done out there, it's mind-blowing to think, you know, it wasn't that long ago that it would take a week on a Pixar image computer to create a beautiful ray-traced 3D, you know, image or scene, and now people are doing it by prompting something on a mobile device and getting it back.
Totally exciting.
So let's get to one of the key questions, which is people worrying about what this is going to do to disrupt the lives and the careers of creatives and rewind back to the clock when you and I worked together all those many years ago.
I remember when one of the flagship moments was JJ.
Abrams talking about editing alias on his laptop on Final Cut on a plane, right?
And that was really exciting, but then there was this whole question about what would it do to the career of editors who had the control over the avid booths and were they suddenly going to be out of jobs because of technology taking over?
That narrative is even stronger now about worrying that creatives are going to lose their careers.
What do you say to that?
Well, I think the point you made about control over the booth that they had and the fear and uncertainty of what was about to take place in this transformation of bringing sophisticated editing tools to a laptop to be done anywhere, even on an airplane.
It doesn't mean the storytelling changes, right?
A storyteller has something to tell, and a lot of them are out there that didn't have access to the booth.
No one gave them the keys.
Now we're taking those things and we're opening it up and say everyone can have access.
And I think it actually should excite the community in much the same way it did, again, to the music analogy.
It didn't replace the musicians.
I remember when sampling came out and everybody said, you know, they're stealing John Bonham's bass drum for this rap song.
Oh my gosh.
And it's like they're going to get rid of musicians and stuff.
And actually it just became another mix, another part of the ingredients that they used in their creative process.
I think when we look at these AI tools, to the people that have been doing it traditionally, I love the ones that embrace it and say, wow, I'm going to be able to use this and get more done and be able to tell my story even better or do things that would have typically been long and tedious, like background imagery for a hero shot, I can now prompt and get that done and still be focusing more on that hero shot than I would have in the past.
So fear is just, you know, that's the fear that comes with the unknown.
And, you know, I think when it comes to creative tools like we experienced with Final Cut and others, is that if you embrace it, you're going to see that it opens up a whole new world, right?
And what do you do with all that stuff?
You play, explore, you create, you make things.
And that's what's exciting to me, is that I just think we're going to see things that we've never dreamed of before.
Yeah, it's interesting because you look back and think back before the sort of, you know, democratization, a phrase that I'm not sure whether I really like it or not, but it's there, of these tools.
It went from, you know, how much video could you actually watch?
There's film and TV, and that was it.
And then suddenly now it's everywhere, you know, and everybody's got little clips of things.
And I've already found, you know, the creativity that we can use in our business of using, you know, image generation tools is pretty intriguing and creating sort of a brand, you know, vibe, you know, for it in a way that we as a small business wouldn't be able to do.
So there is a way of expansion, and I wonder whether what we'll do is in the future look back and see the, you know, a grand expansion of creative output.
It will definitely change people's workflow, which definitely makes people scared.
And that's a good and honest, you know, reaction, but then there's still also this change of opening up the floodgates potentially.
Yeah, but the positive side of that, and I do like the word democratization because in the past it was prohibitive to do a lot of creative work.
I think back, way, way back, when I started, I used to make rock videos, and I would spend $420 an hour to use a Quantel Paintbox in Boston to go in and just paint glints and gleams on these particular shots.
And you were amazed at the power.
You were able to use this stylus and create and paint on top of the video.
And it was like, that was a mind-blowing experience, but not everybody got to do that.
And if we didn't have the band with the budget, we just, oh, we can't do that for you, this shot.
Fast forward to where that's just commonplace, even on a mobile device today, and it just broadens out how many more people can have access to that.
And that, to me, is a good thing.
And it's evidenced by seeing how much great creative stuff there is out there today.
I mean, whether it's the long-form films, and we see spectacular work coming out of some of the films, you know, Hollywood and all that, but all the way down to even short spots on TikTok and other short-form media, they're using very sophisticated technology to create their imagery and their story.
And to me, I think that's a great thing.
I don't see why people would be afraid of that.
You know, I think it's like embrace it and extend it.
Do you think that generative AI is a bit different?
I mean, the stories that you guys talk about and can talk about at length, which I've heard many times, they all involve this great expansion of what's possible for a human to do with other humans.
But generative AI introduces this proxy for other humans.
Do you see any difference there?
Oh, yeah.
Well, that proxy is an AI-driven agent that's been trained on your computer.
And instead of it being a human that you're conversing with, you're conversing with that agent in the computer.
Like, if I was to say to my production studio not too long ago, I need a background scene of a jungle with a road going through it, and I'm going to have the hero shot be a jeep driving as it's being chased by a dinosaur or something like that.
That team would go off and build that jungle, and it would be timely and costly and everything else.
I still want to tell that story of the jeep and that particular scene.
Fast forward to where that production agent is now a computer that's been trained to know what a jungle is or to know what the different types of thing is.
I'm still getting that result out of it, but now I don't have to have the burden of the cost and the time and all of those.
It really comes down to what are you doing with it?
Like if, you know, there's...
As we saw with desktop publishing, when you give people these tools, all of a sudden you see people making slides with 30 different fonts and different sizes and because they can.
And then all of a sudden it kind of works itself out like nobody wants to see that.
That's crazy.
They want to see what you want to get across with your point.
What are you trying to tell me?
And so it's refined itself down to, you know, yes, you have access to all of those fonts, you have access to all of those things, but really it's about what story you're telling.
And if we take that same analogy to these clips and content with generative AI, it's still about the story.
We see lots of wacky stuff coming out of it.
We see a lot of gen AI all the time, dogs on surfboards and cats flying in a spaceship.
And that's just kind of people, I think, just spewing out stuff that they can do.
Like, oh, I prompted it and I got this crazy thing.
But if you take that and then start looking at some of the stuff like Runway has an AI film festival and I attended it last year and I'm actually going to be a judge in this one coming up this year.
People are using generative AI to tell stories.
And it's from all parts of the globe.
And they're telling stories using beautiful imagery and content and stuff that normally they would have never been able to do.
It would have been stick figures for some or something like that.
So I look at it in that context.
But yes, we will always see people putting too many fonts on a slide or too many wacky images prompted with it and stuff like that.
But that's just kind of the nature of the beast, I think.
I think that just comes along with anything that you give access to.
Oh, can I ask what your judging criteria are going to be?
What is the story you're trying to tell?
How well are you telling that story?
If you're telling me a story by showing me a bunch of visual effects and things that are fancy but I don't really understand what you're trying to get across, then that's just too many fonts on a screen kind of thing.
If you're telling me a story that I'm like, oh, that all just adds to that what it is you're trying to convey, then that's much more meaningful to me.
So the context that I look at is, what is your goal?
What is your purpose for bringing this to others, to bringing it to the audience, right?
I mean, again, when you look at music and things like that, you can make noise and just, you know, it's not in tune, there's no harmonies, it's just noise and nobody's going to really want to experience that.
Or you can use it and be a solo musician that's creating content that has all the different parts being played and you're conveying something that's beautiful and people really appreciate hearing it.
So it's what you do with those ingredients that matters to me.
Yeah, the exciting part for me, I guess, is in the democratization sort of idea is the ability for the talented people that can use these tools to be able to produce much more from many more people.
Today, the world of great special effects essentially is isolated to a relatively small community for a relatively small number of outputs.
You get films and TV and some music videos or things like that, but it's still a pretty small number of productions that can have a jungle and somebody walking through it.
But if you think about over the last few years, it feels like sort of a standard visual language in Silicon Valley with the animated illustrations with somebody walking across to sit down at a desk and say something.
This was the classic marketing videos on people's websites.
That could all be much richer and could have a much more interesting way of telling a story by having the scene more complete, to have it be more visually appealing, to help people understand the setting of it or whatever the narrative is.
You could see how the people who can do this work with these tools can do it faster and cheaper and be able to go to a much larger audience than we've seen.
Absolutely agree.
It's like going from clip art to real art and allowing people to make it their own.
So it's not the same kind of, oh, I've seen that in 15 other presentations.
No, it's unique, it's compelling and it's sustainable.
It gives you something that you can really get engaged with.
So I think those are some of the most important things.
And one of the other things that I'll bring up, because I know eventually what we'll want to hit on this is, how do you know that the content that's been trained on is something that I can use?
Because that's probably one of the biggest questions when I recently was at CES and we did a presentation, I was on this panel for Digital Hollywood, and that was the number one question from the audience, is how can I be sure that what I use out of these generative AI models, I'm okay to use?
Well, that's the whole trust in and trust out model, meaning that as long as that content has been trained on content that has artists' rights adhered to and everything's respected in that, then they can indemnify you on the output that's created with that content.
Companies like Getty, we worked with Getty, they have millions of images and things like that that they trained using our platform Picasso, and now when customers of theirs use that new business model of theirs, they can be assured that whatever they're creating with that, Getty stands behind them.
And companies like Adobe are doing this and others.
I think this is going to be one of these most important things that comes out of this whole emergence of these new technologies is the trust in and trust out.
Same kind of thing when musicians were sampling different parts.
Look, the police made a lot more money on the sampling of one of their songs by one of the rap artists than they've ever made by putting the song out on their singles and stuff.
And that's because the attributes to the artists were paid attention to.
And I think that came out of, you know, at first when they sampled, you heard sampling of everything, and then all of a sudden it was like, no, no, no, no.
You gotta pay the rights, you gotta have the permission to use it, and all of those things.
And that's all worked itself out now.
And I think we're gonna see that same kind of wave come across all of these models that are being trained.
It's what you train with, is that as important as what you're creating with, so that you have the rights to do something with it.
And I think the creative community is the one that is most acutely aware and concerned about it, for good reason.
We write, we publish a lot.
you sit there and look at the yellow limbs and think huh, so how much of my content is getting consumed in there to be able to spit something back out again?
And so having companies like Getty and Adobe, you know, focused on making sure that it's trusted content coming in is incredibly important in this creative community because otherwise you don't want to produce it and make it freely available.
Right.
And then get sued.
Exactly.
You don't want to get sued and you don't want to be contributing to something that's essentially replacing you in some way.
You want to actually feel like you're part of the cycle that actually can be positive for you as well.
Yeah, absolutely.
And like I said, it's going to become more and more important as we go forward, especially as the tools get more and more refined.
Right now, it's a little bit of the black box.
You prompt, you get an image.
You prompt, you get a video.
Well, when I get that result, I don't have much more control over it.
I don't get the 3D rig for the character that I've just been prompted and got.
But all of that's going to continually get refined and changed so that you'll be able to get a fully rigged character that you can manipulate and do all the things that you would do as if somebody modeled it for you and handed it to you.
And the more that we get to do those things, the more embracing of these technologies the professional studios will use.
AI is already being used across studios.
It's been done for a while.
Generative AI is starting to get used more and more.
But again, it really comes down to how much control do you give that director and stuff, rather than just the black box of I need a gloomy sky with a spaceship flying by.
Well, that's not the spaceship I want.
Keep prompting, keep prompting.
That's not a productive way.
But if you go, let me tap that spaceship and go in there and modify it and make changes to it and things like that.
Now it's become part of my workflow.
That generative process is just another assistant that I'm working with.
What do you think it opens up in terms of for education?
So you've got design schools, film schools.
There's the easy answer, which is we'll teach them the tools.
But what beyond that?
What beyond that in terms of that would help young people sort of really embrace these tools and good paying careers rather than just the starving artist idea?
No, that's a great question.
I think it gives the students the ability to learn the styles, not just the tools.
So they can all learn a paintbrush, they can all learn the prompt.
And prompting will become more and more important, of course.
But the idea that you could learn a style.
I want to create in the style of Picasso, provided my content has been trained on that and the estate has approved it, etc.
etc.
I can learn and appreciate that style.
I can learn and appreciate dozens of other kinds of things, so it will influence my own style.
And I think students, you know, look, everybody that's ever created anything has always been influenced by something that's come before them.
And so you see it, you hear it, all of that.
So the more enriching you can do for those students to give them the ability to explore, because maybe they can't go to a museum in Spain, or maybe they can't, but they can get there.
They can have trained models about it.
They can have virtual environments to explore it.
They can walk around and stuff.
You're going to be just constantly enriching their brain into what's possible for that.
So I think that it really transfers from just learning the tools to learning the styles and learning the atmosphere and things that were taking place for that artist to have created what they did to hopefully influence the student to go and do brand new things.
I got a sort of a technical question, which is the general public is mostly thinking about generative AI in terms of LLMs, image creation tools.
I was hoping that you could talk a little bit about OpenUSD and using that for digital content creation, because I'm recognizing that there is a whole workflow that Nvidia has in the creative world that may not be familiar to people that are outside of that world today.
So, yeah.
Nvidia is a platform and technology company.
Most people see the results that our developers build on top of those things.
One of our platforms, Omniverse, has been built around on top of USD, or Universal Scene Description.
And for those that don't know what that is, it was originated at Pixar, and it's by its name, a universal scene description.
It's the ability to define and present 3D in an environment that may have come from multiple different tools, from multiple different vendors.
One of the biggest challenges we've always had back when I was at studios and stuff is your pipeline is usually comprised of a lot of different tools.
Most of those tools don't talk well with one another.
And so this was inherent at Pixar in the creation of their films and such.
They decided to go and embark upon building an environment that allowed all of these things to have a common denominator.
You can think of it as HTML for 3D.
So it unifies all of those different disparate tools into one common environment so that you can use and mix them together.
So that means whether or not your tool was designed to work with a different tool, as long as it's supporting USD, they can.
And so Omniverse is a platform that brings all of that together and then adds to it an amazing amount of technology developed by us, developed by other developers, etc.
That brings real-time photorealistic rendering.
It brings the ability to manage extremely large sets of data.
And it brings in AI and generative AI, so that you can do things even as a non-programmer.
You can prompt it and create Python scripts and stuff by simply prompting it.
I need a script that's going to populate this in empty room with tables and chairs.
And it will go and do that, things like that.
So, this whole platform that we're seeing now get adopted by developers in all different verticals from manufacturing to media and entertainment to architecture and so on, is really about where we've kind of unified the complexity of these tools from the past into one common place in much the same way HTML did for the web.
I mean, it's very analogous to that.
If you remember back in our days when, oh, you've got to have this installed and you've got to have that installed, it doesn't work on that browser.
All that went away with HTML and it finally getting ratified and everybody contributing to it.
And now we go to the websites and don't even think twice about it.
Same thing has to happen with 3D virtual worlds and 3D tools and things like that.
You know, the idea of the industrial metaverse, being able to go in and virtually experience something in a digital environment that's true to reality, to the physical environment, means that the industrial production workflows can be much more optimized and save a lot of money.
In the car industry, you know, when they change models on a line for productions of cars, that could take months to try to get it all fine-tuned with all the heavy equipment, the robots, the training, the materials, etc.
All of that can be done in the digital world now before they commit to it in the physical world.
So it's permeating across all these different industries, and it's really the next OS, if you will, is that commonplace of everything can work well together.
I love the idea of it.
I mean, the reality of it too, obviously.
But the idea of being able to say, I have all of these little objects, and I can create objects, and then I can move them around.
I mean, it is something that if you think about heavily digitized films, lots of CGI, you can say, oh, well, they made that tree, and I can tell that they made lots of them in that scene.
But being able to open that up and allow lots of people to be able to have access to that, again, is another one of these moments of, well, what are people going to do with that?
And I'm wondering that.
I'm not saying it's not going to happen.
It's sort of you stop and wonder what people will do with this ability to create these worlds.
And they're not just being in a world, like being in a headset and being in a second life kind of world.
I can use all of these in videos that I'm going to create.
I can use these in whatever I feel like.
I can collaborate with people across the globe, as if we're in the same room walking around doing things.
I mean, it's been isolated in these VR and virtual worlds because they've been isolated because they don't communicate well outside of those bubbles.
That will change the practicality of using it.
Yes, not everyone's going to wear a headset 24-7 and do all that, but I manipulate 3D with my mouse on a screen, or I can manipulate it by using a tablet and space and things like that.
There's all different ways to experience it in the same way that you experience the web, but you typically did it on a one-to-one basis with your device and what you're working with.
Now you can have multiple people be sharing that same virtual experience anywhere in the world because it's all running on the cloud, and you can then make your decisions and your creative decisions and production decisions, et cetera, together as if you're in the same place at the same time.
So you can see that affecting across many different domains, right?
Not just the storytelling, the movie making, game making, but like I said, in the industrial side of things and the architectural side of things, we're really seeing this next big wave is about the commonplace of 3D and virtual worlds to be as common as the web.
And you can see so many obvious places where spatial reasoning can be really supported with this sort of technology, training medical staff or just anyone who's working in that sort of 3D environment who not that long ago, the only way of accessing that information was multiple, I think back to the anatomy textbooks, where you had different layers and pages and here's the bones and here's the muscles.
Now, that is primitive, obviously, but being able to literally be immersed in those worlds.
And you can see, the way you describe it is you can also see a different level of creator that say, this is just hypothetically, I'm thinking aloud, but say you go on to a medical degree, but you decide you don't want to be a doctor and you think, oh, what the hell am I going to do now?
Because that does happen to a lot of people.
You can go and create different stories around that knowledge that you have and create quite different virtual immersive worlds that could be for training and education.
Right, you could affect the tool makers that are going to be used in those virtual worlds.
You could affect the teaching of those things.
You don't always have to be opening up the body.
And if that's not what you want to do, but you're right.
Yeah, and I love where these tools aren't just limited into the space of what we're talking about with 3D, but if you think about what it's going to do for the developer community, one of the most exciting aspects of my job working with developers in the developer relations world is that you get to see what's coming.
You get to see where the trends are taking place and stuff.
And one of the things that they're all starting to realize is that the idea of a traditional developer is going to change dramatically.
You will be able to prompt the computer to create an application.
And it'll start small.
We're seeing it now with Python scripts and other things that you can have with copilots and stuff to help you along.
But not too far in the distant future, you're going to be able to, instead of going to an app store to download an app to do something that you're going to have to learn how to use that app to do, you will prompt your device and say, build me an app that does exactly what I want and nothing else, and it will create it for you on the fly and it will be customized just to what you need to do.
And so that whole premise of the app store and your slave to how the app works is going to flip around to where it's going to be a development environment that you're going to go and prompt and have the different components compiled together to make an application for you on the fly.
And that's one of the things the developer community is actually embracing really well, because they see this as an opportunity to broaden the amount of kind of tools and things that will be created.
But their responsibility then raises up a level to how do you take what would be traditionally monolithic applications and break them into components and microservices that can live in the cloud and can be accessed as an API call or a service call or something, and just be able to access that way.
So, you know, there's a level of excitement that's coming because we're all talking about where is this going to be in five years?
And, you know, I'd say it's the death of the app store and the birth of the development environment, and it's going to be something that people are just going to get used to rather than be a slave to.
And there's many examples I can give to where we see that taking place.
Well, in some ways, that can't come too soon, can it?
I mean, we had a live example recently of one of our kids has got a significant data challenge that she has to overcome.
That is image-based.
And it's so close to generative AI, particularly in this case, GPT-4 Vision, or one of the...
That's probably the best one.
But it's so close to that tool being almost able to help her, but just not quite there.
Yes, what she's got is, she's getting a graduate degree in glaciology.
And so she has pictures of glaciers, and she's trying to pull out features of the glaciers and be able to measure things and check change and so forth.
It's the kind of thing that you look at and you go, we're pretty close to being able to say, I want to do this to these sets of images, go write the scripts that are required to do that kind of image analysis and give me back the data.
And visualize it for me, do the statistical analysis and put it together in some slides for me.
And make it a 3D thing.
Yeah, exactly.
And add a soundtrack while you're at it.
Exactly.
Because you can see how someone, if you hand it to a, I mean, it's not really their specialty, but you can see how you can hand it to a studio in LA.
And they know how to take images like that and pull things apart and extract a 3D space out of it and so forth.
But that's a very high-end skill set with high-end tools.
Whereas you could see that this is a moment in time that sometime in the future, should be able to do it by just asking, giving instructions to the computer, not actually sitting out and writing it.
Yeah, it's like the days of waiting for the render.
You'd set up your wireframe, might have a shadow, a flat shaded type thing, but you then hit render and you go away for a while and you hope that it comes out the way you want.
And then you look at it and go, oh, dang it, nope, tweet, tweet, tweet, tweet, back to that.
And that iterative process that was so time consuming today is interactive.
You just grab it, move it, it's got real-time shadows, caustics, the whole nine yards.
And the first time that we showed our creative team using the Omniverse platform that I talked about, which has a real-time photorealistic renderer, RTX renderer in it, they created a spot called Marbles at Night.
And it's on our website, and it's all done in real-time.
And it's got, you know, it's a traditional marbles game, and you can turn it on fire, you can do all kinds of things.
And the very next day, one of the co-founders of Pixar reached out to me and said, you know, oh my God, you're doing something that 30 years ago we dreamed about.
And that gave birth, so we ended up having a session at our GTC conference with the pioneers of computer graphics just talking about how far we've come in such a short period of time.
That was a 30-year window from the idea of waiting days for one frame to render to now where they're manipulating it in real time with all the attributes, you know, that they dreamed about.
And where are we going to be in the next 30?
And it was fascinating just to, you know, to hear the war stories of what it would take to do that.
They knew someday it would get there, right?
Like there was never a doubt in the minds of those pioneers that at some point you're just going to, you know, manipulate this like you do a word processing document or something.
That vision is what carried forward the inspiration.
You know, like our teams always look at, we stand on the shoulders of the giants that came before.
And so when you talk to the Omniverse team, the honor and respect that they have for those early days is apparent throughout everyone you talk to.
And so we did a mixer once and had a big call with bunches.
And it's just great to listen to these stories back and forth.
But one thing that came out of it was, oh my gosh, where's this going to be in another 30 years?
You know, wow.
So the challenges that's being faced on the glacierology is, what was it?
Glaciology, yes.
Glaciology.
I never knew that you could get a doctorate in that, but that's fascinating.
Those challenges are going to be commonplace, you know.
And there will be other challenges to replace them.
You talked about the change for developers.
My favorite part of Google Gemini announcement was the dynamic design demo where they showed, it was a simplistic thing where they showed somebody coming to a website interacting with Gemini saying, I want to plan my kid's birthday party.
And it asks them, has a little bit of a conversation and it dynamically assembles an interface to have somebody walk through and take a look at things and choose different things and sort of try out different party designs in the space.
Those kinds of things.
It was a classic demo of a relatively straightforward example.
But it did make me wonder how to think about the future of being a developer.
As you say today, somebody would say, I'm going to make a party planning app and I'm going to put it in the app store and I'm going to cross my fingers that enough people download and use it.
When you think about how to talk to developers about what their future is then, if it isn't in that mindset of creating, compiling, and then hopefully having some level of commerce, how do they think about that new world of having an AI dynamically create things?
Yeah, great question.
Because it's really about harnessing the power that this new world will give.
The next computer is the data center, right?
We will just have devices that speak to the data center in the cloud, whether we use a little bit or a lot of it.
But from a developer standpoint, it's really about understanding what makes an application, what makes something usable, and being able to know that that power that you have available to you is your palette with which you can go create with.
And to make the components of the things that you're building accessible by not only what you're doing, but by others that come along as well.
I think the idea of the monolithic applications is going to fade away quickly.
You don't need 90% of the stuff that you get in an application a lot of times when you just want to do something specific, or maybe it's 20% or 30% that you want to use.
But there's all this other stuff that you get along with it.
That's time-consuming and wasteful.
It can be much more efficient to just say, I need tools that do this, this, this, and this, and it's not going to be just party planners.
It's going to be, I need tools to design a new automobile.
And I just really want to focus on the shape, the speed, the aerodynamics, et cetera.
And I want to trust that whatever I'm creating is true to reality possible.
Like, don't allow me to create something that can't be built.
And give me the parameters or the guardrails of these things.
So developers are going to need to know how to construct and train these models with that.
It still requires a sophisticated mind to understand what that means.
But it's going to open up the floodgates for people that want to get an application specific to their needs.
That's been one of the biggest challenges for people using applications.
And I talk with them a lot because it's like, yeah, we wanted it to do this, and this is the only one that does it because it does this particular feature really well.
But then we had to use this other application because it does that feature really well, too.
Oh, well, thankfully they can work together now with USD.
But even still, it's like, why do I have to get all of those when I could just say, I want the best of that one, the best of that one, make me a new custom application on the fly just for what I need to get done?
Yeah, you never submit a feature request to the engineering team.
You just ask the application to adjust itself slightly, you know, to do the thing that you really want it to do.
One of the things that we're really interested in, and I'm not sure how much it crosses over with your world, but I know it's something that Nvidia's deep in, is sort of agentic AI and this crossover between, you know, what's embodied, what's in the realistic domain space, and a realistic, like you said, don't give me something that can't be built, or don't let me design something that can't be built.
And agents that have scale and make their own decisions and help you through this process long term, so that it's not just you're not even using tools, you're using an agent that then starts to unlock other tools for you.
Do you have any perspective on, say, the next five years in terms of what's going to be possible in that world?
Yeah, well, I think to your point, having guardrails on things to know what's possible, what's not possible, what's permissible, what's not permissible, all of those things become very, very important.
And so in a lot of these models, especially the ones that we focus on, there's tools in there that do just that.
They put guardrails on there.
They won't allow bad things to be said or bad imagery to be done, et cetera.
And that will continue to go on.
But I think we will get more and more accustomed to having AI agents as our companions for specific things.
Maybe you're taking a walk in a new city that you've never been at before and you just say, hey, I want to look at art shops and I want to look at coffee shops.
Just show me those types of things.
Don't give me a thing that has everything.
I just want to go to where I want to go and it will customize it for you on the fly.
But you might be doing something as an industrial designer and you might say, I want to work with this medium, but I'm not sure what it's capable of holding or what it's capable of doing.
Build me an agent that will assist me in my work and knowing that I have those parameters around it.
So I think you're going to see more and more of those types of things happen.
But to your point, the important thing there is that those guardrails and the understanding of what can and can't be done or should and shouldn't be done, depending on the case, and allowing you to work within that new medium and then have it at your disposal without fear that, oh my gosh, it just took me to a place where I didn't want to go and now I'm lost and all that kind of stuff.
So I think that's the kind of stuff we're going to see in the very near future.
And we're already starting to see some of that, you know, the wearables that are coming out.
There's the humane pen, there's rabbit, there's a bunch of other ones that are emerging.
And the idea that you would just have these as almost ambient available technology when you need it and not when you don't.
I think that's going to pose some really interesting new prospects for us, especially using AI.
Say you're wearing some sort of headset and you're doing some sort of design.
You're creating...
It doesn't even need to be a headset.
You're creating something using generative AI.
It's a vision-based thing.
And what you actually want is it to be, for it to almost be a bit more like a language model, that it's generating the next image for you before you even really kind of know that's what the image is that you want.
Sort of an agent that's like a little bit one step ahead of you.
Is that something that you could imagine people wanting to develop?
Sure.
If you're learning to sing, you want a vocal coach that can tell you how to sing properly.
If you're learning to draw or paint or model or any of these types of mediums, you always have an instructor that's always ahead of you, and they're bringing you along on that journey.
Not everyone has access to a great instructor in different things, but that gets to be changed.
You get to say to the computer, I want a model that knows how to sing, that here's my voice and knows what I can do or what I can't do, and now train me, help me to be a better singer in that context, and it'll be customized just for you.
But it will be ahead of you because it'll be bringing you along.
So I think that'd be great.
I think that people will welcome that with open arms.
That reason I asked that question is I've been digging into an Apple paper that has just been put out along these lines.
And I've been trying to think through why is it that it's Apple looking at this as opposed to anyone else?
Was that just a lucky find?
Is it Apple's a bit sort of like all over the place in terms of, you know, it's pretty haphazard what they actually publish.
You can't really glean much from that.
Well, no, look at paper.
I mean, we actually had the most papers this past SIGGRAPH of any year prior to anybody.
And there's stuff that our research team is doing all the time where, you know, it's like you've got to read it and read it again and then try to figure out, oh, I think I get where they're going with this.
What is it all going to mean?
Well, that's how generative AI came to be.
And that's how a lot of other of these things.
So any company that has a good pool of researchers and innovative thinkers and stuff is bound to be coming out with these incredible papers and prospects for ideas that they put out into the world, and they see what happens.
And I think that's what kind of motivates the companies to work with one another.
I think you can't be a walled garden.
Those days are really, those are in the past.
Everything's open and much more broadly accepted as to what's being done.
And so, you know, it wasn't that long ago where they wouldn't publish papers because it might reveal a secret, you know.
Well, they can still keep the secrets, and they have ways of hiding it in the papers, and it's not hard to kind of see through it, especially when the release of a product comes out.
That's what they were talking about.
But I think it inspires everyone, you know.
It really, look, where we're going to go in this next 30 years, as we were saying, like, who knows?
But it's going to be fun.
It's going to be a wild ride, and we're going to sit there and go, wow.
When I thought the first 30 were wild with computer graphics and stuff, now we're even in the holodeck, and we're doing lots of other things.
So it's not that far, but we do it by the researchers and artists and creative minds and stuff really pushing the envelope about what's possible or what might be possible.
So on that sort of line of what might be possible, one of the things, one of our obsessions that we look at in our research is this, we call it memory versus margins.
In order to be truly useful, you want your digital partner, your AI partner, to be able to remember everything that you've ever experienced.
That would be the ultimate.
That would be the most helpful.
But that takes an extraordinary amount of compute and would be extraordinarily expensive today.
What do you see in the future that helps break those down, some of those barriers that we're able to get to that sort of longer, much, much longer context window in a way so that it can remember things in ways that make it so much more useful, but not be cost prohibitive?
Is that a software thing?
Is it a hardware thing?
Is it something else?
Oh, it's definitely, we're seeing the speed of hardware.
Our CEO says Moore's Law is dead, and I agree in the sense that we've surpassed that quickly.
The amount of power that you can tap into in our supercomputer data centers is daunting, and it's only going to get better because we're going to be using those same tools to help design the next wave and the next wave and such.
And one of the most, I'll give you an example, one of the things that I think is going to really benefit all of mankind with this amount of power.
You know, we talk about digital twins for factories.
We talk about digital twins for objects and cars and things like that.
But one of the most complex things to do a digital twin of is the human body.
It's far more complex than a factory or a car and things like that.
And so we will get to that point in the not too distant future where you will carry around a digital twin of your body and it will be able to understand your past.
It will understand your hereditary connections, your family and stuff.
And it will be able to look at different scenarios for the future for you or what you should and shouldn't do.
You should have this kind of a diet.
You should be on doing these kinds of things.
And it will help you live a better life because it will understand your entire essence as if it was a computer model because it will be.
And it will be able to project different types of futures of what happens if you continue smoking.
Here's where that's going to go bad.
Or if you need to do these different things or watch out for these different types of diseases that you may have had in your family, we're going to get ahead of it and be able to preempt it and things like that.
So in much the same way you carry an ID card today, you're going to be carrying your digital twin of yourself.
And I think that's going to be one of those things.
But it requires an enormous amount of power, an enormous amount of compute and stuff.
But as it gets more and more powerful out there and easier to access for anybody, just think of what that is going to do for eradicating different types of diseases and things and helping people live a better, more richful life just because of that compute technology and the power they have access to.
That's just one example that I look at.
Wow, wouldn't we all want that?
I thought you were going to say digital twin of Earth and we'll solve climate change.
Oh, we're already doing that.
That's already underway.
I like to talk about what's coming that's not there yet, but we've already been well underway now for a couple of years on Earth too, which is building a digital twin of the climate of Earth to be able to take information from the past and make understandable predictions about what could or couldn't happen depending on what's going on on the Earth for the future.
That's an ongoing and we'll have updates.
You'll hear updates of GTC and other things.
It's an ongoing project that's been taking place now for a couple of years.
And it's fascinating to think what that will mean if there's a factory that's built someplace, what the output of that factory could affect some other place in the world and things like that.
And I think that's something that from a global scale helps us all.
But I was taking it down to even more complex thing, you know, the human body in the sense of all the different parts that we have.
And to Dave's point about if it could remember the past or at least learn from the past about what possible abilities there are for the future, what that would do for you as an individual, let alone the globe.
Well, if you need someone to help with Earth-2 and understanding glaciers, we have someone for you.
So we can add that human intelligence into it.
Anyway, thank you so much.
If you want to do developer relations because we are always hiring.
Well, thank you so much for joining us.
This has been great fun.
And hope to be able to do it again sometime.
That would be great.
Absolutely.
I look forward to it.
And let's not take as long as the time as we have to connect again.
Good to see you.
All right.
Take care.
Bye.
If you enjoy our podcasts, please subscribe and leave a positive rating or comment. Sharing your positive feedback helps us reach more people and connect them with the world’s great minds.
Subscribe to get Artificiality delivered to your email
Learn about our book Make Better Decisions and buy it on Amazon
Thanks to Jonathan Coulton for our music