Author: The AI Podcast

  • Wayve CEO Alex Kendall on Making a Splash in Autonomous Vehicles – Ep. 209

    A new era of autonomous vehicle technology, known as AV 2.0, has emerged, marked by large, unified AI models that can control multiple parts of the vehicle stack, from perception and planning to control.

    Wayve, a London-based autonomous driving technology company, and a member of NVIDIA’s startup accelerator program, is leading the surf.

    In the latest episode of NVIDIA’s AI Podcast, host Katie Burke Washabaugh spoke with the company’s cofounder and CEO, Alex Kendall, about what AV 2.0 means for the future of self-driving cars.

    Unlike AV 1.0’s focus on perfecting a vehicle’s perception capabilities using multiple deep neural networks, AV 2.0 calls for comprehensive in-vehicle intelligence to drive decision-making in real-world, dynamic environments.

    Embodied AI — the concept of giving AI a physical interface to interact with the world — is the basis of this new AV wave.

    Kendall pointed out that it’s a “hardware/software problem — you need to consider these things separately,” even as they work together. For example, a vehicle can have the highest-quality sensors, but without the right software, the system can’t use them to execute the right decisions.

    Generative AI plays a key role, enabling synthetic data generation so AV makers can use a model’s previous experiences to create and simulate novel driving scenarios.

    It can “take crowds of pedestrians and snow and bring them together” to “create a snowy, crowded pedestrian scene” that the vehicle has never experienced before.

    According to Kendall, that will “play a huge role in both learning and validating the level of performance that we need to deploy these vehicles safely” — all while saving time and costs.

    In June, Wayve unveiled GAIA-1, a generative world model for developing autonomous vehicles.

    The company also recently announced LINGO-1, an AI model that allows passengers to use natural language to enhance the learning and explainability of AI driving models.

    Looking ahead, the company hopes to scale and further develop its solutions, improving the safety of AVs to deliver value, build public trust and meet customer expectations.

    Kendall views embodied AI as playing a definitive role in the future of the AI landscape, pushing pioneers to “build better” and “build further” to achieve the “next big breakthroughs.”

    For more on NVIDIA’s Inception startup accelerator program, visit https://www.nvidia.com/en-us/startups/

  • NVIDIA’s Annamalai Chockalingam on the Rise of LLMs – Ep. 206

    Generative AI and large language models (LLMs) are stirring change across industries — but according to NVIDIA Senior Product Manager of Developer Marketing Annamalai Chockalingam, “we’re still in the early innings.”

    In the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Chockalingam about LLMs: what they are, their current state and their future potential.

    LLMs are a “subset of the larger generative AI movement” that deals with language. They’re deep learning algorithms that can recognize, summarize, translate, predict and generate language.

    AI has been around for a while, but according to Chockalingam, three key factors enabled LLMs.

    One is the availability of large-scale data sets to train models with. As more people used the internet, more data became available for use. The second is the development of computer infrastructure, which has become advanced enough to handle “mountains of data” in a “reasonable timeframe.” And the third is advancements in AI algorithms, allowing for non-sequential or parallel processing of large data pools.

    LLMs can do five things with language: generate, summarize, translate, instruct or chat. With a combination of “these modalities and actions, you can build applications” to solve any problem, Chockalingam said.

    Enterprises are tapping LLMs to “drive innovation,” “develop new customer experiences,” and gain a “competitive advantage.” They’re also exploring what safe deployment of those models looks like, aiming to achieve responsible development, trustworthiness and repeatability.

    New techniques like retrieval augmented generation (RAG) could boost LLM development. RAG involves feeding models with up-to-date “data sources or third-party APIs” to achieve “more appropriate responses” — granting them current context so that they can “generate better” answers.

    Chockalingam encourages those interested in LLMs to “get your hands dirty and get started” — whether that means using popular applications like ChatGPT or playing with pretrained models in the NVIDIA NGC catalog.

    NVIDIA offers a full-stack computing platform for developers and enterprises experimenting with LLMs, with an ecosystem of over 4 million developers and 1,600 generative AI organizations. To learn more, register for LLM Developer Day on Nov. 17 to hear from NVIDIA experts about how best to develop applications.

  • Afresh Co-Founder Nathan Fenner On How AI Can Help Grocers Manage Supply Chains – Ep. 208

    Talk about going after low-hanging fruit. Afresh is an AI startup that helps grocery stores and retailers reduce food waste by making supply chains more efficient.

    In the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with the company’s cofounder and president, Nathan Fenner, about its mission, offerings and the greater challenge of eliminating food waste.

    Most supply chain and inventory management offerings targeting grocers and retailers are outdated. Fenner and his team noticed those solutions, built for the nonperishable side of the business, didn’t work as well on the fresh side — creating enormous amounts of food waste and causing billions in lost profits.

    The team first sought to solve the store-replenishment challenge by developing a platform to help grocers decide how much fresh produce to order to optimize costs while meeting demand.

    They created machine learning and AI models that could effectively use the data generated by fresh produce, which is messier than data generated by nonperishable goods because of factors like time to decay, greater demand fluctuation and unreliability caused by lack of barcodes, leading to incorrect scans at self-checkout registers.

    The result was a fully integrated, machine learning-based platform that helps grocers make informed decisions at each node of the operations process.

    The company also recently launched inventory management software that allows grocers to save time and increase data accuracy by intelligently tracking inventory. That information can be inputted back into the platform’s ordering solution, further refining the accuracy of inventory data.

    It’s all part of Afresh’s greater mission to tackle climate change.

    “The most impactful thing we can do is reduce food waste to mitigate climate change,” Fenner said. “It’s really one of the key things that brought me into the business: I think I’ve always had a keen eye to work in the climate space. It’s really motivating for a lot of our team, and it’s a key part of our mission.”

  • Co-Founder of Annalise.ai Aengus Tran on Using AI as a Spell Check for Health Checks – Ep. 207

    Clinician-led healthcare AI company Harrison.ai has built an AI system that serves as “spell checker” for radiologists — flagging critical findings to improve the speed and accuracy of radiology image analysis, reducing misdiagnoses.

    In the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Harrison.ai CEO and cofounder Aengus Tran about the company’s mission to scale global healthcare capacity with autonomous AI systems.

    Harrison.ai’s initial product, annalise.ai, is an AI tool that automates radiology image analysis to enable faster, more accurate diagnoses. It can produce 124-130 different possible diagnoses and flag key findings to aid radiologists in their final diagnosis. Currently, annalise.ai works for chest X-rays and brain CT scans.

    While an AI designed for categorizing traffic lights, for example, doesn’t need perfection, medical tools must be highly accurate — any oversight could be fatal. To overcome this challenge, annalise.ai was trained on millions of meticulously annotated images — some were annotated three to five times over before being used for training.

    Harrison.ai is also developing Franklin.ai, a sibling AI tool aimed to accelerate and improve the accuracy of histopathology diagnosis — in which a clinician performs a biopsy and inspects the tissue for the presence of cancerous cells. Similarly to annalise.ai, Franklin.ai flags critical findings to assist pathologists in speeding and increasing the accuracy of diagnoses.

    Ethical concerns about AI use are ever-rising, but for Tran, the concern is less about whether it’s ethical to use AI for medical diagnosis but “actually the converse: Is it ethical to not use AI for medical diagnosis,” especially if “humans using those AI systems simply pick up more misdiagnosis, pick up more cancer and conditions?”

    Tran also talked about the future of AI systems and suggested that the focus is dual: first, focus on improving preexisting systems and then think of new cutting-edge solutions.

    And for those looking to break into careers in AI and healthcare, Tran says that the “first step is to decide upfront what problems you’re willing to spend a huge part of your time solving first, before the AI part,” emphasizing that the “first thing is actually to fall in love with some problem.”

  • Making Machines Mindful: NYU Professor Talks Responsible AI – Ep. 205

    Artificial intelligence is now a household term. Responsible AI is hot on its heels.

    Julia Stoyanovich, associate professor of computer science and engineering at NYU and director of the university’s Center for Responsible AI, wants to make the terms “AI” and “responsible AI” synonymous.

    In the latest episode of the NVIDIA AI Podcast, host Noah Kravitz ‌spoke with Stoyanovich about responsible AI, her advocacy efforts and how people can help.

  • NVIDIA’s Jim Fan Delves Into Large Language Models and Their Industry Impact – Ep. 204

    For NVIDIA Senior AI Scientist Jim Fan, the video game Minecraft served as the “perfect primordial soup” for his research on open-ended AI agents.

    In the latest AI Podcast episode, host Noah Kravitz spoke with Fan on using large language models to create AI agents — specifically to create Voyager, an AI bot built with Chat GPT-4 that can autonomously play Minecraft.

    AI agents are models that “can proactively take actions and then perceive the world, see the consequences of its actions, and then improve itself,” Fan said. Many current AI agents are programmed to achieve specific objectives, such as beating a game as quickly as possible or answering a question. They can work autonomously toward a particular output but lack a broader decision-making agency.

    Fan wondered if it was possible to have a “truly open-ended agent that can be prompted by arbitrary natural language to do open-ended, even creative things.”

    But he needed a flexible playground in which to test that possibility.

    “And that’s why we found Minecraft to be almost a perfect primordial soup for open-ended agents to emerge, because it sets up the environment so well,” he said. Minecraft at its core, after all, doesn’t set a specific key objective for players other than to survive and freely explore the open world.

    That became the springboard for Fan’s project, MineDojo, which eventually led to the creation of the AI bot Voyager.

    “Voyager leverages the power of Chat GPT-4 to write code in Javascript to execute in the game,” Fan explained. “GPT-4 then looks at the output, and if there’s an error from JavaScript or some feedback from the environment, GPT-4 does a self-reflection and tries to debug the code.”

    The bot learns from its mistakes and stores the correctly implemented programs in a skill library for future use, allowing for “lifelong learning.”

    In-game, Voyager can autonomously explore for hours, adapting its decisions based on its environment and developing skills to combat monsters and find food when needed.

    “We see all these behaviors come from the Voyager setup, the skill library and also the coding mechanism,” Fan explained. “We did not preprogram any of these behaviors.”

    He then spoke more generally about the rise and trajectory of LLMs. He foresees strong applications in software, gaming and robotics and increasingly pressing conversations surrounding AI safety.

    Fan encourages those looking to get involved and work with LLMs to “just do something,” whether that means using online resources or experimenting with beginner-friendly, CPU-based AI models.

  • Anima Anandkumar on Using Generative AI to Tackle Global Challenges – Ep. 203

    Generative AI-based models can not only learn and understand natural languages — they can learn the very language of nature itself, presenting new possibilities for scientific research.

    Anima Anandkumar, Bren Professor at Caltech and senior director of AI research at NVIDIA, was recently invited to speak at the President’s Council of Advisors on Science and Technology.

    At the talk, Anandkumar says that generative AI was described as “an inflection point in our lives,” with discussions swirling around how to “harness it to benefit society and humanity through scientific applications.”

    On the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with Anandkumar on generative AI’s potential to make splashes in the scientific community.

    It can, for example, be fed DNA, RNA, viral and bacterial data to craft a model that understands the language of genomes. That model can help predict dangerous coronavirus variants to accelerate drug and vaccine research.

    Generative AI can also predict extreme weather events like hurricanes or heat waves. Even with an AI boost, trying to predict natural events is challenging because of the sheer number of variables and unknowns.

    However, Anandkumar explains that it’s not just a matter of upsizing language models or adding compute power — it’s also about fine-tuning and setting the right parameters.

    “Those are the aspects we’re working on at NVIDIA and Caltech, in collaboration with many other organizations, to say, ‘How do we capture the multitude of scales present in the natural world?’” she said. “With the limited data we have, can we hope to extrapolate to finer scales? Can we hope to embed the right constraints and come up with physically valid predictions that make a big impact?”

    Anandkumar adds that to ensure AI models are responsibly and safely used, existing laws must be strengthened to prevent dangerous downstream applications.

    She also talks about the AI boom, which is transforming the role of humans across industries, and problems yet to be solved.

    “This is the research advice I give to everyone: the most important thing is the question, not the answer,” she said.

  • Deepdub’s Ofir Krakowski on Redefining Dubbing from Hollywood to Bollywood – Ep. 202

    In the global entertainment landscape, TV show and film production stretches far beyond Hollywood or Bollywood — it’s a worldwide phenomenon.

    However, while streaming platforms have broadened the reach of content, dubbing and translation technology still has plenty of room for growth.

    Deepdub acts as a digital bridge, providing access to content by using generative AI to break down language and cultural barriers.

    On the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz spoke with the Israel-based startup’s co-founder and CEO, Ofir Krakowski. Deepdub uses AI-driven dubbing to help entertainment companies boost efficiency and cut costs while increasing accessibility.

    The company is a member of NVIDIA Inception, a free program that offers startups go-to-market support, expertise and technological assistance.

    Traditional dubbing is slow, costly and often missing the mark, Krakowski says. Current technology struggles with the subtleties of language, leaving jokes, idioms or jargon lost in translation.

    Deepdub offers a web-based platform that enables people to interact with sophisticated AI models to handle each part of the translation and dubbing process efficiently. It translates the text, generates a voice and mixes it into the original music and audio effects.

    But as Krakowkski points out, even the best AI models make mistakes, so the platform involves a human touchpoint to verify translations and ensure that generated voices sound natural and capture the right emotion.

    Deepdub is also working on matching lip movements to dubbed voices.

    Ultimately, Krakowski hopes to free the world from the restrictions placed by language barriers.

    “I believe that the technology will enable people to enjoy the content that is created around the world,” he said. “It will globalize storytelling and knowledge, which are currently bound by language barriers.”

    https://blogs.nvidia.com/blog/2023/08/30/deepdub/

  • Replit CEO Amjad Masad on Empowering the Next Billion Software Creators – Ep. 201

    Replit aims to empower the next billion software creators.

    In this week’s episode of NVIDIA’s AI Podcast, host Noah Kraviz dives into a conversation with Replit CEO Amjad Masad. Masad says the San Francisco-based maker of a software development platform, which came up as a member of NVIDIA’s startup accelerator program, wants to bridge the gap between ideas and software, a task simplified by advances in generative AI.

    “Replit is fundamentally about reducing the friction between an idea and a software product,” Masad said.

    The company’s Ghostwriter coding AI has two main features: a code completion model and a chat model. These features not only make suggestions as users type their code, but also provide intelligent explanations of what a piece of code is doing, tracing dependencies and context. The model can even flag errors and offers solutions — like a full collaborator in a Google Docs for code.

    The company is also developing “make me an app” functionality. This tool allows users to provide high-level instructions to an Artificial Developer Intelligence, which then builds, tests and iterates the requested software.

    The aim is to make software creation accessible to all, even those with no coding experience. While this feature is still under development, Masad said the company plans to improve it over the next year, potentially having it ready for developers in the next 6 to 8 months.

    Going forward, Masad envisions a future where AI functions as a collaborator, able to conduct high-level tasks and even manage resources. “We’re entering a period where software is going to feel more alive,” Masad said. “And so I think computing is becoming more humane, more accessible, more exciting, more natural.”

    For more on NVIDIA’s startup accelerator program, visit https://www.nvidia.com/en-us/startups/

  • Codeium’s Varun Mohan and Jeff Wang on Unleashing the Power of AI in Software Development – Ep. 200

    The world increasingly runs on code.

    Accelerating the work of those who create that code will boost their productivity — and that’s just what AI startup Codeium, a member of NVIDIA’s Inception program for startups, aims to do.

    On the latest episode of NVIDIA’s AI Podcast, host Noah Kravitz interviewed Codeium founder and CEO Varun Mohan and Jeff Wang, the company’s head of business, about the company’s business, about how AI is transforming software.

    Codeium’s AI-powered code acceleration toolkit boasts three core features: autocomplete, chat and search.

    Autocomplete intelligently suggests code segments, saving developers time by minimizing the need for writing boilerplate or unit tests.

    At the same time the chat function empowers developers to rework or even create code with natural language queries, enhancing their coding efficiency while providing searchable context on the entire code base.

    Noah spoke with Mohan and Wang about the future of software development with AI, and the continued, essential role of humans in the process.