I Made Games, Apps & Art Using Gemini 3 — Zero Code Needed

AI transcript

🕒

Việt

中文

0:00:10 Welcome to the Next Wave Podcast. I’m Matt Wolf, and last week was a big one. We got some fresh
0:00:15 new models from Google. There was the official launch of Gemini 3, as well as the brand new
0:00:20 Nano Banana Pro model. And today I’m going to talk about both of them. And then I sit down with
0:00:25 someone who has been right in the center of all of this, Tulsi Doshi, the head of product for
0:00:31 Gemini. This episode is packed. We’re talking about what’s actually new, what surprised the Google
0:00:36 team as they were building these models, and how it changes the way normal people like creators,
0:00:41 entrepreneurs, and absolutely anyone using AI interact with these tools day to day.
0:00:47 And towards the end, Tulsi shares her own personal favorite ways to use the new models,
0:00:51 which genuinely opened my eyes to a few use cases I hadn’t even considered yet.
0:00:56 But before we jump into that interview, I want to quickly break down the biggest announcements,
0:01:00 show you some of the cool stuff I made with Gemini 3 and Nano Banana Pro,
0:01:05 and give you a sense of why these updates matter way beyond the tech headlines.
0:01:16 Being a know-it-all used to be considered a bad thing, but in business, it’s everything. Because
0:01:23 right now, most businesses only use 20% of their data, unless you have HubSpot, where data that’s
0:01:28 buried in emails, call logs, and meeting notes become insights that help you grow your business.
0:01:33 Because when you know more, you grow more. Visit HubSpot.com to learn more.
0:01:42 All right, let’s start with Gemini 3, because this is a pretty substantial leap from the previous
0:01:48 generation. Google is pushing this one as their most capable, most general-purpose model yet,
0:01:52 and they’re not just saying that. They’re actually shipping a bunch of meaningful upgrades.
0:01:57 It had big jumps in reasoning and instruction following. This is the first Gemini model where
0:02:02 you can feel the internal steps happening. It doesn’t just give answers, it shows its work in a way
0:02:07 that feels much more intentional and transparent. There’s a huge difference when you’re doing
0:02:12 planning, research, or anything multi-step. It’s got way stronger media understanding. Images,
0:02:17 documents, charts, screenshots. Everything processes way faster and way more accurately,
0:02:24 especially in mixed-media situations. It’s noticeably more stable and aware than earlier Gemini
0:02:30 versions. It’s better at coding, better at math, better at multi-step tasks. This is the biggest jump
0:02:36 in the general-purpose AI assistant category. It recovers from the mistakes, sticks to constraints,
0:02:42 and handles longer, more complicated prompts without drifting. And the whole thing is available
0:02:48 inside the Gemini app on day one. So you don’t need any sort of special access or external tools,
0:02:53 it’s just there. And I need to give a quick thank you to DeepMind for the early access on all of it as
0:02:59 well. So what I want to do real quick, instead of just talking about this new model, is to show you a
0:03:04 couple cool things that I actually did with Gemini 3. Because once you actually put it to work,
0:03:10 that’s where the fun begins. It was able to look at my calendar and my email and my filming schedule
0:03:17 and help me map out a filming, editing, and publishing schedule based on criteria that I gave
0:03:22 it for how I want to release stuff. I got it to solve some pretty complex math problems and actually
0:03:28 show its work. I don’t know why this impressed me so much, but the way it presented the math as it did
0:03:35 it just like really blew my mind kind of. I got it to create a Minecraft clone here using only HTML,
0:03:43 CSS, and JavaScript. Then I took away the HTML, CSS, and JavaScript constraints and actually got it to create
0:03:51 an even better Minecraft world with more color blocks and even closer version to the reality of
0:03:58 what Minecraft actually is. With just one prompt, I got it to make a recreation of advanced wars that
0:04:06 actually works and you can see the AI actually thinks and makes moves as well. I got it to make
0:04:11 this happy birthday singing page with balloons and confetti that actually plays music.
0:04:28 You can’t say you’re not impressed. I made a game. That’s a clone of vampire survivor that actually
0:04:35 works. And on top of Gemini three, they also rolled out a new agent mode that has a cloud-based browser
0:04:41 that can actually take control of your screen and take actions on your behalf. Like this demo where I
0:04:47 got it to book dinner reservations for me. I’m just clicking through the screenshots, but it opened up this
0:04:54 browser here and step-by-step went and booked the restaurant for me. I think probably my favorite demo
0:05:00 was I gave it the PDF attention is all you need, which is a really complex technical paper about the
0:05:07 transformer architecture. And it put it into this visualizer that pretty accurately explains what the
0:05:14 paper is about. So yeah, Gemini three is pretty cool, pretty powerful. I actually made a full breakdown
0:05:19 video of it on my main channel. If you want to see all of the various tests that I did with it.
0:05:25 All right, now let’s switch gears to the one with the fun name, nano banana pro. So this one launched
0:05:32 on Thursday and it’s part of the new image generation system built on top of Gemini three pro image.
0:05:36 If you’ve used any of Google’s earlier imaging models, this is a pretty noticeable upgrade.
0:05:42 And the best way I can describe it is it’s fast, incredibly literal and shockingly good with fine
0:05:48 details. But here’s the biggest updates. Much better text rendering inside of images.
0:05:55 It can do menus, logos, posters, comic panels, all with accurate spelling. And it supports every
0:06:00 language the Gemini app supports. And that’s actually something that most image models are
0:06:05 still struggling with. It’s got more consistent characters and object control. If you need the
0:06:11 same character across multiple images or multiple angles, it sticks much better. This is huge for creators
0:06:16 making scenes, storyboards, thumbnails, or anything sequential. It’s got improved realism in materials.
0:06:23 So skin, fabric, glass, food textures, everything is cleaner, sharper, and less AI shiny than earlier
0:06:30 models. And it’s built directly into the Gemini app as nano banana pro. So you can generate, edit,
0:06:35 and iterate without switching a bunch of different tools. I was able to play with this one in early
0:06:42 image. And it made some genuinely wild images. So let me pull up a few of those and walk you through
0:06:49 them real quick. I was able to tell it to make this cafe menu that was bilingual with both English and
0:06:54 Japanese text. And then I also wanted it to make the images look like they were made out of clay. And it
0:07:00 nailed it. I mean, even the small text, usually when you have small text in these images, it becomes kind of
0:07:06 gibberish. Well, these are actually readable. Single shot of rich, concentrated coffee. And these look
0:07:12 like clay images, all with just one prompt. And it did exactly what I wanted it to. I had to create
0:07:18 three ultra clean modern logos that each spell the word wolf in different languages. So we got Russian
0:07:25 kanji and Italian, and they’re all different materials. We got metallic, we got fur, we got glass,
0:07:32 and it nailed it. Also, you can sketch on images. So I gave it this image of made on YouTube. And then I
0:07:38 sketched a stick figure and I wrote giant person on it right here. And check this out. It made this same
0:07:44 image, but put a giant person in the background, even match the colors that I drew them with. I redesigned
0:07:51 the Apple vision pro by putting ears on it and green eyes. And well, it gave me a new mockup with ears and
0:07:57 green eyes. You can have it combine up to six images. I gave it a dog, a surfboard, a city skyline,
0:08:04 a good vibe sign, and an empty road. And it generated this image with all of those. Here’s another one
0:08:09 where I gave it six different articles of clothing and told it to combine them into a capsule wardrobe
0:08:15 style sheet showing mix and match looks. And well, it did what it was supposed to. I mean, they’re not
0:08:20 good looks, but it did what it was supposed to. I had to create a pencil sketch drawing of a futuristic
0:08:25 gadget. And then once it was done, I said, all right, now let’s turn it into a realistic mockup.
0:08:32 And it did both perfectly. I built a brand mockup for a company called sky sip with colors and bottle
0:08:39 mockups and even fonts. It can resize photos into different aspect ratios without like compressing
0:08:45 screwing up the image. I gave it this image of me and Matthew Berman, which was a nine 16 vertical photo
0:08:54 and told it to make it a 16 nine wide photo. And it did it. You can transfer styles. I gave this image
0:09:03 of a watercolor along with this image of a group of us at GTC. And it took the watercolor and the image and
0:09:09 blended the watercolor style into the image. So it’s really, really capable, especially when it comes
0:09:14 to image editing. I’ve been pretty blown away by what it’s been able to do. This is another one
0:09:19 I’ve had early access to and was able to play with for a couple of weeks. So thanks again to deep mind
0:09:24 for that. And it’s also another one that I did a deeper dive video on, on my main channel. If it’s
0:09:33 something that you want to dive deeper into. All right. So you just saw what Gemini three can actually
0:09:38 do. It’s legitimately powerful, but here’s the thing. Most people watch these demos and then
0:09:45 they don’t actually use it. That’s why we’re giving you the Google Gemini at work guide. It has the exact
0:09:52 prompts and workflows that marketers are using right now to speed up research, create better content and
0:09:58 actually drive results. Scan the QR code or click the link in the description. Now let’s get back to the
0:10:06 show. All right. Now that we’ve gone through everything new and I’ve shown you some of the stuff that I’ve been
0:10:12 experimenting with. Let’s get into the conversation that I had with Tulsi. Today’s guest is Tulsi Doshi
0:10:17 from Google. Someone who has been deeply involved, not only in the launch of Gemini three, but also in
0:10:23 shaping how these models evolve, how they’re used and how they show up inside the products you and I get
0:10:30 to use. We talk about what’s actually different about Gemini three behind the scenes, the unexpected
0:10:35 discoveries the team made while building and testing the model, her personal favorite use cases, which
0:10:41 kind of surprised me and how Google thinks about making these models more helpful, more intuitive and
0:10:47 more aligned with real human workflows. It’s a super fun conversation and gives you way more insight into
0:10:53 how these models are born and why they behave the way they do. So let’s jump over and chat with Tulsi.
0:11:00 Hey, Tulsi. How’s it going? Good. How are you? Good, good. I’m sure it’s been busy couple weeks for you
0:11:05 guys. Yeah, but I think, you know, what’s exciting right now is I feel like it’s just busy always.
0:11:11 Yeah, that is very true. It seems like there’s never a dull moment lately. So I’ve had access to Gemini
0:11:17 for four days now or so and had a chance to play with it. Nice. Honestly, I’m actually pretty blown
0:11:25 away. Like I got it to like, I got it to like one shot, like a mini, like Minecraft clone and like
0:11:31 all sorts of crazy stuff. And I asked it to generate the happy birthday song for me and it generated the
0:11:37 song, but it built like this whole user interface with like floating balloons and the music playing and
0:11:44 like a, a sing along karaoke from a single prompt of make the happy birthday song for me, which was
0:11:48 just, I don’t know. That was crazy to me. It’s been really fun just to see. Like, I think for me,
0:11:54 it’s just like the visual strength of the model is so good in terms of what you can do, like zero shot
0:11:59 that it’s like, yeah, it’s just awesome. Yeah. Yeah. Yeah. So I do have a handful of questions. I have
0:12:04 six questions total, so I won’t waste too much of your time. I’ll just kind of run through the
0:12:09 questions and we’ll take it from there. Yeah, let’s do it. Cool. So my first question is with
0:12:16 Gemini 3, was there something that changed with how the model was trained or like the training data on
0:12:22 the model? Like what made Gemini 3 such a big leap over previous models? I mean, I think honestly,
0:12:26 Matt, it’s a combination of multiple things. And I think that’s actually what makes it so exciting
0:12:32 is like, it really does feel like it’s a team effort here in terms of both architectural changes on the
0:12:38 pre-training side, but also translating those then downstream on the post-training side for what we
0:12:44 were pushing in terms of tool use improvements, in terms of multimodal improvements. I think the other
0:12:50 thing is with Gemini 3, it was really an exercise in us seeing and learning from the feedback from 2.5
0:12:55 in terms of what were people really loving about the model, but also where were people potentially
0:13:00 giving us feedback about where the model wasn’t as usable or wasn’t as exciting. And then how do we
0:13:04 actually take the combination of those two things and try to build something awesome? And I think it’s
0:13:06 been like the culmination of all of that work.
0:13:10 Gotcha. Gotcha. So was it like a fine-tuning thing? Like you were kind of constantly fine-tuning it
0:13:16 towards the feedback that was given? Yeah. So we were doing that as well as then building on kind
0:13:20 of the strengths of the model itself in terms of the work that our pre-training teams have done. So
0:13:25 it’s really been a combination of both of those things. Gotcha. Now I know you personally, you’ve had
0:13:31 a big focus on safety and that sort of thing from my understanding. Has Google and your approach to
0:13:35 safety changed at all over the last sort of several Gemini iterations?
0:13:40 I mean, I think what’s been for me, at least important is I think safety has been kind of
0:13:45 central to the model development process from day one. And that was true for Gemini 2, 2.5,
0:13:53 1.5, 3. So I don’t, in that sense, no, because it’s been a consistent focus and a consistent
0:14:00 prioritization across our models. I think with Gemini 3 and with any model that is the most capable model
0:14:06 we release or the next generation of models, we spend a lot of time on frontier safety.
0:14:11 And one of the things that we really invested in with this model is making sure to do kind of robust
0:14:16 external safety testing to really invest in what are our kind of understanding and protections of this
0:14:22 model. How do we get that right on day one? And so I think that’s been really important. One thing
0:14:27 also that’s related to safety that we spent a lot of time on with this model is also things like
0:14:33 reducing sycophancy and trying to make sure that the model has a more balanced approach to how it
0:14:37 responds. And I think that is also like, it’s not quite safety, but it’s also in the vein of
0:14:40 how do we build a really good experience for users?
0:14:47 Yeah, it totally makes sense. So I’m curious, is there anything that’s not in Gemini 3 that you
0:14:49 expect to be in future Gemini models?
0:14:56 Ooh, good question. I mean, Gemini 3 is a huge step up in a lot of ways. I think there’s two areas I’m
0:15:03 excited about continuing to push. One is, you know, right now, Gemini 3, one of the things you saw this
0:15:09 morning is we showed a lot of really rich visual outputs, really interactive outputs. And a lot of
0:15:14 those are very code driven, right? One of the things we want to continue to push on with Gemini is
0:15:20 pulling these modalities together. Right now, Gemini is multimodal in, but it still outputs text,
0:15:26 right? How do we also like really bring that rich output modality to that kind of mainline model?
0:15:31 So that’s one area that I’m like really excited to continue to see us push on and push forward.
0:15:37 Another area I’m really excited for us to like continue to push on is really how we continue to
0:15:44 iterate on multi-turn tool use, kind of long journeys. This model does an amazing job at a lot of those
0:15:50 things. And you can see that in some of the evals like vending bench or sweet bench or tau 2. But I
0:15:55 think that’s an area where we know that’s where the world is going in terms of kind of complex actions
0:15:59 and investments. And I think that’s where we want to continue to help climb, I think.
0:16:04 Yeah, yeah. One of the really cool things that I saw when I was playing with Gemini was I gave it like
0:16:09 seven tasks in a row and said, go through and do all seven of these tasks. And then the output was
0:16:13 like, all right, here you go. Here’s one done. Here’s number two done. Here’s number three done.
0:16:18 And it just did all of these tasks one after another from a single prompt, just one initial
0:16:23 prompt at all, which was really cool. The other thing I’m really excited about, but haven’t tested
0:16:29 with Gemini 3 is the longer context with video inputs. Like I think there was a very, very brief
0:16:34 mention of like, you can actually give it like up to like hour long videos now, and it will actually
0:16:39 understand the context, which is something I’m super excited to test as somebody who makes YouTube
0:16:43 videos, you know? Yeah, actually, like my husband also makes YouTube videos. And so this is something
0:16:48 that like I’ve been watching more closely. And actually there’s a demo we pulled together. He’s a huge
0:16:54 pickleball player where you can actually like have the model critique your form in, in many of these
0:16:58 kinds of cases. And I think that’s the kind of thing, like there’s both the sheer high quality of
0:17:04 video understanding at like, you know, very specific FPS. So you can actually like capture this kind of
0:17:09 nuance and then actually have the model reason over it, which is awesome. Yeah. Yeah, absolutely. I mean,
0:17:13 I used to actually take golf lessons and that was one of the things when I was taking golf lessons is
0:17:17 they’d take me into a room, show me a video of my swing. And then the instructor would be like,
0:17:21 see right here, the way you kind of move your arm there. If you just adjust your elbow up a little
0:17:27 bit, you know? So that’ll be really cool to have that sort of built into like, I don’t need the
0:17:32 trader anymore to show me how to do that. I can analyze my own stuff. That’s really cool. Was there
0:17:38 anything that Gemini three was capable of doing that you guys didn’t expect it would be doing like any
0:17:42 sort of like emergent capabilities? We’re like, Oh, wow, that’s cool. We didn’t see that one
0:17:47 coming. It’s interesting because I feel like we’ve been iterating on Gemini three for a while. So
0:17:51 in some ways we were able to take some of the early emergent capabilities where we were like, wow,
0:17:56 this model is actually like surprisingly good in this area and then build on top of it. Right.
0:18:02 Kind of take that further. I think we knew the model was going to be great in multimodal understanding,
0:18:07 but I think just how versatile it is across all of the different aspects of that, I think has been
0:18:12 kind of a pleasant surprise. I think for me, another one that I actually would say a lot of folks in the
0:18:17 building won’t necessarily tell you is a surprise, but one that I am like very excited about actually
0:18:23 is it’s multilingual performance. Like the model is very, very strong across languages. And then you
0:18:28 can think about these intersections, right? So like the intersection of multimodal with multilingual
0:18:33 means you could, for example, like there’s someone who created a demo of like taking handwritten
0:18:39 Korean and then translating that into a web app. Right. So there’s a lot you can do now across the kind
0:18:43 of rich combination of these things that I think is actually really cool. And then leads to a bunch
0:18:48 of emergent properties that is actually really awesome. Awesome. When it comes to benchmarks,
0:18:54 I’m curious if there’s like any specific benchmark, you guys put a little bit more weight on or trying
0:18:59 to like optimize for like, what is the benchmark that inside of Google people are like, let’s see
0:19:04 how we can push this one. Yeah. I mean, I think part of what we’re actually really proud of is I think
0:19:08 one of the things that we hold really steadfast as a goal is we really want a well-rounded model.
0:19:14 And I think that really means two things. One, it means not just focusing on a single benchmark,
0:19:19 but actually looking across a slate. And so when you see the benchmark table that we’ll share,
0:19:23 it basically like has a set of benchmarks that are more like academic reasoning benchmarks,
0:19:29 a set of benchmarks that are multimodal, a set of benchmarks around code and then tool use.
0:19:33 And so what we really do is try to say like, hey, there’s a set of capabilities we really want this
0:19:39 model to excel at. And what is a range of evaluations that will help us kind of do that?
0:19:43 And then the thing is, is it’s also important actually, not just to look at those benchmarks,
0:19:48 the kinds that you publish externally, but then internally, we also have evaluations for model
0:19:54 behavior, for style, for instruction, following these types of kind of core competencies of the
0:19:59 model that are actually just as important. Right. And then you also have, you know, sources like
0:20:04 LM arena that allow you to get live feedback. And so I think the combination of all three of those
0:20:09 things allows us hopefully to have a model that doesn’t feel like it’s just really jacked up at
0:20:14 one thing, but that it actually like can kind of meet you where you are, which I think is super
0:20:19 important. Right. Right. Yeah. I’ve been personally trying to find like the best benchmark to hone in on
0:20:26 for like, is this model better for general consumers? Right. Like, so my audience on my channel
0:20:32 is mostly focused on general consumers who are just, they kind of want to stay looped in on it,
0:20:35 but they may not be coders or the most technical people in the world. They’re just kind of
0:20:41 wanting to be looped in. And for me, I mostly look at the LM arena because I feel like this is what the
0:20:46 people are saying. Like anybody who types a prompt, this is what they’re saying they like the best,
0:20:51 but I’m still trying to find like that benchmark to hone in on and be like, this is what you guys should
0:20:56 be paying attention to. Yeah. I mean, I think one area that we benefit from when we think about
0:21:01 consumer value is also that we have these amazing consumer experiences. And so one of the things,
0:21:06 you know, we try to do in partnership with, for example, the Gemini app is run live experiments,
0:21:11 get user feedback, you know, on AI studio, we again, run experiments. We allow people to side
0:21:16 by side vote as they’re building in experiences in AI studio or chatting with the model in AI studio.
0:21:20 And so I think actually it’s, it’s interesting when you talk about like, what is just a good model
0:21:27 for consumers? The best signal in my view is actually to just get feedback from consumers in the experiences
0:21:33 that they’re trying the model in. And so for me, that’s the big thing we’re going to continue to invest
0:21:37 in is actually just trying to get as much feedback from our users as possible about what they love and
0:21:42 don’t love about the model. And then using that to improve the model, you know, rather than a benchmark,
0:21:43 it’s the most kind of direct flywheel.
0:21:51 Absolutely. So this is my last question. What is your personal, you, Tulsi, what’s your favorite
0:21:56 use case of Gemini 3 or use cases? Like what are some of the ways you’re using it right now that
0:21:58 you’ve found really impressive?
0:22:02 Yeah. So, I mean, I’ve been using Gemini 3 for the last few weeks. I’ve been like lucky to kind
0:22:07 of get to try a bunch of things. I think similar to you, the ones that really pop out as like wow
0:22:12 moments are when you’re trying to create some of these like just amazing web app experiences that
0:22:17 are like super fun. So my family speaks Gujarati, which is a language from the Western part of India.
0:22:24 And I’ve been actually really enjoying trying to take Gujarati phrases, poetry, stories, and then
0:22:29 actually like not only translate them, but have the model build on them. And so the models like
0:22:34 writing capabilities, multilingual capabilities, and like that’s something that I haven’t been able
0:22:39 to do as much because Gujarati is not a language that is super common on the web. And so it’s
0:22:44 actually one where it’s harder to capture that nuance. And I find that Gemini 3 has been able to
0:22:48 do that much better than I think I would have expected, which is pretty cool. I’ve also just been
0:22:53 honestly starting to use it in my daily life, which is pretty awesome for a wide range of things,
0:22:59 including creating like cool invitations that are more interactive for like events and just
0:23:00 it’s been really fun.
0:23:04 Awesome. Well, thank you so much. I added a few bonus questions in there because I had some
0:23:10 follow-ups, but that was my list of questions that I had ready for you. Is there anything else that,
0:23:12 you know, maybe we should touch on that we didn’t touch on?
0:23:18 We touched on a lot of things in that original brief. So I feel like we covered most of the ground.
0:23:22 One thing that you and I, I think, didn’t talk as much about, but we kind of touched on briefly is
0:23:27 this idea of like Gemini 3 should be able to bring anything to life and help you bring anything to life.
0:23:32 I think one of the things that’s really powerful is this kind of transitioning between formats,
0:23:37 right? So one of the things that I think Robbie talked about is this kind of generative layouts
0:23:42 and the model being able to help you kind of visualize concepts and actually like
0:23:47 help them be more interactive as you engage with them. I think one of the things we’re really excited
0:23:53 about is how we can actually leverage Gemini to build these like richer experiences for users and for
0:23:59 consumers. I think the other is that Gemini 3 is super widely available. And I think that’s
0:24:02 something we’re really excited about. Like we’re really excited about how do we bring the power of
0:24:07 Gemini 3 everywhere? So it’s not just a model that is great for developers. It’s also a model that brings
0:24:11 a ton of value to you in the app. It’s a model that brings a ton of value to you in search.
0:24:16 And I think the idea of the model should meet you where you are and then help improve your
0:24:19 experience. I think that’s means to me, hopefully more than anything.
0:24:23 Totally. Yeah. You just reminded me of something that I tested that I thought was really cool.
0:24:28 I actually took the attention is all you need paper and fed it in and said, generate an animated
0:24:34 visual that explains this paper. And it actually did make an animation that was really cool that
0:24:40 explained like, you know, if a sentence has it in it, it looks back through the sentence to find out
0:24:45 what it means. And it created this whole like visual explainer of the transformer architecture.
0:24:49 And I was like, this is wild. So yeah. And that’s amazing. Right. Like just like,
0:24:54 and I think about it as like, if you go back to kind of Google’s mission of like make information
0:24:59 universally accessible and useful, like I think that kind of example where you can take information
0:25:04 that is super dense and super complex and then actually translate it to a format that is like
0:25:07 really rich and understandable. I mean, that’s awesome.
0:25:12 Yeah, absolutely. Absolutely. Well, Tulsi, thank you so much. This has been awesome. And I really,
0:25:16 really appreciate you taking the time out of your day. I know there’s a lot going on over there right
0:25:20 now. Of course. Thank you. And I’m excited to see the content. Please keep giving us feedback by the
0:25:23 way, as you’re playing with the models. That’s awesome. Absolutely. Yeah. Thanks again. Really
0:25:29 appreciate it. Thanks. Thank you so much for listening and a huge thank you to Tulsi Doshi for
0:25:35 joining us and giving such an inside look at what’s happening with Gemini 3. If you enjoyed this episode,
0:25:39 hit that like button, subscribe to the show and share it with someone who’s curious about where AI is
0:25:45 headed. It helps the podcast grow and lets us bring on more amazing guests. Thanks again. I really
0:25:48 appreciate you being here and hopefully we’ll see you in the next one.

Get the free Google Gemini guide: https://clickhubspot.com/ebf

Episode 86: How big of a leap is Google’s Gemini 3, and what does it mean for the future of AI-powered creativity and productivity? Matt Wolfe (https://x.com/mreflow) sits down with Tulsee Doshi (https://x.com/tulseedoshi), the Head of Product for Gemini at Google, to explore the week’s groundbreaking AI releases—Gemini 3 and the new Nano Banana Pro image model.

In this episode, Matt shares hands-on demos and experiments with Google’s newest models, including multimodal reasoning, advanced image generation, and agent automation. Tulsee Doshi dives deep into how Gemini 3 was built, unexpected discoveries during development, her favorite multilingual and creative use cases, and how Google is making AI more accessible across products for creators, entrepreneurs, and everyday users.

Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd

—

Show Notes: