AI Voice Technology Just Got INSANE (ElevenLabs GenFM Demo + More)

AI transcript
0:00:07 >> Will come on some Next Wave podcast with Matt Wolf and Nathan Lanz.
0:00:10 >> It feels like probably the biggest unlock.
0:00:14 You could produce content in one language and then make it available in 33 different languages.
0:00:18 >> Yes, to be able to just talk to AI now and to build so many different apps.
0:00:23 It’s like, this is a time I keep telling people where you can be the idea person who ships.
0:00:26 >> Hey, welcome to the Next Wave podcast.
0:00:29 I’m Matt Wolf. I’m here with Nathan Lanz.
0:00:32 And in this AI landscape, this AI world that we’re in,
0:00:35 there are so many different tools out there.
0:00:37 And most people are sitting around going,
0:00:39 “I don’t know how to use this in my business.
0:00:41 I don’t know what this is good for.”
0:00:42 Well, today, we’re going to talk about a tool
0:00:45 that people are actually implementing in their business
0:00:47 and people are actually generating revenue from
0:00:52 and using really successfully to generate side hustle income.
0:00:56 Today, we’re talking to Amar Reshi from Eleven Labs.
0:00:57 And if you’re not familiar with Eleven Labs,
0:01:00 it is a tool that allows you to train your own voice into it.
0:01:03 It creates podcasts. It creates sound effects.
0:01:07 It does all sorts of amazing things with audio.
0:01:11 And Amar is going to break down all of that for us on this show
0:01:14 and even show us inside the app how to use it,
0:01:16 how to get the most out of it.
0:01:18 And it’s just got a ton of cool features.
0:01:21 >> The big unlock here is the fact that with Eleven Labs,
0:01:23 you could produce content in one language
0:01:26 and then have it go out to 33 different languages.
0:01:28 So in business, that is such a huge unlock
0:01:30 that you can now reach all these new markets
0:01:31 that you couldn’t before.
0:01:34 >> Yeah, this tool is absolutely game changing.
0:01:35 We use it ourselves.
0:01:37 It is a super fun tool, too.
0:01:38 You’re going to love it.
0:01:41 And so let’s go ahead and just break it all down with Amar Reshi.
0:01:46 >> Look, if you’re curious about custom GPTs
0:01:49 or our pro that’s looking to up your game, listen up.
0:01:53 I’ve actually built custom GPTs that help me do the research
0:01:56 and planning for my YouTube videos
0:01:58 and building my own custom GPTs
0:02:00 has truly given me a huge advantage.
0:02:02 Well, I want to do the same for you.
0:02:04 HubSpot has just dropped a full guide
0:02:06 on how you can create your own custom GPT
0:02:08 and they’ve taken the guesswork out of it.
0:02:11 We’ve included templates and a step-by-step guide
0:02:13 to design and implement custom models
0:02:16 so you can focus on the best part, actually building it.
0:02:19 If you want it, you can get it at the link in the description below.
0:02:20 Now, back to the show.
0:02:25 [MUSIC]
0:02:28 >> We’re here with Amar Reshi from 11 Labs
0:02:31 and we’re going to talk about some of the really cool stuff
0:02:34 that 11 Labs has been rolling out recently.
0:02:36 So Amar, let’s jump in real quick though
0:02:38 and give a little bit of background on you.
0:02:40 What were you doing before 11 Labs
0:02:41 and how did you get involved with them?
0:02:42 What’s your role there?
0:02:44 Kind of give us the lay of the land a little bit.
0:02:46 >> Yeah, yeah, sounds good.
0:02:48 I caught the AI bug like everyone else
0:02:52 and around late 2022, I fell into it all.
0:02:55 Really on the deep end when I published a children’s book
0:02:57 that accidentally went viral.
0:02:59 You remember that map because that’s how we met.
0:03:00 >> Yeah, yeah.
0:03:02 >> And so after that episode,
0:03:06 I basically started to just share stuff with AI every month or so.
0:03:07 It was a new experiment,
0:03:09 new tools are coming out all the time, right?
0:03:12 We just got Runway, we just got Pika and all these things.
0:03:16 And so it was a new thing every month or so,
0:03:17 sharing each creation.
0:03:20 And 11 Labs was a tool I stumbled across.
0:03:22 I started using it for a bunch of my experiments,
0:03:23 voice-overs and videos,
0:03:28 videos, characters voiced in those AI-generated videos.
0:03:31 And yeah, I really, really enjoyed using the tool,
0:03:33 but I was burning credits and I was-
0:03:37 So you’re like, hire me, hire me, give me free credits.
0:03:41 Well, that’s hilarious.
0:03:43 It’s like the college student working at like Burger King
0:03:45 or something like getting free burgers, right?
0:03:46 It’s like-
0:03:49 But no, I threw that, I essentially was like,
0:03:51 how do I get in touch with 11 Labs?
0:03:53 I wonder if they’d enjoy using this product.
0:03:56 I spoke to the founder, we had a great conversation,
0:03:59 and he was also looking to hire someone to lead the design team.
0:04:02 And that’s essentially how it started.
0:04:05 We hit it off and yeah, the rest is history.
0:04:09 And I’m head of design at 11 Labs and having a blast.
0:04:09 >> Very cool.
0:04:13 Yeah, I remember we were doing some Twitter spaces for a while.
0:04:14 We were on a pretty consistent streak
0:04:16 of doing like these weekly Twitter spaces.
0:04:19 And I think that’s how we initially connected.
0:04:21 And you had the book that you put out.
0:04:24 And I think that book was actually one of the first times
0:04:25 that I started to realize,
0:04:28 oh, AI is actually kind of a controversial space.
0:04:30 Like up until that point, I was just like,
0:04:32 look at these cool tools, these are so much fun.
0:04:36 But I started to see some of the backlash bubble up
0:04:39 around that time, around AI.
0:04:42 And so I do remember that very well.
0:04:45 I remember you had kind of like a mixed experience
0:04:46 putting out that book, let’s say.
0:04:50 >> Yeah, mixed is probably the right word.
0:04:53 I had a bit of an Oppenheimer moment looking in the lake.
0:04:55 Like what have I done to art?
0:04:59 But no, it was a great learning experience
0:05:02 because I think it showed me someone in the tech bubble
0:05:03 in San Francisco.
0:05:04 And also there’s someone who always just sees
0:05:06 the optimistic side of all of this.
0:05:08 Hey, actually there’s a whole set of people
0:05:10 who have a very visceral reaction to this.
0:05:12 And it was worth seeing their side of the story,
0:05:15 even if the way they reacted was maybe a little harsh.
0:05:18 >> Yeah, I think that’s the best perspective
0:05:21 in the world of AI is to try to see both sides of the token.
0:05:24 Let’s talk about 11 Labs a bit.
0:05:26 11 Labs, I think, first came onto the scene
0:05:28 is like a voice cloning tool.
0:05:30 It was the first one I ever remember coming.
0:05:32 There was voice cloning tools out there,
0:05:35 but they were very obviously AI, right?
0:05:38 Like they were like that Siri Alexa kind of voice
0:05:40 where you can tell it was AI.
0:05:42 11 Labs was the first one I remember popping up
0:05:45 where I was like, I could barely tell the difference
0:05:47 if this is a human or not anymore.
0:05:50 And then I was able to actually clone my own voice into it
0:05:53 and make like little recordings with my own voice.
0:05:57 But I mean, 11 Labs has come so far since then.
0:06:01 Like, I guess what is 11 Labs like big picture mission?
0:06:03 Like who are they trying to help?
0:06:05 And like who are these tools for?
0:06:07 Because it seems like there’s a pretty wide range
0:06:09 of abilities it has now.
0:06:10 Yeah, yeah, yeah, yeah.
0:06:12 So a funny story is the way it started
0:06:16 was the two co-founders in Poland were watching a show
0:06:17 on one of the streaming sites.
0:06:18 It was all dubbed in Polish.
0:06:22 But in Poland, that show was dubbed by one voice actor
0:06:24 who did the men and the women.
0:06:27 So you can imagine what an experience that was like.
0:06:29 And I kind of took them down the rabbit hole.
0:06:31 There’s got to be a better way, right?
0:06:34 And so to start with the problem we all had,
0:06:36 whenever you watch a dub show and it sucks
0:06:38 and you’re like, oh, I wish there was a better way.
0:06:40 I don’t want to read the subtitles.
0:06:42 And so that kind of took them down the dubbing rabbit hole.
0:06:45 But the mission there and the mission, I think,
0:06:48 still holds true, which is we say internally,
0:06:50 it’s like keep content universally accessible,
0:06:51 any language, any voice.
0:06:54 And that’s where a lot of the dubbing stuff started.
0:06:57 But then also the huge voice library now
0:07:00 with 32 plus different languages, the model support.
0:07:01 And so it still is that.
0:07:03 It’s we really want to make content engaging
0:07:05 any language, any voice.
0:07:07 You should have the same great authentic experience
0:07:08 that you had in your language.
0:07:10 Another language for other people.
0:07:12 But what’s been amazing is that platform that started
0:07:16 as this voice replication and dubbing thing
0:07:20 is now just a complete AI audio platform in suite
0:07:22 where you just generate sounds, soon music.
0:07:25 So there’s going to be all sorts of fun stuff coming there.
0:07:28 When it comes to celebrity voices and stuff like that,
0:07:33 what’s 11 labs thoughts or approach on cloning voices
0:07:34 without permission?
0:07:37 Because I do know that’s a worry of a lot of voice actors,
0:07:39 especially the stance internally.
0:07:41 And at least the things that I can speak to are,
0:07:44 you know, we take the deep fake stuff like very seriously
0:07:48 and and the safety bits are actually a feature built in
0:07:50 that we think through from the beginning.
0:07:52 So very much like, how do we recognize voice signatures
0:07:54 for these famous voices?
0:07:57 How are we seeing what’s generated moderating that?
0:07:58 So that stuff is like all monitored.
0:08:02 And I think the team has kind of been working on that nonstop
0:08:05 since the beginning, which is why even this election cycle
0:08:07 is very smooth and everything all went well.
0:08:10 And I think, yeah, but on the celebrity voices thing,
0:08:11 we do want to work with them, right?
0:08:14 I think it’s like we and we already started
0:08:16 having a bunch on the on the platform.
0:08:18 So on the mobile app, we have Bert Reynolds
0:08:22 and Judy Garland and James Dean and Jerry Garcia
0:08:24 and Deepak Chopra actually just put his voice on too.
0:08:27 And you can like, you can listen to his meditations
0:08:27 in his voice.
0:08:31 So it’s been really fun to kind of start getting
0:08:33 more and more folks on the platform that way.
0:08:36 Well, I would love to sort of dive in and like,
0:08:39 maybe even do some screen sharing and maybe talk about
0:08:43 some of the features that you are, you think are the coolest.
0:08:46 Maybe anybody watching this on YouTube can get a peek.
0:08:49 If they haven’t tried 11 Labs, they can see sort of some
0:08:51 of the stuff that it’s capable of.
0:08:54 And then maybe even take a look at the iPhone app
0:08:56 if we can figure out how to do it technically.
0:08:59 Take a look at the iPhone app because that 11 Labs reader
0:09:00 is really, really cool.
0:09:02 Like you can go and take any article you find online,
0:09:06 any PDF, any, pretty much any source of text,
0:09:09 throw it in there and have somebody read it to you.
0:09:11 And now you can even have it turned into a podcast for you.
0:09:13 So like, that’d be really cool to talk about.
0:09:16 But maybe let’s start with like the desktop app.
0:09:17 Let’s do it.
0:09:18 Let’s do it.
0:09:18 Cool.
0:09:20 Well, okay, we’re in the dashboard here.
0:09:23 And this is essentially where you can start creating
0:09:27 all sorts of stuff with our generative AI audio.
0:09:28 So let’s start with text-to-speech.
0:09:31 This is essentially what most people end up using
0:09:34 because it’s widely applicable for voiceovers in video games,
0:09:37 to YouTube videos, to any sort of way
0:09:40 you might want to use generate speech.
0:09:41 So here I’ve got maybe, you know,
0:09:45 a nice fun opener for a potential podcast.
0:09:47 And let’s just regenerate and hear it.
0:09:48 Hey, everyone.
0:09:50 Welcome back for another deep dive.
0:09:51 Cool.
0:09:53 So you got a sense of Brittany there
0:09:55 who’s in our voice library.
0:09:58 And you can see we’ve got tons of voices as well
0:10:00 for so many different use cases.
0:10:03 So that’s Brittany, our kind of social media style voice.
0:10:04 Hey, everyone.
0:10:06 Welcome back to the channel.
0:10:08 And then you have, you know, trailer voice.
0:10:10 Remember those trailers from the 90s?
0:10:14 In a world where AI voices sound like robot.
0:10:17 Can we do like one for the next wave like that?
0:10:21 Welcome to the Next Wave podcast.
0:10:23 And these voices that you’re sharing,
0:10:24 these are voices that are available
0:10:26 to anybody with an 11 lab subscription,
0:10:29 or these like ones that you’ve personally trained
0:10:30 that are just in your account.
0:10:32 Yeah.
0:10:34 These are all available in the library.
0:10:36 And so the ones you’re seeing are ones
0:10:37 that have just added from the library
0:10:38 available to everyone.
0:10:39 Gotcha.
0:10:39 Cool.
0:10:41 Let’s hear the trailer voice.
0:10:43 Welcome to the Next Wave podcast
0:10:46 with Matt Wolfe and Nathan Lans.
0:10:49 Okay, that’s the new intro.
0:10:50 Yeah, yeah.
0:10:52 I don’t have to record one anymore.
0:10:52 We’ll just use that one.
0:10:54 Yeah, download this and send you the file.
0:10:54 Okay.
0:10:57 Amazing.
0:10:58 Yeah.
0:10:59 And so you’ve got the voices,
0:11:01 we’ve got the different models,
0:11:02 our turbo model, which is cheaper,
0:11:04 but way faster.
0:11:05 And so that’s really great for people
0:11:06 who are building apps
0:11:09 and need really fast reaction times
0:11:11 for the generations.
0:11:14 And then yeah, some legacy models,
0:11:16 if people still prefer how some of the older ones sound,
0:11:17 but we kind of kept that in there
0:11:18 for our power users.
0:11:19 But yeah.
0:11:21 And then real quick, what are the little like,
0:11:22 you’re probably about to get into this
0:11:23 and I’m just sort of getting ahead of myself,
0:11:25 but what are the little like sliders
0:11:27 and can you sort of explain what those do
0:11:29 and how they affect the output?
0:11:30 Yeah, so a few of them.
0:11:32 So stability is essentially,
0:11:34 you can push it to be more variable
0:11:35 and you’ll get more intonation
0:11:36 and stuff like that.
0:11:39 But it’s a little, it gets a little unstable.
0:11:41 So it’s really, if you just want to experiment
0:11:43 with the way the output might play out,
0:11:45 similarity is really useful
0:11:47 if you have a replica of your voice
0:11:48 and you’re trying to decide
0:11:50 how similar you want it to sound
0:11:52 versus maybe lose it a little bit
0:11:53 and try something else.
0:11:55 And then style exaggeration.
0:11:57 Yeah, this one is,
0:11:59 I view it as like a very experimental slider.
0:12:02 It’s like, you’re not sure what you’re going to get,
0:12:04 but you might get a range of emotions.
0:12:07 So it’s feeling a little boring.
0:12:08 Try the style of exaggeration.
0:12:09 Yeah.
0:12:10 And one thing that’s really cool too
0:12:13 is that it’s like, you generate a sentence
0:12:14 and you hear it, you know, like,
0:12:17 I don’t really like the way that came out.
0:12:18 You tweak a slider.
0:12:20 I mean, it seems like you don’t even have to tweak a slider.
0:12:21 You can just generate it again.
0:12:22 It’ll sound a little bit different,
0:12:24 but tweaking the sliders will sort of
0:12:28 make a bigger impact on the regeneration of it.
0:12:29 Oh, 100%.
0:12:31 And the other thing that actually makes a difference
0:12:33 is how you’ve written the text out.
0:12:35 So for instance, if I had done all caps,
0:12:37 it’s actually going to be a bit louder
0:12:38 and more exaggerated.
0:12:41 The exclamation mark is adding more emphasis.
0:12:42 So it actually understands the context
0:12:44 of the text that it’s reading.
0:12:46 Yeah, there’s some handy tips.
0:12:48 Like, I’ve been playing with 11 Labs for,
0:12:50 I don’t know how long 11 Labs has been around,
0:12:52 but I feel like I’ve been playing with it
0:12:54 for at least 18, 19 months now.
0:12:57 I didn’t even know some of that kind of stuff.
0:12:59 So that’s really cool.
0:13:00 Amazing.
0:13:02 My favorite, though, is the voice changer.
0:13:04 And I’m not sure how I’m going to demo it here.
0:13:07 It’s a tricky one, but you can take your own voice
0:13:08 and transform it into another.
0:13:12 So, you know, maybe you want to say that trailer line
0:13:12 in a different way.
0:13:17 We can take that audio and then turn you into David,
0:13:20 the trailer voice, with exactly how you set it.
0:13:23 Yeah. So all the, like, specific inflections
0:13:26 that you might put into the sentence and things like that,
0:13:28 it’s going to follow that exactly, pretty much.
0:13:29 It’ll follow that exactly.
0:13:32 And so this is, honestly, if you really want to direct the voice,
0:13:33 this is the best way to do it.
0:13:36 You have to, you know, be a little bit of a voice actor yourself,
0:13:39 but you’ll get some fun output with this one.
0:13:39 Yeah, yeah.
0:13:43 I wonder if it would work, like, if it’ll take the audio
0:13:46 from the microphone, even though you’re on a podcast.
0:13:47 You really want me to voice that right now?
0:13:52 What about, like, a three-word thing?
0:13:55 Like, let’s freaking go or something like that.
0:13:57 All right, well, I’m going to try the mic or got it over here.
0:13:59 Okay. Okay.
0:14:01 Hopefully it doesn’t take over my mic over here,
0:14:02 but we’ll see.
0:14:04 Three, two, one, let’s go.
0:14:08 Three, two, one, let’s go.
0:14:12 Three, two, one, let’s go.
0:14:14 So perfect.
0:14:14 That’s so cool.
0:14:17 I love that.
0:14:21 So, like, if you’re not quite getting the exact output you want
0:14:22 by typing in the text,
0:14:25 this is sort of like the next thing you can go try,
0:14:29 just speak it out in the exact way you want it to be heard,
0:14:31 and then it’ll generate it that way.
0:14:34 And so, yeah, you saw, you know, the voices here,
0:14:38 but I do want to show you the library where all of the voices are.
0:14:40 And so this is where, you know, I think you can have the most fun,
0:14:43 just browsing what all the different voices on the platform are.
0:14:46 And we have voice actors as well who’ve now added their voice.
0:14:53 So one of my favorites is Carter who actually did a lot of the voices
0:14:55 for Mortal Kombat and Street Fighter.
0:14:58 So Baraka and like all these characters that you remember, yeah.
0:15:02 I’m an 80s kid, so yeah, definitely.
0:15:04 Yeah, I love Mortal Kombat.
0:15:07 So for me to see his voice on the platform is super cool.
0:15:08 You can hear it.
0:15:11 I’ve failed over and over and over.
0:15:12 So his voice Shao Kahn,
0:15:15 and this is so much like that Shao Kahn voice, right?
0:15:17 Oh my God, I got to use that for a video game.
0:15:21 That’s got to be like the voice in the game, like the narrator.
0:15:23 Yeah, 100%.
0:15:25 And we even have a singing voice.
0:15:30 So and this is kind of the cool thing about the replicas, right?
0:15:33 It’s like you can record any sort of recording
0:15:35 and it’ll take that into consideration.
0:15:40 Although I’m almost 30, I feel like a schoolgirl when I think of you.
0:15:46 Maybe that’s the one we should use for our next wave intro.
0:15:47 Yeah, definitely.
0:15:50 That’s super cool.
0:15:54 Now, I know there’s like a sort of like marketplace as well.
0:15:57 Like, do you know Matt Vidpro, the AI YouTuber?
0:16:01 He trained his voice into Eleven Labs
0:16:05 and he put it into some sort of like marketplace too.
0:16:09 And now he says that he keeps on seeing like TikToks
0:16:12 and like Instagram reels and stuff that are using his voice.
0:16:14 And it’s like, it’s been weirding him out.
0:16:16 But like, what’s the whole like marketplace thing?
0:16:18 Yeah, yeah.
0:16:20 So voice actors, when they come to the platform,
0:16:23 they essentially can start earning when people use their voice.
0:16:27 And so we have people earning in the thousands of dollars a month passively,
0:16:29 just not doing anything and other people are using their voice.
0:16:34 And so I think Matt Vid is probably a recipient/victim according to…
0:16:38 Of this popularity, yeah.
0:16:43 But yeah, no, he’ll be earning for all that virality, which is cool.
0:16:46 Are there any limitations on how his voice can be used?
0:16:50 Like, can it be used for selling hymns or something like this?
0:16:54 Yeah, there are people who have…
0:16:57 You know, they can keep their voice on the platform
0:16:58 for a specific period of time.
0:16:59 They can pull it back.
0:17:00 So he does have that control.
0:17:02 So that’s super cool.
0:17:05 Yeah, it’s like a, you know, if you want a little passive income stream,
0:17:09 and you have a decent sounding voice,
0:17:11 go generate or train your voice in there,
0:17:13 and you can sort of sell it on the platform.
0:17:14 That’s pretty cool.
0:17:17 Totally. And, you know, it’s not just one voice.
0:17:18 You might be able to do multiple voices.
0:17:22 So Carter here does his, like, you know, video game style character,
0:17:24 but he also has a casual conversation on.
0:17:29 This man was attacked by a shark, but incredibly, this attack saved his life.
0:17:30 And so you can see that he’s…
0:17:32 Wow, great brain.
0:17:36 We all know that regular exercise is good for the body.
0:17:37 A modern man’s man, Val…
0:17:39 And so he’s got different voices.
0:17:40 They’re all earning different ways.
0:17:42 It’s, you can, yeah, if you’re a voice actor,
0:17:44 I’m sure you can do a range of voices.
0:17:45 So, yeah.
0:17:45 Yeah, for sure.
0:17:46 Yeah, no.
0:17:48 That’s super cool.
0:17:49 Now, what’s the main use case right now?
0:17:50 Do you, is it, is it people doing, like,
0:17:52 faceless YouTube channels and stuff like that?
0:17:54 Or, like, what’s, like, what’s the main way
0:17:56 people are actually using Eleven Labs?
0:17:57 It’s a range, honestly.
0:17:59 They’re, the faceless YouTube channels are huge.
0:18:02 I was just watching a cricket game yesterday,
0:18:06 and the ads in between, I could recognize the voice.
0:18:07 I was like, that’s Brian.
0:18:12 And so there’s advertisements that it’s being used for.
0:18:15 And then, yeah, I think a lot of these, like,
0:18:19 video game things are now, are using them a lot, too.
0:18:22 Yeah, I’ve actually started even using the sound effects
0:18:23 feature a little bit more, too.
0:18:26 Like, you can have, like, a, you know, a knock on the door,
0:18:30 or, like, a loud crash, or, like, a, you know, an explosion,
0:18:33 or things like that, and just generate real quick sound effects.
0:18:36 And you don’t even need to go hunt them down on, you know,
0:18:37 stock sound effects sites anymore.
0:18:39 Just go to Eleven Labs, tell it what you want,
0:18:40 and generate it.
0:18:43 And now I’m starting to sound like a pitchman for Eleven Labs,
0:18:45 but, like, I legitimately do use it, so.
0:18:49 Like, how many languages work, too?
0:18:51 Like, is it good, like, going from one language to another yet,
0:18:53 or, like, where is that, where is that at?
0:18:54 Yeah, yeah.
0:18:56 So it goes between 32 different languages.
0:18:59 And yeah, you can, you can generate that same thing,
0:19:02 that text we had here in German.
0:19:03 Let’s actually do that.
0:19:04 Why not?
0:19:06 Yeah, that feels like probably the biggest unlock
0:19:08 that people haven’t really wrapped their heads around yet,
0:19:11 is the fact that you could produce content in one language,
0:19:14 and then make it, like, available in 33 different languages.
0:19:16 So it handles the translation as well.
0:19:18 You can type out a sentence in English,
0:19:20 and it’ll, like, translate it to Japanese,
0:19:23 and then speak it out in your voice in Japanese as well.
0:19:26 So you would have to do the translation yourself.
0:19:26 Okay.
0:19:30 But yeah, you will, you will get the effect.
0:19:31 Oh, I’m so weak.
0:19:34 We’ll come in some next wave podcast
0:19:38 with Matt Wolf and Nathan Lanz.
0:19:42 You were saying trailer boys, but now in German, yeah.
0:19:45 My name is cool in German.
0:19:50 I like Matt Wolf.
0:19:56 I mean, my last name is German anyway, so.
0:19:58 And yeah, on the sound effects front,
0:20:00 let’s just, I think I was trying to do this
0:20:02 for a video game I was making.
0:20:04 And let’s see, let’s hear what sound it makes.
0:20:09 Oh, yeah, that’s more like a barrier.
0:20:10 Yeah.
0:20:11 What’s the prompt?
0:20:13 Just for anybody who might just be listening on audio?
0:20:16 Yeah, so it’s a retro video game click sound.
0:20:17 Okay, got this one.
0:20:22 Yeah, and you kind of remind me of those classic 8-bit games.
0:20:24 Yeah, like in the menu or whatever,
0:20:25 you’re picking something in there.
0:20:27 Yeah, it was, like, moving through the menu.
0:20:29 Let’s hear a car whizzing by.
0:20:30 Let’s see what that sounds like.
0:20:37 Really fast car.
0:20:42 But yeah, the cool thing here, though,
0:20:44 is that exactly what you said,
0:20:46 it’s like instead of spending hours searching
0:20:49 for that sound effect, just describe what’s in your head
0:20:51 and then hopefully you get a sound close to it
0:20:53 and we send you multiple samples
0:20:55 so you can hopefully get there pretty quick.
0:20:56 Yeah, yeah, for sure.
0:20:59 Is there anything else in the desktop app
0:21:02 that we haven’t covered on this yet?
0:21:06 My favorite one, which is the latest thing we’ve got
0:21:08 that’s come out recently, is conversational AI.
0:21:12 So we’re literally letting anyone build conversational AI agents
0:21:15 that they can talk to and try out.
0:21:18 And that just, the API is super simple too,
0:21:20 so anyone can literally start building
0:21:23 their own conversational experiences.
0:21:26 One thing I made was a little assistant
0:21:27 for our Wiki internally.
0:21:29 We have all this stuff about our offices
0:21:30 and all that stuff.
0:21:33 And it’s like, ah, do I have to go read through this Wiki?
0:21:35 Why can’t I just ask this agent?
0:21:38 Hey, what’s the Wi-Fi password in the London office, right?
0:21:40 And it’ll tell me that.
0:21:43 And so it’s as easy as like you go in,
0:21:47 you describe what it is, you give it a system prompt,
0:21:49 and you have all these other settings
0:21:50 if you really want to play with them.
0:21:53 You can choose the LLM.
0:21:56 For instance, I can go to Gemini Flash or GP4 or Mini.
0:22:00 And then here you can see I’ve got all of the Wiki
0:22:01 as a knowledge base for it.
0:22:05 So it knows what it’s basing off the answers.
0:22:07 And then we can talk to it.
0:22:11 Yeah, it’s almost like building like a custom GPT,
0:22:14 like the actual process of building it.
0:22:15 And I actually played with that.
0:22:16 I actually forgot about that feature
0:22:17 until you just brought it up again.
0:22:20 But I played with it in like a video either last week
0:22:22 or the week before, whenever it came out.
0:22:24 It was really quite cool.
0:22:27 Now, when you do build one of these, two questions.
0:22:29 Is there a limit to how much like knowledge
0:22:31 you can put into it?
0:22:34 And two, can I like embed it on my website
0:22:35 or something like that so other people
0:22:37 can have conversations with it outside of 11 Labs?
0:22:40 Yeah, so it’s working with the context window
0:22:41 of the LLM that you’re using.
0:22:44 So there’s a bit of that that limits that.
0:22:46 And then, yeah, you can totally embed it.
0:22:48 So we let you easily like copy out a widget.
0:22:51 You can customize it in the dashboard.
0:22:53 And then you literally just paste in this one line
0:22:55 and you’ll get your widget everywhere,
0:22:55 which is pretty cool.
0:22:57 That’d be super cool to make it like something like a–
0:23:00 instead of having a frequently asked question
0:23:01 for a company or something, you’ve got like a little widget.
0:23:04 And it’s like, here’s a little magic genie
0:23:04 that you just talked to or whatever.
0:23:08 Magic assistant that you talked to
0:23:09 and it’ll answer whatever.
0:23:10 Right, that’s so cool.
0:23:11 100%.
0:23:12 And the other cool thing is you can actually
0:23:15 give it success criteria.
0:23:17 And so you’ll know later on in the logs,
0:23:20 like, hey, when someone spoke to this thing,
0:23:22 did they find the thing that they were looking for?
0:23:26 And, you know, I can define it as it’s as simple as a prompt.
0:23:28 It’s like, if the question of the user has answered successfully,
0:23:29 then you helped.
0:23:31 You know, if they were like, hey, cool, like,
0:23:33 I got what I needed, that counts as success.
0:23:37 But yeah, I can give you a quick demo of what it sounds like.
0:23:40 The voice I picked out for this assistant
0:23:42 was a more sci-fi sounding robot
0:23:44 because I wanted it to feel like a classic one.
0:23:45 So it sounds like this.
0:23:47 Hello, I’m Ava.
0:23:50 We got the last big robotic voice.
0:23:52 She need the hologram now.
0:23:53 Yeah, exactly.
0:23:56 I was thinking of Cortana from Halo.
0:23:57 Yeah, that’s the vibe.
0:24:00 But let me talk to it.
0:24:01 Let’s see.
0:24:02 Hi, I’m Elle.
0:24:05 Hey, Elle, how’s it going?
0:24:07 Where is the London office?
0:24:11 The London office is located at Floor 5,
0:24:17 119 Wardore Street, London, W1F0UW.
0:24:20 If you need any more information about the office
0:24:23 or anything else, feel free to ask.
0:24:23 That’s very cool.
0:24:26 Tweet, yeah.
0:24:30 And I will say the speed before you comment on the speed
0:24:31 is the LLM also.
0:24:33 Okay, yeah, yeah.
0:24:36 Yeah, I think it wasn’t like Gemini Flash,
0:24:38 one of the options and Gemini Flash is pretty fast.
0:24:40 Yeah, it’s really fast.
0:24:42 That’s the one we kind of recommend because of the speed.
0:24:46 Though I find if you want a good balance of intelligence
0:24:50 plus the speed, then I think 4.0 is a good middle one.
0:24:51 Sweet.
0:24:52 No, that’s really cool.
0:24:53 I mean, hearing that voice,
0:24:56 having a conversation with that voice was pretty surreal though.
0:24:59 Yeah, yeah, yeah.
0:25:00 It’s really fun.
0:25:01 You can just kind of make your own characters
0:25:03 and embed them in different places.
0:25:07 And then yeah, the last thing I’ll show you guys is projects.
0:25:11 This is what we use for a lot of the folks who are publishing
0:25:14 and turning their books into audiobooks
0:25:16 or people who actually want their podcast transcript
0:25:19 maybe also regenerated and want to try something else.
0:25:22 So projects essentially let’s you kind of go line by line
0:25:26 and even choose different voices for different segments.
0:25:28 So it’s a really powerful editor for that long form stuff
0:25:31 that people might want to do.
0:25:32 And you can do that in your own voice, right?
0:25:34 So you could train it on your own voice.
0:25:36 Because I guess one big gripe I have about audiobooks
0:25:38 is like when you’re the original author,
0:25:38 it’s so much better.
0:25:40 But when it’s somebody else, it’s like, whatever.
0:25:45 A lot of people are starting to use 11 Labs for audiobooks,
0:25:45 it seems like.
0:25:50 I think like in fact, I think like Amazon might even be letting
0:25:54 people use 11 Labs for like audible books now.
0:25:55 I’m pretty sure.
0:25:58 So it’s like you are starting to see more and more authors
0:26:01 just use tools like this and plug in their entire book.
0:26:02 Yeah, exactly.
0:26:05 And that’s also a big part of why we were excited
0:26:06 about doing the mobile app.
0:26:10 It’s like we’re giving all these indie authors
0:26:12 a way to self publish their content.
0:26:14 And now they can use our whole suite.
0:26:15 So it’s like they use projects.
0:26:17 They publish directly to our mobile app.
0:26:19 And then they reach a whole new audience.
0:26:23 And so it’s kind of fun to see the tools finally coming
0:26:26 together and connecting across different mediums as well,
0:26:28 which has been a fun evolution.
0:26:30 Yeah, I do wonder long term how that’s going to work out.
0:26:33 It feels like all these are going to kind of start to combine.
0:26:36 Are you guys competing with Suno?
0:26:38 Or like there’s runway in AI video,
0:26:40 but now they’ve got something with steals
0:26:41 where they’re showing images, right?
0:26:43 Like competing with Mid Journey.
0:26:44 And Mid Journey is working on AI video.
0:26:47 And at some point, they’re all going to want audio as well.
0:26:47 Right?
0:26:51 And so it’s like, how does this all play out long term?
0:26:51 It’s so true.
0:26:55 I think we’re seeing the convergence of all the different,
0:26:58 you know, even the Luma’s recent thing,
0:27:00 where it’s like now it’s a canvas for creativity
0:27:02 and it has for you and photos and stuff.
0:27:03 I think the consumer wins.
0:27:04 That’s for sure.
0:27:06 We’re having a great time with all these tools.
0:27:07 For sure.
0:27:08 Let’s talk about the app a little bit too,
0:27:12 because I know the app’s got the reader in it,
0:27:13 which we were talking about a little bit,
0:27:16 that’ll read PDFs to you or read articles
0:27:17 or things like that to you.
0:27:19 You can have Jerry Garcia read it to you.
0:27:22 Or was Bert Reynolds, I think it was one of them, right?
0:27:24 Yeah, that was my favorite.
0:27:27 So you’ve got like all of these options
0:27:28 for who can read it to you.
0:27:32 And then now the newest feature is the Gen FM,
0:27:35 where, you know, similar to what they did with the Notebook LM,
0:27:37 it’ll do podcasts, but you know,
0:27:38 it’s got like new voices.
0:27:42 And can you use the new Gen FM
0:27:45 with like different voices or is it set voices right now?
0:27:49 Yeah, so right now we’ve kind of curated pairs.
0:27:52 And so we’ve kind of got some voices
0:27:53 are really great for tech content.
0:27:55 Others are great for politics or studying or whatever.
0:27:57 And based off that content,
0:27:59 it’ll pick out your co-hosts.
0:28:02 But we will add more customization.
0:28:03 That’s a given.
0:28:05 That’s kind of what we’re really excited about.
0:28:07 We even played around with internally,
0:28:10 like what does it look like when, you know,
0:28:12 Deepak Chopra is trying to understand the recipe
0:28:13 for chicken marsala.
0:28:17 It was amazing.
0:28:21 Or do like a podcast of Deepak Chopra
0:28:24 trying to understand like Gen Alpha slang or something?
0:28:26 Sure, you know, exactly.
0:28:27 I want to hear that.
0:28:31 So yeah, we’re loving the direction it can take
0:28:32 with all the voices, for sure.
0:28:33 Very cool.
0:28:34 All right, cool.
0:28:37 So the mobile app, yep.
0:28:38 As you can see here on the homepage,
0:28:40 we’ve recently got Gen FM,
0:28:42 and you’ll see it on your homepage right here
0:28:45 with the two co-hosts kind of flying around your screen.
0:28:49 But you go into it, you import an article.
0:28:52 So I’m just pasting in an article that I want to hear.
0:28:55 And I’m just going to hit generate.
0:29:02 And so what was fun about this?
0:29:08 Is we took this state that’s kind of boring, right?
0:29:10 Like you’re waiting for your podcast to load
0:29:13 and we turned it into a fun, interactive moment.
0:29:15 It’s like your co-host are showing up.
0:29:18 We’re getting them ready in the room.
0:29:22 You’re not just sitting there, yeah.
0:29:24 Yeah, yeah, it’s not just sitting there.
0:29:26 And then when your podcast is ready,
0:29:28 it kind of sounds something like this.
0:29:30 So let’s hear them.
0:29:32 Let’s hear them.
0:29:34 Counter-intuitive.
0:29:35 Underreacting.
0:29:37 A superpower for the modern age?
0:29:39 Well, not quite.
0:29:41 Today we’re discussing how learning to underreact
0:29:44 may actually be the key to making a real difference
0:29:46 in our increasingly chaotic world.
0:29:49 That’s an intriguing concept.
0:29:52 In a world that often feels like it’s spinning out of control,
0:29:55 the idea of underreacting might seem counter-intuitive.
0:29:57 Can you elaborate on what you mean by that?
0:29:59 Of course.
0:30:02 You see, I used to be the kind of person who would rush in.
0:30:05 And so you can see, you get a little bit of a discussion.
0:30:07 And you know, it’s been so fun
0:30:09 because I’ve tried it out on all sorts of content.
0:30:15 For instance, my dad had written this essay about parenthood
0:30:19 and I then played it back to him,
0:30:22 but with two AI co-hosts discussing his essay.
0:30:24 And it was like a completely different perspective.
0:30:27 And he was like, wow, it feels so…
0:30:28 Wow, this is genius.
0:30:34 He either felt like a genius or like he was being judged.
0:30:37 But it’s amazing, right?
0:30:41 It’s such a fun way to hear an entirely new perspective
0:30:42 on a piece of content.
0:30:46 And it can also combine a whole bunch of pieces of content.
0:30:49 So if I wanted to do a morning news brief,
0:30:53 I can go to all of the various websites that I get my AI news from,
0:30:54 toss them all into the Reader app,
0:30:57 and it’ll make a podcast that rounds up
0:30:59 all the news for that day for me, right?
0:31:00 Yeah, yeah, exactly.
0:31:01 You can get all sorts of…
0:31:03 You just give it a bunch of content.
0:31:07 We will then take out insights and other perspectives
0:31:08 you might not have gleaned from it.
0:31:12 And then to the extent that we can with the LLM’s knowledge,
0:31:15 of course, like how does it tie to other events
0:31:16 you might not have known about?
0:31:19 And then that brings out a cool insight as well.
0:31:21 So yeah, it’s been fun.
0:31:24 And I’ll tell you, the team really scrambled for this one.
0:31:26 We thought about the idea and we were like,
0:31:28 “Oh, we gotta make it happen.”
0:31:32 And then we just started going at it a week and a weekend,
0:31:33 and then it was real.
0:31:37 And then we just kept fine-tuning it till we got to this point.
0:31:41 Awesome. Yeah, it’s a super cool app.
0:31:42 I have one question about it.
0:31:42 Can you actually…
0:31:45 I actually have not played with the Gen FM.
0:31:47 I think the day we’re recording this is either
0:31:49 the day it came out or the day after it came out.
0:31:52 So it’s really fresh as the day that we’re recording this.
0:31:55 But can you actually download the podcast episode
0:31:57 when you’re done, like get like an MP3 version of it
0:31:58 or something like that?
0:32:01 Yeah, so we don’t have full episode downloads yet,
0:32:03 but you can share them with your friends
0:32:04 and that will send them…
0:32:07 I think about, well, we love number 11,
0:32:10 so it’s a one-minute, 11-second clip that you can send.
0:32:14 Obviously, it’s similar to notebook LLM, right?
0:32:16 So I’m curious, like right now,
0:32:17 what’s the main difference?
0:32:19 Is it like you guys have a lot more choices of voices,
0:32:20 I assume? Is that one…
0:32:25 Yeah, I think there are a few things that we think
0:32:27 make Gen FM stand out.
0:32:29 And I think one is the voice library for sure.
0:32:31 It’s like really realistic voices
0:32:34 and you have a whole range of potential co-hosts.
0:32:36 The other thing is the languages piece.
0:32:36 You know, that’s…
0:32:40 We’re still staying true to that, which is 32 languages.
0:32:42 So anyone can have all sorts of different podcasts
0:32:44 and different languages, which we’re excited about.
0:32:46 And then I think a mobile first experience,
0:32:49 because this is something where I don’t necessarily
0:32:51 want to pull up like a notebook style page
0:32:53 and like listen to a podcast there.
0:32:56 I want to hear it on my phone while I’m on the commute
0:32:56 or something like that.
0:33:00 And we think this is a more natural way to do it.
0:33:02 Yeah, I mean, we recently did a tier list for AI.
0:33:04 And we did put 11 Labs in A,
0:33:07 and we did put notebook LLM after debating it in B.
0:33:10 And the big reason was, or was it B or C?
0:33:10 Maybe I think it was B.
0:33:14 And the reason was, you know, I’m highly skeptical
0:33:16 that Google can actually turn that into a product
0:33:18 that people actually use.
0:33:20 And I think 11 Labs, honestly, has a better chance,
0:33:22 you know, with the whole suite of things
0:33:23 that you guys have people using,
0:33:24 it seems more natural that, yeah,
0:33:26 it goes into 33 different languages.
0:33:27 That’s great.
0:33:29 You turn a podcast into 33 different languages.
0:33:32 That’s an actual use case that people could do today
0:33:35 and make money from that, versus with the Google product,
0:33:37 with notebook LLM, I’m not sure right now
0:33:39 what you would actually use it for in terms of business.
0:33:41 Yeah, yeah, I think I’m clear.
0:33:43 I see the studying use case.
0:33:44 I think it’s great.
0:33:48 And I think it’s a fun way to explore the content.
0:33:51 And also, kudos to them for really showing people
0:33:53 this was a fun new way to look at it.
0:33:58 But yeah, we’re absolutely excited to make something great.
0:34:01 Well, I don’t know if you can answer my next question or not.
0:34:02 You may not be allowed to,
0:34:05 but is there anything exciting that’s upcoming
0:34:08 for 11 Labs that maybe you could tease for us?
0:34:09 Or that you can break on the show.
0:34:15 Well, I think we teased Music way early on.
0:34:19 And we’re excited that that model is really coming together
0:34:20 and sounding great.
0:34:23 So I think that’ll be one of the next things
0:34:24 you guys will see very soon.
0:34:26 Interesting.
0:34:30 And then, of course, the other thing we’re always thinking about
0:34:35 is how does all these different parts of audio come together,
0:34:38 music, sound, speech, and so on.
0:34:40 So things are being explored.
0:34:42 You know, I’ll wait for the right moment.
0:34:44 I think the right teammates,
0:34:47 the research team who’s really cranking and doing all of this
0:34:48 and deserve all the credit, honestly.
0:34:53 Yeah, I’m waiting for them to feel right,
0:34:54 reveal to the world what they’ve been cooking.
0:34:55 And it’s amazing.
0:34:56 Awesome.
0:34:58 Now, this has been really cool.
0:34:59 I appreciate you jumping in.
0:35:02 I know you’ve got a really great X account.
0:35:02 I follow you on X.
0:35:04 You share a lot of really cool stuff about 11 Labs,
0:35:07 but also not always just 11 Labs, right?
0:35:09 It’s not a pure 11 Labs pitch account,
0:35:11 but you share some really cool stuff,
0:35:13 a lot of stuff you’re experimenting with.
0:35:14 You build a lot of apps.
0:35:16 You’ve been doing a lot of AI coding stuff.
0:35:18 In fact, we’re talking about doing a follow-up episode
0:35:20 where we break down some of the cool apps
0:35:22 that you’ve been working on and how you’ve built them.
0:35:25 So definitely everybody needs to follow Amar over on X.
0:35:28 Do you want to go ahead and shout out your X account?
0:35:31 I always feel awkward when I’m shouting on my own account,
0:35:34 but yeah, I’m at Amar on X.
0:35:37 Yeah, A-M-M-A-A-R.
0:35:39 Because I know how badly the priestess has spelled my name.
0:35:43 The second A keeps throwing me off.
0:35:46 I’m like, “Get the real thing, get the real thing.”
0:35:49 I think to what you said,
0:35:54 as someone who doesn’t really code and finds it hard to code
0:35:56 and to be able to just talk to AI now
0:35:58 and to build so many different apps,
0:36:01 it’s like, this is the time I keep telling people
0:36:03 where you can be the idea person who ships.
0:36:04 I appreciate it.
0:36:05 Well, thanks so much, Amar.
0:36:07 This has been an awesome conversation.
0:36:09 And yeah, thanks again for hanging out with us.
0:36:12 And anybody tuning in,
0:36:14 make sure that you subscribe to this podcast
0:36:15 wherever you listen to podcasts.
0:36:17 If you’re watching on YouTube, subscribe to us there.
0:36:19 If you’re Spotify, Apple Podcast,
0:36:22 wherever you listen to podcasts, you can find this show.
0:36:23 Please subscribe to us.
0:36:24 Really appreciate you.
0:36:25 We’ll see you in the next one.
0:36:25 Bye-bye.
0:36:28 (upbeat music)
0:36:31 (upbeat music)
0:36:33 (upbeat music)
0:36:36 (upbeat music)
0:36:38 (upbeat music)
0:36:41 (upbeat music)
0:36:43 you
0:36:45 you

Episode 38: How revolutionary is the latest in AI voice technology? Matt Wolfe (https://x.com/mreflow) and Nathan Lands (https://x.com/NathanLands) dive deep into this topic with Ammaar Reshi (https://x.com/ammaar), head of design at ElevenLabs and AI enthusiast who has made waves with his innovative AI projects.

In this episode, Ammaar takes us through the cutting-edge features of ElevenLabs, a platform revolutionizing content creation with AI-driven voice technology. From monetizing pre-recorded voices to producing multilingual content, and even generating music, explore how ElevenLabs is transforming how we create and consume audio content. They also delve into Ammaar’s background, discussing his transition from viral AI art to leading design at ElevenLabs, and the exciting developments on the horizon for AI in audio.

Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd

Show Notes:

  • (00:00) Discussing AI business tool with ElevenLabs.
  • (05:28) Co-founders initiated dubbing innovation for accessibility.
  • (07:52) Exploring ElevenLabs features, including iPhone app.
  • (10:47) Stability affects voice similarity and style.
  • (13:49) Browse library of diverse platform voice actors.
  • (17:37) Using ElevenLabs for quick sound effects.
  • (20:21) Anyone can build simple conversational AI agents.
  • (25:20) Mobile app empowers indie authors for self-publishing.
  • (31:40) GenFM: Realistic voices, 32 languages, mobile experience.
  • (34:25) Follow Amar on X for AI app insights.

Mentions:

Get the guide to build your own Custom GPT: https://clickhubspot.com/tnw

Check Out Matt’s Stuff:

• Future Tools – https://futuretools.beehiiv.com/

• Blog – https://www.mattwolfe.com/

• YouTube- https://www.youtube.com/@mreflow

Check Out Nathan’s Stuff:

The Next Wave is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Production by Darren Clarke // Editing by Ezra Bakker Trupiano

Leave a Comment