AI transcript
(upbeat music)
– We’ve existed for about three years
and we’ve passed everybody in revenue
in like literally a year and a half.
Usage is important, but that does not define
the long-term success of an actual customer.
– I think that daily active use
is a pretty terrible metric to uncover customer value.
– There have been companies built in the past
on just great design.
There’s no reason that they can’t be built on the AI side.
– In upgrading all of these multiple layers,
they’ll essentially end up building your core
defensibility in the market.
– Retention problems are just activation problems
in disguise.
– Between June 3rd and June 9th,
A16Z ran its second annual New York Tech Week.
Now this week had thousands of people attend
a record-breaking 700 plus events,
including one event run by our podcast team.
Now this A16Z live recording
is exactly what you’re about to hear,
but first let’s take a quick trip to memory lane.
When ChatGPT was launched in November, 2022,
it quickly became the fastest growing consumer application
in history, but TechSpace AI was just the beginning.
In the next 500 days,
a flurry of AI models launched that spanned new modalities,
from images to video to audio to 3D,
that all yielded an entire ecosystem of applications
that have upended, quite frankly,
the way we work, learn, create, and even play.
Now here in mid-2024, competition is fierce,
but I don’t think I have to convince you of that.
So for this live recording,
we brought in key leaders at three AI companies
to discuss how they’ve managed to stand out
amongst the noise,
because they have products that reach millions of users.
So in this conversation,
you’ll hear from Gora Misra,
co-founder and CEO of Captions,
Karla Sarena,
chief revenue officer of 11 Labs,
and Laura Birkhauser, VP of product at Descript.
Together, we explore what ladders up to AI products
that people actually use,
including what features really matter
when AI is necessary or distracting,
whether you need to own your models,
designing for retention in international expansion,
and of course, where we all go from here.
I hope you enjoy this recording as much as I did.
As a reminder, the content here
is for informational purposes only,
should not be taken as legal, business, tax,
or investment advice,
or be used to evaluate any investment or security,
and is not directed at any investors
or potential investors in any A16Z fund.
Please note that A16Z and its affiliates
may also maintain investments
in the companies discussed in this podcast.
For more details,
including a link to our investments,
please see a16z.com/disclosures.
And so we’re actually less than two years since that,
but a lot of people are familiar with text-to-text,
but all three of the products here
go into several other modalities, right?
We’ve got audio, we’ve got video, imagery.
So I think that’s really exciting,
but maybe we could actually just start with the why now,
and specifically maybe the unlock that we’ve seen
with unstructured data, right,
before we use databases and everything needed
to be really structured
in order for us to make sense of it.
Today, that’s not quite the case.
So Gaurav, maybe we start with you,
and what do you see really today as the why now?
– Yeah, I mean, I think it’s a really exciting time,
generally, just because obviously there’s been
a couple of key breakthroughs,
just in terms of technology with transformers
and diffusion models and so on and so forth.
But I think the key here is we’re able to use a lot more data
to train these models now than ever before, right?
And there’s a bunch of things happening,
both on the hardware side, the software side, right?
And the data side to enable that to happen.
And that’s why we’re seeing amazing results, right?
If you look at a lot of what the key players
in this industry are doing,
they’re just training these models
with more and more and more data every iteration, right?
And that’s able to produce reliably better
and better and better results,
which is pretty amazing to see.
And it’s not in sight so far.
– Carlos, maybe we’ll go to you
before we talk about description in a second.
– I think it’s correct.
The key message for us is experimentation
for 11 lamps has been like,
if you put garbage in, garbage out, right?
If the quality of the data that you put in
is not that great,
then essentially what you end up producing
is half-baked with lots of mistakes and things like that, right?
And we can see that with Whisper,
how many of you have tried Whisper
and it comes out that like,
subscribe, subscribe, subscribe,
and things like that, right?
All the time.
That’s true, we’ve seen it all the time.
But so I think like for us,
like there’s been a layer and initially we trained it
with a lot of data and then over time,
we ended up curating the data
to make sure that like it is very high quality.
Otherwise you’re not able to achieve the results
that you are expecting or that your consumers
or your businesses would need, right?
But that’s a fundamental change
that has happened in the market.
Amounts of data being used with transformers
and alarms to generate this like human content generated,
like whether that’s speech or text or anything else, right?
– Yeah, 3D models, we’re seeing all types of stuff.
So the reason I wanted to wait to talk to you, Laura,
is because I don’t know how many of you have used Descript,
but any guesses on when Descript started?
We talked about Chat GPT, November 2022.
So Descript has been around since 2017.
The reason I wanted to frame that
is because obviously the last couple of years,
very exciting, but machine learning, AI,
in the ’50s is when this really got going.
And obviously there have been unlocks,
but I want to get your pulse, Laura,
on the importance of putting AI at the forefront.
A lot of AI is embedded in the applications
that probably people in the room are building as well,
but Descript long-used machine learning
before really saying, “Hey,
you’re using machine learning, AI,” et cetera.
So what are your thoughts?
– That’s right.
So Descript is software that lets you edit video
just like a text document.
So if you can edit a Google document, congratulations.
You’re also a video editor.
If you can just download Descript,
and now you can edit video.
And it turns out that the technology
that sort of undergirds that is in fact AI,
but we haven’t traditionally come forward
and said, “We’re an AI video editor.”
A, there wasn’t like this huge reward
in the hype cycle for saying that.
So we didn’t have marketers saying it,
but also what we found is that customers didn’t care, right?
They don’t care what is the technology
that is creating this value for me.
What they care about is there is value here.
This is helpful for me.
And so that was long hour way of designing software,
and it probably would have continued that way forever,
except that actually when I think about the thing
that is making us change our minds,
in addition to some of these cool models that are coming out,
it is that the way that humans and computers
are interacting is totally different.
So you can talk to your computer now.
You can use human language to communicate
more subtle intentionalities that you have
for how you wanna edit your video or create your video.
So as this technology has gotten better, we thought,
well, gosh, do we actually wanna design AI
and the product differently?
And if so, how?
And so with our latest release,
we’re actually bringing all of the AI features
that we’ve long had in the product into the same space
and adding a ton of new ones.
And we had a big discussion with our design team
about how do we do this?
And one of the big discussions we had is,
is AI a magic bond or is it an entity?
And one of the big decisions you have to make there
is that traditional creators are much more used
to interacting with Pro Tools software
or creative software in a point and click way.
And so they want a magic wand.
But you have this whole new wave of people
that are now generating and editing video and audio
and they’re used to using kind of more
of this entity interaction.
They want an entity.
Then you start talking about an entity, right?
And you get into internal discussions like,
I don’t know if it’s an entity that might be a bad idea
because what about our robot overlords
are inevitable robot overlords, right?
That’s kind of like one side of the debate.
And then hilariously, you have the other side
of the debate that I don’t want an entity
because actually it turns out this technology
is really stupid sometimes.
And if you make it an entity, you said like,
hey, welcome, this is like your co-editor.
And it turns out your co-editor is like a total moron
that makes horrible suggestions sometimes
because it’s hallucinating.
And so we’re like, okay, how do we deal with that?
So what we decided to do with this newest release
is we’re actually, we’re calling it underlord.
And it’s a nod to the potentially apocalyptic future of AI.
Well, also admitting that right now,
this thing is kind of like a very eager,
like somewhat competent intern
that does a really great job at the first pass
of the worst parts of your workflow.
So that’s some of the story
about how we’ve thought about designing with AI over the years.
– I’d love to get both of your posts.
Like, how do you think about that same question?
What part of AI do I put at the forefront?
Or do I just use this really powerful technology
and kind of give my users what they want
but not really sell this AI thing too much?
– So I’d say, at the end of the day,
you have to solve customer problems.
That’s what we’re trying to do, right?
I think the biggest mistake that can be made is to say,
hey, here’s the technology.
You can have technology, do whatever you want.
People can’t just take that and be like, okay,
I know what to do with this, right?
I think you have to mold it into a product
that solves a problem at the end of the day.
So I think that’s like traditional.
Nothing’s changed there, right?
It’s exactly the same as before.
And if you’re not doing that,
then essentially you’re gonna see retention problems.
Where you’re gonna see people coming in,
trying out the thing,
not knowing exactly what to do with it,
not working perfectly for their use case.
And then they’ll leave, right?
Kind of tourism is what we’re calling it, right?
But I think at the same time on the marketing side,
like stepping away from product for a second,
there is something to be said about sort of having AI
in your message on the marketing side.
Here’s why.
If I just say, I have a better product,
it’s so much better, you won’t believe it.
I’ll be saying the same thing
that people have been saying for literally 100 years
about every product, right?
Like, yeah, trust me, it’s better, right?
Trust me, come on and try it out.
This is every single product that exists, right?
But putting in that AI term in there,
just from the marketing side, this is just tactical,
actually lets people understand,
oh wait, this is gonna be a step change, right?
Of course, if you don’t meet that expectation
when they land in the product, you’re gonna have a problem.
But if you’re able to meet that expectation,
putting that in kind of does inform people about like,
okay, this is not gonna be sort of like the better product,
it’s gonna be a step change
compared to everything else we’ve seen.
So that’s the general guide.
I do feel like a lot of people
are just throwing in the AI term in the marketing side now
just to kind of get the eyeballs there.
And maybe that message will kind of get lost a little bit.
But so far, the innovation has just been so strong
that the message has kind of remained strong.
And if it continues this way,
the marketing side can continue as well.
But at some point, it might get muddled, we’ll see.
– Maybe just I can add on a modifier for you
because I think not only do you have to market the product,
but if you use this bucket term of AI, right?
That means many different things.
Do you own your models, build your own models?
Are you an API wrapper?
And so I would love to hear from you, Carlos,
at 11 Labs in particular,
like in building your own models as well,
like how does that play into it?
Is it a whole marketing packaging
thinking about what you share and what you don’t?
– Yeah, we need to be open.
Like we are an AI company, sorry guys.
And we say it all the time, like we say,
like we do AI voices, we do AI sound effects,
we’re gonna be doing AI music in many ways.
So for us, like it’s all about the audio sphere, right?
So it’s like that layer infrastructure
that allows you to create high quality engaging content,
whether that is like with voice, with like audio overall.
And the way we thought is, well, actually,
there wasn’t really a good quality text-to-speech available
before we invented our own site.
So we were fundraising initially.
It was difficult because the market is not there,
like how are you gonna be getting customers and so on.
So it was like, it was really tough in the early days.
But we thought, look, if you’re able to deliver quality
that voices that sound engaging,
the applications on top of it,
then you end up having market that is just fully on top, right?
So how do you do that? AI voices, simple and plain, right?
And that worked really well.
So we started with like the LLM pure like API play
with a very simple UI that was end of January last year
when we launched the product.
And we thought, well, actually,
there’s gonna be like some pieces of like some content creators
that might want to use the UI,
but we expect on the API side, it’s gonna be quite big purely
because like people might want to build their own applications
on top of it.
And it worked really well.
And since then, what we also realized, like,
well, you cannot expect all of the business
to have the capabilities, build their own application.
So what if we end up going full end to end
and we build our own applications
for areas where we really care about?
And that’s how we end up creating like projects
or audio native or like the dubbing product
and a bunch of other pieces, right?
So it’s been very interesting for us.
And of course, we always say that it’s AI driven
because at the end of the day, we’re a foundational model
that happens to also build applications on top of it.
But I think like the beauty of it
is that anyone can build anything they fancy on top of the API.
And today we power quite a lot of different companies,
more than 41% of Fortune 500 companies use 11 Labs.
We power a lot of startups
and we are very proud to help all of these companies
like succeed as well, right?
So it’s been very interesting,
like having both sides, both motions,
like the pure API play and the application layer
on top of it, it’s challenging as well.
Because then you and I’m having two different profiles
in terms of like on the product side,
on the engineering side and everything, right?
So you always need to balance it.
– Absolutely, maybe we can actually jump straight
to that question of competition.
I feel like if there’s one question
that comes up on this podcast the most,
everyone’s excited about AI and they’re like,
okay, well, where does differentiation come up?
Where do moats arise?
I’d love to prove all three of you on that.
I know we’re early,
but where do you think you can stand out?
Do you really need to be building at the model layer?
You talked about the infrastructure layer,
or can you really just build a really great UI
and capture the app layer?
What do you think about that?
– Maybe I’ll start here by saying,
again, not much has changed in terms of like,
there have been companies built in the past
on just great design.
So I think there’s no reason
that they can’t be built on the AI side.
But at this point of the journey,
there’s so much to innovate on and so much to build on.
It does help to have models
that are foundational and built in-house
because it does give you that extra differentiation
and that extra step.
It is a competitive field and the deeper you can go
and the more you can build from the ground up really,
connecting these different layers together, right?
You can deliver super fast fees on your models.
You can deliver the highest quality that anyone’s seen, right?
And you can deliver a great user experience
that solves a real problem.
Then you have an advantage there.
So I would say though for consumer companies,
which we’re a consumer company, right?
Like we’re used by literally millions and millions
of people around the world
and people make over a hundred thousand videos a day
published through our platform.
For a consumer company,
it does matter a lot to have that differentiation
at this stage.
I think in the longest term,
if you think about what differentiates a consumer company
in the longest of terms, it’s probably just brand, right?
And that’s kind of what you’re building over a period of time.
And the only way a brand dies is like with a generation.
It also takes a generation to build a brand too, right?
So I think that’s kind of the ultimate goal
of where you want to get to.
But I think in the meantime,
there’s many modes that last like different lengths of time,
whether that’s the data mode or a model or like,
whether it’s a UI, UX mode, whatever it might be.
– So at Descript, I would say that we are a horizontal editor
and we’re a very powerful human editor,
which is something that I think a lot of kind of newer
just started in the age of AI,
in the second chapter of AI companies can’t say
because it takes a long time
to build a really powerful,
horizontal human driven editor.
So you can do like really complex editing jobs with Descript.
If you already are like an expert who’s great at this work
and you can do it really quickly with low barriers to entry.
If you’re new to it.
So that reason, I think the application layer
is especially important to us.
And I almost see it as a mirror
to kind of what 11 Labs was saying,
where I think like in general,
we have a may the best model win sort of mentality
when it comes to all of the different models
that we use in our application layer.
And that’s because we’re trying to do everything,
not just AI voices, but things like eye contact,
things like avatars, things like AI speech, transcription,
editing video with text.
If there’s like a cool thing happening in AI
when video generation, when Thora comes out,
that will be in Descript, we’re gonna have it.
And so I think like generally we have an attitude
that is may the best model win,
we wanna give our customers the absolute best experience.
If we don’t see interesting enough work happening
in a space that we wanna be in, we’ll build that model.
And I think there are real places for Descript to differentiate
because we own so much of the editing workflow
and have really great editing workflow data
that like that may be a place
where our models become differentiated.
But in general, if you’re trying to provide
a ton of different services to customers
across a ton of different workflows,
it can really make sense to not try to build
every single one of those in-house,
but instead to be like very thoughtful
about where it makes sense to own versus buy or borrow.
– I think like there’s an element here on,
if you think about purely about differentiation in these days,
like ’cause the market has bought a lot
from purely foundational picks and shovels.
And now the transition towards the app side,
what you end up thinking about
or how I think about defensibility is fear about your users,
your consumers or your businesses.
That’s essentially what will drive defensibility
over the long term.
And if you think about Instagram or Meta
or like a Facebook in the early days,
what was their defensibility?
There was literally nothing out there,
but they were able to fast grow,
outpace everyone in terms of growth, deliver value.
And then the UI was not even that great, right?
But it was actually like you were feeling
there was part of the community
and it was like the experience that you were getting, right?
So defensibility was coming from the actual users
versus the product itself.
And I think like the transition that we’ve seen today
from the foundational models sort of like app side,
it’s actually very interesting
because then you’re able to engage different type
of generations or different type of users
that like if you retain them
and you give them the best experience possible,
they will stay there for the coming year, right?
Whether that is because they’re building their own applications
on top of that because they’re essentially like,
“Well, I want to use your app overall.”
And the way we also think about this at 11 apps
is like layers, right?
So having the foundational layer,
which is like the research that we provide, right?
We do LMS and essentially we provide the best text to speech
and AI voices in the market, fantastic.
What else do you have on top of it?
The data that we’ve acquired that we’ve licensed from partners,
the products end-to-end products that we’re building,
the partnerships that we have, the customers that we have.
So you end up creating all of these multiple layers
that essentially end up building
your core defensibility in the market
that hopefully will sustain us for the coming years, right?
As the market changes,
if one of the layers like ends up getting replaced,
absolutely fine because then essentially you have
all of the other ones that will back you
over the long term, right?
– Yeah, and something you spoke to here
is just like this new generation.
And I think we’re all kind of trying to figure out
what can now be done with AI
when you talked about UX even or designing a new UI.
Voice is now in the mix in ways that it wasn’t before,
but then you also have this question of,
“Do I want to completely reinvent the wheel?
“Show someone a very powerful UI
“that they’re maybe just not familiar with
“and that you don’t retain them.”
So Gaurav, I’d love to probe you on retention.
I mean, even just from the perspective of desktop
versus mobile, you do have a mobile app.
How do you think about designing for that?
Because we’ve seen over over the last, let’s say two years,
there’s this extreme willingness to try,
but then I think someone internally
and coin this like AI tourist phenomena, right?
It’s people try and then a lot of them do leave.
So how do you think about that?
– Yeah, I mean, it’s something we think about a lot
because at the end of the day,
I think you can kind of go by metrics
and you can really worry about like,
“Oh, there’s retention number, it should be at that number.”
And you can kind of get caught up in that a little too much
when the reality is like those micro optimizations
are not going to solve whatever retention problem
or any other metric problem that you might have, right?
At the end of the day, it’s about the user experiences.
It’s about solving a real problem.
I think generally, if you want a complete hit end to end,
you need to have a breakthrough technology
that’s applied to solve a very specific problem
that a user actually has, right?
And then you need to have an engine
that can deliver that solution
to people who have that problem
as quickly as possible across the world, right?
If you have all those pieces,
then you won’t have a retention problem
or an acquisition problem
or any other problem basically, right?
Now, the cool thing about this time right now
is the technologies are being developed
and there’s actually a crazy number of technologies out there,
right?
I think it’s a very unique time from that perspective, right?
And for product people, the main problem is,
“Hey, like how do we actually solve problems, right?”
Actually solve real problems that people have, right?
And not just sell the technology as technology, right?
Like, “Hey, we have technology, just that, right?”
But actually convert it into a real value delivery
for users for a specific use case,
even an issue’s case, right?
Whatever it might be, right?
And then I think for marketers,
the problem is how do we actually educate people
that there’s a new way to solve these problems, right?
Like people may not think the first thing,
“Oh, you know what?
“I’m gonna Google AI for this, right?”
That might not be the first thing that people think about, right?
They might be searching for just
whatever they were normally doing, right?
Which may be something that takes a long time.
And, or they might be like not aware
that there’s new solutions available
for these problems, right?
So I think that’s sort of the end-to-end.
I think if you focus on that at that level,
like all the other numbers sort of follow on their own,
and that’s kind of what we’ve seen,
both across our desktop app and our mobile apps as well.
And we’re in the consumer space,
so retention is definitely a very hard game to crack
compared to, say, B2B businesses.
But we’ve been able to do it really well.
And like, I think it’s because of that high-level focus
across technology, product, and marketing.
– Yeah, maybe Laura, you used to work at Twitter.
What are you learning in terms of products
that reach so many people?
We’re talking daily active users.
What have you learned from that space
that you can apply to AI
when you are trying to fix this retention problem?
– I will say that I am so glad to be out of the game
of trying to optimize for MDAU
for monetized daily active users.
I think that daily active use
is like a pretty terrible metric
to uncover customer value, right?
And so one of the things that I just love most
about working at Descript
is being able to identify alternative metrics
to think about how they’re done right by the customer.
Two that I really like to think about
that are a bit in tension with each other.
They act as guardrails,
it’s time to expression and editing richness.
So I think if Descript is doing its job really well,
the amount of time it takes you from starting a project
to getting it into a shareable state,
whether you’re a marketer
who is like trying to repurpose a webinar into clips
or someone who is more of a creator,
trying to make your latest YouTube review
or you’re someone in learning development,
trying to create a training.
I want the amount of time it takes you to create that
to go down and down.
And so you’re able to just create more and more of the content.
Is anyone here a creator in any way
have a YouTube channel or a marketer?
Do you know about just like the gaping maw
that can never be fully fed or stated for content
that I find so many of our customers
are just staring into with despair?
And so getting kind of their time to expression down
is really important.
But one of the ways you do that is just like
by creating worse and worse content
that it’s just a role with an iPhone
and you slap some captions on it,
which is great for some use cases,
but for others just like a missed opportunity,
like you could have done so much more
to create really high quality video content.
And so if Descript is also winning on increasing
the editing richness,
the number of jobs that you’re able to do with us
and the number of things you’re able to do
to transform your media and make it really high quality,
the interaction of those two metrics
is such a great way to drive towards customer value.
I will say that like what Gorav said
around just like good product fundamentals
with retention totally resonates with me.
My attitudes for the tourists
is you’ve got to triage the tourists.
Some component of them just don’t have a use case
for your software.
They want to create a voice clone.
They want to see it.
They’re like, oh, that looks cool,
but they don’t have anything to do with that voice clone.
And it’s like, great, let’s let them do that.
That’s awesome.
Maybe one day you’ll think about Descript
or 11 Labs and come back.
But then who are these tourists
who actually have a legitimate use case
and they just don’t know it yet.
They could be using video
to communicate within their company.
They could be using text-based video editing
to create all of their marketing clips
and they don’t know that yet.
And how can I create software that activates really well,
that displays all of our use cases
and lets them have a good first time?
And I find that like often retention problems
or just activation problems in disguise in a trench coat.
And so what I really try to focus on
to improve retention is just like the activation experience.
– Just having come from a social media background
as well at Snap, such a good point about just DAU
and like how that can be such a trap.
– I think social media companies obviously optimized DAU
for a reason because money’s coming from a different source.
And so actually it’s good to be out of that game.
And really interestingly with the generative AI space,
it seems like it’s kind of having the opposite effect
on what it’s trying to achieve.
Like social media on one end is using AI as well,
but really to consume time from people as much as possible.
Consume as much of your time and it’s succeeding.
And on the other hand, generative AI is actually kind of
giving back time to people so they can actually do more.
So pretty cool.
– Yeah, we talked about this on a recent episode,
how some tools, I’m sure people would resonate with this.
If you had one excellent session,
it could have saved you four hours of work in five minutes.
That’s actually more valuable
than spending 20 minutes every day in an app.
And you don’t see that in the same metrics, right?
So I love that you brought up different metrics
that you’re paying attention to, Laura.
Charles, is there anything that jumps to mind there for you
in terms of how you might rethink a business model
in terms of what metrics you’re paying attention to,
or the way that you’re monetizing a product
that might be different because the willingness to pay
we’ve also seen is there, even if it is just,
I’m using this once a month, once every two months even.
– Yeah, and I think it’s a really good point, right?
Some consumers actually feel that if they need to do something
twice, the product is not working well, right?
It’s that element that we’ve gone from one side
to the other side.
So probably like someone in the middle
is what it fits well.
I was actually like in a meeting with a customer
and we presented a C level last week.
And the question they came back with was like,
okay, so how much time am I gonna save?
And I was like, well, you’re gonna save anywhere
between 50 to 60 times the time.
Like it’s gonna be like 50 to 60,
like it’s slashed by 50 to 60.
And they were like, no, that’s not possible.
And I was like, let’s do the math right now.
And we did the math.
And it was very interesting.
So I think there is an emphasis on that side.
But I think like sometimes we try to overemphasize
the effects of like the efficiency
that you’re getting with Genitive AI
when in fact, Genitive AI is not perfect, right?
I think like that’s one of the main reasons
why the AI tourists are there and they’re very big.
It’s because everyone comes with like such a big expectation
that he’s gonna be solving all of my problems
and it’s gonna be cooking dinner for me tonight as well.
And unfortunately it’s not gonna cook dinner for you.
It’s just never gonna solve all of your problems.
But it’s gonna help you quite a lot
either because you can do a lot of more modernization
with your customers,
with like you can reach new markets
or you can actually do it much quicker, right?
But I think like framing it on actually
what is valuable for you as a business
or as an individual is much more important.
So like initially our metrics were like beautiful,
like usage, right?
And over the past month,
we’ve ended up like switching to like usage is important,
but that does not define like the long-term success
of an actual customer for us, right?
It’s one of our like, yeah, activation side
is about actually what’s the use case that you have
and how do we measure that of the long-term
and how do we understand, try to insert the use case
based on the way you’re using the product, right?
So that we can offer you the best tools
and the best tips and all that stuff.
For us that that’s essentially those are the key metrics
to the best like usage.
Usage is still super important,
but I don’t really mind if someone uses the product today
and then doesn’t do it for like a week or two weeks
because I know that like if we’ve nailed it,
they’re gonna come back two weeks later, right?
I think that’s how we are thinking about it.
– You don’t have those social notifications
that are like a friend of a friend
maybe posted something, please come to our app.
All right, well, so we’re gonna open up
to questions very soon.
So if you have any questions start thinking about them,
but I wanna do rapid fire one or two more.
So the importance of optimizing an application
for a specific role or someone’s use case,
who are you, what are you trying to do?
So each of you actually comes from different backgrounds,
right?
So Gora, you’ve done design and development,
you’ve been an engineer,
Laura, you’ve been immersed in product,
carless operations.
And so those are roles where there’s a gosh,
I don’t know how many other people who fit that subset.
So I’d just love to hear your perspective,
independent of your company,
how do you think of AI as let’s say the next five years?
What does an AI-powered engineer look like
in your case, Gora,
or like an AI-powered operations person?
What do you need, what’s missing?
Are there products out there
that actually fit that use case and are doing it well?
– Yeah, I mean, thinking about it
from an engineering perspective
or even from a design perspective,
I think maybe the closest on the engineering side
would be like a tech lead manager,
someone who’s actually setting up the overall architecture
of whatever’s being built, right?
But a lot of the work’s been done by AI
and they’re coming in, they’re making edits.
They’re like, maybe we need to change this,
reviewing stuff, right?
Same on design, right?
Like kind of giving high level instructions
and like, let’s have this,
let’s maybe use this style over here,
let’s change these components, right?
And getting that output back
and kind of reviewing it, leaving comments
the same way that a manager might, right?
And being able to produce hopefully a lot more value
and output.
So that means that companies can be going
to a much larger revenue scales with way fewer people,
which is gonna be interesting.
– Yeah, I think a lot about this.
What is the AI product manager?
The paradigm that I use is more like,
how do I wanna interact with AI to do my job better?
One of the use cases I’m excited about
is a rubber duck who talked back.
You guys hear about like rubber ducking
where you keep a rubber duck on your desk
and you talk through difficult problems with that rubber duck.
And I think like, I’m never going to cede control
of the creativity and the genius to like the entity.
Like clearly, have you met me?
I’m in charge of that.
But I think like it can be fun to toss the ball around
with someone and I think I’m excited to see
how AI continues to develop to be like a fun thing
to toss the ball around and then can take all of the stuff
that you’re just like spewing out all of the kind of word
garbage and turn it into something crisp and readable
and easy to understand.
So that’s a use case that I’m excited about.
– I think it’s from an operation side,
it’s like even more complex, right?
Because like there’s so many like things that you need to do.
Like how do you automate
or how do you get someone to help you on that front, right?
So ideally you end up having a product
that helps you to twice as much in the same amount of time.
Not because I’m thinking about it
from an efficiency perspective,
but much more of how I can potentially generate
more revenue for the business, right?
I think that’s where potentially
and hopefully like the market is going to be going.
I like on the sales side, it’s much easier
because you end up having AISDRs these days.
We’ll end up having AISDSMs in all of those pieces.
Like that can be already there in many cases, right?
But purely on the operation side,
there’s a lot more complex.
Chagivity is your friend for sure, right?
Or on topic if you use it, or like any of those tools
that will help you generate quite a lot of different things
on a day-to-day basis.
Is that giving you a 2X?
Not yet, right?
So I’m not sure, like I still haven’t found the right product
that like would help anyone optimize
and become like 2X themselves.
– Maybe someone will build it in the room.
I guess final question.
Does anyone feel free to jump in?
All three of your products have a lot of customers.
People are using it.
Seems like maybe for the retention problem,
what challenges are you facing?
Whether it’s like regulation or not having the right models
or hoping that the open source models catch up
or just curious if anything jumps out
where just calling out a challenge
that you’d like to be solved in the next few years.
– Yeah, I’d say for us, it’s hiring actually.
It’s very traditional, right?
But I think hiring the right people
to solve the particular problems
that we’re having in our company.
And problems go really quickly,
or the company’s going really quickly, right?
And you have to kind of keep an eye on
all the different things that are happening,
where new needs might come up,
especially with a company like ours,
where we’ve existed for about three years
and there’s video companies that have been around
for a long time.
We’ve passed everybody in revenue
in like literally a year and a half.
And with growth at that scale,
you just have to constantly be thinking about
what are the new problems that are coming up
and who can we hire solve those problems, right?
So I think that’s like a very traditional answer.
And maybe there’s some AI recruiters out there,
but we have a great team.
So I don’t think we need them, at least not yet.
– Maybe AI can help with that.
I think it’s just that we’re in the middle
of a paradigm shift, right?
Like we haven’t gotten to the end of it.
We’re in the middle now.
And what I can tell you is that the way
that we’re going to edit video and audio in a year
or in two years is going to look completely different
than how we’re doing it right now.
But we don’t know how yet.
And on one hand, like that’s why I’m here.
That’s like why I’m doing this job,
because this is a place where the next generation
of like product managers and designers
we’re going to reinvent the way
that humans and computers interact with each other.
Someone’s going to figure it out.
And God, I hope it’s like me
or that I’m part of it in some small way.
But that’s also just like a very fragile moment, right?
Like it’s both a challenge and an opportunity.
And I think it’s like the challenge
of our industry right now.
– I think for us, it’s like there’s two sides of it.
What is definitely hiding, I can relate a lot on that.
It’s difficult.
We’ve gone from like zero to like tens and tens of millions
in months, not even years, in months.
And it’s really difficult to find people
that have experienced that previously,
also because like the market has evolved
very quickly in such a timeframe.
So that’s one side.
So there’s a lot of commitment that like we expect
from people at the company
and we need to be able to actually keep growing at this stage.
And on the research side,
it’s extremely difficult to find the right researchers
on the engineering side, on the operation side,
on sales, even support like across the board, right?
But that’s one side of the equation.
The other side of the equation is preventing misuse, right?
And I think realistically,
that is something that we have
an entire team dedicated to that
a day and night in the fourth, seven.
But every time that we put together something
that is windy or the different things
that like people make up to try to game it.
And it is similar to fraud,
where like you’re always like two steps behind
and it’s really difficult to cut and like keep fighting it.
So I think like about those two elements
are like the biggest challenges
that we constantly facing as a company.
Like we’re winning, but still it’s just a matter
of making sure that you’re constantly innovating
and having resources for something that it is important.
Otherwise like regulators come
or like consumers don’t blame
and things like that and people complain, right?
– Yeah, you need unprecedented people
for an unprecedented pace.
Quick question is Laura,
who’s our wonderful producer at the A16Z podcast
is gonna go around.
So if anyone does have a question, just raise your hand
and she’ll come find you.
– I’m curious how are we thinking about internationalization
or serving users of like various levels of digital literacy?
– We’ve had an international audience from the beginning,
including every country and every region
you could possibly imagine.
So I think it’s been a high priority from the beginning,
right?
Because the interesting thing is a lot of the development
that AI is bringing is not just things that are usable
in like, oh, it’s just an English thing
or oh, it’s just like a US thing or something.
It actually brings change in workflows
across almost every country
and every culture you can imagine.
And it actually works, right?
Like I think we’ve gone and launched new markets
where we’ve had zero users
and overnight had an explosion of users in that market.
But then we learned something about that particular market
where, oh, they don’t like this particular thing
or if you think about, for example, the Middle East, right?
Text is written in the opposite way.
And so that changes a lot about the UI
and changes a lot about the user experience, right?
And we’ve done a lot of work to make that good
and make that as usable and as amazing of an experience
as it is in any other language.
So those are the types of efforts
we’ve made high priority from the beginning.
– Would you say that other countries or regions
are actually more readily adopting the products
because I’m just thinking through,
well, actually maybe they can’t hire the software engineer
or maybe they can’t pay for the traditional video editor
or those thousands of dollars.
So they’re actually more readily adopting these technologies
’cause they’re bringing the cost down.
– Absolutely.
I mean, I think around the world people are super open
to trying something new
to see if they can change their workflow, right?
I think as long as you can provide something
that is once you try it,
you can’t go back to what you were doing before.
That’s it.
That’s the difference, right?
If you can provide that experience in any language,
any culture, any country,
people will use the product.
– I mean, I think for us internationalization
has been like since day one there.
We have a fully international team,
everyone is fully remote.
So that actually, there’s a very strong correlation,
funny enough between the actual employee profile
and the fact that we are multiple countries,
everyone can be based whatever they wanted
and traveling and all of that stuff.
And the actual user type that we’ve got it, right?
So yes, in the initial days,
like a lot of our growth came from North America
and European markets.
But actually these days, when you look at the entire pie,
it’s like super spread out across the world.
I can relate to that purely on the fact
that people want the best tools
that will help them on a day-to-day basis, right?
And you don’t really need to spend these days,
like thousands of dollars or like hundreds of dollars
to actually produce a video or to produce a podcast
or produce something, right?
You could do it much cheaper using tools.
And that’s beauty of it.
So by default, like anyone that truly wants to have
a cost efficient solution will end up like using
any of the tools, script, captions or labs
or anything else that you have out there.
So by default, you end up having the strategy
that is about international markets,
with doing well content,
like trying to engage your audiences like that’s where they are
and trying to personalize it to them anyway.
Otherwise, I think like you end up like having a problem
of being very skewed towards a market,
traditionally it’s been always that,
oh, you go one market, you conquer it
and then you expand to another one.
And this day it’s just not,
it’s just that it worked quite well enough.
– Yep, it’s time for maybe one, maybe two more.
I see one at the back.
– I’m just wondering what barriers or stop gaps
you might be putting in place for people
who may be using your products for nefarious purposes
and thinking about trust and safety.
– I think like from 11, we invest like millions
every single year on actually like preventing misuse, right?
And we will start somebody to implement
like a fingerprinting system
for any content that gets generated.
So since we launched the fingerprinting has been in place,
we then opened up the API and the UI,
make sure that anyone can check
whether something was generated by us or not.
And since then we’ve also essentially engaged
on monitoring the content that our users generate.
So that essentially if someone is generating things
that they shouldn’t, then essentially we block them.
We’ve gone as far as to build the Nogo Voices,
which is a model that will prevent anyone
that tries to clone a celebrity voice for instance, right?
We’re constantly adding all of these layers
to try to make sure that we stay ahead of the curve.
But as I was saying earlier,
like it’s an uphill battle overall, right?
There is always ways in which you can game it.
But at the same time, like you have open source tools, right?
So we can try to do our side of the equation,
like anything that is open source
and to some extent you don’t really have that much
like control over those tools, right?
But I think it’s important as a company,
we will keep investing like millions every single year
and we can increase it as the market grows as well.
– I have to just quickly ask because it’s very timely
and I’m sure people in the audience are wondering
with some of the recent news around AI voices,
let’s just leave it at that and celebrities.
Are you finding there to be a bunch of false positives?
‘Cause I feel like that’s maybe something
that people wonder, you hear a celebrity’s voice,
but how unique can a voice be?
And so if you’re trying to filter out certain people’s voices,
are you finding that actually like our voices maybe aren’t
that unique?
– That’s a really good question, right?
The voices are not as unique as everyone thinks,
but however they quite unique.
So you end up having like false positives for sure,
but we end up thinking like, if it’s a false positive,
if it tells you like, oh,
you don’t have permission for this voice,
automatically it tells you like, oh,
but you can still pass the voice structure
and it would show you the voice structure.
So if you pass it because it is your voice,
then you’re able to actually like use your own voice, right?
I have a twin brother for the ones that don’t know.
We do sound exactly the same.
And even my parents actually,
they sometimes they made mistakes, right?
So truly like, I could be talking,
but you could be thinking that it’s my twin brother.
We have exactly the same voice.
And that is a challenge that as a company we have
and a society we have, right?
But I think like the end of building layers
as a product from a product perspectives
to help filter those false positives.
I think like people understand that like,
you’re trying to go from like everything is free for all
and then you can misuse as much as you wanted.
There was like, let’s put some controls
and even if there’s some false positives,
people understand it online.
– Something about the product side of this too,
which I do think is super important
to sort of like build the safety features
from the product, from the ground up,
like in the product from the ground up.
And that’s kind of the difference
between offering a technology versus offering a product.
If you just say, hey, come to our website,
make deep fakes, right?
That’s offering a technology.
And some people might be out there doing that, right?
I don’t know, right?
But I think if you build that into a product,
like for example, we have the language translation feature,
right, which can translate whatever you’re speaking
to a different language, change your lip movements as well.
And yes, that’s using the same technology,
but in a very opinionative way
that you can’t change what was said,
but you can change what language it was set in, right?
And so that limits the scope of abuse
immediately, quite a bit, right?
And then all the traditional methods
can be used on top of that as well.
– Bri, I mean, with these group,
you can create a voice clone of yourself
and sort of like intermingle.
We have this thing called Overdub,
where if I say the wrong word,
I can go back in with the text,
say the word that I actually meant to say,
and then it will with my voice clone,
kind of create that.
But obviously there are a lot of misuses there.
And so whenever we launch a product,
we launch it with protections in place
and do a bunch of testing and hire outside people
to try to crack it and try to make sure
that we do our very best to make sure that it’s ungamable.
But like you said, if people are extremely determined
to crack through security,
like they will always find new ways to do it.
And this was the case when I was in social media too,
where like you do all kinds of things
to try to protect your platform.
And bad actors, they get up every morning
and grind just as hard as you do.
And so you’re just sort of in the eternal struggle.
And I think like every single tech product
should be thinking about like,
how are people going to misuse us
and making sure that they’re responsibly providing
a bunch of resources to stay in the fight.
– So as VP of Revenue at 11,
how do you view the role of open source?
Because as a developer myself,
I would rather use, for example, Falcon 70B,
which is a dollar in dollar out per million tokens,
as opposed to GPT-4, which is 30 and 50 out.
So do you think that open source is a threat to your business,
especially as companies like Meta
are kind of taking a scorched earth approach
to releasing models?
– I mean, I think it’s complementary actually.
You always end up having like businesses
or like people that like can go and use open source
and they have the means and the tools
and the knowledge to make that work.
And then you’re having quite a lot of different people
that like don’t really have those means or knowledge, right?
So it just ends up becoming like different sides
of the business or different sides of the market, right?
However you want to segment it.
When I think about voices,
we’ve been talking to each other as humans
for the past 50,000 years, right?
And there wasn’t really a good technology
that was able to replicate how we talk as humans.
So the fact that like as a platform or like even open source,
you’re able to actually replicate people’s voices
with their permission, make it sound natural, engaging,
and then power a new type of communication
and like platform and experience.
The market is massive.
So by default, you need to have both sides
to be able to actually like counterbalance each other
and push each other.
But it comes also the open source at a cost,
which is like the number of features that you will have
is like more limited, right?
So you will end up also having like less voices.
So what’s your preference?
Like you don’t have the UI.
So what’s your preference as a business or as an individual?
Is it purely building on top of it?
Then maybe open source is a good way.
Like today, the quality is not there yet.
But I’m sure that within the next three years,
the quality is going to be like matching anything
that is like private, right?
So it’s going to be more about like the actual system
that you build around it to make sure that like people start
like using it in a much easier way and then embed it anyway.
But I actually think it’s like complimentary.
Like without one, we can not have the other one purely
because the market like needs both sides.
– So just a follow-up, would you say that’s important for,
I guess, picks and shovels, companies,
closed source to build an application layer on top
to stay competitive?
– I don’t think anyone has actually built a pool like LLM.
If they’re not able to build applications on top of it,
to make life easier for consumers and businesses,
you will end up struggling down the line.
Whether that is in six months time
and that is in 18 months time, you will struggle.
Because at the end of the day,
like I want to launch my own application
like my product to use the product like this immediately,
right?
And if I need to spend the next like like coding
and building the UIs and everything
might give up and go somewhere else.
Even if it’s more expensive,
especially if I don’t even know
where they have product market fit.
And product market fit, like we always think about like
actual startups, but like big corporates
might not have even product market fit.
So if you want to iterate quickly
and then go to market as quickly as possible,
then you might want to have a stack
that is like truly readily available for you.
But once you’re ready and you’ve tested it
and the technology fits good enough with other LMS
or like open source,
then you might end up looking to switch.
And we’ve seen that with OpenAI,
like the big migration that like from developers
like that started using OpenAI,
such as GPT, APIs and GPT 3.5.
And then now they’re migrating towards like Anthropic
and like Mithral or Lama.
That’s been happening for the past six months.
It will continue happening, right?
So you start to validate that everything goes well
and then you figure out whether there is alternatives
or that is like really negotiating pricing
or like open source.
(upbeat music)
If you liked this episode, if you made it this far,
help us grow the show,
share with a friend or if you’re feeling really ambitious,
you can leave us a review at ratethisfodcast.com/asixz.
You know, candidly producing a podcast
can sometimes feel like you’re just talking into a void.
And so if you did like this episode,
if you liked any of our episodes, please let us know.
I’ll see you next time.
(upbeat music)
(upbeat music)
(upbeat music)

Less than two years since the breakthrough of text-based AI, we now see incredible developments in multimodal AI models and their impact on millions of users.

As part of New York Tech Week, we brought together a live audience and three leaders from standout companies delivering AI-driven products to millions. Gaurav Misra, Cofounder and CEO of Captions, Carles Reina, Chief Revenue Officer of ElevenLabs, and Laura Burkhauser, VP of Product at Descript discuss the challenges and opportunities of designing AI-driven products, solving real customer problems, and effective marketing.

From the critical need for preventing AI misuse to ensuring international accessibility, they cover essential insights for the future of AI technology.

 

Resources: 

Find Laura on Twitter: https://x.com/burkenstocks

Find Carles on Twitter :https://twitter.com/carles_reina

Find Gaurav of Twitter: https://twitter.com/gmharhar

 

Stay Updated: 

Let us know what you think: https://ratethispodcast.com/a16z

Find a16z on Twitter: https://twitter.com/a16z

Find a16z on LinkedIn: https://www.linkedin.com/company/a16z

Subscribe on your favorite podcast app: https://a16z.simplecast.com/

Follow our host: https://twitter.com/stephsmithio

Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Engine Chatbot
AI Avatar
Hi! How can I help?