AI transcript
At the end of last year, there were 120 tools
with which you can clone someone’s voice.
And by March of this year, it’s become 350.
Being able to identify what is real
is going to become really important,
especially because now,
you can do all of these things at scale.
– One of the reasons that spam works and deep fakes work
is the marginal cost of the next call is so low
that you can do these things in mass.
– It’s way cheaper to detect deep fakes.
We’ve had 10,000 years of evolution.
The way we produce speech has vocal cords,
has the diaphragm, has your lips and your mouth
and your nasal cavity.
It’s really hard for these systems to replicate all of that.
– Deep fake, a portmanteau of deep learning and fake.
It started making its way into the public consciousness
in 2018, but is now fully in the zeitgeist.
– We are seeing an alarming rise of deep fakes.
– Deep fakes are becoming increasingly easy to make.
– Deep fake videos are everywhere now.
– Deep fake robo-caller
with someone using President Biden’s voice.
– Deep fake of President Zelensky.
– Deep fake.
– Deep fake.
– Deep fakes.
– Deep fakes.
– We’ve seen deep fakes across social media,
commerce, sports and of course, politics.
And at the rate that they’re appearing,
deep fakes might sound like an impossible problem to tackle.
But it turns out that despite the decreasing barrier
to creation, our defender tool chest is even more robust.
So in today’s episode, we’ll discuss that
with someone who’s been thinking about voice security
for much longer than the average Twitter user
or even high-ranking politician,
wondering where this all goes.
Today, VJ Balasubramanian, co-founder and CEO of Pindrop,
joins A16C general partner, Martin Casado,
to break down the technology, the policy
and the economy of deep fakes.
Together, they’ll discuss questions like,
just how easy is it to create a deep fake today?
Like, how many seconds of audio do you need
and how many tools are available?
But also, can we detect these things?
And if so, is the cost realistic?
Plus, what does good regulation look like here
in a space moving so quickly?
And have we lost a grip on the truth?
We’ll listen in to find out, but first,
let’s kick things off with how VJ got here.
As a reminder, the content here
is for informational purposes only.
Should not be taken as legal, business, tax
or investment advice,
or be used to evaluate any investment or security
and is not directed at any investors
or potential investors in any A16C fund.
Please note that A16C and its affiliates
may also maintain investments
in the companies discussed in this podcast.
For more details, including a link to our investments,
please see a16c.com/disclosures.
I’ve been playing in the voice space
for a really long time.
I’m gonna date myself, but I started working at Siemens.
And at Siemens, we were working in landline switches
and EWSD switches and things like that.
And so that’s where I started.
I also worked at Google and there I was working
on the scalability algorithms for video chat.
And so that’s where I got introduced
to a lot of the voice over IP side of things.
And then I came to do my PhD from Georgia Tech.
And so there, I naturally got super interested
in voice security.
And ultimately, Pindrop, which is the company
that I started, was my PhD thesis,
very similar to the way you started off your life
as well, but it turned out to be something pretty meaningful.
And ever since then, it’s been incredible
what’s happened in this space.
– This is why I’m so excited to have you on this podcast.
To many deep fakes are this new emergent thing,
but you’ve actually been in the voice fraud detection space
for a very long time.
So it’s gonna be great to see your perspective
on how things are different now
and how things are more of the same.
And so maybe to provide a bit of context
to get started from deep fakes,
they’ve entered the zeitgeist,
maybe talk through what they are when we say deep fakes
and why we’re talking so much about them.
– We’ve been doing deep fake detection
for like now seven years.
And even before that, you have people manipulating audio
and manipulating video.
And you saw that with Nancy Pelosi slurring in a speech,
all they did was slow down the audio.
It wasn’t a deep fake, it was actually a cheap fake, right?
And so that is actually what’s existed for a really long time.
What changed is the ability to use
what are known as generative adversarial networks
to constantly improve things like voice cloning
or video cloning or essentially try to get the likeness
of a person really close.
So it’s essentially two systems competing against each other.
And the objective function is I’m gonna get really close
to Martin’s voice and Martin’s face,
and then the other system is trying to figure out,
okay, what are the anomalies?
Can I still detect that it’s a machine
as opposed to a human?
So it’s almost like a reverse Turing test.
And so what ended up happening is
once you start creating these GANs,
which are used in a lot of these spaces
when you run them across multiple iterations,
the system becomes really, really good
’cause you train a deep learning neural network
and that’s where the deep fake comes from.
And they became so good that lots of people
have extreme difficulty differentiating
between what is human and what is machine.
– So let’s break this down a little bit
because I think that deep fakes are more talked about
now than they were in the past, right?
And so clearly this seems to have coincided
with the generative AI wave.
And so do you think it’s fair to say
that there’s a new type of deep fake
that is drafted on the generative AI wave
and therefore we need to have a different posture
or is it just the same but brought to people’s attention
because of generative AI?
– Generative AI has allowed for combinations
of wonderful things.
But when we started, there was just one tool
that could clone your voice, right?
It was called Liobard, incredible tool.
It was used for lots of great applications.
At the end of last year, there were 120 tools
with which you can clone someone’s voice.
And by March of this year, it’s become 350.
And there’s a lot of open source tools
that you can use to essentially mimic someone’s voice
or to mimic someone’s likeness.
And that’s the ease with which this has happened.
Essentially the cost of doing this has become close to zero
because all it requires for me to clone your voice,
Martin now requires about three to five seconds
of your audio.
And if I want a really high quality deep fake,
it requires about 15 seconds of audio.
Compare this to before the generative AI boom
where John Legend wanted to become the voice of Google Home
and he spent like close to 20 hours recording him saying
a whole bunch of things so that Google Home
could say in San Francisco,
the weather is 37 degrees or whatever.
So the fact is that he had to go into a studio,
spend 20 hours recording his voice
in order for you to do that compared to 15 seconds
and 300 different tools available to do it.
– It almost feels to me that we need like new terms
because this idea of cloning voices
has been around for a while.
I don’t know if you remember this, Vijay,
but this wasn’t too long ago when I was in Japan
and I got this call from my parents, which I never do.
And my mom’s like, where are you right now?
And I’m like, I’m in Japan.
And my mom’s like, no, you’re not.
And I’m like, yes, I am.
She says, hold on, let me get your father.
So my dad jumps on the line and he’s like,
where are you?
I’m in Japan.
He’s like, I just talked to you, you were in prison
and I’m leaving to go bring $10,000 of bail money to you.
I’m like, what are you talking about?
And he’s like, listen, someone called and said
that you had a car accident and you were a bit muffled
because you were hurt
and that I needed to bring cash to a certain area.
And like your mom just thought to call you
while I was heading out the door, right?
So of course we called the police after this
and they said, this is a well-known scam
that’s been going on for a very long time.
And it’s probably just someone that tried to sound like you
and muffling their voice, right?
And so it seems that calling somebody
and obfuscating the voice to trick people
has been around for a very long time.
So maybe just from your perspective,
do we need a new term for these generative AI fakes
because they’re somehow fundamentally different
or is this just more of the same?
And we shouldn’t really worry too much about it
because we’ve been dealing with it for a long time.
– Yeah, so it’s interesting it happened to you in Japan man
because the origin of that scam early on,
I went with the Andres and Horowitz contingency to Japan.
This was way back, this was like close to eight, nine years back
when I was talking about voice fraud,
the Japanese audience talked to me about Oriori Sagi
which is help me grandma.
So it’s exactly that, but at that point in time,
it had started costing Japan close to half a billion dollars
in people losing their life savings to the scams, right?
So in Japan, half a billion dollars close to eight, nine years back.
So the mode of operation is not different, right?
Get vulnerable populations, right?
To get into an urgent situation,
believe they have to do it, otherwise it’s disastrous
and they will comply.
What’s changed is the scale
and the ability to actually mimic your voice.
The fact is that now you have so many tools
that anyone can do it super easily.
Two, before if you had some sort of an accent and things like that,
they couldn’t quite mimic your real voice,
but now because it’s 15 seconds,
your grandson could have a 15 second TikTok video
and that’s all it’s required, not even 15 seconds,
with five seconds and if depending upon the demographic,
you can get a pretty good clone.
So what’s changed is the ability to scale this
and then these fraudsters are combining
these text-to-speech systems with LLM models.
So now you have a system that you’re saying,
okay, when the person says something,
respond back in a particular way crafted by the LLM.
And here is the crazy thing, right?
In LLM, hallucination is a problem.
So the fact that you’re making shit up is a bad idea.
But if you have to make shit up to convince someone,
well, you must be able to do that.
And it’s crazy.
We see fraud where the LLM is coming up with crazy ways
to convince you that something bad is happening.
Wow, wow, wow.
I want to get into next,
are we all doomed as it possible to detect these things like that?
But before we do that, it’d be great if,
since you probably are the world’s expert on voice fraud,
you’ve probably seen more types of voice fraud
than any single person on the planet.
We know of the Odi Odi Sagi,
which is basically what I got hit with.
Can you maybe talk to some other uses of deepfakes
that are prevalent today?
Yeah, so deepfakes existed,
but if you think about deepfakes affecting,
and deepfakes right now you can see, right,
in the political spectrum, they’re there, right?
So election misinformation with President Biden’s campaign
happened, we were the ones who caught it
and identified it and things like that.
What was the specifics?
Are you allowed to talk about it?
Yeah, no, no, for sure.
What happened is early on this year,
and if you think about deepfakes,
they affect three big areas,
commerce, media and communication, right?
And so this is news media, social media.
So what happened is at the beginning of an election year,
you had the first case of election interference
with everyone during the Republican primary
in New Hampshire got a phone call that said,
hey, you know what?
Your vote doesn’t count this Tuesday.
Don’t vote right now, come vote in November.
And this was made in the voice
of the president of the free world, right?
President Biden, right?
That’s the craziness.
They went for the highest profile target,
and you should listen to the audio.
It’s incredible.
It is like President Biden,
and they’ve interspersed it with things
that President Biden says,
like what a bunch of malarkey and things like that.
So that came out and people were like,
okay, is this really President Biden?
So not only did we come in and say,
this was a deep fake,
we have something called source tracing,
which tells us which AI application
was used to create this deep fake.
So we identified the deep fake,
and then we worked with that AI application.
They’re an incredible company.
We worked with them and they immediately found
the person who used that script and shut them down.
So they couldn’t create any other problem.
So this is a great example of different good companies
coming together to shut down a problem.
And so we worked with them.
They shut it down.
And then later on regulation kicked in
and they find the telco providers
who distributed these calls.
They find the political analyst
who intentionally created these deep fakes.
But that was the first case of political misinformation.
You see this a lot.
– Was that this year?
– Yeah, it was this year.
It was in January of this year.
– That’s amazing.
Okay, we’ve got politics.
We’ve got bilking old people.
Maybe one more good anecdote
before we get into whether we can detect these things.
– The one thing that’s really close home
is in commerce, right?
Like financial institutions.
Even though Generative AI came out in 2022, in 2023,
we were seeing essentially one deep fake a month
in some customer, right?
So it was just one deep fake a month
and some customer would face it.
It wasn’t a widespread problem.
But this year, we’ve now seen one deep fake per customer
per day.
So it has rapidly exploded.
And we have certain customers like really big banks
who are getting a deep fake every three hours.
Like it’s insane the speed.
So there has been a 1400% increase
in the amount of deep fakes we’ve seen this year
in the first six months compared to all of last year.
And the year is not even over.
– Wow.
All right, so we have these deep fakes.
They are super prevalent.
They are impacting politics and e-commerce.
Can you talk to like whether these things
are detectable at all?
Is this the beginning of the end or where are we?
– Martin, you’ve lived through many such cycles
where initially it feels like the sky is falling.
Online fraud, emails, spam, there’s a whole bunch of them.
But the situation is the same.
They’re completely detectable.
Right now we’re detecting them with 99% detection rate
with a 1% false positive rate.
So extremely high accuracy on being able to detect them.
– Just to put this in context,
what are numbers for identifying voice?
Not fraud just like whether it’s my voice.
– So it’s roughly about one in every 100,000
to one in every million, right?
That’s the ratio.
So it’s much higher precision for short
and much higher specificity.
But yeah, deep fakes you’re detecting
with a 99% accuracy.
And so these things you’re able to detect
very, very comfortably.
And the reason you’re able to detect it
is because when you think about even something like voice,
you have 8,000 samples of your voice every single second,
even in the lowest fidelity channel,
which is the contact center.
And so you can actually see how the voice changes over time,
8,000 times a second.
And what we find is these deep fakes systems,
either on the frequency domain,
suspectrally or on the time domain, make mistakes.
And they make a lot of mistakes.
And the reason they make mistakes,
and still it’s very clear is because think about it,
your human ear can’t look at anomalies 8,000 times a second.
If it did, you’d go mad, right?
Like you’d have some serious problems.
So that’s the reason like it’s beautiful to your ear.
You think it’s Martin speaking on the other end,
but that’s where you can use good AI,
which can actually look at things 8,000 times a second.
Or like when we’re doing most online conferencing,
like this podcast, it’s usually 16,000.
So then you have 16,000 samples of your voice.
And if you’re doing music,
you have 44,000 samples of the musician’s voice
every single second.
So there’s so much data and so many anomalies
that you can actually detect these pretty comfortably.
I see a lot of proposals, particularly from policy circles,
of using things like watermarking or cryptography,
which has always seemed a strange idea to me,
because you’re asking criminals to comply by something.
So I don’t know,
how do you view more active measures to self-identify
either legit or illegitimate traffic?
Yeah, see, this is why you’re in security, Martin,
and almost immediately you realize
that most attackers will not comply
to you putting in a watermark.
But even without putting in a watermark, right?
Like even if you didn’t have an active adversary,
like the President Biden robocall that I referenced before,
when it finally showed up,
the system that actually generated it had a watermark in it.
But when they tested it against that watermark,
they only were able to extract 2%.
Oh, interesting.
So you mean the original Biden call had a watermark?
A watermark, because it was generated by an AI app
that included a watermark.
And then they copied–
(laughs)
And 90% of that watermark went away,
largely because when you take that audio,
play it across air, play it across telephony channels,
the bits and bytes, they get stripped away.
And so once they get stripped away,
and audio is a very sparse channel.
So even if you add it over and over again,
it’s not possible to do it.
So these watermarking techniques,
I mean, they’re a great technique.
You always think about defense in depth,
where they’re present.
You will be able to identify a whole lot more genuine stuff
as a result of these watermarks,
but attackers are not going to comply it.
When you get videos,
like we are now working with news media organizations,
and 90% of the videos and audios they get from,
for example, the Israel Hamas War are fake.
How many?
90% of them are fake.
– What?
– Yeah.
– I guess I shouldn’t be so surprised, but.
– Yeah.
They’re all made up.
They’re a different war.
Some of them are cheap fake.
Some of them are actually deep fake.
Some of them are clutched together.
And so being able to identify what is real
is going to become really important,
especially because now you can do
all of these things at scale.
– Can you draw out how the maturation
in AI technology impacts this?
Because clearly something happened in the last year
to make this economic for attackers,
which we’re seeing arise.
And clearly it’s going to keep getting better.
And so do you have a mental model
for why this doesn’t become a serious problem in the future
or does it become a serious problem in the future?
– So one of the things that we talk about
is any deep fake system should have
strong resilience built in it.
So it should not just be good
about detecting deep fakes right now.
It should be able to detect what we call zero day deep fakes.
A new system gets created.
How do you detect that deep fake?
And essentially the mental model is the following.
One, deep fake architectures
are not simple monolithic systems.
They have like several components within them.
And what ends up happening is each of these components
tend to leave behind artifacts.
We call this a fake print.
So they all leave behind things that they do poorly, right?
And so when you actually create a new system,
you often find they’ve pulled together pieces of other systems
and those leave behind their older fake prints.
And so you can actually detect newer systems
because they usually only improvise on one component.
The second is we actually run GANs.
So you get these GANs to compete.
Like we create our own deep fake detection system.
Now we say, how do you beat that?
And we have multiple iterations of them running
and we’re constantly running them.
– Sorry, I just wanna make sure that I understand here.
So you’re creating your own deep fake system
using the approach you talked about before,
which is the general adversarial network.
So then you can create a good deep fake
and then you can create a detection for that.
Is that right?
– Exactly.
And then you beat that detection system
and you run that iteration, iteration, iteration.
And then what you find
is actually something really interesting,
which is if a deep fake system has to serve two masters,
that is, one, I need to make the speech legible
and sound as much like Martin.
And two, I need to deceive a deep fake detection system.
Those two objective functions start to diverge them.
So for example, I could start adding noise
and noise is a great way to avoid you
from understanding my limitations.
But if I start adding too much noise,
I can’t hear it.
So for example, we were called into one of these deep fakes
where LeBron James apparently was saying bad things
about the coach during the Paris Olympics.
It wasn’t LeBron James, it was a deep fake.
We actually provided his management team
the necessary detail so that in X,
it could be labeled as AI-generated content.
And so we did that.
But if you look at the audio,
there was a lot of noise introduced into it, right?
To try and avoid detection.
But lots of people couldn’t even hear the audio.
They were like, this is really,
and so that’s where you start seeing these systems diverge.
And this is where I have confidence
in our ability to detect it, right?
Which is you run these GANs,
you know the architectures
that these deep fake generation systems are created.
And ultimately you start seeing divergences
in one of the objective functions.
So either you as a human will be able
to detect some things off,
or we as a system will be able to detect some things off.
– Awesome.
One of the reasons that spam works
and deepfakes work is the marginal cost of the next call
is so low that you can do these things in mass, right?
Like the marginal cost of the next spam email or whatever.
Do you have even just the most vague sense of,
if it takes me a dollar to generate and deepfakes,
how much does it cost to detect and deepfakes?
Is it one to one?
Is it 10 to one?
Is it 100 to one?
– It’s way cheaper to detect deepfakes, right?
Because if you think about it,
like what we’ve seen is the closed example
is Apple released its model that could run on device.
And even that model, which is a small model
in order to do lots of things like voice to text
and things like that.
Our model is about 100 times smaller than that.
So it’s so much faster in detecting deepfakes.
So the ratio is about 100th right now.
And we’re constantly figuring out ways
to make it even cheaper, but it’s 100th that of generation.
– Wow, I see.
So to detect it is two orders of magnitude cheaper
than creation.
Which means in order for anybody to economically get,
listen, if there is no defense, there’s no defense.
But if there’s a defense that requires the bad guys
to have two orders of magnitude more resources,
which is actually pretty dramatic.
Given normally you go for parody on these things
because it tends to be a lot more good people
than bad people.
– And that’s the thing.
You have two orders of magnitude.
And then the fact is that once you know
what a deepfake looks like,
unless they re-architect the entire system.
And the only companies that re-architect full pipelines.
And the last time this was done is back in 2015
when Google released Tacotron,
where they re-architected several pieces of the pipeline.
It’s a very expensive proposition.
– Is the intuitive reason that the cost is so much cheaper
to detect is that you just have to do less stuff.
Like the person generated the deepfake has to like,
sound like a human, be passable to a human
and evade this.
And so that’s just more things than detecting it,
which just can be a much more narrow focus.
So it’ll always be cheaper to detect.
And then you don’t see a period in time
where the AI is so good, no deepfake mechanism can detect it.
You don’t see that.
– We don’t see that because either you become so good
at avoiding detection that you actually start becoming worse
at producing human-generated speech
or you’re producing human-generated speech.
And unless you actually create a physical representation
of a human, because we’ve had 10,000 years of evolution
and the way we produce speech has vocal cords,
has the diaphragm, has your lips and your mouth
and your nasal cavity, all of that physical attributes.
So think about the fact that your voice is resonating
through folds of your vocal cord.
And these are subtle things that have changed over time.
It’s all of what has taken you to become you.
And somebody might have punched you in the throat
at some point in time that’s created some kind of thing.
There’s so much thing that happens.
It’s really hard for these systems to replicate all of that.
They have generic models and those generic models are good.
You can also think about the more we learn
about your voice, Martin, the better we can get
at knowing where your voice is deviating.
– And I have an incentive as a good guy
to work with you on that.
So you’ll have access to data where the bad people
may not have access to data and it totally makes sense.
It seems to me like the spam lessons learn apply here,
which is spam can be very effective for attackers,
very effective.
Defenses can also be incredibly effective,
however you have to put them in place.
And so it’s the same situation here,
which is be sure you have a strategy for deep fake detection.
But if you do, you’ll be okay.
– That’s exactly right.
And I think it has to be in each of the areas.
Like when you think about deepfakes,
you have incredible AI applications
that are doing wonderful things in each of these paces.
You know, the voice cloning apps,
they’ve actually given voices to people
who have throat cancer and things like that.
Not just throat cancer, people who have been put behind bars
because of a bad political regime
are now getting to spread their message.
So they’re doing some incredible stuff
that you couldn’t do otherwise.
But in each of those situations,
it was with the consent of the user
who wanted their voice recreated, right?
And so that notion that the source AI applications
need to make sure that the people using their platform
actually are the people who want to use their platform.
That’s part A.
– And this is where the partnerships that you talked about
with the actual generation companies comes in
so that you can help them for the legitimate use cases
as well as sniffing out the illegitimate one.
Is that right?
– Absolutely.
– And with the labs, incredible.
The amount of work they’re doing to create voices ethically
and safely and carefully is incredible.
They’re trying to get lots of great tools out there.
We’re partnering with them.
They’re making their data sets accessible to us.
There are companies like that, right?
Another company called Respeacher.
They did a lot of the Hollywood movies.
So all of these companies are starting to partner
in order to be able to do this in the right way.
And it’s similar to a lot of what happened
in the fraud situation back in the 2000s
or the email spam situation back in the 2000s.
– I want to shift over to policy.
I’ve had a lot of policy discussions lately
in California as well as at the federal level.
And here’s my summary of how our existing policymakers
think about AI.
A, they’re scared and they want to regulate it.
B, they don’t know why they’re scared.
And C, with one exception,
which is none of them want deep fakes of themselves.
So I’ve found a primary motivation around regulating AI
is just this fear of political deep fakes, honestly.
And these are in pretty legit face-to-face conversations.
And so have you given thought
to what guidance you would give to policymakers,
many of who listen to this podcast
and how they should think about any regulations
or rules around this and maybe how it intersects
with things like innovation and free speech, et cetera.
I mean, it’s a complicated topic.
I think the simple one-liner answer is
they should make it really difficult for threat actors
and really flexible for creators, right?
That’s the ultimate difference.
And history is rife with a lot of great ways, right?
Like you live through the email days
where the CANSPAM Act was a great way,
but it came in combination with better ML technologies.
– And I’m of that generation too,
but maybe just walk through how CANSPAM works.
I think it’s a good analog.
– You probably know more about the CANSPAM Act,
but the CANSPAM Act is one where anyone
who’s providing unsolicited marketing
has to be clear on its headers,
has to allow you to opt out, all of those things.
And if you don’t follow this very strict set of policies,
you can be fine.
And you also have great detection technologies
that allow you to detect these spams, right?
And now that you follow a particular standard,
especially when you’re doing unsolicited marketing
or you’re trying to do bad things like pornography,
you have detection, AI/ML technologies
that can detect you well.
The same thing happened when banks went online.
You had a lot of online fraud.
And if you remember, the Know Your Customer Act
and the Anti-Money Laundering Acts came in there.
So the onus was you as a organization
have to know your customer.
That’s the guarantee.
And so you need technology.
After that, you can do what you want.
What was really good about both of those cases
is they got really specific on one,
what can the technology detect?
Because if the technology can’t detect it,
you can’t litigate, you can’t find the people
who are misusing it and so on.
So what can the technology detect?
And two, how do I make it really specific
on what you can and cannot do
in order to be able to do this?
And so I think those two were great examples
of how we should think about litigation.
And in deep fake, there is this very clear thing, right?
Like you have free speech,
but for the longest time,
anytime you used free speech for fraud,
or you were trying to incite violence,
or you were trying to do obscene things,
these are clear places
where the free speech guarantees go away.
So I think if you’re doing that, you should be fined.
And you should have laws that protect you against that.
And that’s the model I think of.
– Awesome.
So I’m gonna add just one thing from CANSPAN
that I think that you’ve touched on,
but I was actually working email security there.
So I think that this highlighted,
I wanna see if you agree with this kind of characterization.
So the first one is for illegal use,
policy doesn’t really help
because people aren’t gonna comply
and they’re gonna do whatever they want
and they’re doing something criminal anyways.
And so for that, we just rely on the most technical solution.
You can make recommendations,
but for strictly illegal users,
you have to rely on technology.
No policy is gonna keep you safe.
But then there’s this kind of gray area of unwanted stuff.
And the unwanted stuff, you didn’t ask for it.
It may not be illegal, but it’s super annoying
and it’s unwanted and it can fill your inbox.
And for those, you can put in rules
because if somebody crosses those rules,
you can litigate them or you can opt out of it.
And so it regulates to unwanted.
I could see that definitely happening here.
And then of course, there’s the wanted stuff
which doesn’t require any regulation.
Is that a fair characterization?
– That’s a really good characterization.
I think you’ve said it really, really well.
And the only other thing that I’ll say is right now
because we consume things through a lot of platforms,
platforms should be held accountable at some level
to clearly demarcating what is real and what is not.
Because otherwise it’s going to be really hard
for the average consumer to know
that this is AI generated versus this is not.
So I think there’s a certain amount of accountability there.
– Because the technology is where it is,
putting the onus on the platforms to do best practices
just like we did for spam, right?
Like I rely on Microsoft and Google
for the spam detection doing the same type of thing
for the platform.
It sounds like a very sensible recommendation.
– Yeah.
– All right, great.
So let’s just go ahead and wrap this up.
So key point number one is deepfakes
have been around for a long time.
We probably need a new name for this new generation
and this isn’t just like some hypothetical thing
but you’re seeing a massive increase.
You said as much as one per day
and the cost to generate has gone way down.
Good news is that these things are evidently detectable
and in your opinion will always be detectable
if you have a solution in place.
And then as a result, I think any policy should
provide the guidance and maybe accountability
for the platforms to detect it
because we can actually detect it.
And so listen, it’s something for people to know about
but it’s not the end of the world
and policy makers don’t have to regulate all of AI
for this one specific use case.
Is this a fair synopsis?
– This is a beautiful synopsis, Martin.
You’ve captured it really well.
– All right, that is all for today.
If you did make it this far, first of all, thank you.
We put a lot of thought into each of these episodes
whether it’s guests, the calendar touchers,
the cycles with our amazing editor Tommy
until the music is just right.
So if you’d like what we put together,
consider dropping us a line at ratethespodcast.com/a16z
and let us know what your favorite episode is.
It’ll make my day and I’m sure Tommy’s too.
We’ll catch you on the flip side.
(upbeat music)
[BLANK_AUDIO]
Deepfakes—AI-generated fake videos and voices—have become a widespread concern across politics, social media, and more. As they become easier to create, the threat grows. But so do the tools to detect them.
In this episode, Vijay Balasubramaniyan, cofounder and CEO of Pindrop, joins a16z’s Martin Casado to discuss how deepfakes work, how easily they can be made, and what defenses we have. They’ll also explore the role of policy and regulation in this rapidly changing space.
Have we lost control of the truth? Listen to find out.
Resources:
Find Vijay on Twitter: https://x.com/vijay_voice
Find Martin on Twitter: https://x.com/martin_casado
Stay Updated:
Let us know what you think: https://ratethispodcast.com/a16z
Find a16z on Twitter: https://twitter.com/a16z
Find a16z on LinkedIn: https://www.linkedin.com/company/a16z
Subscribe on your favorite podcast app: https://a16z.simplecast.com/
Follow our host: https://twitter.com/stephsmithio
Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures.
Leave a Reply