Detecting Deepfakes With AI

AI transcript
0:00:16 “Wanna understand exactly how interest rate rises will impact your mortgage or how New
0:00:22 York City gets fresh produce or exactly what on earth was going on over at FTX before the
0:00:23 whole thing collapsed?”
0:00:27 Twice a week, we sit down with the perfect guest to answer these sort of questions and
0:00:32 understand what’s going on with the biggest stories in finance, economics, business and
0:00:33 markets.
0:00:34 I’m Tracy Allaway.
0:00:35 And I’m Jill Weisenthal.
0:00:38 And we are the hosts of Bloomberg’s All Thoughts podcast.
0:00:40 Look us up wherever you get your podcasts.
0:00:48 The All Thoughts podcast from Bloomberg.
0:00:54 Earlier this year, an employee working in Hong Kong for an international company got
0:00:56 a weird message from one of his colleagues.
0:01:01 He was supposed to make a secret transfer of millions of dollars.
0:01:02 It seemed sketchy.
0:01:04 It obviously seemed sketchy.
0:01:09 So he got on a video call with a bunch of people, including the company’s CFO, the
0:01:10 chief financial officer.
0:01:15 The CFO said the request was legit, so the employee did what he was told.
0:01:20 He transferred roughly $25 million to several bank accounts.
0:01:25 As it turned out, the CFO on the video call was not really the CFO.
0:01:27 It was a deep fake.
0:01:34 An AI-generated twin created from publicly available audio and video of the real CFO.
0:01:37 By the time the company figured out what was going on, it was too late.
0:01:45 The money was gone.
0:01:48 I’m Jacob Goldstein, and this is What’s Your Problem, the show where I talk to people who
0:01:51 are trying to make technological progress.
0:01:53 My guest today is Ali Shahiari.
0:02:00 He’s the co-founder and chief technology officer at the audaciously named Reality Defender.
0:02:02 Ali’s problem is this.
0:02:07 How can you use AI to protect the world from AI?
0:02:13 More specifically, how do you build a set of models to spot the difference between reality
0:02:16 and AI-generated deep fakes?
0:02:19 How do you get into the defending reality business?
0:02:29 When I started, it was around actually generating videos and deep fakes.
0:02:32 You were attacking reality before you were defending it?
0:02:38 I wouldn’t say we were attacking anything, but we were definitely looking into this new
0:02:39 technology.
0:02:42 This is way back before all this stuff went crazy.
0:02:46 This is back in 2019 around that time.
0:02:51 We were building digital twins, and we were looking at how do you make it so that it looks
0:02:52 realistic?
0:02:53 Is it a cartoon looking thing?
0:02:57 Is it a Unity 3D thing?
0:03:02 That’s when we started to see these early research papers where they were taking someone’s
0:03:07 face and putting it on a video and blending it in, and it looked really good.
0:03:13 We were like, “Oh, maybe we can do the digital twins that way.”
0:03:19 While we were in that business, we were probably in a few years, someone can download an app
0:03:27 and just make anything very easily, and that’s the origins of how we started.
0:03:28 We’re very mission-driven.
0:03:35 What we’re trying to do here is really protect the world and people from the dangers of AI,
0:03:40 but in a way where we want people not to abuse this technology.
0:03:41 We love AI.
0:03:44 We just don’t want it to be abused.
0:03:57 Let’s talk about this deep fake detection, gen AI detection market more generally.
0:04:00 Who’s selling deep fake detection right now and who’s buying?
0:04:03 What’s the market landscape look like?
0:04:08 The type of clients that we have right now are banks.
0:04:13 For example, we are currently live with one of the largest banks in the world.
0:04:19 When you call that bank, the audio goes through our deep fake detection models, and we’re
0:04:24 able to tell the call center, “This person might be a deep fake,” and the part of that
0:04:26 is that’s actually happened.
0:04:34 Someone’s called the bank and they’ve transferred money up, and this goes back to 2019.
0:04:39 The first incident of deep fake fraud actually happened back in 2019.
0:04:41 That we’re aware of, right?
0:04:42 Right, exactly.
0:04:45 What happened in 2019?
0:04:49 This is back where this is early and nobody really knew about this.
0:04:58 There was a CEO that called a parent company calling the child company, the CEO, calling
0:04:59 the other CEO.
0:05:04 He wanted to transfer some money out, and it sounded like him, and the guy transferred,
0:05:09 I think it was in UK, about $200,000 to $300,000 out, and that was one of the first ones that
0:05:10 we know of.
0:05:12 They got away with it?
0:05:13 I believe so, yeah.
0:05:18 There was an instance earlier this year where I think it was in Hong Kong, and some employee
0:05:24 was on a Zoom call with the company’s CFO, and the CFO was like wire $25 million or something
0:05:28 to some bank account, and then the employee did it, and it turned out the CFO on the call
0:05:29 was a deep fake.
0:05:30 Right?
0:05:31 Yeah, so fast.
0:05:34 Were they your clients?
0:05:40 They were not our clients, unfortunately, but this shows how quickly the technology is
0:05:41 evolving.
0:05:45 2019 audio, fast forward a few years, now you got a Zoom call with a bunch of people
0:05:50 on it, and they all look like people you know, and nope, they’re all deep fakes.
0:05:53 So you were starting to mention banks are some of your main clients?
0:05:55 Who are some of your other main clients?
0:05:56 Other companies?
0:06:01 I think of some of the big ones, they use our product this year, especially with the
0:06:02 election.
0:06:04 Back in 2020, we thought it would be a problem.
0:06:05 It wasn’t.
0:06:08 With this year, we think it’s a big problem, for sure.
0:06:14 I think we were early, but this is happening everywhere even this year.
0:06:17 This year is the largest election year in the world, and more than 50% of the people
0:06:24 are voting, and we already have documented cases of election issues with deep fakes.
0:06:29 Okay, media companies, banks, any other kind of big categories of clients?
0:06:38 Yeah, so other ones are governments, agencies, but in the end, everyone we think, we believe
0:06:41 everyone needs this product.
0:06:44 It shouldn’t be up to the people to decide or figure out if something’s a deep fake.
0:06:48 If you’re on the social media platform, you shouldn’t have to figure out, hey, is this
0:06:49 person real or not?
0:06:53 It should just be built in and anyone should be able to use it.
0:06:59 Well, are social media companies either buying or building deep fake detection tools, or
0:07:02 do they want to stay out of that business and be like, no, we don’t want to be in the
0:07:05 business of saying, yes, this is real, no, this isn’t real?
0:07:10 I can tell you we’ve been in contact and have talked to some social media platforms.
0:07:17 I think one issue is they don’t have to flag these things.
0:07:19 It’s up to them.
0:07:22 There’s not a lot of regulation.
0:07:23 I know they’re thinking about it.
0:07:26 We’ve chatted with some, but that’s the extent of it.
0:07:29 Okay, so let’s talk about how it works.
0:07:31 There’s two ways that I want to talk about how it works.
0:07:35 One is from the point of view of the user, whoever that may be.
0:07:38 Then the other is what’s going on under the hood.
0:07:40 Let’s start with the point of view of the user.
0:07:46 If I’m a whatever, a bank, a university, a media company who is paying for your service,
0:07:48 how does it work for me?
0:07:51 Those are exactly the user and the use case.
0:07:57 If let’s say it’s a media company, they’re looking at maybe filtering through a lot of
0:08:01 content, so content moderation, actually that would be like a social media company.
0:08:03 They’re looking at content moderation.
0:08:08 Maybe they’re looking at millions of assets and they want to quickly flag those things
0:08:10 if they were in that business.
0:08:16 The bank, for the example I gave, the issue is someone could call and biometrics fail,
0:08:17 by the way.
0:08:22 If they call a bank, some banks say, “Repeat after me, my voice is my password.”
0:08:25 That actually fails now, what do you think?
0:08:28 A bank wants to make sure the person calling in is actually that person.
0:08:33 This is more relevant to private banking, where there’s actually a one-on-one relationship
0:08:36 between the client and the bank.
0:08:38 In that case, let’s take that case.
0:08:42 In that case, someone calls in and talks to their banker.
0:08:46 They’re a rich person who has a private banker, basically is what you’re talking about.
0:08:51 This rich person calls in and talks to their private banker.
0:08:56 Is the system just always running in the background in that case?
0:08:59 How does it work from the point of view of the private banker?
0:09:00 Sure.
0:09:06 I have to be careful what I say here, but the high level is the models are listening,
0:09:11 and if they detect a potential deep fake, they will, the call center, that person will
0:09:15 get a notification, so it’s integrated into their existing workflow.
0:09:18 They’ll get a notification that says, “Hey, this person might…”
0:09:19 Get like a text or a Slack or something?
0:09:20 They’re using software.
0:09:23 By the way, you’re talking to a deep fake.
0:09:24 No, they’re using software.
0:09:28 For the bank they’re using, they’re still using a software and there’s a dashboard.
0:09:31 That scenario they do, they escalate.
0:09:35 They might say, “Let me ask you some more questions or let me call you back.”
0:09:38 Let me call you back is a super safe one, because if they have a relationship presumably,
0:09:41 they know the number, they just call them back.
0:09:42 Yeah, absolutely.
0:09:48 Then how does it work for like, when you say, I presume by the way that you can’t name
0:09:49 your clients.
0:09:53 You said a media company and a bank, it’s secret that they’re…
0:09:54 Yeah, we’re not allowed to.
0:09:56 Okay, so let’s say a media company.
0:09:59 How does it work for a media company?
0:10:03 Their use case is slightly different, especially right now as I mentioned around the election,
0:10:09 so there might be something that’s starting to go viral in the news and they want to check.
0:10:11 Hey, is this a real or not?
0:10:16 I would like to say something like this is usually when something goes viral, the damage
0:10:17 is already tonned.
0:10:18 Yes.
0:10:21 Although if you’re whatever the New York Times of the Wall Street Journal, you don’t want
0:10:23 to repeat the viral lie.
0:10:29 Part of your business model is people are paying to subscribe to you because you are more reliable,
0:10:30 right?
0:10:31 Exactly.
0:10:32 So that’s why they come to us.
0:10:35 They upload the assets and our web app returns a result.
0:10:36 I see.
0:10:43 It’s like you just go to whatever, realitydefender.whatever and you upload the viral video and your machine
0:10:45 says it’s a fake.
0:10:49 Yeah, so we give results and probabilities that we don’t have the ground truth.
0:10:51 So we give a probability.
0:10:54 There’s several different models running, so we use an ensemble of models.
0:11:00 We have different models looking at different things and we give an overall score averaging
0:11:01 those.
0:11:05 In the case of a video, we actually highlight the areas of the defig, if the person speaking
0:11:08 and they’re fake, there’ll be a red box around them.
0:11:10 If there is the real, there’ll be a green box around them.
0:11:16 And well, that latter part sounds more binary as opposed to probabilistic.
0:11:17 We give both.
0:11:21 So yeah, there was a probability score and there’s just the visual.
0:11:26 And so the probabilistic score is basically, according to our model, there’s a 70% chance
0:11:28 that this is fake, something of that nature.
0:11:35 According to our ensemble of models, our model of models, our fund of funds of models.
0:11:41 So, okay, so you’re actually walking us toward what’s under the hood, right?
0:11:43 I’m interested in discussing this on a few levels, right?
0:11:49 There is this sort of broad beyond reality defender, what are the basic ways that the
0:11:50 technology works?
0:11:55 Like how does deep fake detection, gen AI detection work in a broad way?
0:11:56 Can you talk me through that?
0:11:57 Absolutely, yeah.
0:12:01 There’s currently two ways people are looking at this problem.
0:12:03 Number one is provenance.
0:12:07 For example, you watermark a media that you create.
0:12:10 Maybe you watermark it or you digitally sign it.
0:12:13 Maybe you put on a blockchain somewhere or something like that, but basically there’s
0:12:14 a source of truth.
0:12:16 This video is real and there’s a watermark.
0:12:17 That’s number one.
0:12:22 But we’re concerned with instances where that is not the case, right?
0:12:28 Our world is full of videos today that are not clearly watermarked, blockchain, whatever,
0:12:29 for provenance.
0:12:30 So we have this problem.
0:12:32 What are the ways people are solving it?
0:12:33 Yeah.
0:12:37 The second way is how we’re solving it, which is basically we use AI to detect AI.
0:12:40 So we call inference.
0:12:48 So we train AI models, as I mentioned, a bunch of them to look at various aspects of the
0:12:49 same video.
0:12:55 So is it a sort of generative adversarial network, the right term?
0:12:59 It seems like if I were making up how to do this, I’d be like, well, I’m going to have
0:13:03 one model that’s like cranking out really good deepfakes.
0:13:05 But I’ll know which ones are the deepfakes.
0:13:08 And then I’m going to feed the deepfakes and the real ones to my other model and I’ll
0:13:10 score it on how well it does.
0:13:12 And it’ll get really good at figuring out the difference.
0:13:13 Yeah.
0:13:15 That’s actually exactly how a lot of these work.
0:13:20 If you go to– there’s a website you can go where it just generates a person every time
0:13:21 you go to it, right?
0:13:24 And that’s actually using, again, to generate that person.
0:13:27 So the way we detect– and I can give a little more detail here.
0:13:33 So for example, one of our models which we actually removed was looking at blood flow.
0:13:35 So yeah.
0:13:40 So imagine, actually, in this video, if the lighting and conditions are right, we can
0:13:45 actually detect the heartbeat and the blood flow and the veins the way we’re looking at
0:13:46 each other.
0:13:47 Yes.
0:13:49 As I’m looking– weirdly today, maybe because it’s hot or because the lighting– I can actually
0:13:52 see a vein bulging on my forehead.
0:13:57 So you’re saying an AI could measure my pulse from that or something?
0:13:58 In the right conditions.
0:14:02 Now, that model has a lot of limitations.
0:14:08 And you need to have the right– it’s basically– it has a lot of bias, right?
0:14:09 So we tossed that.
0:14:10 Wait.
0:14:11 You’re saying it didn’t work?
0:14:12 You’re saying it didn’t work?
0:14:15 It worked in the right conditions and with the right skin tone.
0:14:16 So yeah.
0:14:17 So otherwise, it was biased.
0:14:21 So we– this was experimental and we tossed it.
0:14:22 I like things that didn’t work.
0:14:23 So you tried it.
0:14:25 And in a broad way, it didn’t work.
0:14:27 It worked in narrow conditions, but you need things that work more broadly.
0:14:28 Yeah.
0:14:30 What’s another thing you tried that didn’t work?
0:14:35 Well, I can tell you, every month, we may be throwing away models.
0:14:38 Well, presumably, there’s things that work for a while and then they don’t, right?
0:14:42 It’s kind of like antibiotics versus bacteria, right?
0:14:46 Like your adversaries are getting better every day, basically, right?
0:14:50 Well, we use– what we like to use is we like to say we’re like an antivirus company.
0:14:54 So every time– every month, there’s a new generative technique.
0:14:58 We should both detect it, but maybe it’s something we then anticipate and we don’t detect.
0:15:01 And so we have to make sure we quickly update our model.
0:15:06 So– and then the model that worked last year, it’s completely irrelevant now.
0:15:07 So what else?
0:15:11 Like what else is happening technologically on the reality defense side, on the detection
0:15:12 side?
0:15:13 Okay.
0:15:15 So that way, we have a few different products.
0:15:20 One is, as I mentioned, real-time audio, like scanning and listening for telephone calls.
0:15:26 The other one is a place where a journalist or any user can go and upload not just videos,
0:15:28 but we also detect images.
0:15:30 We also detect audio.
0:15:32 We also detect text, like chat, GPT.
0:15:37 And these tools also explain to a user why something is a deep fake.
0:15:39 We don’t just give a score for an image.
0:15:43 We might put a heat map and say, “These are the areas that set the model off.”
0:15:49 For text, we might highlight areas and say, “These are the areas that appear to be generated.”
0:15:55 There’s a case study you have about a university that is a client of yours that, among other
0:16:03 things, uses your service to tell when students are turning in papers written by chat GPT,
0:16:05 basically, as I read it.
0:16:09 I just assumed that everybody writes papers with chat GPT now, and there’s nothing anybody
0:16:10 can do about it.
0:16:11 But is that not true?
0:16:18 If I have GPT write my paper and then I change a few words, does that sort of help– let me
0:16:21 sail past your defense?
0:16:22 It depends.
0:16:24 It depends how much you change.
0:16:27 If you change over 50%, maybe it would.
0:16:29 So it depends.
0:16:32 Over 50% is more than a few words.
0:16:33 Can you talk?
0:16:37 I know you can’t name the university, but in practice, you know how they use it.
0:16:43 Some professor runs the students’ papers through your software and says of one student, there’s
0:16:49 a 60% chance that this was created using a large language model.
0:16:53 Do you know in practice– obviously, the professor could do whatever they want or the university
0:16:58 could have whatever policy– but do you know in practice, what do they do with this information?
0:17:03 That’s in a way a harder one to figure out than the banker who’s like, “Oh, it might be
0:17:04 a deep fake on the phone.
0:17:06 I’ll call you right back for security.”
0:17:10 I don’t have a banker, but if I had a banker and they did that, I’d be like, “Oh, that’s
0:17:11 cool.
0:17:13 My bank is doing this thing.”
0:17:21 Whereas with the professor and the student, that’s a much more sort of fraught situation,
0:17:27 and harder to think of how to deal with, again, the probabilistic nature of the output of
0:17:28 the model.
0:17:29 Yes.
0:17:30 I’ve got a couple more things here.
0:17:34 First of all, I think even universities are trying to figure out this problem and how
0:17:36 do you solve it.
0:17:43 The second thing to note, most of our users are not interested in a text detector.
0:17:46 That seems to be a much smaller market.
0:17:48 The biggest one is actually audio.
0:17:52 It’s becoming– imagine you get a call from a loved one and they’ll send me money and
0:17:56 you send money and figure out it’s not who– it was a deep fake, right?
0:18:00 That’s actually a much widely used system.
0:18:02 That’s the big one in terms of the business.
0:18:03 It’s interesting.
0:18:08 I wonder if that’s partly– we think about the video more, but is it partly because
0:18:12 deep fake audio is now quite good and there are lots of instances where people will transfer
0:18:16 lots of money based solely on audio?
0:18:18 Deep fake audio is the best and it’s getting better, right?
0:18:19 Interesting.
0:18:22 I used to be able to make your voice, “Maybe I need a minute.
0:18:25 Now I need just a few seconds and I can make your voice.”
0:18:28 It’s getting exponentially better, all of them are.
0:18:33 But audio is definitely top of the list right now.
0:18:34 How are you keeping up?
0:18:35 Yeah.
0:18:38 I mean, so when we detect audio, it’s tricky.
0:18:40 There’s a lot of factors to think about.
0:18:44 A person’s accent, right, is model biased?
0:18:49 Does it not understand or is there an issue where it detects one person with a certain
0:18:51 type of accent always at the deep fake?
0:18:54 There’s also issues of like noise.
0:18:57 When there’s a lot of background noise, the model can be impacted.
0:19:02 When there’s crosstalk, multiple people speaking at the same time, that could impact the model.
0:19:04 So there’s a variety of factors.
0:19:08 And the other thing you think about is our models, they support multiple languages.
0:19:10 So we don’t just do English.
0:19:13 And so all of these kind of make it very complicated.
0:19:16 So when we detect something, it’s called pre-processing.
0:19:21 There’s a whole bunch of steps to the audio before it actually goes to our AI models where
0:19:26 we have to clean up the audio, do a certain types of transformations before we push it
0:19:27 to the models.
0:19:34 Is that happening in real time with these companies?
0:19:38 And what is the frontier of pre-processing?
0:19:41 Is it an efficiency and speed problem because you’re trying to do it in real time and so
0:19:47 you’re just trying to kind of make the sort of algorithmic part of it as fast and efficient
0:19:48 as possible?
0:19:49 Yeah.
0:19:51 I mean, this is a challenge.
0:19:52 There’s a lot to be done.
0:19:54 So that’s ongoing research.
0:19:59 How do we continue to speed up not just the pre-processing, but inference?
0:20:03 And there’s a variety of one thing that’s called a foundation model.
0:20:06 I’m not sure if you’ve heard what those are, but these are extremely large pre-trained
0:20:07 models.
0:20:08 GPT is a foundation model.
0:20:09 It’s a pre-trained model.
0:20:14 And so these models can be useful in some parts of the pre-processing, where they can
0:20:23 quickly extract certain features for us, and then we can use those to down the pipeline.
0:20:38 While to come on the show, the problems that Ali is trying to solve now.
0:20:42 Want to understand exactly how interest rate rises will impact your mortgage?
0:20:45 Or how New York City gets fresh produce?
0:20:50 Or exactly what on earth was going on over at FTX before the whole thing collapsed?
0:20:54 This week, we sit down with the perfect guest to answer these sort of questions and understand
0:20:59 what’s going on with the biggest stories in finance, economics, business, and markets.
0:21:00 I’m Tracy Allaway.
0:21:02 And I’m Joe Weisenthal.
0:21:05 And we are the hosts of Bloomberg’s OddLots podcast.
0:21:06 Look us up wherever you get your podcasts.
0:21:16 The OddLots podcast from Bloomberg.
0:21:18 How good are you at detecting defects?
0:21:20 Can you quantify how good you are?
0:21:24 So the way they usually do this is they look at benchmarks, right?
0:21:30 There’s public data sets, which we can take and run and we’re in the 90s.
0:21:32 And then, but you know, that’s not the real world scenario.
0:21:40 When you say you’re in the 90s, you mean you, in a binary sense, you guess correctly 90%
0:21:41 of the time.
0:21:42 Yeah.
0:21:44 So on a public benchmark, we’re in the 90s.
0:21:47 There’s accuracy, precision, and recall.
0:21:49 Accuracy is how accurate are we?
0:21:56 Let’s say there was a hundred, a sample set is hundred, maybe 50 is fake, 50 is real,
0:21:57 right?
0:21:58 The accuracy is you take.
0:21:59 Okay.
0:22:00 How many of those did you get right?
0:22:02 How many of the real and fake divided, right?
0:22:04 That’s the, that’s the, that’s the accuracy.
0:22:10 The problem with that is like unbalanced data set, maybe, maybe only two is fake.
0:22:13 And then the other 98 are real.
0:22:18 So in that case, uh, the accuracy, if we had said that, okay, everything is real, we would
0:22:19 be 98%.
0:22:20 Uh-huh.
0:22:21 Right?
0:22:23 That’s not very useful because you missed the defects.
0:22:24 Yeah.
0:22:26 So that’s why our precision and recall come in.
0:22:32 They look specifically at how did you do on that specific, like, the fakes or the reels?
0:22:34 So that’s, so there’s more than just accuracy.
0:22:37 There’s also other factors to look at.
0:22:44 So there’s, it’s kind of like the sort of false positive, false negative, um, challenge
0:22:46 with medical tests, right?
0:22:50 You want to test that both says you have the thing, says you have the disease when you
0:22:55 have the disease and also says you don’t have the disease when you don’t have the disease.
0:23:01 And that actually ends up being a really complicated problem, uh, given the, the nature of baselines,
0:23:02 right?
0:23:05 Like in your universe, certainly in the universe of people calling their banker, almost everybody
0:23:09 calling their banker is a real person, right?
0:23:14 But there are these very high stakes, presumably very rare cases where it is a deep fake.
0:23:17 And so that’s like a complicated problem.
0:23:18 It actually is.
0:23:19 It absolutely is.
0:23:24 And it’s something as we work with each customer, we have to tweak those somewhat higher false
0:23:27 positives, someone higher false negatives.
0:23:28 It depends on each use case.
0:23:33 In the case of a bank, they, they want to be a bit more cautious, but that also causes
0:23:35 a lot of, it could cause a lot of pain.
0:23:36 Kind of volume.
0:23:37 Right.
0:23:40 Cause if every client it’s like, oh, sorry, gotta call you back to make sure you’re not
0:23:41 a deep fake.
0:23:43 Like that’s not great.
0:23:44 Yeah.
0:23:49 And if you have thousands of calls a day and even 1% is a false positive, uh, or negative
0:23:53 that, that creates a lot of work because it adds up.
0:23:54 How do you solve that?
0:23:56 What do you do about that?
0:24:03 So the way it works is all about adjusting, you can think of thresholds, right?
0:24:09 We can adjust fair variety of parameters as the output for a model, not just the model
0:24:17 itself, but, uh, the, for example, in an audio, as we speak, you know, we can look at, okay,
0:24:21 how long do you want to listen before you give an answer?
0:24:27 You know, maybe, maybe, I know the longer you listen, the more, the more, um, confident
0:24:28 the model is.
0:24:29 Oh, that’s smart.
0:24:30 That makes sense.
0:24:31 Right.
0:24:32 Because it’s essentially more data for the model.
0:24:33 Exactly.
0:24:36 So what are you trying to figure out now?
0:24:39 Like what is the frontier?
0:24:43 What’s really the latest now, and it’s just amazing how quickly it’s going is videos.
0:24:48 So the videos that we detect are like a face swap, like you’re sitting there speaking and
0:24:50 another person’s face is on there.
0:24:51 That’s a face swap.
0:24:57 But now you can generate an entire video completely from scratch and you just type in
0:25:00 the description and the video comes out.
0:25:03 It can take some, it can, I can take your voice a few seconds of your voice.
0:25:09 I can then have you say anything I want, which you can clearly see the bad, bad person can
0:25:10 misuse these tools.
0:25:12 So the latest is these things are getting really good.
0:25:18 And over time, like with those videos is your, how is your reliability and accuracy changing?
0:25:23 You getting better or worse or staying the same as the technology to create the deep
0:25:24 fakes improves?
0:25:29 So what’s interesting is it has slowed down in terms of like the signatures, like we,
0:25:32 we don’t need as much data as we used to.
0:25:37 So of course there’s still a lot of work and we’re never going to stop, but it is stabilizing
0:25:39 a little bit.
0:25:43 When you say it, what is stabilizing a little bit?
0:25:48 The, so like the deep fakes signatures are stabilizing the way.
0:25:52 Signatures meaning the giveaways, the things that I can’t see, but that your models can
0:25:53 see that.
0:25:54 Yeah, exactly.
0:25:59 So our models, going back and give a little more detail, they’re looking at different
0:26:04 attributes of a piece of media and they pull out those attributes and then they send those
0:26:08 through our in-house neural networks that, that study those attributes.
0:26:13 Like one that you have mentioned that the company has mentioned publicly is the, the
0:26:15 sync of audio and video, right?
0:26:16 Yes.
0:26:19 Like maybe that’s one where it’s gotten better and it doesn’t matter anymore.
0:26:24 But like it, from what I understand, from what I’ve read, there was at least a time when
0:26:30 the sync of the audio and video tended to be off in deep fake videos, right?
0:26:34 Is that an example of a signature?
0:26:36 So the way that works is we train the model.
0:26:41 We say, Hey, here’s a bunch of people speaking, here’s what they look like.
0:26:42 Look at the sync.
0:26:46 Here’s a bunch of people like that are deep fakes and look at the sync and we tune the
0:26:48 model so we can tell the difference.
0:26:50 That’s also happening with video, by the way.
0:26:56 If you look at Sora and some of these new models where someone’s walking, for example,
0:27:02 their legs are not like, you know, the, they’re not really smooth or they don’t look right.
0:27:04 So you can look at that as well.
0:27:06 That’s the temporal dynamics we call that.
0:27:07 Uh-huh.
0:27:12 Like temporal dynamics is basically are things proceeding in time in a natural way?
0:27:13 Exactly.
0:27:16 How things change over time.
0:27:21 So yeah, all of these seem like things that you can just that are going to be fleeting,
0:27:22 right?
0:27:24 Like my baseline assumption is it’ll all get solved.
0:27:31 Um, do you, how long do you think you’ll be able to defend reality for?
0:27:35 You know, this question comes up all the time where there is always a giveaway or there
0:27:38 is always a new way to look at the problem.
0:27:40 We’re not just looking always at the raw pixels, right?
0:27:43 We could look at different aspects.
0:27:44 We could look at the frequency.
0:27:48 For example, if you look at an image, you can actually break it down into frequencies.
0:27:52 When you say frequency, what do you mean when you say you can look at the frequency?
0:27:53 What does that mean?
0:27:55 So for example, okay, so let’s go with audio.
0:27:59 You know, you can use some of the Fourier transformers to actually break up an audio
0:28:03 into individual wavelengths, signs and co-signs, that does it look.
0:28:07 You can do the same with, for an image, for example, you can break that up like, like an
0:28:10 analogy of a wave form of audio.
0:28:11 Yeah.
0:28:13 It can, it can be translated into a bunch of waves.
0:28:17 So there’s multiple dimensions that we look at.
0:28:22 And the AI, there’s always a giveaway.
0:28:24 And again, we’re also thinking outside the box, right?
0:28:27 Like the blood flow, for example, right?
0:28:29 But there’s other kind of similar things we could think about.
0:28:37 I mean, presumably, you know, Renaissance, Renaissance capital, the James Simons is one
0:28:45 of the first quant hedge funds, and they made tons of money for a long time, they wildly
0:28:46 outperform the market.
0:28:48 Clearly, they had a technological advantage.
0:28:54 And the thing Simon said, the founder, this math guy about, about that company, one of
0:28:59 the things he said was like, we actually don’t want to hire like finance people who have
0:29:02 some story about why a stock is going to outperform.
0:29:06 Because if there’s a story about it, that then, then somebody else is going to know it
0:29:07 already.
0:29:08 Right?
0:29:13 Their thing was just like, we just give the model all the data and let the model find
0:29:20 these weird ass patterns that no human even understands, but they work more often than
0:29:23 they don’t work, and we make tons of money.
0:29:27 And I would think that would be the case for you to some extent, that if you could think
0:29:32 of a thing like monitoring blood flow or whatever, then the bad guys or whatever, the people
0:29:36 who want to make realistic, uh, gen AI would also think of it.
0:29:43 And the real kind of secret sauce would be in weird correlations that the model finds,
0:29:46 that we wouldn’t even understand.
0:29:47 Exactly.
0:29:54 I mean, that is oftentimes what the model is training on and the way it determines if,
0:30:00 if something’s a defect, looking at certain features, it’s something that we don’t even
0:30:01 tell it.
0:30:02 Right?
0:30:03 Yeah.
0:30:04 It determines on its own.
0:30:05 Right?
0:30:09 And it’s kind of this kind of new era of, of, of whatever, neural networks, machine learning,
0:30:10 right?
0:30:14 It’s just, you throw everything at it and let the machine figure it out.
0:30:16 We like to say we throw the kitchen sink at it sometimes.
0:30:17 Yes.
0:30:18 Yes.
0:30:22 I mean, and so when you were talking before about explainability, right, about sort of
0:30:28 saying in your output, here’s why we think it’s fake, I feel like that kind of throw everything
0:30:32 at it and let the machine figure it out makes it hard to, like sometimes you don’t know,
0:30:33 right?
0:30:39 The machine is very smart and it says this is probably fake, like is that, is that intention?
0:30:40 That can happen.
0:30:44 So you look at it, we’ll show you an image and it’ll say the model was looking at certain
0:30:45 areas.
0:30:49 And by the way, this also helps us with debugging and bias, right?
0:30:55 Maybe it was like for some reason, looking at an area of the face that we wouldn’t tell
0:30:58 why would that set off the model.
0:31:03 And so in those scenarios, we also investigate, like why was this area flagged?
0:31:09 And it could be 100% correct, it’s just we do, we do have to examine it further.
0:31:13 Could you create a deep fake that would fool your deep fake detector?
0:31:14 Yes.
0:31:19 Well, if you could do it, somebody else could do it, don’t you think?
0:31:24 I could do it because I have access to a lot more knowledge, right?
0:31:28 Like, you know, I could, if I was running an anti-virus company, I could probably do
0:31:29 it.
0:31:32 I could probably write a virus if I knew exactly what it was looking for.
0:31:34 We’re constantly actually trying to do that by the way.
0:31:35 Yeah.
0:31:38 I mean, in a sense, that’s the whole adversarial network thing, right?
0:31:44 Like, I guess you have to do that for your detection models or your suite of models to
0:31:45 get better, right?
0:31:46 Yeah.
0:31:50 So we have what’s called red teaming, both black box and understanding of the code.
0:31:52 So we’re trying to break the models.
0:31:55 That’s part of what we do.
0:32:00 And so are there, like, evil geniuses at your company who can make killer deep fakes?
0:32:06 We definitely have geniuses, 100%, but we’re in the business of detection, right?
0:32:10 We don’t try to generate too much other than just for training the models.
0:32:11 Yes.
0:32:19 I mean, I have to think, like, there are many people in the world who want to make deep fakes
0:32:28 for many reasons, and they’re at different levels of technological sophistication, naively,
0:32:29 not knowing much about this.
0:32:33 I would think you can catch most of them, but if you have people who can beat your models,
0:32:39 I would imagine that, say, state actors, countries throwing billions of dollars at this, probably
0:32:42 also have people who could defeat your models.
0:32:43 Yeah.
0:32:47 I mean, that’s always the case with any cybersecurity company.
0:32:50 We are a cybersecurity company.
0:32:54 Every cybersecurity company does its best to defend, right?
0:32:57 But we do not promise 100%.
0:33:01 Our models are always a probability.
0:33:05 Who’s the best at making deep fakes that you’re aware of?
0:33:06 There’s a few, right?
0:33:10 There’s, like, Sora from OpenAI, there’s Runway, there’s Synthesia, there’s Claim.
0:33:12 Those you better be able to catch, right?
0:33:14 Anything I’ve heard of, you better be really good at detecting.
0:33:21 Presumably, it’s like some, like, Russian genius squad or, I don’t know, the North Koreans
0:33:22 or some things.
0:33:25 I would imagine it is some state-funded actor, but I don’t know.
0:33:31 I would actually say we’re in a place where this is a problem is getting bigger, but we’re
0:33:35 in a place where a lot of the defeats coming out are actually for entertainment and they’re
0:33:37 not used for evil.
0:33:42 You’ve seen the famous Tom Cruise one or other actors running around and doing things
0:33:44 and those are defeats, right?
0:33:45 Those are actually pretty good.
0:33:47 We detect them, but they’re actually very good.
0:33:51 What are you thinking about in the context of the election in the US this year?
0:33:56 And do you have particular clients who are especially focused on election-related deep
0:33:57 fakes?
0:34:03 Yeah, the media companies are the main ones and we’re ready.
0:34:08 We detect the best deep fakes, right?
0:34:10 Everything that’s coming out, we detect.
0:34:18 So we’re ready and we want to make sure we’re there as one avenue of people verifying content.
0:34:24 I believe late last year there was an election in Slovenia where there was an audio of one
0:34:28 of the candidates saying he’s going to double the price of beer.
0:34:31 And that actually was a deep fake.
0:34:34 It was caught, but it kind of caused some damage.
0:34:35 So it’s starting to happen now.
0:34:37 It’s an awesomely stupid deep fake.
0:34:43 I mean, to me, the real risk of deep fakes is not people believing something that’s false.
0:34:47 It’s people ceasing to believe anything, right?
0:34:50 It’s just saying, oh, it’s probably just a deep fake, right?
0:34:55 Like that actually, to me, seems like the bigger risk is nothing is true anymore.
0:34:57 Nobody cares about the truth anymore.
0:35:01 That’s definitely a problem as well.
0:35:04 Now we’re seeing people saying, oh, this is a deep fake.
0:35:05 That’s actually happened.
0:35:10 If you, I believe it was a Kate Milton video, if I’m correct, that was earlier this year
0:35:13 where we all thought that was a deep fake and it wasn’t.
0:35:16 So this kind of problem is happening.
0:35:21 Because people want to believe things that are consistent with their prior beliefs and
0:35:25 they don’t want to believe things that call their prior beliefs into question, right?
0:35:30 And so deep fakes, in a way, are an easy out where if you see something you like, you assume
0:35:31 it’s true.
0:35:34 If you see something you don’t like, you assume it’s not true, or you assume everything’s
0:35:35 just kind of bullshit.
0:35:39 Like, that to me seems like a big kind of societal level risk of deep fakes.
0:35:41 We’ll never fix that.
0:35:43 That’s something that we’ll never solve.
0:35:45 People have their own beliefs.
0:35:49 You can show them anything, the facts, math.
0:35:50 That’s not going to fix the problem.
0:35:51 Yeah.
0:35:58 No, I guess that’s a human nature problem, not an AI problem.
0:36:13 We’ll be back in a minute with the Lightning Round.
0:36:17 Want to understand exactly how interest rate rises will impact your mortgage?
0:36:19 Or how New York City gets fresh produce?
0:36:24 Or exactly what on earth was going on over at FTX before the whole thing collapsed?
0:36:28 Twice a week, we sit down with the perfect guest to answer these sort of questions and
0:36:33 understand what’s going on with the biggest stories in finance, economics, business, and
0:36:34 market.
0:36:35 I’m Tracy Allaway.
0:36:36 And I’m Jill Weisenthal.
0:36:39 And we are the hosts of Bloomberg’s All Thoughts podcast.
0:36:41 Look us up wherever you get your podcasts.
0:36:45 The All Thoughts podcast from Bloomberg.
0:36:53 Okay, let’s close with a Lightning Round.
0:36:54 Okay.
0:37:00 So, how often do people applying to work at Reality Defender use generative AI to write
0:37:01 cover letters?
0:37:03 Oh, that’s a good one.
0:37:04 Not a lot, but we’ve seen it for sure.
0:37:07 I would say maybe about 3%.
0:37:09 Okay.
0:37:14 If I want to use generative AI to write a cover letter to apply to work at Reality Defender,
0:37:18 but I don’t want to get caught, what should I do?
0:37:23 Change about 75% of the words.
0:37:26 Who is Gabe Regan?
0:37:32 Gabe was, I think he was our VP of public relations or something like that.
0:37:33 He’s a deep fake.
0:37:39 We created him as kind of a fun joke, but obviously we tell everyone he’s not real.
0:37:42 Tell me a little bit more about that.
0:37:49 If you go on certain websites where you put your photo and maybe your job experience,
0:37:54 there’s quite a large number of deep fake profiles on these websites.
0:37:55 Like LinkedIn?
0:37:57 Yes.
0:37:59 Huh.
0:38:00 Why?
0:38:01 Scammers.
0:38:02 Why would people be doing that?
0:38:03 Sorry?
0:38:04 Scammers.
0:38:07 I’m trying to think, how do you get money out of people by having a fake LinkedIn account?
0:38:08 Oh, I can tell you.
0:38:14 Let’s say you start, the most popular ones that I’m aware of is like cryptocurrency.
0:38:18 Maybe you create a coin and you’re like, here’s the CEO and here’s this person and they have
0:38:22 these great LinkedIn profiles, here’s their photo and they’re not real, but it solves
0:38:25 the story, right?
0:38:30 Is it right that you founded a clothing company?
0:38:32 I did, yes.
0:38:36 What’s one thing you learned about fashion from doing that?
0:38:39 It’s much different than software development.
0:38:40 Sure.
0:38:43 I don’t think you needed to start a company to learn that.
0:38:47 I mean, the marginal cost is not zero for one thing.
0:38:48 Yeah.
0:38:49 The software is easy.
0:38:50 You write some code.
0:38:55 It’s not easy at all, but what I mean is you’re writing some code and you ship it versus
0:38:56 in fashion.
0:38:59 You have to have, like, you got to source the fabric, you got to design it, you got to
0:39:07 make the patterns, you got to cut it, sew it, make sure it fits, it’s a lot more work.
0:39:10 What are the chances that we exist in a simulation?
0:39:13 You know, I used to think this was kind of a joke, but I don’t know.
0:39:20 I’m seeing every month it seems to get higher from my perspective.
0:39:22 Why do you say that?
0:39:27 I’m seeing what’s happening with tech and what we’re building, and you can see there
0:39:32 was one paper where they took a bunch of agents and they gave them all a job and they started
0:39:36 to do it and they just started to create their own workflows, right?
0:39:37 I don’t know.
0:39:39 Can we be getting there again?
0:39:44 So it’s like, well, if we can create a simulation that seems like reality, maybe someone created
0:39:46 a simulation that is our reality.
0:39:49 Exactly, yeah.
0:39:52 What do you wish more people understood about AI?
0:39:55 I mean, it’s a tool, and I don’t think people should be afraid of it.
0:40:01 They should embrace it, and, you know, there’s people are just running away from it.
0:40:02 It’s fantastic.
0:40:03 It’s great.
0:40:04 Embrace it.
0:40:05 Just be careful.
0:40:09 One thing I’d like to tell my friends and family, especially with the FIC Audio, have
0:40:10 a safe word.
0:40:16 If somebody calls you and you’re like, that’s weird, call them back or ask for a safe word.
0:40:21 What do you wish more people understood about reality?
0:40:24 About reality.
0:40:30 I would say just be aware that you exist and every day is a gift, so you should be excited
0:40:31 that you’re here.
0:40:36 And as as of you existing, it’s like you’ve won the lottery a million times, so every
0:40:39 day is a gift.
0:40:46 Ali Shakiyari is the co-founder and CTO at Reality Defender.
0:40:49 Today’s show was produced by Gabriel Hunter-Chang.
0:40:54 It was edited by Lydia Jean-Cott and engineered by Sarah Bouguere.
0:40:58 You can email us at problem@pushkin.fm.
0:41:15 I’m Jacob Goldstein, and we’ll be back next week with another episode of What’s Your Problem.
0:41:19 Want to understand exactly how interest rate rises will impact your mortgage?
0:41:21 Or how New York City gets fresh produce?
0:41:26 Or exactly what on earth was going on over at FTX before the whole thing collapsed?
0:41:30 Twice a week, we sit down with the perfect guests to answer these sort of questions and
0:41:35 understand what’s going on with the biggest stories in finance, economics, business, and
0:41:36 market.
0:41:37 I’m Tracy Allaway.
0:41:38 And I’m Joe Weisenthal.
0:41:41 And we are the hosts of Bloomberg’s OddLots podcast.
0:41:43 Look us up wherever you get your podcasts.
0:41:45 The OddLots podcast from Bloomberg.
0:41:47 (upbeat music)
0:41:50 (upbeat music)

As generative AI tools improve, it is becoming easier to digitally manipulate content and harder to tell when it has been tampered with. Today we are talking to someone on the front lines of this battle. Ali Shahriyari is the co-founder and CTO of Reality Defender. Ali’s problem is this: How do you build a set of models to distinguish between reality and AI-generated deepfakes?

See omnystudio.com/listener for privacy information.

Leave a Comment