AI transcript
0:00:05 Hi, everyone. Welcome to the A6nZ podcast. I’m Sonal, and this is our sixth episode of 16
0:00:10 Minutes, our new news show, where we cover recent headlines of the week, the A6nZ way,
0:00:13 why they’re in the news, why they matter from our vantage point in tech,
0:00:17 and share our experts’ views on the trends involved as well. You can catch up on past
0:00:23 episodes at a6nz.com/16minutes or subscribe to it as a separate feed in your favorite podcast
0:00:28 player app. This week, we have two episodes since we’ll be skipping next week. This episode covers
0:00:32 two other topics that came up recently. The Capital One Preach, is it more of the same or
0:00:37 different? How does it fit into the seemingly endless string of corporate hacks? But first,
0:00:42 we dive deeper into recent news around healthcare claims data and insurance providers,
0:00:48 which sounds boring, but is apparently not. Okay, so the first segment is on recent news
0:00:53 that a number of big tech companies, including Amazon, Apple, Google, and Microsoft, are working
0:00:59 with insurance companies in order to help provide claims data to patients. This came about at a
0:01:04 developer conference hosted by a coalition called the Karin Alliance, which is basically a bipartisan
0:01:10 multi-sector collaborative. That’s what they call themselves. It includes a lot of former
0:01:15 national health coordinators, the US’s first health information technologies are, the US’s first CTO,
0:01:19 the former secretary of health and human services, and a number of other people are working in this
0:01:23 alliance. But the key point of this, and just to summarize the news before I introduce our
0:01:29 A66 expert, is it’s the first time that healthcare providers are giving claims data to third-party
0:01:34 developers. So just to summarize also some of the stats around claims data. So apparently,
0:01:40 there are 4 billion prescription claims and 3 billion medical claims processed every year.
0:01:46 And then in terms of the cost to the healthcare industry, $315 billion is spent on healthcare
0:01:53 claims of which 35% is administrative waste or overhead. So now I’m going to introduce our
0:02:00 A66 expert, Julie Yu, who is a former founder of a patient provider matching startup and is a deal
0:02:05 partner on the A66 bio team who is focused on all things care delivery. Welcome, Julie.
0:02:08 Great to be here. I’m so excited to hear from you because honestly, claims data is the most
0:02:14 boring ethnic on the planet. Honestly. I think it’s very sexy. You do because it was a why. Tell
0:02:18 me why it matters. At a higher level, the notion of data liquidity in healthcare, it’s almost an
0:02:23 inevitability that it will eventually occur. But to see things like this is very exciting just
0:02:27 because it’s making it all real. When we think about the consumer angle, oftentimes what you hear
0:02:33 from consumers with regards to healthcare is it’s inconvenient. It’s slow and I never know what’s
0:02:38 going on and it’s opaque and I never understand what things cost and transparency. And I think
0:02:44 claims data specifically has a role to play in making all of that better. Why is that? What is
0:02:48 it about claims data specifically? Because when I think of claims data, I think I have so many
0:02:54 bills outstanding and late fees from labs because I go to the doctor’s office and the lab is a
0:02:57 separate office and they have two different billing systems and then I’m like, what the heck?
0:03:02 Yeah, exactly. So in very simple terms, claims are basically the invoices of healthcare. It’s
0:03:05 the equivalent of if you were to go to a mechanic and get something done to your car, you’d get a
0:03:09 list of the things that were done, how much they cost, et cetera. Now there are many different
0:03:15 types of claims data. You mentioned lab data. There’s medical claims data, so services rendered
0:03:20 by a physician or a nurse. Like in an office or hospital. Exactly, office visits, hospital visits.
0:03:25 There are prescription pharmacy claims data, so drugs that are prescribed to you. Those can also
0:03:31 be claims-based payments. And then there’s many different elements of claims data that we should
0:03:37 be aware of. One is that claims because they take so long to process can be in very different states
0:03:40 and depending on when you see a claim, you might see very, very different information.
0:03:45 What do you mean by that? Like specifically? Yeah. So the average length of time that it takes to
0:03:49 fully process a claim can be anywhere from two to three months, upwards of many, many months beyond
0:03:53 that. And the reason for that is that the process involves, first of all, a provider submitting
0:03:58 a claim to an insurance company saying here is what was done to this patient’s sonal for these
0:04:03 specific line item services. And here’s what I think I should be paid. The insurance company then
0:04:07 receives it. And by the way, that process of just simply receiving it could take weeks because there’s
0:04:11 lots of middlemen, it could be a paper-based process, et cetera. And so then the insurance
0:04:15 company finally receives it. And then ultimately, they need to adjudicate it. They need to determine
0:04:20 whether or not this was a valid service per the contracts, per the benefit of the consumer,
0:04:24 and therefore how much am I actually going to pay and then how much am I going to leave to be paid
0:04:29 out of the pocket of the patient. And so that end-to-end process is very lengthy, involves many
0:04:35 multiple players, intermediaries. And depending on when you see that claim in that entire cycle,
0:04:39 as you can imagine, you might see very different information. So is it pre-adjudicated versus
0:04:45 post-adjudicated? Is it sort of coming from the provider side? Is it on the insurance side, et cetera?
0:04:50 So for instance, one of the major things that insurance companies have been very reticent
0:04:56 around sharing is the full set of allowed amount data. And what that means is basically when a
0:05:02 physician submits a claim, they will put their sort of billed amount. But contractually, they
0:05:07 may have specific rates that are negotiated with the insurance companies. And releasing that into
0:05:11 the public would obviously negate any negotiating leverage that they might have with providers
0:05:16 in certain markets. That’s so helpful. I love the lay of the land, the context for how the claims
0:05:23 work, the process. So now back to the news. Why does it matter that developers can develop on
0:05:28 top of this? Like, it sounds like a very complicated system. People don’t have incentives to share,
0:05:32 to be transparent. Like, there’s already enough middlemen. What does it help to have third-party
0:05:36 developers in the middle here now? Yeah. Well, I think ultimately the beneficiary will be patients
0:05:40 because these companies will ultimately be developing apps that are consumer-facing.
0:05:45 And I think we can probably break down the benefits into two major buckets. One benefit is simply
0:05:50 financial transparency. You can imagine like a mint.com of healthcare finally being built. There’s
0:05:53 actually unfortunately sort of a graveyard of companies that have tried to do that in the past.
0:05:58 And some of the challenges that those companies have faced are around, you know, the lack of
0:06:02 liquidity of data. It’s funny that it’s a novel thing that, you know, all of a sudden the insurance
0:06:06 companies are going to sort of buy into making APIs available to consumers because that data
0:06:11 has existed in a B2B format for other companies. So the consumer thing is actually the real new
0:06:15 thing because that patients can benefit through those apps. There are actually multi-billion-dollar
0:06:21 businesses made on using claims data for other commercial purposes. One example that probably
0:06:25 not so many people outside of the industry know about is claims data are actually used very heavily
0:06:30 by pharmaceutical and life sciences companies. And the use case there is, you know, taking sort
0:06:35 of the claims exhaust from insurance. The data exhaust, right? Yep. And using that to inform
0:06:40 which providers see what types of patient populations. And that can be informative to
0:06:44 everything from clinical trial recruiting, who are the physicians who are most likely to have
0:06:49 patients where they can be recommended certain alternative therapies. And so that is a very
0:06:54 robust industry that’s existed for decades. And that is actually a very liquid use of claims data.
0:06:56 And what’s the other bucket? The other bucket is actually the health benefit.
0:07:00 From claims? Well, this is the Holy Grail. And I think one of the interesting things about this
0:07:04 announcement, it might be a nice to have that information, you know, just an ability to see
0:07:09 what was done. But really, the Holy Grail is how can I move the needle on my health status?
0:07:15 A lot of companies have also tried to use claims to make health claims. And that is not intended
0:07:20 there. But there’s only so much you can see in a claim that can then imply what actually happened
0:07:24 from a health care perspective. So it’s the equivalent of trying to infer, you know,
0:07:28 how did the food taste based on seeing a restaurant receipt? So that’s really where
0:07:34 the need for medical record data becomes very necessary, right? The actual clinical context
0:07:38 around a given claim can tell you a lot more about the health status and therefore what you can do
0:07:42 to move the needle on that. If you compare claims to medical records, claims are sort of the broad
0:07:47 data that will give you a full, much more comprehensive set of information about what has
0:07:51 happened to you, whereas the medical record and the other types of data we described are sort of
0:07:55 the depth. So you get the breadth versus the depth as a baseline. You can provide sort of a
0:08:00 scaffold, let’s call it, of the journey that a patient has had around their health condition
0:08:04 at the starting point, which can then inform, okay, it looks like I should double click here
0:08:07 and understand what happened around this particular event. So bottom line it for me,
0:08:12 how should we think about this news in the broader context of health care and in particular where
0:08:16 claims really fit in? Yeah, I think we should be very optimistic that this is yet another sign that
0:08:21 data liquidity will be a basic piece of infrastructure within our industry. That said, I think this is
0:08:26 only step one of many steps to come around really getting a holistic understanding at the consumer
0:08:31 level about what’s going on with your health care. So if claims is like v1.0 of data liquidity,
0:08:36 you know, actual medical records can be v2.0, but really what I’m excited about and what I think
0:08:42 the whole industry is looking towards is the v3.0, which would be all of the sort of non-medical
0:08:47 but health related content that actually can matter to us as consumers that we can control
0:08:53 and use to drive our health outcomes. So it’s things like food. What is my social status?
0:08:58 We’re now recognizing more and more that things like social isolation are huge contributors to
0:09:05 negative effects on mental health and even things like GI issues have a huge mental health component.
0:09:08 And so literally understanding what my social interactions are could have a huge impact on
0:09:12 that. Well, thank you for joining the segment. Thank you so much. Okay. So the last segment
0:09:17 this week covers the Capital One data breach and how corporate hacks happen. So let me quickly
0:09:21 first summarize the news. First of all, Capital One is a financial services company and they
0:09:25 do lots of things, including provide credit cards. They’re considered one of the 10 largest banks in
0:09:30 terms of assets and they are the third largest credit card issuer and credit powers a lot of our
0:09:33 financial system today. So if someone’s stealing that data is not so good. So what someone did is
0:09:38 a hacked into a server holding the personal records of over 100 million people. And here’s
0:09:44 what they stole. 140,000 social security numbers, 80,000 bank account numbers, 106 million credit
0:09:49 card applications from between 2015 to 2019, according to the company. It’s one of the largest
0:09:54 data thefts from a bank ever. It cost estimated cost right now is about 150 million just for
0:10:00 context compared to the Equifax credit bureau breach of 2017. That one exposed a sensitive
0:10:05 information on over 147 million consumers and it costs way more about 650 million and they just
0:10:10 settled claims for that about two weeks ago. So that’s a quick context for the news. Let me
0:10:15 introduce our A6NZ expert Joel de la Garza, who is our operating partner for security, was a former
0:10:20 CSO Chief Security Officer at BOX. He’s actually investigated a lot of breaches. He was responsible
0:10:25 for incident response for Deutsche Bank and Citigroup. And in his career has worked on
0:10:29 over 100,000 security incidents. Welcome Joel. Thank you. It’s good to be here.
0:10:33 So there’s a lot of data breaches. So I almost felt like, why are we even doing this as a news
0:10:38 item? It feels like it’s the same old story over and over again. What is different or unusual if
0:10:42 anything about this one? Well, the interesting thing about this one is that Capital One has
0:10:46 long been kind of the most, one of the most sophisticated, most secure adopters of cloud
0:10:51 technology. I think they were probably the first large financial service to actually move to using
0:10:55 cloud services. They really leaned into a lot of these technology trends and they’ve transformed
0:11:00 the way that they build and roll out their business. And so for someone who is so sophisticated
0:11:04 to have a breach of this magnitude happen to them on their new platform is actually quite
0:11:09 a stutter. And what actually happened just in the details is that apparently the hacker got in,
0:11:13 it was a 33-year-old hacker from Seattle, a software engineer. She got in through a firewall
0:11:17 misconfiguration and they themselves are speaking of them being a leader in cloud services,
0:11:23 RNAWS, Amazon Web Services. But apparently the underlying cloud services were not compromised.
0:11:26 Can you give us some more details on how the hack happened? Well, I think that the indictments that
0:11:30 were released by the US government were fairly detailed, but they don’t provide all of the kind
0:11:35 of relevant points. And there’s been a lot of speculation about what the underlying causes
0:11:40 are. And folks are trying to make this sound similar to a lot of breaches and a lot of other
0:11:43 kind of scenarios that impacted other companies. And so I think people are filling in the blanks
0:11:48 and we still don’t really know the details. Right. But at a high level, it sounds like
0:11:52 there were some pretty sharp edges in the way that this cloud service provider’s configuration
0:11:57 for a product worked and the configuration was not set appropriately. And so that allowed for
0:12:02 an issue where internal services could be exploited, data could be exfiltrated.
0:12:05 It’s a fairly common occurrence that’s happened to a lot of companies this year.
0:12:09 You said that it’s actually sometimes hard to tell that the information hasn’t really come out.
0:12:12 And then one of the lines I used to love saying, and I continue to say when we talk about hacks,
0:12:16 is that attribution is hard. It’s hard to figure out who did it, who done it.
0:12:21 Well, it’s hard until it isn’t, right? I think with this situation, we’ve got a computer intruder
0:12:25 that was bragging about the activity that they had done. And they were engaged in several very
0:12:31 prominent hacker channels taking credit for their activity. And they actually had posted some of
0:12:36 Capital One’s data to a publicly available GitHub repository. Another security researcher was out
0:12:41 there looking through GitHub Repose and found Capital One’s data and turned around and reported
0:12:45 it to Capital One. Well, the thing that’s funny to me, even though it shouldn’t be funny, the FBI
0:12:50 noticed her activity on Meetup and she posted comments on Twitter and Slack of all places.
0:12:53 How does this fit in the overall taxonomy of corporate breaches? We keep hearing about one
0:12:57 every year. Target, Equifax, the list goes on and on. So in the old days, when things were
0:13:01 predominantly on-prem, people were running their own servers. On-premises, right? On-premises.
0:13:05 As opposed to software as a service or cloud-based. As opposed to SaaS or cloud or whatever the case
0:13:10 may be. In those days, breaches typically happened because software patches weren’t applied
0:13:14 or someone gave away their username and password. And that was kind of how we got most of the large
0:13:19 breaches. Equifax was the result of a software patch that hadn’t been applied that allowed the
0:13:23 hackers to get into the network and exfiltrate the data. As we’ve moved to the cloud world,
0:13:27 we’ve gotten out of the need to patch a lot of this stuff, right? Cloud solves a lot of these
0:13:32 problems. The number one source of breaches now seems to be misconfiguration. That’s something
0:13:35 we’ve been noticing for the last couple of years. It’s actually one of the forecasts that we made
0:13:40 earlier this year looking at all the data that these kinds of configuration issues will be the
0:13:44 things that drive breaches into the future. And with this category of configuration issues,
0:13:47 what does that mean? Just in this case, it was supposedly firewall misconfiguration.
0:13:50 Why wouldn’t a cloud service provider just set it all universally for everyone?
0:13:56 So they are in the process, and Amazon has made, and Amazon, Google, and Microsoft have made a
0:14:03 lot of attempts to make a lot of these tools more easy, intuitive, and just more rapidly to be deployed.
0:14:07 But one of the challenges is when you’re a large cloud service provider, you’re trying to hit the
0:14:12 right balance between safety and security, ease of use, and features, right? And I think what generally
0:14:16 happens with security in the cloud world, having come from box and worked through a lot of these
0:14:20 problems, is that you have to find that balance between ease of use and security.
0:14:24 So a high-level understanding of how this breach occurred is that there was some kind
0:14:29 of a misconfiguration on a web application firewall provided by a cloud vendor. Now,
0:14:33 none of the products are specified. None of the vendors are specified. But there was basically
0:14:39 a misconfiguration on a web application firewall that somehow exposed internal resources, right?
0:14:44 Resources that were not supposed to be available to the public. And this person found those internal
0:14:49 resources, was able to access them from the outside, and then exploit them in such a way
0:14:53 that they were able to take more data from them. Right. Isn’t that pretty common, if I recall,
0:14:58 because I covered the AT&T breach back in 2012. I even edited an op-ed from Weave of all people,
0:15:03 but isn’t that what he did kind of hacked into the AT&T thing? Yeah, this is a fairly common. I mean,
0:15:08 the traditional way that you would build these sorts of applications and build this infrastructure
0:15:13 is that you have transitive trust relationships. So it’s the idea that I’ve got this hard perimeter,
0:15:18 this really solid firewall and perimeter that keeps people from getting in. And inside that
0:15:23 perimeter, it tends to get softer, right? So services are available. Things can talk to each
0:15:26 other that you wouldn’t want to have happening on the internet. Which enables collaboration and
0:15:29 people to work together. Which enables collaboration, rapid deployment, all sorts of really interesting
0:15:34 things. However, the moment you breach that perimeter, you expose these internal services
0:15:38 and your data can be exfiltrated, right? And so the mood now, and generally in the industry,
0:15:42 people are moving towards what’s called a zero trust approach, which is to remove the concept
0:15:48 of having this perimeter, remove the concept of a firewall, just assume there’s no trust in any
0:15:51 environment and build your services according to that kind of a model. But won’t that be hard for
0:15:56 people wanting to collaborate and balance all the security usability convenience that you talked
0:16:00 about? I mean, you could argue that actually, once you start to build security in, you actually make
0:16:06 it a design requirement along with ease of use, along with interoperability, that you can actually
0:16:09 make sure that all those requirements are met. And if you look at really successful software
0:16:15 companies now, they start with security built in until security is baked in hard. What can companies
0:16:20 do to protect themselves? The first is obviously choosing quality vendors, making sure that you
0:16:24 appropriately vet your third parties. I think in this specific case, right? How do you prevent
0:16:29 these sorts of things from happening? The fact that so many people were able to infer exactly
0:16:33 what happened in this specific breach means that it’s been happening out in the community for
0:16:38 some time. And this is just a further supports the case that we need to have better collaboration
0:16:42 around information security, better collaboration around security breaches, that things like this,
0:16:47 that these attack vectors are shared throughout the community and that people can take the
0:16:51 appropriate steps to protect themselves. So bottom line it for me, Joel, how should we think about
0:16:54 this Capital One breach and the context of all the other breaches and the taxonomy of breaches?
0:17:00 Well, I think the unfortunate truth is that if something like this can happen to Capital One,
0:17:03 who’s probably one of the best in the business, it can happen to anyone. And so we’re going to
0:17:07 continue to see this kind of an activity. We’re going to continue to see these breaches. And we’re
0:17:11 going to have to really think hard about who we as a people want to protect our data and want
0:17:14 to think about data privacy. Fantastic. Well, thank you for joining this segment.
0:00:10 Minutes, our new news show, where we cover recent headlines of the week, the A6nZ way,
0:00:13 why they’re in the news, why they matter from our vantage point in tech,
0:00:17 and share our experts’ views on the trends involved as well. You can catch up on past
0:00:23 episodes at a6nz.com/16minutes or subscribe to it as a separate feed in your favorite podcast
0:00:28 player app. This week, we have two episodes since we’ll be skipping next week. This episode covers
0:00:32 two other topics that came up recently. The Capital One Preach, is it more of the same or
0:00:37 different? How does it fit into the seemingly endless string of corporate hacks? But first,
0:00:42 we dive deeper into recent news around healthcare claims data and insurance providers,
0:00:48 which sounds boring, but is apparently not. Okay, so the first segment is on recent news
0:00:53 that a number of big tech companies, including Amazon, Apple, Google, and Microsoft, are working
0:00:59 with insurance companies in order to help provide claims data to patients. This came about at a
0:01:04 developer conference hosted by a coalition called the Karin Alliance, which is basically a bipartisan
0:01:10 multi-sector collaborative. That’s what they call themselves. It includes a lot of former
0:01:15 national health coordinators, the US’s first health information technologies are, the US’s first CTO,
0:01:19 the former secretary of health and human services, and a number of other people are working in this
0:01:23 alliance. But the key point of this, and just to summarize the news before I introduce our
0:01:29 A66 expert, is it’s the first time that healthcare providers are giving claims data to third-party
0:01:34 developers. So just to summarize also some of the stats around claims data. So apparently,
0:01:40 there are 4 billion prescription claims and 3 billion medical claims processed every year.
0:01:46 And then in terms of the cost to the healthcare industry, $315 billion is spent on healthcare
0:01:53 claims of which 35% is administrative waste or overhead. So now I’m going to introduce our
0:02:00 A66 expert, Julie Yu, who is a former founder of a patient provider matching startup and is a deal
0:02:05 partner on the A66 bio team who is focused on all things care delivery. Welcome, Julie.
0:02:08 Great to be here. I’m so excited to hear from you because honestly, claims data is the most
0:02:14 boring ethnic on the planet. Honestly. I think it’s very sexy. You do because it was a why. Tell
0:02:18 me why it matters. At a higher level, the notion of data liquidity in healthcare, it’s almost an
0:02:23 inevitability that it will eventually occur. But to see things like this is very exciting just
0:02:27 because it’s making it all real. When we think about the consumer angle, oftentimes what you hear
0:02:33 from consumers with regards to healthcare is it’s inconvenient. It’s slow and I never know what’s
0:02:38 going on and it’s opaque and I never understand what things cost and transparency. And I think
0:02:44 claims data specifically has a role to play in making all of that better. Why is that? What is
0:02:48 it about claims data specifically? Because when I think of claims data, I think I have so many
0:02:54 bills outstanding and late fees from labs because I go to the doctor’s office and the lab is a
0:02:57 separate office and they have two different billing systems and then I’m like, what the heck?
0:03:02 Yeah, exactly. So in very simple terms, claims are basically the invoices of healthcare. It’s
0:03:05 the equivalent of if you were to go to a mechanic and get something done to your car, you’d get a
0:03:09 list of the things that were done, how much they cost, et cetera. Now there are many different
0:03:15 types of claims data. You mentioned lab data. There’s medical claims data, so services rendered
0:03:20 by a physician or a nurse. Like in an office or hospital. Exactly, office visits, hospital visits.
0:03:25 There are prescription pharmacy claims data, so drugs that are prescribed to you. Those can also
0:03:31 be claims-based payments. And then there’s many different elements of claims data that we should
0:03:37 be aware of. One is that claims because they take so long to process can be in very different states
0:03:40 and depending on when you see a claim, you might see very, very different information.
0:03:45 What do you mean by that? Like specifically? Yeah. So the average length of time that it takes to
0:03:49 fully process a claim can be anywhere from two to three months, upwards of many, many months beyond
0:03:53 that. And the reason for that is that the process involves, first of all, a provider submitting
0:03:58 a claim to an insurance company saying here is what was done to this patient’s sonal for these
0:04:03 specific line item services. And here’s what I think I should be paid. The insurance company then
0:04:07 receives it. And by the way, that process of just simply receiving it could take weeks because there’s
0:04:11 lots of middlemen, it could be a paper-based process, et cetera. And so then the insurance
0:04:15 company finally receives it. And then ultimately, they need to adjudicate it. They need to determine
0:04:20 whether or not this was a valid service per the contracts, per the benefit of the consumer,
0:04:24 and therefore how much am I actually going to pay and then how much am I going to leave to be paid
0:04:29 out of the pocket of the patient. And so that end-to-end process is very lengthy, involves many
0:04:35 multiple players, intermediaries. And depending on when you see that claim in that entire cycle,
0:04:39 as you can imagine, you might see very different information. So is it pre-adjudicated versus
0:04:45 post-adjudicated? Is it sort of coming from the provider side? Is it on the insurance side, et cetera?
0:04:50 So for instance, one of the major things that insurance companies have been very reticent
0:04:56 around sharing is the full set of allowed amount data. And what that means is basically when a
0:05:02 physician submits a claim, they will put their sort of billed amount. But contractually, they
0:05:07 may have specific rates that are negotiated with the insurance companies. And releasing that into
0:05:11 the public would obviously negate any negotiating leverage that they might have with providers
0:05:16 in certain markets. That’s so helpful. I love the lay of the land, the context for how the claims
0:05:23 work, the process. So now back to the news. Why does it matter that developers can develop on
0:05:28 top of this? Like, it sounds like a very complicated system. People don’t have incentives to share,
0:05:32 to be transparent. Like, there’s already enough middlemen. What does it help to have third-party
0:05:36 developers in the middle here now? Yeah. Well, I think ultimately the beneficiary will be patients
0:05:40 because these companies will ultimately be developing apps that are consumer-facing.
0:05:45 And I think we can probably break down the benefits into two major buckets. One benefit is simply
0:05:50 financial transparency. You can imagine like a mint.com of healthcare finally being built. There’s
0:05:53 actually unfortunately sort of a graveyard of companies that have tried to do that in the past.
0:05:58 And some of the challenges that those companies have faced are around, you know, the lack of
0:06:02 liquidity of data. It’s funny that it’s a novel thing that, you know, all of a sudden the insurance
0:06:06 companies are going to sort of buy into making APIs available to consumers because that data
0:06:11 has existed in a B2B format for other companies. So the consumer thing is actually the real new
0:06:15 thing because that patients can benefit through those apps. There are actually multi-billion-dollar
0:06:21 businesses made on using claims data for other commercial purposes. One example that probably
0:06:25 not so many people outside of the industry know about is claims data are actually used very heavily
0:06:30 by pharmaceutical and life sciences companies. And the use case there is, you know, taking sort
0:06:35 of the claims exhaust from insurance. The data exhaust, right? Yep. And using that to inform
0:06:40 which providers see what types of patient populations. And that can be informative to
0:06:44 everything from clinical trial recruiting, who are the physicians who are most likely to have
0:06:49 patients where they can be recommended certain alternative therapies. And so that is a very
0:06:54 robust industry that’s existed for decades. And that is actually a very liquid use of claims data.
0:06:56 And what’s the other bucket? The other bucket is actually the health benefit.
0:07:00 From claims? Well, this is the Holy Grail. And I think one of the interesting things about this
0:07:04 announcement, it might be a nice to have that information, you know, just an ability to see
0:07:09 what was done. But really, the Holy Grail is how can I move the needle on my health status?
0:07:15 A lot of companies have also tried to use claims to make health claims. And that is not intended
0:07:20 there. But there’s only so much you can see in a claim that can then imply what actually happened
0:07:24 from a health care perspective. So it’s the equivalent of trying to infer, you know,
0:07:28 how did the food taste based on seeing a restaurant receipt? So that’s really where
0:07:34 the need for medical record data becomes very necessary, right? The actual clinical context
0:07:38 around a given claim can tell you a lot more about the health status and therefore what you can do
0:07:42 to move the needle on that. If you compare claims to medical records, claims are sort of the broad
0:07:47 data that will give you a full, much more comprehensive set of information about what has
0:07:51 happened to you, whereas the medical record and the other types of data we described are sort of
0:07:55 the depth. So you get the breadth versus the depth as a baseline. You can provide sort of a
0:08:00 scaffold, let’s call it, of the journey that a patient has had around their health condition
0:08:04 at the starting point, which can then inform, okay, it looks like I should double click here
0:08:07 and understand what happened around this particular event. So bottom line it for me,
0:08:12 how should we think about this news in the broader context of health care and in particular where
0:08:16 claims really fit in? Yeah, I think we should be very optimistic that this is yet another sign that
0:08:21 data liquidity will be a basic piece of infrastructure within our industry. That said, I think this is
0:08:26 only step one of many steps to come around really getting a holistic understanding at the consumer
0:08:31 level about what’s going on with your health care. So if claims is like v1.0 of data liquidity,
0:08:36 you know, actual medical records can be v2.0, but really what I’m excited about and what I think
0:08:42 the whole industry is looking towards is the v3.0, which would be all of the sort of non-medical
0:08:47 but health related content that actually can matter to us as consumers that we can control
0:08:53 and use to drive our health outcomes. So it’s things like food. What is my social status?
0:08:58 We’re now recognizing more and more that things like social isolation are huge contributors to
0:09:05 negative effects on mental health and even things like GI issues have a huge mental health component.
0:09:08 And so literally understanding what my social interactions are could have a huge impact on
0:09:12 that. Well, thank you for joining the segment. Thank you so much. Okay. So the last segment
0:09:17 this week covers the Capital One data breach and how corporate hacks happen. So let me quickly
0:09:21 first summarize the news. First of all, Capital One is a financial services company and they
0:09:25 do lots of things, including provide credit cards. They’re considered one of the 10 largest banks in
0:09:30 terms of assets and they are the third largest credit card issuer and credit powers a lot of our
0:09:33 financial system today. So if someone’s stealing that data is not so good. So what someone did is
0:09:38 a hacked into a server holding the personal records of over 100 million people. And here’s
0:09:44 what they stole. 140,000 social security numbers, 80,000 bank account numbers, 106 million credit
0:09:49 card applications from between 2015 to 2019, according to the company. It’s one of the largest
0:09:54 data thefts from a bank ever. It cost estimated cost right now is about 150 million just for
0:10:00 context compared to the Equifax credit bureau breach of 2017. That one exposed a sensitive
0:10:05 information on over 147 million consumers and it costs way more about 650 million and they just
0:10:10 settled claims for that about two weeks ago. So that’s a quick context for the news. Let me
0:10:15 introduce our A6NZ expert Joel de la Garza, who is our operating partner for security, was a former
0:10:20 CSO Chief Security Officer at BOX. He’s actually investigated a lot of breaches. He was responsible
0:10:25 for incident response for Deutsche Bank and Citigroup. And in his career has worked on
0:10:29 over 100,000 security incidents. Welcome Joel. Thank you. It’s good to be here.
0:10:33 So there’s a lot of data breaches. So I almost felt like, why are we even doing this as a news
0:10:38 item? It feels like it’s the same old story over and over again. What is different or unusual if
0:10:42 anything about this one? Well, the interesting thing about this one is that Capital One has
0:10:46 long been kind of the most, one of the most sophisticated, most secure adopters of cloud
0:10:51 technology. I think they were probably the first large financial service to actually move to using
0:10:55 cloud services. They really leaned into a lot of these technology trends and they’ve transformed
0:11:00 the way that they build and roll out their business. And so for someone who is so sophisticated
0:11:04 to have a breach of this magnitude happen to them on their new platform is actually quite
0:11:09 a stutter. And what actually happened just in the details is that apparently the hacker got in,
0:11:13 it was a 33-year-old hacker from Seattle, a software engineer. She got in through a firewall
0:11:17 misconfiguration and they themselves are speaking of them being a leader in cloud services,
0:11:23 RNAWS, Amazon Web Services. But apparently the underlying cloud services were not compromised.
0:11:26 Can you give us some more details on how the hack happened? Well, I think that the indictments that
0:11:30 were released by the US government were fairly detailed, but they don’t provide all of the kind
0:11:35 of relevant points. And there’s been a lot of speculation about what the underlying causes
0:11:40 are. And folks are trying to make this sound similar to a lot of breaches and a lot of other
0:11:43 kind of scenarios that impacted other companies. And so I think people are filling in the blanks
0:11:48 and we still don’t really know the details. Right. But at a high level, it sounds like
0:11:52 there were some pretty sharp edges in the way that this cloud service provider’s configuration
0:11:57 for a product worked and the configuration was not set appropriately. And so that allowed for
0:12:02 an issue where internal services could be exploited, data could be exfiltrated.
0:12:05 It’s a fairly common occurrence that’s happened to a lot of companies this year.
0:12:09 You said that it’s actually sometimes hard to tell that the information hasn’t really come out.
0:12:12 And then one of the lines I used to love saying, and I continue to say when we talk about hacks,
0:12:16 is that attribution is hard. It’s hard to figure out who did it, who done it.
0:12:21 Well, it’s hard until it isn’t, right? I think with this situation, we’ve got a computer intruder
0:12:25 that was bragging about the activity that they had done. And they were engaged in several very
0:12:31 prominent hacker channels taking credit for their activity. And they actually had posted some of
0:12:36 Capital One’s data to a publicly available GitHub repository. Another security researcher was out
0:12:41 there looking through GitHub Repose and found Capital One’s data and turned around and reported
0:12:45 it to Capital One. Well, the thing that’s funny to me, even though it shouldn’t be funny, the FBI
0:12:50 noticed her activity on Meetup and she posted comments on Twitter and Slack of all places.
0:12:53 How does this fit in the overall taxonomy of corporate breaches? We keep hearing about one
0:12:57 every year. Target, Equifax, the list goes on and on. So in the old days, when things were
0:13:01 predominantly on-prem, people were running their own servers. On-premises, right? On-premises.
0:13:05 As opposed to software as a service or cloud-based. As opposed to SaaS or cloud or whatever the case
0:13:10 may be. In those days, breaches typically happened because software patches weren’t applied
0:13:14 or someone gave away their username and password. And that was kind of how we got most of the large
0:13:19 breaches. Equifax was the result of a software patch that hadn’t been applied that allowed the
0:13:23 hackers to get into the network and exfiltrate the data. As we’ve moved to the cloud world,
0:13:27 we’ve gotten out of the need to patch a lot of this stuff, right? Cloud solves a lot of these
0:13:32 problems. The number one source of breaches now seems to be misconfiguration. That’s something
0:13:35 we’ve been noticing for the last couple of years. It’s actually one of the forecasts that we made
0:13:40 earlier this year looking at all the data that these kinds of configuration issues will be the
0:13:44 things that drive breaches into the future. And with this category of configuration issues,
0:13:47 what does that mean? Just in this case, it was supposedly firewall misconfiguration.
0:13:50 Why wouldn’t a cloud service provider just set it all universally for everyone?
0:13:56 So they are in the process, and Amazon has made, and Amazon, Google, and Microsoft have made a
0:14:03 lot of attempts to make a lot of these tools more easy, intuitive, and just more rapidly to be deployed.
0:14:07 But one of the challenges is when you’re a large cloud service provider, you’re trying to hit the
0:14:12 right balance between safety and security, ease of use, and features, right? And I think what generally
0:14:16 happens with security in the cloud world, having come from box and worked through a lot of these
0:14:20 problems, is that you have to find that balance between ease of use and security.
0:14:24 So a high-level understanding of how this breach occurred is that there was some kind
0:14:29 of a misconfiguration on a web application firewall provided by a cloud vendor. Now,
0:14:33 none of the products are specified. None of the vendors are specified. But there was basically
0:14:39 a misconfiguration on a web application firewall that somehow exposed internal resources, right?
0:14:44 Resources that were not supposed to be available to the public. And this person found those internal
0:14:49 resources, was able to access them from the outside, and then exploit them in such a way
0:14:53 that they were able to take more data from them. Right. Isn’t that pretty common, if I recall,
0:14:58 because I covered the AT&T breach back in 2012. I even edited an op-ed from Weave of all people,
0:15:03 but isn’t that what he did kind of hacked into the AT&T thing? Yeah, this is a fairly common. I mean,
0:15:08 the traditional way that you would build these sorts of applications and build this infrastructure
0:15:13 is that you have transitive trust relationships. So it’s the idea that I’ve got this hard perimeter,
0:15:18 this really solid firewall and perimeter that keeps people from getting in. And inside that
0:15:23 perimeter, it tends to get softer, right? So services are available. Things can talk to each
0:15:26 other that you wouldn’t want to have happening on the internet. Which enables collaboration and
0:15:29 people to work together. Which enables collaboration, rapid deployment, all sorts of really interesting
0:15:34 things. However, the moment you breach that perimeter, you expose these internal services
0:15:38 and your data can be exfiltrated, right? And so the mood now, and generally in the industry,
0:15:42 people are moving towards what’s called a zero trust approach, which is to remove the concept
0:15:48 of having this perimeter, remove the concept of a firewall, just assume there’s no trust in any
0:15:51 environment and build your services according to that kind of a model. But won’t that be hard for
0:15:56 people wanting to collaborate and balance all the security usability convenience that you talked
0:16:00 about? I mean, you could argue that actually, once you start to build security in, you actually make
0:16:06 it a design requirement along with ease of use, along with interoperability, that you can actually
0:16:09 make sure that all those requirements are met. And if you look at really successful software
0:16:15 companies now, they start with security built in until security is baked in hard. What can companies
0:16:20 do to protect themselves? The first is obviously choosing quality vendors, making sure that you
0:16:24 appropriately vet your third parties. I think in this specific case, right? How do you prevent
0:16:29 these sorts of things from happening? The fact that so many people were able to infer exactly
0:16:33 what happened in this specific breach means that it’s been happening out in the community for
0:16:38 some time. And this is just a further supports the case that we need to have better collaboration
0:16:42 around information security, better collaboration around security breaches, that things like this,
0:16:47 that these attack vectors are shared throughout the community and that people can take the
0:16:51 appropriate steps to protect themselves. So bottom line it for me, Joel, how should we think about
0:16:54 this Capital One breach and the context of all the other breaches and the taxonomy of breaches?
0:17:00 Well, I think the unfortunate truth is that if something like this can happen to Capital One,
0:17:03 who’s probably one of the best in the business, it can happen to anyone. And so we’re going to
0:17:07 continue to see this kind of an activity. We’re going to continue to see these breaches. And we’re
0:17:11 going to have to really think hard about who we as a people want to protect our data and want
0:17:14 to think about data privacy. Fantastic. Well, thank you for joining this segment.
with @julesyoo @smc90
This is episode #6 of our new show, 16 Minutes, where we quickly cover recent headlines of the week, the a16z way — why they’re in the news; why they matter from our vantage point in tech — and share our experts’ views on these trends as well.
This week we cover, with the following a16z experts:
- health claims, insurance & big tech, and healthcare data liquidity — with a16z bio partner Julie Yoo;
- Capital One data breach, cloud security, and corporate hacks — with a16z operating partner for security Joel de la Garza;
…hosted by Sonal Chokshi.