Episode 23

Insurance, the O.G. Data Business

From inferring the presence of a swimming pool based on seemingly unrelated factors, to the unexpected ways a Yelp review or LinkedIn account can have an effect on your policy decision, to predictions on the future of data and insuretech: this episode of the Orbit podcast pits Hg’s Max Dewez against {Carpe Data CEO, Max Drucker, on the eve of 2022’s edition of InsureTech Connect in Las Vegas.

Listen on:

Episode Transcript

Max Dewez

Welcome to Orbit, the Hg Podcast series where we speak to leaders and innovators from across software and tech ecosystem to discuss the key trends that change how we all do business.

My name is Max Dewez. I’m a Director at Hg where I spearhead our North American investing efforts in financial end markets, including insurance.

As such, I’m thrilled to be joined today by my fellow Max D: Max Drucker, the Founder, and CEO of Carpe Data, a company that is at the forefront of using emerging data to improve outcomes across the insurance cycle. Max. It’s great to see you again.

Max Drucker

Good to see you as well.

Max Dewez

Thank you very much for doing this today, I know you’re a busy guy. Lots that we’d love to cover on all things, data, insurance, and the applications thereof. But, for our listeners we’d love to just get, the headline intro to your background and what you guys are working on at Carpe Data.

Max Drucker

Well, thank you. I’m looking forward to this conversation.

Carpe Data is an alternative data company specifically for insurers. We look at big problems across the insurance life cycle and spectrum across various lines of business, and we look to identify new data sources and new data elements to improve insurance outcomes. Generally, what that means is basically everything we do breaks down to one or two categories:

It’s either to enable/unlock automation – though that’s skipping manual steps in the process, from underwriting to claims process – or, as we describe, improving insurance outcomes, and that can be something as predictable as loss, but also things such as pricing, as paying the right amount for a claim, about reducing cycle time etc. So again, everything we do is about identifying those use cases to really move the needle for the carrier. There’s still so much tremendous opportunity with data to enable these automated decisions to enable selection improvements, enable effectively having a different window/different insight into risks than they’ve previously had with the traditional data sources.

Max Dewez

Got it. And how did you get into that? Have you always been an insurance guy?

Max Drucker

I consider myself to be something of an insuretech dinosaur. I mean, they can’t see me on here, I have white hair. I began my career out of college working at Apple, so I came up from the technical side. That was the part of the 1.0 era, and I was recruited in to be a founder of E-Coverage, which was the first online auto insurance carrier.

My claim to fame here in my life and my greatest accomplishment is that I was part of the team that built the first web-based online insurance platform. We sold the first auto insurance policy on the Internet And that’s E-Coverage. So it just shows you how old I am!

That was a company that was backed by E-Trade and Softbank, and that was that era. There was no software, there was no web-based software whatsoever. I mean it literally did not exist. There was no guide wire. There was no Duck Creek or any of these other companies at the time, so we had to really completely build our own, because this was – largely carriers were using mainframes, some client servers, but there wasn’t really anything out there.
We had to build the end-to-end policy processing software. We really learned a lot about insurance. That was really kind of the original boom of insuretech at the time.

Now E-Coverage then blew up in brilliant glory, right. It’s why no one’s ever heard of E-Coverage. So we left, in those days, a very, very big crater in the ground. But out of that I had learned that I knew I didn’t want to build an insurance company, because we clearly weren’t very good at that. But, maybe making technology for insurance companies seemed like a decent idea. So, I started a policy arbiter core systems company called Steel Card that would effectively give insurers the ability to sell insurance online.

We were really at the very early stages of being able to provide internet quoting, point of sale policy issuance, first notice of loss online… Then I ultimately sold that business to Insurity that that was owned by Choice Point at that time, and so very much of the data that – Choice Point is now Lexis Nexis Insurance Solutions.

So I was very much a part of how data impacted or was critical towards how insurers do business. So that’s my background: it’s understanding and experiencing from quoting to underwriting, to issuance, to management and the incorporation of data and all that.

So, after a very, very short stint at Insurity, I took a little time, and then we’ve been back at this as the insurance software side and implementations and all that stuff is very important, but also very, very painful, very messy. Data just seem to be such the big opportunity where there really hasn’t been a lot of real innovation, and new players in the space for really, literally decades.

Max Dewez

Which is surprising, right? If you think about it, insurance is kind of the original big data industry. Back in the UK and Europe in the seventeenth century, they’re putting together mortality tables…

Max Drucker

You’re stealing my line, Max! That’s what I’ve been saying, that insurers are the O.G. data companies. They’ve been predicting outcomes for hundreds of years. right? They’ve been predicting death forever, that’s what they do. They take data, they predict outcomes, that’s what they do.

Max Dewez

Exactly, so again, being able to understand data, to analyze data. It’s at the heart of the business. To your point, you had seen a situation of not a lot of innovation for decades. So, what’s changed? It’s 2022, what is it about this moment that means that we’re ready for dramatic change and new applications of data?

Max Drucker

I think I think there’s a few components of it, so maybe break it down into three major trends that that make this possible.

The first is the ubiquity of the Internet. There’s so much more data out there today than there was certainly two decades ago. Facebook didn’t even exist two decades ago. Yelp didn’t exist two decades ago, right? Not every business was online two decades ago.

Now we have so much new disparate data – that’s both publicly available or part of, you know various networks – yet that’s predictive of insurance outcomes. I think most people might think that say a customer review, say a Yelp score might be predictive of a potential loss for a given business. But, even today, being able to automate that data, being able to systematically use that data in a given process to predict outcomes – whether it’s for selection or pricing, or whatever that is – it’s still not something that insurers are fully taking advantage of, are really able to do. It’s very ad hoc.

But so again, trend number one is: this: new data exists.

Trend number two.

Obviously the introduction of machine learning – and AI much more broadly – becoming ubiquitous and effectively becoming almost table stakes for any company that’s dealing in data and predicting outcomes, has enabled us to transform these new sources and be able to really figure out exactly what to do with them. To make that far more common.

And I think the third aspect of it is the ubiquity, ease and cost of real scale that has only really been made possible with cloud computing in a way that AWS has made possible.

We could not have done what we’re doing in corporate data ten years ago. Being able to have massive amounts of data at our fingertips with very scalable costs. I come from a world of building data centers and having racks and racks and racks and managing all of that. We’re now in a world where this that stuff is so commoditized and so easy, we can really focus on being able to harness a vast amount of data for ultimately what becomes something that’s very, very simple.

An illustration is something as mundane as a swimming pool. So, a swimming pool is something that an insurer wants to know about specifically. Say, if you’re insuring an apartment building, does your apartment building have a swimming pool or not. Of course I need to know that, right?

But, at this moment, there’s no data source for every apartment building that has swimming pools. Right? We have data sources for motor vehicle records, right? We have data sources for loss histories, right? We have data sources for other things, say, in other lines of business. But we don’t have a data source for every apartment building that has a swimming pool.

So, we’re now an era in which you can look at, say, a structured data set like apartments.com. Apartments.com might have some coverage there, and might be able to tell you that, and the accuracy might be decent. But the coverage is not going to be really broad enough.
So that’s one approach. A second approach can be to look at, say, unstructured text that’s out there now. So now we can look at: maybe our building has a website, and on their ‘About Us’ it talks about the pool hours being from 8am-10pm or whatever that is.

But the third way of looking at it is actually building a model around it, he said. We know that apartment buildings that have this average rent, that are on this street… Well, it’s likely to have a pool based on other apartment buildings that meet similar criteria. So we can build a model to predict that the likelihood is that 95%, there’s a pool.

Max Dewez

More likely in Santa Monica than in Buffalo.

Max Drucker

Exactly right. So you’re using all kinds of intelligence, right? Based on a given area. Say, okay, here’s a comparable, here’s a score.

Or finally you can even look at an image. You can look at a Google image or an image on the website, or whatever somewhere, and apply some computer vision and say, look, there’s a body of water on this. It looks like it’s a pool. So now you have four separate kind of approaches to be able to determine this. Something again that’s simple but so important as, “is there a pool here or not”.

And so that’s why this is such an exciting time. Because, before being able to harness that data and be able to automate around that, here you’re looking at a manual process where an underwriter will call that apartment building. The underwriter might be Googling around that. They may be taking the agents word for it and not validating that – which is obviously an area of exposure. There have been, traditionally, no good way of automating something like that which is obviously, across the spectrum, a tremendous opportunity for insurers to both improve their accuracy, but also again to automate, to be able to systematize and make this process far more streamlined, similar to where auto insurance is today.

Max Dewez

Yeah, okay. So the pool example makes total sense. I presume your premiums go up if you have a pool. You mentioned the Yelp score earlier, and would love to understand how a Yelp score can feed into an insurance decision.

Max Drucker

Well, so today not a ton of insurers are building their pricing models around here, not with any consistency. Because that data is somewhat inconsistent, and it can be very spotty, and there can certainly be questions around it, right. Yelp wasn’t designed to provide a score for every business that that’s out there.

But, sort of intuitively, take a restaurant – and what we have seen, and it does back us up, backwards and forwards every single week, you can look at this.

That customer ratings are predictive of insurance outcomes. A restaurant with terrible reviews is indeed more likely to have a loss within twelve months than a restaurant with great reviews.
So you might ask the question. Okay, that’s interesting. Is that causal or is it corrollative? Why is that?

The answer kind of really breaks down to: the online presence of a business can be very telling as a proxy for how well that business is run.

A business owner that takes enough effort and makes that effort in order to have high ratings, which will likely translate in more business, is also likely to run their business in such a way that’s going to put them in less peril, and be less likely to have some kind of a crippling loss.

I often get the question when I give this example, “Ah, well, my brother-in-law opened a restaurant, he and all his friends write reviews – or he hired some firm – and they, you know, dummied up the reviews.” So how do you come out against that kind of fraud?

And I think it’s a great question and what the response is: what we’re trying to do, and what I think the data that speaks to is, we’re not trying to predict where you should have dinner. Your brother-in-law may not be very good, but your brother-in-law is demonstrating that he cares about this business, and he knows how to generate business, and he’s doing the right things in order to get that customer through the door or drive that business. So it can still be a predictive indicator of how well that business is run, and how likely it is to have a loss.

Max Dewez

Yeah, intuitively I would have guessed the low ratings could be a leading indicator for a restaurant that probably isn’t going to make it. What’s the next step? Well, insurance fraud.

Max Drucker

Well, I think that’s absolutely right. There’s absolutely those issues. And that may be why we have seen in the data, things like reputation or customer ratings are predictive of bad outcomes

But also this data enables us to look at the trend analysis. Where is this trending? Are the reviews getting better or are they getting worse? Is the velocity of reviews increasing or going down? It’s not necessarily just about predicting loss. It can also be about predicting – again, when we talk about tricked outcomes – retention.

So if a business is blowing up and they’re getting lots of new positive reviews, or the velocity is increasing, that might be an indication that they may be stopping. Because now they’re this or that, or they’re eligible for whatever this is, or the converse, you’re seeing your reviews plummet. Well, this is probably a bad sign. Is it something that maybe I don’t want to renew? Or maybe this is something I want to look a little bit more into and maybe put through a different layer of underwriting that maybe something else is going on in this business that that I want to know about.

Max Dewez

Got it. So there’s the Yelp review indicator. Have you had any other really interesting or weird or unexpected insights that you or your customers have pulled out of the data in the last couple of years. Something where you would say, well, you would never have thought that that was a predictor of insurance.

Max Drucker

Oh, my! Where do you begin? I mean the amount of things that that our projectors see. We’re able to predict things like… likelihood a person’s going to hire an attorney. That’s something that you wouldn’t necessarily think would be an easy thing to do. You know we effectively are able to predict they’re going to hire an attorney before they know they’re going to hire an attorney.

They don’t even know. But we know, based on all the people that are coming together, that they’re going to see that ad for that personal injury attorney or they’re going to have somebody refer them, or whatever that is, and they’re going to bring that person in.

That’s an interesting area that you would think that the online presence or the kind of data that is publicly available – it’s out there, you know – might be helpful in understanding things like that.

Max Dewez

That indicator’s important, right? Because I guess it makes it more likely that they’re going to challenge the claim or challenge the outcome, and that makes it more expensive for the carrier.

Max Drucker

It’s going to be a three or four times worse of an outcome for the carrier once they hire that attorney.

So there’s that but, on the other hand, what also is super interesting and maybe it’s intuitive but, there hasn’t really been a great way of being able to harness this data. Understanding online presence can be predictive of someone that’s very unlikely to commit fraud. So much in our business, the carriers are so focused on trying to find that 1% of that really hard fraud or looking for, you know the soft part of the exaggeration and abuse.

But, on the on the flip side of that, being able to predict the likelihood that this person is a good Samaritan: they’re not exploiting the insurance company, and it’s the claim that you can handle. You can either pay it immediately or handle with much less intervention, things such as: if a person’s got more than 200 LinkedIn connections, they maintain a private twitter or a Facebook profile – nothing’s probably available there – and they have an email address that’s been active for more than five years. So just take those three data points. Well, that’s actually something you should pay much more quickly to. You’re processing right through. So that’s the kind of thing that you wouldn’t necessarily think is out there. But it’s pretty interesting, and you can see how this ultimately can be can be very powerful stuff.

Max Dewez

So i’m going to LinkedIn immediately after this, and just adding a bunch of people to yeah to streamline my next claim.

Max Drucker

I’f you’re going to commit fraud, that’s one way of telling them you’re not going to. So keep this between us, right?

Max Dewez

We’ll take this part out of the podcast. Maybe one question on data application before we get to how the insurance carriers and clients use it, which is, use of personal data or public data, or whatever it might be, the hot button issue.

You see it in commercial relationships, right? So Apple stops tracking and suddenly the social media companies are in a bunch of trouble. You’ve got GDPR in Europe. How does that impact the way that you guys use it and the way you think about your business?

Max Drucker

So one of the basic components of our company from the very, very early days was, we really want to avoid any potentially controversial type of data.
We want it to be the gold standard for alternative data and not weigh into some of these areas of “there’s a picture of someone drinking from a red cup on Facebook, is that gonna impact something?”

This is an area on that consumer-ish personal social data that we’ve avoided from the very beginning of the business. There’s so much other opportunity avoiding the more controversial ones, it certainly initially seems to be the right move. So that’s why we focused initially in the claims area, right? Because the claims area is really not controversial at all, identifying federation abuse, exaggeration, and identifying opportunities to pay more quickly, these kind of areas.

The type of data that we’re talking about. That’s collected via tracking or the very personal type social data that has been impacted by various legislation, really doesn’t come into play here. Additionally, we’ve also focused on small commercial and small commercial doesn’t really fall in these buckets either. Because, if you’re a business, it’s your job to put information out there so you can be found. It’s your job to put a website out there. It’s your job to make sure that you’re on tripadvisor or whatever your given business segment is. It’s out there. And this is obviously information that can be very powerful and predictive for insurance carriers. So the long and short of it, the answer is the fact that we’re really working on it isn’t really impacted.

There’s categories too where people can potentially opt in. Data that people can use as well. And that was certainly an industry trend several years ago. Where you could basically have an opportunity to almost monetize your own data. If you’re using data for Facebook or using whatever this is, you have an opportunity to share that data with whatever that given provider is, and then be marketed to, or however that goes. But I think those trends are kind of moving away. I think most people really aren’t so into that so much. And I think that those models really haven’t worked out so well.

Max Dewez

Ultimately, it’s the paradox of insurance, which is, if you had perfect information, then you can never price an insurance policy, because the person who needs it would never get it.

So maybe speaking of really applying that insurance lens. If I remember right, you put out a white paper a while ago, estimating that insurers are using less than 40% of the data that they actually have in their systems.

Forget about all the stuff that sits outside their four walls like, “does an apartment building have a pool”, I mean why is that?

Max Drucker

I’m very sympathetic to the challenges that insurers face. They’re working through so many of them, again, the scale is so massive. Not only have the legacy systems been built from scratch over decades. Many of these systems were initially developed in the 60s and 70s that are still the core systems. So you have that final challenge of the stuff being just very old. Also so many insurers have gotten to where they are by acquisition, and by going into different markets and so, twenty/thirty years later you look back and say, the business is built by these twelve companies and had these four core systems and had this many underwrite systems and rule systems and forum systems and billing systems and economic systems.

I mean, these are very hard problems to solve and so clearly most carriers have strategies around getting through data lakes or being able to aggregate data and put it in one place. But I mean we still encounter big carriers that can tell us that they don’t know if they have a worker’s comp or a bot policy for a given business, and they’re asking, “Hey, give us all the stuff. Can you figure out where there’s overlap, or where this stuff is?”

I think point one is that these are actually pretty hard problems to solve. Obviously, everybody wants to solve these problems, but it’s a matter of time.

Fortunately, where technology is again with massive processing that can be done at scale. We can do so much more of this stuff and being able to harness it. For us at Carpe Data, we’re very much an off-the-shelf data company. Give us a given subject, whether it’s an entity or an individual, and then we provide you data back via an API request, or if they can’t do that, some sort of other process. However, for some of our enterprise or partner customers, we will also work with their data if they’re data disparate. And whether that’s using data and incorporating it with ours to build some kind of a model to predict some kind of outcome or to just help consolidate and clean this stuff up because the capability varies so widely.

Max Dewez

I think you know, my colleague, Andrew, spoke about this on a prior podcast. This idea of the unbundling of the insurance carrier where, for a long time they tried to do everything. But frankly, there are specialists like you who kind of cut through some of that, and bring that expertise to bear.

Max Drucker

I look at I look at our company that started relatively recently and with very new technology. After a number of time, even these things that we did when we first got started are practically obsolete or as difficult, or we didn’t think that we wanted to be able to structure it this way, or think about it that way. And then you go back and redo again. Iin my background, having done four systems implementations, I’m very sympathetic to how hard it is and why that is. Now we really try and think about that. I’m trying to find what’s the natural seam to integrate. Right? If you say, here’s some API, go figure it out within your systems… Well, that’s a that’s a problem, right? That’s hard for someone to be a priori.

We try to sort of say, “Okay, let’s sit down. What’s your infrastructure? Okay, we’ll use a document management system. And that’s been effective, And that’s a good workflow. All right? Well, how do things get into that? We can work on?”

Okay, maybe for one customer, maybe we’re working through one system like a homegrown system, one can be a document management system. It’s really about having flexibility and understanding that doing anything it’s so hard because of the history, and because of these challenges. If you can be creative and try to be clever about how you can do these integrations, it can really go a long way.

Max Dewez

That makes sense. Reminds me of the cartoon I saw: the guy up on a podium who’s speaking to a crowd who says, “Who wants clean data?” and everyone puts their hand up. And the next page is, “Who wants to clean the data?” Crickets. So it’s not easy.

So you and I are both headed to Las Vegas next week for ITC, which I believe stands for Insuretech Coachella.

Max Drucker

Our event is called CarpeChella!

Max Dewez

Good. Looking forward to that and the whole thing. But having been a few years, it seems to me that every year there’s one idea or trend or piece of tech that has a big groundswell behind it. It was probably three or four years ago it felt like every other exhibitor had a drone company, and was doing stuff with aerial imagery. I’m sure people have done interesting things with it. But that was the year of the drone. What do you think? You know we’ll be there in a week. What do you expect to see really popping off, that people are getting excited about?

And you’re allowed to say data. But then I’m going to ask you for another one.

Max Drucker

So I’ll have a think and then I’ll add a self-serving answer. I think we’re going to hear a lot of talk about BI severity, a lot of talk about social inflation, right? And inflation, period, right? Carriers have had a very rough year in their BI severity, both on the property side and on the injury side. So carriers are really going to be looking for, “How can I reduce my exposure?” The medical cost has exploded out control.

Then social inflation, and being the other side of this, in which attorneys have gotten much better, and people have become much more savvy. And so there’s a lot more frankly abusive carriers on the other side of that. So they’re seeing this as the trend of a day: “How can I reduce my exposure?”

There’s no single solution to that. Certainly we argue it’s – and what we work with the carriers on is – being able to automate as much as you possibly can. So systematic, right?
First identify all the good risks. Figure out who’s not going to be your problem. Deal with that.

Second: these are potential areas that you want to do. These are the given areas where your likely outcome is, and to pay it more quickly, but then..

Three – and certainly what Carpe Data does is we monitor injury claims at scale. It’s what we’ve done, it’s really what we launched on. So ultimately, that does reduce the individual severity of what that injury exposure is. So we’re able to identify: this person’s a lot healthier than they say they are. We can get them back to work faster, and that’s going to have a dramatic impact on that BI severity.

So that’s I think we’re going to see again. That’s going to be a major talk track.
The more self-serving side, what do I think we’re also going to see, and this is maybe a consistent trend, but one I think we’re seeing more and more. It’s just frankly on automation, right? So anything, whether that’s automating an underwriting process, whether it’s a claims process, whether it’s a customer service process, whether it’s an agent process, automation has become even more important in such a tight labour market.

So we’re seeing wages increase, right. But even worse than that, insurers are able to hire people with remote working. We certainly hear from our customers about a drop in productivity. So you have less productivity. You’re having a very, very hard time hiring people, and many, of course, are retiring as well, and you have an aging workforce. So you have these really massive trends that disrupt the labour that insurers have historically relied upon. Well, your only way out of that is through automation.
Automation comes both from a workflow per

spective, but also and certainly where we sit into being able to automate those decisions to pass through. And we look at auto insurance, as has really the great success story, because auto insurance is effectively a fully automated product because of the data that made that possible. It’s the automated credit score, the automated loss history report, the automated motor vehicle report to validate on the way through that makes it possible.
That’s what insurers are really going to look for across whatever their other lines of business are. What are the areas where I can use data so I can automate decisions so I can pass them through and not have to have a human being look at it.

I don’t know, Max What do you think?

Max Dewez

What do I think we’re going to see? Look, I think the trend over the last kind of two, three years has definitely been probably the journey you went on ten, fifteen years ago. Which was, I don’t want to be an insurance carrier, right? There was a time when it was insurance plus tech and i’m gonna take the risk, or pretend I’m not taking the risk, but unload a lot of it out to reinsurers. But you know that that has been a tough road to hoe the last two, three, years, and you see it obviously in in the public market evaluation.

I think it’s absolutely enablement. I think we’re seeing this across, not just insurance, but some of the other financial end-markets we look at like banks, right? You’ve got these legacy core systems that you talk about where you’ve cobbled together by acquisition. Some of it’s written on cobalt – very hard to replace. You’ve got a policy that has another ten years to run. You can’t change the system. But it’s really about putting the layer on top of that, so that it’s scooping areas of functionality out of what used to be monolithic core systems. Some of it was designed for it, other parts weren’t designed for it, but you cobbled it together.

But we’re seeing more and more of these companies that are doing one, two, three things really well. They’re not trying to fight the core system, they’re not trying to do a five-year re-implementation. But they’re saying you can leave your general ledger, leave your database in place, because we’re not going to take that out. You’re not going to take that out. No one’s going to win any medals for taking it out. What matters to you? What matters frankly to the employees who you’re trying to retain and the customers you’re trying to impress is workflow. the automation that you talked about. I can create a good customer experience, a good employee experience while not going through a huge tech headache.

So that can be in the data side, that can be in the customer acquisition side, it can be in the marketing side. I think really, that’s where the future is today.

Max Drucker

All right. We’re on the same page.

Max Dewez

Well, I certainly hope so, because you’re a lot smarter than I am. So, thank you! Thank you very much for doing this. You’re a busy guy. I really appreciate it. And looking forward to seeing you next week.

Max Drucker

Absolutely. It’s super fun. Great reconnecting, Max. And yeah, see you in Vegas!

Orbit episodes



What drives business quality in an era of AI and digital platforms?

Episode details


LLM AI, the fourth pillar of software

Episode details