Podcast Feature: Insurance Fraud Detection

Data Skeptic Podcast

I was recently invited to speak on the Data Skeptic podcast about my research on fraud detection. It was a great experience, complete with a few pre-recording jitters, where I got to discuss our paper on simulating network-based insurance fraud data.

We developed a simulation engine that mimics the characteristics of real-world insurance fraud network data. It allows researchers to experiment with different fraud patterns, levels of class imbalance, and network structures - all without relying on sensitive or hard-to-access data. This enables a more robust development and evaluation of fraud detection techniques.

The conversation covered a lot of ground, but two questions stuck with me. I wish I had answered them better. So here is a quick follow-up:

1. What is Actuarial Science and what career options are there?

Actuarial science applies mathematics, statistics, and financial theory to assess risk, mostly in insurance, pensions, and finance. Actuaries work on product design, pricing, and risk management, and increasingly also in data science. Career paths include roles in insurance, consulting and tech-driven analytics.

2. What are neighborhood and fraud-score based features?

Also, during the episode I briefly touched on class imbalance and its implications for logistic regression. I wanted to share some excellent references I have come across:

Other Resources

🎧 Podcast Episode Link (Also on Apple, Google, and Spotify)

Grateful to Kyle Polich for the invite to the podcast!

EAJ Online Seminar Presentation

Shortly after the podcast release, I had the opportunity to present our work in a live online seminar hosted by the European Actuarial Journal (EAJ). In this session, I focused on a key aspect we did not fully explore in the podcast: the data-generating process underlying our simulation engine.

My talk begins at 17:50 in the recording and provides a deeper walkthrough of the engine's design and implementation, especially useful for those interested in synthetic data generation and fraud detection modeling.