In the world of insurance, claims are the lifeblood of the industry. Accurately predicting how losses will evolve over time is crucial, as it directly influences reserving, risk selection, underwriting, pricing, and claims management. Traditionally, actuaries have estimated reserve requirements using portfolio-level calculations, relying on methods like Chain-Ladder and Bornhuetter-Ferguson triangulation. These approaches provide an aggregate view of loss development across a portfolio, serving the purposes of financial and regulatory reporting. While these methods offer transparency and ease of implementation, they also come with certain risks and limitations.
The innovative approach of Individual Loss Development (ILD), which aims to predict the development of each individual claim based on its unique characteristics. This approach is a gamechanger, providing a more individualized perspective on loss development.
Challenges of traditional methods
Traditional methods excel at providing an overall view of loss development but may overlook critical details. Estimating loss development solely through an aggregated lens can lead to inaccuracies in loss reserving for specific segments if essential characteristics are neglected. Additionally, these methods offer a static view, and analyzing reserve changes and underlying patterns can be time-consuming. Moreover, distinguishing and quantifying different, overlapping trends within aggregate methods can be challenging.
Introducing Individual Loss Development (ILD)
ILD is designed to predict loss development for each individual claim, taking into account its unique attributes. However, building ILD models manually can be a formidable task, particularly due to the temporal nature of claim development and the complexity of feature engineering and parameter estimation. Additionally, working with free text features, which can be highly predictive of future claim development, poses challenges due to the limited availability of natural language processing models.
Automated Machine Learning (AutoML) to the rescue
Automated Machine Learning (AutoML) emerges as a solution to the complexities associated with ILD modeling. AutoML streamlines feature engineering, variable selection, and model building, significantly reducing the time required for experimentation. AutoML also facilitates automated text mining, enabling the extraction of predictive insights from claim documents swiftly.
Predictive variables for loss development
To construct predictive variables for loss development, insurers typically consider five types of information:
1. Initial claim details
2. Claim development information
3. Policy details
4. Customer details
5. External factors
However, it’s crucial to avoid target leakage, which occurs when information that would not be available at the time of prediction is used in model training. To prevent this, variables are restricted to historical values for each development year, aligning with the prediction time.
Feature sets may vary depending on the model’s purpose, with some focusing on early claim development estimation and others on long-tailed portfolios.
Data preparation challenges
ILD modeling can lead to large datasets, especially for long-tailed liabilities. Downsampling techniques can help manage dataset size, but they introduce bias toward short-distance values. The introduction of weights can help mitigate this bias, depending on the business objective.
Setting up the model
Several factors must be considered when setting up an ILD model, including whether to work with open and closed claims, how to handle outliers, at which claim stages predictions are made, and what post-processing techniques are needed.
Modeling workflow
AutoML allows for quick experimentation with various model families and feature engineering approaches. It also enables the creation of target features, considering factors like loss total and loss delta. The choice of optimization metric depends on the model’s intended purpose.
Model interpretation
ILD models provide insights into predictive features. The three most predictive features often include initial loss, cumulative incurred loss, and development year. Monotonicity constraints are crucial, ensuring that the model aligns with business assumptions.
Applying the ILD approach
A simulated aviation loss portfolio with 14,000 claims from different sub-lines of business was used to apply the ILD approach. The model demonstrated promising accuracy, and the most predictive features aligned with expectations. Additionally, ILD models offer individual explanations for each prediction, enabling insurers to understand why a claim might be considered risky.
Implications for insurers
The ILD approach offers a granular perspective on claim portfolio inflation, complementing traditional reserving methods. Machine learning, particularly AutoML, provides a solution to the challenges associated with building ILD models manually, saving time and offering actionable insights. While ILD is unlikely to replace triangle-based methods entirely, it can enhance various aspects of insurance, from pricing and claims investigation to fraud detection and portfolio evaluation. By embracing ILD and staying current with machine learning advances, insurers can gain a competitive edge in the industry.
In the ever-evolving landscape of insurance, the adoption of ILD and machine learning holds the promise of transforming the industry, enabling insurers to make more informed decisions and provide better services to their clients.