---
title: "Refinement building blocks"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Refinement building blocks}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```


## Introduction

In many pricing analyses, model estimation is followed by a translation step.

A fitted GLM may capture the structure of the portfolio well, while some fitted
effects still need to be reviewed before they are used in a tariff.

Common reasons include:

-	irregular local variation
-	lack of monotonicity
-	externally imposed tariff structures
-	expert judgement not directly represented in the model
-	implementation constraints in policy administration systems

For this reason, actuarial pricing work often distinguishes between:

1.	model estimation
2.	tariff refinement
3.	final refit of the pricing structure

`insurancerating` provides a staged refinement interface:

1.	fit an unrestricted model
2.	initialise a refinement object with `prepare_refinement()`
3.	add one or more refinement steps
4.	inspect these steps before refit
5.	call `refit()` to obtain the final fitted model

This separation can make tariff adjustments easier to understand, reproduce, and audit.


## When refinement can help

Refinement can help when the estimated model output is useful, but the fitted
coefficient pattern needs additional structure before it is used in a tariff.

Typical use cases include:

-	smoothing a rating factor derived from a continuous variable
-	imposing monotonicity
-	restricting coefficients to a predefined relativity structure
-	introducing expert-based relativities within existing model levels
-	simplifying the final tariff for practical implementation

In many workflows, refinement is applied to the model that represents the final
pricing signal, such as a premium or pure-premium model. In other cases, it may
also be useful for selected frequency or severity effects. The relevant question
is whether the adjusted coefficient pattern is intended to support the tariff
structure that will be reviewed or implemented.


## Example setup

The example below starts from one common premium modelling setup:

-	analyse a continuous variable with a GAM
-	convert it to tariff segments
-	fit frequency and severity models
-	combine both into a premium proxy
-	fit an unrestricted premium model
	

```{r, message = FALSE, warning = FALSE}

library(insurancerating)
library(dplyr)

age_policyholder_frequency <- risk_factor_gam(
  data = MTPL,
  claim_count = "nclaims",
  risk_factor = "age_policyholder",
  exposure = "exposure"
)

age_segments_freq <- derive_tariff_segments(age_policyholder_frequency)

dat <- MTPL |>
  add_tariff_segments(age_segments_freq, name = "age_policyholder_freq_cat") |>
  mutate(across(where(is.character), as.factor)) |>
  mutate(across(where(is.factor), ~ set_reference_level(., exposure)))

freq <- glm(
  nclaims ~ bm + age_policyholder_freq_cat,
  offset = log(exposure),
  family = poisson(),
  data = dat
)

sev <- glm(
  amount ~ zip,
  weights = nclaims,
  family = Gamma(link = "log"),
  data = dat |> filter(amount > 0)
)

premium_df <- dat |>
  add_prediction(freq, sev) |>
  mutate(premium = pred_nclaims_freq * pred_amount_sev)

burn_unrestricted <- glm(
  premium ~ zip + bm + age_policyholder_freq_cat,
  weights = exposure,
  family = Gamma(link = "log"),
  data = premium_df
)

```

Before refinement, inspect the unrestricted coefficient structure:

```{r}

rating_table(burn_unrestricted)

rating_table(burn_unrestricted) |>
  autoplot()

```

At this stage, the coefficients reflect the unrestricted model fit. This output
is often informative by itself. If the pattern is too irregular, too granular or
difficult to explain, a refinement step can be added explicitly.


## The refinement object

Refinement begins with:

```{r}

ref <- prepare_refinement(burn_unrestricted)
ref

```

A `rating_refinement` object stores:

-	the fitted base model
-	the underlying model data
-	the refinement steps added through the refinement interface

At this point, the model itself has not been refitted. The refinement object
represents a proposed tariff adjustment structure, not yet the final fitted
result.

This distinction is useful because refinement steps can be inspected before they
are incorporated into the final model.


## Smoothing 

### Purpose

Smoothing can be used when a rating factor derived from a continuous variable
contains local variation that is hard to justify in a tariff.

For example, a coefficient pattern such as:

-	age 30–34 lower
-	age 34–38 higher
-	age 38–42 lower again

may be statistically possible, but difficult to explain or maintain. Smoothing
adds a more stable structure to the rating factor.


### Adding smoothing

```{r}

ref <- ref |>
  add_smoothing(
    model_variable = "age_policyholder_freq_cat",
    source_variable = "age_policyholder",
    breaks = seq(18, 95, 5),
    weights = "exposure"
  )

```

The key arguments are:

-	`model_variable`: the grouped variable present in the GLM
-	`source_variable`: the original continuous portfolio variable
-	`breaks`: the preferred commercial cut points
-	`smoothing`: the smoothing specification
-	`weights`: optional weighting, typically exposure
	
### Inspecting smoothing before refit

```{r}

print(ref)
autoplot(ref, variable = "age_policyholder_freq_cat")

```

This plot belongs to the **pre-refit stage**. It shows:

-	the original fitted coefficients
-	the proposed smoothed structure

The purpose is to inspect the refinement step itself, before it is incorporated
into the final fitted model.

### Choosing a smoothing method

Typical smoothing choices are:

-	`"spline"`: polynomial-style smoothing
-	`"gam"`: flexible smooth curve
-	`"mpi"`: monotone increasing
-	`"mpd"`: monotone decreasing

The appropriate choice depends on the pricing context.

For example:

-	age may justify a flexible smooth
-	insured value or power may require a monotonic relationship
-	low-exposure tails may benefit from exposure weighting


## Restrictions

### Purpose

Restrictions can be used when coefficients need to follow a predefined
structure.

Typical examples include:

-	bonus-malus systems
-	governance-approved relativities
-	externally mandated tariff structures
-	implementation constraints in policy systems

Restrictions differ from smoothing:

-	smoothing reshapes the fitted pattern
-	restriction imposes user-defined coefficients


### Adding restrictions


```{r}

zip_df <- data.frame(
  zip = c(0, 1, 2, 3),
  zip_adj = c(0.8, 0.9, 1.0, 1.2)
)

ref <- ref |>
  add_restriction(restrictions = zip_df)

```

The restriction table must contain exactly two columns:

-	the original factor levels
-	the adjusted coefficients


### Inspecting restrictions before refit

```{r}

autoplot(ref, variable = "zip")

```

This shows the proposed restricted structure relative to the original fitted
model.


## Expert-based relativities

### Purpose

In some cases, the fitted model uses a broad factor level, while portfolio or
business knowledge suggests that more granular differentiation may be useful.

For example, a model may estimate one coefficient for "construction", while
pricing practice distinguishes between:

-	residential construction
-	commercial construction
-	civil engineering

This can be relevant when subgroup exposure is too limited to estimate stable
coefficients directly.

### Adding relativities

```{r, eval = FALSE}

relativities_activity <- relativities(
  split_level(
    "construction",
    c("residential_construction", "commercial_construction"),
    c(1.00, 1.15)
  )
)

ref <- ref |>
  add_relativities(
    model_variable = "business_activity",
    split_variable = "business_activity_split",
    relativities = relativities_activity,
    exposure = "exposure",
    normalize = TRUE
  )

```

If `normalize = TRUE`, the relativities are scaled so that their
exposure-weighted average remains equal to 1 within the original level.

This preserves the original model signal while introducing finer structure.


## Refit

### Why refit is required

Refinement steps alter part of the model structure. Once these changes are
applied, the remaining coefficients may also adjust.

For that reason, the sequence does not end with `add_smoothing()` or
`add_restriction()`. The final step is:

```{r}

burn_refined <- refit(ref)

```

This refits the model while incorporating the documented refinement steps.


### Inspecting the final fitted result

After refit, use `rating_table()`:

```{r}

rating_table(burn_refined)

```

At this point, the output no longer represents a proposed refinement plan. It
represents the fitted coefficient structure after refinement.

The distinction is:

-	before `refit()` --> inspect the refinement plan
-	after `refit()` --> inspect the fitted tariff structure

If smoothing, restrictions, and relativities have been applied, they are now
embedded in the fitted model output.


### Visualising the final structure

```{r}

rating_table(burn_refined) |>
  autoplot()

```


## Model data and rating grids 

After refit, model structure can be extracted with `extract_model_data()`:

```{r}

md <- extract_model_data(burn_refined)
head(md)

```

Observed model-point combinations can be obtained with `rating_grid()`:

```{r}

grid <- rating_grid(burn_refined)
head(grid)

```

This is typically used for:

-	tariff review
-	portfolio summaries
-	compact prediction input
-	implementation support
	
	
## Complete example

One possible refinement sequence is:

```{r}

zip_df <- data.frame(
  zip = c(0, 1, 2, 3),
  zip_adj = c(0.8, 0.9, 1.0, 1.2)
)

burn_refined <- prepare_refinement(burn_unrestricted) |>
  add_smoothing(
    model_variable = "age_policyholder_freq_cat",
    source_variable = "age_policyholder",
    breaks = seq(18, 95, 5),
    weights = "exposure"
  ) |>
  add_restriction(zip_df) |>
  refit()

rating_table(burn_refined)

rating_table(burn_refined) |>
  autoplot()

```


## Legacy interface

Legacy entry points remain available:

```{r, eval = FALSE}

burn_refined_old <- burn_unrestricted |>
  smooth_coef(
    x_cut = "age_policyholder_freq_man",
    x_org = "age_policyholder",
    breaks = seq(18, 95, 5)
  ) |>
  restrict_coef(zip_df) |>
  refit_glm()

```

These are primarily maintained for backward compatibility.

For new code, the recommended interface is:

```{r, eval = FALSE}

prepare_refinement() |> add_*() |> refit()

```

This keeps the sequence of tariff adjustments explicit.


## Summary

The refinement interface helps separate:

-	model estimation
-	tariff adjustments
-	final fitted output

This makes it easier to document and inspect adjustments before the model is
refitted. In practice, this can support tariff structures that are:

-	statistically grounded
-	interpretable
-	commercially usable
-	easier to implement


## Next steps

For the underlying pricing concepts, see:

- [Pricing workflow building blocks](pricing-workflow-building-blocks.html)

For an example sequence from portfolio analysis to fitted tariff, see:

- [Getting started](getting-started.html)