Thanks to Marco De Virgilis.
Moved NEWS file into RMarkdown package vignette format.
Previously it was required that the row and column names of a triangle be convertible to numeric, although that “requirement” did not always cause a problem. For example, the following sets the rownames of GenIns to the beginning Date of the accident year.
## dev
## origin 1 2 3 4 5 6 7 8
## 2001-01-01 357848 1124788 1735330 2218270 2745596 3319994 3466336 3606286
## 2002-01-01 352118 1236139 2170033 3353322 3799067 4120063 4647867 4914039
## 2003-01-01 290507 1292306 2218525 3235179 3985995 4132918 4628910 4909315
## 2004-01-01 310608 1418858 2195047 3757447 4029929 4381982 4588268 NA
## 2005-01-01 443160 1136350 2128333 2897821 3402672 3873311 NA NA
## 2006-01-01 396132 1333217 2180715 2985752 3691712 NA NA NA
## 2007-01-01 440832 1288463 2419861 3483130 NA NA NA NA
## 2008-01-01 359480 1421128 2864498 NA NA NA NA NA
## 2009-01-01 376686 1363294 NA NA NA NA NA NA
## 2010-01-01 344014 NA NA NA NA NA NA NA
## dev
## origin 9 10
## 2001-01-01 3833515 3901463
## 2002-01-01 5339085 NA
## 2003-01-01 NA NA
## 2004-01-01 NA NA
## 2005-01-01 NA NA
## 2006-01-01 NA NA
## 2007-01-01 NA NA
## 2008-01-01 NA NA
## 2009-01-01 NA NA
## 2010-01-01 NA NA
A plot with the lattice=TRUE
option, which previously would blow up, now displays with nice headings.
It can often be useful to have “origin” values that are not necessarily convertible to numeric. For example, suppose you have a table of claim detail at various evaluation dates. Invariably, such a table will have a Date field holding the date of loss. It would be nice to be able to summarize that data by accident year “cuts”. It turns out there’s a builtin function in R that will get you most of the way there. It’s called ‘cut’.
Here we take the GenIns data in long format and generate 50 claims per accident period. We assign each claim a random date within the year. The incurred (or paid) “value” given is a random perturbation of one-fiftieth of GenInsLong$value.
We accumulate the detail into an accident year triangle using ChainLadder’s as.triangle
method. The summarized triangle displayed at the end is very similar to GenIns
, and has informative row labels.
x <- GenInsLong
# start off y with x's headings
y <- x[0,]
names(y)[1] <- "lossdate"
set.seed(1234)
n = 50 # number of simulated claims per accident perior
for (i in 1:nrow(x)) {
y <- rbind(y,
data.frame(
lossdate = as.Date(
as.numeric(as.Date(paste0(x[i, "accyear"]+2000, "-01-01"))) +
round(runif(n, 0, 364),0), origin = "1970-01-01"),
devyear = x[i, "devyear"],
incurred.claims = rnorm(n, mean = x[i, "incurred claims"] / n,
sd = x[i, "incurred claims"]/(10*n))
))
}
# here's the magic cut
y$ay <- cut(y$lossdate, breaks = "years")
# this summarized triangle is very similar to GenIns
as.triangle(y, origin = "ay", dev = "devyear", value = "incurred.claims")
## devyear
## ay 1 2 3 4 5 6 7 8
## 2001-01-01 349741.1 1109368 1737850 2265706 2749056 3318464 3469142 3549578
## 2002-01-01 352821.5 1245621 2132200 3377061 3820987 4148933 4610189 4891852
## 2003-01-01 296547.8 1275881 2198221 3235844 3944931 4113276 4623159 4900318
## 2004-01-01 313669.5 1392038 2171462 3774168 4035879 4461897 4661352 NA
## 2005-01-01 443940.5 1138787 2190873 2905444 3371444 3849587 NA NA
## 2006-01-01 391526.6 1324732 2230006 3000719 3742811 NA NA NA
## 2007-01-01 446941.9 1292116 2416001 3404734 NA NA NA NA
## 2008-01-01 349330.2 1425022 2844242 NA NA NA NA NA
## 2009-01-01 369893.1 1368242 NA NA NA NA NA NA
## 2010-01-01 346492.8 NA NA NA NA NA NA NA
## devyear
## ay 9 10
## 2001-01-01 3769684 3980606
## 2002-01-01 5311927 NA
## 2003-01-01 NA NA
## 2004-01-01 NA NA
## 2005-01-01 NA NA
## 2006-01-01 NA NA
## 2007-01-01 NA NA
## 2008-01-01 NA NA
## 2009-01-01 NA NA
## 2010-01-01 NA NA
The user is encouraged to experiment with other cut’s – e.g., breaks = "quarters"
will generate accident quarter triangles.
A new function, as.LongTriangle
, will convert a triangle from “wide” (matrix) format to “long” (data.frame) format. This differs from ChainLadder’s as.data.frame.triangle method in that the rownames and colnames of Triangle are stored as factors. This feature can be particularly important when plotting a triangle because the order of the “origin” and “dev” values is important.
Additionally, the columns of the resulting data frame may be renamed from the default values (“origin”, “dev”, and “value”) using the “varnames” argument for “origin”/“dev” and the “value.name” argument for “value”.
In the following example, the GenIns
triangle in ChainLadder is converted to a data.frame
with non-default names:
GenLong <- as.LongTriangle(GenIns, varnames = c("accident year", "development age"),
value.name = "Incurred Loss")
head(GenLong)
## accident year development age Incurred Loss
## 1 1 1 357848
## 2 2 1 352118
## 3 3 1 290507
## 4 4 1 310608
## 5 5 1 443160
## 6 6 1 396132
In the following plot, the last accident year and the last development age are shown last, rather than second as they would have been if displayed alphabetically (ggplot’s default for character data):
Previously, when an “exposure” attribute was assigned to a triangle for use with glmReserve
, it was assumed/expected that the user would supply the values in the same order as the accident years. Then, behind the scenes, glmReserve would use an arithmetic formula to match the exposure with the appropriate accident year using the numeric “origin” values after the triangle had been converted to long format.
glmReserve
now allows for “exposure” to have “names” that coincide with the rownames of the triangle, which are used to match to origin in long format. Here is an example, newly found in ?glmReserve
.
GenIns2 <- GenIns
rownames(GenIns2) <- paste0(2001:2010, "-01-01")
expos <- (7 + 1:10 * 0.4) * 10
names(expos) <- rownames(GenIns2)
attr(GenIns2, "exposure") <- expos
glmReserve(GenIns2)
## Latest Dev.To.Date Ultimate IBNR S.E CV
## 2002-01-01 5339085 0.98258394 5433719 94634 110099.9 1.1634283
## 2003-01-01 4909315 0.91271125 5378826 469511 216043.4 0.4601455
## 2004-01-01 4588268 0.86605312 5297906 709638 260872.1 0.3676129
## 2005-01-01 3873311 0.79727286 4858200 984889 303550.0 0.3082073
## 2006-01-01 3691712 0.72228301 5111171 1419459 375013.9 0.2641949
## 2007-01-01 3483130 0.61531018 5660771 2177641 495378.0 0.2274838
## 2008-01-01 2864498 0.42219349 6784799 3920301 789961.1 0.2015052
## 2009-01-01 1363294 0.24162172 5642266 4278972 1046513.8 0.2445713
## 2010-01-01 344014 0.06922055 4969825 4625811 1980101.4 0.4280550
## total 30456627 0.61982473 49137483 18680856 2945660.9 0.1576834
The glmReserve
function now supports the negative binomial GLM, a more natural way to model over-dispersion in count data. The model is fitted through the glm.nb
function from the MASS
package.
To fit the negative binomial GLM to the loss triangle, simply set nb = TRUE
in calling the glmReserve function:
## Latest Dev.To.Date Ultimate IBNR S.E CV
## 2 5339085 0.98282233 5432401 93316 37402.11 0.4008113
## 3 4909315 0.91663181 5355820 446505 132949.43 0.2977557
## 4 4588268 0.88245834 5199416 611148 147083.10 0.2406669
## 5 3873311 0.79610366 4865335 992024 210714.29 0.2124085
## 6 3691712 0.71756209 5144798 1453086 290921.41 0.2002094
## 7 3483130 0.61438536 5669292 2186162 435789.89 0.1993402
## 8 2864498 0.43869620 6529571 3665073 779454.57 0.2126710
## 9 1363294 0.24851792 5485697 4122403 973734.25 0.2362055
## 10 344014 0.07078345 4860091 4516077 1380681.59 0.3057259
## total 30456627 0.62742290 48542422 18085795 2237970.23 0.1237419
New files in the /inst/unittests/
folder can be used for future enhancements
Contributors of new contributions to those R files are encouraged to utilize those runit scripts for testing, and, of course, add other runit scripts as warrantted.
By default, R’s lm
method generates a warning when it detects an “essentially perfect fit”. This can happen when one column of a triangle is identical to the previous column; i.e., when all link ratios in a column are the same. In the example below, the second column is a fixed constant, 1.05, times the first column. ChainLadder previously issued the lm warning below.
x <- matrix(byrow = TRUE, nrow = 4, ncol = 4,
dimnames = list(origin = LETTERS[1:4], dev = 1:4),
data = c(
100, 105, 106, 106.5,
200, 210, 211, NA,
300, 315, NA, NA,
400, NA, NA, NA)
)
mcl <- MackChainLadder(x, est.sigma = "Mack")
Warning messages:
1: In summary.lm(x) : essentially perfect fit: summary may be unreliable
2: In summary.lm(x) : essentially perfect fit: summary may be unreliable
3: In summary.lm(x) : essentially perfect fit: summary may be unreliable
which may have raised a concern with the user when none was warranted.
Now ChainLadder issues an “informational warning”:
## Warning in Mack.S.E(CL[["Models"]], FullTriangle, est.sigma = est.sigma, : Information: essentially no variation in development data for period(s):
## '1-2'
Fixed tail extrapolation in Vignette. (Thanks to Mark Lee.)
Added back functionality to estimate the index parameter for the compound Poisson model in ‘glmReserve’ (now depends on package cplm). This works for both ‘formula’ and ‘bootstrap’.
Added methods ‘resid’ and plot for class ‘glmReserve’ (now depends on ggplot2)
New function PaidIncurredChain by Fabio Concina, based on the 2010 Merz & Wuthrich paper Paid-incurred chain claims reserving method
plot.MackChainLadder and plot.BootChainLadder gained new argument
‘which’, allowing users to specify which sub-plot to display. Thanks to Christophe Dutang for this suggestion.
Updated NAMESPACE file to comply with new R CMD checks in R-3.3.0
Removed package dependencies on grDevices and Hmisc
Expanded package vignette with new paragraph on importing spreadsheet data, a new section “Paid-Incurred Chain Model” and an added example for a full claims development picture in the “One Year Claims Development Result” section.
New generic function CDR to estimate the one year claims development result. S3 methods for the Mack and bootstrap model have been added already:
New function tweedieReserve to estimate reserves in a GLM framework, including the one year claims development result.
Package vignette has new chapter ‘One Year Claims Development Result’.
New example data MW2008 and MW2014 form the Merz & Wuthrich (2008, 2014) papers
Source code development moved from Google Code to GitHub: https://github.com/mages/ChainLadder
as.data.frame.triangle now gives warning message when dev. period is a character
Alessandro Carrato, Giuseppe Crupi and Mario Wuthrich have been added as authors, thanks to their major contribution to code and documentation
Christophe Dutang, Arnaud Lacoume and Arthur Charpentier have been added as contributors, thanks to their feedback, guidance and code contribution
A new function, CLFMdelta, finds the value of delta such that the model coefficients resulting from the ‘chainladder’ function with that value for argument delta are consistent with an input vector of ‘selected’ age-to-age factors, subject to restrictions on the ‘selected’ factors relative to the input ‘Triangle’. See the paper “A Family of Chain-Ladder Factor Models for Selected Link Ratios” by Bardis, Majidi, Murphy: https://www.variancejournal.org/issues/?fa=article&abstrID=6943
A new ‘coef’ method returns the age-to-age factor coefficients of the regression models estimated by the ‘chainladder’ function.
Exports a function “LRfunction” that calculates a Triangle’s link ratio function and can be used to plot the space of “reasonable link ratio selections” per the CLFM paper.
ClarkLDF and ClarkCapeCod functions: additional functionality
A ‘vcov’ method now exists to produce the covariance matrix of the estimated parameters using the approach in Clark’s paper
Additional values (in lists) returned by Clark’s methods:
Fine-tuning of maximum likelihood numerical algorithm’s control parameters
If the solution is found at the boundary of the parameter region, it is conceivable that a “more optimal” solution might exist if the boundary constraints were not as conservative, so a warning is given
The parameters returned by the methods were the scaled versions; they now at their original scales.
The loss development factor (LDF) being returned by ClarkCapeCod was not documented
New implementation of the methods in David Clark’s “LDF Curve Fitting” paper in the 2003 Forum by Daniel Murphy.
‘MackChainLadder’ has new argument ‘alpha’ as an additional weighting parameter. As a result, the argument ‘weights’ is now just that, weights should be between 0 and 1. The argument ‘alpha’ describes the different chain ladder age-to-age factors: The default for alpha for all development periods is 1. See Mack’s 1999 paper: alpha=1 gives the historical chain ladder age-to-age factors, alpha=0 gives the straight average of the observed individual development factors and alpha=2 is the result of an ordinary regression with intercept 0.
Basic ‘chainladder’ function now available using linear models. See ?chainladder for more information.
More examples for ‘MackChainLadder’ demonstrate how to apply the MackChainLadder over several triangles in ‘one-line’.
‘as.data.frame.triangle’ has new argument ‘lob’ (e.g. line of business) which allows to set an additional label column in the data frame output.
‘MackChainLadder’: Latest position of incomplete triangles were in some cases not returned correctly. Thanks to Ben Escoto for reporting and providing a patch.
‘MackChainLadder’:
New triangle class with S3 methods for plot, print and conversion from triangles to data.frames and vis versa
New utility functions ‘incr2cum’ and ‘cum2incr’ to convert incremental triangles into cumulative triangles and vis versa. Thanks to Chritophe Dutang.
New logical argument lattice for plot.MackChainLadder (and plot.triangle), which allows to plot developments by origin period in separate panels.