data Car
dataCar.Rd
This data set is taken from the dataCar
data set of the insuranceData
package and slightly adjusted (see the code in examples for reproducing this data set).
The original data set is based on one-year vehicle insurance policies taken out in 2004 or 2005. There are 67566 policies, of which 4589 (6.8%) had at least one claim.
Usage
data(dataCar)
Format
A data frame with 67566 observations on the following 15 variables.
veh_value
vehicle value, in $10,000s
exposure
0-1
clm
occurrence of claim (0 = no, 1 = yes)
numclaims
number of claims
claimcst0
claim amount (0 if no claim)
veh_body
vehicle body, coded as
BUS
CONVT
COUPE
HBACK
HDTOP
MCARA
MIBUS
PANVN
RDSTR
SEDAN
STNWG
TRUCK
UTE
veh_age
1 (youngest), 2, 3, 4
gender
a factor with levels
F
M
area
a factor with levels
A
B
C
D
E
F
agecat
1 (youngest), 2, 3, 4, 5, 6
X_OBSTAT_
a factor with levels
01101 0 0 0
Y
the loss ratio, defined as the number of claims divided by the exposure
w
the exposure, identical to
exposure
VehicleType
type of vehicle,
common vehicle
oruncommon vehicle
VehicleBody
vehicle body, identical to
veh_body
Details
Adjusted data set dataCar
, where we removed claims with a loss ratio larger than 1 000 000. In addition, we summed the exposure per vehicle body and removed those where
the summed exposure was less than 100. Hereby, we ensure that there is sufficient exposure for each vehicle body category.
References
De Jong P., Heller G.Z. (2008), Generalized linear models for insurance data, Cambridge University Press
Examples
# How to construct the data set using the original dataCar data set from the insuranceData package
library(plyr)
#>
#> Attaching package: 'plyr'
#> The following object is masked from 'package:actuaRE':
#>
#> is.formula
library(magrittr)
data("dataCar", package = "insuranceData")
dataCar$Y = with(dataCar, claimcst0 / exposure)
dataCar$w = dataCar$exposure
dataCar = dataCar[which(dataCar$Y < 1e6), ]
Yw = ddply(dataCar, .(veh_body), function(x) c(crossprod(x$Y, x$w) / sum(x$w), sum(x$w)))
dataCar = dataCar[!dataCar$veh_body %in% Yw[Yw$V2 < 1e2, "veh_body"], ]
dataCar$veh_body %<>% droplevels()
dataCar$VehicleType = sapply(tolower(dataCar$veh_body), function(x) {
if(x %in% c("sedan", "ute", "hback"))
"Common vehicle"
else
"Uncommon vehicle"
})
dataCar$VehicleBody = dataCar$veh_body