Simulated data sets to illustrate the package functionality
poissontraindata.RdBoth the traindata and testdata dataframe are synthetically generated data sets to illustrate the functionality of the package. The traindata has 5000 observations and the testdata has 1000 observations. The same settings were used to generate both data sets.
Format
ythe poisson distributed outcome variable
x1covariate 1
x2covariate 2
x3covariate 3
x4covariate 4
x5covariate 5
Examples
# The data sets were generated as follows
library(MASS)
library(magrittr)
ScaleRange <- function(x, xmin = -1, xmax = 1) {
xRange = range(x)
(x - xRange[1]) / diff(xRange) * (xmax - xmin) + xmin
}
set.seed(144)
p = 5
N = 1e6
n = 5e3
nOOS = 1e3
S = matrix(NA, 5, 5)
rho = c(0.025, 0, 0, 0.05, 0.075, 0, 0, 0.025, 0, 0)
S[upper.tri(S)] = rho
S[lower.tri(S)] = t(S)[lower.tri(S)]
diag(S) = 1
Matrix::isSymmetric(S)
#> [1] TRUE
X = mvrnorm(N, rep(0, p), Sigma = S, empirical = TRUE)
X = apply(X, 2, ScaleRange)
B = c(-2.3, 1.5, 2, -1, -2, -1.5)
mu = poisson()$linkinv(cbind(1, X) %*% B)
Y = rpois(N, mu)
Df = data.frame(Y, X)
colnames(Df)[-1] %<>% tolower()
set.seed(2)
DfS = Df[sample(1:nrow(Df), n, FALSE), ]
DfOOS = Df[sample(1:nrow(Df), nOOS, FALSE), ]
poissontraindata = DfS
poissontestdata = DfOOS