Model Stability analysis

To build a prediction model you can reliably use backward selection in a data situation where there are enough persons that are positive and negative on the outcome compared to the number of potential predictors when you combine it with internal validation and model stability analysis.

You can find more about model stability analysis in the papers of Royston and Sauerbrei, Sauerbrei and Schumacher, Heymans et al. and Heinze et al.

With model stability analysis the selection of models and predictors can be evaluated. Bootstrapping is used to evaluate the selected models and predictors in the resampled datasets. With the psfmi_stab function this evaluation of model stability can be done in multiply imputed datasets. For normal (single) datasets, bootstrapping is applied in these datasets. For multilevel data, cluster bootstrapping is used (Field).

Example of Model Stability analysis

First start with backward selection over 5 imputed datasets using a p-value of 0.05 and method D1. With the line of code, pool_lr$predictors_in, information can be obtained from the predictors that were selected in the model at each selection step.

library(psfmi)
## Registered S3 methods overwritten by 'car':
##   method                          from
##   influence.merMod                lme4
##   cooks.distance.influence.merMod lme4
##   dfbeta.influence.merMod         lme4
##   dfbetas.influence.merMod        lme4
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
    predictors=c("Gender", "Smoking", "Function", "SocialSupport"), 
    p.crit = 0.05, method="D1", direction="BW")
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - SocialSupport
## Removed at Step 3 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
pool_lr$predictors_in
## # A tibble: 1 x 1
##   value   
##   <chr>   
## 1 Function

Now apply model stability analysis. For the example 10 bootstrap samples are used, but these can easily increased to 1000. Note that this may take a while when the predictors are selected in several steps.

library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=3, impvar="Impnr", Outcome="Chronic",
    predictors=c("Gender", "Smoking", "Function", "SocialSupport"), 
    p.crit = 0.05, method="D1", direction="BW")
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - SocialSupport
## Removed at Step 3 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
stab_res <- psfmi_stab(pool_lr, direction="BW", start_model = TRUE,
      boot_method = "single", nboot=10, p.crit=0.05)
## 
## Boot 1
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - SocialSupport
## Removed at Step 3 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 2
## Removed at Step 1 is - Gender
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - SocialSupport
## Removed at Step 4 is - Function
## 
## Boot 3
## Removed at Step 1 is - SocialSupport
## Removed at Step 2 is - Gender
## Removed at Step 3 is - Smoking
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 4
## Removed at Step 1 is - SocialSupport
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 5
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - Gender
## Removed at Step 3 is - SocialSupport
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 6
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - Gender
## Removed at Step 3 is - SocialSupport
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 7
## Removed at Step 1 is - Gender
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - SocialSupport
## Removed at Step 4 is - Function
## 
## Boot 8
## Removed at Step 1 is - SocialSupport
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 9
## Removed at Step 1 is - Smoking
## Removed at Step 2 is - SocialSupport
## Removed at Step 3 is - Function
## 
## Selection correctly terminated, 
## No more variables removed from the model
## 
## Boot 10
## Removed at Step 1 is - Gender
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - SocialSupport
## Removed at Step 4 is - Function
stab_res
## $bif
##         Gender Smoking Function SocialSupport
## boot 1       0       0        1             0
## boot 2       0       0        0             0
## boot 3       0       0        1             0
## boot 4       0       0        1             0
## boot 5       0       0        1             0
## boot 6       0       0        1             0
## boot 7       0       0        0             0
## boot 8       0       0        1             0
## boot 9       1       0        0             0
## boot 10      0       0        0             0
## 
## $bif_total
##        Gender       Smoking      Function SocialSupport 
##             1             0             6             0 
## 
## $bif_perc
##        Gender       Smoking      Function SocialSupport 
##            10             0            60             0 
## 
## $model_stab
##   Gender Smoking Function SocialSupport freq bif_pat_perc
## 1      0       0        1             0    6           60
## 2      0       0        0             0    3           30
## 3      1       0        0             0    1           10
## 
## $call
## psfmi_stab(pobj = pool_lr, boot_method = "single", nboot = 10, 
##     p.crit = 0.05, start_model = TRUE, direction = "BW")