The different methods are described in the papers of Marshall et al and Eekhout, vd Wiel and Heymans.

Be aware that backward selection may result in overfitted and optimistic prediction models, see TRIPOD. Backward selection should therefore be followed by internal validation of the model.

Logistic Regression

Pooling without BW and method D1

library(psfmi)
## Registered S3 methods overwritten by 'car':
##   method                          from
##   influence.merMod                lme4
##   cooks.distance.influence.merMod lme4
##   dfbeta.influence.merMod         lme4
##   dfbetas.influence.merMod        lme4
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
         predictors=c("Gender", "Smoking", "Function", "JobControl",
                      "JobDemands", "SocialSupport"), method="D1")
pool_lr$RR_model
## $`Step 1 - no variables removed -`
##            term    estimate  std.error    statistic        df     p.value
## 1   (Intercept) -0.02145084 2.49485297 -0.008598036 104.09644 0.993156301
## 2        Gender -0.35445151 0.41807427 -0.847819477 141.28927 0.397972465
## 3       Smoking  0.07565036 0.34084592  0.221948835 147.74179 0.824660215
## 4      Function -0.14188458 0.04337897 -3.270815252 132.02927 0.001368147
## 5    JobControl  0.00690354 0.02053384  0.336203110  88.93815 0.737509628
## 6    JobDemands  0.00227508 0.03872846  0.058744401 103.72259 0.953268722
## 7 SocialSupport  0.04434046 0.05750883  0.771019941 126.70867 0.442130487
##          OR   lower.EXP   upper.EXP
## 1 0.9787776 0.006951596 137.8108760
## 2 0.7015581 0.306989710   1.6032584
## 3 1.0785854 0.549958398   2.1153353
## 4 0.8677214 0.796369271   0.9454664
## 5 1.0069274 0.966670925   1.0488604
## 6 1.0022777 0.928182101   1.0822882
## 7 1.0453382 0.932895897   1.1713332
pool_lr$multiparm
## $`Step 1 - no variables removed -`
##               p-values D1  F-statistic
## Gender        0.396581644  0.718797866
## Smoking       0.824355069  0.049261285
## Function      0.001093311 10.698232410
## JobControl    0.736975846  0.113032531
## JobDemands    0.953182373  0.003450905
## SocialSupport 0.440844794  0.594471750

Back to Examples

Pooling with BW and method D3

Pooling Logistic regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D3 (Meng and Rubin likelihood ratio statistics).

library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
         predictors=c("Gender", "Smoking", "Function", "JobControl",
                      "JobDemands", "SocialSupport"), p.crit = 0.05, 
         method="D3", direction="BW")
## Warning: The `keep` argument of `group_split()` is deprecated as of dplyr 1.0.0.
## Please use the `.keep` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - JobControl
## Removed at Step 4 is - SocialSupport
## Removed at Step 5 is - Gender
## 
## Selection correctly terminated, 
## No more variables removed from the model
pool_lr$RR_model_final
## $`Step 6`
##          term   estimate  std.error statistic       df     p.value        OR
## 1 (Intercept)  1.2289920 0.46958952  2.617162 133.9118 0.009885997 3.4177826
## 2    Function -0.1398865 0.04195767 -3.333991 125.0844 0.001126680 0.8694569
##   lower.EXP upper.EXP
## 1 1.3501560  8.651769
## 2 0.8001745  0.944738
pool_lr$multiparm_final
## $`Step 6`
##           p-values D3 F-statistic
## Function 0.0004761907    12.29523

Back to Examples