Logistic Regression after Multiple Imputation - different selection criteria
The different methods are described in the papers of Marshall et al and Eekhout, vd Wiel and Heymans.
Be aware that backward selection may result in overfitted and optimistic prediction models, see TRIPOD. Backward selection should therefore be followed by internal validation of the model.
Logistic Regression
Pooling without BW and method D1
library(psfmi)
## Registered S3 methods overwritten by 'car':
## method from
## influence.merMod lme4
## cooks.distance.influence.merMod lme4
## dfbeta.influence.merMod lme4
## dfbetas.influence.merMod lme4
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl",
"JobDemands", "SocialSupport"), method="D1")
pool_lr$RR_model
## $`Step 1 - no variables removed -`
## term estimate std.error statistic df p.value
## 1 (Intercept) -0.02145084 2.49485297 -0.008598036 104.09644 0.993156301
## 2 Gender -0.35445151 0.41807427 -0.847819477 141.28927 0.397972465
## 3 Smoking 0.07565036 0.34084592 0.221948835 147.74179 0.824660215
## 4 Function -0.14188458 0.04337897 -3.270815252 132.02927 0.001368147
## 5 JobControl 0.00690354 0.02053384 0.336203110 88.93815 0.737509628
## 6 JobDemands 0.00227508 0.03872846 0.058744401 103.72259 0.953268722
## 7 SocialSupport 0.04434046 0.05750883 0.771019941 126.70867 0.442130487
## OR lower.EXP upper.EXP
## 1 0.9787776 0.006951596 137.8108760
## 2 0.7015581 0.306989710 1.6032584
## 3 1.0785854 0.549958398 2.1153353
## 4 0.8677214 0.796369271 0.9454664
## 5 1.0069274 0.966670925 1.0488604
## 6 1.0022777 0.928182101 1.0822882
## 7 1.0453382 0.932895897 1.1713332
pool_lr$multiparm
## $`Step 1 - no variables removed -`
## p-values D1 F-statistic
## Gender 0.396581644 0.718797866
## Smoking 0.824355069 0.049261285
## Function 0.001093311 10.698232410
## JobControl 0.736975846 0.113032531
## JobDemands 0.953182373 0.003450905
## SocialSupport 0.440844794 0.594471750
Back to Examples
Pooling with BW and method D3
Pooling Logistic regression models over 5 imputed datasets with backward selection using a p-value of 0.05 and as method D3 (Meng and Rubin likelihood ratio statistics).
library(psfmi)
pool_lr <- psfmi_lr(data=lbpmilr, nimp=5, impvar="Impnr", Outcome="Chronic",
predictors=c("Gender", "Smoking", "Function", "JobControl",
"JobDemands", "SocialSupport"), p.crit = 0.05,
method="D3", direction="BW")
## Warning: The `keep` argument of `group_split()` is deprecated as of dplyr 1.0.0.
## Please use the `.keep` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_warnings()` to see where this warning was generated.
## Removed at Step 1 is - JobDemands
## Removed at Step 2 is - Smoking
## Removed at Step 3 is - JobControl
## Removed at Step 4 is - SocialSupport
## Removed at Step 5 is - Gender
##
## Selection correctly terminated,
## No more variables removed from the model
pool_lr$RR_model_final
## $`Step 6`
## term estimate std.error statistic df p.value OR
## 1 (Intercept) 1.2289920 0.46958952 2.617162 133.9118 0.009885997 3.4177826
## 2 Function -0.1398865 0.04195767 -3.333991 125.0844 0.001126680 0.8694569
## lower.EXP upper.EXP
## 1 1.3501560 8.651769
## 2 0.8001745 0.944738
pool_lr$multiparm_final
## $`Step 6`
## p-values D3 F-statistic
## Function 0.0004761907 12.29523
Back to Examples