R/extract_importance_SL_learner.R
extract_importance_SL_learner.Rd
Extract the individual-algorithm extrinsic importance from one fitted algorithm within the Super Learner, along with the importance rank.
extract_importance_SL_learner(fit = NULL, coef = 0, feature_names = "", ...)
the specific learner (e.g., from the Super Learner's
fitLibrary
list).
the Super Learner coefficient associated with the learner.
the feature names
other arguments to pass to algorithm-specific importance extractors.
a tibble, with columns algorithm
(the fitted algorithm),
feature
(the feature), importance
(the algorithm-specific
extrinsic importance of the feature), rank
(the feature importance
rank, with 1 indicating the most important feature), and weight
(the algorithm's weight in the Super Learner)
data("biomarkers")
# subset to complete cases for illustration
cc <- complete.cases(biomarkers)
dat_cc <- biomarkers[cc, ]
# use only the mucinous outcome, not the high-malignancy outcome
y <- dat_cc$mucinous
x <- dat_cc[, !(names(dat_cc) %in% c("mucinous", "high_malignancy"))]
feature_nms <- names(x)
# get the fit (using a simple library and 2 folds for illustration only)
library("SuperLearner")
set.seed(20231129)
fit <- SuperLearner::SuperLearner(Y = y, X = x, SL.library = c("SL.glm", "SL.mean"),
cvControl = list(V = 2))
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
# extract importance
importance <- extract_importance_SL_learner(fit = fit$fitLibrary[[1]]$object,
feature_names = feature_nms, coef = fit$coef[1])
importance
#> # A tibble: 22 × 5
#> algorithm feature importance rank weight
#> <chr> <chr> <dbl> <int> <dbl>
#> 1 glm lab2_fluorescence_mucinous_call 2.09 1 0
#> 2 glm lab2_fluorescence_score 1.85 2 0
#> 3 glm lab1_telomerase_neoplasia_call 1.42 3 0
#> 4 glm lab1_actb 1.19 4 0
#> 5 glm lab4_glucose_score 1.19 5 0
#> 6 glm institution 1.04 6 0
#> 7 glm lab1_telomerase_score 0.972 7 0
#> 8 glm lab3_muc3ac_score 0.953 8 0
#> 9 glm lab5_neoplasia_v1_call 0.879 9 0
#> 10 glm lab1_molecules_neoplasia_call 0.856 10 0
#> # ℹ 12 more rows