A wrapper function for Super Learner-based extrinsic variable selection within
stability selection, using the stabs
SL_stabs_fitfun(x, y, q, ...)
the features.
the outcome of interest.
the number of features to select on average.
other arguments to pass to SuperLearner
a named list, with elements: selected
(a logical vector
indicating whether or not each variable was selected); and path
a logical matrix indicating which variable was selected at each step).
for general usage of stability selection.
# \donttest{
# subset to complete cases for illustration
cc <- complete.cases(biomarkers)
dat_cc <- biomarkers[cc, ]
# use only the mucinous outcome, not the high-malignancy outcome
y <- dat_cc$mucinous
x <- dat_cc[, !(names(dat_cc) %in% c("mucinous", "high_malignancy"))]
feature_nms <- names(x)
# use stability selection with SL (using small number of folds for CV,
# small SL library and small number of bootstrap replicates for illustration only)
sl_stabs <- stabs::stabsel(x = x, y = y,
fitfun = SL_stabs_fitfun,
args.fitfun = list(SL.library = "SL.glm", cvControl = list(V = 2)),
q = 2, B = 5, PFER = 5)
#> Stability Selection with unimodality assumption
#> Selected variables:
#> institution lab1_actb
#> 1 2
#> Selection probabilities:
#> lab1_molecules_score lab1_telomerase_score
#> 0 0
#> lab2_fluorescence_score lab3_muc3ac_score
#> 0 0
#> lab3_muc5ac_score lab4_areg_score
#> 0 0
#> lab4_glucose_score lab5_mucinous_call
#> 0 0
#> lab5_neoplasia_v1_call lab5_neoplasia_v2_call
#> 0 0
#> lab6_ab_score cea
#> 0 0
#> lab1_molecules_neoplasia_call lab1_telomerase_neoplasia_call
#> 0 0
#> lab2_fluorescence_mucinous_call lab4_areg_mucinous_call
#> 0 0
#> lab4_glucose_mucinous_call lab4_combined_mucinous_call
#> 0 0
#> lab6_ab_neoplasia_call cea_call
#> 0 0
#> institution lab1_actb
#> 1 1
#> ---
#> Cutoff: 0.7; q: 2; PFER (*): 0.303
#> (*) or expected number of low selection probability variables
#> PFER (specified upper bound): 5
#> PFER corresponds to signif. level 0.0138 (without multiplicity adjustment)
# }