R/get_base_set.R
get_base_set.Rd
Using the estimated intrinsic importance and a base method designed to control the family-wise error rate (e.g., Holm), obtain an initial selected set.
the test statistics (used with "maxT")
(used with "minP" or "Holm")
the alpha level
the method (one of "maxT", "minP", or "Holm")
the number of resamples (for minP or maxT)
the estimated covariance matrix for the test statistics
the false discovery rate (for method = "BY")
the initial selected set, a list of the following:
decision
, a numeric vector with 1 indicating that the variable was selected and 0 otherwise
p_values
, the p-values used to make the decision
# \donttest{
data("biomarkers")
# subset to complete cases for illustration
cc <- complete.cases(biomarkers)
dat_cc <- biomarkers[cc, ]
# use only the mucinous outcome, not the high-malignancy outcome
y <- dat_cc$mucinous
x <- dat_cc[, !(names(dat_cc) %in% c("mucinous", "high_malignancy"))]
feature_nms <- names(x)
# estimate SPVIMs (using simple library and V = 2 for illustration only)
set.seed(20231129)
library("SuperLearner")
est <- vimp::sp_vim(Y = y, X = x, V = 2, type = "auc", SL.library = "SL.glm",
cvControl = list(V = 2))
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> Warning: prediction from rank-deficient fit; attr(*, "non-estim") has doubtful cases
#> Warning: One or more original estimates < 0; returning zero for these indices.
# get base set
base_set <- get_base_set(test_statistics = est$test_statistic, p_values = est$p_value,
alpha = 0.2, method = "Holm")
base_set$decision
#> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# }