Run a Super Learner for the provided subset of features

run_sl(
  Y = NULL,
  X = NULL,
  V = 5,
  SL.library = "SL.glm",
  univariate_SL.library = NULL,
  s = 1,
  cv_folds = NULL,
  sample_splitting = TRUE,
  ss_folds = NULL,
  split = 1,
  verbose = FALSE,
  progress_bar = NULL,
  indx = 1,
  weights = rep(1, nrow(X)),
  cross_fitted_se = TRUE,
  full = NULL,
  vector = TRUE,
  ...
)

Arguments

Y

the outcome

X

the covariates

V

the number of folds

SL.library

the library of candidate learners

univariate_SL.library

the library of candidate learners for single-covariate regressions

s

the subset of interest

cv_folds

the CV folds

sample_splitting

logical; should we use sample-splitting for predictiveness estimation?

ss_folds

the sample-splitting folds; only used if sample_splitting = TRUE

split

the split to use for sample-splitting; only used if sample_splitting = TRUE

verbose

should we print progress? defaults to FALSE

progress_bar

the progress bar to print to (only if verbose = TRUE)

indx

the index to pass to progress bar (only if verbose = TRUE)

weights

weights to pass to estimation procedure

cross_fitted_se

if TRUE, uses a cross-fitted estimator of the standard error; otherwise, uses the entire dataset

full

should this be considered a "full" or "reduced" regression? If NULL (the default), this is determined automatically; a full regression corresponds to s being equal to the full covariate vector. For SPVIMs, can be entered manually.

vector

should we return a vector (TRUE) or a list (FALSE)?

...

other arguments to Super Learner

Value

a list of length V, with the results of predicting on the hold-out data for each v in 1 through V