NEWS.md
    S3 class, which makes internal code cleaner and facilitates simpler addition of new predictiveness measures.extract_sampled_split_predictions is a vector, not a list. This facilitates proper use in the new version of the package.Z in coarsened-data settings; allow case-insensitive specification of covariate names/positions when creating Z
V defaults to 5 if no cross-fitting folds are specified externallycross_fitted_f1 and cross_fitted_f2 in cv_vim
cross_fitted_f1 and cross_fitted_f2 in cv_vim
cross_fitted_se to cv_vim and sp_vim; this logical option allows the standard error to be estimated using cross-fitting. This can improve performance in cases where flexible algorithms are used to estimate the full and reduced regressions.vim and cv_vim; currently, this option is only available for non-sampled-split calls (i.e., with sample_splitting = FALSE)vim are based on the entire dataset, while the full and reduced predictiveness (predictiveness_full and predictiveness_reduced, along with the corresponding confidence intervals) is evaluated using separate portions of the data for the full and reduced regressions.sample_splitting to vim, cv_vim and sp_vim; if FALSE, sample-splitting is not used to estimate predictiveness. Note that we recommend using the default, TRUE, in all cases, since inference using sample_splitting = FALSE will be invalid for variables with truly null variable importance.sample_splitting = TRUE to match more closely with theoretical results (and improve power!). In this case, we first split the data into  cross-fitting folds, and split these folds equally into two sample-splitting folds. For the nuisance regression using all covariates, for each  we set aside the data in sample-splitting fold 1 and cross-fitting fold  [this comprises  of the data]. We train using the remaining observations [comprising  of the data] not in this testing fold, and we test on the originally withheld data. We repeat for the nuisance regression using the reduced set of covariates, but withhold data in sample-splitting fold 2. This update affects both cv_vim and sp_vim. If sample_splitting = FALSE, then we use standard cross-fitting.family if it isn’t specified; use stats::binomial() if there are only two unique outcome values, otherwise use stats::gaussian()
cvAUC)cvAUC
ipc_est_type (available in vim, cv_vim, and sp_vim; also corresponding wrapper functions for each VIM and corresponding internal estimation functions)testthat/ to use glm rather than xgboost (increases speed)glm rather than xgboost or ranger (increases speed, even though the regression is now misspecified for the truth)forcats from vignettemeasure_accuracy and measure_auc for project-wide consistencytestthat/ to not explicitly load xgboost
stats::qlogis and stats::plogis rather than bespoke functionsvimp will handle the rest.vimp”run_regression = TRUE for simplicityverbose to sp_vim; if TRUE, messages are printed throughout fitting that display progress and verbose is passed to SuperLearner
cv_predictiveness_point_est and predictiveness_point_est to est_predictiveness_cv and est_predictiveness, respectivelycv_predictiveness_update, cv_vimp_point_est, cv_vimp_update, predictiveness_update, vimp_point_est, vimp_update; this functionality is now in est_predictiveness_cv and est_predictiveness (for the *update* functions) or directly in vim or cv_vim (for the *vimp* functions)predictiveness_se and predictiveness_ci (functionality is now in vimp_se and vimp_ci, respectively)weights argument to ipc_weights, clarifying that these weights are meant to be used as inverse probability of coarsening (e.g., censoring) weightssp_vim and helper functions run_sl, sample_subsets, spvim_ics, spvim_se; these functions allow computation of the Shapley Population Variable Importance Measure (SPVIM)cv_vim and vim now use an outer layer of sample splitting for hypothesis testingvimp_auc, vimp_accuracy, vimp_deviance, vimp_rsquared
vimp_regression is now deprecated; use vimp_anova insteadvim; each variable importance function is now a wrapper function around vim with the type argument filled incv_vim_nodonsker is now deprecated; use cv_vim insteadvimp_anova)vimp_anova)family for the top-level SuperLearner if run_regression = TRUE; in call cases, the second-stage SuperLearner uses a gaussian familySL.mean as the best-fitting algorithm, the second-stage regression is now run using the original outcome, rather than the first-stage fitted valuestwo_validation_set_cv, which sets up folds for V-fold cross-validation with two validation sets per foldcv_vim: now, the cross-validated naive estimator is computed on a first validation set, while the update for the corrected estimator is computed using the second validation set (both created from two_validation_set_cv); this allows for relaxation of the Donsker class conditions necessary for asymptotic convergence of the corrected estimator, while making sure that the initial CV naive estimator is not biased high (due to a higher R^2 on the training data)cv_vim: now, the cross-validated naive estimator is computed on the training data for each fold, while the update for the corrected cross-validated estimator is computed using the test data; this allows for relaxation of the Donsker class conditions necessary for asymptotic convergence of the corrected estimatorvim, replaced with individual-parameter functionsvimp_regression to match Python packagecv_vim now can compute regression estimatorsvimp_ci, vimp_se, vimp_update, onestep_based_estimator