Compute nonparametric estimates of the chosen measure of predictiveness.

est_predictiveness(
  fitted_values,
  y,
  a = NULL,
  full_y = NULL,
  type = "r_squared",
  C = rep(1, length(y)),
  Z = NULL,
  ipc_weights = rep(1, length(C)),
  ipc_fit_type = "external",
  ipc_eif_preds = rep(1, length(C)),
  ipc_est_type = "aipw",
  scale = "identity",
  na.rm = FALSE,
  nuisance_estimators = NULL,
  ...
)

Arguments

fitted_values

fitted values from a regression function using the observed data.

y

the observed outcome.

a

the observed treatment assignment (may be within a specified fold, for cross-fitted estimates). Only used if type = "average_value".

full_y

the observed outcome (from the entire dataset, for cross-fitted estimates).

type

which parameter are you estimating (defaults to r_squared, for R-squared-based variable importance)?

C

the indicator of coarsening (1 denotes observed, 0 denotes unobserved).

Z

either NULL (if no coarsening) or a matrix-like object containing the fully observed data.

ipc_weights

weights for inverse probability of coarsening (e.g., inverse weights from a two-phase sample) weighted estimation. Assumed to be already inverted (i.e., ipc_weights = 1 / [estimated probability weights]).

ipc_fit_type

if "external", then use ipc_eif_preds; if "SL", fit a SuperLearner to determine the correction to the efficient influence function.

ipc_eif_preds

if ipc_fit_type = "external", the fitted values from a regression of the full-data EIF on the fully observed covariates/outcome; otherwise, not used.

ipc_est_type

IPC correction, either "ipw" (for classical inverse probability weighting) or "aipw" (for augmented inverse probability weighting; the default).

scale

if doing an IPC correction, then the scale that the correction should be computed on (e.g., "identity"; or "logit" to logit-transform, apply the correction, and back-transform).

na.rm

logical; should NA's be removed in computation? (defaults to FALSE)

nuisance_estimators

(only used if type = "average_value") a list of nuisance function estimators on the observed data (may be within a specified fold, for cross-fitted estimates). Specifically: an estimator of the optimal treatment rule; an estimator of the propensity score under the estimated optimal treatment rule; and an estimator of the outcome regression when treatment is assigned according to the estimated optimal rule.

...

other arguments to SuperLearner, if ipc_fit_type = "SL".

Value

A list, with: the estimated predictiveness; the estimated efficient influence function; and the predictions of the EIF based on inverse probability of censoring.

Details

See the paper by Williamson, Gilbert, Simon, and Carone for more details on the mathematics behind this function and the definition of the parameter of interest.