Inference for Model-Agnostic Variable Importance


I discuss a framework for inference on general model-agnostic variable importance measures.


Traditional statistical inference and machine learning often appear to be at odds. While simple population models are useful in many contexts, there is increasing interest in using machine learning to uncover complex relationships. This dichotomy has been particularly stark when the scientific quantity of interest is the importance of variables in predicting the response. In this talk, I will discuss a model-agnostic notion of variable importance and general conditions under which valid inference on the true importance can be obtained, even when machine learning-based techniques are used as part of estimation. I will tie this example to general ideas from the literature on causal inference and targeted learning that provide tools for incorporating machine learning into inference on general target parameters.