Efficient Nonparametric Statistical Inference on Population Feature Importance using Shapley Values

Date:

We discuss our paper to be published in the Proceedings of the Thirty-seventh International Conference on Machine Learning.

Slides

The true population-level importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments. Valid statistical inference on this importance is a key component in understanding the population of interest. We present a computationally efficient procedure for estimating and obtaining valid statistical inference on the Shapley Population Variable Importance Measure (SPVIM).