Efficient Nonparametric Statistical Inference on Population Feature Importance using Shapley Values
Date:
We discuss our paper to be published in the Proceedings of the Thirty-seventh International Conference on Machine Learning.
The true population-level importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments. Valid statistical inference on this importance is a key component in understanding the population of interest. We present a computationally efficient procedure for estimating and obtaining valid statistical inference on the Shapley Population Variable Importance Measure (SPVIM).