Is Interpolation Benign for Random Forests?

Series

Econometrics Seminars and Workshop Series
Speaker(s)

Erwan Scornet (Ecole Polytechnique, France)
Field

Econometrics

Location

University of Amsterdam, Roeterseilandcampus, room E5.22
Amsterdam
Date and time

February 17, 2023
12:30 - 13:30

Abstract
Statistical wisdom suggests that very complex models, interpolating training data, will be poor at predicting unseen examples. Yet, this aphorism has been recently challenged by the identification of benign overfitting regimes, specially studied in the case of parametric models: generalization capabilities may be preserved despite model high complexity. While it is widely known that fully-grown decision trees interpolate and, in turn, have bad predictive performances, the same behavior is yet to be analyzed for random forests. In this talk, I will present how the trade-off between interpolation and consistency takes place for several types of random forest models. In particular, I will establish that interpolation regimes and consistency cannot be achieved for non-adaptive random forests. Since adaptivity seems to be the cornerstone to bring together interpolation and consistency, we study the Median RF which is shown to be consistent even in the interpolation setting. Regarding Breiman's forest, we theoretically control the size of the interpolation area, which converges fast enough to zero, so that exact interpolation and consistency can occur in conjunction. Joint with Ludovic Arnould and Claire Boyer.