• Graduate program
  • Research
  • Summer School
  • Events
    • Summer School
      • Sustainable Finance
      • Applied Public Policy Evaluation
      • Economics of Blockchain and Digital Currencies
      • Economics of Climate Change
      • Foundations of Machine Learning with Applications in Python
      • From preference to choice: The Economic Theory of Decision-Making
      • Gender in Society
      • Business Data Science Summer School Program
    • Events Calendar
    • Events Archive
    • Tinbergen Institute Lectures
    • 16th Tinbergen Institute Annual Conference
    • Annual Tinbergen Institute Conference
  • News
  • Alumni
  • Magazine
Home | Events Archive | Machine Learning for Static Panel Models with Fixed Effects
Seminar

Machine Learning for Static Panel Models with Fixed Effects


  • Series
  • Speaker(s)
    Annalivia Polselli (University of Essex, United Kingdom and visiting-fellow Erasmus University Rotterdam)
  • Location
    Erasmus University Rotterdam, Campus Woudestein, Polak 3-09
    Rotterdam
  • Date and time

    February 28, 2024
    13:00 - 14:00

Abstract

Machine Learning (ML) algorithms provide powerful data-driven tools for approximating high-dimensional and/or non-linear nuisance functions of the confounders without making assumptions on the true functional form ex-ante. In this paper, we develop estimators of causal parameters for panel data models which allow for non-linear effects of the confounding regressors, and investigate the performance of these estimators using well-known ML algorithms (i.e., LASSO, classification and regression trees, gradient boosting, and random forests). We use Double Machine Learning (DML) by Chernozhukov et al. (2018) for the estimation of the homogeneous treatment effect in panel data models with unobserved individual heterogeneity (or fixed effects) and no unobserved confounding by extending Robinson (1988)’s partially linear regression model. We develop three alternative approaches for handling the fixed effects by adapting the within-group estimator, first-difference estimator, and correlated random effect estimator (Mundlak, 1978) to non-linear models. Using Monte Carlo simulations, we find that conventional least squares estimators can perform well even if the data generating process is non-linear and smooth, but there are substantial performance gains in terms of bias reduction under a process where the true effect of the regressors is non-linear and discontinuous. However, for the same scenarios, we also find inference to be problematic for tree-based learners, despite extensive hyperparameter tuning, because these lead to highly non-normal distributions of the estimator and severely under-estimated sampling variance. Finally, we provide an illustrative example of DML for observational panel data showing the impact of the introduction of the national minimum wage on voting behaviour in the UK.