• Graduate Programs
    • Tinbergen Institute Research Master in Economics
      • Why Tinbergen Institute?
      • Research Master
      • Admissions
      • PhD Vacancies
      • Selected PhD Placements
    • Facilities
    • Research Master Business Data Science
    • Education for external participants
    • Summer School
    • Tinbergen Institute Lectures
    • PhD Vacancies
  • Research
  • Browse our Courses
  • Events
    • Summer School
      • Tuition Fees and Payment
      • Applied Public Policy Evaluation
      • Deep Learning
      • Development Economics
      • Economics of Blockchain and Digital Currencies
      • Economics of Climate Change
      • The Economics of Crime
      • Foundations of Machine Learning with Applications in Python
      • From Preference to Choice: The Economic Theory of Decision-Making
      • Inequalities in Health and Healthcare
      • Marketing Research with Purpose
      • Markets with Frictions
      • Modern Toolbox for Spatial and Functional Data
      • Sustainable Finance
      • Tuition Fees and Payment
      • Business Data Science Summer School Program
    • Events Calendar
    • Events Archive
    • Tinbergen Institute Lectures
    • 2026 Tinbergen Institute Opening Conference
    • Annual Tinbergen Institute Conference
  • News
  • Summer School
  • Alumni
    • PhD Theses
    • Master Theses
    • Selected PhD Placements
    • Key alumni publications
    • Alumni Community
Home | Events Archive | Robust Estimation and Inference for Categorical Data
Seminar

Robust Estimation and Inference for Categorical Data


  • Location
    Erasmus University Rotterdam, Campus Woudestein, ET-14
    Rotterdam
  • Date and time

    February 04, 2025
    11:30 - 12:30

Abstract:

While there is a rich literature on robust methodologies for contamination in continuously distributed data, contamination in categorical data is largely overlooked. This gap in the statistics literature is unfortunate because many datasets contain categorical variables that can suffer from contamination just like continuous variables. Examples include inattentive responding and bot responses in questionnaires, data entry errors, or zero-inflated count data. We propose a novel class of contamination-robust estimators of models for categorical data, termed C-estimators ("C" for categorical). C-estimators generalize maximum likelihood estimation and are shown to be consistent, asymptotically Gaussian, and fully efficient in the absence of contamination, where the latter property contrasts with classic robustness theory for continuous data. In addition, we propose a general notion of outlyingness for categorical data and a measure thereof. We verify the attractive statistical properties of the proposed methodology in simulation studies. Furthermore, we demonstrate its practical usefulness in an empirical application on correlation estimation in questionnaire responses, where we find evidence for inattentive responding. Moreover, we provide a free open-source R package implementing C-estimators and offering rich methods for printing and plotting.