• Graduate program
    • Why Tinbergen Institute?
    • Program Structure
    • Courses
    • Course Registration
    • Facilities
    • Admissions
    • Recent PhD Placements
  • Research
  • News
  • Events
    • Summer School
      • Behavioral Macro and Complexity
      • Econometrics and Data Science Methods for Business and Economics and Finance
      • Experimenting with Communication – A Hands-on Summer School
      • Inequalities in Health and Healthcare
      • Introduction in Genome-Wide Data Analysis
      • Research on Productivity, Trade, and Growth
      • Summer School Business Data Science Program
    • Events Calendar
    • Tinbergen Institute Lectures
    • Annual Tinbergen Institute Conference
    • Events Archive
  • Summer School
  • Alumni
  • Times
Home | News | Research by Yi He challenges ways of working and thinking within data science
News | June 10, 2022

Research by Yi He challenges ways of working and thinking within data science

In an interview with the Amsterdam School of Economics (ASE), Tinbergen Institute research fellow Yi He (University of Amsterdam) shares his fascination with the way data science applies mostly outdated mathematical models. He also elaborates on how his research will make clear exactly what kind of solution data scientists should use for a given problem.

Research by Yi He challenges ways of working and thinking within data science

Earlier studies by Yi focussed predominately on statistical issues. Nowadays, he is more concerned with data science, acknowledging that, at the end of the day, he is more of a data scientist than mathematician. So what does he see as the main difference? 'I base my ideas and theories on real-world situations and events. The research done by mathematicians is at a much higher level of abstraction,' the he explains.

''Many data scientists use very clear and understandable mathematical solutions to formulate answers to concrete financial questions. But when you look closely at these mathematical solutions, you notice that they’re actually rather simplistic ‒ to the point of not giving you a very reliable answer. So I cast doubt on such solutions Yi says. To me, data science is all about using data in the best possible way to answer questions. And I believe it means moving off the beaten track. It really feels great to be doing this kind of research. To me, data science is all about using data in the best possible way to answer questions. And I believe it means moving off the beaten track. It really feels great to be doing this kind of research.'

He acknowledges that all predictions can be expected to have margins of error. But in some cases, more complex methods work better while in others 'simplicity is king'. Which method should be applied to which data set? 'That’s the issue we’re now trying to resolve.' And Yi is doing so with a new mathematical theory he is currently developing. 'It’s not after all simply a question of how, but just as much one of why. We need to understand why complex data models give us the best predictions in some situations but not others.' He concludes: 'It will save us a lot of time ‒ and spare us many illusions.'

This is an excerpt of the interview by ASE. Read the complete interview on their website

The paper “Most powerful test against a sequence of high dimensional local alternatives" authored by Yi He and co-authors Jiti Gao (Monash University, Australia), and Sombut Jaidee (Monash University, Australia) is forthcoming in the Journal of Econometricsdoi.org/10.1016/j.jeconom.2021.10.015.