• Graduate program
    • Why Tinbergen Institute?
    • Program Structure
    • Courses
    • Course Registration
    • Facilities
    • Admissions
    • Recent PhD Placements
  • Research
  • News
  • Events
    • Summer School
      • Inequalities in Health and Healthcare
      • Research on Productivity, Trade, and Growth
      • Behavioral Macro and Complexity
    • Events Calendar
    • Tinbergen Institute Lectures
    • Annual Tinbergen Institute Conference
    • Events Archive
  • Alumni
  • Times
Home | Courses | Unsupervised Machine Learning
Course

Unsupervised Machine Learning


  • Teacher(s)
    Patrick Groenen, Pieter Schoonees
  • Research field
    Econometrics
  • Dates
    Period 3 - Jan 03, 2022 to Feb 25, 2022
  • Course type
    Core
  • Program year
    First
  • Credits
    4

Course description

Machine Learning II discusses supervised and unsupervised machine learning approaches which have become popular tools for solving practical problems.

It has as its goal that the student obtains a thorough technical understanding of a selection of supervised and unsupervised machine learning techniques, can implement the technique in the high level language R, and can write a report about an application of the technique. This course is a follow-up to Machine Learning I.


The first part of the course continues the discussion of supervised machine learning techniques from the Machine Learning I course. The second part focuses on unsupervised machine learning techniques for finding meaningful relations between all variables in a data set simultaneously. In contrast
to supervised machine learning, in unsupervised techniques all variables play similar roles. Therefore, the relationships among all variables must be modelled, whereas in supervised learning only the relationships between the target variable and the features are of direct interest. An important application of unsupervised learning techniques in management is customer segmentation in targeted marketing.

An overview of techniques and ideas to be treated are:

- support vector machines,

- gradient boosting machines,

- principal components analysis (PCA) and variants,

- multidimensional scaling (MDS),

- cluster analysis.

This course is a field course in the Tinbergen Institute program for 3 credits in the major econometrics.

Prerequisites

Machine Learning I

Course literature

The course will cover material from the following list of readings, which are considered essential for your learning experience. These books and articles are also part of the examined material. Changes in the reading list will be communicated on Canvas.
Books:

  • Hastie, T., Tibshirani, R. and J. Friedman. (2009). The elements of statistical learning (2nd edition). Springer. Available at https://web.stanford.edu/~hastie/Papers/ESLII.pdf.
  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Springer. Available at https://www.statlearning.com/.

Selected papers, including:

  • Groenen, P. J. F., & van de Velden, M. (2016). Multidimensional scaling by majorization: A review. Journal of Statistical Software, 73(8), 1-26.
  • Groenen, PJF, Nalbantov, G. and Bioch, J.C. 2009. SVM-Maj: a majorization approach to linear support vector machines with different hinge errors. Advances in Data Analysis and classification, 2(1), 17-43.
  • Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10(3), 515-534.