Unsupervised Machine Learning
Teacher(s)Patrick Groenen, Pieter Schoonees
DatesPeriod 3 - Jan 03, 2022 to Feb 25, 2022
Machine Learning II discusses supervised and unsupervised machine learning approaches which have become popular tools for solving practical problems.It has as its goal that the student obtains a thorough technical understanding of a selection of supervised and unsupervised machine learning techniques, can implement the technique in the high level language R, and can write a report about an application of the technique. This course is a follow-up to Machine Learning I.
The first part of the course continues the discussion of supervised machine learning techniques from the Machine Learning I course. The second part focuses on unsupervised machine learning techniques for finding meaningful relations between all variables in a data set simultaneously. In contrast
to supervised machine learning, in unsupervised techniques all variables play similar roles. Therefore, the relationships among all variables must be modelled, whereas in supervised learning only the relationships between the target variable and the features are of direct interest. An important application of unsupervised learning techniques in management is customer segmentation in targeted marketing.
An overview of techniques and ideas to be treated are:
- support vector machines,
- gradient boosting machines,
- principal components analysis (PCA) and variants,
- multidimensional scaling (MDS),
- cluster analysis.This course is a field course in the Tinbergen Institute program for 3 credits in the major econometrics.
The course will cover material from the following list of readings, which are considered essential for your learning experience. These books and articles are also part of the examined material. Changes in the reading list will be communicated on Canvas.
- Hastie, T., Tibshirani, R. and J. Friedman. (2009). The elements of statistical learning (2nd edition). Springer. Available at https://web.stanford.edu/~hastie/Papers/ESLII.pdf.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Springer. Available at https://www.statlearning.com/.
Selected papers, including:
- Groenen, P. J. F., & van de Velden, M. (2016). Multidimensional scaling by majorization: A review. Journal of Statistical Software, 73(8), 1-26.
- Groenen, PJF, Nalbantov, G. and Bioch, J.C. 2009. SVM-Maj: a majorization approach to linear support vector machines with different hinge errors. Advances in Data Analysis and classification, 2(1), 17-43.
- Witten, D. M., Tibshirani, R., & Hastie, T. (2009). A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10(3), 515-534.