This course is focused on developing a theoretical foundation for machine learning, a growing discipline that delivers a wide set of tools, models, and algorithms for solving real-world prediction and inference problems. The course focuses on using mathematical analysis to state the setups that are encountered in common machine learning problems, and strive to provide a rigorous analysis of the model behaviors and the training algorithms. Throughout the semester, the emphasis will be drawing on the modeling frameworks and the analysis to understand emerging phenomena in modern machine learning, as well as the algorithmic and practical implications from this effort.
This course complements an introductory machine learning class such as CS 6140, DS 5220, or DS 4400, and provides a different perspective to what you would typically learn from an intro-lvel course.
The course content will involve a mix of materials from different subjects such as machine/deep learning theory, probability and statistics, neural networks and deep learning, information theory, reinforcement learning, and language modeling.
Prerequisites
Students are expected to be familiar with basic calculus and linear algebra, and be comfortable with reading and writing proofs.
Prior knowledge in probability and linear algebra.
Having taken an introductory machine learning class.
See here for the syllabus.
Week 1, Jan 8: Overview
What is the course about
Basic setup of supervised prediction
Empirical risk minimization, and uniform convergence
Week 2, Jan 12: Neural networks and generative models, Jan 15: Transfer learning estimators
Basic setup of neural networks and language models
Transfer learning and minimax estimation
PAC-learning
Learning a finite, realizable hypothesis class
Week 3, Jan 22: Concentration estimates
Concentration estimates
Moment generating functions
There will be three homeworks, for a total of 40% of overall grade. The homeworks should be done individually and submitted separately as well.
The course project includes an in-class presentation for 40% of total grade and a final course project for 20% of total grade. The goal is to prepare you for understanding and gaining experience in the latest frontier of machine learning. For this semester, we will hand out a list of papers for you to read, and you can choose one that most interests you. You will have the opportunity to present the paper materials in-class. Lastly, you will write a project report summarizing the findings from your reading.
There isn’t a single textbook that covers all of the lectures, though the following are good references for the course materials.
Understanding Machine Learning: From Theory to Algorithms, Shai Shalev-Shwartz (Hebrew University of Jerusalem) and Shai Ben-David (Waterloo)
Statistical learning theory lecture notes, Percy Liang (Stanford)
Mathematical analysis of machine learning algorithms, Tong Zhang (UIUC)
Learning theory from first principles, Francis Bash (INRIA)