Machine Learning I: Large-Scale Data Analysis and Decision Making
MATH 60629A

Fall 2026

[Schedule]  [Evaluations]  [References]  [Fall 2019]  [Français


Instructor: Laurent Charlin

Class Schedule:

Day/Time Room
Monday 3:30pm--6:30pm Decelles, Natashquan

Office hours: TBD (for now, you can email me)


Description:
In this course, we will study machine learning models, a type of statistical analysis that focuses on prediction, for analyzing very large datasets ("big data").
We will survey different machine learning techniques (supervised, unsupervised) as well as some applications (e.g., recommender systems) and ways to scale-up computations (e.g., distributed frameworks).

**Course delivery:** This course will be given as a flipped classroom. It is an instructional strategy where students learn the material before they come to class. The material will be a mix of readings and video capsules. Class time is reserved for more active activities such as problem solving, demonstrations, and questions-answering. In addition, class time will contain a short summary of the week's material.

Mathematical Note: Mathematical maturity will be assumed.

Programming Note: Python knowledge will be assumed. If you do not know Python I have listed a few ways to learn the basics below. I recommend option 1 (Data Camp) or option 2 below:

  1. DataCamp. Complete Chapters 1, 2, 3 of the Introduction to Python course. I will provide you with a link to get access to Chapters 2 and 3.
  2. HEC CAM offers introductory python courses in January.
  3. Here is the tutorial we used in 2018: Fall 2018 tutorial. While I think the first two options are superior, this will give you an idea of the level I am expecting. particularly recommend this

Further a machine-learning tutorial using python will be provided on week #4.



Weekly Schedule

  1. 01/05. Class introduction and math review. [slides]
  2. 01/12. Machine learning fundamentals
  3. 01/19. Supervised learning algorithms
  4. 01/26. Python for scientific computations and machine learning [Practical Session]
    • The tutorial that you will follow is here (on colab), Solutions.
    • I encourage you to start the tutorial ahead of time and to finish it during our 180 minutes together.
  5. 02/02. Neural networks and deep learning
  6. 02/09. Recurrent Neural networks and Convolutional neural networks
  7. 02/16. Unsupervised learning
  8. 02/23. Reading week (no class)
  9. 03/02. Project team meetings
  10. 03/09 Attention and the Transformer architecture
  11. 03/16 Transformers in practice
    • Will be given in class.
  12. 03/23 Recommender systems
  13. 03/30 Modern generative models (To be confirmed)
    • Will be given in class.
    • Slides
  14. 04/06. No class
  15. 04/13 Class project presentations


Evaluations

  1. Homework (20%)
    • Available early October.
    • Due on February 21.
  2. Project (30%)
  3. Project presentation (10%)
  4. Final Exam (30%)
    • Date: April 29 (Wednesday), Time: 6:30pm-9:30pm,
    • Documentation allowed: cheat sheet (standard size 8.5 x 11, double sided), calculator.
    • Material covered: Everything covered in class + required lectures.
    • Past exam: Fall 2018, Fall 2020 (Solutions)
  5. Capsule summaries (10%)
    • Provide a short summary (10 to 15 lines of text in the form) of 10 capsules throughout the semester.
    • The summary of a capsule must be provided before its class (e.g., a summary of capsule on "Learning Problem" must be submitted by 01/12).
    • Post your summaries using this form


References

  1. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome, 2009 [ESL]
  2. Deep Learning. Ian Goodfellow, Yoshua Bengio and, Aaron Courville. [DL]
  3. Reinforcement Learning : An Introduction Hardcover. Richard S. Sutton, Andrew G. Barto. A Bradford Book. 2nd edition [RL-Sutton-Barto]
  4. Machine Learning. Kevin Murphy. MIT Press. 2012. [ML-Murphy]
  5. Recommender Systems Handbook, Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. 2011. [RSH]
  6. Data Algorithms : Recipes for Scaling Up with Hadoop and Spark 1st Edition. Mahmoud Parsian. O'Reilly. 2015 [DA]
  7. Python for Data Analysis : Data Wrangling with Pandas, NumPy, and IPython. Wes McKinney. O'Reilly. 2012 [PDA]
  8. Pattern Recognition and Machine Learning. Christopher Bishop. 2006 [PRML]
  9. Advanced Analytics with Spark. O'Reilly. Second Edition. 2017