Machine Learning for Large-Scale Data Analysis and Decision Making
80-629-17A


This is the official website of the course. I will keep it up to date. In case of disagreement with Zone Cours this website will preval.

[Schedule]  [Evaluations]  [References


Instructor: Laurent Charlin

Class Schedule: Wednesday 8:30am--11:30am. CSC Saine Marketing.

Office hours: Wednesday 11:30am--12:30pm. CSC 4.817.

Description:
In this course, we will study machine learning models, a type of statistical analysis that focuses on prediction, for analyzing very large datasets ("big data"). In addition to standard models, we will study models for analyzing user behaviour and for decision making. Massive datasets are now common and require scalable analysis tools. Machine learning provides such tools and is widely used for modelling problems across many fields including artificial intelligence, bioinformatics, finance, marketing, education, transportation, and health.

**Note:** Mathematical maturity will be assumed. Programming will also be required but python tutorial(s) will be provided in the first few weeks of the class. The plan is to survey different machine learning techniques (supervised, unsupervised, reinforcement learning) as well as some applications (e.g., recommender systems). We will also focus on large-scale machine learning and will discuss distributed computational frameworks (Hadoop and Spark).


Weekly Schedule

  1. 08/30 Class introduction and math review. [slides]
  2. 09/06 Programming with Python I [**In Lab -- Decelles, Laboratoire LACED**]
  3. 09/13 Machine learning fundamentals. [slides] [**C-Ste-Cath, Quebecor**]
    • Required readings: Chapter 5 of Deep Learning (the book).
  4. 09/20 Python for scientific computations and machine learning [**In Lab -- Decelles, Laboratoire LACED**]
  5. 09/27 Supervised learning algorithms [slides]
    • Required reading: None
  6. 10/04 Neural networks and deep learning [slides]
  7. 10/11 Parallel computational paradigms for large-scale data processing [slides]
  8. 10/25 Project team meetings
  9. 11/01 Unsupervised learning [slides]
  10. 11/08 Recommendation systems I [slides]
  11. 11/15 Sequential decision making I [slides]
  12. 11/22 Sequential decision making II [slides]
  13. 11/29 Class project presentations


Evaluations

  1. Homework (20%).
  2. Project (30%). Details to come.
  3. Project presentation (10%). Details to come.
  4. Final Exam (30%). 12/08 9am, room: TBD.
  5. Class participation (10%).


References

  • The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition Hastie, Trevor, Tibshirani, Robert, Friedman, Jerome, 2009
  • Deep Learning. Ian Goodfellow, Yoshua Bengio and, Aaron Courville. [DL]
  • Reinforcement Learning : An Introduction Hardcover. Richard S. Sutton, Andrew G. Barto. A Bradford Book. 2nd edition [RL-Sutton-Barto]
  • Machine Learning. Kevin Murphy. MIT Press. 2012. [ML-Murphy] 2016 [ML]
  • Recommender Systems Handbook, Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. 2011. [RSH]
  • Mining of Massive Datasets. Jure Leskovec, Anand Rajaraman, Jeff Ullman. Cambridge University Press. 2014. [MMDS]
  • Decision Theory. Halsted. 1986. [DT]
  • Data Algorithms : Recipes for Scaling Up with Hadoop and Spark 1st Edition. Mahmoud Parsian. O'Reilly. 2015 [DA]
  • Python for Data Analysis : Data Wrangling with Pandas, NumPy, and IPython. Wes McKinney. O'Reilly. 2012 [PDA]
  • Data Science from Scratch : First Principles with Python. Joel Grus. 2015 [DSS]