Data Science-related courses at UofT

Overview

Data Science-related courses are offered mostly by the Deparment of Computer Science and the Department of Statistical Sciences. You may also be interested in exploring the offerings of the Dept. of Electrical Engineering (in particular in machine learning and information theory) and the Rotman School of Management (in business analytics). While the focus here is on the undergraduate course offerings at UofT, strong students can sometimes petition to take graduate courses. Graduate course listings are available on individual Departments' websites.

Data Scientists need a solid background in Statistics and Computer Science, but often domain expertise — specific knowledge about the domain from which the data to be analyzed came from — is invaluable. During your time at UofT, it would be valuable to develop expertise in some application domain; examples include finance, life science, physical science, psychology, and marketing. Some ideas for course sequences in an application domain can be found in the description of the Applied Statistics Specialist.

Methods of Data Analysis and the Design of Scientific Studies

Statistical Theory

Machine Learning

Machine Learning is used to obtain insights for large-scale datasets. Machine Learning courses complement the Data Analysis courses.

Multiple machine learning courses are available. Their contents overlap to some extent.

Working with Non-Data Scientists

Working with non-Data Scientists is probably the most important skill of the Data Scientist. A number of courses throughout the university (most of which are limited-enrollment) allow for an opportunity to work with non-Statisticians/Computer Scientists.

Opportunities to do that while taking courses include courses such as the ones offered in DCSIL.

Data Wrangling

For many Data Scientists, facility with wrangling data is absolutely essential. You should take CSC343 and practice processing datasets as soon as possible. For people working with large datasets, knowledge of Linux-like systems is essential to be able to process data. CSC209 is a good place to start.

Facility with data wrangling comes with practice -- you’ll acquire it if you work with large datasets. It is a good idea to learn to do Data Wrangling in R as well.

Database Systems

If your interest is in building systems for storing large amounts of data, consider taking CSC369, followed by CSC443.