SML480: Pedagogy of Data Science

Spring 2020

Course staff

Course description   We are excited to announce a new course: SML 480 — Pedagogy of Data Science. SML 480 will run in parallel to SML 201 -- Introduction to Data Science. Students in SML 480 will work as paid Undergraduate Course Assistants (UCAs), teaching data science to beginners in SML 201. In SML 480, students will discuss approaches to teaching and to curriculum design in data science, as well as consolidate their knowledge in the foundations of data science by exploring topics such as functional programming for data science, the Grammar of Graphics approach to designing a programming language for data visualization, and simulation-based inference. Students in SML 480 will follow the progress of SML 201 students; findings about the progress of beginner data science students will potentially be published in an academic journal or conference.

SML 480 is an excellent opportunity to both contribute to others' learning directly and to think broadly about data science teaching and data science curriculum design. Our goal is to graduate leaders who will mentor their colleagues in data science, and think deeply about data science education.

The course syllabus is available here.


Mon/Wed/Fri 1:30pm-2:20pm in CSML 103 on Zoom.

Problem sets

Problem set 1 (Rmd source): Cost functions and Maximum Likelihood. Due: Wednesday March 4 23:59p.m.

Problem set 2 (Rmd source): Generalizability and Random Effects. Due: Friday April 10 23:59p.m.

Problem set 3 (Rmd source): Generalizability and Random Effects II. Due: Friday April 24 23:59p.m.


Data analysis using regression and multilevel/hierarchical models by Andrew Gelman, Jennifer Hill (Free e-book from the PU library)
Advanced Data Analysis from an Elementary Point of View by Cosma Shalizi (free pdf online from the author)
How to Design Programs, 2nd ed. by Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, and Shriram Krishnamurthi (free online copy from the authors at the link)
Weighing the Odds: A Course in Probability and Statistics by David Williams (on reserve at the library)
Advanced R by Hadley Wickham (free online copy from the author at the link)
Computational and Inferential Thinking by Ani Adhikari and John DeNero (free online copy from the authors at the link)
Statistical Thinking for the 21st Century by Russell A. Poldrack (free online copy from the author at the link)

An inclusive environment

We strive to build and maintain an inclusive environment in class — an environment that allows every student to reach their full potential. Please do not hesitate to contact me and/or your preceptor to let us know if you need special accommodation or with any concerns.

Design credit: CS229, Jan 2019.