Week 1 |
Welcome to SML201
Lecture 1 (Rmd source): evaluating R expressions, printing to the console, variables, conditionals, functions.
Lecture 2 (Rmd source): functions review, print vs. return, logical expressions, vectors, vectorized operators
|
Reading: DataCamp's Intro to R, Ch. 1, 2, 5
Reading: Poldrack Ch. 3
Just for fun: Physician salary data
|
Week 2 |
Lecture 1 (Rmd source): indexing with logical vectors, parallel vectors, function composition and pipes, data frames, intro to tidyverse.
Lecture 2 (Rmd source): wrangling data with dplyr/tidyverse: filter, arrange, rename, select, summarize, mutate, and group_by. lecture draft
|
Reading (primary): R4DS Ch. 5
Reading (secondary): Poldrack Ch. 5.1-5.3.
Reading (exercises): Datacamp Introduction to the Tidyverse Ch. 1 and 3.
|
Week 3 |
Lecture 1 (Rmd source): problem solving with dplyr/tidyverse. Using sapply (lecture draft)
Lecture 2 (Rmd source): dplyr odds and ends. Named arguments. Review of sapply. A first look at DataViz with ggplot. (lecture draft.)
|
Reading R4DS Ch. 5 (continue reading)
Reading: Datacamp Introduction to the Tidyverse Ch. 1 and 3. (continue practicing)
Reading: Healy Ch. 3
|
Week 4 |
Lecture 1 (Rmd source): Introduction to DataViz with ggplot (cont'd); Intro to Predictive Modeling: Linear Regression (Rmd source). (lecture draft.) Predictive modeling slides.
Lecture 2: Predictive modeling slides, cont'd. Logistic regression part 1 (Rmd source). Logistic regression part 2 (Rmd source). Bar charts with ggplot (Rmd source). Splitting datasets and cross-validation (Rmd source). lecture draft.
|
Video: SML201: Why the categorical version of a variable works better on the training set
Reading (primary): Healy Ch. 3
Reading (exercises): Datacamp Introduction to the Tidyverse, Data visualization chapters.
|
Week 5 |
Barcharts 2 (Rmd source). Histograms (Rmd source). Measuring performance of classifiers (Rmd source). Dataset splits (Rmd source). SSE/MSE/RMSE (Rmd source). Barcharts: summary (Rmd)
Interpreting regression coefficients
|
Reading (primary): Healy Ch. 4
|
Week 6 |
Variable selection and cross-validation (Rmd source)
Videos: Probability, odds, betting odds, odds on a bookmaker website, log odds, log odds 2, variable selection with cross-validation, replicate, sampling datasets
Fairness in Machine Learning. Video. Video: Calibration.
|
Reading: Poldrack Ch. 6
Reading: Angwin et al, Machine Bias: There’s software used across the country to predict future criminals. And it’s biased against blacks.
Reading: Corbett-Davies et al, A computer program used for bail and sentencing decisions was labeled biased against blacks. It’s actually not that clear.
Reading (advanced): Julia Dressel and Hany Farid, The accuracy, fairness, and limits of predicting recidivism. Sam Corbett-Davies and Sharad Goel, The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning (more technical). Margaret Mitchell et al, Model Cards for Model Reporting.
|
Week 7 |
Tidy data (Rmd source). Video
A look at DataViz
Morning lecture: Recording
Afternoon lecture: Recording (better quality)
Software: To knit to pdf, you need to install MiKTeX (Windows) or MacTeX (Mac). Alternatively, you can use Rstudio Cloud.
|
Reading: Healy Ch. 1
Reading: Healy Ch. 2.1
Reading: Healy Ch. 3.2
Reading: Shalizi, Using R Markdown for class reports
Just for fun: Napoleon's march on Moscow
Just for fun: Cross-national differences in happiness: Cultural
measurement bias or effect of culture?
Just for fun: LaTeX and Donald Knuth's email habits
The Challenger disaster. Richard Feynman demonstrates the effect of cold temperature on the o-rings
Music during the break: Brian Wilson by Barenaked Ladies
|
Week 8 |
Fairness recap
Probability distributions. Code, Rmd source.
Recordings: Tuesday morning lecture, Tuesday afternoon lecture
Recordings: Thursday morning lecture, Thursday afternoon lecture
Intro to Probability, pt. 2 (Rmd source)
|
Reading: OpenIntro Stistics (4th ed) Ch. 3. (link to free pdf)
Just for fun: You can load a die but you can't weight a coin
Music during the break: Collect Call and Gimme Sympathy by Metric
Music during the break: Crabbuckit and Man I Used to Be by k-os. Crabbuckit (The Good Lovelies cover)
|
Week 9 |
Probability review (Rmd)
P-values (Rmd)
Tuesday lecture recording: morning lecture, afternoon lecture
Thursday lecture recording: morning lecture, afternoon lecture
|
Reading: Poldrack Ch. 7, Ch. 8.1-8.4, Ch. 9.1-9.3
Music during the break: Free Man in Paris and Both Sides, Now by Joni Mitchell
|
Week 10 |
P-values (Rmd) continued
The t-statistic (Rmd)
Hypothesis testing design recipe (Rmd)
Tuesday morning lecture, Tuesday afternoon lecture.
Thursday morning lecture, Thursday aftenroon lecture
|
Reading: Poldrack 15.1-15.3
Music during the break: Chris Hadfield, Ed Robertson, and the Wexford Gleeks, Is Somebody Singing. The Beatles and Paul McCartney's OPP badge, Sgt. Pepper's Lonely Hearts Club Band intro and With A Little Help From My Friends
Music during the break: The Right Honourable Stephen "Stingo" Harper and Yo-Yo Ma, With a Little Help from My Friends. And another cover with Yo-Yo Ma, with Rosa Passos: Chega de Saudade
|
Week 11 |
Hypothesis testing. Code (Rmd).
Hypothesis testing: summary
Confidence intervals
Tuesday lectures: Tuesday morning, Tuesday afternoon
Thursday lectures Thursday morning, Thursday afternoon
|
Reading:
Music during the break: Spadina Bus by The Shuffle Demons. Some Chords by deadmau5.
|
Week 12 |
Inference with Linear Regression. Code (Rmd)
Comparing group means. Code (Rmd)
A brief intro to artificial neural networks. Andrej Karpathy's GoogleNet labelling interface.
Lecture recordings: Tuesday 11am, Tuesday 3pm. Thursday 11am, Thursday 3pm.
Guest lecture: Ganes Kesari
|
Reading: Poldrack Ch.14
Reading: Poldrack Ch. 15
Music during the break: Tom Sawyer by Rush.
|