--- title: "SML201 Precept 8 Solutions, Spring 2020" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ### Problem 1 Suppose that $X\sim\mathcal{N}(2, 10^2)$. We sample the variable $X$ once (i.e., we obtain a sample from the distribution $\mathcal{N}(2, 10^2)$). ### Problem 1(a) Write R code to obtain $P(2.1 < X < 3.1)$. Use `pnorm`. #### Solution ```{r} pnorm(q = 3.1, mean = 2, sd = 10) - pnorm(q = 2.1, mean = 2, sd = 10) ``` *Learning goal*: compute probabilities of intervals ### Problem 1(b) Write R code to obtain $P(2.1 < X < 3.1)$. Use `pnorm(..., ,mean = 0, sd = 1)`. #### Solution The idea here is that we can "shift" and "shrink" X using $(X-2)/10$ so that now $(X-2)/10 \sim \mathcal{N}(0, 1)$ ```{r} pnorm(q = (3.1-2)/10, mean = 0, sd = 1) - pnorm(q = (2.1-2)/10, mean = 0, sd = 1) ``` *Learning goal*: transform normal random variables to be $\mathcal{N}(0, 1)$ ### Problem 1(c) Write R code to obtain $P(2.1 < X < 3.1)$. Use `rnorm`. (And not `pnorm`.) #### Solution ```{r} x <- rnorm(n = 100000, mean = 2, sd = 10) mean((2.1 < x) & (x < 3.1)) ``` *Learning goal*: compute probabilities via simulation. Understand the connection between samples from a distribution and the cumulative probability function. ### Problem 1(d) Write R code to obtain $P(2.1 < X < 3.1)$. Use `rnorm(..., mean = 0, sd = 1)` ### Problem 2 Suppose 65% of Princeton students like Wawa better than World Coffee. We selected a random sample of 100 students, and asked them which they prefer. What is the probability that more than 78 students said "Wawa"? #### Problem 2(a) Answer the question using `pbinom`. *Learning goal*: map a word problem to a cumulative probability computation, use the normal approximation to the binomial distribution. #### Solution ```{r} 1 - pbinom(q = 78, size = 100, prob = .65) ``` Another option is to use the `lower.tail` argument, but that is not preferred right now ```{r} pbinom(q = 78, size = 100, prob = .65, lower.tail = F) ``` *Learning goal*: map a word problem to a cumulative probability computation. #### Problem 2(b) Answer the question using `pnorm`. Use the normal approximation to the Binomial distribution (recall: the mean is $n\times prob$ and the variance is $n\times prob\times (1-prob)$). #### Solution ```{r} 1 - pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100)) ``` (Note: we are not requiring trying to use a continuity correction. To match the answer to 2(b), we'd need `q = 78.9`) Another option (dispreferred): ```{r} pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100), lower.tail = F) ``` *Learning goal*: use the normal approximation to the binomial. Recognize the consequences of not using continuity correction. ### Problem 3 Suppose 100 Princeton students we asked whether Harvard or Stanford is the worse online institution of higher learning. 60 students said that Stanford is worse. Compute the p-value for the null hypothesis that Princeton students think that Harvard and Stanford are equally bad, on average. What can you conclude? #### Solution The null hypothesis here is that $P(Stanford) = 0.5$ The p-value here is P(n.Stanford >= 60 or n.Stanford <= 40). We can compute that using ```{r} pbinom(q = 40, size = 100, prob = 0.5) + (1 - pbinom(q = 59, size = 100, prob = 0.5)) ``` We would see a value that's as extreme as what we're seeing 5.6% of the time. This suggests that the data we have is consistent with Princeton students thinking that Harvard and Stanford are equally bad online institutions of higher learning. ### Problem 4 Answer Problem 2 using only `rnorm(..., mean = 0, sd = 1)` #### Solution ```{r} x <- rnorm(n = 100000, mean = 0, sd = 1) # Now, 65 + x*sqrt(.65*.35*100) ~ N(65, sqrt(.65*.35*100)^2) y <- 65 + x*sqrt(.65*.35*100) mean(y > 78) ``` *Learning goal*: compute probability via simulation; flexibly apply variable transformation