Problem 1

Suppose that \(X\sim\mathcal{N}(2, 10^2)\). We sample the variable \(X\) once (i.e., we obtain a sample from the distribution \(\mathcal{N}(2, 10^2)\)).

Problem 1(a)

Write R code to obtain \(P(2.1 < X < 3.1)\). Use pnorm.

Solution

pnorm(q = 3.1, mean = 2, sd = 10) - pnorm(q = 2.1, mean = 2, sd = 10)
## [1] 0.03980596

Learning goal: compute probabilities of intervals

Problem 1(b)

Write R code to obtain \(P(2.1 < X < 3.1)\). Use pnorm(..., ,mean = 0, sd = 1).

Solution

The idea here is that we can “shift” and “shrink” X using \((X-2)/10\) so that now \((X-2)/10 \sim \mathcal{N}(0, 1)\)

pnorm(q = (3.1-2)/10, mean = 0, sd = 1) - pnorm(q = (2.1-2)/10, mean = 0, sd = 1)
## [1] 0.03980596

Learning goal: transform normal random variables to be \(\mathcal{N}(0, 1)\)

Problem 1(c)

Write R code to obtain \(P(2.1 < X < 3.1)\). Use rnorm. (And not pnorm.)

Solution

x <- rnorm(n = 100000, mean = 2, sd = 10)
mean((2.1 < x) & (x < 3.1))
## [1] 0.03982

Learning goal: compute probabilities via simulation. Understand the connection between samples from a distribution and the cumulative probability function.

Problem 1(d)

Write R code to obtain \(P(2.1 < X < 3.1)\). Use rnorm(..., mean = 0, sd = 1)

Problem 2

Suppose 65% of Princeton students like Wawa better than World Coffee. We selected a random sample of 100 students, and asked them which they prefer. What is the probability that more than 78 students said “Wawa”?

Problem 2(a)

Answer the question using pbinom.

Learning goal: map a word problem to a cumulative probability computation, use the normal approximation to the binomial distribution.

Solution

1 - pbinom(q = 78, size = 100, prob = .65)
## [1] 0.001686446

Another option is to use the lower.tail argument, but that is not preferred right now

pbinom(q = 78, size = 100, prob = .65, lower.tail = F)
## [1] 0.001686446

Learning goal: map a word problem to a cumulative probability computation.

Problem 2(b)

Answer the question using pnorm. Use the normal approximation to the Binomial distribution (recall: the mean is \(n\times prob\) and the variance is \(n\times prob\times (1-prob)\)).

Solution

1 - pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100))
## [1] 0.003209814

(Note: we are not requiring trying to use a continuity correction. To match the answer to 2(b), we’d need q = 78.9)

Another option (dispreferred):

pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100), lower.tail = F)
## [1] 0.003209814

Learning goal: use the normal approximation to the binomial. Recognize the consequences of not using continuity correction.

Problem 3

Suppose 100 Princeton students we asked whether Harvard or Stanford is the worse online institution of higher learning. 60 students said that Stanford is worse. Compute the p-value for the null hypothesis that Princeton students think that Harvard and Stanford are equally bad, on average. What can you conclude?

Solution

The null hypothesis here is that \(P(Stanford) = 0.5\)

The p-value here is P(n.Stanford >= 60 or n.Stanford <= 40). We can compute that using

pbinom(q = 40, size = 100, prob = 0.5) + (1 - pbinom(q = 59, size = 100, prob = 0.5))
## [1] 0.05688793

We would see a value that’s as extreme as what we’re seeing 5.6% of the time. This suggests that the data we have is consistent with Princeton students thinking that Harvard and Stanford are equally bad online institutions of higher learning.

Problem 4

Answer Problem 2 using only rnorm(..., mean = 0, sd = 1)

Solution

x <- rnorm(n = 100000, mean = 0, sd = 1)
# Now, 65 + x*sqrt(.65*.35*100) ~ N(65, sqrt(.65*.35*100)^2)
y <- 65 + x*sqrt(.65*.35*100)
mean(y > 78)
## [1] 0.00317

Learning goal: compute probability via simulation; flexibly apply variable transformation