Suppose that \(X\sim\mathcal{N}(2, 10^2)\). We sample the variable \(X\) once (i.e., we obtain a sample from the distribution \(\mathcal{N}(2, 10^2)\)).
Write R code to obtain \(P(2.1 < X < 3.1)\). Use pnorm
.
pnorm(q = 3.1, mean = 2, sd = 10) - pnorm(q = 2.1, mean = 2, sd = 10)
## [1] 0.03980596
Learning goal: compute probabilities of intervals
Write R code to obtain \(P(2.1 < X < 3.1)\). Use pnorm(..., ,mean = 0, sd = 1)
.
The idea here is that we can “shift” and “shrink” X using \((X-2)/10\) so that now \((X-2)/10 \sim \mathcal{N}(0, 1)\)
pnorm(q = (3.1-2)/10, mean = 0, sd = 1) - pnorm(q = (2.1-2)/10, mean = 0, sd = 1)
## [1] 0.03980596
Learning goal: transform normal random variables to be \(\mathcal{N}(0, 1)\)
Write R code to obtain \(P(2.1 < X < 3.1)\). Use rnorm
. (And not pnorm
.)
x <- rnorm(n = 100000, mean = 2, sd = 10)
mean((2.1 < x) & (x < 3.1))
## [1] 0.03982
Learning goal: compute probabilities via simulation. Understand the connection between samples from a distribution and the cumulative probability function.
Write R code to obtain \(P(2.1 < X < 3.1)\). Use rnorm(..., mean = 0, sd = 1)
Suppose 65% of Princeton students like Wawa better than World Coffee. We selected a random sample of 100 students, and asked them which they prefer. What is the probability that more than 78 students said “Wawa”?
Answer the question using pbinom
.
Learning goal: map a word problem to a cumulative probability computation, use the normal approximation to the binomial distribution.
1 - pbinom(q = 78, size = 100, prob = .65)
## [1] 0.001686446
Another option is to use the lower.tail
argument, but that is not preferred right now
pbinom(q = 78, size = 100, prob = .65, lower.tail = F)
## [1] 0.001686446
Learning goal: map a word problem to a cumulative probability computation.
Answer the question using pnorm
. Use the normal approximation to the Binomial distribution (recall: the mean is \(n\times prob\) and the variance is \(n\times prob\times (1-prob)\)).
1 - pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100))
## [1] 0.003209814
(Note: we are not requiring trying to use a continuity correction. To match the answer to 2(b), we’d need q = 78.9
)
Another option (dispreferred):
pnorm(q = 78, mean = 65, sd = sqrt(.65*.35*100), lower.tail = F)
## [1] 0.003209814
Learning goal: use the normal approximation to the binomial. Recognize the consequences of not using continuity correction.
Suppose 100 Princeton students we asked whether Harvard or Stanford is the worse online institution of higher learning. 60 students said that Stanford is worse. Compute the p-value for the null hypothesis that Princeton students think that Harvard and Stanford are equally bad, on average. What can you conclude?
The null hypothesis here is that \(P(Stanford) = 0.5\)
The p-value here is P(n.Stanford >= 60 or n.Stanford <= 40). We can compute that using
pbinom(q = 40, size = 100, prob = 0.5) + (1 - pbinom(q = 59, size = 100, prob = 0.5))
## [1] 0.05688793
We would see a value that’s as extreme as what we’re seeing 5.6% of the time. This suggests that the data we have is consistent with Princeton students thinking that Harvard and Stanford are equally bad online institutions of higher learning.
Answer Problem 2 using only rnorm(..., mean = 0, sd = 1)
x <- rnorm(n = 100000, mean = 0, sd = 1)
# Now, 65 + x*sqrt(.65*.35*100) ~ N(65, sqrt(.65*.35*100)^2)
y <- 65 + x*sqrt(.65*.35*100)
mean(y > 78)
## [1] 0.00317
Learning goal: compute probability via simulation; flexibly apply variable transformation