---
title: "Precept 8 Problem Set"
output:
html_document:
df_print: paged
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Problem 1
Suppose 65% of Princeton students like World Coffee better than Hoagie Haven. We selected a random sample of 100 students, and asked them which they prefer. What is the probability that more than 78 students said "World Coffee"?
Use the normal approximation to the Binomial distribution (recall: the mean is $n\times prob$ and the variance is $n\times prob\times (1-prob)$). Verify that you would get the same answer using `pbinom` as you get using `pnorm`.
#### Solution
```{r}
1 - pnorm(q = 78.9, mean = 65, sd = sqrt(.65*.35*100))
1 - pbinom(q = 78, size = 100, prob = .65)
```
### Problem 2
You sample the following 10 measurements from $N(\mu, \sigma^2)$
```{r}
set.seed(0)
rnorm(n = 10, mean = 0.1, sd = 2)
```
Evaluate the evidence against the null hypothesis that $\mu = 0$.
Now, do the same for
rnorm(n = 100000000, mean = 0.1, sd = 2)
What do you observe?
#### Solution
For $n = 10$:
```{r}
set.seed(0)
my.sample <- rnorm(n = 10, mean = 0.1, sd = 2)
pt(-mean(my.sample)/(sd(my.sample)/sqrt(10)), df = 9) + (1 - pt(mean(my.sample)/(sd(my.sample)/sqrt(10)), df = 9))
```
### Problem 3
Write a function that would compute the p-value in a situation analoguous to what we had with the finches -- we'd like to compare the means of two samples from normal distributions, and compute the p-value for the null hypothesis that the two means are equal. You may only use `rnorm`. You cannot use `pt` or `t.test`.
You are encouraged to use the [lecture](http://guerzhoy.mycpanel.princeton.edu/201s19/lec/W08/lec1/pvalues2.html) as little as possible. Especially if you use the lecture, explain every line of code.
Compare the outputs of your function to the outputs of `t.test`.
```{r}
t.pval <- function(sample1, sample2){
s.mean <- mean(c(sample1, sample2))
sd1 <- sd(sample1)
sd2 <- sd(sample2)
n1 <- length(sample1)
n2 <- length(sample2)
act.diff <- mean(sample1) - mean(sample2)
return(mean(
replicate(100000,
abs(mean(rnorm(n = n1, mean = s.mean, sd = sd1)) -
mean(rnorm(n= n2, mean = s.mean, sd = sd2))) >= abs(act.diff))))
}
library(Sleuth3)
library(tidyverse)
finches <- case0201
sample1 <- (finches %>% filter(Year == "1976"))$Depth
sample2 <- (finches %>% filter(Year == "1978"))$Depth
t.pval(sample1, sample2)
t.pval(sample1+0.3, sample2)
t.pval(sample1+0.4, sample2)
t.test(sample1+0.4, sample2)$p.value
```
### Problem 4 (Take-home challenge)
The speed of light in vacuum is 299,792,458 m/s. The geographic coordinates of the Great Pyramid of Giza is 29.9792458N, 31.134658E. [How weird is that?](https://www.metabunk.org/debunked-the-great-pyramid-of-giza-and-the-speed-of-light.t2154/)
Estimate the probability of observing something as weird or weirder for a site of comparable importance to the Great Pyramid, assuming the null hypothesis that aliens did not undertake large-scale architectural projects on Earth.