The Hypothesis Testing Design Recipe

P-values via fake-data simulation

Identify the problem
- E.g., we are manufactring spaghetti and want the average spaghetti stick to be 30cm in length
- Have a vector x of length 20 that contains the measurements for 20 randomly-chosen sticks.
Identify the data generating process.
- E.g., \(X_i\sim \mathcal{N}(\mu, \sigma^2)\)
- (Next week: check that the data conforms to the model)
Identify the Null Hypothesis
- The Null Hypothesis is the “default” belief you will hold unless there is evidence against it.
- E.g., \(\mu = 30\)
Make a function to generate data assuming the Null Hypothesis is true

FakeMsmt <- function(n.samples, mu, sigma){
  x <- rnorm(n = n.samples, mean = mu, sd = sigma)
  mean(x) # Sometimes return the summary statistic
}

Repeatedly generate data from the Null Hypothesis
- The number of samples for each experiment should be the same as in the original data. You may need to estimate parameters such as \(\sigma\) from the data as well
- fake.means <- replicate(n = 10000, FakeMsmt(n = length(x), mu = 30, sigma = sd(x)))
Estimate how often the fake data that you observe is as extreme or more extreme than what you actually observe
- p.val <- mean(abs(fake.means - 30) > abs(mean(x) - 30))
A small p-value means that the data is not consistent with the Null Hypothesis – it rarely happens that you observe observations as extreme or more extreme as what you actually observed if the data is generated using the null-hypothesis. In that case, you can reject the null hypothesis. A large p-value means that the data is consistent with the Null Hypothesis – the data that you actually observe is not extreme compared to what you would often see even if the null hypothesis is true.
By convention, we reject the null hypothesis if the p-value is under 5%. We say we have no evidence against the null hypothesis if the p-value is over 5%. N.B., we never accept the null hypothesis.

P-values via sampling distributions of summary statistics

Identify the problem
- E.g., we are manufactring spaghetti and want the average spaghetti stick to be 30cm in length
- Have a vector x of length 20 that contains the measurements for 20 randomly-chosen sticks.
Identify the data generating process.
- E.g., \(X_i\sim \mathcal{N}(\mu, \sigma^2)\)
- (Next week: check that the data conforms to the model)
Identify the Null Hypothesis
- The Null Hypothesis is the “default” belief you will hold unless there is evidence against it.
- E.g., \(\mu = 30\)
Identify a summary statistic of the data for which you know the sampling distribution
- E.g., \(\frac{\bar{X}-\mu}{s/\sqrt{n}}\sim t(n-1)\)
Compute the summary statistic
- x.bar = mean(x)
Compute the probability of observing a value of the summary statistic that’s as extreme or more extreme than what you actually observe
- p.val <- 2*pt(-abs(x.bar-mu)/(sd(x)/sqrt(n)), df = n - 1)
A small p-value means that the data is not consistent with the Null Hypothesis – it rarely happens that you observe observations as extreme or more extreme as what you actually observed if the data is generated using the null-hypothesis. In that case, you can reject the null hypothesis. A large p-value means that the data is consistent with the Null Hypothesis – the data that you actually observe is not extreme compared to what you would often see even if the null hypothesis is true.
By convention, we reject the null hypothesis if the p-value is under 5%. We say we have no evidence against the null hypothesis if the p-value is over 5%. N.B., we never accept the null hypothesis.