Problem 1

As we saw in lecture, the probability of obtaining a low p-value varies with the size of the sample, even if the null hypothesis is false.

In the problem, we are considering the probability of obtaining a low p-value when tossing a biased coin that comes up Heads 51% of the time.

Plot the probability of obtaining a p-value of 5% or less vs. the size of the sample for this situation. The null hypothesis should be that the coin is fair.

Use simulation to obtain the probability of obtaining a p-value of 5% or less (as in lecture – you can use much of the code in lecture together with sapply).

Hint

Recall this function from lecture:

p.val.r <- function(n.tosses, prob){
  s1 <- rbinom(n = 1, size = n.tosses, prob = prob)
  p.val <- 2 * pbinom(q = 0.5*n.tosses - abs(0.5*n.tosses - s1), size = n.tosses, prob = 0.5)
  return(p.val)
}

You will want to repeatedly call this function, and figure out how much of the time it returns a value less than 0.05, for each number of tosses that you want to consider.

Problem 2

The probability of obtaining a low p-value also varies with how different the null hypothesis is from reality.

For 100 trials (i.e., the coin is tossed 100 times and we tally the number of Heads), plot the probability of obtaining a p-value under 5% vs. the probability of the coin’s coming up Heads. The null hypothesis should be that the coin is fair.

Problem 3

Now, consider comparing two samples from Gaussian distributions with unknown means and variances.

Plot the probability of obtaining a p-value of 5% or less vs the difference between the means. You can set the standard deviations to 1.0, and assume the sample size is 100. The null hypothesis should be that there is no difference between the true means.

You can use much of the code from the Finches lecture.

Note

As mentioned in lecture, you can use t.test(s1, s2)$p.value to obtain the p-value in this situation.