Problem 1: Can Vox staffers identify the most expensive wine?

Wine is an acoholic beverage made by fermenting grape juice that some of you might try once you reach the age of 21. The journalism outlet Vox asked 19 staff to taste three different wines. They asked them to identify the most expensive wine, and to rate each one of the different wines.

The video reports that “almost half” (say 9) of the staffers identified the most expensive wine correctly. If we want to know whether people can tell expensive wine from cheap wine, what is the null hypothesis?

Compute the p-value for the null hypothesis using pbinom, and make a conclusion about whether we have evidence that Vox staffers can (sometimes) tell expensive wine from cheap wine.

Just for fun: the experiment here is reminiscent of the classical lady tasting tea experiment. (But you do not need to use Fisher’s exact test).

Problem 2: Different Ratings

The staffers also rated the wines on a 1-10 scale. The ratings were: Wine A ($8): 4.8/10 Wine B ($43): 4.8/10 Wine C ($14): 5.4/10

Suppose that your null hypothesis was that Wine B and Wine C would on average be rated the same. Test that hypothesis:

Note that you need to make an assumption about the \(\sigma\). Make a reasonable assumption, and explain why it is reasonable. Explore the effect of changing the value of \(\sigma\) to other reasonable values on the p-value.

Hint: the rule of thumb is that 95% of measurements are within the interval \([\mu - 2\sigma, \mu + 2\sigma]\). That means that \(\sigma > 3\) is likely not reasonable, since that would imply a lot of ratings outside the 1..10 range.

Problem 3: Are any of the ratings different?

Note that we must have a null hypothesis before conducting the experiment. It is inappropriate to first look at the data and then test whether wine B is different from wine C. That is because, just due to random chance, some pair of wines will have different ratings if you are rating a lot of wines.

The appropriate null hypothesis before conducting any tests is that all wines are the same.

Use rnorm in order to test this hypothesis. That is, find the probability that the range of the average ratings for the three wines is greater or equal than \(|5.4-4.8| = 0.6\).

Problem 4 (challenge): Sample size needed to detect a difference of 0.6

Use rnorm to determine the number of tasters needed in order to (usually) find a significant difference between two of the wines, if the true difference between the ratings is 0.6.