Problem Set 2 - Post-mortem =========================== Problem set 2 was marked out of 100. The allocation of these 100 marks is as follows: 1. 10 marks 2. 15 marks 3. 25 marks 4. 25 marks 5. 10 marks 6. 15 marks In marking each question, marks were not deducted for specific steps, but rather the whole solution was judged for completeness, clarity and correctness. A brief outline of a "correct answer" for each question and a list of the most common mistakes found is presented below. In general, the solutions included too many statistics calculated and not used. Unused portions should not have been reported, however no marks were deducted for that. 1. Answer: - Sort the data and Normalize it by dividing by 100. - Calculate D-, D+ and D. - Get D(alpha) from tables and compare with D. - Decide based on the comparison that H0 cannot be rejected. The answer had to clearly include the assumption made for Alpha, the level of significance. Common mistake: - Assumption for Alpha not clearly stated. 2.a. Answer: - Choose K based on the sample size (value between 5 and 10). - Calculate end points for intervals (formula had to be there). - Sort data and find observed frequencies. - Calculate expected frequencies. - Calculate Xo^2. - Calculate the degrees of freedom based on K and the number of estimated parameters (2 in this case). - Get the value of X^2 from tables. - Compare Xo^2 and X^2 and decide NOT TO REJECT H0. Common mistakes: - Formula for calculating the intervals missing. - Assumption for Alpha not clearly stated. 2.b. Answer: - Calculate NEW end points for intervals (formula had to be there as well). - Find NEW observed frequencies. - Calculate expected frequencies (if K has changed). - Calculate Xo^2. - Calculate the degrees of freedom based on K (only, no estimated parameters). - Get the value of X^2 from tables. - Compare Xo^2 and X^2 and decide to REJECT H0. Common mistakes: - Formula for calculating the intervals missing. - Wrong value for the degrees of freedom. - Assumption for Alpha not clearly stated. - H0 and H1 and/or final decision incorrectly stated for 2.b. 3. Answer: - Calculate sample statistics (formulas used had to be there) - Based on calculated statistics, choose 2-4 different distributions to consider for test and comment on choice and assumptions. - For each distribution, estimate MLE parameters (formulas had to be there even if any statistical package was used.) - Draw histograms, Q-Q plots and box plots for each distribution. - Reduce the number of options (if possible) by comparing previous graphs. - Test remaining distributions using K-S test and/or X-square test. - Clearly state the decision made and comment. Common mistakes: - Unclear, insufficient or incorrect justification for the distribution considered for test. - Q-Q plot missing. - Formulas missing for statistics calculations. 4. Answer: - Calculate sample statistics (formulas used had to be there) - Based on calculated statistics, choose different distributions to test and comment on choice and assumptions. - For each distribution, estimate MLE parameters (formulas had to be there even if any statistical package was used.) - Draw histograms, Q-Q plots and box plots for each distribution. (A histogram alone is not enough). - Reduce the number of options if possible by looking at previous graphs. - Test remaining distributions using K-S test of X-square test. - Clearly state the decision made and comment. Common mistakes: - Unclear, insufficient or incorrect justification for the distribution considered for test. - Q-Q plot missing. - Formulas missing for statistics calculations. 5.a. Answer: - Plot Planning time vs. Milling time. - Comment on dependency between both ( clearly dependent ). 5.b. Answer: - Calculate sample correlation (formulas used and calculations had to be there). - Clearly state that data is positively correlated. Common mistake: - Formulas missing. 6.a. Answer: - Calculate sample mean and variance. - Draw a Q-Q plot for each of a normal and exponential distribution. - State clearly that the normal distribution represents a better fit and why. Common mistakes: - Formulas missing. - Q-Q plot drawn for only one of the two distributions.