TEST BANK
For
STATS: DATA AND MODELS
FIFTH EDITION
By Richard De Veaux Paul Velleman David Bock (Global Edition) 1 / 4
Contents Chapter 1 Stats Starts Here 1-1 Chapter 2 Displaying and Describing Data 2-1 Chapter 3 Relationships Between Categorical Variables—Contingency Tables 3-1 Chapter 4 Understanding and Comparing Distributions 4-1 Chapter 5 The Standard Deviation as Ruler and the Normal Model 5-1 Chapter 6 Scatterplots, Association, and Correlation 6-1 Chapter 7 Linear Regressio n 7-1 Chapter 8 Regression Wisdom 8-1 Chapter 9 Multiple Regression 9-1 Chapter 10 Sample Surveys 10-1 Chapter 11 Experiments and Observational Studies 11-1 Chapter 12 From Randomness to Probability 12-1 Chapter 13 Probability Rules! 13-1 Chapter 14 Random Variables 14-1 Chapter 15 Probability Models 15-1 Chapter 16 Sa mpling Distribution Models and Confidence Intervals for Proportions 16-1 Chapter 17 Confidence Intervals for Means 17-1 Chapter 18 Testing Hypotheses 18-1 Chapter 19 More About Tests and Intervals 19-1 Chapter 20 Comparing Groups 20-1 Chapter 21 Paired Samples and Blocks 21-1 Chapter 22 Comparing Counts 22-1 Chapter 23 Inferences for Regression 23-1 Chapter 24 Multiple Regression Wisdom 24-1 Chapter 25 Analysis of Variance 25-1 Chapter 26 Multifactor Analysis of Variance 26-1 2 / 4
Copyright © 2020 Pearson Education, Inc.
25-1
Chapter 25 Analysis of Variance
What’s It About?
We looked at equality of means for two groups in Chapter 20. Here, in Chapter 25, we extend our concern to more than two groups. Another way of looking at this question is to think of it as the association between a quantitative and a categorical variable. In Chapter 20, we saw that the ratio of the difference of two means to its standard error was the natural statistic to use to test the null hypothesis and that the t-model gave us its sampling distribution. Now we have the generalization of that ratio: the ratio of two mean squares, modeled by the F- distribution. We present step-by-step examples of hypothesis tests. In the case that the null hypothesis is rejected, we may want to go further and examine confidence intervals for pairwise differences. We do this using the Bonferroni adjustment.
Comments
Chapter 22 introduced the Chi-square test for homogeneity, testing whether proportions are the same across different groups. It can be useful to point out that the ANOVA test generalizes the t-test for means in much the same way that the Chi-square test generalizes the z-test for proportions. The Chi-square statistic rejects the null hypothesis of equal proportions when the sum of squared differences between expected and observed becomes too large.Similarly, the F-statistic rejects the null when the variance of the group means is too large.The connection is that the numerator of that variance is the squared differences of the group means from the grand average.
ANOVA becomes difficult when too much time is spent on the formulas themselves. Rarely will students need to calculate these quantities by hand. More important is to emphasize that the F-statistic measures the ratio of the variation among the group means to the (average) variation within each group. Students intuitively know that large differences lead to rejecting the null hypothesis, and that differences are easier to discern when the variation is smaller.Students need to understand that the variation within the groups is based on an average of all the groups and so, that variation needs to be similar for the average to make sense. The assumption of equal variances is easily checked by the equal variance condition. They’ve used residual plots to check conditions for regression, so this check is natural.
Looking Ahead
In Chapter 26, we’ll extend the model to more than one categorical factor, making it possible to analyze two factor designs that we discussed in Chapter 11 and continuing to increase our model complexity. In Chapters 25 and 26, we’ll look at models where the predictor variables can be either all quantitative (multiple regression) or a mix of quantitative and categorical variables (multiple regression with indicator variables, or analysis of covariance, depending on your point of view).
Class Do’s
Start with a review of the t-test and when it rejects the null hypothesis of equal means. Use boxplots of various situations to ask students what their intuition says about equality of means with more than two groups. (Watch out for extremely large or small group sample sizes where the intuition may not match well.) 3 / 4
Copyright © 2020 Pearson Education, Inc.
25-2 Part VII Inference When Variables Are Related
Start right in with data. Show the boxplots and ask students whether they think there is evidence of differences in group means. Ask whether the assumptions and conditions of
ANOVA are satisfied:
• We believe that all the groups have the same standard deviation. This is the same “homoskedasticity” as regression. This is pretty easy to believe for randomized experiments because when we randomly assign subjects to treatments, the resulting treatments groups start out with the same underlying standard deviation. So we only need to assume that (under the null hypothesis of no difference in treatment effect), treatments that didn’t change the means also didn’t change the standard deviations. The assumption is crucial here because the MSE needs to make sense. It is dangerous to average essentially different quantities (we learned that in the Simpson’s paradox examples of Chapter 2). To check a corresponding condition, we can look at group boxplots and/or check the scatterplot of residuals vs. predicted for uniform spread.
• We believe that the errors – the differences of the observations from the group means – are independent. (We need to think about how the data were collected. If an experiment, were the treatments appropriately randomized. If an observational study, were the respondents chosen at random?)
• We also believe that the errors are normally distributed. (Check a normal probability plot of the residuals.)
Examine the ANOVA table. Without getting too caught up in the sums of squares and degrees of freedom, look at the two mean squares. The error mean square estimates the (average) error variance across the groups. So, its square root (sp) should give a reasonable estimate of the spread of the boxplots. Does it? Now, if the group means are equal, the treatment mean square should be about the same size as the error mean square. Is it? If it’s a lot larger, the F-test will reject the null hypothesis, and the larger the ratio, the more evidence there is against the null hypothesis. That’s the F-test in a nutshell.
Be careful to say what rejecting the null hypothesis doesn’t say. We don’t know which means are different. For that we need to do multiple comparisons.
The Importance of What You Don’t Say
Don’t dwell on the sums of squares formulas for getting to the mean squares. Students need to understand the concepts and be able to interpret ANOVA output from software packages or a calculator. Don’t distract them from these central issues with scary formulas. If they can conduct a hypothesis test using software or a calculator to do the mechanics, they will be fine.
ANOVA on the TI
It’s very easy to do analysis of variance with the TI calculator. Simply enter the data for each group in a separate list (say L1 , L2 , and L3 ), then choose ANOVA from the STATS TEST menu. When you execute the command ANOVA (L1 , L2 , L3), the TI will report the F-statistic, the P-value, and the information about degrees of freedom, SS, and MS.
- / 4