1
Boxplot - ANSWER an image that has min, Q1, median, Q3, max
Histogram - ANSWER A graphical representation -- bars, measuring the frequency within each interval
Skewed right - ANSWER Not a symmetric distribution, the tail is on the right, i.e. extra stuff on the right
Measures of center - ANSWER Median, the mean (and mode)
Measures of spread - ANSWER Range, IQR & standard deviation
Standard Deviation Rule - ANSWER 68% of the data are within 1 standard deviation, 95% are within 2, 99.7 are within 3 standard deviations from the mean.
For skewed data, use these for center and spread - ANSWER In this situation, we use median (for center) & IQR (for spread)
Explanatory variable - ANSWER In a study, what we think is the "cause"
Response variable - ANSWER In a study, what we think is the "effect" 1 / 2
2
Scatter plot - ANSWER A graphical representation of Q -> Q
Two way table - ANSWER A graphical representation of C -> C
Side-by side box - ANSWER A graphical representation of C -> Q
Linear relationship - ANSWER "shaped like a line"
Correlation coefficient, r - ANSWER Between -1 and 1; measures how close the points are to the line and if the trend is uphill (positive) or downhill (negative).
r = -0.2, for example - ANSWER This is an example of a correlation coefficient that represents a weak negative correlation.
r = 0.9, for example - ANSWER This is an example of a correlation coefficient that represents a strong positive correlation.
Linear regression line - ANSWER A line that fits the data as close as possible, used to make predictions
Interpolation - ANSWER Making predictions *within* the range of your data. This is usually accurate.
Extrapolation - ANSWER Making predictions *outside* of the range of your data. This is generally a bad idea.
Simpson's Paradox - ANSWER When split up, each data set can have a pattern which goes away when all the data is combined.
- / 2