C280 Probability and Stats WGU Leave the first rating Terms in this set (38) Social SciencesPsychology Save Exploratory Data Analysis (EDA) - 4 steps
- Producing data
2.Exploratory data analysis 3&4. Probability and Interface DatasetA set of data identified with particular circumstances. Datasets are typically displayed in tables, in which rows represent individuals & columns represent variables.Quantitative VariablesTake numerical values. and represent some kind of measurement.Categorical VariablesTake category or label values, and place an individual into one of several groups.Examining DistributionsExploring data obtained from one variable at a time.Examining RelationshipsExploring data obtained from two variables at a time.
Spread (Variability)Approximate range covered by the data.Unimodal DistributionsWhen distributions have one mode.Bimodal distributionWhen distributions have two modes.Multimodal DistributionWhen a distribution has three or more modes.Median (M)The Midpoint of the distribution.RangeExactly the distance between the smallest data point (Min) and the largest one (max).
Inter Quartile Range (IQR) Measures the variability of a distribution by giving us the range covered by the middle 50% of the data. IQR= Q3-Q1 Five Number SummaryThe combination of all five numbers (Min,Q1,MQ3,Max) BoxplotGraphically represents the distribution of a quantitative variable by visually displaying the five number summary, and any observation that was classified as a suspected outlier using the 1.5(IQR) Criterion Standard DeviationQuantifies the spread of a distribution in a completely different way.Explanatory Variable(Independent/ X) the variable that claims to explain, predict, or affect the response.
Response Variable(Dependent/ Y) The outcome of the study.The Correlation Coefficient (r) A numerical measure that measures the strength and direction of a linear relationship between two quantitative variables.Simpson's ParadoxWhen averages are taken across different groups, they can appear to contradict the overall averages.Numbers don't seem to add up.In a class of 25 students, the following scores were received on a
quiz:
3,3,3,3,4,4,4,4,5,5,6,6,6,7,7,8,8,8,8,8,9,9
,9,9 For this distribution, what is the most appropriate measure of central tendency?Either mean or the median