- Which variables are ordinal? Which are nominal?
Nominal variables: and
Ordinal Variables: (although it is coded as continuous)
Answer Options:
1a. Nominal variables selections: Calories, Type, Protein or Potassium, MFG
1b. Ordinal Variables selections: Calories, Shelf, Protein
- Use Cols>Columns Viewer to obtain summary statistics. Which, if any, of the variables is missing values?
The variables , , and have missing values. The data set has missing values in total.
Answer Options:
2a. Calories, Carbo, Protein & Fat
2b. Fiber, Fat, Sodium & sugar
2c. Potassium, Vitamins, weight Shelf
2d. 3,4,5
- Use Analyze > Distribution to plot a histogram for each of the continuous variables and create summary statistics. Based on the histograms and summary statistics, choose the correct answers:
and have the largest standard deviations, and so are the most variable.
The variables , , , and seem (right) skewed.
Answer Options:
3a. fat, sodium, calories
3b. fibe, rating potassium
3c. shelf, fat, cups
3d. names, cups, fiber
3e. potassium, carbs, sugars
3f. protein, calories, rating
- Use the Graph Builder to plot a side-by-side box-plot comparing the calories in hot vs. cold cereals. What does this plot show us?
We see that in cold cereals, the different cereals vary in the amount of calories mainly between approximately , whereas all 3 of the hot cereals have 100 calories.
Answer Options:
100-140
90-120
50-100
The Correct Answer and Explanation is :
Question 1: Nominal and Ordinal Variables
Nominal Variables:
Nominal variables are categorical variables that do not have a specific order or ranking. They are just labels used for identification.
- Nominal Variables: Type, MFG
These variables represent categories (e.g., “Type” of cereal or “MFG” manufacturer), where no inherent order exists. Categories can be named but cannot be ranked in any meaningful way.
Ordinal Variables:
Ordinal variables are also categorical but have a defined order or ranking. They reflect some sort of hierarchy, though the distances between categories are not consistent.
- Ordinal Variables: Shelf, Protein
Although “Protein” is typically measured as a continuous variable, in this case, it’s being treated ordinally, as it could represent a ranking or grouping of protein levels (e.g., low, medium, high). “Shelf” represents the position of an item in a store, and there is a defined order to it.
Question 2: Missing Values
To answer this question, we need to know which variables have missing data. This can be determined by using a column viewer tool like “Cols>Columns Viewer” in your analysis software.
Answer Options:
- 2a. Calories, Carbo, Protein & Fat
- 2b. Fiber, Fat, Sodium & Sugar
- 2c. Potassium, Vitamins, Weight, Shelf
- 2d. 3,4,5
Without access to the specific data, the answer would depend on which of these variables actually have missing entries in the dataset, so checking the summary statistics would help in identifying this.
Question 3: Variables with Largest Standard Deviations & Skewness
Largest Standard Deviations (Most Variable):
Variables with the largest standard deviations are those where the data points are spread out the most. This indicates higher variability in the data.
- Answer Options:
- 3a. Fat, Sodium, Calories
These variables might have the highest variation in values (e.g., calorie count can vary widely across different types of cereals).
Skewness:
Skewness refers to the asymmetry of the distribution of data. Right skewed means that most of the data points are on the lower end, with a few very large values pulling the distribution to the right.
- Answer Options:
- 3a. Fat, Sodium, Calories
These variables could be right-skewed because many products might have lower fat, sodium, and calories, with only a few having very high values.
Question 4: Box Plot of Hot vs. Cold Cereals
Box Plot Insights:
In the side-by-side box plot comparing the calories in hot vs. cold cereals, the key observation is the distribution of calorie counts across the two categories.
- The box plot likely shows that cold cereals vary widely in their calorie content, ranging between 100-140 calories. In contrast, hot cereals are more consistent, with a calorie count around 100 calories.
Answer Option:
- 100-140
This is the correct range for the variation in calories for cold cereals, whereas hot cereals have more uniform calorie content.
To help visualize the findings from the box plot (for question 4), I can generate a sample box plot comparing the calories in hot and cold cereals. Let’s do that!
Here is the box plot comparing the calories in hot vs. cold cereals. You can see that the cold cereals have a wider range of calorie counts, typically between 100-140 calories, while hot cereals have a more consistent calorie count around 100 calories.
This plot visually confirms the correct answer for the range of calories in cold cereals, which is 100-140.
