In preparation for writing your report to senior management next week, conductthe following descriptive statistics analyses with Excel®. Answer the questions below in your Excel sheet or in a separate Word document:
· Insert a new column in the database that corresponds to “Annual Sales.” Annual Sales is the result of multiplying a restaurant’s “SqFt.” by “Sales/SqFt.”
· Calculate the mean, standard deviation, skew, 5-number summary, and interquartile range (IQR) for each of the variables.
· Create a box-plot for the “Annual Sales” variable. Does it look symmetric? Would you prefer the IQR instead of the standard deviation to describe this variable’s dispersion? Why?
· Create a histogram for the “Sales/SqFt” variable. Is the distribution symmetric? If not, what is the skew? Are there any outliers? If so, which one(s)? What is the “SqFt” area of the outlier(s)? Is the outlier(s) smaller or larger than the average restaurant in the database? What can you conclude from this observation?
· What measure of central tendency is more appropriate to describe “Sales/SqFt”? Why?
The Correct Answer and Explanation is :
To prepare your report for senior management, follow these steps in Excel to perform the required descriptive statistics analyses:
1. Insert a New Column for “Annual Sales”:
- In your dataset, create a new column labeled “Annual Sales.”
- In the first cell under this header, enter the formula:
= [@SqFt] * [@Sales/SqFt], replacing[@SqFt]and[@Sales/SqFt]with the actual cell references for square footage and sales per square foot, respectively. - Drag the fill handle down to apply this formula to all rows.
2. Calculate Descriptive Statistics:
- Mean: Use the
AVERAGEfunction. - For “Annual Sales”:
=AVERAGE([Annual Sales]) - For “Sales/SqFt”:
=AVERAGE([Sales/SqFt]) - Standard Deviation: Use the
STDEV.Sfunction for sample data. - For “Annual Sales”:
=STDEV.S([Annual Sales]) - For “Sales/SqFt”:
=STDEV.S([Sales/SqFt]) - Skewness: Use the
SKEWfunction. - For “Annual Sales”:
=SKEW([Annual Sales]) - For “Sales/SqFt”:
=SKEW([Sales/SqFt]) - Five-Number Summary: Use the
MIN,QUARTILE.EXC,MEDIAN, andMAXfunctions. - For “Annual Sales”:
- Minimum:
=MIN([Annual Sales]) - First Quartile (Q1):
=QUARTILE.EXC([Annual Sales], 1) - Median:
=MEDIAN([Annual Sales]) - Third Quartile (Q3):
=QUARTILE.EXC([Annual Sales], 3) - Maximum:
=MAX([Annual Sales])
- Minimum:
- For “Sales/SqFt”:
- Minimum:
=MIN([Sales/SqFt]) - First Quartile (Q1):
=QUARTILE.EXC([Sales/SqFt], 1) - Median:
=MEDIAN([Sales/SqFt]) - Third Quartile (Q3):
=QUARTILE.EXC([Sales/SqFt], 3) - Maximum:
=MAX([Sales/SqFt])
- Minimum:
- Interquartile Range (IQR): Subtract Q1 from Q3.
- For “Annual Sales”:
=Q3 - Q1 - For “Sales/SqFt”:
=Q3 - Q1
3. Create a Box Plot for “Annual Sales”:
- Select the “Annual Sales” data.
- Go to the “Insert” tab.
- Click on “Insert Statistic Chart” and choose “Box and Whisker.”
- Interpretation:
- If the box plot is symmetric, the data distribution is likely normal.
- If the box plot is skewed, the data distribution is not symmetric.
- The IQR is less sensitive to outliers than the standard deviation, making it preferable for skewed distributions.
4. Create a Histogram for “Sales/SqFt”:
- Select the “Sales/SqFt” data.
- Go to the “Insert” tab.
- Click on “Insert Statistic Chart” and choose “Histogram.”
- Interpretation:
- If the histogram is symmetric, the distribution is likely normal.
- If the histogram is skewed, the distribution is not symmetric.
- Outliers are data points that fall outside the whiskers in the box plot or beyond the bins in the histogram.
- To identify outliers, look for data points that fall outside the whiskers in the box plot or beyond the bins in the histogram.
- The “Sales/SqFt” value of the outlier can be found by examining the data point.
- Compare the outlier’s “Sales/SqFt” value to the mean or median to determine if it’s smaller or larger than the average.
- Outliers can significantly affect the mean and standard deviation, so it’s important to consider them when analyzing the data.
5. Appropriate Measure of Central Tendency for “Sales/SqFt”:
- If the distribution of “Sales/SqFt” is symmetric, the mean is appropriate.
- If the distribution is skewed, the median is more appropriate because it is less affected by outliers.
- The choice between mean and median depends on the symmetry of the data distribution.
For a visual demonstration of these steps, you can refer to the following video:
videoFive Number Summary and Boxplots in Excel 2016turn0search0