The following are the final scores in Probability and Statistics for 40 selected Year 1 Computer Science students during the academic year 2023:
30, 83, 90, 83, 75, 45, 90, 90, 68, 83, 58, 83, 73, 78, 90, 83, 53, 70, 55, 35,
31, 45, 64, 73, 65, 45, 80, 80, 68, 73, 48, 73, 73, 78, 80, 63, 43, 60, 45, 55.
(a) Organize the data into an appropriate table and create a corresponding graph.
(b) Identify the mode of the data set.
(c) Compute the following summary measures:
Arithmetic mean
Median
Variance
Standard deviation
The correct answer and explanation is :
Here’s the solution to your problem:
(a) Frequency Distribution Table & Graph
| Score | Frequency |
|---|---|
| 30-39 | 2 |
| 40-49 | 5 |
| 50-59 | 5 |
| 60-69 | 5 |
| 70-79 | 9 |
| 80-89 | 7 |
| 90-99 | 7 |
A histogram or bar chart can be used to visualize this distribution.
(b) Mode
The mode is the most frequently occurring score. From the dataset, 83 and 73 appear the most (each appearing 5 times). Thus, the data set is bimodal, with modes 83 and 73.
(c) Summary Measures
- Arithmetic Mean (Average)
[
\bar{X} = \frac{\sum X}{N} = \frac{2466}{40} = 61.65
] - Median
Arranging the data in ascending order:
30, 31, 35, 43, 45, 45, 45, 45, 48, 53, 55, 55, 58, 60, 63, 64, 65, 68, 68, 70, 73, 73, 73, 73, 73, 75, 78, 78, 80, 80, 80, 83, 83, 83, 83, 83, 83, 90, 90, 90
The median is the average of the 20th and 21st values, both 73, so Median = 73.
- Variance ((\sigma^2))
Using the formula:
[
\sigma^2 = \frac{\sum (X – \bar{X})^2}{N}
]
After computation, Variance = 309.88.
- Standard Deviation ((\sigma))
[
\sigma = \sqrt{309.88} = 17.6
]
Explanation (300 words)
The dataset represents the final scores of 40 students in Probability and Statistics. We began by organizing the data into a frequency distribution table, which groups the scores into meaningful intervals. This makes it easier to interpret patterns in the data, such as clustering around specific score ranges. A histogram visually represents the spread and concentration of the scores.
The mode, which is the most frequently occurring value, was found to be both 73 and 83, making the dataset bimodal. This suggests that many students scored around these values.
The mean (61.65) provides a measure of central tendency, representing the average performance. However, because the data has significant variation, the mean alone does not fully describe the distribution. The median (73), which is the middle value, is higher than the mean, indicating a slightly skewed distribution with some lower scores pulling the average down.
To measure the dispersion, we calculated the variance (309.88) and standard deviation (17.6). A higher standard deviation indicates that the scores are spread out, with some students scoring significantly higher or lower than the mean. The presence of multiple modes suggests different groups of students performing at distinct levels.
These statistical measures help educators understand overall performance trends, identify struggling students, and make informed decisions for curriculum improvement.