The Data File Before you can analyze data, you need to create a file which holds them

The Data File Before you can analyze data, you need to create a file which holds them. To illustrate the way in which these files are produced, we will use an imaginary set of data from a questionnaire study which is referred to as the Job Survey. The data relating to this study derive from two sources: a questionnaire study of employees who answer questions about themselves and a questionnaire study of their supervisors who answer questions relating to each of the employees. The questions asked are shown in Appendix 2.1, while the coding of the information or data collected is presented in Table 2.1. The cases consist of people, traditionally called respondents by sociologists and subjects by psychologists whose preferred term now is participants. Although questionnaire data have been used as an example, it should be recognized that SPSS and the data analysis procedures may be used with other forms of quantitative data, such as official statistics or observational measures. TABLE 2.1 The Job Survey data At the top of each column is the SPSS name or label we have given to the variables in this survey. SPSS names Variable and file names in SPSS have to meet certain specifications. Unlike earlier versions of SPSS, they can now be longer than eight characters and can begin with a capital letter. However, it is most probably preferable to keep the names short. Names must begin with an alphabetic character (A–Z). The remaining characters can be any letter, number, period, @ (at), $ (dollar) or _ (underscore). Blank spaces are not allowed and they cannot end with a period and, preferably, not with an underscore. In addition, certain words, known as keywords, cannot be used because they can only be interpreted as commands by SPSS. They include words such as add, and, any, or, and to. The SPSS names given to the variables in the Job Survey are presented in Table 2.2. TABLE 2.2 The SPSS names of the Job Survey variables Variable name SPSS name Identification number id Ethnic group ethnicgp Gender gender Gross annual income income Age age Years worked years Organizational commitment commit Job-satisfaction scale Item 1 satis1 Item 2 satis2 Item 3 satis3 Item 4 satis4 Job-autonomy scale Item 1 autonom1 Item 2 autonom2 Item 3 autonom3 Item 4 autonom4 Job-routine scale Item 1 routine1 Item 2 routine2 Item 3 routine3 Item 4 routine4 Attendance at meeting attend Rated skill skill Rated productivity prody Rated quality qual Absenteeism absence WEEK 5- FORMATIVE ASSESSMENT 1. In the Job Survey data, is absence a nominal, an ordinal, an interval/ ratio, or a dichotomous variable? 2. Is test-retest reliability a test of internal or external reliability? 3. What would be the R procedure for computing Cronbach’s alpha for autonomy? 4. Consider the following questions which might be used in a social survey about people’s drinking habits and decide whether the variable is nominal, ordinal, interval/ratio or dichotomous: (a) Do you ever consume alcoholic drinks? Yes____ No___ (b) If you have ticked Yes to the previous question, which of the following alcoholic drinks do you consume most frequently (tick one category only)? Beer_____ Spirits____ Wine_____ Liquors____ Other_____ (c) How frequently do you consume alcoholic drinks? Tick the answer that comes closest to your current practice. Daily____ Most days____ Once or twice week____ Once or twice a month____ A few times a year_____ Once or twice a year____ (d) How many units of alcohol did you consume last week? [We can assume that the interviewer would help respondents to translate into units of alcohol.] Number of units_____nt

The correct answer and explanation is:

Formative Assessment Answers with Explanations


1. In the Job Survey data, is absence a nominal, ordinal, interval/ratio, or a dichotomous variable?

Answer: Absence is an interval/ratio variable.

Explanation:
Absenteeism refers to the number of days or hours an employee was absent from work, which is a measurable quantity. Interval/ratio variables have numeric values with meaningful distances between them, and they can be ordered, added, or subtracted meaningfully. For example, if one person was absent for 5 days and another for 10 days, the difference (5 days) is meaningful and measurable. Absence cannot be nominal or ordinal, as it is not categorical or rank-based, and it is not dichotomous since it has a wide range of numeric values.


2. Is test-retest reliability a test of internal or external reliability?

Answer: Test-retest reliability is a test of external reliability.

Explanation:
Test-retest reliability measures the consistency of a test or instrument over time by administering the same test to the same participants on two different occasions. If the results are consistent across the two time points, the test is said to have good external reliability. External reliability focuses on the reproducibility of results across time, while internal reliability evaluates the consistency of items within the same test or measurement tool (e.g., Cronbach’s alpha).


3. What would be the R procedure for computing Cronbach’s alpha for autonomy?

Answer:
The R procedure for computing Cronbach’s alpha for the autonomy variable is as follows:

# Install the psych package if not already installed
install.packages("psych")

# Load the psych package
library(psych)

# Create a data frame for the autonomy items
autonomy <- data.frame(
  autonom1 = c(...),  # Replace with actual data
  autonom2 = c(...),
  autonom3 = c(...),
  autonom4 = c(...)
)

# Compute Cronbach's alpha
cronbach_alpha <- alpha(autonomy)
print(cronbach_alpha)

Explanation:
Cronbach’s alpha measures the internal consistency of a scale, which evaluates whether the items in the scale (e.g., autonom1, autonom2, autonom3, autonom4) measure the same underlying construct (autonomy). The psych package in R provides a straightforward function alpha() to calculate this metric. The autonomy items must first be organized into a data frame before running the analysis.


4. Classify the variables in the survey about drinking habits:

(a) Do you ever consume alcoholic drinks? Yes____ No___
Answer: Dichotomous variable.
Explanation: This question has two possible responses (Yes or No). Dichotomous variables are a special type of nominal variable with only two categories.

(b) Which of the following alcoholic drinks do you consume most frequently?
Answer: Nominal variable.
Explanation: The categories (Beer, Spirits, Wine, Liquors, Other) represent distinct and mutually exclusive groups without any inherent ranking or order, making this a nominal variable.

(c) How frequently do you consume alcoholic drinks?
Answer: Ordinal variable.
Explanation: The response options (Daily, Most days, etc.) indicate an order or ranking of frequency but do not have equal intervals between the options. Thus, this is an ordinal variable.

(d) How many units of alcohol did you consume last week?
Answer: Interval/ratio variable.
Explanation: The number of units consumed is a numeric variable where the differences between values are meaningful, making it an interval/ratio variable. If a value of zero means no alcohol was consumed, it can also be considered ratio data.


Summary

In the Job Survey data, absence is an interval/ratio variable because it represents measurable numeric values. Test-retest reliability measures external reliability by testing consistency across time. Cronbach’s alpha, calculated using R’s psych package, evaluates the internal reliability of scale items like those measuring autonomy. In the drinking habits survey, the nature of variables varies: (a) is dichotomous due to its two categories, (b) is nominal as the options are categories without a rank, (c) is ordinal because the responses have a meaningful order, and (d) is interval/ratio due to its numeric nature. Understanding variable types is essential for applying the appropriate statistical analyses and interpreting results accurately in research.

Scroll to Top