The Data File Before you can analyze data, you need to create a file which holds them. To illustrate the way in which these files are produced, we will use an imaginary set of data from a questionnaire study which is referred to as the Job Survey. The data relating to this study derive from two sources: a questionnaire study of employees who answer questions about themselves and a questionnaire study of their supervisors who answer questions relating to each of the employees. The questions asked are shown in Appendix 2.1, while the coding of the information or data collected is presented in Table 2.1. The cases consist of people, traditionally called respondents by sociologists and subjects by psychologists whose preferred term now is participants. Although questionnaire data have been used as an example, it should be recognized that SPSS and the data analysis procedures may be used with other forms of quantitative data, such as official statistics or observational measures. TABLE 2.1 The Job Survey data At the top of each column is the SPSS name or label we have given to the variables in this survey. SPSS names Variable and file names in SPSS have to meet certain specifications. Unlike earlier versions of SPSS, they can now be longer than eight characters and can begin with a capital letter. However, it is most probably preferable to keep the names short. Names must begin with an alphabetic character (A–Z). The remaining characters can be any letter, number, period, @ (at), $ (dollar) or _ (underscore). Blank spaces are not allowed and they cannot end with a period and, preferably, not with an underscore. In addition, certain words, known as keywords, cannot be used because they can only be interpreted as commands by SPSS. They include words such as add, and, any, or, and to. The SPSS names given to the variables in the Job Survey are presented in Table 2.2. TABLE 2.2 The SPSS names of the Job Survey variables Variable name SPSS name Identification number id Ethnic group ethnicgp Gender gender Gross annual income income Age age Years worked years Organizational commitment commit Job-satisfaction scale Item 1 satis1 Item 2 satis2 Item 3 satis3 Item 4 satis4 Job-autonomy scale Item 1 autonom1 Item 2 autonom2 Item 3 autonom3 Item 4 autonom4 Job-routine scale Item 1 routine1 Item 2 routine2 Item 3 routine3 Item 4 routine4 Attendance at meeting attend Rated skill skill Rated productivity prody Rated quality qual Absenteeism absence WEEK 5- FORMATIVE ASSESSMENT 1. In the Job Survey data, is absence a nominal, an ordinal, an interval/ ratio, or a dichotomous variable? 2. Is test-retest reliability a test of internal or external reliability? 3. What would be the R procedure for computing Cronbach’s alpha for autonomy? 4. Consider the following questions which might be used in a social survey about people’s drinking habits and decide whether the variable is nominal, ordinal, interval/ratio or dichotomous: (a) Do you ever consume alcoholic drinks? Yes____ No___ (b) If you have ticked Yes to the previous question, which of the following alcoholic drinks do you consume most frequently (tick one category only)? Beer_____ Spirits____ Wine_____ Liquors____ Other_____ (c) How frequently do you consume alcoholic drinks? Tick the answer that comes closest to your current practice. Daily____ Most days____ Once or twice week____ Once or twice a month____ A few times a year_____ Once or twice a year____ (d) How many units of alcohol did you consume last week? [We can assume that the interviewer would help respondents to translate into units of alcohol.] Number of units_____nt
The correct answer and explanation is:
Week 5 – Formative Assessment Answers
1. In the Job Survey data, is absence a nominal, an ordinal, an interval/ratio, or a dichotomous variable?
Answer: Absence is an interval/ratio variable.
Explanation:
Absence typically refers to a quantitative measure, such as the number of days or instances an employee was absent. This type of variable is numerical and can be subjected to arithmetic operations, making it an interval/ratio variable. Unlike nominal or ordinal variables, interval/ratio variables have meaningful distances between values and an absolute zero point (e.g., zero absence days means no absence).
2. Is test-retest reliability a test of internal or external reliability?
Answer: Test-retest reliability is a measure of external reliability.
Explanation:
Test-retest reliability assesses the consistency of a measure over time by administering the same test to the same participants on two different occasions. If the results are consistent, the measure is considered reliable. It is considered external because it evaluates the stability of a measurement across time, rather than the consistency of items within the same test (internal reliability).
3. What would be the R procedure for computing Cronbach’s alpha for autonomy?
Answer:
To compute Cronbach’s alpha for the autonomy scale, the procedure in R would look like this:
# Assuming data is stored in a dataframe called job_survey
# Autonomy items: autonom1, autonom2, autonom3, autonom4
# Load the psych package
library(psych)
# Compute Cronbach's alpha
autonomy_data <- job_survey[, c("autonom1", "autonom2", "autonom3", "autonom4")]
alpha(autonomy_data)
Explanation:
Cronbach’s alpha measures the internal consistency or reliability of a scale. The alpha() function from the psych package in R calculates this value. Here, the items related to the autonomy scale (autonom1, autonom2, autonom3, autonom4) are extracted from the dataset and passed into the function. A high alpha (typically > 0.7) indicates good internal consistency.
4. Consider the following drinking habits survey questions and classify the variables:
(a) Do you ever consume alcoholic drinks? Yes___ No___
Answer: Dichotomous
Explanation: The variable has only two possible categories (Yes or No), making it a dichotomous variable.
(b) Which alcoholic drink do you consume most frequently? Beer___ Spirits___ Wine___ Liquors___ Other___
Answer: Nominal
Explanation: The options represent distinct categories with no inherent order, making this a nominal variable.
(c) How frequently do you consume alcoholic drinks?
Daily___ Most days___ Once or twice a week___ Once or twice a month___ A few times a year___ Once or twice a year___
Answer: Ordinal
Explanation: The options represent a ranked order of frequency but do not have equal intervals between them, classifying this variable as ordinal.
(d) How many units of alcohol did you consume last week?
Answer: Interval/ratio
Explanation: This variable is numerical, with equal intervals between units and a meaningful zero (indicating no alcohol consumed). Thus, it is an interval/ratio variable.
Summary
Understanding the types of variables and reliability is critical for proper data analysis. Classifying variables accurately ensures appropriate statistical methods are applied, while evaluating reliability helps determine the consistency and validity of measures.