The Data File Before you can analyze data

The Data File Before you can analyze data, you need to create a file which holds them. To illustrate the way in which these files are produced, we will use an imaginary set of data from a questionnaire study which is referred to as the Job Survey. The data relating to this study derive from two sources: a questionnaire study of employees who answer questions about themselves and a questionnaire study of their supervisors who answer questions relating to each of the employees. The questions asked are shown in Appendix 2.1, while the coding of the information or data collected is presented in Table 2.1. The cases consist of people, traditionally called respondents by sociologists and subjects by psychologists whose preferred term now is participants. Although questionnaire data have been used as an example, it should be recognized that SPSS and the data analysis procedures may be used with other forms of quantitative data, such as official statistics or observational measures. TABLE 2.1 The Job Survey data At the top of each column is the SPSS name or label we have given to the variables in this survey. SPSS names Variable and file names in SPSS have to meet certain specifications. Unlike earlier versions of SPSS, they can now be longer than eight characters and can begin with a capital letter. However, it is most probably preferable to keep the names short. Names must begin with an alphabetic character (A–Z). The remaining characters can be any letter, number, period, @ (at), $ (dollar) or _ (underscore). Blank spaces are not allowed and they cannot end with a period and, preferably, not with an underscore. In addition, certain words, known as keywords, cannot be used because they can only be interpreted as commands by SPSS. They include words such as add, and, any, or, and to. The SPSS names given to the variables in the Job Survey are presented in Table 2.2. TABLE 2.2 The SPSS names of the Job Survey variables Variable name SPSS name Identification number id Ethnic group ethnicgp Gender gender Gross annual income income Age age Years worked years Organizational commitment commit Job-satisfaction scale Item 1 satis1 Item 2 satis2 Item 3 satis3 Item 4 satis4 Job-autonomy scale Item 1 autonom1 Item 2 autonom2 Item 3 autonom3 Item 4 autonom4 Job-routine scale Item 1 routine1 Item 2 routine2 Item 3 routine3 Item 4 routine4 Attendance at meeting attend Rated skill skill Rated productivity prody Rated quality qual Absenteeism absence WEEK 5- FORMATIVE ASSESSMENT 1. In the Job Survey data, is absence a nominal, an ordinal, an interval/ ratio, or a dichotomous variable? 2. Is test-retest reliability a test of internal or external reliability? 3. What would be the R procedure for computing Cronbach’s alpha for autonomy? 4. Consider the following questions which might be used in a social survey about people’s drinking habits and decide whether the variable is nominal, ordinal, interval/ratio or dichotomous: (a) Do you ever consume alcoholic drinks? Yes____ No___ (b) If you have ticked Yes to the previous question, which of the following alcoholic drinks do you consume most frequently (tick one category only)? Beer_____ Spirits____ Wine_____ Liquors____ Other_____ (c) How frequently do you consume alcoholic drinks? Tick the answer that comes closest to your current practice. Daily____ Most days____ Once or twice week____ Once or twice a month____ A few times a year_____ Once or twice a year____ (d) How many units of alcohol did you consume last week? [We can assume that the interviewer would help respondents to translate into units of alcohol.] Number of units_____nt

The correct answer and explanation is:

  1. Absenteeism (absence) Variable Type: In the context of the Job Survey data, absenteeism (absence) is best classified as a dichotomous variable. This is because the variable likely refers to whether or not an individual was absent, with two possible outcomes: absent (Yes) or not absent (No). Dichotomous variables, also known as binary variables, contain only two categories or levels. In this case, absence is recorded as either occurring or not, making it a clear example of a dichotomous variable.
  2. Test-Retest Reliability: Test-retest reliability is a measure of external reliability. This type of reliability assesses the consistency of results when the same test is administered to the same group of people at two different times. If the results are consistent across both administrations, the test is considered to have good external reliability. External reliability refers to the stability and consistency of measurement results over time or across different versions of the same test, which is what test-retest reliability specifically examines.
  3. R Procedure for Computing Cronbach’s Alpha for Autonomy: To compute Cronbach’s alpha for the autonomy scale in R, you would typically use the psych package, which includes functions for reliability analysis. Assuming you have data for the four items on the autonomy scale (autonom1, autonom2, autonom3, autonom4), the R procedure would look like this: # Install and load the psych package install.packages("psych") library(psych) # Assuming your data is in a dataframe called 'data' autonomy_data <- data[, c("autonom1", "autonom2", "autonom3", "autonom4")] # Compute Cronbach's alpha alpha(autonomy_data) The alpha() function from the psych package computes Cronbach’s alpha, which is a measure of internal consistency for a set of items. A higher Cronbach’s alpha (close to 1) suggests better reliability.
  4. Survey Questions Analysis: a. Do you ever consume alcoholic drinks? Yes____ No___
    • Dichotomous Variable: This question has two possible answers (Yes or No), making it a dichotomous variable.
    b. If you have ticked Yes to the previous question, which of the following alcoholic drinks do you consume most frequently (tick one category only)? Beer_____ Spirits____ Wine_____ Liquors____ Other_____
    • Nominal Variable: This is a nominal variable because it categorizes respondents based on their most frequently consumed alcoholic drink. The categories have no inherent order.
    c. How frequently do you consume alcoholic drinks? Tick the answer that comes closest to your current practice. Daily____ Most days____ Once or twice a week____ Once or twice a month____ A few times a year_____ Once or twice a year____
    • Ordinal Variable: This is an ordinal variable, as the categories represent an ordered progression from frequent to infrequent consumption. The answers have a meaningful order, though the intervals between them are not necessarily equal.
    d. How many units of alcohol did you consume last week? [We can assume that the interviewer would help respondents to translate into units of alcohol.] Number of units_____
    • Interval/Ratio Variable: This is an interval/ratio variable. Since the number of alcohol units can be quantified, it is a continuous variable with a true zero point, which fits the definition of a ratio scale. If the data points can be compared in terms of both magnitude and meaningful zero, it’s a ratio variable.
Scroll to Top