Amy L. Gill and Rafael A. Irizarry Solution Manual for Introduction to Data Science 1 / 4
Part I
R 2 / 4
1
Chapter 2: R basics
1.1 Section 2.3 Exercises 1.What is the sum of the first 100 positive integers? The formula for the sum of integers1 throughnisn(n+ 1)/2. Definen= 100and then use R to compute the sum of1through 100using the formula. What is the sum?
Answer:
n <-100 # define n n*(n+1)/2 # note use of multiplication symbol and parentheses #> [1] 5050 2.Now use the same formula to compute the sum of the integers from 1 through 1,000.
Answer:
n <-1000 # change definition of n n*(n+1)/2 # same formula as Q1 #> [1] 5e+05
3.Look at the result of typing the following code into R:
n <-1000 x <-seq(1, n) sum(x) Based on the result, what do you think the functionsseqandsumdo? You can usehelp.A.sumcreates a list of numbers andseqadds them up.B.seqcreates a list of numbers andsumadds them up.C.seqcreates a random list andsumcomputes the sum of 1 through 1,000.D.sumalways returns the same number.
Answer:
- You can check the documentation using?seqand?sum.
4.In math and programming, we say that we evaluate a function when we replace the argument with a given number. So if we typesqrt(4), we evaluate thesqrtfunction. In R, you can evaluate a function inside another function. The evaluations happen from the inside out. Use one line of code to compute the log, in base 10, of the square root of 100.
13 3 / 4
141 Chapter 2: R basics
Answer:
log10(sqrt(100) ) ##equivalent to: log(sqrt(100), base = 10)
#> [1] 1 5.Which of the following will always return the numeric value stored inx? You can try out examples and use the help system if you want.A.log(10^x) B.log10(x^10) C.log(exp(x)) D.exp(log(x, base = 2))
Answer:
- In R,loghas a default base ofe. Therefore,logandexpare inverse functions.
1.2 Section 2.5 Exercises 1.Load the US murders dataset.library(dslabs) data("murders") Use the functionstrto examine the structure of themurdersobject. We can see that this object is a data frame with 51 rows and five columns. Which of the following best describes
the variables represented in this data frame:
- The 51 states
- The murder rates for all 50 states and DC
- The state name, the abbreviation of the state name, the state’s region, and the state’s
population and total number of murders for 2010 D.strshows no relevant information
Answer:
- Check the output of the codestr(murders).
2.What are the column names used by the data frame for these five variables?
Answer:
names(murders) # find column names #> [1] "state" "abb" "region" "population" "total" 3.Use the accessor$to extract the state abbreviations and assign them to the objecta.What is the class of this object?
Answer:
- / 4