Table of Contents
Enroll Here: R for Data Science Cognitive Class Exam Quiz Answers
Introduction to R for Data Science
R is a powerful programming language and environment specifically designed for statistical computing and data analysis. It is widely used among statisticians and data miners for developing statistical software and performing data analysis. If you’re getting started with R for data science, here are some key points to understand:
- Installation: You can download and install R from the Comprehensive R Archive Network (CRAN) website (cran.r-project.org). Additionally, you might want to use RStudio, which is a popular integrated development environment (IDE) for R that makes it easier to work with.
- Basics of R: R is an interpreted language, which means you can interactively use it directly, or write scripts and functions. It has a rich set of built-in functions and libraries (called packages) that extend its functionality.
- Data Types: R supports various data types including vectors, matrices, data frames, and lists. Understanding these structures is crucial for manipulating and analyzing data.
- Data Manipulation: R provides powerful tools for data manipulation, transformation, and cleaning. Functions like
subset()
,merge()
,aggregate()
, and many others are commonly used. - Statistical Analysis: R is particularly strong in statistical modeling and analysis. You can perform regression analysis, hypothesis testing, clustering, time-series analysis, and more using built-in functions or packages like
lm()
,t.test()
,kmeans()
,arima()
. - Data Visualization: R offers excellent tools for data visualization through packages like
ggplot2
, which allows you to create highly customizable plots and graphs for exploratory data analysis and presentation. - Packages: R’s functionality can be extended by installing packages from CRAN or other repositories. These packages cover a wide range of topics including machine learning, text mining, spatial analysis, and more.
- Reproducibility: R promotes reproducible research through its ability to create scripts and documents (using tools like R Markdown) that combine code, results, and explanatory text in a single document.
- Community and Resources: R has a large and active community of users and developers. This means there are plenty of tutorials, forums, and online resources to help you learn and solve problems.
- Learning Resources: To get started, consider resources like the official R documentation (r-project.org/documentation.html), online courses (such as those on Coursera or DataCamp), books like “R for Data Science” by Garrett Grolemund and Hadley Wickham, and exploring example code and scripts.
R’s flexibility and extensive libraries make it a preferred choice for many data scientists and statisticians. Whether you’re analyzing data, building models, or creating visualizations, R provides a comprehensive environment for all stages of the data science workflow.
R for Data Science Cognitive Class Certification Answers
Module 1 – R Basics Quiz Answers
Question 1: Vectors in R can be which of the following types?
- Logical
- Numeric
- Character
- All of the above
Question 2: What would be the output in R given: c(1,2) == 1 ?
- FALSE TRUE
- TRUE FALSE
- FALSE FALSE
- TRUE TRUE
Question 3: How would you retrieve the items larger than 5 (as in 15 and 10) from the following vector: costs <- c(3, 15, 3, 10)?
- costs[15,10]
- costs[c(15,10)]
- costs(costs > 5)
- costs[costs > 5]
- costs > 5
Module 2 – Data Structures in R Quiz Answers
Question 1: Give a 5 x 5 matrix object, movies, how would you retrieve the bottom-left item?
- movies[1,5]
- movies(5,5)
- movies[5,1]
- movies[5,5]
- movies[“bottom-left”]
Question 2: Below we create a list for a student and his info. Select all the correct options can we use to retrieve his courses? john <- list(“studentid“ = 9, “age” = 18, “courses” = c(“Data Science 101”, “Data Science Methodology”))
- john[“courses”]
- john[3]
- john$courses
- All the above options are correct
Question 3: Select the correct code from the following options which produces the following result?
- data.frame(“student” = c(“john”, “mary”), “id” = c(1, 2))
- array(“student” = c(“john”, “mary”), “id” = c(1, 2))
- data.frame(c(“john”, “mary”), c(1, 2))
- data.frame(student = c(john, mary), id = c(1, 2))
- list(“student” = c(“john”, “mary”), “id” = c(1, 2))
Module 3 – R Programming Fundamentals Quiz Answers
Question 1: What output will the following produce?
chance_precipitation <- 0.80
if( chance_precipitation > 0.5 ) {
print(“Bring an umbrella”) } else {
print(“Don’t bring an umbrella”)}
- “Thunderstorm warning”
- “Don’t bring an umbrella”
- “Bring an umbrella”
- Some sort of error
Question 2: Which of the following statements are true?
- Using return() when writing a function is optional when you just want the result of the last line in the function to be the output of the function.
- Using return() when writing a function is necessary even when you just want the result of the last line in the function to be the output of the function.
- Using return() is useful when you want to produce outputs based on different conditions.
- Using return() serves no purpose when you want to produce outputs based on different conditions.
Question 3: Which of the following would you use to check the class of the object, myobject?
- class(myobject)
- type(myobject)
- class(object)
- class[myobject]
Module 4 – Working with Data in R Quiz Answers
Question 1: What does CSV stand for, when talking about tabular data files?
- Column-sorted values
- Comma-separated values
- Commonly-spaced values
- Column-separated values
- None of the above
Question 2: Which of the following are true?
- read.csv() can be used to read in CSV files
- You need to install libraries, such as the “readxl” library, to read Excel files into R
- You can load specified datasets or list the available datasets using data()
- You can write to a variety of filetypes, including .txt, .csv, .xls, .xlsx, and .Rdata.
Question 3: To get the number of characters in a character vector, char_vec, what function can you use?
- nchar(char_vec)
- numberOfCharacters[char_vec]
- char_vec.nchar()
- length(char_vec)
Module 5 – Strings and Dates in R Quiz Answers
Question 1: How would you combine the individual words from the vector, hw, into a single string, “Hello World”?
- hw <- c(“Hello”, “World”)
- paste(hw, collapse = ” “)
- paste(“Hello”, “World”)
- tolower(“Hello”, “World”)
- c(hw[1], hw[2])
- None of the above
Question 2: How would you convert the character string “2020-01-01” into a Date object in R?
- as.Date(“2020-01-01”)
- convertToDate(“2020-01-01”)
- date(“2020-01-01”)
- Sys.Date()
Question 3: What does the following regular expression pattern mean?
“.*@.+”
- Find matches containing an @ symbol where there is one or more characters before the @ symbol, and zero or more characters after the @ symbol.
- Find matches containing an @ symbol where there is one or more characters before the @ symbol, and at least one character after the @ symbol.
- Find matches containing an @ symbol where there is zero or more characters before the @ symbol, and at least one character after the @ symbol.
- It’s actually a new emoticon.
R for Data Science Final Exam Answers
Question 1: Which of the following will return TRUE?
- 1 > 2
- ”Apples” = “Bananas”
- TRUE = FALSE
- 2.1 in c(1.5, 3.14)
- None of the above
Question 2: Which of the following will print out the numbers 1, 2, and 3 only?
- for(num in c(1,2,3)) {print(num)}
- c(1,2,3)
- c(1:3)
- c(0,1,2,3,4,5)[2:4]
Question 3: How would you get the average of: ratings <- c(8.0, 8.5, 9.0)
- mean(“ratings”)
- AVER(“rating”)
- mean(ratings)
- average[ratings]
Question 4: How would you convert the following character vector into an integer vector?
- my_vector <- c(“1992”, “2016”, “2012”, “2018”)
- as.integer(my_vector)
- as.numeric(my_vector)
- tointeger(my_vector)
- converttointeger(my_vector)
Question 5: If you know an error might occur, what can you do?
- Think about why the error is happening and attempt to fix the code.
- Catch the error using the tryCatch() function.
- All of the above.
- Give up entirely.
Question 6: You have a file, “november.csv”, in the directory, “/Documents/expenses/“. How do you read this file into R?
- readLines(/Documents/expenses/november.csv)
- read.csv(“/Documents/expenses/november.csv”)
- read.csv(/Documents/expenses/november.csv)
- read.csv(“november.csv”, folder = “Documents/expenses”, type = “csv”)
- None of the above
Question 7: You opened a dataset and noticed a row showing Leonardo DiCaprio’s birthday as 153360000. What does it mean?
- 153360000 is a UNIX timestamp; it is the number of seconds since 1970-01-01 00:00:00.
- Leonardo DiCaprio was born on March 15, 1936 at 00:00.
- The data is definitely corrupt.
- Leonardo DiCaprio will be born in the year 153360000.
Question 8: The following code will produce which of the following outputs?
grep(“milk.+”, c(“cow’s milk”, “milkshake”, “milky”, “cat”, “milk1”, “milk”), value = T)
- “milkshake” “milky” “milk1”
- 2 3 5 6
- “milky” “milk1”
- “milkshake” “milky” “milk1” “milk”
- “cow’s milk” “milkshake” “milky” “cat” “milk1” “milk”
Question 9: You want to split a full name, “John Doe”, into a vector containing two elements: “John” and “Doe”. How would you do so?
- fullname <- “John Doe”
- unlist(strsplit(fullname, “ “))
- strsplit(fullname, “ “)
- None of the above.
Question 10: In R, x <- 1 is the same as x == 1
- True
- False
Question 11: Look at the code below. How many levels does the factor, drinks, have?
drinks <- factor(c(“tea”, “coffee”, “soft drink”, “tea”, “hot chocolate”, “hot chocolate”, “coffee”))
- 1
- 3
- 5
- 7
- None of the above
Question 12: To remove an existing column, “firstname”, from a data frame named “people”, which of the following code should you use?
- firstname <- NA
- people$firstname <- FALSE
- people(“firstname”) <- NA
- people$firstname <- NULL
Question 13: To retrieve the third row of an array named “myarray”, which of the following code should you use?
- myarray[3]
- myarray[,3]
- myarray(row = “third”)
- myarray[3,]
Question 14: How would you get the average of the third column of a data frame named “df”?
- mean(df[3,])
- mean(df[,3])
- mean(df[3])
- df[,3].mean()
Question 15: What is the expected output of the following script?
myfunc <- function(x, y = 2){
x = x + 10
y = y + 100
return(y)
}
myfunc(3)
- 102
- 3
- 13
- 2