Skip to content

Week 4 Quiz:R Programming(Data Science Specialization):Answers2025

Question 1
What is produced at the end of this snippet of R code?
set.seed(1)
rpois(5, 2)

✅ A vector with the numbers 1, 1, 2, 4, 1
❌ It is impossible to tell because the result is random
❌ A vector with the numbers 3.3, 2.5, 0.5, 1.1, 1.7
❌ A vector with the numbers 1, 4, 1, 1, 5

Explanation: With set.seed(1) in R, the Poisson draws rpois(5,2) deterministically produce 1 1 2 4 1 (R’s RNG + that seed yields that result).


Question 2
What R function can be used to generate standard Normal random variables?

rnorm
dnorm
qnorm
pnorm

Explanation: rnorm() samples from the Normal distribution. dnorm is the density, pnorm the CDF, qnorm the inverse CDF.


Question 3
When simulating data, why is using the set.seed() function important? Select all that apply.

❌ It can be used to generate non-uniform random numbers.
❌ It ensures that the sequence of random numbers is truly random.
✅ It can be used to specify which random number generating algorithm R should use, ensuring consistency and reproducibility.
❌ It ensures that the random numbers generated are within specified boundaries.

Explanation: set.seed() makes random-number generation reproducible (same sequence every run). It doesn’t change distribution shape or enforce bounds; it controls the RNG seed/algorithm for reproducibility.


Question 4
Which function can be used to evaluate the inverse cumulative distribution function for the Poisson distribution?

qpois
dpois
ppois
rpois

Explanation: qpois gives quantiles (inverse CDF). dpois density, ppois CDF, rpois random draws.


Question 5
What does the following code do?

set.seed(10)
x <- rep(0:1, each = 5)
e <- rnorm(10, 0, 20)
y <- 0.5 + 2 * x + e

✅ Generate data from a Normal linear model
❌ Generate random exponentially distributed data
❌ Generate uniformly distributed random data
❌ Generate data from a Poisson generalized linear model

Explanation: y = 0.5 + 2*x + e with e ~ Normal(0,20) is a linear model with normal noise.


Question 6
What R function can be used to generate Binomial random variables?

rbinom
dbinom
qbinom
pbinom

Explanation: rbinom() samples from the Binomial; the others are density, quantile, and CDF functions.


Question 7
What aspect of the R runtime does the profiler keep track of when an R expression is evaluated?

✅ the function call stack
❌ the package search list
❌ the global environment
❌ the working directory

Explanation: The R profiler samples the call stack (which functions are running) to estimate where time is spent.


Question 8
Consider the code:

library(datasets)
Rprof()
fit <- lm(y ~ x1 + x2)
Rprof(NULL)

Without running, what % of the run time is spent in lm, based on by.total normalization?

❌ 23%
❌ 50%
❌ It is not possible to tell
✅ 100%

Explanation: The only substantial work done between Rprof() and Rprof(NULL) is the lm(...) call, so essentially all sampled time is spent in lm → 100%.


Question 9
When using system.time(), what is the user time?

✅ It is the time spent by the CPU evaluating an expression
❌ It is a measure of network latency
❌ It is the time spent by the CPU waiting for other tasks to finish
❌ It is the “wall-clock” time it takes to evaluate an expression

Explanation: user is CPU time spent in user-level code; elapsed is wall-clock time.


Question 10
If a computer has multiple processors and R can use them, which is true when using system.time()?

✅ elapsed time may be smaller than user time
❌ user time is 0
❌ user time is always smaller than elapsed time
❌ elapsed time is 0

Explanation: user sums CPU time across cores. If work is parallelized, total CPU time (user) can exceed wall-clock (elapsed), so elapsed can be smaller than user.


🧾 Summary Table

Q# ✅ Correct Answer Key Concept
1 1, 1, 2, 4, 1 R RNG with set.seed() produces deterministic Poisson draws
2 rnorm Sampling Normal random variables
3 Specify RNG/ensure reproducibility set.seed() gives reproducible random sequences
4 qpois Inverse CDF (quantile) for Poisson
5 Generate data from a Normal linear model Linear model with Gaussian noise
6 rbinom Sampling Binomial random variables
7 Function call stack Profiler measures call stack/time per function
8 100% Profiling window covers lm() only
9 CPU time evaluating expression (user) user = CPU time (not wall-clock)
10 elapsed may be smaller than user time Parallel CPU sum can exceed wall-clock