Week 4 Quiz:R Programming(Data Science Specialization):Answers2025
Question 1
What is produced at the end of this snippet of R code?set.seed(1)rpois(5, 2)
✅ A vector with the numbers 1, 1, 2, 4, 1
❌ It is impossible to tell because the result is random
❌ A vector with the numbers 3.3, 2.5, 0.5, 1.1, 1.7
❌ A vector with the numbers 1, 4, 1, 1, 5
Explanation: With set.seed(1) in R, the Poisson draws rpois(5,2) deterministically produce 1 1 2 4 1 (R’s RNG + that seed yields that result).
Question 2
What R function can be used to generate standard Normal random variables?
✅ rnorm
❌ dnorm
❌ qnorm
❌ pnorm
Explanation: rnorm() samples from the Normal distribution. dnorm is the density, pnorm the CDF, qnorm the inverse CDF.
Question 3
When simulating data, why is using the set.seed() function important? Select all that apply.
❌ It can be used to generate non-uniform random numbers.
❌ It ensures that the sequence of random numbers is truly random.
✅ It can be used to specify which random number generating algorithm R should use, ensuring consistency and reproducibility.
❌ It ensures that the random numbers generated are within specified boundaries.
Explanation: set.seed() makes random-number generation reproducible (same sequence every run). It doesn’t change distribution shape or enforce bounds; it controls the RNG seed/algorithm for reproducibility.
Question 4
Which function can be used to evaluate the inverse cumulative distribution function for the Poisson distribution?
✅ qpois
❌ dpois
❌ ppois
❌ rpois
Explanation: qpois gives quantiles (inverse CDF). dpois density, ppois CDF, rpois random draws.
Question 5
What does the following code do?
✅ Generate data from a Normal linear model
❌ Generate random exponentially distributed data
❌ Generate uniformly distributed random data
❌ Generate data from a Poisson generalized linear model
Explanation: y = 0.5 + 2*x + e with e ~ Normal(0,20) is a linear model with normal noise.
Question 6
What R function can be used to generate Binomial random variables?
✅ rbinom
❌ dbinom
❌ qbinom
❌ pbinom
Explanation: rbinom() samples from the Binomial; the others are density, quantile, and CDF functions.
Question 7
What aspect of the R runtime does the profiler keep track of when an R expression is evaluated?
✅ the function call stack
❌ the package search list
❌ the global environment
❌ the working directory
Explanation: The R profiler samples the call stack (which functions are running) to estimate where time is spent.
Question 8
Consider the code:
Without running, what % of the run time is spent in lm, based on by.total normalization?
❌ 23%
❌ 50%
❌ It is not possible to tell
✅ 100%
Explanation: The only substantial work done between Rprof() and Rprof(NULL) is the lm(...) call, so essentially all sampled time is spent in lm → 100%.
Question 9
When using system.time(), what is the user time?
✅ It is the time spent by the CPU evaluating an expression
❌ It is a measure of network latency
❌ It is the time spent by the CPU waiting for other tasks to finish
❌ It is the “wall-clock” time it takes to evaluate an expression
Explanation: user is CPU time spent in user-level code; elapsed is wall-clock time.
Question 10
If a computer has multiple processors and R can use them, which is true when using system.time()?
✅ elapsed time may be smaller than user time
❌ user time is 0
❌ user time is always smaller than elapsed time
❌ elapsed time is 0
Explanation: user sums CPU time across cores. If work is parallelized, total CPU time (user) can exceed wall-clock (elapsed), so elapsed can be smaller than user.
🧾 Summary Table
| Q# | ✅ Correct Answer | Key Concept |
|---|---|---|
| 1 | 1, 1, 2, 4, 1 |
R RNG with set.seed() produces deterministic Poisson draws |
| 2 | rnorm |
Sampling Normal random variables |
| 3 | Specify RNG/ensure reproducibility | set.seed() gives reproducible random sequences |
| 4 | qpois |
Inverse CDF (quantile) for Poisson |
| 5 | Generate data from a Normal linear model | Linear model with Gaussian noise |
| 6 | rbinom |
Sampling Binomial random variables |
| 7 | Function call stack | Profiler measures call stack/time per function |
| 8 | 100% |
Profiling window covers lm() only |
| 9 | CPU time evaluating expression (user) |
user = CPU time (not wall-clock) |
| 10 | elapsed may be smaller than user time | Parallel CPU sum can exceed wall-clock |