Introduction

Pseudo-random number generators (PRNGs) power much of what goes on “behind the scenes” in statistical and cryptographic settings. In R, there is a pseudo-random number generator present which allows users to generate random variables from a variety of distributions. To check the default PRNG used by your R version you can run the following:

# Using version 4.2.2 (at time of writing)
# Returns the methods used for
# 1. The "default" random number generation, 
# 2. Normal variable generation 
# 3. Discrete uniform variable generation

RNGkind()
#> [1] "Mersenne-Twister" "Inversion"        "Rejection"

The PRNG which R utilizes can be specified by specifying a .Random.seed argument in the beginning of an R script or by adjusting the default kind (and normal.kind and sample.kind) arguments in the RNGkind or set.seed argument.

While it is possible for one to utilize their own PRNG through use of user-supplied random number generation, such an approach is pedagogically complex. For users interested in working with and learning about random number generation, present tools available still leave a degree of complexity and/or mystery around how random numbers are generated from the seed(s) and parameters supplied to a given PRNG. The randngen package provides a suite of PRNGs to which aim to be easy to use and flexible for understanding the relevant maths and algorithms implemented by allowing users to specify all relevant parameters.

From PRNG to Uniform Random Variables (and beyond!)

library(randngen)

# Function from here: https://stackoverflow.com/questions/5468280/scale-a-series-between-two-points
range01 <- function(x){(x-min(x))/(max(x)-min(x))}


randngen::lcg(seed = 1234, n=10000)|>
  range01()|>
  hist(main= "0-1 Scaled Uniform Random Variables with randgen::lcg()")

set.seed(1234)
runif(10000)|>
  hist(main= "0-1 Uniform Random Variables with runif()")

Applying the Inverse Probability Transform to Generate Random Variables

library(randngen)

randngen::lcg(seed = 1234, n=10000)|>
  range01()|>
  qnorm()|>
  hist(main= "Normal(0,1) Random Variables using\nscaled randgen::lcg() values and qnorm()")

set.seed(1234)
runif(10000)|>
  qnorm()|>
  hist(main= "Normal(0,1) Random Variables using runif() and qnorm()")

Other applications

  • Simulation studies: use the same PRNG and seed values for your research and not have to worry about

Appendix

R’s history with PRNGs

Disclaimer: This section was copied from output from ChatGPT. I have not found information on this collected in one individual place. However I was told by ChatGPT that it “[…] is well-documented in R’s official NEWS files, which detail changes and new features introduced in each R version …”. If this information is misleading or false, please open an issue or submit a pull request with more accurate information.

R’s base distributions have evolved in their use of Pseudo-Random Number Generators (PRNGs) over time to improve accuracy, speed, and security. Each R version may implement different PRNGs depending on updates, which can impact reproducibility for code across versions. Here’s a summary of the key PRNGs used across different base R versions and the changes introduced over time.

1. Early Versions of R (before the 1.7.0 series)

  • In the earliest versions of R, pseudo-random number generation was already configurable but less standardized in terms of defaults.
  • Base R supported several uniform PRNG algorithms, including Wichmann–Hill, Marsaglia-Multicarry, and Super-Duper.
  • Normal random numbers were generated using the Kinderman–Ramage method, which was later found to have approximation issues. For reproducibility, this legacy behavior is preserved under the name "Buggy Kinderman-Ramage".

Rather than relying on a single primitive generator, early R emphasized flexibility, with the understanding that defaults and algorithms might evolve.

2. R 1.7.0 Series (documented in R 1.7.1 NEWS)

  • PRNG: Mersenne Twister (MT19937)
  • Details: In the R 1.7.0 series, R changed its default uniform PRNG to the widely used Mersenne Twister (MT19937), which has a period of 21993712^{19937} - 1.
  • Additional: At the same time, the default method for generating normal random numbers was changed to Inversion.
  • Reproducibility: The function RNGversion() was introduced, allowing users to reproduce the exact RNG behavior of earlier R versions.

This release marks the point at which Mersenne Twister became the default generator in R.

3. User-Selectable PRNGs in Base R

  • PRNG: Default remains Mersenne Twister
  • Details: Across subsequent R releases, base R continued to support multiple user-selectable uniform PRNGs via RNGkind(), including:
    • "Wichmann-Hill"
    • "Marsaglia-Multicarry"
    • "Super-Duper"
    • "Knuth-TAOCP"
    • "Knuth-TAOCP-2002"

These generators serve different purposes and allow users to choose algorithms tailored to specific statistical or computational needs.

4. Introduction of L’Ecuyer-CMRG

  • PRNG: L’Ecuyer-CMRG (MRG32k3a)
  • Details: R added “L’Ecuyer-CMRG” as an additional RNG kind to support multiple independent random-number streams.
  • Use case: This generator is particularly important for parallel simulations, where independent but reproducible streams are required.

L’Ecuyer-CMRG remains a core component of R’s RNG infrastructure.

6. R 4.x Series (Current Behavior)

  • PRNG: Default remains Mersenne Twister
  • Details: In modern R versions:
    • Mersenne Twister continues to be the default uniform PRNG.
    • L’Ecuyer-CMRG remains available for parallel and multi-stream use.
    • Historical RNG behavior can be reproduced using RNGversion().
  • Note: Base R does not include xoroshiro or xoshiro-family generators as options in RNGkind(). Such generators are available via external packages, but are not part of base R.

Summary Table

R Version / Era Default PRNG Other Available PRNGs Notes
Before R 1.7.0 Version-dependent Wichmann–Hill, Marsaglia-Multicarry, Super-Duper Legacy normal RNG preserved as "Buggy Kinderman-Ramage"
R 1.7.0 series Mersenne Twister Same alternatives remain selectable Default RNG changed; RNGversion() introduced
Later additions Mersenne Twister L’Ecuyer-CMRG Supports multiple independent streams
R 3.6.0 Mersenne Twister (unchanged) Corrected discrete-uniform sampling algorithm
R 4.x (current) Mersenne Twister Wichmann–Hill, Marsaglia-Multicarry, Super-Duper, Knuth variants, L’Ecuyer-CMRG No xoroshiro/xoshiro in base R

Reproducibility Across Versions

R allows explicit control of the RNG type and seed via RNGkind() and set.seed(), respectively, ensuring reproducibility even as PRNG options expand.

References

  1. Random function - RDocumentation. https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/Random.

  2. Morris, T. P., White, I. R. & Crowther, M. J. Using simulation studies to evaluate statistical methods. Statistics in Medicine 38, 2074–2102 (2019).

  3. Generating distributions from random number generators. Cross Validated https://stats.stackexchange.com/questions/637706/generating-distributions-from-random-number-generators.

  4. L’Ecuyer, P. & Simard, R. TestU01. ACM Transactions on Mathematical Software 33, 1–40 (2007).

  5. R Core Team. R News Files. https://cran.r-project.org/doc/manuals/r-release/