Class RandomSampler

java.lang.Object
org.cicirello.math.rand.RandomSampler

public final class RandomSampler extends Object
RandomSampler is a class of utility methods related to efficiently sampling integers without replacement.
  • Method Summary

    Modifier and Type
    Method
    Description
    static int[]
    sample(int n, double p)
    Generates a random sample, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sample(int n, double p, RandomGenerator r)
    Generates a random sample, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sample(int n, int k, int[] result)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sample(int n, int k, int[] result, RandomGenerator gen)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sampleInsertion(int n, int k, int[] result)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sampleInsertion(int n, int k, int[] result, RandomGenerator gen)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    samplePool(int n, int k, int[] result)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    samplePool(int n, int k, int[] result, RandomGenerator gen)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sampleReservoir(int n, int k, int[] result)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).
    static int[]
    sampleReservoir(int n, int k, int[] result, RandomGenerator gen)
    Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n).

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • sampleReservoir

      public static int[] sampleReservoir(int n, int k, int[] result)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      Uses the reservoir sampling algorithm (Algorithm R) from J. Vitter's 1985 article "Random Sampling with a Reservoir" from ACM Transactions on Mathematical Software. The runtime is O(n) and it generates O(n-k) random numbers. Thus, it is an especially good choice as k approaches n. Only constant extra space required.

      This method uses ThreadLocalRandom as the pseudorandom number generator, and is thus safe, and efficient (i.e., non-blocking), for use with threads.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • sampleReservoir

      public static int[] sampleReservoir(int n, int k, int[] result, RandomGenerator gen)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      Uses the reservoir sampling algorithm (Algorithm R) from J. Vitter's 1985 article "Random Sampling with a Reservoir" from ACM Transactions on Mathematical Software. The runtime is O(n) and it generates O(n-k) random numbers. Thus, it is an especially good choice as k approaches n. Only constant extra space required.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      gen - Source of randomness.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • samplePool

      public static int[] samplePool(int n, int k, int[] result)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This implements the algorithm SELECT of S. Goodman and S. Hedetniemi, as described in: J Ernvall, O Nevalainen, "An Algorithm for Unbiased Random Sampling," The Computer Journal, 25(1):45-47, 1982.

      The runtime is O(n) and it generates O(k) random numbers. Thus, it is a better choice than sampleReservoir when k < n-k. However, this uses O(n) extra space, whereas the reservoir algorithm uses no extra space.

      This method uses ThreadLocalRandom as the pseudorandom number generator, and is thus safe, and efficient (i.e., non-blocking), for use with threads.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • samplePool

      public static int[] samplePool(int n, int k, int[] result, RandomGenerator gen)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This implements the algorithm SELECT of S. Goodman and S. Hedetniemi, as described in: J Ernvall, O Nevalainen, "An Algorithm for Unbiased Random Sampling," The Computer Journal, 25(1):45-47, 1982.

      The runtime is O(n) and it generates O(k) random numbers. Thus, it is a better choice than sampleReservoir when k < n-k. However, this uses O(n) extra space, whereas the reservoir algorithm uses no extra space.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      gen - Source of randomness.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • sampleInsertion

      public static int[] sampleInsertion(int n, int k, int[] result)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This implements the insertion sampling algorithm described in:

      Vincent A. Cicirello. 2022. Cycle Mutation: Evolving Permutations via Cycle Induction, Applied Sciences, 12(11), Article 5506 (June 2022). doi:10.3390/app12115506

      The runtime is O(k2) and it generates O(k) random numbers. Thus, it is a better choice than both sampleReservoir and samplePool when k2 < n. Just like sampleReservoir, the sampleInsertion method only requires O(1) extra space, while samplePool requires O(n) extra space.

      This method uses ThreadLocalRandom as the pseudorandom number generator, and is thus safe, and efficient (i.e., non-blocking), for use with threads.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • sampleInsertion

      public static int[] sampleInsertion(int n, int k, int[] result, RandomGenerator gen)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This implements the insertion sampling algorithm described in:

      Vincent A. Cicirello. 2022. Cycle Mutation: Evolving Permutations via Cycle Induction, Applied Sciences, 12(11), Article 5506 (June 2022). doi:10.3390/app12115506

      The runtime is O(k2) and it generates O(k) random numbers. Thus, it is a better choice than both sampleReservoir and samplePool when k2 < n. Just like sampleReservoir, the sampleInsertion method only requires O(1) extra space, while samplePool requires O(n) extra space.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      gen - The source of randomness.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • sample

      public static int[] sample(int n, double p)
      Generates a random sample, without replacement, from the set of integers in the interval [0, n). Each of the n integers has a probability p of inclusion in the sample.

      This method uses ThreadLocalRandom as the source of randomness, and is thus safe, and efficient (i.e., non-blocking), for use with threads.

      Parameters:
      n - The number of integers to choose from.
      p - The probability that each of the n integers is included in the sample.
      Returns:
      An array containing the sample.
    • sample

      public static int[] sample(int n, double p, RandomGenerator r)
      Generates a random sample, without replacement, from the set of integers in the interval [0, n). Each of the n integers has a probability p of inclusion in the sample.
      Parameters:
      n - The number of integers to choose from.
      p - The probability that each of the n integers is included in the sample.
      r - The source of randomness.
      Returns:
      An array containing the sample.
    • sample

      public static int[] sample(int n, int k, int[] result)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This method chooses among the RandomSampler.samplePool, RandomSampler.sampleReservoir, and RandomSampler.sampleInsertion methods based on the values of n and k.

      This approach combining reservoir sampling, pool sampling, and insertion sampling was described in: Vincent A. Cicirello. 2022. Cycle Mutation: Evolving Permutations via Cycle Induction, Applied Sciences, 12(11), Article 5506 (June 2022). doi:10.3390/app12115506

      The runtime is O(min(n, k2)) and it generates O(min(k, n-k)) random numbers.

      This method uses ThreadLocalRandom as the pseudorandom number generator, and is thus safe, and efficient (i.e., non-blocking), for use with threads.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.
    • sample

      public static int[] sample(int n, int k, int[] result, RandomGenerator gen)
      Generates a random sample of k integers, without replacement, from the set of integers in the interval [0, n). All n choose k combinations are equally likely.

      This method chooses among the RandomSampler.samplePool, RandomSampler.sampleReservoir, and RandomSampler.sampleInsertion methods based on the values of n and k.

      This approach combining reservoir sampling, pool sampling, and insertion sampling was described in: Vincent A. Cicirello. 2022. Cycle Mutation: Evolving Permutations via Cycle Induction, Applied Sciences, 12(11), Article 5506 (June 2022). doi:10.3390/app12115506

      The runtime is O(min(n, k2)) and it generates O(min(k, n-k)) random numbers.

      Parameters:
      n - The number of integers to choose from.
      k - The size of the desired sample.
      result - An array to hold the sample that is generated. If result is null or if result.length is less than k, then this method will construct an array for the result.
      gen - Source of randomness.
      Returns:
      An array containing the sample of k randomly chosen integers from the interval [0, n).
      Throws:
      IllegalArgumentException - if k > n.
      NegativeArraySizeException - if k < 0.