I need a function which would generate a random integer in given range (including border values). I don't unreasonable quality/randomness requirements, I have four requirements:

I need it to be fast. My project needs to generate millions (or sometimes even tens of millions) of random numbers and my current generator function has proven to be a bottleneck.

I need it to be reasonably uniform (use of rand() is perfectly fine).

the min-max ranges can be anything from <0, 1> to <-32727, 32727>.

it has to be seedable.

I currently have following C++ code:

output = min + (rand() * (int)(max - min) / RAND_MAX)

The problem is, that it is not really uniform - max is returned only when rand() = RAND_MAX (for Visual C++ it is 1/32727). This is major issue for small ranges like <-1, 1>, where the last value is almost never returned.

So I grabbed pen and paper and came up with following formula (which builds on the (int)(n + 0.5) integer rounding trick):

@Bill MaGriff: yes. It has the same problem. A simplified version is: how can you divide 10 pieces of candy among 3 children evenly (without breaking any of the candies)? The answer is, you can't -- you have to give three to each child, and just not give the tenth one to anybody.
–
Jerry CoffinFeb 15 '11 at 20:04

If your compiler supports C++0x and using it is an option for you, then the new standard <random> header is likely to meet your needs. It has a high quality uniform_int_distribution which will accept minimum and maximum bounds (inclusive as you need), and you can choose among various random number generators to plug into that distribution.

Here is code that generates a million random ints uniformly distributed in [-57, 365]. I've used the new std <chrono> facilities to time it as you mentioned performance is a major concern for you.

The first part is obviously the hardest. Let's assume that the return value of rand() is perfectly uniform. Using modulo will add bias
to the first (RAND_MAX + 1) % (max-min+1) numbers. So if we could magically change RAND_MAX to RAND_MAX - (RAND_MAX + 1) % (max-min+1), there would no longer be any bias.

It turns out that we can use this intuition if we are willing to allow pseudo-nondeterminism into the running time of our algorithm. Whenever rand() returns a number which is too large, we simply ask for another random number until we get one which is small enough.

The running time is now geometrically distributed, with expected value 1/p where p is the probability of getting a small enough number on the first try. Since RAND_MAX - (RAND_MAX + 1) % (max-min+1) is always less than (RAND_MAX + 1) / 2,
we know that p > 1/2, so the expected number of iterations will always be less than two
for any range. It should be possible to generate tens of millions of random numbers in less than a second on a standard CPU with this technique.

EDIT:

Although the above is technically correct, DSimon's answer is probably more useful in practice. You shouldn't implement this stuff yourself. I have seen a lot of implementations of rejection sampling and it is often very difficult to see if it's correct or not.

Fun fact: Joel Spolsky once mentioned a version of this question as an example of what StackOverflow was good at answering. I looked through the answers on the site involving rejection sampling at that time and everysingleone was incorrect.
–
Jørgen FoghOct 29 '14 at 22:40

How about the Mersenne Twister? The boost implementation is rather easy to use and is well tested in many real-world applications. I've used it myself in several academic projects such as artificial intelligence and evolutionary algorithms.

Here's their example where they make a simple function to roll a six-sided die:

Oh, and here's some more pimping of this generator just in case you aren't convinced you should use it over the vastly inferior rand():

The Mersenne Twister is a "random
number" generator invented by Makoto
Matsumoto and Takuji Nishimura; their
website includes numerous
implementations of the algorithm.

Essentially, the Mersenne Twister is a
very large linear-feedback shift
register. The algorithm operates on a
19,937 bit seed, stored in an
624-element array of 32-bit unsigned
integers. The value 2^19937-1 is a
Mersenne prime; the technique for
manipulating the seed is based on an
older "twisting" algorithm -- hence
the name "Mersenne Twister".

An appealing aspect of the Mersenne
Twister is its use of binary
operations -- as opposed to
time-consuming multiplication -- for
generating numbers. The algorithm also
has a very long period, and good
granularity. It is both fast and
effective for non-cryptographic applications.

The Mersenne twister is a good generator, but the problem he's dealing with remains, regardless of the underlying generator itself.
–
Jerry CoffinFeb 15 '11 at 20:21

I don't want to use Boost just for the random generator, because (since my project is a library) it means introducing another dependency to the project. I will probably be forced to use it anyways in the future, so then I can switch to this generator.
–
Matěj ZábskýFeb 15 '11 at 20:26

1

@Jerry Coffin Which problem? I offered it because it satisfied all of his requirements: it's fast, it's uniform (using the boost::uniform_int distribution), you can transform the min max ranges into anything you like, and it's seedable.
–
AphexFeb 15 '11 at 20:29

@mzabsky I probably wouldn't let that stop me, when I had to ship my projects to my professors for submission, I just included the relevant boost header files I was using; you shouldn't have to package the entire 40mb boost library with your code. Of course in your case this might not be feasible for other reasons such as copyright...
–
AphexFeb 15 '11 at 20:32

@Aphex My project is not really a scientific simulator or something that needs really uniform distribution. I used the old generator for 1.5 years without any issue, I only noticed the biased distribution when I first needed it to generate numbers from very small range (3 in this case). The speed is still argument to consider the boost solution though. I will look into its license to see whether I can just add the few needed files to my project - I like the "Checkout -> F5 -> ready to use" as it is now.
–
Matěj ZábskýFeb 15 '11 at 20:44

This is a mapping of 32768 integers to (nMax-nMin+1) integers. The mapping will be quite good if (nMax-nMin+1) is small (as in your requirement). Note however that if (nMax-nMin+1) is large, the mapping won't work (For example - you can't map 32768 values to 30000 values with equal probability). If such ranges are needed - you should use a 32-bit or 64-bit random source, instead of the 15-bit rand(), or ignore rand() results which are out-of-range.

Despite its unpopularity, this is also what I use for my non-scientific projects. Easy to understand (you don't need a math degree) and performs adequately (never had to profile any code using it). :) In case of large ranges, I guess we could string two rand() values together and get a 30-bit value to work with (assuming RAND_MAX = 0x7fff, i.e. 15 random bits)
–
efotinisMay 21 '11 at 20:48

IMO, none of the solutions presented there is really much improvement. His loop-based solution works, but likely to be quite inefficient, especially for a small range like the OP discusses. His uniform deviate solution doesn't actually produce uniform deviates at all. At most it kind of camouflages the lack of uniformity.
–
Jerry CoffinFeb 15 '11 at 20:15

I recommend the Boost.Random library, it's super detailed and well-documented, lets you explicitly specify what distribution you want, and in non-cryptographic scenarios can actually outperform a typical C library rand implementation.

Whole problem was using C/C++'s rand which returns integer in a range specified by the runtime. As demonstrated in this thread, mapping random integers from [0, RAND_MAX] to [MIN, MAX] isn't entirely straightforward, if you want to avoid destroying their statistical properties or performance. If you have doubles in range [0, 1], the mapping is easy.
–
Matěj ZábskýAug 6 '14 at 11:10