Date: Wed, 23 Feb 2000 07:08:14 -0800 (PST)
Message-Id: <200002231508.HAA07282@mail.eskimo.com>
Subject: Re: C FAQ error?
The water is surprisingly deep here. I was initially rather alarmed by your note, because after rummaging around for some of my old random number testbeds and dusting them off, I found that the expression
rand() / (RAND_MAX/N + 1)from the FAQ list doesn't behave as well as I expected it to. When I then attempted to rederive the appropriate expression, my first attempt led to
rand() / ((RAND_MAX + 1) / N)which is exactly what you suggested. But it turns out that the latter expression doesn't work in the general case. Suppose that RAND_MAX is 21 and N is 10. (As it happens, this was the first example I tried.) So ((RAND_MAX + 1) / N) is then 2, and when rand() returns 20 or 21, the value of the expression is 10, which is not in [0, N).
I want to study your... derivation some more, because I think it's basically sound, for the ``|RAND_MAX+1| is a precise multiple of |N|'' case. I suspect that the flaw, however, is in your assumption that if RAND_MAX+1 is not a multiple of N, the same logic applies. You have to make sure that the result falls the right way when the intermediate quotient discards a nonzero remainder, and for this expression, it doesn't.
Unfortunately, I don't immediately remember how I originally derived the expression
rand() / (RAND_MAX/N + 1)I share your unease about its dissimilarity to the (much cleaner) floating-point version
rand() / (RAND_MAX + 1.0) * NI do remember testing the expressions in the FAQ list quite thoroughly, and I've found an article I posted long ago which discusses them in some detail, and which suggests that I put quite a bit of time into that initial derivation. (It also suggests that I wasn't perfectly happy with the expression then, either.) That article, as I read it now, is nowhere as clear as I'd like, and I was about to be at a loss to explain how
rand() / (RAND_MAX/N + 1)works at all, until I discovered that the expression is essentially the same one I explain in the answer to exercise 8 at http://www.eskimo.com/~scs/cclass/asgn.beg/PS3a.html, which is worth reading. (I must confess, though, that I can't imagine what I was thinking when I started throwing that much detail around in the third week of an allegedly beginning programming class.) But going back to the RAND_MAX=21, N=10 case, it turns out that dividing rand() by 3, such that we never emit 8 or 9 at all, is about the best we can do; what we're seeing is one of the ways that any of these expressions begins to fail when RAND_MAX+1 is neither a multiple of nor much greater than N.
For my own reference, here's another way of deriving the expression rand() / ((RAND_MAX + 1) / N), even though that expression doesn't work. We want to divide rand's output by some integer y such that the answer is always in the range 0 <= rand()/y < N. Now when rand returns RAND_MAX, the result RAND_MAX/y should be just barely less than N; one way of achieving this would be if it were the case that (RAND_MAX+1)/y did equal N. So if we set (RAND_MAX+1)/y = N, we can easily solve for y = (RAND_MAX+1)/N.
But wait! If we want RAND_MAX/y to be just barely less than N, the other way to achieve it is to subtract 1 in the denominator. So if RAND_MAX/(y-1) = N, we have y = RAND_MAX/N + 1, and this must be where the expression in the FAQ list comes from, after all.