Could anyone point me on how to solve this(either a solution or preferrably a po...

hhm · on Nov 24, 2008

Execute f 7 times, add all numbers, call that x. Then you do mod(x,7)+1, and you get a random number between 1 and 7. If the original function was unbiased, this one is going to be unbiased too.

deadsy · on Nov 24, 2008

This won't work. f gives 1 to 5. 7 * f gives 7 to 35. But 7 to 35 will not be evenly generated. Think about it: There are more ways to get a 20 than there are to get a 7 or a 35. Same thing with rolling 2 die.

nostrademons · on Nov 24, 2008

That was my second thought too (my first thought was to upvote the comment), but at least in the 2-die case, the modulus takes care of it. If you work out the probabilities, the chance of getting 0mod2 = 1/36 (2) + 3/36 (4) + 5/36 (6) + 5/36 (8) + 3/36 (10) + 1/36 (12) = 18/36 = 1/2. Same goes for getting 1mod2. So you end up with a fair result even though the chances for each individual outcome are biased.

I didn't want to go through all 5^7 possibilities for the 7-die case, but I figured it's likely enough that he's right that I'd keep my mouth shut.

llimllib · on Nov 24, 2008

experimental evidence says that it is in fact a uniform distribution:

  r = random.Random()

  def one_to_five():
    return r.randint(1, 5)

  def mod_seven():
    return (sum(one_to_five() for x in xrange(7)) % 7) + 1

  def test_dist(lst):
    return [(x, lst.count(x)) for x in [1,2,3,4,5,6,7]]

sdfx · on Nov 24, 2008

i've tried it out experimentally in google docs and it looks really good for big numbers. No real proof tho and it might be that the errors counter each other by chance:

http://spreadsheets.google.com/ccc?key=pq4tB7LQWN03gF7ImGhIP...

greendestiny · on Nov 25, 2008

Have a look at the distributions for each sum here: http://brianpmearns.com/ext/dice/dice.html

edit: Oops bad addition on my part, could be uniform, but you'd have to work out the actual number exactly.

sdfx · on Nov 25, 2008

here are the exact numbers:

1 -> 11177

2 -> 11172

3 -> 11158

4 -> 11144

5 -> 11144

6 -> 11158

7 -> 11172

...not quite

gaius · on Nov 25, 2008

I've done a lot of Python but never knew about xrange - thanks!

hhm · on Nov 25, 2008

Ok, a less elegant one then, but one that works for a reasonably simple reason. f gives 1 to 5, if it gives 5, try again, and so on, until you have a number from 1 to 4. Then, do that mod 2. You have a random binary digit, that's unbiased.

Now use that process to get 3 binary digits. You get a random number from 0 to 7. If the random number is 0, start again... eventually you'll get a number from 1 to 7, and all numbers have the same chances.

hhm · on Nov 25, 2008

You're right in that my original reasoning was wrong. Thank you for your comment.

MaysonL · on Nov 24, 2008

f * f gives you 25 boxes, number 11, 12, ... 15, 21,...25...55. Assign 3 each of 21 of those to 1..7. If f * f doesn't fall into one of those 21 boxes, repeat until it does.

andreyf · on Nov 24, 2008

Runtime: O(infinity)?

paulgb · on Nov 25, 2008

Technically speaking, yes.

But, if you want to uniformly map a random number from set X to set Y where (IIRC) lcm(|X|, |Y|) != |X|, it seems you need an infinite worst-case running time.

Here's an informal proof that you can't have a finite upper bound to the number of iterations. After n iterations, you have |X|^n possible outcomes. But, since lcm(|X|, |Y|) != |X|, |X|^n cannot be divided evenly by |Y| (since its factors are the same). So some outcomes in Y must be more likely than others.

(This is not nearly complete, but hopefully it's enough to show how it might be right.)

Of course, in practice it is highly unlikely that you will get past more than a couple iterations before determining an outcome.

andreyf · on Nov 25, 2008

since lcm(|X|, |Y|) != |X|, |X|^n cannot be divided evenly by |Y| (since its factors are the same)

Can you go into more detail about this part?

paulgb · on Nov 25, 2008

Sure. Sorry, I didn't take the time to work this out on paper before posting or I would have realized that the condition itself is wrong. The condition lcm(|X|,|Y|) != |X| instead should be that |Y| has some prime factor that |X| does not.

Here is an explanation with the new condition:

Let p be any prime factor of |Y| that |X| does not have. It follows from Euclid's First Theorem[1] that p cannot divide |X|^n for any n [2]. Since every integer (> 1) has a unique prime factorization, it follows that |X|^n can't divide |Y|, because the prime p divides |Y| but not |X|^n.

[1] http://mathworld.wolfram.com/EuclidsTheorems.html [2] We are given that p does not divide |X|^1. Suppose that p also does not divide |X|^(n-1) for some n > 1. |X|^n = |X|^(n-1) * |X|^1, so by Euclid's First Theorem, if p divides |X|^n it must divide either |X|^(n-1) or |X|^1. We know it divides neither, so p does not divide |X|^n. By induction, this is true for all n > 0.

patio11 · on Nov 25, 2008

Observe the following lookup table, which is a 5x5 matrix.

12345 67123 45671 23456 7RRRR

R means "reroll".

Constant time execution in average case. Mathematically provable that it is as unbiased as your rand5() function. Technically not guaranteed to terminate but if you dock me points for that you're technically not guaranteed to survive to the end of the interview, are you.

wildwood · on Nov 24, 2008

My team uses this as a screening question, so I won't give you the answer - but you forgot to mention that both the rand5 and rand7 should have a uniform distribution. :)

I'll give you a hint, though - the solution is not elegant at all. Which is part of the point.

llimllib · on Nov 24, 2008

1) Given a number 1 to 5, get a new one if it's 5, else throw away the MSB, you get 2 bits of randomness per number. Subtract one and call it a.

(Now you have two bits of randomness evenly distributed among the set {00, 01, 10, 11})

2) Get another number 1 to 5, toss it if it's 5, take the LSB. Call it b.

(Now you have another bit of randomness, evenly distributed among {0,1})

3) if b == 1 and a == 11, start over. else return (4b)+a+1

Python implementation (with the -1 then +1 factored out):

  def one_to_seven(debug=False):
    a = one_to_five()
    while a == 5: a = one_to_five()
    b = one_to_five()
    while b == 5: b = one_to_five()
    b = (b % 2) * 4
    if a+b == 8: return one_to_seven()
    return a+b

llimllib · on Nov 25, 2008

patio11's solution is about twice as efficient as mine, and is a much simpler way to think about it.

litewulf · on Nov 24, 2008

Dumb/obvious idea: Generate 7 random numbers, pick the index that had the highest value. (If theres a tie, run it again!)

(Note: not guaranteed to terminate.)

Hexstream · on Nov 24, 2008

Here's my shot at a solution:

;; * SPOILER? *

  (defun rand7 ()
    (loop (let ((result (+ (1- (rand5))
	   		   (1- (rand5)))))
	    (if (< result 7)
	        (return (1+ result))))))

;; * SPOILER *

edit: If it doesn't work I'd like to know why...

ig1 · on Nov 24, 2008

Generate two random numbers with your rand5 function, let them be the two digits of your base 5 number. Each number from 0-24 has a 1/25 chance of being picked.

If the number is greater than 6 throw it away and repeat, otherwise you have your random number.

llimllib · on Nov 25, 2008

Works, but not as efficient as possible. Look at patio11's solution; you only need to throw it away and repeat if it's >= 21. Otherwise you can just take n mod 7.

spydez · on Nov 24, 2008

If you want a spoiler:

http://ariya.blogspot.com/2007/11/random-number-15-to-17.htm...

He gives a few different ways to solve it, most with uniform distribution.

sfk · on Nov 24, 2008

The elegant solution escapes me for now, but you could treat the odds and evens from the 1-5 function as a biased bit stream, apply a von Neumann corrector and then use the resulting stream.

jbm · on Nov 24, 2008

Now I remember why we computer scientists always looked up to the Math majors...

csl · on Nov 26, 2008

Can't you just do this ?

    int rand7()
    {
        return (int)(7.0f * rand5() / 5.0f);
    }

csl · on Nov 26, 2008

No -- I get zero hits for 4 and 7 with this approach.

csl · on Nov 26, 2008

Ok, just to limit my embarrassment, I've implemented an alternative solution -- though not as good as the others.

First convert rand5() to rand2(). The LSB from rand5() has a uniform distribution for the integers 0--3:

    int rand2()
    {
        int n = rand5();
        return n!=4 ? n & 1 : rand2();
    }

Now we simply build a three bit number:

    int rand7()
    {
        int n = rand2();
        n |= rand2() << 1;
        n |= rand2() << 2;
        return n;
    }

This gives the proper distribution as well, but it's not branch free, so really nothing new here.

maximilian · on Nov 24, 2008

I originally thought (f-1)6/4+1 = (f-1)3/2+1, but it's an integer... hmmm

gonna have to think about that some more

horia314 · on Nov 24, 2008

int rand17() { int t = rand15();

  if(t > 3)
      return t;
  else
      return 2 + rand15();

}

I dunno? will that work?