Hacker News new | past | comments | ask | show | jobs | submit login

Could anyone point me on how to solve this(either a solution or preferrably a pointer on how to get to it): "Q: Given a function which produces a random integer in the range 1 to 5, write a function which produces a random integer in the range 1 to 7."

First I thought it was simple but the I got stuck, maybe I'm just tired. No matter how I twsit and turn it I seem to get only an even distribution over 5 numbers.




Execute f 7 times, add all numbers, call that x. Then you do mod(x,7)+1, and you get a random number between 1 and 7. If the original function was unbiased, this one is going to be unbiased too.


This won't work. f gives 1 to 5. 7 * f gives 7 to 35. But 7 to 35 will not be evenly generated. Think about it: There are more ways to get a 20 than there are to get a 7 or a 35. Same thing with rolling 2 die.


That was my second thought too (my first thought was to upvote the comment), but at least in the 2-die case, the modulus takes care of it. If you work out the probabilities, the chance of getting 0mod2 = 1/36 (2) + 3/36 (4) + 5/36 (6) + 5/36 (8) + 3/36 (10) + 1/36 (12) = 18/36 = 1/2. Same goes for getting 1mod2. So you end up with a fair result even though the chances for each individual outcome are biased.

I didn't want to go through all 5^7 possibilities for the 7-die case, but I figured it's likely enough that he's right that I'd keep my mouth shut.


experimental evidence says that it is in fact a uniform distribution:

  r = random.Random()

  def one_to_five():
    return r.randint(1, 5)

  def mod_seven():
    return (sum(one_to_five() for x in xrange(7)) % 7) + 1

  def test_dist(lst):
    return [(x, lst.count(x)) for x in [1,2,3,4,5,6,7]]


i've tried it out experimentally in google docs and it looks really good for big numbers. No real proof tho and it might be that the errors counter each other by chance:

http://spreadsheets.google.com/ccc?key=pq4tB7LQWN03gF7ImGhIP...


Have a look at the distributions for each sum here: http://brianpmearns.com/ext/dice/dice.html

edit: Oops bad addition on my part, could be uniform, but you'd have to work out the actual number exactly.


here are the exact numbers:

1 -> 11177

2 -> 11172

3 -> 11158

4 -> 11144

5 -> 11144

6 -> 11158

7 -> 11172

...not quite


I've done a lot of Python but never knew about xrange - thanks!


Ok, a less elegant one then, but one that works for a reasonably simple reason. f gives 1 to 5, if it gives 5, try again, and so on, until you have a number from 1 to 4. Then, do that mod 2. You have a random binary digit, that's unbiased.

Now use that process to get 3 binary digits. You get a random number from 0 to 7. If the random number is 0, start again... eventually you'll get a number from 1 to 7, and all numbers have the same chances.


You're right in that my original reasoning was wrong. Thank you for your comment.


f * f gives you 25 boxes, number 11, 12, ... 15, 21,...25...55. Assign 3 each of 21 of those to 1..7. If f * f doesn't fall into one of those 21 boxes, repeat until it does.


Runtime: O(infinity)?


Technically speaking, yes.

But, if you want to uniformly map a random number from set X to set Y where (IIRC) lcm(|X|, |Y|) != |X|, it seems you need an infinite worst-case running time.

Here's an informal proof that you can't have a finite upper bound to the number of iterations. After n iterations, you have |X|^n possible outcomes. But, since lcm(|X|, |Y|) != |X|, |X|^n cannot be divided evenly by |Y| (since its factors are the same). So some outcomes in Y must be more likely than others.

(This is not nearly complete, but hopefully it's enough to show how it might be right.)

Of course, in practice it is highly unlikely that you will get past more than a couple iterations before determining an outcome.


since lcm(|X|, |Y|) != |X|, |X|^n cannot be divided evenly by |Y| (since its factors are the same)

Can you go into more detail about this part?


Sure. Sorry, I didn't take the time to work this out on paper before posting or I would have realized that the condition itself is wrong. The condition lcm(|X|,|Y|) != |X| instead should be that |Y| has some prime factor that |X| does not.

Here is an explanation with the new condition:

Let p be any prime factor of |Y| that |X| does not have. It follows from Euclid's First Theorem[1] that p cannot divide |X|^n for any n [2]. Since every integer (> 1) has a unique prime factorization, it follows that |X|^n can't divide |Y|, because the prime p divides |Y| but not |X|^n.

[1] http://mathworld.wolfram.com/EuclidsTheorems.html [2] We are given that p does not divide |X|^1. Suppose that p also does not divide |X|^(n-1) for some n > 1. |X|^n = |X|^(n-1) * |X|^1, so by Euclid's First Theorem, if p divides |X|^n it must divide either |X|^(n-1) or |X|^1. We know it divides neither, so p does not divide |X|^n. By induction, this is true for all n > 0.


Observe the following lookup table, which is a 5x5 matrix.

12345 67123 45671 23456 7RRRR

R means "reroll".

Constant time execution in average case. Mathematically provable that it is as unbiased as your rand5() function. Technically not guaranteed to terminate but if you dock me points for that you're technically not guaranteed to survive to the end of the interview, are you.


My team uses this as a screening question, so I won't give you the answer - but you forgot to mention that both the rand5 and rand7 should have a uniform distribution. :)

I'll give you a hint, though - the solution is not elegant at all. Which is part of the point.


1) Given a number 1 to 5, get a new one if it's 5, else throw away the MSB, you get 2 bits of randomness per number. Subtract one and call it a.

(Now you have two bits of randomness evenly distributed among the set {00, 01, 10, 11})

2) Get another number 1 to 5, toss it if it's 5, take the LSB. Call it b.

(Now you have another bit of randomness, evenly distributed among {0,1})

3) if b == 1 and a == 11, start over. else return (4b)+a+1

Python implementation (with the -1 then +1 factored out):

  def one_to_seven(debug=False):
    a = one_to_five()
    while a == 5: a = one_to_five()
    b = one_to_five()
    while b == 5: b = one_to_five()
    b = (b % 2) * 4
    if a+b == 8: return one_to_seven()
    return a+b


patio11's solution is about twice as efficient as mine, and is a much simpler way to think about it.


Dumb/obvious idea: Generate 7 random numbers, pick the index that had the highest value. (If theres a tie, run it again!)

(Note: not guaranteed to terminate.)


Here's my shot at a solution:

;; * SPOILER? *

  (defun rand7 ()
    (loop (let ((result (+ (1- (rand5))
	   		   (1- (rand5)))))
	    (if (< result 7)
	        (return (1+ result))))))
;; * SPOILER *

edit: If it doesn't work I'd like to know why...


Generate two random numbers with your rand5 function, let them be the two digits of your base 5 number. Each number from 0-24 has a 1/25 chance of being picked.

If the number is greater than 6 throw it away and repeat, otherwise you have your random number.


Works, but not as efficient as possible. Look at patio11's solution; you only need to throw it away and repeat if it's >= 21. Otherwise you can just take n mod 7.


If you want a spoiler:

http://ariya.blogspot.com/2007/11/random-number-15-to-17.htm...

He gives a few different ways to solve it, most with uniform distribution.


The elegant solution escapes me for now, but you could treat the odds and evens from the 1-5 function as a biased bit stream, apply a von Neumann corrector and then use the resulting stream.


Now I remember why we computer scientists always looked up to the Math majors...


Can't you just do this ?

    int rand7()
    {
        return (int)(7.0f * rand5() / 5.0f);
    }


No -- I get zero hits for 4 and 7 with this approach.


Ok, just to limit my embarrassment, I've implemented an alternative solution -- though not as good as the others.

First convert rand5() to rand2(). The LSB from rand5() has a uniform distribution for the integers 0--3:

    int rand2()
    {
        int n = rand5();
        return n!=4 ? n & 1 : rand2();
    }
Now we simply build a three bit number:

    int rand7()
    {
        int n = rand2();
        n |= rand2() << 1;
        n |= rand2() << 2;
        return n;
    }
This gives the proper distribution as well, but it's not branch free, so really nothing new here.


I originally thought (f-1)6/4+1 = (f-1)3/2+1, but it's an integer... hmmm

gonna have to think about that some more


int rand17() { int t = rand15();

  if(t > 3)
      return t;
  else
      return 2 + rand15();
}

I dunno? will that work?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: