Hacker News new | past | comments | ask | show | jobs | submit login

Nicely said.

I was thinking of ways to solve the problem by eliminating the overhead of checking whether a random number was already drawn, when Vitter's solution was to make the random number generator produce random numbers in a sorted manner, thus eliminating that overhead entirely.

EDIT: Then again... if the algorithm is to be used to shuffle a list, wouldn't returning the selection in a sorted manner defeat the entire purpose? If k = n, output would be input...




It would. The original problem is to draw a subset of tweets.

If there's a chance to randomize the entire tweet archive, then it's a shuffling algorithm now.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: