Uniformly Sample Points in a Disk

Dave_Rosenthal · on May 26, 2022

So points in a circle is pretty basic, but it does remind me of a nice 3Blue1Brown video (https://www.youtube.com/watch?v=mZBwsm6B280) about random chords of a circle. Some of these 'generate a random sample' situations can be quite non-obvious.

edit: The video is about 'Bertrand's Paradox' that another post references.

sidedishes · on May 26, 2022

Related only through 3b1b, but someone did a deeper investigation into a few different ways to sample uniformly in a circle and (even though the point here is mathematical interest) benchmarked them! https://youtube.com/watch?v=4y_nmpv-9lI

ur-whale · on May 26, 2022

This is an instance of the general problem of finding a function F such that the distribution F(Uniform Distribution Over a [0,1]^n) has properties you desire (in this case, n=2 and F is the polar transform).

I have very often found myself wishing for an extensive reference table of such F functions when coding, as in:

   - draw one or many samples from the language's PRNG
   - apply F to it
   - get a random number that "does what I want"

Unfortunately:

   1. The math for F, while not hard in theory, rarely yields closed form solutions (you've got to, IIRC, find Distro^-1 and integrate it, or something like that, neither steps being easy for some distros).

   2. There are very interesting "methods" to produce a similar results (such as the box-müller transform [1]), even if they involve rejection sampling.

What I've never seen though is an overall theoretical + practical treatment of the problem: how do I get the distribution I want by applying a function or maybe an algorithm to the output of a uniform PRNG).

If anyone is aware of such work, would love to learn about it.

Keenan Crane's solution is really neat in that regard, btw.

[1] https://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform

magnio · on May 26, 2022

For more methods to uniformly generate points on an n-dimensional sphere and inside an n-dimensional ball: http://extremelearning.com.au/how-to-generate-uniformly-rand...

I quite like the normalized Gaussian method.

bo1024 · on May 26, 2022

This link is fantastic. The exponential method was new to me, and the connection to the Gaussian method was nice.

fisian · on May 26, 2022

Related to applying a non-linear function to randomly distributed points is the Unscented Transform[0]. It approximates how the function changes the distribution's mean and standard deviation by selecting several points, applying the function to these points and calculating the transformed mean and standard deviation from the transformed points.

[0]: https://en.wikipedia.org/wiki/Unscented_transform

antirez · on May 26, 2022

Alternative: sample random points on a r*r square, and refuse all the ones outside the distance of the radius. This can be generalized for any shape where we have a f(x,y) -> bool that tells us if a point is inside or outside our shape.

_eojb · on May 26, 2022

Graphics programming perspective: We call this “rejection sampling” and may work and is a more general sampling method that can sample from more difficult probability distributions but has significant disadvantages. First the runtime is the worst case across all threads invoking this in a GPU simd unit. Second, you have to evaluate the pdf itself which is easy in this case, and harder in general. The sqrt really doesn’t really amount to much to outweigh the cons in a shader. You can use the rcp sqrt intrinsic if precision is less important, or square the LHS if the ALU is actually a problem (or to conserve more energy)

ohgodplsno · on May 26, 2022

For certain purposes (like graphics programming), this is not a viable option. Not only are you bounded by time (16ms for a single frame means you have to limit the number of points you sample, which may mean that from one frame to the other, you have way less samples), but checking for the distance is way, way less efficient than a simple math expression that can be easily parallelized.

antirez · on May 26, 2022

I think you can avoid taking the square root in the distance formula (by squaring the comparison term), so this actually could be the fastest way possible to reach the effect, because of all the sampling, the points outside are a minority. What the tweets mention instead is you need to take the square root, that is the slowest thing in the expression.

whyever · on May 26, 2022

That is indeed the fastest way in my experience (at least on the CPU). It's very unlikely to have more than a few rejections, so any method that requires evaluating trigonometric functions cannot compete.

Jarmsy · on May 26, 2022

It's not too bad in 2d, but the efficiency of the rejection method gets very bad in higher dimensions.

hyperbovine · on May 26, 2022

The story on a GPU is much different though, thanks to branch divergence.

jepler · on May 26, 2022

Another attempt at a sqrt-free "uniform point nearly in circle": point in regular N-gon. It's easy (work not shown) to sample a point uniformly within a triangle.

Sample from the triangle that has (0,0), (1,0) and (cos 2pi/N, sin 2pi/N) as its vertices. Then uniformly pick an integer M from 0 to (N-1) and [by table look-up of sin/cos] rotate this point by the angle M*2pi/N. Even for modest values like N=128, 256 this is very close to a circle (around .1% error).

However, I'm not familiar enough with GPU compute to know if this maps well onto the available operations. It seems like a table look-up is a lot like a texture look-up so I'm tempted to assert it should be fast.

sokoloff · on May 26, 2022

There’s a sqrt-free “is this point in the circle?” algorithm as well: sum the squares of each coordinate and compare to the square of the radius.

jepler · on May 26, 2022

yes though as sibling points out rejection can be bad for performance (hard to parallelize). I should have specified that the algorithm I suggested was sqrt-free and rejection-free (but does have a problem with distribution right near the edge of the circle)

dekhn · on May 26, 2022

ding ding ding... this is the right answer (if you don't want to think about inverting the pdf).

olliej · on May 26, 2022

This is what I use in my terrible ray tracer, my understanding is that it does bad things to GPU perf because GPUs do not like branching.

j7ake · on May 26, 2022

Fun fact: Average distance between two uniformly random points on a disk is 128r/45pi

spekcular · on May 26, 2022

A nasty integral! https://mathworld.wolfram.com/DiskLinePicking.html

hammock · on May 26, 2022

Is there an intuitive explanation for where those constants come from (128 and 45)?

ssivark · on May 26, 2022

For a related fascinating conundrum in sampling lines inside a circle, see Bertrand's paradox: https://en.wikipedia.org/wiki/Bertrand_paradox_(probability)

bananabiscuit · on May 26, 2022

I think it’s interesting that the article mentions some mathematicians still consider the paradox as unresolved. In my opinion all this paradox is, is an under-specified problem. We don’t, after all, consider a question like “what is the length of the third side of a triangle with 2 sides of length 5?” To be a paradox.

CaptainNegative · on May 26, 2022

> I think it’s interesting that the article mentions some mathematicians still consider the paradox as unresolved.

I think the only citation there is of a philosophy guy, which is a field well-known for raising fundamentally irrelevant points.

melony · on May 26, 2022

Any fans of MCMC here? Metropolis-Hastings? So much of modern ML is based on this sort of sampling that it isn't funny anymore if you make a mistake.

dekhn · on May 26, 2022

I remember some time go I needed to sample points on the surface of a sphere and basically just generated cos and sin of random lat/long as the obvious answer. Then I got pointed at another solution (generate points from normalized normal distribution) and realized just how dumb I was.

nh23423fefe · on May 26, 2022

why would anyone think you could do that? if someone thought, "i want to uniformly sample a subset of x-y space, i guess i'll instead ill sample in r-theta space and then use the nonlinear transformation x = r cos theta and y = r sin theta to sample uniformly."

i would assume someone didnt even know what uniform meant if they did that.

and graphics people are so insane about perf, they arent going to just start doing sqrt over and over. they are going to rejection sample anyway because it just 2 fmas

mattpharr · on May 26, 2022

Rejection sampling is not a good choice here.

First, the cost is 2 FMAs * the cost of generating 2 random numbers * the number of rejection sampling iterations. On average, you reject (4-pi)/4 ~= 0.25 of the time. However, if you're running on a GPU in a warp (or the equivalent) of 32 threads, then you pay the cost of the maximum number of rejections over all the threads.

The bigger issue is that a direct mapping from [0,1]^2 to the disk, as is described in this tweet, if you have well-distributed uniform samples in [0,1]^2, you get well-distributed samples on the disk. Thus, stratified or low discrepancy samples in [0,1]^2 end up being well distributed on the disk. In turn, this generally gives benefits in terms of error with Monte Carlo integration of functions over the disk.

namibj · on May 27, 2022

You can have the threads cooperate, distributing from those that got extra to those that were unlucky.

thomasahle · on May 26, 2022

It's not clear that rejection sampling would be faster.

Assmuning this is a graphics operation running on a GPU. Taking the sqrt is pretty optimized, and having a branch for every point to be sampled is not fun.

onedognight · on May 26, 2022

Rejection sampling doesn’t need sin/cos.

What costs more, generating a random number and maybe a second or a random number and a sin/cos? This sounds very platform dependent.

Note that when generating random points in the 4-ball (i.e. a quaternion)? The ball takes up a much smaller percentage of the space in the hypercube compared to the disk in a square, so rejections sampling can take many iterations.

OskarS · on May 26, 2022

> What costs more, generating a random number and maybe a second or a random number and a sin/cos

You’re adding a very unpredictable loop branch and a bunch of data dependencies doing that. For GPUs (and maybe even CPUs), the sqrt way is almost certainly faster.

gpderetta · on May 26, 2022

If you have ALUs to spare, you can generate multiple points in parallel, say 4, and it is likely that at least one is inside the disk and you can select it branchlessly. Then you only need to loop in the very very rare case.

thomasahle · on May 26, 2022

Just saying it's not clear and one would have to benchmark to find out.

jeffparsons · on May 26, 2022

Spot on. A lot of graphics people in the thread point this out. (I am not a graphics person — I occasionally tinker, but mostly just appreciate it from a distance.)

firtoz · on May 26, 2022

The thinking goes more like this:

I want to get a random point in a circle.

What parameters make up a point in a circle?

Radius and angle can be one, so let's randomise those.

This is something that's done a lot in game development.

eru · on May 26, 2022

And it _is_ a random point in a circle. Just not a uniformly distributed one.

Btw, here's a way to sample a uniform random point on the edge of a circle without trigonometric functions:

Sample x and y coordinates from a standard normal distribution. Then project the result on the circle, by normalising the distance.

(You can approximate normal distributions fairly easily. Or your favourite computing environment might have specialised optimized machinery built-in.)

firtoz · on May 26, 2022

> Sample x and y coordinates from a standard normal distribution. Then project the result on the circle, by normalising the distance.

I wonder if that's uniform...

I expect that you would have more points mapped to the corners.

6gvONxR4sf7o · on May 26, 2022

The logic is that independent standard normals on each axis make up the multivariate standard normal, and the multivariate standard normal is rotationally invariant and therefore has the same density at any angle.

whyever · on May 26, 2022

It's a well-known method to uniformly sample the surface of an n-dimensional sphere.

Someone · on May 26, 2022

I guess/think you misread the “standard normal” part in that sentence as “uniform”.

keefe8 · on May 26, 2022

It appears to be uniform. I tested empirically in Python: https://pastebin.com/raw/ca03uywJ

firtoz · on May 26, 2022

https://codepen.io/toxicfork/pen/OJQzLba

This says otherwise

keefe8 · on May 28, 2022

I meant to show the output of my code, but Pasteboard was down at the time: https://gcdnb.pbrd.co/images/f8bgGZMW3deq.png?o=1

Malipeddi · on May 26, 2022

Normal distribution is not the same as uniform distribution. I edited in a normal distribution sampler in your codepen example.

https://pastebin.com/raw/CP4yNRE9

firtoz · on May 26, 2022

I see, thanks!

eru · on May 27, 2022

Btw, the following two are equivalent (in JavaScript syntax):

    var a = Math.max(Math.random(), Math.random());
    var b = Math.sqrt(Math.random());

firtoz · on May 27, 2022

Curious, how come? Does it become a statistical thing?

eru · on May 30, 2022

Well, not sure whether you'd call it statistics. It's a mathematical fact from probability theory. And it doesn't matter how often you sample: the distributions themselves are the same.

I came across that one in the other direction, when trying to work out how to make weighted rendezvous hashing work. See https://en.wikipedia.org/wiki/Rendezvous_hashing

_ofdw · on May 26, 2022

>why would anyone think you could do that?

They don't realize/they lack intuition about the nonlinearity of going from rectangular to polar coordinates.

Things like this aren't necessarily immediately obvious, depending on one's academic background.

eru · on May 26, 2022

> and graphics people are so insane about perf, they arent going to just start doing sqrt over and over. they are going to rejection sample anyway because it just 2 fmas

If you can re-jig the rest of your algorithm to consume the square of the radius (instead of the radius itself), you don't need to take a square-root here.

olliej · on May 26, 2022

Sqrt is actually one of the faster operations, though obviously avoiding the need for any operation is clearly preferable :)

eru · on May 27, 2022

Here's a much simpler way to get the same distribution (in JavaScript syntax):

    var radius1 = Math.max(Math.random(), Math.random());
    var radius2 = Math.sqrt(Math.random());

Basically taking the maximum of two uniformly distributed random variables is the same as taking the square root of one of them in terms of their distributions.

Whether taking the max of two variables is cheaper than the square-root of one depends on implementation details, I guess?

You can also re-use the random variable that the max 'tossed' away by scaling it.

beaugunderson · on May 26, 2022

and for those looking for a demonstration of an O(n) approach, Jason Davies' demo of Poisson Disk sampling is a good start and links to Bridson's algorithm: https://www.jasondavies.com/poisson-disc/

qsort · on May 26, 2022

Haven't you ever forgotten the determinant of the jacobian?

nh23423fefe · on May 26, 2022

this cuts deep