So points in a circle is pretty basic, but it does remind me of a nice 3Blue1Brown video (https://www.youtube.com/watch?v=mZBwsm6B280) about random chords of a circle. Some of these 'generate a random sample' situations can be quite non-obvious.
edit: The video is about 'Bertrand's Paradox' that another post references.
Related only through 3b1b, but someone did a deeper investigation into a few different ways to sample uniformly in a circle and (even though the point here is mathematical interest) benchmarked them! https://youtube.com/watch?v=4y_nmpv-9lI
This is an instance of the general problem of finding a function F such that the distribution F(Uniform Distribution Over a [0,1]^n) has properties you desire (in this case, n=2 and F is the polar transform).
I have very often found myself wishing for an extensive reference table of such F functions when coding, as in:
- draw one or many samples from the language's PRNG
- apply F to it
- get a random number that "does what I want"
Unfortunately:
1. The math for F, while not hard in theory, rarely yields closed form solutions (you've got to, IIRC, find Distro^-1 and integrate it, or something like that, neither steps being easy for some distros).
2. There are very interesting "methods" to produce a similar results (such as the box-müller transform [1]), even if they involve rejection sampling.
What I've never seen though is an overall theoretical + practical treatment of the problem: how do I get the distribution I want by applying a function or maybe an algorithm to the output of a uniform PRNG).
If anyone is aware of such work, would love to learn about it.
Keenan Crane's solution is really neat in that regard, btw.
Related to applying a non-linear function to randomly distributed points is the Unscented Transform[0]. It approximates how the function changes the distribution's mean and standard deviation by selecting several points, applying the function to these points and calculating the transformed mean and standard deviation from the transformed points.
Alternative: sample random points on a r*r square, and refuse all the ones outside the distance of the radius. This can be generalized for any shape where we have a f(x,y) -> bool that tells us if a point is inside or outside our shape.
Graphics programming perspective: We call this “rejection sampling” and may work and is a more general sampling method that can sample from more difficult probability distributions but has significant disadvantages. First the runtime is the worst case across all threads invoking this in a GPU simd unit. Second, you have to evaluate the pdf itself which is easy in this case, and harder in general. The sqrt really doesn’t really amount to much to outweigh the cons in a shader. You can use the rcp sqrt intrinsic if precision is less important, or square the LHS if the ALU is actually a problem (or to conserve more energy)
For certain purposes (like graphics programming), this is not a viable option. Not only are you bounded by time (16ms for a single frame means you have to limit the number of points you sample, which may mean that from one frame to the other, you have way less samples), but checking for the distance is way, way less efficient than a simple math expression that can be easily parallelized.
I think you can avoid taking the square root in the distance formula (by squaring the comparison term), so this actually could be the fastest way possible to reach the effect, because of all the sampling, the points outside are a minority. What the tweets mention instead is you need to take the square root, that is the slowest thing in the expression.
That is indeed the fastest way in my experience (at least on the CPU). It's very unlikely to have more than a few rejections, so any method that requires evaluating trigonometric functions cannot compete.
Another attempt at a sqrt-free "uniform point nearly in circle": point in regular N-gon. It's easy (work not shown) to sample a point uniformly within a triangle.
Sample from the triangle that has (0,0), (1,0) and (cos 2pi/N, sin 2pi/N) as its vertices. Then uniformly pick an integer M from 0 to (N-1) and [by table look-up of sin/cos] rotate this point by the angle M*2pi/N. Even for modest values like N=128, 256 this is very close to a circle (around .1% error).
However, I'm not familiar enough with GPU compute to know if this maps well onto the available operations. It seems like a table look-up is a lot like a texture look-up so I'm tempted to assert it should be fast.
yes though as sibling points out rejection can be bad for performance (hard to parallelize). I should have specified that the algorithm I suggested was sqrt-free and rejection-free (but does have a problem with distribution right near the edge of the circle)
I think it’s interesting that the article mentions some mathematicians still consider the paradox as unresolved. In my opinion all this paradox is, is an under-specified problem. We don’t, after all, consider a question like “what is the length of the third side of a triangle with 2 sides of length 5?” To be a paradox.
I remember some time go I needed to sample points on the surface of a sphere and basically just generated cos and sin of random lat/long as the obvious answer. Then I got pointed at another solution (generate points from normalized normal distribution) and realized just how dumb I was.
why would anyone think you could do that? if someone thought, "i want to uniformly sample a subset of x-y space, i guess i'll instead ill sample in r-theta space and then use the nonlinear transformation x = r cos theta and y = r sin theta to sample uniformly."
i would assume someone didnt even know what uniform meant if they did that.
and graphics people are so insane about perf, they arent going to just start doing sqrt over and over. they are going to rejection sample anyway because it just 2 fmas
First, the cost is 2 FMAs * the cost of generating 2 random numbers * the number of rejection sampling iterations. On average, you reject (4-pi)/4 ~= 0.25 of the time. However, if you're running on a GPU in a warp (or the equivalent) of 32 threads, then you pay the cost of the maximum number of rejections over all the threads.
The bigger issue is that a direct mapping from [0,1]^2 to the disk, as is described in this tweet, if you have well-distributed uniform samples in [0,1]^2, you get well-distributed samples on the disk. Thus, stratified or low discrepancy samples in [0,1]^2 end up being well distributed on the disk. In turn, this generally gives benefits in terms of error with Monte Carlo integration of functions over the disk.
It's not clear that rejection sampling would be faster.
Assmuning this is a graphics operation running on a GPU. Taking the sqrt is pretty optimized, and having a branch for every point to be sampled is not fun.
What costs more, generating a random number and maybe a second or a random number and a sin/cos? This sounds very platform dependent.
Note that when generating random points in the 4-ball (i.e. a quaternion)? The ball takes up a much smaller percentage of the space in the hypercube compared to the disk in a square, so rejections sampling can take many iterations.
> What costs more, generating a random number and maybe a second or a random number and a sin/cos
You’re adding a very unpredictable loop branch and a bunch of data dependencies doing that. For GPUs (and maybe even CPUs), the sqrt way is almost certainly faster.
If you have ALUs to spare, you can generate multiple points in parallel, say 4, and it is likely that at least one is inside the disk and you can select it branchlessly. Then you only need to loop in the very very rare case.
Spot on. A lot of graphics people in the thread point this out. (I am not a graphics person — I occasionally tinker, but mostly just appreciate it from a distance.)
The logic is that independent standard normals on each axis make up the multivariate standard normal, and the multivariate standard normal is rotationally invariant and therefore has the same density at any angle.
Well, not sure whether you'd call it statistics. It's a mathematical fact from probability theory. And it doesn't matter how often you sample: the distributions themselves are the same.
> and graphics people are so insane about perf, they arent going to just start doing sqrt over and over. they are going to rejection sample anyway because it just 2 fmas
If you can re-jig the rest of your algorithm to consume the square of the radius (instead of the radius itself), you don't need to take a square-root here.
Here's a much simpler way to get the same distribution (in JavaScript syntax):
var radius1 = Math.max(Math.random(), Math.random());
var radius2 = Math.sqrt(Math.random());
Basically taking the maximum of two uniformly distributed random variables is the same as taking the square root of one of them in terms of their distributions.
Whether taking the max of two variables is cheaper than the square-root of one depends on implementation details, I guess?
You can also re-use the random variable that the max 'tossed' away by scaling it.
and for those looking for a demonstration of an O(n) approach, Jason Davies' demo of Poisson Disk sampling is a good start and links to Bridson's algorithm: https://www.jasondavies.com/poisson-disc/
edit: The video is about 'Bertrand's Paradox' that another post references.