The easiest way to generate random points on a (hyper)sphere, at least with respect to most sampling APIs, is to sample from an isotropic normal distribution of the appropriate dimension (for a three-dimensional sphere, just call randn() three times), then normalize the result to be on the surface of the sphere.
This works because the normal distribution is rotationally symmetric, so each sample is essentially a random unit vector scaled by a random (normally distributed) radius. Once you remove the variation in radius you just get back the random unit vector, i.e. a point on the unit sphere.
This is not quite correct. Generating a random vector using three calls to rand() is the equivalent of picking a random point inside a cube. If you normalize the points, you're effectively squishing the corners of the cube in toward the center to make a sphere. Points near the corners of the containing cube will appear more frequently than the top, bottom and 'sides'.
@ginko's comment is correct - you can fix the algorithm by throwing out any points that lie outside the sphere before normalizing.
As others have pointed out, the distinction between randn() and rand() is crucial here - the former gives you points from a (spherically symmetric) normal distribution, the latter gives you uniform points from the unit cube.
One of the advantages of the normal-sampling route over @ginko's rejection-based method is that in high dimensions almost all of the volume of the unit cube is situated outside of the unit sphere (the volume of the unit cube is always 1, whereas the volume of the unit sphere decreases exponentially with dimension). So the rejection method becomes exponentially slow in high dimensions, while the Gaussian method still works just fine.
In three dimensions (to keep the notation reasonably simple) you need a distribution with density f such that f(x)f(y) f(z) is a function of x^2 + y^2 + z^2 (i. e. the distance from the origin). It looks like the Gaussian (up to scaling) is the only one.
Without loss of generality, take f(xi) = k1 * exp(-g(xi)) [1], for some g. Then we need the joint pdf to satisfy f(x1,...,xn) = k2 * exp(-h(R^2)), R=sum(xi^2)^1/2 (the R^2 and h(.) is w.l.g. too). So we get g(x1)+g(x2)=h(x1^2+x2^2). Then assuming the functions g and h analytic we end up needing g(x)= k * x^2, otherwise we get cross terms in the Taylor expansion that can't be cancelled out for all xi. Sounds good?
[1] The function f trivially needs to be symmetric, justifying no loss of generality.
Hmm I'm not sure I get your point. I'm trying to prove that the joint pdf of N iid RV's is isotropic if and only if the RV's are gaussian. If I assume the pdfs are gaussian in the first place the proof isn't valid?
I don't totally understand the math, but davmre is suggesting calling randn(), not rand() -- so sampling from a normal distribution instead of a uniform distribution.
The math is fairly simple. The probability distribution of a multivariate standard gaussian has a simple form that is f=a * exp(-x1^2+...+xN^2)=b * exp(-R^2), where a and b are some normalizing constants and R is the norm of the position , that is, it is obviously rotationally symmetric [1].
But that pdf is also the joint pdf of N i.i.d. gaussians, evident by decomposing f=a * exp(-x1^2) * ... * exp(-xN^2) [2], which is the joint pdf x1,...xN s.t. fx1=c * exp(-x1^2), ..., fxn=c * exp(-xN^2).
[1] Since exp(-R^2) does not depend on direction but only on distance from the origin
[2] The fact that f(x1,...xN)=f1 * ... * fN if x1,...xN are independent follows directly from the fact that P(A & B) = P(A)*P(B) if A and B are independent events.
Great point, you are correct. It's not enough for the distribution of the random vectors to be rotationally symmetric. It must be isotropic, looking the same in every direction from the origin, which a cube does not. (Nor does any other method deriving from a polyhedral solid, as I was thinking might have been possible, through something like selecting a random face from an icosahedron then a random point on that face.)
Not necessarily the most efficient though: the Box-Muller transform that is probably used in the call to randn is related to the transform you'd need to start from uniformly distributed reals.
So your solution ends up pretty much using the same math (while probably calling cos a few more times) and throwing away at least one random number and possibly half of them (depending on the implementation of randn).
Yeah, the Gaussian trick is like programming in a high-level language: using existing abstractions lets you write clean and concise code, at the price of some efficiency versus the optimal engineered-from-scratch solution.
Here's a nice article with pictures on sphere point picking[0]. Here is another [1]. The surface of a sphere in 3-dimensional space has only two dimensions. You can specify any point on the sphere with two coordinates (spherical coordinates). The mathematical expression for this sphere is S-2. If you're using three coordinates to generate the point, you're using more than you need. It's straight algebra to convert from spherical to linear coordinates if you need those for your work.
The easiest way I know how to do it, call rand 3 times: you get a cube -> Calculate the length of the vertex if outside of the sphere reject the sample -> Normalize. IIRC this should be well distributed, yet not fast.
In even dimensions, volume of a sphere of radius 1 is:
V_{2k} = pi^k/k!
For 10 dimensions, the hypersphere is of volume 2.55, but the enclosing box is of volume 2^10 = 1024. So roughly one in 400 trials gets you a point inside the sphere. Not good odds.
The rejection method is never better I think, work smarter not harder. But it's an easy way to remember and do it. Don't recommend using it when you any actual performance.
In terms of CPU power, calculating the length of the vector and doing a branch, I think, is going to be slower then the proper way to do it.
> Note that the points are no longer independent of each other, hence they are no longer uniformly distributed.
This is false. Random variables can be uniform even if they are not independent.
For example, let X be uniform(0, 1), and let Y = X + 0.1 mod 1. Clearly they are not independent. Clearly X is uniform. When you realize that you could redefine the variables as Y = uniform(0, 1) and X = Y - 0.1 mod 1, then it becomes obvious that Y is also uniform.
It is a common problem to know a variable's distribution without knowing whether it is independent.
Good catch. Probably what the article meant to say is that the lack of independence means the points are not conditionally uniform. That is, although X and Y follow uniform distributions individually, the dependence between them means that once you observe X, your resulting belief about Y is no longer a uniform distribution. Compare this to the independent case where knowing X would tell you nothing about Y, so your belief about Y conditioned on X would still just be uniform.
Let X be uniform(0,1) and let Y=X. Clearly the are not independent. Clearly X is uniform. And it's obvious that Y is also uniform.
The comment you quote means "the points are no longer independent of each other, hence each point is no longer uniformly distributed given the others". In you example, Y doesn't have a uniform distribution for X fixed.
>Although we’ve successfully generated uniformly distributed points on a sphere, it feels messy. Some points seem too close together, and some seem too far apart.
This is how randomness should look. If you randomly generate (using a high-quality random engine) uniformly distributed points on a square plane or even a line, you'll see the same apparent clustering.
Put another way, if you plot every point on a coordinate grid with integer coordinates, you'll have a uniform distribution, but it will hardly be "random."
> One solution is to pick λ ∈ [-180°, 180°) as before and then set φ = cos^(-1)(2x - 1), where x is uniformly distributed and x ∈ [0, 1).
There is a fascinating fact hidden in the above sentence: if a point (x,y,z) is uniformly distributed on the surface of a sphere -- say, of radius 1, centered at (0,0,0) -- then each of x, y, and z is uniformly distributed over the interval [-1, 1].
Are you sure? If z is distributed over the surface of the sphere there is far less surface at the extremums -1 and 1. Doesn't z being uniform in the range (-1,1) lead to points being actually more densly concentrated near the poles? Or did I misunderstand what you meant?
Near the poles the surface becomes very flat and therefore there is as much surface area in a slice of constant height than anywhere else. Archimedes was a clever fella :-)
Yes, it is a really weird property of 2-spheres that that holds.
One consequence is that yet another way to pick a point uniformly on the sphere is to choose z ∈ [-1, 1] uniformly, then choose λ ∈ [-180°, 180°) uniformly and use the point
I think that's a consequence of spherical symmetry. I'm not sure I have the entire logic worked out, but because it's isotropic, then the distributions need to be uniform I think.
Nor for 3-spheres (i.e. the 3-dimensional surface defined as all the points in 4-d space at distance 1 from the origin). Nor for higher dimensions than that.
It's peculiar to the 2-sphere ("ordinary" spheres).
I imagine there is a trend where, in higher dimensions, coordinates have a greater tendency to be near zero?
In a 0-sphere, x is either -1 or +1. In a 1-sphere, each of x, y, is (informally speaking) more likely to be near +/- 1 than near 0. A 2-sphere gives us uniform distribution for each coordinate. So I suppose that the coordinates of a 3-sphere are more likely to be near 0 than near +/- 1, and this tendency is more pronounced, the higher the dimension gets. (?)
n-spheres are funny things. Intuition about them is often misleading. (See, for example, the comments expressing skepticism about my original uniform-distribution observation.)
----
EDIT. Ooo, some interesting questions here. What is the limiting behavior of the distribution of x on the n-sphere, as n -> +infinity? I imagine the graph looks like a narrower & narrower spike at x = 0.
Now, suppose we scale the graph of the distribution horizontally by some appropriate function of n. That is, let f_n be the probability distribution of x on the n-sphere. Is there some function g:Z -> R so that the function x -> f_n(x / g(n)) has a limit as n -> +infinity? Does it approach (wild guess) a normal distribution? If so, is this fact (even wilder guess) a special case of some kind of central-limit-theorem-ish statement that holds for geometrical objects?
Informally, the results of "Asymptotic distribution of coordinates on high-dimensional spheres" by M. C. Spruill (http://www.emis.ams.org/journals/EJP-ECP/_ejpecp/ECP/include...) is that as n gets large, the distribution of x approaches a normal distribution with mean 0 and standard deviation 1/sqrt(n). That paper contains what might be some interesting references to the literature.
Interesting statement about geometrical objects. Note that the unit hypercube (surface) aligned with the axis will not converge to a normal (regardless of orientation?), it stays uniform instead.
In fact the simplex seems to be the worst case for convex objects, in terms of concentration of distribution near the center. And the best case should be the sphere. Which plays out nicely since the simplex seems to be the most "concavey" convex shape of a given 'diameter' is the sphere is the most "convexey" convex shape of a given 'diameter', no?
Ah, of course. To visualize it, the distance between the center and the surface of the shape would be proportional to the probability of that vector being selected after normalization. Thanks!
Since the box is anisotropic, this would not give the uniform distribution (it's more likely to draw a "diagonal" direction than a coordinate axis direction).
If someone is familiar with this kind of thing, I'd like to know a little more about the Jaynes solution described. It seems to me that if the circle were translated off to some distant portion of the plane, it would be very unlikely that any method of choosing chords would have the same distributional properties on the original circle and on the translated circle, because the chords are all originally restricted to intersect the first circle and, given two circles in a plane, there are many lines that intersect one but not both circles. Will Jaynes' method really shade the translated circle the same way it will shade the original? My intuition says the translated one should be shaded more heavily in a sort of "band" parallel to the line connecting the origins of the two circles, or possibly in a crescent pattern with more shading closer to the original.
It's easy to see that the second method can be applied so it works independently of the position and size of the circle. Let's start by reformulating it slightly: instead of a "random radius" we will use a "random diameter".
> Choose a diameter of the circle, choose a point on the diameter and construct the chord through this point and perpendicular to the diameter.
In the general case, choose a line going through the origin.
=> In the circle, there is a diameter parallel to this line.
Then choose a point on the line, and construct the perpendicular line at that point. (To be able to choose a point uniformly, you have to consider a finite segment but this is not a problem as long as the location/size of the circle is bounded).
=> This perpendicular line might not intersect the circle, but when it does it determines a chord that goes through a point chosen uniformly on the diameter.
This generalized method can be used to cover uniformly multiple circles at the same time, but obviously not every line will intersect every circle.
If you have random N points and you want to distribute them uniformly on a sphere, this method won't work. I needed this in two different side projects and used this[1] which worked well.
I stumbled upon an attractive & simple to implement non-random method that uses the golden angle (an angle commonly observed in plant phyllotaxy) -- drawing a fibonacci spiral on the sphere while maintaining a constant surface area. It is demonstrated here, on the open processing website.
The third sphere is the "more æsthetically pleasing pattern" example, so you're right that the second one is more randomly distributed (in the sense that each point on the second sphere is equally likely to appear anywhere on the sphere, while points on the third sphere are positioned to try to keep them apart).
This post seems to barely scratch the surface of the problem. First you really need to specify by what metric you want the distribution to be uniform. Euclidean 3d distance? Arcs? The way I would do it would be to generate a lot of random points by some easy scheme, like long/lat, then feed them into a kdtree (k=3), find all neighbors for every point with an overestimating metric, then check them against the real metric and remove the too close ones.
It should be clear that the obvious notion of uniformity is the intended one. I.e., it should be uniform with respect to the usual Lebesgue measure on the sphere.
And your method is extremely complicated. You can solve this problem with 5 lines of code and no data structures.
This works because the normal distribution is rotationally symmetric, so each sample is essentially a random unit vector scaled by a random (normally distributed) radius. Once you remove the variation in radius you just get back the random unit vector, i.e. a point on the unit sphere.