The reason this works is often skipped in computationally oriented writeups:
Rotations of 3-dimensional real space form the topological group SO(3). Naive parameterizations of that group do not form a cover [1], but the group of norm-1 quaterinons, Spin(3), does.
The failure of naive parameterizations, like the Euler angles, to be a cover manifests itself as gimbal lock.
Here's a similar argument from my naive, intuition-based perspective:
It's important to remember that the the group of rotations in 3d space is represented by _unit_ quaternions. This is a subset of the full 4d quaternion space: a spherical shell around the origin, like the skin of a 3d ball but in 4d.
A 3d ball has a 2d, "flat" skin that is "spherically" symmetric.
A 4d ball has a 3d, "volumetric" skin that is, similarly, "spherically" symmetric.
If we are stuck to that 3d skin, it makes perfect sense that it manages to represent rotations in 3d, which have 3 degrees of freedom and spherical symmetry.
The space of rotations that is formed by this "skin" has two special points: the point where all the imaginary coordinates are zero and the real coordinate is one, and it's dual, the negative one, precisely because we are talking about _unit_ quaternions. (Similarly as there is only two "purely x" points in a unit circle: (1, 0) and (-1, 0)).
These points represent the identity rotation, i.e. "don't rotate at all". This makes sense, if you think how complex numbers work: the effect of multiplying "one" is that it keeps everything as-is, whereas every other "unit" complex number has a rotating effect.
Then as you think the three imaginary dimensions as degrees of freedoms you can start traveling into, from this neutral point, you get all kinds of rotations. The geometry of the "round", "skin-shaped" space ensures that the rotations wrap around the correct way, "spherically". Especially that after you have travelled "tau" (2 times pi) units, you are again in purely "real", "not rotated" state. And the spherical symmetry means that this works similarly in _any_ direction you can travel to.
The only gotcha is that the unit quaternion space contains doubly the space of minimal 3d rotations, because of the negative number symmetry. There's some good arguments why this is especially beautiful and true, but they are a bit beyond me.
There is a nice experiment you can do, illustrating this: Hold a glass of water in the palm of your hand. Not too full, especially not until you get the hang of the following move: Assuming you use your right hand, rotate your hand counterclockwise. At first, your hand goes under your arm, until it has rotated about 270 degrees or so. You can now continue that rotation, but you have to lift your hand up, so it is now above the arm. Keep going, and you end up where you started, but the glass has done two full revolutions. Hopefully without spilling an water (takes practice). The fun thing is, after just one revolution, your arm is twisted into a really uncomfortable configuration.
The mathematical explanation is that SO(3) is doubly connected, whereas the unit quaternions, like any sphere, is simply connected. At any time, each part of your arm has undergone some rotation from the orientation at rest. Halfway through the move, as you travel from the shoulder down to the hand, this rotation starts at the identity, and changes continuously through one rotation, back to the identity once more. But in the unit quaternions, that path takes you from one pole to the opposite pole. This explains why you can’t untwist your arm.
Sorry if that made no sense.
Edit: Have a look at “Your palm is a spinor” [1], and then at this Phillipine* tradtional dance [2] about 40 seconds in. I knew I had written about this before [3].
You're definitely right to bring up Gimbal lock [1]
One of the benefits of knowing about it is to know when it doesn't matter to you. In that case, you can just use Euler angles, as you say. If you just need to express a rotation in terms of three angles, or convert back from three angles (e.g. yaw/pitch/roll [2]) to a rotation matrix, then you don't need to know about quartertonians at all.
Even in this case you need to take care of the action of rotations on points near the poles, don't you? (I don't remember, it's been a long time since I did such calculations).
If you are going to apply 3 angles directly, then you run into problems (specifically, if you pitch +/- 90 degrees, then roll and yaw become the same thing, aka gimbal lock). If you take those 3 angles and convert them to a matrix, and use that matrix to apply the angles, you're all good. You can then even take the new matrix and pull 3 angles out of that.
How far into algebra do you need to get to understand "Rotations of 3-dimensional real space form the topological group SO(3)"? I kinda understand that norm-1 quaternions map to rotations in 3D space somehow but I can't prove it myself. What kind of curriculum do I need to follow to really grasp this?
To understand the definitions and apply them in practice, a first course in group theory + a basic understanding of vector calculus suffices. To add the adjective "topology", the first parts of a general topology course is enough.
To truly appreciate groups like SO(3), a course in differential geometry and differential topology is useful.
Edit: This is all assuming you have no background in mathematics (or, alternatively, physics) at all. If you do, a targeted text can teach you these concepts in a few pages.
The thing is group theory is taught in an incredibly abstract manner, its hard to find any motivating application for it, or any problems it helps us solve.
Also terminology/definitions are vague too, whether a vector has an endpoint or it something unachored in space is itself not clear from many treatments.
> The thing is group theory is taught in an incredibly abstract manner, its hard to find any motivating application for it, or any problems it helps us solve.
Mathematics is abstractly defined. But for basic group theory there's a plentitude of very concrete examples to rely on.
> Also terminology/definitions are vague too,
Absolutely not. There is no vagueness at all! Everything is completely well-defined in most introductory textbooks/courses (or you can even read the precise definitions on Wikipedia, which is often not the case).
> whether a vector has an endpoint or it something unachored in space is itself not clear from many treatments.
Vectors do not have endpoints. Vectors are not anchored. Vectors are elements of vector spaces. Vector spaces are completely clearly defined.
> Vectors do not have endpoints. Vectors are not anchored. Vectors are elements of vector spaces. Vector spaces are completely clearly defined.
Ehh.... compare this text from the wikipedia article on affine spaces:
> In an affine space, there is no distinguished point that serves as an origin. Hence, no vector has a fixed origin and no vector can be uniquely associated to a point. In an affine space, there are instead displacement vectors, also called translation vectors or simply translations, between two points of the space.
If you're working with vectors, you're generally working with them as points, not as values that happen to obey the axioms defining a vector space. Those vectors are anchored, and the anchoring is so deeply embedded in the concept that there's a separate concept, affine spaces, specifically devoted to the question of "what if we had things that were like vectors, except without being anchored to a particular point in space?"
Generalizing this, in the theory of differentiable manifolds, a vector (well, a "tangent vector") can be thought of as being an arrow rooted at a specific point. In Euclidean space (affine spaces), you can always translate all vectors to an origin, so it's safe to ignore a vector's root, but in general there's not a canonical way to translate vectors. The issue is essentially curvature -- for example, if you took a tangent vector at the north pole of a sphere, dragged it down to the equator, dragged it along the equator for a bit, then dragged it back up to the north pole, the vector would have rotated. What this shows is that vectors at different points are not mutually comparable without more structure.
I swept a detail under the rug, which is that the vectors are infinitesimal arrows, so their tip is not actually another point of the manifold. In an affine space, vectors can be regarded as non-infinitesimal arrows -- the arrows that you always find textbook illustrations on vector arithmetic. The arrows with a common root form a vector space.
Right, but the individual vector spaces you speak of there are the tangent spaces of a manifold M. There's a vector space TpM attached to M at a point p. And the point you make is that there isn't necessarily a canonical way to relate vectors in TpM to vectors in TqM for p≠q. In this sense, we may well say that "a vector in TpM is anchored at p, and a vector in TqM is anchored at Q". I agree with that, obviously. However, within a fixed vector space, of which such a TpM is merely one example, there is no notion of vectors being "anchored".
Given the definitions in a textbook on smooth manifolds, I agree with you -- the modern notion of a vector space has no notion of root or anchor since these words have no associated definitions.
It's interesting to think about how we came to the modern viewpoint. Some of the underlying context behind my comment is that long ago manifolds were concretely subsets of R^n that could be locally parameterized. The tangent space at a point was the affine subspace of R^n that was tangent to the manifold there, and you could regard a tangent vector as an arrow rooted at that point inside that subspace. Even though we have our clean modern notion of an abstract vector space, somehow these old intuitions linger on in other disciplines
In math we have a precise definition for a vector (an element of a vector space, full stop), but I don't think experts have the authority to be prescriptive about jargon outside their discipline. Of course, a student should be willing to drop preconceptions and to absorb correct usage.
It's true that a vector space has a designated origin. Whether you choose that to interpret as "vectors being anchored" is, I guess, up to you - that's just an issue of language usage, not of mathematics.
I personally don't consider functions or polynomials to be "anchored", but yes, of course there is the zero polynomial etc.
Also, keep in mind that physicists, for example, often use a more restricted definition of "vector" than mathematicians. That Wikipedia definition you quoted doesn't strike me as very mathematical. A mathematician's definition of an affine space is much more abstract.
What I was trying to tell was that even a fundamental concept such as vector isn't clearly defined in undergraduate level treatments. And consider that fields like Physics, EE, Aeronautics and Computer Graphics use such concepts, often use different definitions of the same thing even.
Sounds snarky but completely correct. Parent should look for a more diverse set of examples for vector spaces. In fact sounds like a good linear algebra course would be a priority over group theory
And that’s a shame, because a lot of group theory originated in concrete, tangible problems. Check out the book Visual Group Theory by Nathan Carter, discussed (a bit) here:
That's an odd complaint to hear on a technology centered forum.
The most obvious application for algebra should be thinking about data structures with their operations. If you can't be sure about a method on a class being valid in terms of the contracts for that class (IE being a closed operation) then the method can't be public (usually an idea taught in "intro to OOP" style classes although with other words unless you're reading something like SICP.)
Some of the motivating examples are hard to understand, unfortunately. You can't do quantum mechanics without group theory, and if you can't do quantum mechanics, it will be much harder to understand how half the instrumentation in your chemistry lab works.
it's actually more group representation theory you'll need: the rotation group has an infinite number of representations acting on different vectors spaces, rotation of an electron in 3d is SU(2) which maps to SO(3), the rotation of vectors.
I failed advanced math (UK A level) and dumped advanced physics before reaching the point of taking the exams. I've managed to implement a quaternion system in my canvas library[1] - with nagging doubts that it's not entirely right - mainly by staring at lots of (poorly explained) examples online and hoping that things would click 'by osmosis'. So, I reckon you can go a long way without understanding the concepts behind quaternions, but you'll need to do a lot of geometry and physics study if you ever want to feel comfortable with any quaternion code you write.
The only reason I went through all that pain was because I kept on coming across articles saying that "quaternions fix the gimbal lock issue you encounter with Euler angles" - now I see people saying in this thread that the assertion is false. I no longer know what to believe, but I do know I never want to go crawling down the Euler/quaternion rabbit hole again!
It's a good text, but the parent poster may wish to be made aware that it's very physics-centric. If they do not come at this from an interest in physics or a physics mindset, it may be counterproductive.
This area of math -- covering spaces, Lie groups, representations, etc, is often presented abstractly because there are some very powerful and beautiful theorems that, to a mathematician, really clarify what is happening. But to an engineer, it is a hard slog unless you have some firm examples in mind, and you don't really need the powerful results to work everything out concretely. It just saves you a lot of time to do that.
Nevertheless, I think it's still a good and important idea to work things out concretely a few times and for that all you really need is linear algebra.
That said, the concrete version of your statement is as follows:
SO(3) is best defined as the group of all rotations in 3 space. You then show that this is just all 3x3 matrices that are orthogonal (their transpose is the inverse) and have determinant 1.
You can do this by abstract linearity arguments (e.g. the rotation of a vector times a scalar is the scalar times the rotation of the vector) or by directly writing things out with linear algebra.
The first ingredient is to realize that the rotation in the plane by angle t is a linear map of the plane to itself, and can be represented by matrix multiplication and thus a square 2x2 matrix which sends
(1, 0) to (cos(t), sin(t))
and
(0, 1) to (-sin(t), cos(t)).
Thus the matrix is
[cos(t), -sin(t)]
[sin(t), cos(t)]
this matrix clearly has determinant = 1 and you can verify that the transpose is the inverse. But you could have derived this from general principles that rotations are volume and orientation preserving.
Now a rotation in 3 space must fix some line and then is just a planar rotation for the plane perpendicular to the line. So you can pick a new basis in 3 space corresponding to the line, v, and then two orthonormal unit vectors so that the rotation is just the matrix
[1 0 0]
[0 cos(t) -sin(t)]
[0 sin(t), cos(t)]
for some choice of unit vector v and some angle t. Here you should realize that you need an orientation. E.g. the plane perpendicular to v is the same plane as is perpendicular to -v, but you need an orientation on the plane to figure out the direction of rotation.
Already this should tell you that SO(3) is three dimensional and you have a parametrization of (most of) SO(3) as a point on a sphere together with an angle, so it's kinda like S^2xS^1, except the parametrization breaks down when the angle is pi as you get the same rotation if you pick anti-podal directions and when the angle is zero all the points on the sphere map to the same (identity) rotation. So this parametrization is not a diffeomorphism, it's not even 1 to 1, but it is surjective, and knowing exactly how it fails to be 1 to 1 allows you to understand SO(3) completely because you can think of SO(3) as S^2xS^1 with some points identified.
All of the above relies solely the basics of linear algebra such as what you usually get in a multi-variable calculus course. You don't even need stuff like Jordan decomposition or other more advanced linear algebra topics, just the definition of linear maps, the definition of a "rotation" in 3 space, ideas of orthogonality and the determinant being an oriented volume of a linear map. Most of these concepts are taught in multi-variable calculus as you need them to get volume forms as the result of a change of basis when you are doing integrals over surfaces and volumes.
In terms of 'topological group", the set of matrices with determinant 1 that are orthogonal form a group, as is easily verified via the fact that det(A*B) = det(A)det(B) and det(A^t) = det(A). That is all you need to show that this is a group. It is a topological group in the sense that the multiplication operation is continuous in the inherited norm you expect to get on matrices. E.g. if you write out the multiplication of matrices with the entries being variables you just get polynomials in the product of the two matrices so multiplication is a continuous operation.
When you are working at the elementary level, you don't care too much about whether the matrices are topological groups because you are not going to be using the heavy duty Lie theory machinery, you can write everything out in terms of matrices and maps between them explicitly. It's really good to write things out explicitly a few times and then learn all the abstract stuff because it helps you understand what the general results are really saying. Do not be intimated by people using terms like "universal cover", homotopy, classifying spaces, etc, as you don't need any of that to understand the basic properties of quaternions and the orthogonal groups, but these abstractions have shown to be an very useful way of looking at these spaces so they can help explain what is happening in a deeper way than relying on matrix algebra once you get to the point where you are searching for unifying ideas behind these results. The results themselves can always be proved with elementary techniques.
Technically the unit quaterions are not Spin(3), but only isomorphic to it, they are properly Sp(1) = GL(1, H). It's all fuzzy because the low dimensional classical groups are all isomorphic to each other: SU(2) ~ Sp(1) ~ Spin(3).
This seems extraordinarily nitpicky. Like saying that unit complex numbers are technically not the group of plane rotations about a fixed point, but only isomorphic to it. Or for that matter like saying that the "real number line" is technically not a line, but only isomorphic to one.
Well yeah, it is. The point I wanted to make is that these isomorphism are "exceptional"[1] and only hold for the lower dimensional groups. The general Spin group and quaterions are very different objects.
> Technically X is not Y, but only isomorphic to it
Not sure this is a useful argument. If two structures are isomorphic, there is no way to tell them apart. If you can't tell them apart - maybe they are the same thing.
When it comes to group theory, they are the same object. When we talk about things like rotation groups, we're not usually concerned with the way they are represented (much as we don't usually how the real or the complex numbers are constructed, for both of which there exist multiple different constructions).
That really depends on how you define Spin(3), for example some would define it as the compact simply-connected Lie group of a certain type, at which point the unit quaternions model of Spin(3) is as good as any other.
This is really interesting. Why does the failure of naive parameterizations to form a cover imply a group with gimbal lock? I'm unclear on how a cover is linked to gimbal lock.
I've taken undergrad topology and algebra if you could explain in those terms (I understand what a covering is).
Do correct me if I'm wrong: I thought Spin(3) was a double cover, with the quaternions being one sheet of the covering, that of the connected component of the identity?!
This article incorrectly states that gimbal lock is a property of Euler angles, and that using quaternions prevents it.
This is a common misconception.
Euler angles can be used to rotate an object exactly the same way as quaternions do with no gimbal lock. Similarly, you can apply quaternions in such a way that gimbal lock will happen (if you wanted to represent a physical system of gimbals with quaternions, where that is a physical property).
Here's an abstract view regarding the inevitability of gimbal lock: The state of a single gimbal is described by an angle. Since angles are mod 360°, that is topologically a circle. The state of three gimbals are then given by three angles. One point on each of three circles is a point on a three-dimensional torus T^3. With no gimbal lock, you get a map from T^3 to SO(3), which is locally a diffeomorphism at every point. For topological reasons, no such map exists: Due to compactness, it would be a cover, but the only covers of SO(3) are SO(3) itself (a single cover) and the three-sphere (unit quaternions, a double cover). And T^3 is distinct from either of these two. Hence gimbal lock is unavoidable with three gimbals. (Four gimbals is a different story, but then you have a redundant dimension to play with.)
lol i'm a dummy for never realizing that SO(3) isn't homeomorphic (the diffeo part isn't necessary to prove this...) to T^3 (which i naively thought because s in SO(3) seemingly has 3 free parameters). surprise surprise SO(3) is actually homeomorphic to P^3 lol.
You can drop “seemingly”: SO(3) is indeed tree-dimensional. And SO(3) a.k.a. P^3 has fundamental group Z_2 a.k.a. GF(2), whereas T^3 has fundamental group Z^3, so they are quite different beasts indeed.
Similarly, slerp is also not a property of quaternions [1], contrary to the claim in the article, and is usually not implemented inside quaternion libraries using quaternion exponentiation like the article does, but by computing angles explicitly [2], for robustness I believe, and also because slerp is designed for normalized quats, not for general quats with non-unit magnitude.
With these two things combined - the two most commonly cited reasons about why to use quaternions (using slerp and avoiding gimbal lock) - what are other reasons to use quaternions? I’m aware that there are moderate compute savings in some cases (a matrix obviously has more degrees of freedom than a rigid orientation). Are there other good reasons to deal in quaternions? There are some reasonable ideas about why not to use them. [3]
One reason to accumulate into quaternions vs a matrix is that compounded errors can add scaling and shear to a rotation matrix, whereas unit quaternions remain pure rotation (and non-unit quaternions can be trivially normalized).
That's reasonable, but compound matrix ops can also re-normalized and re-orthogonalized as they go, right? It's easier with a quat, for sure, but does it make up for the general complications with using quats?
That's why I posted an article that describes what the complications are, reference number [3] above. It's mainly a game-centric and people-centric view of problems with quaternions, not a list of technical problems with the representation. In short, many programmers don't understand quaternions, so using them puts an education burden on the team, a burden that is most frequently un-met. Quats can be less efficient, if you're not paying attention to what you're doing.
The part that is less efficient is applying a rotation to a vector (you need to "sandwich multiply" by your quaternion which involves 3×4 + 4×4 = 28 multiplications, whereas with a matrix you only need 3x3 = 9 multiplications).
But composing, interpolating, exponentiating, etc. is a lot nicer with quaternions. (Easier to reason about, numerically better behaved, computationally cheaper.)
If you need to apply the same rotation to a large number of separate vectors, keep your rotation representation as a quaternion internally and convert to a matrix just before vector rotation.
Many programmers don't understand matrices either. If you're going to be working with 3D engines, some of this stuff you'll just have to sit down and learn.
It depends a bit on the engine too. A long time ago I designed & built a multi-platform 3D game engine that used quaternions internally for many things, but exposed Euler PYR angles for most "game-level" object controls. We (the engine team) asked around and ended up deciding that the minuscule performance gain and better mathematical behavior from using quats everywhere would be more than offset by some of the devs being unable to reason about how the enemy ended up pointing the wrong direction, when looking at values in the debugger.
That is true, but it's safe to say that the number of people who do understand matrices is far greater than the number of people who understand quaternions. It's also worth pointing out that game engines and graphics APIs all deal in matrices, but don't all deal in quaternions. Mats can do what quats can do, but it's not true the other way around, quats cannot do everything matrices can do. Quats don't help with perspective, non-uniform scaling, or shear, just to name a few.
I don't think the "far" part is safe to say, not in this day and age.
Also, it is not really interesting what matrices or quaternions "can do". What is interesting is what we can do with them. And quaternions make a lot of things possible that are intractable or at least a huge pain with matrices.
> compound matrix ops can also re-normalized and re-orthogonalized as they go
Certainly, but as you say it's a bit harder.
> does it make up for the general complications with using quats?
I'm sure that's context dependent, and don't have much of an opinion which way it's likely to break in general. I was just throwing something in the "pro" column that hadn't yet been discussed.
> ...not implemented inside quaternion libraries using quaternion exponentiation like the article does, but by computing angles explicitly...
How else would you compute quaternion exponentiation? I don't think there's really a dichotomy here. When you compute quaternion exponentiation, one natural way to do it (as with complex numbers) is to think of the quaternions as a real magnitude multiplied by a phase. For complex numbers, the magnitude grows according to exp(x) and the phase evolves according to cos(x)+isin(x). This just falls out of Euler's formula. If you know that the magnitude is 1, you take the exp(x) term out, and you end up with a point that moves in a circle.
The same thing applies to quaternions.
I'm aware that there are other ways to compute quaternion exponentiation, but this is just a natural way to do it, especially for people who aren't experts in numeric programming.
The article's quat_pow is implemented using 'quat_exp(quat_scale(quat_log(q), n));' where the code example I posted computes the half-angle of the rotation. If you stand back, I'd agree with you that there's an equivalence here, and it could be argued that Euclideanspace's code example is a kind of simplification and flattening of using an exponentiation function. Still, there are real differences between these two implementations, and the article's here looks conceptually simple, while the one people use in practice looks more complicated, and requires knowing how quaternions works. It's natural if you're fluent in quats, but not necessarily intuitive otherwise.
> If you stand back, I'd agree with you that there's an equivalence here, and it could be argued that Euclideanspace's code example is a kind of simplification and flattening of using an exponentiation function.
You don't have to stand back that far! They're really quite similar pieces of code.
quat_log() is basically a conversion to axis-angle. quat_exp() is basically a conversion from axis-angle back to quaternions. So, the quat_exp(t quat_log(x)) formula, with different names for the functions, is described as:
1. Figure out the angle between the starting and ending position, and the axis of rotation.
2. Vary the angle of rotation smoothly from t=0..1.
3. Convert back from axis-angle to an orientation.
The only funny thing here is that the axis-angle encoding of quaternions uses a magnitude which a factor of two away from the angle in radians, so you'll see the sample code you linked to (with SLERP) use variables like "halfTheta" and "cosHalfTheta", where the quat_exp() and quat_log() formulas simply won't name them that way.
In the end I think the point of learning more math is so you can see past the differences in naming and recognize when two seemingly different approaches to the same problem are really just two different sets of terminology and names for the same approach.
I believe that even though slerp isn't a property of quaterions and can be retrieve without them, having your orientation represented as a vector is a clean way for the developer to hold the state of an orientation and interpolate it. What's happening under the hood shouldn't matter much, the abstraction in the code using quaterions makes it easier for the developer to interpolate orientations without the worry of gibal locking.
It is easier to deal with accumulating floating point when using quaternions than when using matrices. Various other operations are also quite easy to express in terms of quaternions, such as swing/twist decomposition.
Are you simply saying that quaternions can be used to perform the same rotation as the Euler method or are you saying that the rotation information along the Euler axes can also be lost even when using the quaternion method?
Said another way, are you conflating gimbal lock the physical property, with gimbal lock the common bug of creating irreversible rotations or is the bug still possible?
Thanks for the heads up, I'm going to rephrase the statement about gimbal lock and link you article.
And just to be 100% sure, is the approach I'm using the right one: storing a quaternion instead of 3 angles, multiplying, overwriting the rotation value?
And the idea is, _that's_ how you avoid gimbal lock. It's that implementation of multiplying, and overwriting, which can be done with any system that describes and applies rotation, including Euler angles, rotation matrices, etc.
Great article. Presumably the reason this confusing exists exists is that people don't want to store full rotation matrices, so their choices are Euler angles or quaternions, and Euler angles guarantee gimbal lock by quaternions allow you to avoid it. Is that right?
Basically, you store 3 quaternions, representing 3 angles, and combine them to get the final rotation.
You might say "This is just quaternions emulating Euler angles!" and my answer is, sure. You can say the same about rotation matrices. There's nothing inherent about rotation matrices that makes them susceptible to gimbal lock. You can implement them as representing 3 fixed angles, thus gimbal lock, or you can implement them as accumulating rotations, thus no gimbal lock. Same is true of quaternions.
The fact that you can get gimbal lock with quaternions is a feature, not a bug. Quaternions are just one way to describe rotations. Gimbal lock is a natural phenomenon of certain physical rotation systems, and can be described whether you use quaternions, or matrices etc.
If you want to represent different rotations as differential Euler angles, then I think you can say it's a property of Euler angles, or at least linked to them.
I don’t entirely agree with the article’s viewpoint that people do not perfectly understand quaternions and therefore they should not be used, as I get the feeling there are many parts of 3D graphics that are not perfectly understood by developers, and that’s okay.
Be aware, quaternions are not always the right solution. There's a reason Unity, Unreal, 3DSMax, Maya, Blender, etc all support Euler interpolation in animation. A simple example is an artist might want to show a clock hand spinning fast to show the progress of time. To do that they set a start angle of 0 and an end angle of say 20000. Sure, there may be ways to represent that with specialized quaternions but in general the 3D tools all seems to default to using Eulers.
This is an issue with the GLTF format. They chose quaternions to represent rotations in animation and as such can't easily represent what the artist's intent was.
You might claim you can sample the Euler animation and split it into multiple quaternion slerps but that brings up another issue which is you need support for discontinuous animations in order to handle other situations (another thing the GLTF format apparently didn't consider).
Quaternions usually match artist's intent, and Euler angles usually don't. glTF isn't alone in using quaternions. I did a bunch of FBX imports a while back all the orientation channels are just quaternions. It makes sense, because it's one natural way to interpolate rotation data, just based on the orientations of bones during the keyframes. The kind of stuff that you interpolate using Euler angles is going to be stuff that is naturally on gimbals, like cameras, tanks, robots, turrets, and stuff like that. You can do that easily enough by adding another node to your transform hierarchy with quaternions, but if you started off with Euler angles, you don't really have a way to back out.
Quaternions are not always right, but they are the right default. If you want Euler angles, you can always translate to-from quaternions. Quaternions are independent of the way you set up the coordinate system and each axis is equal.
Unity, for example, uses quaternions internally. It exposes getters and setters for Euler angles that do the conversion to/from quaternions as a convenience. The editor edits Euler angles but they disappear as soon as you are in-game, and if you open up your scene file in a text editor, you'll see m_LocalRotation with the x/y/z/w of a quaternion. I believe Unreal is the same way.
Honestly, that just makes too much sense. Trying to do a physics simulation with Euler angles is just adding extra steps, because Euler angles are not easily composable. If you want to compose two Euler angles to get a third, the easy way to do it is to convert to quaternions, multiply, and then convert back to euler angles. You can see Euler angles in the editor when you are animating a model, but most of the time you are just dragging stuff around on screen or matching mocap data and quaternions make 100x more sense than Euler angles for representing that stuff.
My sense is that any code which does a lot of trig, when there's an obvious way to write the code that does no trig, should probably be rewritten to eliminate the trig. A little bit of sin/cos/tan is fine but as soon as you are doing round trips with acos/asin/atan, you have to start considering where the branch cuts are.
Unity uses quaternions in the rotation but the actual animation curves are still interpolating Euler angles. Same with Unreal, Maya, 3DSMax, Blender etc... GLTF requires you to convert the animation curves to quaternions. That's a lossy operation
Yeah, I think any time you need to interface with a human, Euler angles are better because they're more intuitive. There's a good reason aircraft instruments display things in Euler angles, for example.
> However writing a rotation directly in quaternion form isn't really intuitive, what we do instead is convert an Euler angle to a quaternion then use it for rotating.
For anyone who is interested in an accessible introduction to representing rotations, I highly recommend this site: https://rotations.berkeley.edu. One of my professors provided it for one of his courses, and it's been a really helpful reference several times since then.
I made this guide on how to implement quaternions yourself and use them to rotate objects in a 3D engine. The implementation is probably not the most efficient, but I tried to make it simple enough to understand how quaternions work.
Thank you for making this and getting straight to the useful code. Too often, guides on quaternions stray into proofs and waste time for a programmer who just wants to apply the quaternion concept.
Most of the time you don't want to use his SLERP function. You can even see what is wrong in his illustration video: the cube does 3/4s of a full rotation, while only 1/4 of a full rotation would have been sufficient. In other words, it's not always taking the shortest path between 2 rotations.
0) Very nice practical introduction to quaternions and their application to rotation.
1) Neat didactic "textbook" implementation, but note that it is not production quality (eg potential overflow in the norm function unnecessarily). That was not the aim, either, but just something to bear in mind.
2) As a supplement, a useful practical reference for rotations in 3D (with good clarifications and basically all formulae you'll ever need, but no implementation) is
Representing Attitude: Euler Angles, Unit Quaternions, and Rotation Vectors by James Diebel
To compute a norm without overflow (unless it is totally unavoidable), let m be the maximum of the absolute values of the components. Divide each component by m, compute the square root of the sum of squares, and multiply by m. Only the last step might overflow, and if it does, it could not be avoided in any case. Incidentally, this normalization procedure also avoids underflow problems.
I see. I'd say it depends what the application is, then, because in graphics correctly handling (unlikely) extreme values would be quite secondary to performance, especially for an inner loop function like norm. See fastinvsqrt, e.g., which is extremely imprecise!
Absolutely, I should’ve been more precise: it’s perfectly fine for most applications, but not for a library, or, say, manned aviation. So, yeah, when optimising for accuracy, range, or speed you might implement it differently, respectively.
I mean, papers have been written about sqrt(a^2+b^2) alone… :-)
In many applications there is definitely no reason to worry about overflow when computing norms. That is more of an issue if you are writing library code for general use, which should be as robust as you can make it.
As a non-math person, I've thought a lot about quaternions and why they need 4 dimensions, and why there aren't 3d complex numbers. It's because if you think about it, on the complex plane, the imaginary number i just represents a rotation of 90 degrees. Now if you think about a 3d space, i represents a rotation across one dimension, and j represents a rotation across another dimension. But how do you rotate from i to j? You can't without another number k.
If you want to intuitively understand why this particular 4D construction is the right representation of a 3D rotation, then I highly recommend 3blue1brown's explorable interactive video series on the topic:
The interactive videos alone are quite the technical feat, but after going through it, it's honestly hard to imagine fully understanding this topic with less technology (or with a less incredible teacher!)
In particular, that toggle switch to show it in terms of an angle makes it super clear.
The 4 components can be re-written in terms of 3 variables for an orientation vector and 1 variable for rotation about that axis. Basically "point this way and rotate this much". The 4 variables are expressed as two complex numbers.
That helped me understand what the quaternions are actually describing. Incidentally, it also kind of explains why 3 variables isn't enough, and so the regular rotation thing must not be sufficient.
In "Further Reading" the article links to [Let's remove Quaternions from every 3D Engine](https://marctenbosch.com/quaternions/) which is about Geometric Algebra.
Many chefs are brilliant, but only Jeremiah Tower's book covers will tell you he's brilliant. Many branches of mathematics are of great utility. Geometric Algebra will breathlessly tell you this. I know few fields quite so evangelical.
If you don't know better, you should use quaternions rather than matrices. If you don't know better, stick with quaternions and avoid the generalization presented by Geometric Algebra until the benefit is clear.
This tension is probably why the field is so evangelical.
Quaternions are inevitable. In ten thousand runs of the simulation, sentient beings would come up with quaternions every time. Geometric Algebra is not so inevitable. An aesthetic awareness of the centrality of ideas guides some but not all mathematicians. Like that famous quote about taking an instant dislike to Ted Cruz, it saves time.
As a much more intuitive version of quaternions, there's Geometric Algebra (aka Clifford Algebra). In 4D, the calculations end up being the same, but there's much more intuition and generalizability behind the Geometric version.
This is good, thanks. But a much more interesting problem I haven't seen a good writeup for is how to interpolate smoothly between quaternions at different times. Quaternion slerp has jerks (C_0 but not C_1 or C_2) at the keyframes.
Ken Shoemake’s 1985 Siggraph paper “Animation Rotation with Quaternion Curves”, that brought quaternions to computer graphics, covered this. The idea is to use quaternions as control points in a spline the same way you would use 3d points in a spline. You could have a series of quaternion orientations, and connect them with C_2 continuity by using a connected series of piecewise cubic Bezier splines.
The abstract mentions it: “This paper gives one answer by presenting a new kind of spline curve, created on a sphere, suitable for smoothly in-hetweening (i.e. interpolating) sequences of arbitrary rotations.” And the final punch line is section 4.3, then you can work through the details in the earlier sections.
You can use bezier splines (https://ibiblio.org/e-notes/Splines/bezier.html). These just use linear interpolation, multiple times. In the case of quaternions replace the linear interpolations with quaternion slerps and you get quadratic bezier splines over orientations.
I've used "A General Construction Scheme for Unit Quaternion Curveswith Simple High Order Derivatives" in the past, and while not perfect it was generally good enough and fairly easy to implement.
Basically it extends Hermite splines to Quaternion splines using the Lie group operations.
Small trivia:
Existence of two unit quaternions corresponding to the same rotation is the same thing as the fact that an electron must be fully rotated twice before it has the same configuration as when it started.
Which is also the same as the fact that (very, very roughly) the virtual photons that make up a electron's electromagnetic field have continuous (in the calculus sense) polarization over time and space.
Quaternion is great for dealing with 3D rotation. Another great approach is using the rotor in geometric algebra. It's pretty simple and it works on rotation in dimensions higher than 3D as well.
These implementations of difference and slerp aren't accounting for geometric double-cover. You want to do a dot-product check first to make sure you're in the same "hemisphere"
I love this. Quaternions were my nemesis while learning 3D math. I think it was the way I was taught it but quaternions always confused me as I mixed them up with euler angles. Having resources like this that explain them in detail really helps grok what quaternions are, can do, and how to incorporate them in your project. Great job! I’m over the hump now. I use dual quaternions for skinning and single q’s for rotation storage (why store 3x3m when a 4f quaternion will do?).
> A quaternion is basically a 4 dimensional vector, so it has a magnitude (or norm, or length)
Is it really a vector in the physical sense? People often say vector when they mean N-tuple -- for example we learned in high school that vectors are just N numbers taken together.
For physicists, a vector must satisfy certain transformation laws - it must transform in the correct way if a rotation is applied, and the scalar product must be invariant of the coordinate system, IIRC. I don't have enough intuition of quaternions to say how they behave under transformations, though. I would be surprized if you could have "proper" vectors with four components in three-dimensional space.
Well, quaternions form a vector space over quaternion addition. This part is not very interesting. Vector spaces do not describe multiplication of vectors by each other. So, quaternions are not (only) vectors "in the mathematical sense" when it comes to their more interesting properties.
tldr: Simply explained without demonstrations:
Quaternions are hypercomplex numbers of the form
w + xi + yj + zk
Where w, x, y, and z are real and i^2 = j^2 = k^2 = -1 and ij = k, ji = -k, jk = i, kj = -i, ki = j, ik = -j.
Being u = (x, y, z) = xi + yj + zk a unitary vector parallel to a rotation axis, it is possible rotate any vector q with a theta arc around u by doing:
pqp'
where p = cos(theta/2) + sin(theta/2)u and p' = cos(theta/2) - sin(theta/2)u .
It is definitely UB in C++ and probably implementation defined in C (and this use is fine all implementations, AFAIK). Some C++ implementations allow this as a conforming language extension.
Guys I work at a company that uses Quaternions for rotations of physical objects. PTUs we call them (Pan Tilt Units).
I am telling you Quaternions have HUGE issues. These issues become much more apparent when you deal with physical objects.
Here's the thing Quaternions don't exist in reality. It represents an orientation of rotation but it completely masks the path took to achieve that orientation.
For every gimbal in reality there is an actual YawPitchRoll (YPR) that was executed to achieve that orientation. AS soon as you convert that real YPR into a Quaternion you lose the YPR that was needed to achieve that orienation.
So let's say I need to have one gimbal imitate the position of another gimbal. I take the YPR given to me by gimbal "A" convert the YPR to a Quat, send that Quat over the wire to Gimbal "B" and convert that Quat back to YPR to feed to the gimbal so it can rotate itself to imitate the orientation of gimbal A.
The quat is a higher entropy form of information. Now when converting back to YPR there are MULTIPLE YPRs that yield the same orientation. You can derive a YPR that is out of bounds of the physical gimbal.
Literally you can get a YPR that tells your gimbal to Yaw 190 and pitch all the way back past 90 to 170 degrees and roll 180 degrees until it's right side up. This YPR is identical to a yaw of 10, a pitch of 20 and 0 roll. Quaternions hide the original YPR, you lose information so when you receive a Quaternion it's hard to translate it into a physical realization of the orientation.
The company I work for doesn't realize this. They used Quaternions from day one and we have all kinds of headaches like this when we try to extract the YPR and use these orientations in the real world. Actually I should say only I have these headaches. A lot of people haven't figured out this problem yet.
The only time you should use Quats are if you need to transform an orientation or you're dealing with virtual objects that have no rotational limits. Everybody thinks quats are magic and better. They are not. They have huge downsides. Huge.
If you have actual, possibly motorized, physical rotation axes to keep track of, then of course you need to keep track of the actual angles of each separate rotation, somewhere.
Compressing it all into one orientation may work for some use cases, but generally one should not expect that. Similarily one wouldn't try to represent all of the axes of a KuKa arm robot with just one orientation. You need info about the individual axes when you want to control it.
Your company can still use quaternions to represent the rotations of each axis, tho. Might help in convincing them going forward, as they don't have to let go of them (they are good fellas actually).
I meant when you want to compute the final orientation (lets ignore position): Each axis results in some orientation change which can be expressed as a quaternion. The final orientation can then be computed by combining the quaternions into one quaternion.
This is very similar to the computation of forward kinematics (https://en.wikipedia.org/wiki/Forward_kinematics) for robots, which just have additional translations for the transformations.
As said before you still want to retain the actual angles somewhere.
Depending on your physical configuration one of the Euler angle variants (Tait–Bryan angles is another term to search for) could perfectly describe your case and you could just use these to store the angles.
Euler angles can also be converted to quaternion. But you can't recover the original used angles from the quaternion alone because of the 2pi wrapping of angles.
For others still wondering why quaternion or orientation alone won't suffice, here is a different example: Imagine an axis which can rotate more than only one revolution, i.e. can have angles like 4 pi, for example motorized volume knobs. An orientation alone can't represent that as it wraps the angles to 2pi. User manually turns the knob to two full revolutions (4pi) and then increases the value by remote control, which results in motorized knob to rotate. Should the knob turn back to nearly zero revolutions (rotate back 4pi) plus the increase? No, it should rotate to 4pi+increase.
By the way, when you have internally stored the "multi revolution" angle and have a sensor reading in range [0..2pi[ you can recover an multi revolution angle equivalent to the sensor reading with wrapToPiSeq( angle in radians before, sensor reading).
# to equivalent angle in [0,2pi[ range
def wrapTo2Pi(rad):
return rad % (2*pi)
# to equivalent angle in ]-pi,pi] range
def wrapToPi(rad):
return wrapTo2Pi(rad + pi) - pi
# angle difference between rad0 and rad1, in range ]-pi,pi]
def angleDiff(rad0, rad1):
r0 = wrapToPi(rad0)
r1 = wrapToPi(rad1)
return wrapToPi(r1-r0)
# rad1 to equivalent angle so the jump from rad0won't be greater than |PI|
def wrapToPiSeq(rad0, rad1):
r0 = wrapToPi(rad0)
r1 = wrapToPi(rad1)
diff = wrapToPi(r1-r0)
return rad0+diff
I get what what your saying here. So use 3 quaternions to represent the orientation rather then 1. Right?
But doesn't that magnify the problem by 3x? Extracting the angle from the quat again can yield multiple possibilities. If you have 3 quats you now have 3x more possibilities. This can only be done if you assume certain restrictions for each quat such that they yield a singular axis angle when you do a back conversion.
Additionally doesn't that technique yield more possibility to encode axises that are incorrect? A quaternion representing the x axis can be accidentally encoded with rotations along other axises. It's better to keep the type of your domain restricted to be able to encode only possible answers.
Still this can be done as a valid-ish workaround. I give you credit for that, I wouldn't of thought of this so thanks for ur explanation. Although I might use this idea it is far from ideal imo because of the problems I mentioned above. That is if I interpreted what you're saying correctly?
Also addressing your example, the bigger problem in my mind is that there are still issues that arise even if the physical gimbal is restricted on all axises to (0, 2pi). Any gimbal that can go 4pi likely can go infinite pi and that knob will likely be the same so losing rotational information greater than 2pi is ok for most cases in my mind. (If the system yawed 790pi, users are usually only interested in some value under 2pi). The insidious thing imo, is that information is lost even in (0, 2pi) and a lot of people don't realize this.
Represent your axes with the actual angles. These probably correspond to motor position or revolutions.
Use the quaternions only when you want an orientation for these angles.
Maybe you want to know in which direction the PTU is pointing for a particular set of motor positions, or axis angles. Compute the kinematic chain by multiplying the quaternions corresponding to each axis in the order they are physically applied, but multiply from right to left. Your final result is one quaternion representing the direction the PTU is pointing at. (You could also use 3x3 matrices or other representations) Depending on your physical configuration one of the Euler angle variants (Tait–Bryan angles is another term to search for) could perfectly describe your case and you could just use these to store the angles and compute the orientation.
If you don't actually need the final orientation, then you can omit the quaternions altogether.
If you have orientations as input and need to control the motors so the PTUs are pointing in the required direction:
I would compute the current orientation of the PTU. Then compute a trajectory of quaternions interpolating from the current orientation to the target orientation (use quaternion slerp for interpolation).
Then at each timestep you compute the required motor positions using inverse kinematics [1].
It is a common problem that multiple motor positions are possible and actually an unfinished research problem.
Now that I wrote this, I think this might be the problem you are encountering.
For PTU I think it would be ok to try to recover the current target angles from the target quaternions of the trajectory using one of the Euler configurations. There are papers [2] listing all together, so one can try which is correct.
Having a configuration chosen there is still the problem of multiple solutions. In this case use the one closest to the actual current angles. For cases when all angles are possible for an axis, use the current angle as target (i.e. motor doesn't change). When you then have an target angle use `wrapToPiSeq` to get an equivalent angle close the current actual angle as input for the motor controller.
[1] https://en.wikipedia.org/wiki/Inverse_kinematics
In computer animation and robotics, inverse kinematics is the mathematical process of calculating the variable joint parameters needed to place the end of a kinematic chain, such as a robot manipulator or animation character's skeleton, in a given position and orientation relative to the start of the chain.
[2] In the past I used this (but be careful in which direction they apply the transformations, I stumbled over this):
Diebel, J. (2006). Representing attitude: Euler angles, unit quaternions, and rotation vectors. Matrix, 58(15-16), 1-35.
https://www.astro.rug.nl/software/kapteyn/_downloads/fa29752...
When switching target orientation while already approaching one I would use quadratic bezier splines to get a smooth switch (https://ibiblio.org/e-notes/Splines/bezier.html). These just use linear interpolation, multiple times.
In the case of quaternions replace the linear interpolations with quaternion slerps and you get quadratic bezier splines over orientations.
I'm not sure how we do it in this case. I know we already follow a trapezoidal velocity profile when approaching a target, but mid switch to another target I'm not sure what we're using. Thanks for this, I will look into it.
> In this case use the one closest to the actual current angles. For cases when all angles are possible for an axis, use the current angle as target (i.e. motor doesn't change). When you then have an target angle use `wrapToPiSeq` to get an equivalent angle close the current actual angle as input for the motor controller.
Yeah that's how I solved this issue. But still if we avoided quaternions we wouldn't have this problem all together, which is my point.
Specifically what's going on is that we're sending quaternion values over the network and the person on the receiving end needs YPR so we're basically like wtf, there's no transformations being performed on the quat, the source of info is a YPR and the output is needed is the same exact YPR so we're only converting to a company wide Quat Protobuf type to send over the wire. I submitted a request to make a new protobuf type that included YPR but I was met with huge company resistance from other engineers saying that a YPR was redundant to a Quat (It's not).
>In computer animation and robotics, inverse kinematics is the mathematical process of calculating the variable joint parameters needed to place the end of a kinematic chain, such as a robot manipulator or animation character's skeleton, in a given position and orientation relative to the start of the chain.
Interesting, but yeah the application in my company is just a single gimbal so there's no chain of "joints" here. I don't think this would apply to our my specific case.
>> For every gimbal in reality there is an actual YawPitchRoll (YPR) that was executed to achieve that orientation. AS soon as you convert that real YPR into a Quaternion you lose the YPR that was needed to achieve that orienation.
I would say you obscure it. You can certainly calculate it from the 4 quaternion parameters.
SolveSpace (Free CAD software) can be used to design assemblies and mechanisms from a set of parts with constraints. You can certainly build a gimbal with it by constraining the pieces. If you do it correctly, it will be possible to re-orient the final 3DoF part and the constraint solver will solve for the angles (assuming you built it that way).
Internally we treat all object orientations as quaternions, so this would just be using the algebraic constraint solver to find the angles. In practice there will be closed form solutions - with problems at gimbal lock.
Nope. The only thing you need to specify is the order the YPR angles are applied (that more a convention than an assumption). In SolveSpace the assembly constraints would effectively encode the order of application.
>If you really need this problem solve I might know someone willing to do paid consulting on it.
If you really need education in math and how to properly do trigonometry I will help you solve this problem for free. Just ask me questions. I'm super nice and won't go around fraudulently espousing an expertise in math and demanding people pay me to solve trivial math problems.
Frankly you are not even qualified to solve the problem yourself or give me a recommendation.
Especially when this problem is basically impossible to solve. I'll give you 10 thousand dollars if you can give me a quaternion that doesn't have multiple yprs here on HN. Literally, give me your venmo.
>Nope. The only thing you need to specify is the order the YPR angles are applied (that more a convention than an assumption). In SolveSpace the assembly constraints would effectively encode the order of application.
The term "yaw pitch roll" ALREADY has the order of the Euler angles applied. Let me tell you that order, it's: yaw, pitch and then roll.
Ypr is different from straight up Euler angles in that ypr has order fixed; hence the term "ypr"
With the order of angles fixed you still get multiple answers for a single quaternion. Why don't you try it in 3D space in your own head. Given A rotational orientation in 3D, find at least two yprs needed to arrive there.
Here maybe this will help you: a ypr of (0,0,0) is the same as a ypr of (180, 180, 180). These two yprs can only be represented by a single quat. Think about it. Given only a quat you cannot determine which ypr was used by the physical gimbal to realize this orientation. For solve space to know it must be making assumptions or holding onto information outside of the quat.
There free education for you and I didn't even ask you for a dime.
No, no, no. Wrong again. Please study basic trigonometry and read your own sources. Your own link proves you wrong.
The formula for conversion from quat to ypr involves arctan and arcsin. These functions yield multiple answers.
Additionally your own Wikipedia link explicitly states the existence of multiple answers, and that traditionally atan and asin in programming languages yield only one answer. I quote:
"Note, however, that the arctan and arcsin functions implemented in computer languages only produce results between −π/2 and π/2, and for three rotations between −π/2 and π/2 one does not obtain all possible orientations. To generate all the orientations one needs to replace the arctan functions in computer code by atan2"
Either way you misunderstand the math behind quaternions and you lack a basic grasp of trigonometry. Assuming you read your own Wikipedia link, what I said is categorically true.
Interesting to read about experience with a physical gimble, thanks. In this case the "problems" of Euler angles are actually an accurate model of the problem space.
You can design a hardware ptu that when given a quat it figures out its own ypr to arrive at that orientation. That would make quats feasible for hardware.
However, how the Ypr was picked by the hardware must be explicitly encoded as an assumption that must be part of every conversion from quat to ypr that happens downstream.
I'm sure it does, I'll give you the benefit of the doubt even though the article makes no mention of Quaternions. My point is, using Quaternions for physical devices is using a hammer on a screw. Huge mistake, but it can be done by people who don't know any better. I'm guessing you worked on this and bought in to the whole Quaternion BS?
I'm in the defense industry as well and guess what? Basically most people don't know any better.
Rotations of 3-dimensional real space form the topological group SO(3). Naive parameterizations of that group do not form a cover [1], but the group of norm-1 quaterinons, Spin(3), does.
The failure of naive parameterizations, like the Euler angles, to be a cover manifests itself as gimbal lock.
[1] https://en.wikipedia.org/wiki/Covering_space