Hacker News new | past | comments | ask | show | jobs | submit login
Matrices from a geometric perspective (coranac.com)
206 points by niklasbuschmann on Dec 17, 2017 | hide | past | favorite | 51 comments



3Blue1Brown has an outstanding series of short videos, "Essence of linear algebra"[1], which covers vectors, matrix transformations, and the related math. What I particularly like about these videos is that the concepts are introduced first without the numbers and calculations (that stuff is covered later). The concepts are first introduced as abstract animations to emphasize the "shape" of what the vector, matrix, etc really represent.

[1] https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2x...


Amazing series!

Open question I still have - what is the "geometric interpretation" of the transpose operation? A^T?

Considering the fact that the transpose shows up all the time, I'm very surprised that I've never seen a good explanation of how I should be visualizing it.


Graph theory offers one nice illustration of the transpose (for square matrices). Here's a directed graph:

  ⬐---------¬
  n₁ → n₂ → n₃
It can be represented by this adjacency matrix:

  X = [ 0  1  0 ]
      [ 0  0  1 ]
      [ 1  0  0 ]
See how there is a 1 for each row i and column j where there is an edge from node i to node j?

Okay, now look at the transpose, Xᵀ:

  Xᵀ = [ 0  0  1 ]
       [ 1  0  0 ]
       [ 0  1  0 ]
If we interpret this matrix as an adjacency matrix, its graph looks like this:

  ⬐---------¬
  n₃ → n₂ → n₁
Taking the transpose of a directed graph's adjacency matrix just reverses the direction of the edges!

P.S: some more cool things about adjacency matrices:

Imagine that a vector represents a node or set of nodes on a graph. Like this:

  n₁ = [ 1 ],  n₂ = [ 0 ],  n₃ = [ 0 ],
       [ 0 ]        [ 1 ]        [ 0 ]
       [ 0 ]        [ 0 ]        [ 1 ]
Watch what happens when we multiply X with n₁:

  X*n₁ = [ 0  0  1 ]   [ 1 ]   [ 0 ]
         [ 1  0  0 ] * [ 0 ] = [ 1 ] = n₂
         [ 0  1  0 ]   [ 0 ]   [ 0 ]
So the adjacency matrix is not only an index of edges, it's also a little machine that can push nodes around on a graph. You can even do it with two at a time:

  X*(n₁ + n₂) = [ 0  0  1 ]   [ 1 ]   [ 0 ]
                [ 1  0  0 ] * [ 1 ] = [ 1 ] = n₂ + n₃
                [ 0  1  0 ]   [ 0 ]   [ 1 ]
If you're interested in diving further into algebraic representations of graphs, check out the Leavitt Path algebra: https://arxiv.org/abs/1410.1835


Notably adjacency matrices are orthogonal so A^T = A^-1.

This extends to probabilistic transitions as well.


Well, the adjacency matrix I used as an example is orthogonal, but they don't have to be. Any matrix with 1's and 0's can be interpreted as an unweighted adjacency matrix for an undirected graph (if the matrix is symmetric) or directed graph (if it's not symmetric). For example, here's an adjacency matrix that's not an orthogonal matrix:

  [ 1 0 ]
  [ 1 1 ]


It's a bit difficult to explain the visualization without pictures, but I'll give it a shot.

The transpose is really about converting the matrix to operate on a different vector space, namely, the dual space. In particular, the dual space of a vector space V is the vector space of "linear functionals", which are linear functions

\phi: V -> R

A linear functional on R^2 looks like a gradient (the "gradient fill" gradient, not a calculus gradient). These gradients are in one-to-one correspondence to vectors in R^2. In particular, given a vector w \in R^2, the direction of the gradient is along the direction of v, and the speed with which the gradient is changing corresponds to the magnitude of w.

The precise mathematical correspondence is that (i) given a vector w \in R^2, the function

f_w(v) = <w, v>

is a linear function (here, <,> is the inner/dot product), and (ii) every linear function has this form. Now, note that f_w is exactly multiplication by the transpose w^T of w! In particular,

f_w(v) = w^T v

More generally, for any linear map A : V -> W, the adjoint A* of A is defined to be the linear map from the dual space of W to the dual space of V that satisfies

<w, A v> = <A* w, v>

The transpose A^T is the adjoint of A when V is a finite-dimensional real vector space:

<A^T w, v> = (A^T w)^T v = w^T A v = <w, A v>

In summary, you can try to visualize A^T as a linear map "acting" on dual vectors. For example, let v \in R^2 and let w be a dual vector (i.e., a gradient), and suppose that A rotates v clockwise by 90 degrees. To preserve the inner product <w, A v>, A^T rotates w counter-clockwise by 90 degrees.


And a practical example of this is checking transformed points against a view frustum. Instead of transforming points into the view space a transposed matrix allows you to transform the frustum into the object space and check untransformed points against it. This works only on non-perspective transforms, of course, but the view transform should not be perspective anyways.

To visualize this, take a simplest case of 2D space and non-homogenous coordinates. A simple frustum would be an angle made by two rays from the origin. You can see that rotating this space is as same as rotating the frustum in the opposite direction (though in this case transposed matrix is the same as inverse) but stretching the space opens/closes the frustum depending on in which direction it pulls its normals.


> Open question I still have - what is the "geometric interpretation" of the transpose operation? A^T?

AFAIK, there is not a solid geometric interpretation. Part of the trouble is that the transpose can change the shape of the matrix.

For example, for a vector the transpose turns it into a matrix outputting a single number. These two objects seem to be rather incomparable geometrically.

The best intuition I have for transposition is that it represents a time reversal (but not necessarily an inverse). In the case of a vector, you have to think of it as a linear transformation that maps 1 to that vector. The transpose instead maps that vector to its length squared. I have more vague intuition for the why its length squared and how it relates to projections, but its hard to put into words.

With rotation matrices, this time reversal results in the inverse. So essentially rotation/skewing is reversed, but scaling is not.


I don't have a complete answer, but a first step would be to think of the coordinate-free concept behind transposes, which is the adjoint: https://en.wikipedia.org/wiki/Transpose#Adjoint



I was hoping this was the top comment. Linear algebra would have been much more interesting if I had watch those videos first. He 3B1B such an amazing teacher!


It remind me 3B1B either when I read this much earlier blog. 3B1B is really really wonderful tutor. Liner algebra is only one series among others.


Can't overstate how great his stuff is.


For people interested in the relationship between algebra and geometry, there is a quite rigorous, yet not too long discussion of these topics in the "Elements of Geometry for Computer Vision" [1].

It comes the from the engineering end and finds a good balance between going into too much depth and introducing the necessary theory for bridging algebra and geometry. E.g. representing motion as a change in basis in linear space, the perspective camera as a mathematical model, etc.

[1] http://cmp.felk.cvut.cz/~pajdla/gvg/GVG-2016-Lecture.pdf


Did not expect to see this mentioned. I attended this exact course at the university. This is my goto material for computer vision geometry.


I don't want to doubt the potential usefulness of the linked article to certain audiences, but I wanted to point out the following:

It's important to acknowledge that intuition from rotations etc. are crutches, and not substitutes for deeply understanding linear-algebraic constructions and their formal properties.

Linear algebra is not really about the arithmetic of multiplying arrays of numbers - it's about the nice algebraic things that happen when you're working with things that come with "linear" operations. The linear-algebraic things that permeate all of mathematics aren't "rotations matrices" and such, but rather "universal constructions" like kernels and cokernels, products and tensor products etc. We even have abstractions that precisely formalize these nice properties of the category of modules/vector spaces, such as abelian categories and linear functors. Any introductory reference on these will provide you with an abundance of examples where things with these formal properties naturally arise.


A related remark: there is another (rather silly) way to "geometrize linear algebra". It goes something like whenever you see a ring R, think of it as the sheaf of functions (the structure sheaf) on some space. Whenever you see a module over R, think of it as the space of sections in some vector bundle over that space. Then anything you say about modules through this admits a sheaf-theoretic version, which is often easier to picture if you're geometrically inclined.


In the case where the base ring R is a field (all of typical linear algebra), then the space associated to it is a point. So a vector bundle over this point is just a vector space, so we didn't really gain any extra geometry here - if you want a way of picturing linear algebra geometrically, you still need some other idea.


Good point!


Would you suggest a set of intermediary-level books on the topic?

For example: what would be beneficial to read if I understand that matrices are systems of linear equations, but I do not clearly understand why a cross product exists only for 3 dimensions?


I finished Ph.D. courses without ever having to pay much attention to the geometric interpretation of matrices. The path I took was solely computational and algebraic. I often wondered if I'd missed anything. The 3Blue1Brown videos say I did.

I will admit I don't have a solid intuition for matrices used for rotations, flips, etc. (outside of basic 2x2 examples). This can be a disadvantage, even if one pursues a purely computational approach to linear algebra. For instance, it took me a little longer to understand numerical techniques like Householder transformations. https://en.wikipedia.org/wiki/Householder_transformation

That said, I want to also say something in defense of the purely algebraic approach. In higher dimensions, there is no geometric analogy and people who rely on geometry alone find themselves unable to think abstractly in higher dimensions. To me, matrices and vectors are the building blocks for multivariate generalizations of familiar relationships. The algebraic approach meant I was comfortable dealing with complicated high dimensional objects that weren't amenable to visualization.

I believe intuitions are good for laying a foundation, but abstractions are necessary for generalization and for building larger theoretical systems. Intuitive constructs can only get you so far. If I'm not mistaken, this was what Bourbaki group strove for; that is, to move mathematics past the intuitionist approaches of the day to a solid theoretical foundation built upon abstractions. There are pros and cons, but arguably without this foundation, mathematics might not have grown in the way it did.


> In higher dimensions, there is no geometric analogy and people who rely on geometry alone find themselves unable to think abstractly in higher dimensions.

Often features in higher dimensions can start to be seen in lower dimensions. For example the common example of the strangeness of higher dimensional spaces is the case that a hypercube with side length 2 can have a smaller volume that a hypersphere touched by hyperspheres of radius 1 at each corner of the hypercube.

This is not true in 3 or even 4 dimensional space. But if you use geometric intuition to see the trend from 1d to 2d to 3d, it seems entirely reasonable.

(Apologies if I didn't explain the problem well. I wanted to find a video on it, but couldn't find one)

> I believe intuitions are good for laying a foundation, but abstractions are necessary for generalization and for building larger theoretical systems. Intuitive constructs can only get you so far.

I have a slightly different view on this. I feel that abstractions are fundamentally necessary for communicating rather than conceptualization. For example, you may realize that the type of theorems you have been making about vectors equally applies to polynomials, but in order to convince others you would need to write done the properties they both have (resulting in an abstraction).

Its kind of a subtle point because communication is key to mathematics. But normally, before I try to write a proof, I figure out the intuition for why it must be true in my head. Because this is a powerful tool, I try to learn a solid intuition for any abstraction I am introduced to.


> Intuitive constructs can only get you so far.

yes, but intuitive constructs helps the beginner. It makes the material easier to obsorb, and hopefully, by building this intuition up, it can lead to a deeper learning later on (what you call abstraction).


This is obvious for anyone who does game development. Matrices are use to convert between 3d game coords -> screen coords. Also almost all movements/rotations/scaling of objects in game is done via matrix multiplications.

I used matrices once to draw 3d spheres efficiently! It's very computationally expensive to calculate the points of many sphere's of varying radii every frame. What i instead did was pre-compute a few small sized spheres located at the origin. I then matrix multiplied the pre-computed vertices to position spheres at the desired location and size in the 3d world. This was so much faster my fps went from ~5 to a smooth 60!

Very useful knowledge!


> Also almost all movements/rotations/scaling of objects in game is done via matrix multiplications.

Not necessarily true. Quaternions are used a lot because they have some desirable properties over matrices. As far as I remember, they are smaller and more suitable for interpolation.


It's not either-or, quaternions do not replace matrices. Matrices are (almost) always used, regardless of the use of quaternions (also regardless of use of any other types of transforms including geometric algebra). Quats are typically used for interpolating orientations. Once you have an interpolated orientation, you still have to turn it into a matrix to interact with other transforms, and to send to the GPU. The vast majority (if not all) game engines, libaries, and GPU/console hardware operate on matrices at the lower levels.


I could have sworn that quaternion transforms were still 4x4 matrices, but I think that's just because quaternions themselves are representable as 4x4 real matrices, which are computationally convenient. https://en.wikipedia.org/wiki/Quaternion#Matrix_representati...


Quaternions are a 4-parameter representation of a rotation + scaling operator, the analog of using a complex number to represent rotation + scaling in the plane.

You can convert a (4 parameter) quaternion to a (9 parameter) 3x3 matrix in a similar way that you can convert a (2 parameter) complex number to a (4 parameter) 2x2 matrix. In either case you aren't using the full parameter space of the matrices.

In general in programs quaternions are used for storing and manipulating rotation operators (e.g. composing them together, etc.) because it saves work and improves numerical stability, but then final application of the rotation to a gigantic list of vectors is done using a matrix, because matrix-vector multiplication is very efficient (only need 9 multiplications and 6 additions to multiply a 3x3 matrix by a 3-element vector).


Quaternions represent 3D rotations in a way which avoids the drawbacks of Euler angles. See "gimbal lock".


Yes, but he's comparing them against Rotation Matrices (which don't suffer from gimbal lock), not Euler angles:

http://work.thaslwanter.at/Kinematics/html/04_Quaternions.ht...

In practice, game engines use all three methods.


Euler angles (or an equivalent) are implicit in the the way rotation matrices work.


This is not correct (or at any rate is oversimplified). A rotation matrix can represent any arbitrary path through the space of possible rotations (SO(3)) by smoothly varying parameters. There is no inherent “gimbal lock” or the like.

In general I would recommend avoiding Euler angles as a formal representation of rotations. There are several choices of better representations, some of which are described at https://en.wikipedia.org/wiki/Rotation_formalisms_in_three_d...


You're right, I should have said "usually inherent". But it's a major reason for using quaternions.

While you stress the case of gimbal lock occuring within a single rotation, which is typically due to the use of Euler angles, there is also the problem of gimbal lock emerging through the composition of two or more rotations in matrix form. I'm vague on this point but AFAIK this problem can arise with rotation matrices whether or not Euler angles are used. Choose three successive unlucky rotation axes and you've got the problem.

This seems to get so complicated. See e.g. https://en.m.wikipedia.org/wiki/Davenport_chained_rotations


Can also be prevented by rotation composition order of Y,X,Z (where Y is up and Z is forward)


I thought there was no 3D parameterization of SO3 that was singularity-free, except maybe axis*angle.


For an article like this I find that static visualizations aren't anywhere near as effective as dynamic ones found in a similar explanation here (3Blue1Brown on Linear Algebra Transformations)

https://www.youtube.com/watch?v=kjBOesZCoqc&list=PLZHQObOWTQ...


A notebook I made for exploring this relationship. If you have Jupyter notebooks setup, you can change the parameters and see how this works in a more interactive setting: https://github.com/Andrewnetwork/WorkshopScipy/blob/master/N...


Genuinely curious how anyone could know about matrices and _not_ know their relationship to geometry.


Why do you assume a geometric interpretation is a given? For graphics, matrices and geometry go hand in hand, but not for other fields. There isn't always a geometric relationship.

Matrices are a notation normally introduced as part of linear algebra, for solving a system of equations. The book I first learned linear algebra from is "Advanced Engineering Mathematics" which doesn't cover anything geometric in the entire first chapter on linear algebra (which usually takes 1 quarter in college to cover). It mentions linear transformations, but only algebraically. What it does talk about is Gauss-Jordan elimination, augmented matrices, homogeneity, inverses, determinants, eigenvalues, unitary and skew-Hermitian matrices.

Most of that stuff isn't geometric, and it seems like it wouldn't be hard for lots of people to have been exposed to the algebra without learning the geometric interpretations.

Everything going on in neural networks today is built on matrices and matrix operations that don't have a naturally geometric point of view.


>Why do you assume a geometric interpretation is a given?

Because that's how matrices are introduced first, in high school mathematics?

>There isn't always a geometric relationship.

Not sure about that to be honest. Seems like there is always a yin-yang between algebra and geometry -- two sides of the same coin.

>The book I first learned linear algebra from is "Advanced Engineering Mathematics"

I pulled my copy of the shelf in my office for the first time in 20+ years and took a look:

I see many references to geometry : Example 5. Linear Transformations ... the matrices ... represent a reflection in the line... Exercise 19 (Rotation) Show that the linear transformation ... with matrix... is a counterclockwise rotation ... and so on.

Actually, I realized I'm reading the chapter "Linear Algebra Part II , Matrices..." and you cited "the first chapter on linear algebra, so I read that chapter (Vectors) and I find on page 276 : "Geometric Representation" with a nice figure titled "Geometrical Interpretation of a scalar triple product".

So hmm. Perhaps we're looking at different books? I found that the current (10th) edition is available in Safari Books. Interestingly they have removed some of the geometric references that are present in my 1983 version. However they also included this :

"(d) Computer graphics. To visualize a three-dimensional object with plane faces (e.g., a cube), we may store the position vectors of the vertices with respect to a suitable x1x2x3-coordinate system (and a list of the connecting edges) and then obtain a two-dimensional image on a video screen by projecting the object onto a coordinate plane, for instance, onto the x1x2-plane by setting x3 = 0. To change the appearance of the image, we can impose a linear transformation on the position vectors stored. Show that a diagonal matrix D with main diagonal entries 3, 1, image gives from an x = [xj] the new position vector y = Dx, where y1 = 3x1 (stretch in the x1-direction by a factor 3), y2 = x2 (unchanged), image (contraction in the x3-direction). What effect would a scalar matrix have?

(e) Rotations in space. Explain y = Ax geometrically when A is one o..."

So to be honest I'm not seeing evidence to support your assertion that the geometric interpretation of matrices is absent from that textbook.


You said you were genuinely curious how people could learn matrices without the geometry. I (and others here) gave multiple examples of how that could happen, but you're arguing. Are you really curious, or was your question rhetorical? Do you not believe it's possible that anyone could have learned math in a different order than you did?

Despite your objection, it's a fact that matrices for systems of equations and neural networks both do not have a naturally geometric interpretation. Nor do many of the applications used in my math book: electrical networks, traffic flow, models of commodity markets, etc. You are free to ignore that, but if you do, it might not help you understand why some people learned matrices without a geometric context.

You are looking at a different book than I am. I have 7th edition Kreyszig. I didn't say geometry was absent from the book, I said it was absent from the first quarter/semester of linear algebra in my book. There is geometry in the book, but many students that went through college the same time I did could easily have taken the minimum requirement, one quarter of math and stopped there.

I don't know about yours, but my high school math introduced solving systems of linear equations. It did not cover orthonormal bases or rotations or affine transforms or homogeneous coordinates.


Hehe, you'd be suprised. I took an honors linear algebra course in my first year of university, and abstraction was the word of the day. Everything we did was for abstract vector spaces over arbitrary fields. We didn't look at the determinant as the signed volume of an n-cube, for example, but rather as a homomorphism from the group of matrices to the base field under multiplication. The cross product was never mentioned because it didn't generalize to arbitrary dimensions. It was good preparation for further abstract algebra courses, but not so much for others, like differential geometry, which rely heavily on geometric intuition.


I first came across them in a book about discrete mathematics, which was more about the algebraic structures matrices can form than what matrices themselves can represent.


You could have learned them from a very algebraic source, I suppose.


Not sure how that could happen without going through high school math, but I suppose anything is possible.


Linear algebra is not offered in most US highschools.


Matrices and matrix arithmetic, including determinants is covered in "Algebra 2" here in a small town in a flyover state.


How disgraceful, conflating matrices with linear and affine transformations.


Way too often intuition (or trying to make things clear on an intuitive level) interferes with and/or severely impedes and slows down gaining the actually useful skills. Maybe not in applications to computer grahics, but in genreral there is so much to matrix analysis [1] that trying to translate everything into geometric images would make learning it at a practically useful level anything but impossible.

Rote learning, like the mindless memorizing of the multiplication table, is an extremely important component of education that is to reach any degree of sophistication or usefulness.

Also, being not (too firmly) attached to a particular interpretation of a formal system leaves open the door to potentially discovering new applications.

[1] See, for example, a book by Horn and Johnson.


I just love how a few posts over here massively upvote clearly simple comments such as "I had used a matrix in 3D and then used them to rotate points in 3D space" against somebody who may not clearly understand, yet still makes a point that matrices are way more abstract than people expect them to be - and they downvote because it's harder to understand.

This is such a horrible form of childish anti-intellectualism.


Its almost as if there is an isomorphism between matrices and linear transformations.

Actually, not almost. The author clearly explains this isomorphim in his introduction to matrices.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: