This book spends three chapters on the mechanics of matrix operations before defining a vector space, and then it jumps into determinants.
IMO this is the wrong way to understand linear algebra, and the typical path of a lot of introductory text books. A better text: Axler's "Linear Algebra Done Right"
As a math professor, I think you can get by without defining vector spaces -- I think it is a little bit difficult to come up with examples, other than R^n or C^n, which seem motivated and interesting to the beginner. The truly important example (IMHO) is R^n without a choice of basis -- but I think this can only be well motivated after you've seen a lot of linear algebra, not before.
Mechanics of matrix operations are not pleasant to teach, they make the subject seem like a bunch of contrived and confusing examples. Same for the "row echelon" stuff. Alright, you now have an algorithm for solving systems of equations... but presenting linear algebra as an algorithm sells it short.
What is really important in my view is the geometry of the subject, which is already very interesting in two dimensions. Problem: Here is a linear transformation, given as a 2x2 matrix. Draw a picture which illustrates what this does to the plane. If you ask me, this is much more important than most of the crap that gets taught and tested in most courses in linear algebra.
This is how I learned linear algebra, as a series of algorithmic steps to perform to reach the solution. Only a handful of homework examples even hinted that matrices were powerful and useful. I learned the semester before that class how matrices can be used to represent physical locations and how multiplying these matrices can move you along the coordinate frames for the joints in a robot arm all the way to the end effector. In fact, you can take your final transformation matrix after you multiply each joint together and use that to translate your points in your native/world coordinate system into coordinates in your tool coordinate system. I later learned that matrices can be used for quantum computation/pyhsics and all sorts of other things like weather prediction, physics, graph theory, Bayesian networks, and so on.
Now that I'm out of school, I really want to learn how to properly use matrix algebra. I don't care about HOW to find a determinant or HOW LU decomposition works as I've already done those, but rather why I would want to perform that operation on the matrix in the first place.
The ability for matrices to model real processes is something that really fascinates me.
Do you have any other good reading material for someone like myself?
> Do you have any other good reading material for someone like myself?
I'm afraid I don't personally (I'm into abstract and theoretical math, which I'm guessing is not your cup of tea). But I should dig something up before I teach the subject again, so I would be as interested to read replies as you.
Gilbert Strang is an applied mathematician and also the author of a popular linear algebra book; I would guess that his books might interest you. But this is speculation, I haven't read any of them myself.
So what is a good Linear Algebra textbook from that perspective? Assume someone has completed a course in Axiomatic Set Theory and is conversant with proofs etc, but now wants to get into Linear Algebra. Any textbook suggestions?
I'm guessing "Linear Algebra Done Right" by Axler; I haven't read it yet but I heard good things about it and currently have it on order from Amazon. Hoffman + Kunze is the classic. For now, I don't know a book in the subject I really like, but I haven't looked very hard yet.
One way would be to take a course on a topic that uses linear algebra heavily, rather than taking a generic linear algebra course. Since all the ideas from linear algebra are then being used to actually do something, you'll get at least one example for why you would use whatever it is.
I would suggest trying these videos: http://www.stanford.edu/~boyd/ee263/videos.html. The prerequisites are very low and a main focus is on interpreting the abstract concepts in applications.
I'll second the Strang recommendation. I've recently read the chapter in Linear Algebra and it's Applications about how the FFT algorithm is in part a matrix calculation! Fascinating (and baffling) stuff.
I think it is a little bit difficult to come up with examples, other than R^n or C^n, which seem motivated and interesting to the beginner
My introductory linear algebra class also used polynomials over R, which led to examples like using projection to construct a polynomial approximation of a non-polynomial function.
I completely agree. In college, my first set of Linear Algebra courses were taught by the Math department and were completely algebra based.
Later on, taking the engineering LA courses where they stressed the geometry of all the operations really cemented the concepts and simplicity of LA into my mind.
A million times this. Its not trivial to geometrically understand linear algebra without getting into topology if you go "determinants first". An algebraic understanding of linear algebra is not the most adequate if you are learning it to further understand topics such as machine learning or signal processing.
PD: If you really do like or need the algebraic approach, Shilov's book is the way to go after this course... I think the first chapter is about determinants and he builds the rest of the book from there. It also includes a chapter on tensor algebra.
This is the best resource I've used for higher level math. I'm not sure why more people don't know about it because everything is VERY in depth with great examples and explanations.
I've used Paul's Online Notes for all my higher level math classes starting from Calculus 2. It has been more useful to me, even more than Khan's Academy. I like the little tricks and mnemonics that he teaches, especially in Linear Algebra. Example here: http://tutorial.math.lamar.edu/Classes/LinAlg/DeterminantRev.... Much easier way of calculating determinants.
Another resource I've used is Patrick JMT's excellent videos found at: http://patrickjmt.com. He really goes over problems slowly. Best math teacher I've ever had.
Oooh, very cool resource. I'd been looking for something like that for a while actually. Something that gives a little bit of structure in terms of what sequence to do things in... that's cool.
Linear Algebra is a basis for so much in Machine Learning, and this is a free, open textbook that takes the beginner through the basics. I've not read it myself - I'm not the intended audience - but I've seen several reviews from people whose opinions I trust, and it seems to be well respected.
Honestly, I felt like linear algebra was the low point in my math career-- even behind axiomatic set theory. I still have nightmares about determining whether a set of vectors is linearly independent.
Linear algebra is frequently taught very poorly. I think most university courses in the subject are kind of watered down and reduced to algorithms you can memorize -- which, ironically, makes them tough going for the strong students.
What impresses me is the accessibility of this textbook. Not only is it available online (via XML and MathML) but you can download the TeX source and multiple PDFs optimized for printing or on-screen viewing [1].
Are there any linear algebra books which use machine learning as the motivating example? Or a book which teaches linear algebra and machine learning together?
Funny, I just spent last night implementing a "principal compnents analysis" using numeric.js, for a character recognition project. PCA / SVD seems like a great motivating topic.
btw, I couldn't use Singular Value Decomposition in numeric.js for PCA because the method, numeric.svd, uses the "thin" algorithm, and throws an error if there are more columns than rows. I calculate way more features (50-200+ columns) than I have training samples (rows, 10-30 written manually). without svd I had to use the "covariance method", which I guess can sometimes present approximation issues, but seems to be working well for me.
The purpose of the PCA is dimensionality reduction (google "curse of dimensionality"). I had used Mahalanobis Distance as a p-value score to detect outliers (p-val < 0.05), and it worked well when there were only 6 features. Curse of dimensionality makes MD useless when there are 50 or 100 features, and PCA reduces them to 3-10 features which carry the most information with the rest approaching zero. So I do MD on the projected (reduced) features, and its working great.
If I do it all over again I might try a "one-class SVM" (which, sadly, I had not heard of until only recently and very late in the project). SVM's are non-linear like most machine learning algos, but the linear PCA can still be used to complement other methods, eg to do pre-processing before feeding to a neural network.[1]
For deeper linear methods, check out the PCA-related extensions like Fisher Linear Discriminants or Projection Pursuit.
1. "Many neural networks are susceptible to the Curse of Dimensionality though less so than the statistical techniques. The neural networks attempt to fit a surface over the data and there must be sufficient data density to discern the surface. Most neural networks automatically reduce the input features to focus on the key attributes. But nevertheless, they still benefit from feature selection or lower dimensionality data projections."
Just wanted to share something I ran into the other day in the course of research on PCA, a very cool motivating example, sports-related.[1] On page 13, they take some NBA player stats and graph the projection of the third PC (y-axis) against the second PC (x-axis). The third PC is a negative correlation of rebounds with assists and steals. Turns out to sort players roughly by height, placing Karl Malone (6'8") and Mugsy Bogues (5'3") at opposite extremes.
1. Faloutsos et al, Quantifiable Data Mining Using Principal Component Analysis, 1997
Anyone have any recommendations for linear algebra courses through Coursera or similar sites? It is a subject I've always been fascinated by but my self-directed learning endeavours have not yielded much usable knowledge.
IMO this is the wrong way to understand linear algebra, and the typical path of a lot of introductory text books. A better text: Axler's "Linear Algebra Done Right"