Linear Algebra for AI

aqsalose · on Sept 16, 2017

"All the Linear Algebra You Need".

The title is misleading. In general, I like an applications-first approach, but I have hard time liking a tutorial with the title like that and which proceeds to say things like

>Neural networks consist of linear layers alternating with non-linear layers.

without providing a proper definition nor intuition what is linear and what is not. Next step they are constructing NN layers of ReLU's and not telling what is a rectified linear unit and only barely hinting it is supposed to do ("non-linearity").

The article is not a worthless, it's a nice tutorial to building a NN classifier for MNIST, but don't expect full mastery of the mathematics relevant in understanding NNs after reading this tutorial.

Chapter 2 of the book by Goodfellow et al. is a serviceable concise summary of the relevant concepts, but I don't think that chapter alone is a good primary learning material if you are not already familiar with the subject. For that I'd recommend reading a proper undergrad level textbook (and doing at least some of the problem sets [1]), one example: http://math.mit.edu/~gs/linearalgebra/ and then continue with e.g. the rest of the Goodfellow's book.

[1] There's no royal road into learning mathematics. Or for that matter, learning anything.

jph00 · on Sept 16, 2017

It's the notes for a 40 min talk Rachel will give at the O'Reilly AI conference. It was originally meant to be a longer tutorial, so the scope had to be cut down significantly, whilst the title remained. There's lots of links in the notebook to additional resources with more background info.

Having said that, there really isn't much more linear algebra you need to implement neural networks from scratch. You'll need convolutions of course, although that's not too different from what's shown here.

For those interested in much more detail, Rachel has a full computational linear algebra course online http://www.fast.ai/2017/07/17/num-lin-alg/ . Most of that isn't needed for most deep learning, however.

jwdunne · on Sept 16, 2017

Hrm. I find a lot of stuff on linear algebra not so intuitive or motivating. It's a bit harder to get through. A big motivator is the sheer scale of application, along with calculus. Yeah it's a must for AI but really linear algebra underpins much much more and gives you awesome problem solving tools in general.

As for intuition, a better approach may be to get 3 good books and cycle through them. Where one doesn't feel so intuitive at a given point, the other 2 might. They will explain things in a different way with different examples. Connecting the ideas across 3 sources will help it sink in.

For a light intuitive introduction, give this a try:

https://betterexplained.com/articles/linear-algebra-guide/

Maybe it's a little too light but the idea is important: voraciously grasp the ideas intuitively. If none are forthcoming, try to make one.

Another Linear Algebra book, by Heffernan, was a 'third' book:

http://joshua.smcvt.edu/linearalgebra/

It's free too. I think someone else has mentioned No Bull Linear Algebra. That's a cool book too.

I had to put my learnings aside a few months ago but I made more progress this way than just following one book.

ivan_ah · on Sept 16, 2017

More linear algebra stuff to read with you weekend morning coffee: https://minireference.com/static/tutorials/linear_algebra_in...

See also Section VI in this SymPy tutorial https://minireference.com/static/tutorials/sympy_tutorial.pd... or LA topics covered with numpy http://ml-cheatsheet.readthedocs.io/en/latest/linear_algebra...

hal9000xp · on Sept 16, 2017

Is there any book which actually explains where matrix and its rules come from? Instead of throwing on you matrix multiplication rules in dogmatic way so you blindly follow them like mindless robot?

I know there are lectures in YouTube from 3Blue1Brown:

https://www.youtube.com/watch?v=kjBOesZCoqc&list=PLZHQObOWTQ...

So I want a book which covers linear algebra in the similar manner.

umanwizard · on Sept 16, 2017

Matrices are not fundamental or interesting by themselves as just Excel-like grids of numbers. The reason we care about them is because they are a convenient notation for a certain class of functions.

I have to catch a flight so I don't have time to explain this fully, but the key points are:

1/ A "linear function" is a function where each variable in the output is "linear" in all the variables of the input (i.e., a sum of constant multiples of the input variables). e.g. f(x,y) = (x + 3y, y - 2x) is a linear function, but g(x, y) = (x^2, sin(y)) is not.

2/ All linear functions can be represented by a matrix. The `f` I mentioned above corresponds to the matrix:

  [ 1 3
   -2 1 ]

3/ The rules of matrix multiplication are defined so that multiplying by the matrix of a linear function corresponds to applying that function.

For example, again using the definitions above:

  f(7, 8) = (31, -6)

And notice that we get the same thing when we do matrix multiplication:

  [ 1 3   * [ 7      = [ 31
   -2 1]      8 ]        -6 ]

4/ Matrix multiplication also corresponds to function composition. If `f` is as defined above, and `h` is defined by h(x, y) = (-3y, 4x + y), then the matrix for h is

  [ 0 -3
    4  1 ]

and the function `f ο h` you get by applying `h` and then `f` is (you can check this...)

  f ο h(x, y) = (12x, 4x + 7y)

The matrix for this functions happens to be

  [ 12 0
     4 7 ]

But, lo and behold, this matches matrix multiplication:

  [ 1 3    * [ 0 -3    =  [ 12 0
   -2 1 ]      4  1 ]        4 7 ]

4/ Why do we care about linear functions? Well, linear functions are interesting for a lot of reasons, but one in particular is that (differential) calculus is all about approximating arbitrary differentiable functions by linear ones. So you might have some weird function but, if it's differentiable, you know that "locally" it is approximated by some (constant plus a) linear function

dmit · on Sept 16, 2017

http://linear.axler.net/

jasode · on Sept 16, 2017

"Linear Algebra Done Right" is a fine book but its enduring popularity leads people to recommend it as a universal default answer.

The parent asked if there was a LA book that covered the material in the same style as 3Blue1Brown's videos. If that's the criteria, Sheldon Axler's book isn't the best book. One can compare a sample chapter to the youtube videos and realize they use different pedagogy:

http://linear.axler.net/Eigenvalues.pdf

hexrcs · on Sept 16, 2017

I second this. Linear Algebra Done Right is an awesome book. It also comes with a helpful selection of exercises after each chapter with detailed answers available on the website, which is great for self-learners. If you are a student and your Uni has a Springer subscription, you might be able to get the PDF for free.

ivan_ah · on Sept 16, 2017

For intuition about linearity, check out this intro jupyter notebook: https://github.com/minireference/noBSLAnotebooks/blob/master... and the associated video https://www.youtube.com/watch?v=WfrwVMTgrfc

fnl · on Sept 16, 2017

The No Bullshit Guide to Linear Algebra https://gumroad.com/l/noBSLA#

keithpeter · on Sept 16, 2017

I like the look of the presentation and the active teaching method adopted. I also like the way Savov gives out the definitions and the facts as a pdf but keeps the exercises, investigations and examples for the paid version - the exercises are the value added in Maths in my limited experience.

I just bought the paper version off Lulu (I like being able to read and scribble and then go on the computer for the computational exercises). And now to set up SymPy on Debian...

foota · on Sept 16, 2017

There was a web book I found a while ago that built up some sort of motivation for linear algebra. Unfortunately I don't remember what it was or the title.

Edit: found it, https://graphicallinearalgebra.net

Ymmv. Matrix multiplication is defined the way it is imo because it has interesting properties that way. Not very satisfying though.

dragandj · on Sept 16, 2017

You might find my Clojure Linear Algebra Refresher helpful.

http://dragan.rocks/articles/17/Clojure-Linear-Algebra-Refre...

This is the link for the first part. You'll find further articles there.

It walks you through the code, explain things briefly, and points you to the exact places in a good Linear Algebra with Application textbook where this is explained in detail.

jtmcmc · on Sept 16, 2017

my linear algebra teacher taught it in a visual and proof focused way and it was amazing. He also tied it upwards into abstract algebra WRT vector spaces. He also taught my abstract algebra where he tied things back into linear algebra. That was an amazing set of classes...

sherjilozair · on Sept 16, 2017

All the linear algebra you need for AI: http://cs229.stanford.edu/section/cs229-linalg.pdf

derekmcloughlin · on Sept 16, 2017

I found that chapter 2 of the Deep Learning book by Ian Goodfellow, Yoshua Bengio and Aaron Courville is quite good.

http://www.deeplearningbook.org/contents/linear_algebra.html

Yuioup · on Sept 16, 2017

Thank you for this! I've been re-learning Linear Algebra since my question[1] was answered on an earlier HN thread[2] on fast.ai. This will definitely help.

[1] https://news.ycombinator.com/item?id=14878199

[2] https://news.ycombinator.com/item?id=14877920

auston · on Sept 16, 2017

This left me with a lot of questions, it could definitely use more explanations between code examples. I had to do a lot of "reading between the lines".

I've been reading https://www.manning.com/books/grokking-deep-learning and have been liking it better than this tutorial.

dragandj · on Sept 16, 2017

All the linear algebra that is needed for AI is matrix multiplication and broadcasting?

Am I the only one who is baffled by this?

jventura · on Sept 16, 2017

If AI == Neural Networks only, maybe matrix knowledge is enough, but when you start to include NLP, Expert systems, SEarch algorithms such as A*, Minimax, etc. in the AI category, you'll have to know more mathematics..

AI is much more than neural networks!

dragandj · on Sept 16, 2017

Even for NN alone, those few operations are not enough. Not only that, but to even understand what the operations that I need directly do, I need far more background knowledge of linear algebra.

contravariant · on Sept 16, 2017

Well, if you want to simplify even more then all you really need is optimisation and a sufficiently broad class of functions.

tw1010 · on Sept 16, 2017

I think they forgot to prepend "to convince the math-phobic you know what you're doing"

bra-ket · on Sept 16, 2017

"Coding the Matrix" : https://www.amazon.com/Coding-Matrix-Algebra-Applications-Co...

colmvp · on Sept 16, 2017

I wonder why there's so much more emphasis on Linear Algebra over Calculus (I've seen a number of courses teaching LA to complement DL courses), given that without Calculus, it's hard to understand optimization and backpropagation. Rote memorization might help for copying the same network over and over but isn't enough when you have to customize it.

chestervonwinch · on Sept 16, 2017

How do you know a stationary point is a local optimum? Eigenvalues of the Hessian matrix! Even though vector calculus is taught without assuming linear algebra, a lot of the material is coupled. Further, many of the "first steps" machine learning ideas are essentially linear algebra problems (e.g., linear regression, PCA, etc).

1024core · on Sept 16, 2017

> The purpose of this notebook is to serve as an explanation of two crucial linear algebra operations used when coding neural networks: matrix multiplication and broadcasting.

Um... no. While I like Rachel, and she's really smart and all that, this is a vaaast oversimplification.

BenGosub · on Sept 16, 2017

I have yet to go through the tutorial. LA is one of the areas where I am trying to improve... Coding the Matrix course/book is well written for those that find a good/intuitive approach to learning.

tw1010 · on Sept 16, 2017

Right now, sure. But give it five years and you'll no doubt also need at least a years worth of undergraduate (pure) math.

methodover · on Sept 16, 2017

Is there a website that could interpret tbis... iPython notebook, I think it is? I'd love to read it on mobile.

c8g · on Sept 16, 2017

Mobile friendly: https://nbviewer.jupyter.org/github/fastai/fastai/blob/maste...

View from github URL: https://nbviewer.jupyter.org/

fsqcds · on Sept 16, 2017

GitHub does it. Try to request desktop version.

RayVR · on Sept 16, 2017

First line of fastai imports is a big "Fuck you" to usability. Don't import *.

alishan-l · on Sept 16, 2017

Thanks for sharing!

nnfy · on Sept 16, 2017

I may be biased because I have a background in mathematics, but after working with tensorflow for about 6 months now, I dont think you really need to understand linear algebra or multivariate calculus to work with neural nets, unless you're trying to implement your own engine.

Edit: I dont mean to discourage anyone from learning, as an understanding of the relatively simple mathematics behind NNs may afford the user an additional intuition about the behavior of neural nets, but it appears that one can treat neural nets almost like black boxes, given a suitable engine to work with like TF.

As an aside, TF is a pretty magnificent library. It pretty much works out of the box, and in addition to python and CPP bindings, there appears to be an unnoficial port to c#, although I haven't tried it yet. I strongly recommend the tutorials at tensorflow for anyone interested in experimenting.

_qhtn · on Sept 17, 2017

Yeah, you don't need to be an automotive engineer to drive a car.

Most 'work' in AI seems to be using python to shovel data into libraries anyways.