Beyond any sort of trivial example, I found I lacked the mathematical and statistical knowledge to not only interpret the results in a relatively unbiased and error-free way, but to know "what to do next."
The popular MOOCs don't take you far enough to start doing serious machine learning, but you don't need a PhD to be ready to solve those problems.
It takes work. Lots of work. Re-learn linear algebra until you know why "eigenvectors" are so important. Know what the most important matrix factorizations (LU, QR, SVD, Eigen, Cholesky) do. Read the papers until the math becomes "no big deal". Pick up a probability textbook and read the whole thing; also, get a working knowledge of real analysis. It won't happen quickly.
The PhD is some classes, plus 3-7 years of focused work. Some of that's compressible and unnecessary to becoming a data scientist. Some of it isn't. The Coursera courses are great for getting you started; they're entry-level college courses, and if you read the papers and the seminal textbooks (e.g. Elements of Statistical Learning by Hastie et al) you can get into the intermediate territory in a couple years or so. It's not easy, but it can definitely be done. Getting to the expert level, I think, just requires real-world experience on real-world problems... but, one hopes, you can start attacking such problems once you're at the intermediate level.
What you are describing as a background is all part of a "normal" math/CS undergrad education (at least in Germany where I studied).
From a level of the mathematical difficulty, Elements etc. (but also current-level research papers) are all readable by anyone with a solid understanding of undergraduate mathematics (which is essentially a decent Linear Algebra course, multivariate analysis, a probability course, and a numerical computing course).
I think the reason why employers look for candidates with a PhD is that too many people "scrape by" when getting their CS degree -- e.g. they somehow fulfilled the required coursework, and somehow got their degree. The PhD requirement is essentially a bureaucratic substitute for answering the question "has this person understood math in sufficient depth to be able to do independent work with it".
Thanks muraiki and michaelochurch. This is a similar frustration I faced. The moocs often seem to teach you just enough that is similar to formula substitution, everything starts to crumble when you depart towards data that's significantly different. So I have begun from the bottom starting with MIT's Linear Algebra and Harvard's Statistics 110. Your comments have validated my journey though this is going to be a long one.
Thank you for your advice. In my case it's not "re-learn linear algebra" but "learn linear algebra... after first learning calculus and how to understand/write a proof." :) At 32 I'm not certain if this is a worthwhile way for me to go...
That being said, I haven't given up completely. I'm starting to read "The Haskell Road to Logic, Maths, and Programming" in the hopes of finally being able to grok proofs. At the very least, I feel that learning more math can only help me as a developer.
The popular MOOCs don't take you far enough to start doing serious machine learning, but you don't need a PhD to be ready to solve those problems.
It takes work. Lots of work. Re-learn linear algebra until you know why "eigenvectors" are so important. Know what the most important matrix factorizations (LU, QR, SVD, Eigen, Cholesky) do. Read the papers until the math becomes "no big deal". Pick up a probability textbook and read the whole thing; also, get a working knowledge of real analysis. It won't happen quickly.
The PhD is some classes, plus 3-7 years of focused work. Some of that's compressible and unnecessary to becoming a data scientist. Some of it isn't. The Coursera courses are great for getting you started; they're entry-level college courses, and if you read the papers and the seminal textbooks (e.g. Elements of Statistical Learning by Hastie et al) you can get into the intermediate territory in a couple years or so. It's not easy, but it can definitely be done. Getting to the expert level, I think, just requires real-world experience on real-world problems... but, one hopes, you can start attacking such problems once you're at the intermediate level.