Life Lessons from Machine Learning

u23KDd23 · on March 16, 2015

I'm worried about living in a society where we've become overly dependent upon statistical methods for important decision making especially when we aren't allowed to question their integrity, precision, and limitations. As a scientist I know the reality is there is a lot of terrible data being produced through various methods and I've met a lot of PhD students in the field of machine learning who argue against starting from scratch even when it's quite apparent their prior abstractions are incorrect. The reality is there are a lot of promises being made by ML researchers and dishonesty about their competency and ability to achieve those results. This is absolutely not unique to ML and plagues a lot of other fields as well. We really need more ML researchers to be actively critical about research being performed in the ML field if ML is going to achieve it's full potential.

Houshalter · on March 16, 2015

>I'm worried about living in a society where we've become overly dependent upon statistical methods for important decision making

That's very unlikely as people are incredibly biased against algorithms: http://lesswrong.com/r/discussion/lw/lsc/link_algorithm_aver...

There are many areas where even really simple algorithms have been shown to outperform humans for decades: http://lesswrong.com/lw/3gv/statistical_prediction_rules_out...

Even when humans are given the predictions of the algorithm and allowed to take that into account, they still do worse than just the algorithm on it's own. Sure the human might fix an obvious error of the algorithm in one case, but then makes 10 other worse errors elsewhere.

Despite that organizations are very slow and hesitant to adopt them. There are massive regulatory and liability issues in many areas. And people are just generally biased and scared of them. Even on tech friendly places like hacker news, your comment is at the top of the thread. I remember a post awhile ago about using machine learning to detect fraud in loans in the third world, and half the comments were about how evil and racist such and unfair such an algorithm would be. Not realizing humans are all of those things.

People are very overconfident in human ability despite overwhelming evidence we suck at predicting things and doing anything statistics related. Human error is just ignored or seen as an inevitable fact of life.

u23KDd23 · on March 16, 2015

"People are very overconfident in human ability despite overwhelming evidence we suck at predicting things and doing anything statistics related. Human error is just ignored or seen as an inevitable fact of life."

Programs don't program themselves. Algorithmic biases often reflect human biases. If we want people to accept technology and give us opportunities to pursue our visions of what technology can offer society we need to be cognitive of ethical and moral challenges especially when there is so much at stake. Yes there are fields where there are regulatory and liability issues, but I'm more worried about the fields where there isn't as much oversight and transparency.

yummyfajitas · on March 16, 2015

Algorithmic biases often reflect human biases

I've been doing this for a while and I've literally never met a human who told an algorithm to overweight x[23] ("good looking"), x[48] ("is white") and x[873] ("is wealthy"), for x a 1,100-dimensional feature vector.

Algorithms do have biases, but they are almost always orthogonal to the human ones. Witness, for example, all the recent "we can fool deep learning image recognition systems" papers.

http://arxiv.org/abs/1412.1897

http://arxiv.org/abs/1312.6199

At this point I'm 90% sure you are a layperson who's never actually programmed such a system.

cafebeen · on March 16, 2015

Not to mention biases in data collection. "Garbage in, garbage out" certainly applies, and the situation probably worsens as datasets get bigger.

stdbrouw · on March 16, 2015

We should be equally worried about being overly dependent on clinical methods (human judgment) for important decision making. When looking at an algorithm, you can at least see how it works. Not so easy for "trust me, I'm a doctor, I know what I'm doing."

Check out classic papers like The Robust Beauty of Improper Linear Models or Paul Meehl's work on clinical vs. statistical expertise.

u23KDd23 · on March 16, 2015

Have you heard of evidence-based medicine? It's actually a central component of clinical decision making.

Houshalter · on March 16, 2015

I still don't trust them. Only 15% of doctors get this question right:

>1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?

mellavora · on March 16, 2015

Nice summary/intro to ML.

I challenge one point, near the end "Life is the greatest epistemological problem of all... We arrive into this world not knowing anything, and using only these mountains of data, we try to put them together in a way that makes this massive, immensely complex world, slightly more understandable and predictable."

Your point is of course correct, but perhaps not complete. Two observations: 1) The purpose of the nervous system is to allow more coordinated movement. 2) Rhythm is the fundamental computational element of life (including brains).

Conclusion: The only purpose of it all is to dance.

codehalo · on March 16, 2015

Rhythm, Resonance, and Recursion.

Gobitron · on March 16, 2015

I'd be interested in a longer blog post on your observations, particularly the one about rhythm.

yummyfajitas · on March 16, 2015

My statistics/ML life lesson: the harder a decision is, the less important it is to get the right answer.

If choice A is vastly better than choice B, you don't need much information to determine that. So if, after gathering lots of information, you still can't determine which is better, don't stress too much about choosing the best one.

A professor of mine pointed this out to me when I was stressing about which grad school to go to. He then pointed out that at my age, his life choices were CUNY (math dept) or West Point - that was a decision that would have had a far larger effect on his life than my choice of Rutgers vs Brown vs Austin.

NotableAlamode · on March 16, 2015

Does this life-lesson hold in an adversarial world, where the counterparties actively try to deceive you, e.g. when making financial investments?

yummyfajitas · on March 16, 2015

That's an interesting question. My entire context and thinking about this is probabilistic, not adversarial, so I have no idea.

Xcelerate · on March 16, 2015

Interesting idea, but I'm not sure if it applies to political concepts. Think of the issues the United States is split over (roughly 50/50) and how difficult it is to form an opinion on some of the topics. There are significant repercussions for choosing one side over another.

yummyfajitas · on March 16, 2015

It applies to individual decisions aimed at maximizing a utility function. Groups don't have utility functions due to Arrow's Impossibility Theorem (except in special cases, e.g. cardinal preferences/prices for private goods).

But I imagine that if I did view democracy as a meaningful decision procedure, then I suspect I would view close votes as a belief that both outcomes are almost equally good. Both decisions are not similar, but the utility attached to each decision is.

raverbashing · on March 16, 2015

Yeah, but you don't know if, for the first graph a straight line is underfitting or not.

Or if it's really a line or something with, for example, proportinal to T^1.1 (and not T) for example.

(that is possible if you reduce the noise of the experiment, either through better procedure or more experiments)

Oh and nature may have a different concept of "simple". Solving Maxwell's equations for a variety of cases is "simple" for nature. Folding a protein? Simple.

Machine learning is not actually about the scientific method (to find a model that gives a certain prediction, given the most amount of samples possible) but to find an ok answer from a limited dataset.

darkhorn · on March 16, 2015

Life lessons from machine learning? These are topics taught in Statistics (Bachelor of Science).

whazor · on March 16, 2015

Machine learning is mostly applied Statistics. It is a different view of the same thing, but now the machine is doing the work. This is why avoiding overfitting is important (as always).

SideburnsOfDoom · on March 16, 2015

> We were all born without any knowledge whatsoever of how the world works

Is that really true? A google for "babies are hardwired to" brings up a enormous number of results.

hedgew · on March 16, 2015

Not at all true. The simple fact that we're born with two legs is proof that massive amounts of information about our surrounding world is already encoded into us before birth. It took millions of years for life to develop this novel mechanism for traversing this planet.

This, of course, extends to our minds as well. Simply the structure of our brain has an immense impact. Even if you don't believe that our neurons are pre-trained to know some things, you'd have to admit that our brain is wired in a way that allows us to learn quite efficiently in this world. But psychology is quite complicated, so it's hard to say what exactly is nature and what nurture.

Technically, when first life began, it really was without knowledge. Over time, organisms learned and adapted more and more to our environment, both physically and mentally.

cheatsheet · on March 16, 2015

This really depends on where you draw the line between knowledge and non knowledge. Our entirety scientific body is the best explanation we have for the phenomenon we as a social species have observed. Much of what we call 'knowledge' - like maths or physics, is actually human produced culture (in that it is taught after it has been discovered). It is subject to being incorrect (although it would take a very powerful set of observations, deductions, definitions, and measurements to prove it incorrect).

I think honestly, the line between nature and nurture really depends on what you believe makes you human, conscious, and existing with awareness. You can technically define the entire existence of the universe as a process of changing states of information (some of which may not be measurable).

raverbashing · on March 16, 2015

> Technically, when first life began, it really was without knowledge.

Which might be a good discussion subject, the first knowledge about the environment might have been how to get energy from somewhere

Jun8 · on March 16, 2015

OP's claim is quite naive. Even if you haven't read anything on this, when you have a child you'll immediately understand how much information they have out of the box. Some well-known examples:

* Babies will immediately seek out face-like patterns and turn their faces towards them (http://www.slate.com/blogs/how_babies_work/2013/04/03/babies...). Babies would turn their faces to animal faces, too, so it's broad face recognition.

* Although babies make smiling motions soon after birth, at about 6 weeks they start to make social similes (http://www.webmd.com/parenting/baby/babys-first-social-smile). This surely is not learned behavior.

* Language learning. Chomsky's Universal Grammar (http://en.wikipedia.org/wiki/Universal_grammar) posits that we are born with a language faculty that has a lot of switches about language, we just set the switches by being exposed to a particular language.

cheatsheet · on March 16, 2015

A lot of philosophy also deals with this idea, but it's not like anyone takes the humanities seriously in the age of data.

At some points it certainly seems like we need to measure the pain in our foot and fit it to a curve, to know we've shot ourselves in it.

Dewie · on March 16, 2015

Some people believe that young children know more about The Truth in a spiritual sense than adults. They just have it forced out of them by the time they grow up.

SideburnsOfDoom · on March 16, 2015

Some people believe all kinds of meaningless or unprovable nonsense.

Dewie · on March 17, 2015

politegoose · on March 16, 2015

Similarly, studying reinforcement learning raises awareness of the tradeoffs you need to make between exploration and exploitation (studying to became a generalist vs. an expert, switching careers vs. continuing in your field, etc)

lukas · on March 16, 2015

Really interesting post.

I think working on Machine Learning algorithms early in my career had a pretty significant effect on how I think about running my business.

The biggest factor by far in a model's quality is the amount and accuracy of the training data. In the same way, people seem to learn in direct proportion to the amount and clarity of the feedback they get.

I think groups of people (i.e. companies) improve as fast as the amount of clear feedback people are giving each other and the amount and accuracy of the feedback they are getting from customers.

kolinko · on March 16, 2015

Two counterarguments:

- we are not a blank card when we're born

- but more importantly, for some decisions the amount of data you need to have an outcome will never get close enough to what you can obtain. That's why cs majors have problems with real life - they look for algorithms amd data where only educated guesses are possible.

raddad · on March 16, 2015

Dr. Peter H. Diamandis — Intelligent Self-directed Evolution

https://www.youtube.com/watch?v=1H68gX_uCj4

jerf · on March 16, 2015

Further insights I got from machine learning:

1. There is a mathematically determinable maximum rate of learning. In the real world, this often comes up at small scales, and once you start learning to see the world through a machine learning lens you can see people routinely making enormous leaps off of data that not only doesn't support the leap, it literally is mathematically incapable of supporting it. As the amount of data scales up this starts mattering less, and the dominant factor becomes the fact that as humans, we actually aren't all that great at pulling learned truth from large amounts of data, though some of this is also that the universe is pretty darned noisy and if we were "good" at it, we'd learn an awful lot of untrue stuff. (If that sounds like a description of the current reality... no, I mean orders of magnitude moreso than today's real science problems.)

2. The maximum rate of learning is heavily dependent on two things: The ability of the agent to control input into the environment, and the latency of the feedback. First, it has been well established in both theory (machine learning mathematics) and fact (various psychology experiments with cats in very strange visual environments) that active learning is radically faster than passive learning. Passively sitting in a room and listening to someone attempt to present facts is perhaps not the worst approach to teaching, but it's definitely radically suboptimal.

Second, latency is huge. Trying to learn from a latent signal is intrinsically harder than a less latent one, and it has absolutely nothing to do with willpower or moral fiber... it is intrinsically, mathematically, irreducibly more difficult. The maximum speed of learning continues to increase all the way down into the sub-second response times.

The amount of learning that can even in theory be done under the "listen to a lecture, two days later get quizzed on it, receive marked-up quizzes back a week later" model is shockingly low compared to what is theoretically on the table. If I were in the educational startup field, my near-number-one priority would be speed of feedback. I would literally be happy to hear from one of my programmers that the yes/no feedback on the arithmetic drilling went from .5 to .4 seconds. (If you can get your hands on it, try Big Brain Academy or any of its brethren that have the arithmetic drilling in it, and imagine just how much faster you could have learned basic arithmetic with this incredibly responsive tool around when you were a kid. Reminds me, my oldest is getting near where I ought to be pulling that back out....) Between what's on the table with lower latency and integrated spaced repetition, there ought to be almost unbelievable gains on the table for anyone who can put this all together into the right package. (The aforementioned "Big Brain Academy" line of games being the closest I've seen. Latency for putatively "educational" games is often just awful.)

On that note, if you are in the educational startup space, it behooves you to take a machine learning course. Whether or not you are able to apply the code techniques themselves, the insights about learning itself will more than pay back your time. And, frankly, I'm virtually desperate to see someone get this right soon.

(On a sidenote, I'm open to people's suggestions about such programs that do take latency into account for elementary school age children. Most real-world programs seem to go in exactly the wrong direction and assume that the children are not discriminating, so who cares anyhow, let's not spend money on quality software, and the result is slow, slow, slow... agonizing. My feeling is that if we learn from Big Brain Academy and keep things moving along, the mere act of learning itself can feel good enough to keep most users engaged just fine, and that if you make learning boring, no amount of graphical frippery, loud noises, or encouraging-sounding recorded sound clips of people saying "very good" can make up for it.)