Requiring absolute certainty as a subjective being yields an incomprehensible, absurd universe. The scientific method is based on testing models, i.e. systems of measures, against each other.
I volunteer at two institutions: (1) as a reading tutor at a Harlem charter school and (2) as a mathematics tutor for disadvantaged youth in Lower Manhattan. At 1 the administrators regularly assess students' reading levels. The process is highly subjective but allows for individuals' effectiveness to be tracked, as well as tutors' and methods'. At 2 the administration is more ad hoc, giving each tutor more freedom but having no system for measuring progress. 1 has a good idea about what works and a programme with demonstrable effectiveness. 2 is a more challenging environment as every problem calls for starting from square one. I set my own measurement criteria but due to a small, variable sample they have low statistical power.
In both cases the measurement is subjective. Strictly codifying the measure and then lording over it would be unproductive. But having a measure and optimising processes to the measure creates a feedback loop. Simultaneously remaining vigilant of exceptions, e.g. a child who could read complicated words so long as they don't contain 'q', creates a meta-loop that optimises the measure-process system. Together one has a system that evolves, that learns.
"We see that cars are safer for men than women because the crash-test dummies are men."
Crash dummies have made cars safer for women. Perfect is the enemy of good.
Crash dummies have made cars safer for women. Perfect is the enemy of good.
I know you just threw this in at the end, but "Perfect is the enemy of good" does not apply here. I'd argue the better quote would be "Good enough is the enemy of better." NHTSA has been around since 1970 but they didn't start testing with different-sized dummies until 2003. I don't think the author is suggesting that we needed 200 different-sized crash dummies in 1970, but why did it take 33 years to notice that our dummies didn't match most drivers? We could have tested smarter and better and we didn't.
I read the post as criticising Gates for (1) not being transparent about the models he's using, and, (2) using models that aren't "objective", taken to mean not accurate, and being adopted arbitrarily because of his authority. 1 is a valid point if true, but 2 occupies the bulk of the post. The standard for accuracy appears set too high, illustrated by calling the crash dummy testing process un-objective for having missed a nuance. Worse, the proposed solution to a perceived lack of objectivity is rejecting the system. This is essentially nihilism, whose conflict arises because it rejects any model that isn't perfectly "objective".
33 years is too long to notice an exception to 50% of the population. That, however, doesn't reduce the "objectivity" nor utility of said model. It just means it could have been better. Similarly, valid criticisms are to be had of the Gates models. That does not mean we say it's not objective and dismiss it as the manifestation of a rich man's ego.
You've missed her point entirely. She's not saying that only "objective" models should be used; she's saying that there's an irreducible subjectivity in the choice of model, and that this needs not to be forgotten. She's not saying that measurements should not be taken; she's saying that we need to discuss what to measure, and make sure that what we're measuring is what we actually want to see improved. Her point is not nihilist in the slightest.
Those three comments have got to be the shortest path I've seen from global philanthropy through applied math, a debate about modeling, and the original problem of nihilism.
<turn to crowd>
Internet: please look above for how conversation is supposed to work.
What you're proposing is measuring the outcomes of different processes. The laws that Gates is trying to buy would force schools to measure the effectiveness of each teacher based on the standardized test scores (in only math and reading) of their students. In other words, Gates's 'reforms' actually have a lot more in common with school 2 than school 1, in that they try to ascribe large amounts of meaning to tiny samples.
If you were teaching via something like Open Systems Instruction then using value added metrics for teachers would make complete sense, but as it is Gates is just trying to buy laws that would reengineer the schools in minority communities in order to create a large pool of low-income wage slaves.
We can always get better. If you're current system is not perfect, then it (by definition) has flaws. Everything has flaws. So let's try to identify the flaws. And let's try to get rid of the flaws.
Don't just disregard any potential flaw because of the political tribe of the person or movement pointing it out.
"Gates also brings up the campaign to eradicate polio and how measurement has helped so much there as well. Here he sidesteps an enormous amount of politics and debate about how that campaign has been fought and, more importantly, how many scarce resources have been put towards it. But he has framed this fight himself, and has collected the data and defined the success metric, so that’s what he’s focused on."
I was looking for some meat to back up the author's claims in regard to mosquito nets etc. Instead I read the above nonsense about polio and realized that the author was either fabricating claims from whole cloth or grinding an ax with Bill Gates.
The campaign against polio is in its 25th year. Gates involvement was a challenge grant to Rotary International's Polio Plus campaign in 2009 He never set the metrics. They had been in place for twenty years. I was active in Rotary when it was announced. Several local members had been volunteers in far corners of the world on vaccination drives. Some more than once. One more than a few.
The grant was sized to allow the existing effort to finish its work. The Gates foundation didn't set the agenda, let alone Bill Gates. The foundation employs experts is my understanding.
Absolutely right. Furthermore, regarding the claim that Gates has sidestepped the politics of polio eradication is also incorrect - Gates has been in the UK talking about this to the UK media this week.
"This plan says that if the world supplies the necessary funds, political commitment, and resolve, we will certify the eradication of polio by 2018."
He continued: "Funds, commitment, and resolve… These are the key variables.
The author seems to have constructed a strawman. And I think the point she misses most is transparency of data and the feedback loop. If you have these two things it puts a light on the problem points she discusses.
That is, having a model is not sufficient. You need a model and then a way to measure that a model is a success, and a feedback loop. All of these things are open and can be updated. Things like seatbelts being better for men become more obvious, and the model can be improved to include gender weights, etc...
At the end of the day a model is a model. Almost always imperfect. But if the model is open, as is the metric of success, and how the feedback loop works, I think you'll tend toward improvement and better discussions.
The title is wrong, data is objective but the models are not and which data is used is not (and she argues both those points).
She links to Gary Rubinstein's blog[1] that is mind boggling in its analysis of the results of an actual model of teacher's effectiveness and the resulting data.
1) The model that NYC used to evaluate a teacher's effectiveness is not available. It appears that the input data is not available either, only the output data of the model. That isn't transparency.
2) The output of the model is random. Having a feedback loop on random output data isn't going to improve the model.
Maybe Bill Gates' model and data will be better and will be transparent. I'm skeptical.
I have no idea why you are being downmodded. Your point is absolutely correct.
You should always create an explicit, transparent model and utility function. Discussion will always be more clear. "Person I disagree with undervalues math relative to reading, just look at his weights" will get us far closer to the truth than vague subjective opinions and mood affiliation.
The only other time I've seen this particular is blog when it was posted[1] on HN and the author was fighting a strawman Nate Silver. Its unfortunate because the blogger obviously has some math smarts, but misinterprets the written word.
Your comment brings to mind the quote by George Box, "All models are wrong, but some models are useful."
I am currently enrolled in Jeff Leek's Data Analysis class on Coursera. Jeff stresses that a good analyst always challenges their results, gaining credibility with the reader. I would have liked to have seen more of this in Bill Gate's essay in the NYT (that Cathy used in her blog).
Two articles attacking two people that are held in high regard with link baity titles. I mean, come on, "Bill Gates is naive", "Nate Silver confuses cause and effect". This person is obviously trying to grab page views with whatever they can come up with. These articles could have been worded in much less offensive and aggressive ways but that wouldn't be conducive to the authors goal.
The author doesn't get what Bill Gates is saying. I doubt if Bill is naive enough not to understand what the model of measurement is important and never 100% objective. Bill is saying (1) teachers' performance needs to be measured; and (2) college ranking is based on a wrong criterion.
And we must do something to fix this. Bill isn't saying he has the perfect model for measuring these two things. But he is right in saying both of these things.
Teacher performance in K12 needs to be measured. It's hard, maybe expensive, and certainly subjective and biased in certain ways. But it needs to be measured. In college, teachers' performance is constantly measured. No professor will tell you that the student evaluation is a perfect indication of his/her ability to teach. But most professor would agree that such a thing is necessary.
College professors tend to be recruited based on their ability to do research and get grants, so measuring them on teaching performance is non-threatening. However, the situation is different in K12.
Bill Gates has been able to apply his measurement philosophy at Microsoft, and the results have been disastrous by most accounts.
Money quote:
"Stack ranking provides questionable value as to insight into an individual’s actual job performance. Its use highly politicizes an organization. The rank number is most often based on unsubstantiated subjective judgment by an evaluator who may feel pressured to respond according to a narrow set of guidelines."
I love science. But measuring employee or teacher performance is not science. Managing an organization by pseudo-science should be called out for what it is: B.S.!
Teaching colleges of course take teaching seriously. Research universities place higher importance in research, but do take teaching also seriously simply because undergraduate tuition contributes to a big chunk of their revenue.
>But measuring employee or teacher performance is not science.
I certainly agree that learning - which I will define as the acquisition, mastery, and retention of prescribed skills - needs to be measured, modeled, and used to improve the performance and cost-effectiveness of the process.
Learning is a complicated process governed by MANY factors beyond teacher performance. These would include the student's interest, aptitude, and maturity/self-control, the parents' interest and engagement, the administration's effectiveness at providing a safe environment conducive to learning. Clearly, these factors are interrelated. Any worthwhile study needs to collect sufficient data to reasonably represent these contributions to the outcome.
Let me cite a concrete example: My wife is a teacher in a small private school who teaches chemistry, physics, calculus, and statistics to students in grades 10-12. Some of these are actually college level classes taught under the umbrella of a local community college, following their rubric. She spends many hours refining her classes and labs to make them more effective and engaging. As she looks at her grade distributions, they are ALWAYS bimodal. She can tell you by mid semester where most students will end up. Many will not do homework or study for tests. She will put notes on homework asking for students to see her about a problem and they will not follow up. She uses the school's academic warning system to notify parents, often with no response. Indeed, any parent can see any grade at any time by logging into the schools web site. The sad truth is many parents and students just don't give a rip. This is why many teachers roll their eyes when the "good idea fairy" suggests "let's do another study to evaluate teacher performance."
It is high time we evaluate the performance of students, parents, teachers, and administrators in this process. I am tired of seeing educational expenditures go up and performance go down. We need to pay for performance at all levels.
I don't think that Bill Gates claims decent measurement is sufficient for the kind of progress he seeks but rather claims that (in most cases) it is necessary.
I suspect one answer to Mathbabe's question "what can a non-academic mathematician do that makes the world a better place?" might be to highlight and explain the kinds of logical dependencies that are so often obscured by polemics.
I cringe at the suggestion Bill Gates is naive. Anyone who has spent the time he has in corporate America will be well aware of the challenges and politicization of measurement.
I like how the title introduces a HUGE bias against the author.
I know it's all trendy to say whatever can bait some viewers to blog posts and stuff, but come on, he's one of the richest men in the world, one of the greatest entrepreneurs of recent history and currently solving world-wide health issues, let's have some decency.
Also, Bill might be rich but under his command Microsoft faked evidence in federal court, etc, to get and keep it. If the law worked the same for him as for everyone else he wouldn't have so much money to give away.
Data is objective. But a decision process incorporates both data and a utility function (a goal [1]), and the utility function/goal is based on values.
The successes that Bill Gates ascribes solely to data collection are actually due to a choice of utility function, together with using data to measure it and make decisions based on it.
However, the claim made in the title does not agree with the content of the article. Data is objective. Your choice of goals is the thing that is not.
Bill Gates chose as his goal (# of children who can read) - others prefer (# of teachers with lifetime job security) or (3 x # of children who can read + 2 x # of children who can do math). It is absolutely true that no amount of data will change your fundamental goals, and we should recognize this.
Data is objective. Your choice of goals is the thing that is not.
These two are not independent. Your statement makes it sound like data grows on trees or falls from the sky. Of course that's not so. Someone chose to collect this data rather than some other data. That choice was not objectively given – it's very much affected by "choice of goals", as well as by pre-existing beliefs and no doubt countless other all-too-human factors.
Moreover, the data may be wrong, but lutusp has that one covered.
Almost never true. One of the responsibilities of science is to gather data in a way meant to minimize sources of bias in the data collection process. Any number of studies have come to a very predictable, and wrong, conclusion, based on biases and errors in data collection.
> It is absolutely true that no amount of data will change your fundamental goals, and we should recognize this.
Yes, and one point of science is to separate the process of data collection from any particular goal or outlook.
Bias and error are not the same as a lack of objectivity.
A loaded coin may come up heads 75% of the time. Using that coin to make a decision yields a biased procedure. It's also objective, since the bias is unrelated to the experimenter.
A loaded coin may come up heads 75% of the time. Using that coin to make a decision yields a biased procedure. It's also objective, since the bias is unrelated to the experimenter.
That's just not true. The experimenter chose to record the results of coin tosses, chose the coin, chose the tossing method, and chose what variables to record (heads/tails for each toss).
Before the coin is tossed the first time, human subjectivity and bias have already had ample opportunities to affect the outcome. What does it even mean to propose that the resulting data will be "objective?" The resulting bias, if any, is entirely the experimenter's responsibility.
This is what the essayist means when she says "the people who own the model have the power." Her point isn't that using measurements is bad or that feedback loops can't work, it's that merely establishing an ambitious shiny new model doesn't offer any guarantee that the powers and influences that corrupted the old model won't corrupt the new one too.
> Bias and error are not the same as a lack of objectivity.
Say what? Bias is a lack of objectivity, that's how the word is defined. And errors can be managed by a combination of procedural discipline and peer review. In other words, the essentials of science.
> A loaded coin may come up heads 75% of the time. Using that coin to make a decision yields a biased procedure. It's also objective, since the bias is unrelated to the experimenter.
No, never. The outcome of the study is not objective if the experimenter believes the coin to be fair. And if the experimenter knows the coin is unfair, then it's not objective for a different reason.
Objectivity is not a debating point as in post-modernism, it's a prerequisite for science, and with sufficient rigor, it can be established. This is not to argue that this is always true, but it is always possible.
Quote: "Objectivity is a noun that means a lack of bias, judgment, or prejudice."
And your point (b) confuses inaccuracy and variance, which are distinct factors. All your points (a through c) result in an easily definable and scalar error factor, not at all orthogonal.
You realize that "lack of bias" comes from the thesaurus, not the definition of the word, right?
If you want to use an unusual definition of a word in order to make it apply to all errors (rather than only some), be my guest. There is no point disputing definitions. By your definition, you are correct, by the common definition, you are incorrect.
Quotation: "undistorted by emotion or personal bias".
> If you want to use an unusual definition of a word ...
I just proved that I am using the definition of the word. If you're not happy with what dictionaries have to say on this issue, then begin a campaign to change the meaning of "objective".
> There is no point disputing definitions.
So stop doing that. I'm not disputing the accepted definition, I'm simply posting it. Copy, paste.
All the definitions of objectivity describe a property of the experimenter, not the method.
A method can be biased too. Take, for example, the estimator S^2 = (1/n) sum (x[i]-mean(x))^2. This is a biased estimator of the standard deviation, but nevertheless it is objective. It is not influenced by the state of the experimenter at all.
Personal bias contradicts objectivity, but not all bias is personal bias.
Feel free to conflate all errors under one label - those of us who care about getting our measurements right don't have that luxury.
The author makes the point that data are not enough, but then doesn't drop the other shoe. The "other shoe" in this case is that data collection can only describe, but cannot explain.
If I collect data about all those points of light in the night sky, I can then say, "There are many points of light in the night sky". But that is only a description -- it lacks an explanation, for which the data collection effort can only be a preliminary step.
The social sciences are famous for describing things they cannot explain. When we make an effort to explain what has been described, we cross the line into science.
> ... behind every model and every data set is a political process that chose that data and built that model and defined success for that model.
Not in science. This is precisely what science is meant to avoid. All we need to do is practice science rather than perform naive data collection followed by shallow conclusions.
I suppose MathBabe provides a service by constantly reminding us that human motivation is impure and leads to all source of bias that taints the numbers and the models used to evaluate them, but it does get tiresome to read a blog supposedly about math which is, in fact, largely written to support her political agenda. (Even if you agree with her agenda.) She recently attacked Nate Silver for "defending corruption" because he didn't use his book to attack evil, greedy Wall Street, and now Bill Gates for being naive, because he didn't use his annual report to state the obvious, about which little can be done, but instead chose to promote measurement, where progress can be made. Perhaps she should change her moniker to PoliticalScienceBabe?
If you define the model arbitrarily, I agree, but it you build a model based on observed data, you are much better off.
Choosing thin crash test dummies is an arbitrary decision. Measuring sets of real people and building crash test dummies around them is not. Unfortunately, it would also cause the costs of testing to skyrocket because you don't get two tries for a larger and a smaller person in one car.
Tackling education can follow the latter model, but just like any process, the real risk is that the end goal becomes following the process and not accomplishing what the process is trying to achieve. A sufficiently complicated model could, in theory, address that, but I think trying to turn educators in robots will have far more negative effects because that will impact the best teachers the most.
So what is the author proposing? Maintaining the status quo, because we cannot expect to measure anything accurately.
I do agree with the author that the current problem with education in the US will not be solved with measurements alone. The problem seems to be systemic. But it would be a start.
I volunteer at two institutions: (1) as a reading tutor at a Harlem charter school and (2) as a mathematics tutor for disadvantaged youth in Lower Manhattan. At 1 the administrators regularly assess students' reading levels. The process is highly subjective but allows for individuals' effectiveness to be tracked, as well as tutors' and methods'. At 2 the administration is more ad hoc, giving each tutor more freedom but having no system for measuring progress. 1 has a good idea about what works and a programme with demonstrable effectiveness. 2 is a more challenging environment as every problem calls for starting from square one. I set my own measurement criteria but due to a small, variable sample they have low statistical power.
In both cases the measurement is subjective. Strictly codifying the measure and then lording over it would be unproductive. But having a measure and optimising processes to the measure creates a feedback loop. Simultaneously remaining vigilant of exceptions, e.g. a child who could read complicated words so long as they don't contain 'q', creates a meta-loop that optimises the measure-process system. Together one has a system that evolves, that learns.
"We see that cars are safer for men than women because the crash-test dummies are men."
Crash dummies have made cars safer for women. Perfect is the enemy of good.