this researcher seems to take it for granted that there is something fundamental...

RiderOfGiraffes · on Aug 23, 2009

I'm not sure I understand you, and I'd appreciate a more complete analysis.

In relation to this article, it says that when you build a model that fits the data, it's not enough. It's a further requirement that the model makes testable predications that turn out to be correct.

Models that make accurate predications are valuable. Models that aren't then tested by checking their predictions are, by definition, untested. That, to my mind, makes them potentially dangerous.

I'd be interested to know why you describe making falsifiable models is just "a learning algorithm."

Thanks.

hc · on Aug 23, 2009

> In relation to this article, it says that when you build a model that fits the data, it's not enough. It's a further requirement that the model makes testable predications that turn out to be correct. Models that make accurate predications are valuable. Models that aren't then tested by checking their predictions are, by definition, untested.

the problem with the article is that it lumps in many 'models' with wholly untested ones undeservedly, saying And no, that does not mean simply training your model on half of your data set and showing that you can effectively explain the other half of your data. That is hypocritical because this is more or less precisely what scientists, following the "scientific method", are doing, albeit many times slower than a computer with a parametric model.

> I'd be interested to know why you describe making falsifiable models is just "a learning algorithm."

what happens in some piece of the world over some duration of time can be represented as a function from that piece of the world's initial state to its final state. in the sciences, we try to estimate such functions. the scientific method is a very simple algorithm for doing so: generate a hypothesis, compute N "testable predictions" f(x_i)\approx y_i, and then accept your hypothesis if they all turn out to be correct.

aheilbut · on Aug 23, 2009

There is something very fundamental about the scientific method. Science relies on experiments that ground it in physical reality, which does still exist.

caffeine · on Aug 24, 2009

That's actually not true. There is something fundamental about the scientific method. Namely, it's the learning algorithm; the only one that actually actually works (so far). Everything else is just variations on or approximations of this algorithm.

When you build a model and then do science (i.e. Hypothesis/Test/Explain loop) on the model instead of doing science on Nature, you learn a lot about your model. But you must be honest and admit that you have learned nothing about Nature. What you might have done is come up with hypotheses. But it's not enough. It's not science yet - it's just a complicated hypothesis. It doesn't count until you test it on Nature.

The point (and it's an obvious one, well-understood by most complexity researchers) is that your models need to enable you to falsify them by doing real, repeatable experiments on Nature, not on models. There's no point building a model of a complex system if the only experiment that can falsify it is so complex that it can't be done.

A lot of time is wasted this way - in many ways it's the nature of the beast (it's called "complexity" for a reason). Nonetheless, this is frustrating to scientists who think the resources spent on learning facts about models might be better spent on learning facts about Nature - the more radical likely being willing to trade all the modelling (and modelers?) for even a single properly verified fact.

hc · on Aug 24, 2009

when you train a computational model on half of your dataset and test it against the other half (which the author claims is Not Sufficiently Scientific), how is that qualitatively different from training it on all your data, and then "testing it on Nature" by gathering some more and checking your results? i am reminded of searle's chinese room argument.

> this is frustrating to scientists who think the resources spent on learning facts about models might be better spent on learning facts about Nature - the more radical likely being willing to trade all the modelling (and modelers?) for even a single properly verified fact.

there is no getting rid of models. neither science nor any other human activity i am aware of is capable of verifying facts about the physical world.

roundsquare · on Aug 24, 2009

>when you train a computational model on half of your dataset and test it against the other half (which the author claims is Not Sufficiently Scientific), how is that qualitatively different from training it on all your data, and then "testing it on Nature"

it would be essentially the same thing if you took half the data, made a model, tested it on the other half, got good results, and stopped.

but thats not going to happen. your fist model will stink. you'll refine it and try again. and you'll keep doing that. unfotunatly, this mean your model ends up being dependent on all the data.

the easiest way to prevent this is to build a model and the test it on real world data that occurs after you build the model. if it doesn't work, tweak the model, and then find new data again.

hc · on Aug 24, 2009

> your fist model will stink. you'll refine it and try again. and you'll keep doing that. unfotunatly, this mean your model ends up being dependent on all the data.

this is exactly what the scientific community as a whole is always doing.

jacquesm · on Aug 24, 2009

A learning algorithm creates a specialist filter, usually in classification problems, that can discriminate between the various classes.

That doesn't mean that because you didn't manually tune each and every weight in the classifier that its behaviour is not solidly governed by scientific principles.

Science and the application of the scientific method are what got us there in the first place, and has given us the keys to understanding these systems. Genetic Algorithms are a another area where you could easily be tempted to think that you are not doing science, the same thing happens there.

Science is the key to understanding. The scientific method is the ultimate arbiter between what we know to be true, and what we can not prove to be either false, fantasy or outright lie.

I feel that you think that somehow these computational models somehow supplant the scientific method, but you are forgetting one crucial bit here, those models themselves will need to be understood, and only the scientific method will allow you to do so.

caffeine · on Aug 24, 2009

> How is that qualitatively different from training it on all your data

It's different for the simple reason that to really verify a hypothesis, you don't just test it on the other half of the same data. That's just step 1.

What you need to do is work out the further implications of your hypothesis, and test those. When doing so, one of three things will happen:

1) You remain an astronaut. In other words, your work is so far out that none of its implications falsify or corroborate any existing theories, and all require their own experiments. This can be good (i.e. ground-breaking work), but for your hypothesis to become accepted as anything other than an interesting diversion, you will need to wait for the rest of science to catch up, or to keep building out implications until you can connect what you're doing to existing theories.

2) The implications of your hypothesis directly corroborate an existing theory. This is useful, but a bit boring. It means you've basically discovered yet another implication of an existing theory. Experiment some more, and then publish.

3) The implications of your hypothesis contradict existing hypotheses or theories. This is exciting! This is where the real science lies. Now you get to design a series of experiments to figure out who is right, who is wrong, and why. You debug your hypothesis and/or the existing science 'til it works, possibly overturning established wisdom along the way.

4) A combination of (2) & (3) occur. Woah! This is really exciting. You've discovered that the implications of one existing theory falsify those of another existing theory, uncovering a fundamental flaw in our understanding so far. Well done! You've got a lifetime of debugging ahead of you, but your contribution is likely to be truly important.

> there is no getting rid of models.

True; and models are very useful. But there's a big difference between models used as an explanatory / predictive tool (i.e. Newtonian physics), and models used as a substitute for Nature when doing experiments (i.e. what's done in Computational Neuroscience, my field). The latter attempts to duck the issue of doing real experiments that are complicated / expensive by running the experiment on a computer.

Now, this is certainly useful - but to be science, it must be accompanied with a solemn understanding that "What you learn here don't mean jack." It's just a model, and probably a bad one. God doesn't use 64-bit floating points. The only possible use is to help you design a better experiment to try in the real world. But if you never get to doing that, then you've wasted everybody's time, because you got stuck on step 0 (Hypothesize).

hc · on Aug 24, 2009

i realized that i misread this original post. you are saying that what you do (study the brain) is more important than what some other scientists do (empirically study complex functions). i think this is kind of impolite for scientists to do in public, though of course, everyone makes jokes about it in their own company. anyway, after rereading, the only positive statement i take issue with is this:

> There is something fundamental about the scientific method. Namely, it's the learning algorithm; the only one that actually actually works (so far). Everything else is just variations on or approximations of this algorithm.

i believe that what i call the scientific method is merely a variation on/approximation of bayesian inference.

jacquesm · on Aug 23, 2009

> new, more powerful learning algorithms.

Could you enumerate those ?

hc · on Aug 23, 2009

here is a course i have been working through that describes some learning algorithms: http://courses.csail.mit.edu/6.867/lectures.html i also like this book: http://www.amazon.com/All-Statistics-Statistical-Inference-S...

jacquesm · on Aug 24, 2009

Learning algorithms and statistical inference do not in any way supersede or invalidate the scientific method, they are products of the scientific method.

The are very interesting though, that's for sure.

hc · on Aug 24, 2009

the products of a thing often supersede it.

in this case, the development of inference is certainly a consequence of the development of the scientific method. but the validity of the latter is a mathematical consequence of the validity of the former. bayesian inference is more fundamental.

another good book that informs my personal views on this matter: http://www.amazon.com/Probability-Theory-Logic-Science-Vol/d...

zeynel1 · on Aug 24, 2009

@hc: This is so true. In my opinion "scientific method" is actually undefined. Just checked Wikipedia to confirm that "scientific method" itself has become a bloated academic field with countless versions; it is not used by practicing researchers.

http://en.wikipedia.org/wiki/Scientific_method

jacquesm · on Aug 24, 2009

The scientific method is alive and well, according to that same article you linked above these are the steps in 'doing science' following 'a scientific method' (of which there are many, but they all share the same basic characteristics):

   1. Define the question

   2. Gather information and resources (observe)

   3. Form hypothesis

   4. Perform experiment and collect data

   5. Analyze data

   6. Interpret data and draw conclusions that serve as a starting point for new hypothesis

   7. Publish results

   8. Retest (frequently done by other scientists)

So, even though because of the majority of fields and scope of problems we can not use a single methodology to give us all our knowledge (that would be so convienient) we have come up with a series of steps that if you adhere to them will give you meaningful results.

zeynel1 · on Aug 24, 2009

Yes this is my point. Is there a "The Hacking Method?" I believe not. There may be lists such as "7 good habits of best hackers" but not a Biblical commandment defining a method to be the true computer programming.

The idea of an absolute sacrosanct Scientific Method is contrary to what scientific research is about. It defines science to be an academic method.

So, assume that I have my personal system of doing research. I don't define the problem. I don't form formal hypotheses; I don't perform experiments (formal academic exercises) instead I use trial and error; I take frequent stratetegic naps to develop my ideas; I analyze the idea any way I like; I don't publish my results; instead I note my results meticulously on my lab books. I don't care anything about academic rules and ignore academics and their science fetish.... instead I make discoveries. I ignore the paperwork, and career considerations. . . I make discoveries. You recognize Edison's method. Are you saying that what I am doing is not science because I did not follow your step by step "scientific method?"

The scientific method is an attempt to brand academic physics as science and the rest as stamp collecting. This marketing gimmick actually has been working. People associate academic physics with science.

caffeine · on Aug 26, 2009

If you just make discoveries, it's not science, no. If you communicate them widely in a way that makes your discoveries replicable by anyone of sufficient training and interest, then yes, it is science. It's just a word! Its meaning is a combination of discovery, openness, and total honesty. As long as you're more or less doing those things ... you're more or less doing science.