Hacker News new | past | comments | ask | show | jobs | submit login
Networks are Killing Science (scientificblogging.com)
47 points by Anon84 on Aug 23, 2009 | hide | past | favorite | 33 comments



Linkbait is killing journalism.

I don't know whether an editor or the writer himself is to blame, but the headline pretty much contradicts the article's thesis: that because network techniques are not tested in accordance with normal scientific method, the claims of their proponents are unverifiable and ultimately such techniques will be limited by their inability to prove or explain the phenomena they seem to expose and may even be dismissed as a high-level form of paraeidolia.

Here's a prediction of my own, one that I'm willing to put to the test: if complex systems researchers don't get serious about the scientific method, their field is going to fizzle out, if not crash and burn.

Doesn't sound too worried about the future of 'proper' science, does he? I think he's right, insofar as non-scientists are deeply suspicious of any kind of modeling (see the climate change debate for example), so using controls, double-blinds, and so on are the best way to advance this empirical technique.


There's a huge potential flaw in your prediction: you assume that the true efficacy of the techniques they use will determine the survival and popularity of the research techniques. In the cases indicated, the people deciding whether to continue funding the research are not scientists who understand causal relations worth a damn, and they're very susceptible to spin (in the political sense of the term "spin", of course).

The saddest thing about it is that I'm positive there's something to the theories involved, but they're being squandered by people more interested in the popular appearance of success than they are in true efficacy -- possibly because they ego-identify with the successes of their pet projects, and ensure that they don't pay enough attention to the right factors to recognize the difference between appearance of efficacy and true efficacy.


Perhaps I missed this in the article, but in what "cases" are the people deciding whether to continue funding complex systems research not scientists?

My guess is that a majority of the research into complex systems is funded by the NIH and the NSF. Both of which make funding decisions largely based on the opinions of other scientists.


Um, it's not my prediction; I'm just quoting the article, though I agree with the basic thesis (as do you, apparently).


I agree. I think the underlying issue here is that computer models are given way too much credit. A computer model is only as good as the software it runs on and the assumptions about the real world that this software makes.

If as scientist, you want to test assumptions, it is better not to rely on complex computer models that hide all their assumptions under the hood, but to make simpler or as simple as possible theories that can test assumptions one at a time. If you know all your assumptions are reliable then you can cobble them all together in computer models. And of course keep everything open so that others can check your work.

One of the most obvious and disastrous examples of failure of computer models is the economic crash that just happened. A bunch of bankers and rating agencies had a lot of fancy complex computer models that proved that their mortgage backed securities could not possibly lose value. So the securities were rated AAA, and everyone treated them as essentially risk free. Well we all know what happened next.


The economic crash is not a failure of computer models or even complex models. It's a failure based on a single simple assumption: house prices never go down.


The author seems to be confused. Scientists do try to validate their models (even the sniper author, via correlation). They do understand that their model must predict the future ... however, in some fields (like preventing sniper attacks, psychology) there are many external variables, so it is difficult (if not impossible) to prove that the model is exactly correct. However, that does not mean that the results are not on target or that we cannot learn anything from the study ...

If anything else, no one will take a scientist's model seriously unless they use it to predict future events accurately -- this is a fundamental requirement of a model, after all.


"If anything else, no one will take a scientist's model seriously unless they use it to predict future events accurately -- this is a fundamental requirement of a model, after all."

Except when it comes to many many people trying very hard to spend trillions of dollars to fight something that is only predicted in models (failed models) - global warming.

Here is a link to Freeman Dysons discussing this subject:

http://www.edge.org/3rd_culture/dysonf07/dysonf07_index.html


When politics get involved it gets ugly. Special interest groups always try to confuse the public over results which would have otherwise been accepted by the scientific community. This causes the public to form opinions on things which are not based in science but in emotion. For example, global warming, evolution, ex-gay therapies, etc.

Global warming is definitely scientific fact, but the issue here is moreso how much money do we need to spend to mitigate the (still somewhat unknown) risk.


> Global warming is definitely scientific fact, but the issue here is moreso how much money do we need to spend to mitigate the (still somewhat unknown) risk.

I think there's something to the indicators that global temperatures have been rising, of course, but the "science" involved has been so damned sloppy that we can't even get credible estimates of past trends, let alone credible predictions of future trends. There's a lot more at issue than "how much money do we need to spend". Hell, there's even significant doubt about the question of whether what is currently occurring is inconsistent with natural cyclic trends, and many temperature sampling sites have been found to be compromised by virtue of placement in the middle of parking lots and the like.

Get me some figures that don't look like they were compiled by twelve year olds, then we'll talk.


I don't believe I supported my post with any information from special interest groups, I gave you a link to what Freeman Dyson writes.


This article is about the 'inductive fallacy,' more or less. Karl Popper wrote a long work (The Logic of Scientific Discovery) on the scientific method, putting forward the criterion of falsifiability as an alternative to the positivist approach of model-building.


I have gotten into doing this sort of work (the scientific computer modeling part, not the "cargo-cult" interpretations thereof). Popper was the first person I went back to reread when I got into it. This is very important to understand (and so simple).

However, I disagree with the author that this type of work could be dispensable. You have to test your theories in order for them to be scientific, but you have to know exactly what experimental results your theories predict. These complex systems are too important to stop studying, and they're too complex to use "simpler" methods.


In my experience, the best down-to-earth definition of science, referring to the process that the scientific community collectively performs, as measured by conference talks, proceedings and journal articles is: Science is anything which helps scientists think about their field in new or improved ways; with the comment that the ultimate goal (in the natural sciences) is to arrive at a testable theory.

The intermediate steps very often do not contain anything testable or repeatable.


this researcher seems to take it for granted that there is something fundamental/sacrosanct about the "scientific method", but there isn't; it is a learning algorithm whose role is being diminished by new, more powerful learning algorithms.


I'm not sure I understand you, and I'd appreciate a more complete analysis.

In relation to this article, it says that when you build a model that fits the data, it's not enough. It's a further requirement that the model makes testable predications that turn out to be correct.

Models that make accurate predications are valuable. Models that aren't then tested by checking their predictions are, by definition, untested. That, to my mind, makes them potentially dangerous.

I'd be interested to know why you describe making falsifiable models is just "a learning algorithm."

Thanks.


> In relation to this article, it says that when you build a model that fits the data, it's not enough. It's a further requirement that the model makes testable predications that turn out to be correct. Models that make accurate predications are valuable. Models that aren't then tested by checking their predictions are, by definition, untested.

the problem with the article is that it lumps in many 'models' with wholly untested ones undeservedly, saying And no, that does not mean simply training your model on half of your data set and showing that you can effectively explain the other half of your data. That is hypocritical because this is more or less precisely what scientists, following the "scientific method", are doing, albeit many times slower than a computer with a parametric model.

> I'd be interested to know why you describe making falsifiable models is just "a learning algorithm."

what happens in some piece of the world over some duration of time can be represented as a function from that piece of the world's initial state to its final state. in the sciences, we try to estimate such functions. the scientific method is a very simple algorithm for doing so: generate a hypothesis, compute N "testable predictions" f(x_i)\approx y_i, and then accept your hypothesis if they all turn out to be correct.


There is something very fundamental about the scientific method. Science relies on experiments that ground it in physical reality, which does still exist.


That's actually not true. There is something fundamental about the scientific method. Namely, it's the learning algorithm; the only one that actually actually works (so far). Everything else is just variations on or approximations of this algorithm.

When you build a model and then do science (i.e. Hypothesis/Test/Explain loop) on the model instead of doing science on Nature, you learn a lot about your model. But you must be honest and admit that you have learned nothing about Nature. What you might have done is come up with hypotheses. But it's not enough. It's not science yet - it's just a complicated hypothesis. It doesn't count until you test it on Nature.

The point (and it's an obvious one, well-understood by most complexity researchers) is that your models need to enable you to falsify them by doing real, repeatable experiments on Nature, not on models. There's no point building a model of a complex system if the only experiment that can falsify it is so complex that it can't be done.

A lot of time is wasted this way - in many ways it's the nature of the beast (it's called "complexity" for a reason). Nonetheless, this is frustrating to scientists who think the resources spent on learning facts about models might be better spent on learning facts about Nature - the more radical likely being willing to trade all the modelling (and modelers?) for even a single properly verified fact.


when you train a computational model on half of your dataset and test it against the other half (which the author claims is Not Sufficiently Scientific), how is that qualitatively different from training it on all your data, and then "testing it on Nature" by gathering some more and checking your results? i am reminded of searle's chinese room argument.

> this is frustrating to scientists who think the resources spent on learning facts about models might be better spent on learning facts about Nature - the more radical likely being willing to trade all the modelling (and modelers?) for even a single properly verified fact.

there is no getting rid of models. neither science nor any other human activity i am aware of is capable of verifying facts about the physical world.


>when you train a computational model on half of your dataset and test it against the other half (which the author claims is Not Sufficiently Scientific), how is that qualitatively different from training it on all your data, and then "testing it on Nature"

it would be essentially the same thing if you took half the data, made a model, tested it on the other half, got good results, and stopped.

but thats not going to happen. your fist model will stink. you'll refine it and try again. and you'll keep doing that. unfotunatly, this mean your model ends up being dependent on all the data.

the easiest way to prevent this is to build a model and the test it on real world data that occurs after you build the model. if it doesn't work, tweak the model, and then find new data again.


> your fist model will stink. you'll refine it and try again. and you'll keep doing that. unfotunatly, this mean your model ends up being dependent on all the data.

this is exactly what the scientific community as a whole is always doing.


A learning algorithm creates a specialist filter, usually in classification problems, that can discriminate between the various classes.

That doesn't mean that because you didn't manually tune each and every weight in the classifier that its behaviour is not solidly governed by scientific principles.

Science and the application of the scientific method are what got us there in the first place, and has given us the keys to understanding these systems. Genetic Algorithms are a another area where you could easily be tempted to think that you are not doing science, the same thing happens there.

Science is the key to understanding. The scientific method is the ultimate arbiter between what we know to be true, and what we can not prove to be either false, fantasy or outright lie.

I feel that you think that somehow these computational models somehow supplant the scientific method, but you are forgetting one crucial bit here, those models themselves will need to be understood, and only the scientific method will allow you to do so.


> How is that qualitatively different from training it on all your data

It's different for the simple reason that to really verify a hypothesis, you don't just test it on the other half of the same data. That's just step 1.

What you need to do is work out the further implications of your hypothesis, and test those. When doing so, one of three things will happen:

1) You remain an astronaut. In other words, your work is so far out that none of its implications falsify or corroborate any existing theories, and all require their own experiments. This can be good (i.e. ground-breaking work), but for your hypothesis to become accepted as anything other than an interesting diversion, you will need to wait for the rest of science to catch up, or to keep building out implications until you can connect what you're doing to existing theories.

2) The implications of your hypothesis directly corroborate an existing theory. This is useful, but a bit boring. It means you've basically discovered yet another implication of an existing theory. Experiment some more, and then publish.

3) The implications of your hypothesis contradict existing hypotheses or theories. This is exciting! This is where the real science lies. Now you get to design a series of experiments to figure out who is right, who is wrong, and why. You debug your hypothesis and/or the existing science 'til it works, possibly overturning established wisdom along the way.

4) A combination of (2) & (3) occur. Woah! This is really exciting. You've discovered that the implications of one existing theory falsify those of another existing theory, uncovering a fundamental flaw in our understanding so far. Well done! You've got a lifetime of debugging ahead of you, but your contribution is likely to be truly important.

> there is no getting rid of models.

True; and models are very useful. But there's a big difference between models used as an explanatory / predictive tool (i.e. Newtonian physics), and models used as a substitute for Nature when doing experiments (i.e. what's done in Computational Neuroscience, my field). The latter attempts to duck the issue of doing real experiments that are complicated / expensive by running the experiment on a computer.

Now, this is certainly useful - but to be science, it must be accompanied with a solemn understanding that "What you learn here don't mean jack." It's just a model, and probably a bad one. God doesn't use 64-bit floating points. The only possible use is to help you design a better experiment to try in the real world. But if you never get to doing that, then you've wasted everybody's time, because you got stuck on step 0 (Hypothesize).


i realized that i misread this original post. you are saying that what you do (study the brain) is more important than what some other scientists do (empirically study complex functions). i think this is kind of impolite for scientists to do in public, though of course, everyone makes jokes about it in their own company. anyway, after rereading, the only positive statement i take issue with is this:

> There is something fundamental about the scientific method. Namely, it's the learning algorithm; the only one that actually actually works (so far). Everything else is just variations on or approximations of this algorithm.

i believe that what i call the scientific method is merely a variation on/approximation of bayesian inference.


> new, more powerful learning algorithms.

Could you enumerate those ?


here is a course i have been working through that describes some learning algorithms: http://courses.csail.mit.edu/6.867/lectures.html i also like this book: http://www.amazon.com/All-Statistics-Statistical-Inference-S...


Learning algorithms and statistical inference do not in any way supersede or invalidate the scientific method, they are products of the scientific method.

The are very interesting though, that's for sure.


the products of a thing often supersede it.

in this case, the development of inference is certainly a consequence of the development of the scientific method. but the validity of the latter is a mathematical consequence of the validity of the former. bayesian inference is more fundamental.

another good book that informs my personal views on this matter: http://www.amazon.com/Probability-Theory-Logic-Science-Vol/d...


@hc: This is so true. In my opinion "scientific method" is actually undefined. Just checked Wikipedia to confirm that "scientific method" itself has become a bloated academic field with countless versions; it is not used by practicing researchers.

http://en.wikipedia.org/wiki/Scientific_method


The scientific method is alive and well, according to that same article you linked above these are the steps in 'doing science' following 'a scientific method' (of which there are many, but they all share the same basic characteristics):

   1. Define the question

   2. Gather information and resources (observe)

   3. Form hypothesis

   4. Perform experiment and collect data

   5. Analyze data

   6. Interpret data and draw conclusions that serve as a starting point for new hypothesis

   7. Publish results

   8. Retest (frequently done by other scientists)

So, even though because of the majority of fields and scope of problems we can not use a single methodology to give us all our knowledge (that would be so convienient) we have come up with a series of steps that if you adhere to them will give you meaningful results.


Yes this is my point. Is there a "The Hacking Method?" I believe not. There may be lists such as "7 good habits of best hackers" but not a Biblical commandment defining a method to be the true computer programming.

The idea of an absolute sacrosanct Scientific Method is contrary to what scientific research is about. It defines science to be an academic method.

So, assume that I have my personal system of doing research. I don't define the problem. I don't form formal hypotheses; I don't perform experiments (formal academic exercises) instead I use trial and error; I take frequent stratetegic naps to develop my ideas; I analyze the idea any way I like; I don't publish my results; instead I note my results meticulously on my lab books. I don't care anything about academic rules and ignore academics and their science fetish.... instead I make discoveries. I ignore the paperwork, and career considerations. . . I make discoveries. You recognize Edison's method. Are you saying that what I am doing is not science because I did not follow your step by step "scientific method?"

The scientific method is an attempt to brand academic physics as science and the rest as stamp collecting. This marketing gimmick actually has been working. People associate academic physics with science.


If you just make discoveries, it's not science, no. If you communicate them widely in a way that makes your discoveries replicable by anyone of sufficient training and interest, then yes, it is science. It's just a word! Its meaning is a combination of discovery, openness, and total honesty. As long as you're more or less doing those things ... you're more or less doing science.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: