Hacker News new | past | comments | ask | show | jobs | submit login
P vs. NP for Dummies (scottaaronson.com)
97 points by bdr on Aug 15, 2010 | hide | past | favorite | 36 comments



I really like the idea of introducing difficult problems like these to outsiders (I'm an outsider in this instance) for the sole reason that it greatly increases the odds that someone may come out of left-field and either solve it or at least provide an insight that helps solve it.

Speaking of breaking down complex ideas with analogies, I'm reminded of this Wired article [1] which examines the inhibition of breakthroughs caused by experts' cognitive biases:

Modern science is populated by expert insiders, schooled in narrow disciplines. Researchers have all studied the same thick textbooks, which make the world of fact seem settled. This led Kuhn, the philosopher of science, to argue that the only scientists capable of acknowledging the anomalies — and thus shifting paradigms and starting revolutions — are “either very young or very new to the field.” In other words, they are classic outsiders, naive and untenured. They aren’t inhibited from noticing the failures that point toward new possibilities.

Or a more specific observation from one of the cited studies in the article:

[A diverse lab] mulled the problem [of unexpected protein synthesis results] at a group meeting. None of the scientists were protein experts, so they began a wide-ranging discussion of possible solutions. At first, the conversation seemed rather useless. But then, as the chemists traded ideas with the biologists and the biologists bounced ideas off the med students, potential answers began to emerge. “After another 10 minutes of talking, the protein problem was solved,” Dunbar says. “They made it look easy.”

[1] http://www.wired.com/magazine/2009/12/fail_accept_defeat/all...


"I really like the idea of introducing difficult problems like these to outsiders (I'm an outsider in this instance) for the sole reason that it greatly increases the odds that someone may come out of left-field and either solve it or at least provide an insight that helps solve it."

Really? I kinda doubt this happens often (or even at all) with hard Mathematical problems like P vs NP. We're talking about problems that tend to have so much history and theorizing about them that it would take enormous effort just to understand what is known about what's hard.

Does anyone know of any statistics on Mathematical problems being solved by "outsiders"?


I guess I could have emphasized the "problems like this" and the "at least provide insight that helps solve" parts, as I wasn't suggesting that some amatuer will necessarily come along and completely solve this problem. And sure, I agree that in specific instances in which the problem is purely mathematical, outsiders probably rarely have much impact.

That being said, it was my understanding that even the most recent paper posted here causing all this stir over the P/NP problem, was using statistical physics, which does indeed make it a somewhat "out of left-field" approach not used by the domain experts that have been traditionally studying this problem.


The better justification is just: It's nice to tell people about hard problems. We do not have to expect them to solve them for us.


I think it is unfair to look at this in the context of the P=NP problem. I think there is a lot of good that can come from "outsiders" looking at problems from their view. If you leave P=NP for what it is (a REALLY hard mathematical proof) and take a look at some of the other unsolved mysteries, this outside-the-box thinking could be a huge asset.


Thank you for this. I've been trying to figure out the "big picture" of this, and it is helpful to have someone explain it who understands the minute details.


P vs NP is basically talking about problems where you can checking each possibility in a reasonable amount of time, but the number of possibility's keeps growing faster as the numbers get larger.

The traveling salesman is the easiest to understand, let's say you want to visit 11 city's which route is the fastest? It's easy to find out how long a route takes just add 10 numbers (the travel times between each city). However, there are basically 11! (11 * 10 * 9 * 8 ...) paths to look at. Add a new city, and there are suddenly 11 times as many options and each of them take 10% longer to compute (you add 11 numbers for each path vs 10). Going from 1001 to 1002 city's gives 1001 times as many routs, but calculating each route only takes 0.1% longer to compute (add 1000 vs 1001 numbers).

The only real hope if you want the perfect solution is for there be some secret that let's you eliminate many options without checking them. The simple truth is you can skip some steps aka, if your looking for the fastest trip though every major city in the US, you can ignore any path that starts with a longer trip than a complete path. AKA if you now of a route that visits every city in 21,357 miles you and you take longer than that to visit 80% of the city's you can ignore all paths that start that way. But, what are the limits on those improvements? What is the fastest possible approach?

PS: There are many options that can give a reasonably good solution, EX: if a computer picks the best random path from 1 million possible paths then it's probably better than 99.99% of the options. You can then take the best random paths and try to improve them by trying simple manipulations (swapping city's etc).


In retrospect, the traveling salesman problem was actually an impediment for me to understand the P vs. NP question. Reasons:

1) Formally, P vs. NP is presented as a binary YES/NO question rather than a "what is the smallest number such that <something>"? Only after you grasp the Y/N format of the question are you shown the trick where you can convert any problem in FNP into NP by converting it into a binary Y/N question with the help of a binary search. If you are presented initially with the TSP as the canonical P/NP problem, it is unclear for a while how it fits into the formal presentation

2) (If my understanding on this point is incorrect, I will be glad if someone disabuses me of my misconception): Sometimes, P vs. NP is presented as "problems to which it is harder to find the solution than to verify a solution once it's given". TSP is not actually one of those problems: if someone hands you a possible solution, the process for verifying that it is indeed correct is equivalent to finding it in the first place.


1) Agreed.

2) You're right; TSP is NP Hard, not NP Complete.

But if you're talking about the YES/NO version, then it is defined (ala Wikipedia) like this: "given the [graph and] costs and a number x, decide whether there is a round-trip route cheaper than x". This problem is NP Complete; it's easy to see that verifying in this case just means summing the weights and seeing if they are below `x`.


A version I was told to think about was competing travel agents. At each step they need to find a better solution or know when to stop looking. It's still not quite there "given the [graph and] costs and a number x, decide whether there is a round-trip route cheaper than x" but if you are handed a graph you can say it's shorter and then give it as the answer. The difference is there may be approaches that prove there is a shorter path, without finding it.

However, the mental model is vary close to the generic solution because someone can hand you the 2nd best solution which looks nothing like the best solution and you still need to find it. Where if you are handed the best solution they may be a shortcut to proving it's the best.

PS: The Clique problem always seemed like the version where shortcuts seemed most obvious. But, understanding why they fail is a larger mental jump IMO than NP Hard TSP to NP Complete TSP.


About 2): If somebody gives you the minimal solution and some additional information, you can verify it in P. The size of the additional information is no more than polynomial in the size of the problem.


fwiw, I found this http://claymath.msri.org/pversusnp.mov (warning: video/.mov) very lucid. She (Dr Ramachandran) starts from some basic problems and leads us to the problem P vs NP.

As a self professed 'dummy', I would recommend it anyone who wants to understand the gist of P vs NP



An even simpler explanation:

There are many problems of the form of several sets of options, where a possible solution to the problem represents a particular choice for each possible option. An example might be a diet where there is a list of acceptable foods (each with a particular calorie value) and a maximum calorie recommendation for each day, the problem being choosing which foods to eat in a particular day. A characteristic of these types of problems is that they are relatively "easy" to check the answers. It's easy to determine if a particular set of food choices adds up to under or over the maximum recommendation for the diet. If you were to tag each food option with a "deliciousness" value then you could also total up the deliciousness index of a daily set of food choices and judge them relative to each other.

Some of these problems are such that the total number of possible solutions grows extraordinarily fast relative to the "size" of the problem (e.g. number of possible food choices). So fast that for relatively "small" problem sizes (perhaps only 100 or so food choices) it becomes fundamentally impossible to compute every single possible solution to the problem (even if you had a computer that ran as fast as the laws of physics allowed and was the size of the entire known universe). More so, some of these problems are such that there is no way to come up with an exact solution that is faster than being forced to do a brute force check of all or many of the possible solutions.

Now, this doesn't mean that (depending on the problem) you can't generate quite good approximations to the most optimal solution to the problem by using shortcuts, but exact solutions are impractical.

As it turns out, many of these types of super-hard problems are closely related to each other. Such that you can translate one variety of problem (such as how to travel through several cities in the least amount of total distance or how to schedule a large number of meetings between a large number of people using a fixed number of meeting rooms) into another variety. Which would mean that if you had a shortcut to "easily" (i.e. practically) compute a solution to one you could come up with solutions to all of the other classes of problems as well, merely by couching those other problems within the frame of the problem you had "solved".

This is the essence of P vs. NP. P is the class of problems where the cost of computing an exact solution doesn't necessarily grow too fast relative to input sizes to be impractical with real-world computers. NP is the class of problems that are equivalent to P problems if you happen to have a magical computer which could evaluate and compare any number of possible solutions simultaneously. Naturally, NP will include all of the P problems, so NP-complete problems are taken to be the set of problem for which only the magical computer would be suitable.

The question is whether or not there is actually a difference between P and NP problems. As mentioned, all NP-complete problems are essentially equivalent, if you come up with a way to turn one of them into a P-problem then you can turn all NP-complete problems into P-problems. However, these problems are so difficult to grapple with that we don't know whether this is possible or impossible. Currently we believe that it is impossible, that there are no ways to come up with exact solutions to NP-complete problems except through trying many or all of the possible solutions (making exact solutions impractical or impossible, depending on the problem size).

If P was equal to NP then it's possible that there would be some consequences (some good and some bad). It might be possible to come up with exactly optimal network routing, for example. But it might also be possible to crack public key cryptography.

Proving that P was not equal to NP would have some consequences as well. It would mean that we couldn't ever hope for anything better than approximate solutions for some problems (e.g. routing and scheduling) and public key cryptography might be generally reliable (with a large enough key size).


Thanks to everyone who helped break it down. I had no idea my comment would actually get this level of a response.


Maybe these "dummies" would benefit more by learning about social media or python programming.

Why fixate on this particular problem as opposed to learning all sorts of other things that are more useful?


"Useful" is a particularly subjective word, and shouldn't necessarily be the metric by which we measure the opportunity cost of our time.

Many advances in the world that you and I take for granted are the direct result of pursuit of knowledge for its own sake. When Einstein developed relativity, he wasn't trying to figure out how to sync satellites to make GPS more accurate (and obviously satellites didn't exist at the time). But if that theory was never developed, we wouldn't have all the luxuries that GPS provides for us. That is just a trifling example of how the quest for deeper understanding of fundamental knowledge has made the world a better place.

Lots of people on hacker news love writing code. That's great, seriously. Looking down on others because their interests aren't as purportedly "useful" as yours is pretty arrogant and shortsighted.


When Einstein developed relativity, he wasn't trying to figure out how to sync satellites to make GPS more accurate (and obviously satellites didn't exist at the time).

These "dummies" have essentially zero chance of making any meaningful contribution to the P vs NP problem.


It is said that Wiles first came across the statement of Fermat's Last Theorem in a popular math book. Who knows who might read this article/item and possibly start working on it.

More, people might read and understand this, and then pass it on to students as one of the current great mysteries in math and science.

Who knows where it might go? To think otherwise is amazingly short-sighted.


So they shouldn't be interested in learning about it?


But why this as opposed to a whole host of other things that they might learn instead?

I think people should think more carefully about how they spend their time.


If you're being sincere, I suggest that this is the generally-held position on this thread of discussion:

A) A lot of people are immensely fascinated by questions in theoretical science and mathematics, including P=NP.

B) A lot of people enjoy learning about things regardless of obvious utility. Personally, my choice of topics to spend time studying has nothing to do with whether it will pay off for me at some point. It's a pleasant luxury of modern life that many people can afford to do that.

C) If you are in camps A and B, it seems extremely arrogant and insulting when you suggest "Why this?" and "I think people should think more carefully about how they spend their time." Since you seem to have no perception of why anyone is fascinated by this work, it appears completely nonsensical to suggest these things to communities populated with a bunch of people who are clearly interested in it, or to readers of a big blog post discussing it.

If you really have something to say about points A or B, or something else, I wish you would say it instead of wrapping it in these bizarre, judgemental insults.


Basically this interest in P!=NP is like yielding to authority.

I would like to decide for myself what's worth studying -- not have it dictated to me.

Having said this, of course anyone who has an interest in computing should know what the P vs NP problem is about, but learning enough to be able to follow proof strategy for P vs NP is another matter.


I think the interest in P=NP specifically is for two reasons besides authority:

- Anyone with a little bit of CS or programming background can understand the formulation of the problem, similar to how anyone can understand Fermat's Last Theorem.

- There are only so many open problems that anyone can understand. Open problems are interesting to casual observers since they represent a boundary between the known and unknown, and since they invite the imagination to think up ways to attack them.

Once you've invested yourself a little bit in the problem for these reasons, I think it's natural curiosity to want to spend twenty or thirty minutes, or more, learning about the outline of a novel proof strategy.

That said, this article was a general introduction to what the problem is about, not a description of the proof.


> I think people should think more carefully about how they spend their time.

I'm surprised that no one has yet commented on the irony of this statement. You are implying that you thought carefully about how to spend your time and decided that a good way to spend it would be telling other people they don't think carefully enough about how to spend their time. Indeed... much better to spend your time arguing on the Internet than learning something new.


Who are you to say how I should spend my time?


youre so full of it -- you say this at the same time you have been karma whoring by submitting comments from liptons blog posts on P vs NP to this website, so which one is it, is the problem interesting and newsworthy or not?


Do Not Feed the Troll


It was my comment that inspired this; and I wasn't trolling anyone.


"The short answer is: the biggest unsolved problem of theoretical computer science, and one of the deepest questions ever asked by human beings!"

presumably, 'reading' is one of those more useful things that are worth learning, as well.


But assuming P!=NP, it's resolution will probably have very little impact on the world.


That's (again, probably) not true, although none of the recent flurry of blog posts have explained why not (possibly because it's hard to make this convincing if you're not familiar with mathematical research).

The point is that solutions to such problems as this do not exist in a vacuum: even with the million-dollar incentive for solving the particular stated problems we do not expect that they will be solved without building a network of related sub-theories and sub-theorems, reusable tools which will also help with solving future problems which we don't yet even know enough to formulate.

Fermat's Last Theorem was a good example of this: nobody cares about sums of nth powers, but everybody cares about the Mordell conjecture, Taniyama–Shimura–Weil conjecture etc. which were proved along the way. FLT itself is just a line in the sand to illustrate how far we've come since 1637.


Is it likely that a proof of P!=NP will give insight into designing more efficient algorithms? Or mostly insight into proving other lower bounds?


Perhaps we could get new upper bounds out of it. Or---perhaps the eventual proof will establish a link between hitherto unrelated areas of mathematics. This link could be used to transform whole classes of theorems and proof structures / ideas from one field to another.


I think his argument is that such a proof would represent a significant advance in human knowledge. At its time of origin, the special theory of relativity didn't give anyone nicer cars, either.


I agree, except that it would free up brainpower to work on other problems, and that it would allow more confidence in cryptosystems that rely on an NP-hard problem being exponential time (but I don't think factoring a large product of two primes like could break RSA, is NP-hard).

A resolution P=NP would likely have a huge practical effect, but most people don't expect that to happen.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: