Since I'm a CS theory person, I can offer some theoretical improvements on the running time and make the problem even more general...
Instead of minimize the linear difference of partition, we might want to minimize the standard deviation, or basically any convex function, and still do it in the same time bound.
One can reduce this problem to find a k-edge path of minimum weight on a complete DAG. The naive algorithm will run in O(kn^2), but we can improve the running time to O(kn) by realize the weight on this DAG has the Monge property. This is very practical to implement.
In this application, k is very large. n is just a constant multiple of k. We can use a theoretically better algorithm that takes n2^O(sqrt(log n loglog n)) time. (this is almost O(n^(3/2))). I doubt it will ever be implemented with speed comparible to the O(kn) solution. See http://www.cs.ust.hk/mjg_lib/bibs/DPSu/DPSu.Files/sdarticle_...
I shall post a solution tomorrow since I'm currently touring NYC with my gf...
What is a complete DAG? Can I create one by just taking a complete graph, choosing an arbitrary ordering of the vertices, and assigning each edge a direction according to that ordering?
Edit: after reading the problem statement, yes, I can.
Yes, that's what 90% of the comments on the article said, too. Welcome to the club. :P
He's talking about people's everyday understanding of data structures, including his own. That the data structure he described already existed when he banged it out is beside the point; most folks don't know about it, most CS programs don't teach it, and most software doesn't use it.
That's sort of like saying, "Well, all special relativity does is show Einstein's ignorance of Lorentz's work."
It's true in some sense, but ultimately an unproductive sentiment.
I think that knowing your audience goes a long way, particularly when using "You're doing it wrong" in the title.
The audience of this article was a lot of people that knew of the problem, and knew of research into cache-oblivouis algorithms. This is a long ways from not citing somebody, it's preaching to somebody as if they are ignorant when they're a fair ways ahead of you. Quite annoying when encountered.
I see his point, I'm not attacking that. I'm not happy about him didn't do enough research on the topic to find more recent results? Or talk with a real algorithms professor about this before writing up this entire article about this? One line could change those 90% of the comments, just one line about how "This kind of problem is tackled by the field of cache-oblivious data structures."
I really wish the theory and applied side split off into separate departments or divisions just like how math and applied math go their separate ways.
Cache-conscious data structures are more efficient than cache-oblivious data structures. While the latter are interesting from a research perspective, in practice the cache sizes are known and can be optimized for.
Would you use a cache-oblivious datastructure that's 10x slower than its regular counterpart? Cache sizes don't change every day, I don't see the point.
In general there are lots of datastructures that are wonderful on paper but whose constant time factors make them infeasible in practice.
In programming languages, different paradigms differs as much as how different field in math. So a more concrete analogy would be one jumping from different paradigms of programming languages. Like someone using a imperative language like Java for their entire life and must learn to read a Haskell program.
You are restricted by the set of tools you already have right?
I can't believe how many times a great advancement in theory in different field comes from pure math advancement (or applying pure mathematical results). Say in economics, if Nash didn't apply topology, how would one ever know Nash equilibrium exists. Stable matching, compressed sensing and others also comes to mind.
I have a hypothesis that any field would benefit if we introduce some pure mathematicians to them.
To be completely honest, the more common case is that the mathematicians follow behind physicists and engineers to verify that their applications are theoretically sound. But yes, great breakthroughs do come from pure theory.
I don't know if there is a 'pitch' concerning pausing college for a semester. I think the same thing applies as for those people (like me) who quit their programming jobs to come to hacker school:
I've learned more in the last 2 months than I have in the last year at my job. Pair programming with people who are better than you in a specific skill is an incredible way to learn. If you love programming and want to become a better programmer by spending time doing what you love with other people who are driven by the same passions, come to hacker school. You have found your space.
I guess it depend on how one sees something to have value.
There are mathematicians only consider research level mathematics have value. There are some subfield of mathematics like algebraic geometry that take years for a math grad student to even understand the language, and takes even longer to produce research paper with value.
Sometimes, people have long list of strings that have total length more than 255 characters. frak would be a nice tool for compression.