Hacker News new | past | comments | ask | show | jobs | submit | more Mgccl's comments login

Google Analytics has a filter system that allow one to use regular expressions of length at most 255.

Sometimes, people have long list of strings that have total length more than 255 characters. frak would be a nice tool for compression.


Since I'm a CS theory person, I can offer some theoretical improvements on the running time and make the problem even more general...

Instead of minimize the linear difference of partition, we might want to minimize the standard deviation, or basically any convex function, and still do it in the same time bound.

One can reduce this problem to find a k-edge path of minimum weight on a complete DAG. The naive algorithm will run in O(kn^2), but we can improve the running time to O(kn) by realize the weight on this DAG has the Monge property. This is very practical to implement.

I posed it as a problem on daily haskell exercise http://dailyhaskellexercise.tumblr.com/post/58060450750/the-....

In this application, k is very large. n is just a constant multiple of k. We can use a theoretically better algorithm that takes n2^O(sqrt(log n loglog n)) time. (this is almost O(n^(3/2))). I doubt it will ever be implemented with speed comparible to the O(kn) solution. See http://www.cs.ust.hk/mjg_lib/bibs/DPSu/DPSu.Files/sdarticle_...

I shall post a solution tomorrow since I'm currently touring NYC with my gf...


What is a complete DAG? Can I create one by just taking a complete graph, choosing an arbitrary ordering of the vertices, and assigning each edge a direction according to that ordering?

Edit: after reading the problem statement, yes, I can.


a solution would be awesome, especially in javascript. i actually found a case where their linear algorithm, https://github.com/crispymtn/linear-partition, failed.


See the updated content in the link and the code in here http://www.chaoxuprime.com/posts/2013-08-16-more-algorithms-... Both in Haskell. you can see how to implement it from scratch...


I thought you would want the most negative cycle, which would be NP-hard to find.


Let's put some algebra/order into this... Bellman-Ford works for any totally ordered group (G,+), such that a+b<=a+c if b<=c.

What logarithm does is just a order preserving homomorphism from (R,*) to (R,+).


Good article, but it only demonstrate his ignorance on existing cache-oblivious data structures...


Yes, that's what 90% of the comments on the article said, too. Welcome to the club. :P

He's talking about people's everyday understanding of data structures, including his own. That the data structure he described already existed when he banged it out is beside the point; most folks don't know about it, most CS programs don't teach it, and most software doesn't use it.

That's sort of like saying, "Well, all special relativity does is show Einstein's ignorance of Lorentz's work."

It's true in some sense, but ultimately an unproductive sentiment.


I think that knowing your audience goes a long way, particularly when using "You're doing it wrong" in the title.

The audience of this article was a lot of people that knew of the problem, and knew of research into cache-oblivouis algorithms. This is a long ways from not citing somebody, it's preaching to somebody as if they are ignorant when they're a fair ways ahead of you. Quite annoying when encountered.


As he says in the comments, he didn't choose the title.


I see his point, I'm not attacking that. I'm not happy about him didn't do enough research on the topic to find more recent results? Or talk with a real algorithms professor about this before writing up this entire article about this? One line could change those 90% of the comments, just one line about how "This kind of problem is tackled by the field of cache-oblivious data structures."

I really wish the theory and applied side split off into separate departments or divisions just like how math and applied math go their separate ways.


Cache-conscious data structures are more efficient than cache-oblivious data structures. While the latter are interesting from a research perspective, in practice the cache sizes are known and can be optimized for.


true.

irrc there is only a constant factor of time difference between the two.


Would you use a cache-oblivious datastructure that's 10x slower than its regular counterpart? Cache sizes don't change every day, I don't see the point.

In general there are lots of datastructures that are wonderful on paper but whose constant time factors make them infeasible in practice.


I second this.

In programming languages, different paradigms differs as much as how different field in math. So a more concrete analogy would be one jumping from different paradigms of programming languages. Like someone using a imperative language like Java for their entire life and must learn to read a Haskell program.


You are restricted by the set of tools you already have right?

I can't believe how many times a great advancement in theory in different field comes from pure math advancement (or applying pure mathematical results). Say in economics, if Nash didn't apply topology, how would one ever know Nash equilibrium exists. Stable matching, compressed sensing and others also comes to mind.

I have a hypothesis that any field would benefit if we introduce some pure mathematicians to them.


To be completely honest, the more common case is that the mathematicians follow behind physicists and engineers to verify that their applications are theoretically sound. But yes, great breakthroughs do come from pure theory.


This is not available in New York due to state laws require them to have certain license.


I know who would be perfect for this, except she is still in college so she would need to drop a semester to do this instead.

Do you have any pitch concerning this?

I assume the summer one doesn't conflict with college, but at those times people usually go for internships.


Disclaimer: I am in the current batch.

I don't know if there is a 'pitch' concerning pausing college for a semester. I think the same thing applies as for those people (like me) who quit their programming jobs to come to hacker school:

I've learned more in the last 2 months than I have in the last year at my job. Pair programming with people who are better than you in a specific skill is an incredible way to learn. If you love programming and want to become a better programmer by spending time doing what you love with other people who are driven by the same passions, come to hacker school. You have found your space.


I guess it depend on how one sees something to have value. There are mathematicians only consider research level mathematics have value. There are some subfield of mathematics like algebraic geometry that take years for a math grad student to even understand the language, and takes even longer to produce research paper with value.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: