Hacker News new | past | comments | ask | show | jobs | submit login

Although this is a valid point in many cases, I don't think this is one of those cases. It's in these "infrastructure" type projects like kernels, compilers, interpreters, and parsers where "micro" optimizations are actually really important.

> But then say, you are writing the data do a disk. Well maybe it doesn't really matter how fast you are decoding the protobuf if next you are sitting there for ages waiting for that data to be written out. That 30usec gain is nothing on top of that 10msec wait time that is coming next, so was that week a good investment f you just did for pure speed improvement? (well you might have done as a learning exercise, then speed doesn't really matter).

haberman's parser (1460 MB/s) outperforms Google's C++ parser (260 MB/s) more the 5x. Note that even in the disk example, a fast SSD will have enough bandwidth to throttle the CPU on Google's parser. On top of that, this is FOSS, which means his weeks of investment is multiplied every time someone downloads and uses his code.




> On top of that, this is FOSS, which means his weeks of investment is multiplied every time someone downloads and uses his code.

Excellent point.

Also, I didn't mean to talk specifically about his parser, it was just used as a general example.

It is just that in my experience, engineers (I am guilty too) have a tendency to spend time micro-optimizing without, in the end, making a difference in overall user-experience. For example, stuff like choosing to write a GUI app in C++ when it could have been whipped up in Python in a fraction of time and lines of code. The menus will open in 10ms instead of 3ms but maybe it doesn't really matter from user s point of view.

Same holds for most data that ends up in IO choke-points. Even memory today in SMP architectures is a choke-point. Spend time hand-optimizing CPU bound code only to find out that it ends up waiting on a lock, in a disk, network buffer, or for some user input.

Also micro-optimizations are often not future-proof. Many cache-friendly data structures and algorithms for example, assume a particular cache line size, or particular characteristics of hardware that just happen to change. Even in the assembly case, today we have 32bit, 64bit and ARM common target architectures, each with various levels of SSE extension support and other features, so one can spend a lot of time, maintaining and tweaking all of them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: