Hacker News new | past | comments | ask | show | jobs | submit login
Java IAQ (norvig.com)
75 points by Mitt on April 13, 2012 | hide | past | favorite | 31 comments



Goes directly to "What other operations are surprisingly slow?" and attempts to write micro benchmark with the same results.

Fails....

Most of the general stuff is accurate and dandy but I don't believe you should listen to many (most likely any) of the speed related statements as this article appears to have been written in 1998.

Topic should be "Java IAQ circa 1998".


I have an IAQ of my own:

Why is it that Java VMs seem to have such different interactive latency when compared to other memory-managed language VMs? Is the performance of eg, eclipse or Android or the Java browser plugin something to with the GC kicking in too aggressively?

Over the last decade I've heard different explanations re: SWT/swing/AWT toolkits, and server-side optimizations being default in desktop VMs, but I'm fairly sure both issues should be resolved now. Nevertheless, my general experience seems to be if it's Java based, it seems to handle touch/mouse events noticeably worse than otherwise.

Maybe .net and Python and the rest are doing more of their UI work in C?


This has a lot more to do with the GUI toolkits than the JVM. Python is most often used with bindings for GTK (C), QT (C++) or tk (C). I don't know a lot about .net but I believe it just links to the standard windows gui toolkit probably also written in native code.

For Java the main issue is that Swing/SWT/AWT suck and java is mostly used as a server language so there is little impetus to fix them.

Eclipse is just bloated, that is why its often slow. Android has its own custom JVM (Dalvik) but I imagine the majority of the UI rendering / handling code is native with Java bindings.

All that said, the full JVM (both openJDK and sun's) have multiple garbage collectors and many tuning options. This creates flexibility to tune the GC for particular work loads but also makes them complicated to work with. It doesn't help that the default GC/memory settings in the JVM are a bit...off, and use the high throughput GC vs the low pause collectors.


It's been a while since I've done work in Java, but I'm pretty sure the GC is not to blame for this. Java uses a concurrent mark and sweep GC. The concurrent part means it does not pause the world while it's collecting. So as mbell says, it has to do with the UI frameworks.

But I don't think it has to do with whether or not the UI code is written in C either. In the UI frameworks I've used for both Python and Ruby, you could write slow interpreted code that would run on the UI thread. The key thing was you tended not to do it, because the frameworks didn't encourage you to do so. The problem is that Java frameworks make it easy to for example do IO on the UI thread, whereas the other frameworks make it hard. Threading is relatively hard to get right in any of these languages, so the choices made by the framework authors really matter.

Another thing contributing to Java being perceived as being slow is the long boot time of the JVM. (It used to be slow at least, don't know if it still is.)


> It's been a while since I've done work in Java, but I'm pretty sure the GC is not to blame for this. Java uses a concurrent mark and sweep GC.

Part of the problem is that the default GC is not the ConcMarkSweep collector in most installations but rather its the parallel "stop the world" GC that is optimized for throughput, not latency. For web applications (or anything with latency issues) step one is almost always to modify your JVM opts to enable the CMS collector.

Unfortunately even the ConcMarkSweep collector can have issues. Its not that hard to create a workload that it can't handle resulting in your app slowing eating through all available heap space because the collector can't keep up ultimately forcing a full collection which on large heaps can be devastating, think 60 to 120 second pauses.

The G1 collector, which is no longer considered experimental in Java 7 is 'better' and you can specify a maximum pause time, but it can still run into situations where it has to force a full GC and there are situations where the CMS collector is still better.


It's pretty simple... Hotspot collectors are optimized for throughput, not latency. That's where the money is for Sun/Oracle. It's a trade off between one and the other in any garbage collector.

You can tune it to some extent.


It's the libraries. Write an app with OpenGL or SDL and it works fine. (Minecraft is a perfect example)


If you are interested in this kind of stuff, Effective Java is a must-read book. Long, good explanations of how to do stuff right.

Also, the 2nd edition covers up through Java 1.6.


I second this recommendation, actually even if you do not program in Java nor on the JVM. Like no other book it taught me a particular kind of foresight - to analyse the consequences of each small decision being made during implementing some new thing and to design things in such a way that they are least likely to cause problems now or in the future, to me or to other programmers working on the project - my code is much more solid and less error-prone ever since and it was definitely a big AHA moment for me. The author, Joshua Bloch, designed some of the standard Java APIs and I actually can't think of a better way to learn about those things than implementing an API used by other people, which explains why he has so many interesting things to say that are seldom found in other places. Here is also a very nice talk by the author:

http://www.infoq.com/presentations/effective-api-design


FWIW, last changed July 13, 1998 according to the source.


This appears to have been written before IDEs were invented. The author suggests that it's acceptable to extend a class just to gain unqualified access to its static methods to save a few characters of typing.


Did you read "Scheme In One Class" at the link he attached to the paragraph you're sniping at?


Yes I have, and in that he reiterates that it's just to achieve unqualified access (i.e. save a few keystrokes, which is not an issue with any moderately good IDE):

> I came up with the idea of putting the utility methods in their own class, SchemeUtils, and then having the five other top-level classes extend SchemeUtils. That way, I can use the unqualified name, and I get the modularity I want.

It's not a huge deal in his project since he obviously wasn't using OOP anyhow, but it really is an abuse of inheritance with no benefit.


What IDE will hide the qualifying class names of method calls? I was searching for that yesterday, but the best I could so is syntax highlighting to make classes display in a very faint color, but Eclipse can't distinguish type declarations from method qualifiers from class declarations, nor will it fold away the class names I want to hide, so it's not ideal.

Coding is 10% reading and 90% reading, so an IDE that only helps with writing code is approximately worthless.


Java compilers are very poor at lifting constant expressions out of loops. The C/Java for loop is a bad abstraction, because it encourages re-computation of the end value in the most typical case. So for(int i=0; i<str.length(); i++) is three times slower than int len = str.length(); for(int i=0; i<len; i++)

Does anyone know if this is still valid advice? I recall reading somewhere that the JIT has optimized around this problem.


For this example, I would expect that any decent Java compiler optimizes this away nowadays. That does introduce a dependency between compiler and standard library, but the benefit is large enough, given that this is idiomatic code, and that Java has it easy here because Java strings are immutable.

For general collections, things are more difficult, but I still expect java to optimize things in many cases. I haven't checked, though.

A C/C++ compiler has a tougher job. It can only move strlen calls out of a loop if it can determine that the strlen you call came from the correct <header> file and that there is no defined behavior under which either the pointer or its contents change inside the loop.

If it cannot do either, it must 'call' strlen each time through the loop (That 'call' can be inlined, especially if the compiler knows it is the standard library's strlen)


The thing with Java is that the compiler (that is, javac) does very little. It will replace constant variable accesses with the value (things that are defined as static final.....) but that is about it. This is why its very easy to decompile Java source, complete with method and variable names.

All the real optimization is done by the JIT which usually doesn't 'kick in' unless a segment of code gets called many times, in client mode I think the default in the hotspot vm is 1500, while in server mode is lower but I can't remember the default. I generally manually set it to something around 200 for sever apps.

As a result of this, the JIT may optimize this loop, it may not, it may optimize in the middle of a test, it all depends on how often the code its called. You can easily get different benchmark results by changing the -XX:CompileThreshold JVM parameter and will see variability in performance unless you 'warm up' the JIT forcing it to optimize.


The thing is String.length() doesn't do any re-computation. All it does is return a variable defined in the String object that is modified when the string it contains changes.

See my other comment saying you should ignore any speed suggestions by the article.


That is correct from the String source I am seeing. It simply returns a field, so the JVM probably inlines it into the for loop conditional making this advice not really necessary.


I like this way to rewrite it when the end test is slow (I think this is correct syntax):

for(int i=0, len=str.length(); i<len; i++)

One evaluation and you don't have an extra len variable hanging around.


But note the warning from Sun: "So when should you use static import? Very sparingly!"

I'm so glad there is an authoritative answer on this subject. IMHO, I hate the use of static imports because it becomes difficult to know where a particular method comes from... where it's an instance method from the class itself or one that is brought in through a static import.

And then there's the possibility of clobbering the namespace by including a local variable name and a statically imported variable.


I don't like them in general either, but there's one place I use them extensively: JUnit tests (statically import Mockito, Hamcrest, Assert, etc).

It mostly has to do with how much the library tries to be a DSL, and how much readability will suffer if I don't use a static import. IMO, this

  assertThat(6, isGreaterThan(5));
reads a heckuva lot better than

  Assert.assertThat(6, Matchers.isGreaterThan(5));
Of course the combination of poor Java type inference and Generics-heavy libraries make it so that often I have to do the latter to make use of angle braces, but that's another story.


It really should be Assert.that(6, Is.greaterThan(7)), shouldn't it? Best of both worlds!


I don't understand this comment at all.

Do you not know that import statements are at the top of every class file, or did you mean that you hate wildcard imports? Do you also eschew local variables, because they might clobber fields and method parameters if you ignore the compiler warnings?


The page links also to a C IAQ: http://www.seebs.net/faqs/c-iaq.html

I've never seen that many wrong C statements. :-)


A little outdated, the one about HashMaps and equals is moot with generics.

Still quite interesting.


It's not. HashMap<K, V> works exactly like Hashtable with regards to equals() and hashCode().


Ah, interesting.


Its not like Java is changing that much :)

Is this still true these days: "creating a new object is a fairly expensive operation" ?


Creating a new object is never free, but it used to be far more expensive. To my knowledge the original Java GCs weren't compacting & had to find the best place in the heap to stick a new object. This required a linear search of the available free blocks. This is obviously much slower than incrementing a pointer at the edge of your used space. Unfortunately my search-fu has failed me & I can't find corroboration of my claim.

My impression of Java these days is to start with the "right" thing (allocate as needed, etc) and not to overly concern yourself with performance until it becomes an issue. There are certainly situations where allocating lots of objects (say billions or, maybe, millions) causes performance issues, but creating object pools to preemptively avoid that is a lot of extra work that may not be worth it.


Short answer: not necessarily

A good JVM contains optimizations, which makes it cheap. For example, it will try to allocate the object on the stack instead of the heap. Apart from field initialization, this costs nothing, because the stack pointer probably had to be changed anyways.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: