That is it! So the problem seems to be a complex one. Some part of it I think is an ordinary Java issues - what happens when there is not enough memory, and system start use swap, what happens when a connection rate is too fast while system is doing a heavy IO? What happens during recovery of network operation or replication failure and so on.
The second part is the complexity of the software itself, but not the complexity of the algorithms or tasks - it isn't a rocket science, but added artificial complexity due to all those CrappyFactoryManager().GetSpecialShitFactory().instantiateANewCrap() and so on - seem like no one can comprehend the whole mess itself.
In the other hand, this failure probably will cause some improvements or at least more attention to Cassandra project, and everyone who uses it will benefit.