Hacker News new | past | comments | ask | show | jobs | submit login

I can't speak for their implementation but batching is not necessary. Stream processing complex JSON documents and storing the documents to disk at rates of 500k documents/second per server is demonstrably achievable on some scale-out systems.

The internal architectures make an enormous difference in throughput. A proper high-performance stream processing engine does not look anything like the "Hadoop in RAM" style model.




> Stream processing complex JSON documents and storing the documents to disk at rates of 500k documents/second per server is demonstrably achievable on some scale-out systems

So is it per server or scaled out? I thought SSDs have capped around 100k discrete per second (P/E aka write cycles).

Can you give an example? I've been unable to practically reach more than a scale of 10k/sec/server using a number of technologies and combinations to collect from socket, parse json and write to socket. That's just my specific use case.


Looking at the top end of Intel's SSD lineup I see that they have a product that advertises up to 175k IOPS of random 4K writes. Is this what you are referring?

The product is the 2TB P3700.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: