Hacker News new | past | comments | ask | show | jobs | submit login

Perl would do it quite fast and it has the benefit of accessing posix primitives directly.



A naive perl solution is really really slow compared to even the reference Java implementation. (I know, I've tried)


That's strange, you should be able to stream the file right into a tiny perl executable at the same speed as the bottlenecking hardware. The kernel will take care of all the logistics. You're probably trying to do too much explicitly. Just use a pipe. Perl should be done before Jit completes.


Using cat to redirect the file to /dev/null takes 18s on my machine (a low-end NUC). Just running a noop on the file in Perl (ie. feeding it into a `while (<>)` loop but not acting on the contents) takes ~2 minutes.

1B lines is a lot, and Java ain't a slouch.


Why are you using cat at all? Use a pipe. This isn't hard stuff. Don't use <>, feed the file into a scalar or array. it should only take a few seconds to process a billion lines.

https://www.perl.com/pub/2003/11/21/slurp.html/#:~:text=Anot....


If it isn't hard, then perhaps you could demonstrate with a complete perl program that you think should beat java.


I profiled my attempt, actually reading each line is the bottleneck.


Perl is always going to be much faster than Java at tasks like this. Use stdin and chomp() instead of reading each line explicitly.

This is really a small, trivial task for a perl script. Even with a billion lines this is nothing for a modern cpu and perl.



Reddit?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: