For this problem the multiple process version would be quite simple in python or... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

xmcqdpt2 on Nov 26, 2022 | parent | context | favorite | on: I/O is no longer the bottleneck

For this problem the multiple process version would be quite simple in python or any other languages. It's a classic same program multiple data (SPMD) task. You split the file into N chunks than run N versions of the original program on it (a Map). You then need to collate the results, which required a second program, but that step is similar to the sorting step in the original and so would be negligible wrt wall time (a quick Reduce).

For large files you should get almost embarrassing parallelism.

jvanderbot on Nov 26, 2022 [–]

Oh I think a few simd instructions could reduce processing to near zero without going crazy with multi-threaded architectures.

Remember that fizzbuzz on HN that hit GB/s? Mostly SIMD. Zero multi-threaded IIRC.

Consider applying for YC's W25 batch! Applications are open till Nov 12.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact