Hacker News new | past | comments | ask | show | jobs | submit login

Strings are just slices of bytes in Rust too. Both languages use UTF-8 as their internal representation. The difference is that Rust requires its strings to be valid UTF-8. Go does not. Either choice is reasonable. The downside of requiring valid UTF-8 is that in order to read bytes from a file, you have to do a validation check on them to ensure they are UTF-8 before returning a string back. This is an additional cost.

But this is not the only additional cost in the Rust program. I outlined a few others.

> Isn't it possible to do the same with Rust for this benchmark?

This benchmark has two programs for each language. The "simple" and "optimized" variant. The "simple" version is supposed to be written in an idiomatic style for that language. Taking extra steps to make tweaks and optimize the code is, I think, against the spirit of the challenge. Obviously, this is a very fuzzy concept, and everyone can make up their own mind on the extent to which this framing is useful. (I think it is, personally, especially when you also allow for a second submission that tries to make the program fast.)




To clarify my question, can Rust read files and split their strings without checking UTF8 correctness? That could speedup the algorithm.

In Go I have used this [1] in the past when I had to validate UTF8 encoding but on hot paths where I'm sure UTF8 is valid (coming from sanitized database data for example) I skipped that part.

[1] https://golang.org/pkg/unicode/utf8/#Valid


> To clarify my question, can Rust read files and split their strings without checking UTF8 correctness? That could speedup the algorithm.

That's what every single Rust program in this benchmark does, except for the simple variant.


yes, you can operate on raw byte strings (you write them like b'this is ASCI' ), and then it would look like go or c code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: