Strings are just slices of bytes in Rust too. Both languages use UTF-8 as their ...

hu3 · on March 15, 2021

To clarify my question, can Rust read files and split their strings without checking UTF8 correctness? That could speedup the algorithm.

In Go I have used this [1] in the past when I had to validate UTF8 encoding but on hot paths where I'm sure UTF8 is valid (coming from sanitized database data for example) I skipped that part.

[1] https://golang.org/pkg/unicode/utf8/#Valid

burntsushi · on March 15, 2021

> To clarify my question, can Rust read files and split their strings without checking UTF8 correctness? That could speedup the algorithm.

That's what every single Rust program in this benchmark does, except for the simple variant.

unionpivo · on March 15, 2021

yes, you can operate on raw byte strings (you write them like b'this is ASCI' ), and then it would look like go or c code.