You should read it again > Newly defined header fields SHOULD limit their field ...

layer8 · on Aug 21, 2023

SHOULD is a recommendation, not a requirement, and it refers only to newly-defined header fields, not existing ones. The text implies that 8-bit characters in existing fields are to be interpreted as ISO-8859-1.

verst · on Aug 22, 2023

There is a RFC (2119) that specifies what SHOULD means in RFCs:

> SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

https://datatracker.ietf.org/doc/html/rfc2119

jart · on Aug 21, 2023

Haven't you heard of Postel's Maxim?

Web servers need to be able to receive and decode latin1 into utf-8 regardless of what the RFC recommends people send. The fact that it's going to become rarer over time to have the 8th bit set in headers, means you can write a simpler algorithm than what Lemire did that assumes an ASCII average case. https://github.com/jart/cosmopolitan/blob/755ae64e73ef5ef7d1... That goes 23 GB/s on my machine using just SSE2 (rather than AVX512). However it goes much slower if the text is full of european diacritics. Lemire's algorithm is better at decoding those.

HideousKojima · on Aug 21, 2023

>Haven't you heard of Postel's Maxim?

Otherwise known as "Making other people's incompetence and inability to implement a specification your problem." Just because it's a widely quoted maxim doesn't make it good advice.