Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You should read it again

> Newly defined header fields SHOULD limit their field values to US-ASCII octets

ASCII octets! That means you SHOULD NOT send Latin1 encoded headers. The opposite of what pzmarzly was saying. I don't disagree Latin-1 being a superset of ASCII or having backward compatibility in mind, but that's not relevant to my response.




SHOULD is a recommendation, not a requirement, and it refers only to newly-defined header fields, not existing ones. The text implies that 8-bit characters in existing fields are to be interpreted as ISO-8859-1.


There is a RFC (2119) that specifies what SHOULD means in RFCs:

> SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.

https://datatracker.ietf.org/doc/html/rfc2119


Haven't you heard of Postel's Maxim?

Web servers need to be able to receive and decode latin1 into utf-8 regardless of what the RFC recommends people send. The fact that it's going to become rarer over time to have the 8th bit set in headers, means you can write a simpler algorithm than what Lemire did that assumes an ASCII average case. https://github.com/jart/cosmopolitan/blob/755ae64e73ef5ef7d1... That goes 23 GB/s on my machine using just SSE2 (rather than AVX512). However it goes much slower if the text is full of european diacritics. Lemire's algorithm is better at decoding those.


>Haven't you heard of Postel's Maxim?

Otherwise known as "Making other people's incompetence and inability to implement a specification your problem." Just because it's a widely quoted maxim doesn't make it good advice.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: