> there are quite easy ways to avoid homograph attacks, those attacks are a poor excuse to discriminate against non-anglosphere.
No. There are these kind of attacks every now and then to this day. Maybe of you're not following itsec they fly under your radar, but getting this right is exceptionally hard. And besides these kind of attack, every major os had multiple bugs just in the processing of Unicode that could at least be used for DoS attacks.
So saying it's easy to avoid any sort of abuse of Unicode seems quite ridiculous.
Go ahead and support it for messages, display names and whatnot, but for the love of god, limit the login name of users to ASCII. Don't assume that your Python/Go/JavaScript lib for Unicode handles sanitizing and canonicalization properly. It doesn't. And even if it has only a minor bug that doesn't lead to direct issues, the next update of the lib might fix the problem and now you have to deal with the fact that your db might contain data that was processed with the old faulty lib and now gets compared to the properly processed output of the new version. Just don't. Use it as opaque data for displaying, as GP said, but never as an identifier for anything.
No. There are these kind of attacks every now and then to this day. Maybe of you're not following itsec they fly under your radar, but getting this right is exceptionally hard. And besides these kind of attack, every major os had multiple bugs just in the processing of Unicode that could at least be used for DoS attacks.
So saying it's easy to avoid any sort of abuse of Unicode seems quite ridiculous.
Go ahead and support it for messages, display names and whatnot, but for the love of god, limit the login name of users to ASCII. Don't assume that your Python/Go/JavaScript lib for Unicode handles sanitizing and canonicalization properly. It doesn't. And even if it has only a minor bug that doesn't lead to direct issues, the next update of the lib might fix the problem and now you have to deal with the fact that your db might contain data that was processed with the old faulty lib and now gets compared to the properly processed output of the new version. Just don't. Use it as opaque data for displaying, as GP said, but never as an identifier for anything.