Hi! We've been working with and evaluating Gettext when we started Fluent. Our o...

ascar · on April 17, 2019

You make a lot of good points in the linked article, but you lose some credibility right from the start

> Secondly, it makes it impossible to introduce multiple messages with the same source string which should be translated differently.

This is false. The gettext message format uses msgctxt to deal with this. It's a fundamental part of the format. The unique identifier is the combination of msgctxt and the singular string. I wonder how you could miss that? We actually use an automatically generated msgctxt for some part of our app to avoid accidentally translating the same source text incorrectly in different context.

Also I couldn't quite follow the point about interpolation of fluent vs gettext (probably because I don't know fluent). Message interpolation in gettext works and can be absolutely readable. E.g. "You have {count} items". The big drawback is that you can't move this variable across strings. Can you do that with fluent?

zbraniecki · on April 17, 2019

> but you lose some credibility right from the start

Thank you for the feedback! I updated the article to include the mention about `msgctxt`.

Personally, in my experience, many project environments end up with partial support for this feature (for example many react/angular extractors don't support it) which leads to limited use and requires the localizer to request adding a context by the developer.

I did not include that since it's just my personal experience and I assume more mature projects tend to recognize the feature and use it, hopefully, extensively :)

> Message interpolation in gettext works and can be absolutely readable. E.g. "You have {count} items".

As far as I understand this is not part of the system (gettext), but its bindings and in result is underspecified and differs between implementations. For example [0] uses `%{ count }` while [1] uses `{{ count }}`. If I'm mistaken here, please, point me to the spec :)

Since it is a higher level replacement, this approach likely suffers from multiple limitations. First of all, I highly doubt that there is any BiDi isolation between interpolated arguments and the string leading to a common bug when RTL text (say, arabic) contains an LTR variable (say, a latin based name of a person). Fluent resolves it by wrapping all interpolated placeables in BiDi isolation marks.

Secondly, I must assume that any internationalization, such as number formatting, date formatting, etc. is also not done from within of the resolver in gettext. That, in turn, means that it may be tricky to verify that a number is formatted using eastern arabic numerals when used in arabic translation, while formatted to western arabic when used in english translation. Fluent formats all placeables using Unicode backed intl formatters (for example in JS we use ECMA402), allowing for consistency and high quality translations where placeables get formatted together with the message.

For example, in your example, will the `You have { count } items` be translated to `لديك 5 عناصر` or `لديك ٥ عناصر`? And what will happen if instead of `count`, you'd have `name: "John"`? Will it be RTL or LTR?

[0] https://hexdocs.pm/gettext/Gettext.html#content [1] https://angular-gettext.rocketeer.be/dev-guide/api/angular-g...

ascar · on April 20, 2019

Yes, I agree. It only solves a subset of the problems. Formatting and RTL/LTR is difficult to solve with gettext.

ngrilly · on April 17, 2019

> Our opinion is similar to Unicode's - Gettext is fundamentally flawed design for internationalization purposes.

Did the Unicode consortium express critics about gettext? Could you provide some reference about this?

zbraniecki · on April 17, 2019

I don't know if there's any public statement about this. I base my position on experience at Unicode Conference and work on CLDR and ICU. I understand that it diminishes the value of my claim.

I can also point out to ICU MessageFormat - which has been designed much after Gettext and, I'd dare to say on purpose, bares no resemblance to it.

ngrilly · on April 17, 2019

I agree that CLDR plural forms and ICU MessageFormat are somehow an implicit critic of gettext design :-)

zbraniecki · on April 17, 2019

Hahaha, thank you! I still feel ashamed of making a strong claim based on informal conversations, but I feel a bit vindicated by your agreement! :)