Actually GPT-4 is at least marginally capable at almost all languages, and I exp...

OfSanguineFire · on Aug 12, 2023

> For example, I can carry on a reasonably good conversation with ChatGPT in the ancient dead language of Pali and have it translated into cuneiform

ChatGPT’s ability to translate into Pali or cuneiform is limited by the size of the corpus. As I said, most languages of the world do not have a sizeable electronic corpus, and what has appeared in writing is only a limited portion of those languages. ChatGPT cannot magically guess words or idioms that it was never trained on.

This is well known to anyone working in corpus linguistics. Do you have any formal background in the field?

> As people are invested in their native language this doesn’t seem unreasonable.

Outside a few relatively privileged languages, people are much less invested in their native language than you assume. Due to the pressures of poverty, political oppression, and social stigma, it can be difficult for linguists to even find speakers willing to answer some questions about their language, let alone train MT.

fnordpiglet · on Aug 12, 2023

Then it’s up to those that care to preserve languages they care about. I for one am not an advocate that there be no one preserving languages or dedicating their lives to preserving languages. That’s a wonderful thing to do, and we have a tool that they can use to encode those languages for all time in a way that is functionally accessible to all for all time.

Reading the thread I don’t notice any proposal of an alternative. Without that I will still hold onto the idea that we can improve things with the miracles we create in our technologies. While corporations might not feel a profit potential here, surely the open source world has shown we don’t need corporations to do amazing things.