One of the biggest things that I see with this release is trained vectors for Asian languages - Hindi, Kannada, Telugu, Urdu, etc.
This is huge - because most other releases have traditionally been in European languages. It is fairly rare to see Asian languages release.
One challenge is that typical linguistic use in Asia mixes the native language with English. For example, people in north india use "Hinglish". It is typically fairly hard to make sense out of this.
I’m sure you know yourself that number of speakers is rarely the metric used for choosing a target market, and more commonly products are launched by the potential revenue to be made (which scales with GDP per capita).
Yet, most projects only target the Anglosphere, not even Europe is usually included.
Problem is that "Europe" means so many different languages... which is also our biggest remaining obstacle in trying to launch even just web products here (as compared to launching "in the Anglosphere").
To be fair, the Wikipedia sites for these languages barely have any content. Of those that do exist most articles are essentially 2-line blurbs. These languages are essentially dead for all practical purposes.
I'm not going to waterdown my analysis for some two bit politically correct occidentals and "hyper-nationalist" orientals.
A large fraction of Indian children will be illiterate in these "thriving" languages in the coming decades, with "zero" monetary loss. Not living is death, and they haven't been alive for a long time - I can neither get any Govt. services in my mother tongues, nor can get the laws done by the colonial state in Delhi.
A zombie is not alive IMO. If Indians wish to parlay their tongues for money, it's upto them. I for one will not deceive myself. We are going to be part of the borg that is the Anglosphere in less than 3-4 generations.
gnipgnip is first of all projecting, according to his analysis, Indians will stop talking local languages and adopt English and in 80 years its all over, India is an English speaking country, the flaw in that logic. The main problem is Indians do entertain themselves in their native languages, yes, many languages spoken by smaller groups of people are certainly under threat, but languages with currently massive footprints like Hindi, Telugu, Bengali etc. ain't going any where 80 years from now.
A language that is spoken by 80 million people is not dead, and is not dying. The colonial two-step when coming to languages existed for a long time and its just matter of time, people with help of machine learning will provide support in multiple languages etc.
For much large part of Mughal period the court language was Farsi, but Hindi/Urdu survived, you are underestimating the ability of people to straddle multiple languages. For many Indians its just a necessity.
Language shift occurs when mothers stop talking to their infants in the language. Not being able to access government services in the language may indicate a decline in the language's vitality, in its prestige and influence, but it is a long, long way from language "death".
This is huge - because most other releases have traditionally been in European languages. It is fairly rare to see Asian languages release.
One challenge is that typical linguistic use in Asia mixes the native language with English. For example, people in north india use "Hinglish". It is typically fairly hard to make sense out of this.