Hacker News new | past | comments | ask | show | jobs | submit login

Why would it be incorrect? Sometimes two or three different words describe the same thing and that's ok. If you poll enough people you can get a rough idea if one version is more dominant that the other, if there's an even split, or if different regions in the same country prefer different versions. Similar to soda/pop/coke in the US.

You can design a study with a high level of data granularity. You could even track differences in pronunciation and grammar if you wish so.




because 'we usually say coche but sometimes say auto' is almost the same as 'we usually say auto but sometimes say coche', but they differ from 'we always say carro'. if a study is saying spanish is radically different in montevideo and in buenos aires, it's just wrong. this may not be the particular design error that resulted in these incorrect results, but it seems like a promising candidate


I think we're both in agreement. Perhaps my example of coche/carro was unfortunate and I didn't make my point clear enough.

A well-designed study, in my mind, would compare the usage of a varied bag of words. Starting from articles, pronouns, numbers, common verbs, then common objects, verb forms, less common adjectives, ending with uncommon objects and phrases. The compared words would be weighted based on their frequency. If two dialects have the same articles, pronouns, numbers, etc. and some differences in less frequent nouns, they would be similar rather than radically different - at least lexically. Things might look differently if we look at pronunciation.

I don't know what list of words was compared in the study linked in this subthread, so it's hard for me to say anything about it.


i agree




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: