Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The second category essentially concerns spelling variations. Years and years ago I built a tools for searching old documents and to deal with all the spelling variants these contain I implemented something called the Gloria Guts algorithm, which ranks words based on how much their spelling differs. As I recall it worked much better than sounded for our data set


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: