Hacker News new | past | comments | ask | show | jobs | submit login

Yep - it's actually been pretty invaluable in getting this project to where it is now. :) It definitely has its own flaws, but those are easy to work around.

Edit - looks like they have a similar problem from dealing with hand typed data. From their road map:

Ambiguous token classification (coming soon): e.g. "dr" => "doctor" or "drive" for an English address depending on the context. Multiclass logistic regression trained on OSM addresses, where abbreviations are discouraged, giving us many examples of fully qualified addresses on which to train




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: