Hacker News new | past | comments | ask | show | jobs | submit login

Incidentally, I noticed that if you try to use tesseract on an image taken from a Google Books page, you get terrible OCR accuracy. Anyone know why that is?



I recall that on some google-scanned books, there was some metadata from abbyy finereader. So that may be why.

Also, tesseract often needs to be configured.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: