Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Maybe you could try extracting the text also using some pdf text extraction and use that also to compare. Might help fix numbers which tesseract gets wrong sometimes.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: