Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not really. An A4 page at 75ppi — aka what used to be the standard "Web export" back in the day — is 620x877, and 1,000 of those images costs about $2 with the current pricing for gpt4o. Assuming there are about 500 words per page on an A4-sized page, and that each word is 0.75 tokens, that's ~666k tokens for $2. Given that gpt4o is $2.50/million tokens of text, using it for OCR is break-even with Tesseract + LLM, and a lot more accurate — especially once tables or columns are involved.

It's honestly shocking how much gpt4o with vision has simplified things.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: