Hacker News new | past | comments | ask | show | jobs | submit login

> I try a bunch of different OCR programs, but can't find any that can transcribe the document with 100% accuracy. They often confuse certain letters or numbers (like 0 and C, 9 and 4, 0 and D). Sometimes they omit characters, sometimes they introduce new ones. I try different font sizes and different fonts, but it doesn't matter.

I feel like this could be trivially solved by plugging an LLM to the OCR output with the sole task to correct spelling errors like that. That's pretty much one of the tasks LLMs should excell the most at.




It's hexadecimal. There is no spelling, so there's no way for an LLM to know if something is supposed to be a `D` or a `0` any more than traditional OCR software can.


yes i noticed that way too late, my bad


Denoising algorithms are always lossy. An LLM (or, y'know, Markov chain) could do this job by exploiting statistical regularities in the English language, but a hex dump isn't quite the English language, so it'd be completely useless. Even if this text were English, though, the LLM would make opinionated edits (e.g. twiddling the punctuation): you'd be unlikely to get a faithful reproduction out the other end.


Of course, use search and replace to change 0 to zero... etc. The OCR will (should) work better.


You might as well just use an error-correction code: same result, less overhead.


> hex dump

ah, missed that, was just skipping through


Still would not solve the problem of copying data without changing it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: