Hacker News new | past | comments | ask | show | jobs | submit login

Author here. Seeing how difficult it was to get a reliable OCR transcription with commercial software from a pristine, computer-generated representation of the text, I suspect trying to OCR photos would be even less reliable :)

I simplified some things for brevity in the write-up. I did indeed try a bunch of fonts/font sizes (trying a single page at a time and manually inspecting the results) without much improvement.




There is nothing pristine about images transmitted over Fax. It's such a grotty old technology with loads of aliasing issues. A modern cell phone picture of a word screen full of hex would almost certainly be easier to OCR.


Did you try a search-and-replace in Word, changing the problem characters to something else?

e.g.

  0123456789ABCDEF
  012345M7XPAVKHEF


That's brilliant! You could even expand each input character to multiple characters to build an error correcting code.


Nope! That's a good idea though.

The transcription errors I was getting were not consistent. Like, D would be O or 0 or D, with no apparent rhyme or reason to it. And the turnaround time on each fax attempt was long enough that I focused on doing the image recognition myself instead.


This was a phenomenal effort and such a joy to read. Based on how much work this was, these were probably some very important sound files that mean a lot to someone in your family, so thanks for your hard work getting them off the laptop.

My goofy idea was using the font OCR-A but you'd be very lucky if that Mac came with that.

https://en.wikipedia.org/wiki/OCR-A


Why not display the info as a series of QR images? There probably wasn’t a dev environment on the laptop though.

For the record, you’d have had no problem mounting an image of the HFS disk on any modern Linux or macOS system.


Well, mounting the disk itself. If it was simple to get an image of the disk, the author could have used the same method to just get the files they wanted.


There are many different SCSI to USB cables out there, for exactly this purpose. Even the weird mini-SCSI interface used by Apple in the 90’s.


Couldn't you just do a bunch of different faxes, perhaps in different fonts or different font sizes, which would lead to different randomly distributed errors? Then you can do OCR for all of them, and just take the median of the result, and get exponentially less error.


Did you try seeing how well ChatGPT was at OCRing the images? Though since it is HEX characters it might not do so good. I've found it to be very reliable at OCRin e.g. photos of receipts.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: