Hacker News new | past | comments | ask | show | jobs | submit login

Seems like a good use case for machine learning.



ML is used extensively for this. These folks are handling cases that are low confidence or Id imagine building labeled training datasets.


Yann LeCun pioneered CNNs for this actually. OCR of zipcodes for USPS. Early 90's.


Always wondered why USPS didn't adopt a system for writing zipcodes similar to the one used in USSR:

https://en.wikipedia.org/wiki/File:Stamp_Soviet_Union_1977_C...

Add: https://en.wikipedia.org/wiki/File:Russian_postal_codes.svg


Some might refuse to write 4s (vs U+1FBF4) and 8s like that as they require two strokes or retracing.


... what?

Can you explain how this is an issue at all and a point of concern for something like USPS?


The (arbitrary?) 9 segment number examples from the Russian Postal Codes that were linked and were questioned as to why the USPS did not adopt them are ambiguous depending on how one chooses to fill out the segments of the numbers. For example 4 may be seen as 9. 8 is slightly less an issue, but amounts to 1 segment away from 0 and 9. If a machine or any and all perfectly rule following humans would be writing and interpreting the zip using 10 different characters mapped to the Arabic numbers, then it would be more optimal to try to maximize the mutual distance between them such as an anti-reflected binary code, which would be an increasingly challenging exercise with more segments that could start at 7.


> For example 4 may be seen as 9

Except no, specifically because 9 has a slanted ending compared to 4. If some idiot writes the symbols whatever they wants instead of the reference then it's the the problem of the idiot.

And any way, most people would write it properly which means more mail would be processed efficiently and require less human intervention.


You think handwriting is easier to recognize than digits of a standard form and size?


No


There used to be many of these facilities. This is the last one. The demand has gone down as OCR has gotten better and hand addressed mail volume has decreased.


These facilities exists specifically for the edges that the ML models are failing on.


Indeed, and this is why they continue to downscale the number of people doing such tasks.


They need 250 more as per the article (scroll to end).


Just before the end is the likely reason for that: "70 hours a week".

This is the last such facility, after all the others have been closed, so needing more people is not incompatible with the fact that they've scaled down overall.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: