Hacker News new | past | comments | ask | show | jobs | submit login

"""try to read bad handwriting from records from nuclear weapons plants. It's a dull job a monkey could do but it pays the bills."""

+

"""For years I worked as a computer guy for an academic department and that was fun and I'm trying to get back into it"""

Sounds like you could enjoy automating your work. Learn a bit of Python and some Machine Learning/Deep Learning and digitize the handwriting and build a little program that reads the stuff for you. The default example for reading handwritten digits is called MNIST if you want to read further. I'd suggest fast.ai and watching the first couple of lessons. That should get you started to play around with this. You don't have to tell anyone that you do this (I probably wouldn't) but hey at least it might be a nice way to do a little less boring stuff and ease into a bit of programmin/data science?




I would be very, very hesitant to to this.

First of all, you definitely don't want any chance of incorrect recognition of handwriting coming from a nuclear weapons plant.

Secondly, who knows what policies he might violate by using some unauthorized, untested software (whether OP is the author or not) with potentially sensitive information.


    First of all, you definitely don't want any chance of incorrect recognition of handwriting coming from a nuclear weapons plant.
Loving this, this sounds like the plot of a bad sci-fi movie.


It reminds me of the start of "Brazil", bugs and all.


Sounds like a plot for a really decent sci-fi book with an AI as the antagonist.


He can use it such that the model's predictions are still manually verified by him before actually submitting them or whatever. Seems pretty harmless to me.


In his current situation, if he zones out, he produces nothing. In the situation where he's aided by the potentially faulty described system, if he zones out, he produces erroneously transcribed data.


What? In his current situation, if he zones out, he can produce erroneously transcribed data too. What's your point?


What? In his current situation, even if he doesn't zone out, he can still produce erroneously transcribed data too. Why even go to work?


Manual verification is likely just as tedious as doing it by hand in the first place.


Probably. But maybe he'll get a kick out of modeling and automating it. Challenging his intellect seems to be the key point here.


I'd first go for a pareto approach. There's probably some easy automation that can aid the manual work, which is easy to implement and provides visible performance gains. Stuff like pre selecting interesting image parts (aim for high false positive rates and zero false negatives) or even just an interface for speeding up dull manual work like pulling documents from the repository, associating relevant documents and presenting them in a fast acting UI.


I've done something similar, automating a similar process. Essentially the screen was split on half. The left was the actual image, the right was the data of what the program thinks it is after OCR and some magic interpretation. The user can scroll around and both panes would move together. User clicks accepts on each value. Otherwise user can edit each value, or add a new value (if it was completely missed) and then accept. Once everything is accepted they can save the data.

On paper it saved a lot of man hours but it was never fully rolled out. The project didn't have a long lifespan and we had enough people sitting around doing nothing to just manually do all the work. But I got a nice award for it LOL.


one could make a similar argument that no one should have tried making a car that could drive better than a human


This is more like some guy in the early 1900s having the job of hand-delivering important information on a horse.

Then deciding on his own on the advice of HN that he should try a car for that task instead, the car breaking down, and the guy getting fired because his employer never approved of this whole "car" thing and he shouldn't have introduced that into his job workflow without having talked to his employer about it.


I will take a look at it, thanks. I have tried some off the shelf OCR things and had little success. But what I do is really needle in the haystack kind of thing. I am looking for a "U" or and "Sr" in a particular place and happy to find one in a thousand pdf pages.

I am looking into auto-scrolling pdfs and batch loading, sequencing of pdfs to make things easier. But I will definitely check out fast.ai - thanks!


Only problem is I don't know how likely it would be that the writing is neatly separated by character for mnist to work, and handwriting recognition isn't accurate enough. Maybe some restraints on the inputs will fix that


Regretfully pragmatic viewpoint here: The poster is currently relying on this menial job for income. Automating it before he has something else lined up could be a really bad idea, financially.


This is a great idea. If you manage to pull this off you'd gain extremely valuable experience that would allow you to easily transition later on.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: