A while back I copied from somewhere this script that does the job nicely. #!/bi...

grimgrin · 2024-03-15T05:30:24.000000Z

In the spirit of sharing, cuz I think this is a great script (thank you), I prefer using maim over scrot simply because it has a --nodrag option. Personally feels better when making selections from a trackpad. Click once, move cursor, click again.

    maim -s --nodrag --quality=10 $IMG.png

10 is scrot's 100

raphman · 2024-03-15T08:17:37.000000Z

Yet another variation I have been using for ages, using ImageMagick's `import` tool (which probably only works on X11)

    import "$tempfile"
    TEXT=`tesseract -l eng+deu "$tempfile" stdout`
    echo "$TEXT" | xsel -i -b

dsp_person · 2024-03-15T03:50:05.000000Z

I was using something like this for awhile, but I found tesseract did poorly quite often. That resize trick didn't seem to affect much. I'm not sure what pre-processing would make it better.

I'd love to if TextSnatcher does anything to improve on this. The github page is opaque.

mappu · 2024-03-15T04:38:20.000000Z

The source is pretty straightforward - it's calling `scrot -s -o` to a temp file, and then `tessaract` with no further preprocessing.

https://github.com/RajSolai/TextSnatcher/blob/master/src/ser...

stevesimmons · 2024-03-15T14:59:32.000000Z

> I found tesseract did poorly quite often

The script calls Tesseract in default page segmentation mode (PSM 3). [1]

Depending on the input text, PSM mode 11 for disconnected text would probably work much better. That uses the flag "--psm 11".

[1] From the original repo: string tess_command = "tesseract " + file_path + " " + out_path + @" -l $lang" ;

aidenn0 · 2024-03-15T20:03:47.000000Z

Having used Tesseract for OCR for other things, getting the right PSM helps but it's still rather terrible, especially for sans-serif fonts, which are common in UIs.

Granted there's a lot of ambiguity in sans serif fonts, lower-case "L", vertical bar, and upper-case "i" can even be pixel-identical, but I've seen tesseract turn

  Chapter III

into

  Chapter |l1

which really surprises me. In fact, for books, I run it through sed to replace vertical bar with upper-case "i" and it significantly improved recognition.

hiAndrewQuinn · 2024-03-15T07:15:23.000000Z

I had a PowerShell script which did this as well, but alas, it was lost to time with the rest of my little scripts from my last job.

Apologies to all of my fellow Unix-Windows borderers.

Arch-TK · 2024-03-15T10:15:33.000000Z

  trap "rm $IMG*" EXIT

see https://www.shellcheck.net/wiki/SC2064

also, use mktemp -d and recursively delete the directory

doix · 2024-03-15T04:56:56.000000Z

This is perfect for me! Having a window with a button that I need to click is much worse than just binding a script to a hotkey.