Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have seen excellent performance with Florence-2 for OCR. I wrote https://blog.roboflow.com/florence-2-ocr/ that shows a few examples.

Florence-2 is < 2GB so it fits into RAM well, and it is MIT licensed!

On a T4 in Colab, you can run inference in < 1s per image.




This looks good, I will investigate integrating it into my project. Thanks!


I couldn't find any comparisons with Microsoft's TrOCR model. I guess they are for different purposes. But since you used Florence-2 for OCR, did you compare the two?


This is pretty cool, when checking how Microsoft models (then) stacked against Donut, I chose Donut, didn't know they published more models!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: