Hacker News new | past | comments | ask | show | jobs | submit login
Building an Internet Scale Meme Search Engine (findthatmeme.com)
785 points by whoisburbansky on Jan 11, 2023 | hide | past | favorite | 151 comments



This title really undersells the absolute insanity of the described solution. This is a beautiful example of "if it's stupid, but it works, it's not stupid." The justification is very convincing.

One thing I'm curious about: how did you build your corpus of meme images and videos?


It reminds me a little of https://www.beeper.com/. It allows you to read iMessages on Android and other platforms. To make this work, they will ship you some old iPhone to act as a server "bridge": https://twitter.com/ericmigi/status/1351934418961661959

If it works, it works. But it also speaks volumes about Apple's disregard/inexperience with exposing their stuff via the web - https://www.icloud.com/ being the prime example: half the stuff the phone apps can do are not available (cannot create a reminder with a due date...) and the things that are there are slow and buggy.


I think I've seen a post from https://texts.com about that they. I don't think they ship you the iPhone though, the host it themselves.


I would imagine the only scalable way to run such a service is to run macOS virtual machines with multiple user accounts for each iMessage user.


(Not the author) Maybe they leveraged https://knowyourmeme.com/? But that can't possibly have all the random memes, could it?


Author here: KnowYourMeme is one of many sites that memes are continually ingested from (any site that has memes I try to ingest regularly) :)


Amazing work! Also, thank you for making that feed on the main page, been laughing for a while here :D


Also lost 20 minutes doom scrolling that feed. Add an upvote button and some ML and you could destroy some lives.


Thanks! Comment made my night.


Nice IPhone cluster.

Have you tried something based on deep-learning that uses Transformers : https://github.com/roatienza/deep-text-recognition-benchmark (available weights are for tasks that seem similar to OCR so there is a good chance you can use it out of the box). With a good gpu it should process hundreds to thousands image per seconds, so you likely can build your index in less than a day. (Maybe you can even port it to your iphone stack :) )

https://github.com/microsoft/GenerativeImage2Text (You'll probably have to train on your custom dataset that you have constituted)

There are tons of other freely available solutions that you can get with a search for things with keywords like "image to text ocr" "transformers" "visual transformers"...


You can do better than a general image-to-text model reading memes, because they all use the same fonts - so you want something trained off synthetic data made with that font.


Personally, I've been hunting for something that can extract both the text and the associated image. I've never seen anything that can do both.


All hail the memelord!


How do you ingest your social circle's in-group memes? Are they reliably posted to meme generator sites?


What about copyright?


OP’s meme site lists where each image comes from. Looking through it I mainly see ifunny and 9gag.


Do you crawl telegram channels?


yeah but lots of things that work are stupid because there are many other solutions that work better, the greatness of this crazy solution is it really seems like the best solution given price requirements.


I feel like I’m taking crazy pills in this thread. Am I the only one who talks to Gen Z kids who explore around their iPhone apps? This definitely isn’t the best option given price requirements. It’s not even the most convenient option.

I’m around age 30, not 13, so similar to the article, my first instinct was also to create a database and OCR the image. But by total coincidence, yesterday I had a conversation with my 14 year old cousin on the topic of saving memes. Her response was along the lines of “yeah, everyone nowadays just saves the image to your iPhone photos, and then just search for it later from the photos app”.

Yeah. This whole article is literally already built into iOS UI, not just a hidden API. And kids all seem to know about this, apparently.

This article uses an example meme with the text “Sorry young man But the armband (red) stays on during solo raids”. I saved it in my iPhone photos app… and found it again through the search function in the photos app.

https://imgur.com/a/BPICjOz

https://imgur.com/a/55el9uQ

This is a solved problem already, by teenager standards.

I felt extremely old yesterday when I was talking to my cousin. And I felt extremely old today, reading this article. This is because looking back, the past few decades of CS cultural intuition have established that text are text, and images are images. Strings and bitmaps don’t mix.

This seems sort of obvious to anyone in tech, but I realized that from a clueless grandma perspective, not being able to search up text in photos wasn’t really obvious. Well, the roles are reversed now. Ordinary people now have access to software that treats text as a first class citizen in photos by default.


The entire point is to find memes you don't already have.


No, it's not. The author even mentions private memes that are in-jokes. He built a service that can be used to explore memes, but people generally don't search for new memes in a search engine. They tend to use the search engine to find memes they already have.


knowing != having


How would you solve sourcing and distribution using just iOS though? Sure, it's built into iPhones, but if you want to create a comprehensive globally accessible meme search I don't think you can do that by saving memes to your iPhone.


A kid doesn’t care about that. They just save the memes they like, including the custom ones that their friend made which doesn’t make any sense to anyone else. You don’t need a comprehensive global search engine, if you have a tool that will tell you exact personalized answers to images you’ve saved before. And kids these days save everything; it’s like how people use Gmail, no point in deleting if you just archive.

If you’re talking about kids without iPhones, then I don’t know, I’m assuming there’s probably some competitor apps on android now.

But I think you’re thinking too narrow. Don’t worry about a meme database. What about a searchable visual database of everything you’ve ever seen?


The use case for the meme database is slightly different: it's to find that meme you saw somewhere else. Local search isn't enough then.

Though I share your feelings somewhat - I was completely surprised by what you wrote in the comment higher up about iPhone gallery search. I didn't realize this is possible in a reliable fashion, much less off-line and deployed in a mass market device.


> The use case for the meme database is slightly different: it's to find that meme you saw somewhere else.

The whole idea is that whenever you find a meme you like you save it, so all search is still local


Just like we meticulously remember to catalog every song, movie, joke, or otherwise informative sensory input we’ve ever been exposed to?

I know the obvious response to the above will be “yes, really, teens do that. Pictures of everything”

But really? Everything?


Complete supposition: maybe teens' exposition to memes is through non-private messengers and apps, meaning all media is saved automatically on the phone and available through a search. I don't think the web is very much used still.


Yes, and the database is useful for all the situations when you saw a meme on someone else's device, or embedded in some piece of content, so you had no way to save it.


Your cousin and your friends don't care about that, not sure if it applies to all kids worldwide honestly. I'm sure "meme collecting" is a common practice among many teens, but I don't think it means that every teen saves all meme/images they encounter.

You know that some teens don't even save some images? They store them on specific instagram accounts they make for a specific category. My cousin had an instagram account for close friends (5 people) where she only shared bad things that happened to her during the day. Another one for nice things, etc. All those memories were recorded in app and never saved in the device, only stored as stories on the account. Guess what? She was sad a while ago because somehow she lost access to one of this accounts and so to all those pictures/videos.

Also the fact that some teens save a lot of pics on their devices doesn't mean much in the grand scheme of things. In 2008 I had folders upon folders of images downloaded from the internet. Now i'm not even sure where they are, probably in some hard drive in some closet. You can be sure that if i remember any of those contents I won't dig up my hard drive, but google them (use a comprehensive global search engine). We have no idea where all this collected data from teens will be in 15 years, it's not unlikely that it will be lost or archived in hard to reach places and forgotten. I've stumbled some times into "meme dumps" where they upload all their memes to a service to free space on their device/icloud.

For sure teens use technology in ways that might be unexpected and counter-intuitive to us, but I don't think that invalidates in the slightly the need of a global search engine for memes. It's a good idea if i want to find a meme that I saw, a need that I don't think will disappear anytime and that's also not a millennial+ only problem.


I only realised quite recently that I could now select text in images on my iPhone the same way I could if I was looking at a web page.


You can do this back in 2021!

https://www.macrumors.com/how-to/copy-paste-text-from-photos...

I really want to emphasize how insane this situation is, because I think most tech people won’t realize what’s happening unless it’s pointed out.

If you’re a typical tech person, you probably look at this, go “oh, iOS Photos now OCRs every photo. Cool, that’s 2000 or 2010 era fancy tech, boring these days. And then a search engine on those strings, yeah cool, nothing too mind blowing”. The sheer boring-ness of this by tech people standards meant that this iOS UI change went under the radar.

That’s not true for non tech people.

The people who discovered this, put this to use immediately. You can search up anything from an image now. Old memes? Sure. Forgot the name of a restaurant you went to, but remember that you took a picture of the menu and the beef dish was amazing? Search up the word “beef” and it’s probably in there. Took a screenshot of an article, remember 1 or 2 words from it, but can’t find it on Google? Search for those 1-2 words you remember to find the screenshot, then use the phrases in the screenshot to find the article on google. Trying to find a picture of a cat you saved? Type in ”cat” and search for it. Yes, the photos app can do that too.

Screenshots are cheap and instant. Kids never delete them. It’s like how Gmail “archive” feature in 2005 revolutionized email because you never had to delete an email. Well, iCloud Photos “optimize storage” means that you can effectively store infinite screenshots.

There’s another UX revolution happening in terms of saving information. It’s just that photos became easily instantly searchable, and nobody seemed to have really noticed the implications this has on storing memories, and boosting recollection. This can possibly be the equivalent of “you’ll always have a calculator in your pocket” but equivalent analogy to memory techniques like spaced repetition.


> I really want to emphasize how insane this situation is, because I think most tech people won’t realize what’s happening unless it’s pointed out.

Count me in. If you asked me about the OCR itself, I'd probably say "yeah, it's mostly been solved for a good decade for print books and articles, but it's unreliable enough". I somehow never considered OCR might have gone better - possibly because my main exposure was through badly OCRed book scans and a built-in OCR in some PDF reader I used at one point.

It definitely didn't occur to me that OCR works well enough on arbitrary images, and it's cheap enough compute-wise that you could do it locally in a casual fashion.

Nice thing you have there in the Apple garden. Over here in Android land, I have the opposite problem. You say:

> This can possibly be the equivalent of “you’ll always have a calculator in your pocket” but equivalent analogy to memory techniques like spaced repetition.

and all i can think of is how I recently became convinced that a Samsung flagship is losing my photos. There's been a couple cases over the past few months when I felt really damn sure I made a set of photos of something (e.g. remodeled kitchen), but when I checked on the phone, it turned out those photos don't exist, or there is maybe just one where I expected 5-10. They aren't in the gallery. They aren't in the filesystem. Poof, gone.

So either I'm getting senile in my 30s, or something is off with the way my phone stores photos. I did a web search for this the other day, there are relatively recent reports on-line complaining about the same thing, but no one has any evidence. I'm thinking about doing an experiment now (basically make extra photos every day and document them in a paper notebook, and check after half a year if the photos match the notes) - but the point of me sharing this is: I no longer trust new tech, smartphones in particular, to handle basics correctly. Much less do something advanced like reliable text search on images.


There is a chance that your photos are being backed up by some cloud service and being removed from your gallery. The most likely suspect is Google Photos.

Note that Google photos not only OCRs, but it also does a visual search of objects, faces, scenery etc. and is extremely powerful.


> There is a chance that your photos are being backed up by some cloud service and being removed from your gallery. The most likely suspect is Google Photos.

I have Google Photos upload and backup both disabled.

But then, I'm pretty sure either Google or Samsung SMS app had a "feature" to automatically delete old messages (for a definition of "old" that was neither specified, nor configurable), and it defaulted to ON on my current phone, likely costing me significant chunk of my message archive (that I dutifully transferred over from the previous phone) before I accidentally found and disabled the switch.

So yeah, could be Google Photos deleting it. Or someone else. I don't trust Android as a platform anymore.

BTW. about this "delete old messages" "feature" - most likely this was implemented for performance reasons. But the thing is, you're unlikely to send or receive enough SMS in your whole life for it to take a noticeable amount of space. The irony here is, I do remember a case where the messaging app would become slow and laggy if you had enough texts stored on the phone - but that was solely because someone implemented the message list as a linked list, thus adding a O(N) multiplier to many GUI operations.


Well maybe this isn't for teenagers with iphones


Nice project, I wanted to build meme search engine myself one day, but figured Tesseract will fail at most of the memes because of how stylized those have become. So I tuned down my meme source to only /r/bertstrips as those contain sane looking text and it's working quite alright - project has no frontend yet, I search from cli and click links.

> Initial testing with the Postgres Full Text Search indexing functionality proved unusably slow at the scale of anything over a million images, even when allocated the appropriate hardware resources.

I can guarantee you that correctly setup PostgreSQL text search will be faster than ES with much, much less hardware resources needed, it's just a matter of correctly creating tsvector column and creating GIN index on it (and ofc asking right queries so it's actually used). I can help you out setting postgres schema up and debugging queries if you are interested, for testing purposes at least.


I recently worked on a project using lnx.rs. Simple to setup and use and fast at the scale I was using it. Built on Tantivy with a custom fast fuzzy search feature.

If you want to go beyond meme sites and possibly detect memes in the wild, common crawl might be something to start with.


One issue I've had with postgres full text search is when you want to rank using ts_rank you end up with a full table scan.


This is really brilliant to see, and I've been surprised for quite a long time that nothing similar exists. I think it's a real shame that few people with interest in memes have interest in building solutions like this that help us engage with them.

People in the 21st century know a lot about the mistakes of the past century that led to much popular culture of the time being lost (especially terminally online people who've watched lots of Youtube documentaries about lost Dr. Who episodes and so on), so it surprises me how little we try and avoid those same mistakes with today's ephemeral pop culture in the form of memes. People like yourself who want to help make the internet's huge corpus of memes tractable are part of the solution in terms of meme archival and cultural memory.

(There's a good meme metadiscussion group on Discord, "The Philosopher's Meme," which you might be interested in joining. People there would be very keen to discuss what you've made.)


I've always been surprised that Reddit hasn't built meme search into its site search.

Memes are a core part of the Reddit experience, yet it's difficult to find something I know I saw before.


Not familiar with Discord, do you have a link to that group?



Love the hackiness of this - however, the vision framework is available on Desktop macs as well - https://developer.apple.com/documentation/vision

and specifically:

https://developer.apple.com/documentation/vision/vnrecognize...


> My preliminary speed tests were fairly slow on my Macbook. However, once I deployed the app to an actual iPhone the speed of OCR was extremely promising (possibly due to the Vision framework using the GPU). I was then able to perform extremely accurate OCR on thousands of images in no time at all, even on the budget iPhone models like the 2nd gen SE.

He does mention running it on a macbook


I would guess that tests in this sentence refers to tests of the iOS app on the simulator.

Which would be slow expectedly


I would think it would run well on a M1/M2 Mac as a native app though, right?


That was my question as well. I'm wondering how much of a performance benefit the neural engine has on M series chips when compared to intel chips.


I would assume he's using an intel macbook and wouldn't have the gpu acceleration (and subsequent Vision framework integration) of the m1


There's ocrit, a CLI utility using Apple's Vision framework for OCR: https://github.com/insidegui/ocrit


What's the cost of building and running a cluster of iPhones vs Mac Minis?


in the article $40 second hand, imei banned and broken screen iPhones are being used so...


There's a ton of compute power available in the form of unused phones.


The photo under Upgrading the iPhone OCR Service Into An OCR Cluster. In the future, data centres are going to host racks of iPhones.


It is already here to be honest. I know BrowserStack and other mobile testing platforms (at Facebook and Amazon) do host real devices, both Android and iPhones, in server farms like this. Meta wrote a blog post about it: https://engineering.fb.com/2016/07/13/android/the-mobile-dev...

At one of my previous workplaces, we discussed running the Z3 theorem prover on an iPhone cluster, because they run so much faster on A series processor than a desktop Intel machine.


Reminds me of imgix, who built their product on Apple libraries so ended up having racks of macs to run their service.


Modern app click farms already have walls of iPhones.


To be fair...insane performance in a tiny cool package...iPhones would make great servers if you could order them without the screen/camera etc. :-)


There's your startup right there - washing line racks of discarded iPhones, near bleeding edge, busted screens, still functional o/wise.

Low entry cost, recycling, eco-friendly, . . . a deck that writes itself.


I had a friend in med school who wrote a very early note-taking app for the iPad. Turns out that there was no way to render PowerPoint files when the iPad first came out. He realized that the iOS/Mac OS "quick preview" function could be used to take screenshots of each PowerPoint slide. For a brief time, his was the only app that could display PowerPoints (albeit, they were just screenshots!). There's a lot of hidden utility in Apple libraries.


Love the inventiveness.

My question is about the image distribution costs. All the memes on the site seem to be coming straight off an object storage, all that bandwidth consumption has got to add up(?). Some sort of a CDN might help depending on the search patterns.


Although not as elegant a solution as this I've also tried my hand as well at indexing and categorizing memes. I wanted to save a very specific type of meme though since there are, in my mind, 2 main categories of memes. The first category are what I call "story" memes, they are standalone and typically what you see being shared on Facebook. They usually have text and are able to tell a story on their own with no additional context and can be presented as a single post, story, etc, (think 4 panel comics). The second type are reaction memes. These are used to respond to people and usually convey a feeling towards a post or tweet. They can also be standalone so they should probably be considered a subset of the "story" memes. I've gravitated towards the reaction memes as I see more utility in them and can be used in a more universal way. My site if anyone is interested (its still a work in progress):

https://www.memeatlas.com


These different approaches really compliment each other - most of the memes you've categorised are used in a variety of situations and therefore not suited to text searching. Meanwhile if you're looking for a specific meme that you've seen, text search is the way to go.

Ideally there would be a best of both worlds where you could search memes by "characters" or "formats" in addition to text.

As feedback, it would be nice to search all memes from the homepage. The search on https://www.memeatlas.com/meme-templates.html also seems to be broken.


What are you using as the back end to host your website?


If you don't need advanced search features, you can use Sonic (https://github.com/valeriansaliou/sonic). It's blazing fast and you can save lot of money on servers.


Someone mentioned lnx.rs in this thread, how does it compare? And how about all the other new Rust solutions like MeiliSearch, Toshi, Quickwit?


I don't know, I am not an expert on this topic. But this could be a nice topic for a blog article that will hit the frontpage on HN :)


I sat down literally last night and started sketching out the scratch-my-own-itch solution to more or less exactly this problem, because I too have meme-aphasia where I know there exists a meme that fits perfectly in a conversation, but I have about 5 seconds to find it before the moment passes.

I'm so, so glad to see that I'm not the only person in the world with the same "problem". Well done, mandatory.

edit: holy crap you even index videos, nice


I wonder how the performance of Vision.framework on desktop Mac hardware compares to a cluster of phones. (The author mentions that it was "fairly slow", but it sounds like they were running an iOS app in the simulator and not a macOS app.)


I would expect an M1 native Mac app to work similarly well. Though the iPhone solution may win on price.

The framework is supported on macOS (even tvOS apparently) https://developer.apple.com/documentation/vision


Same thoughts. You can make that fly on a Mac Mini (provided it can be made to work close to the metal and not in an emulator)


Does the Vision API call back to apple servers in any fashion? Like how google on-device voice recognition APIs will call back to Google when you are online (unless you explicitly pass flags to force it in offline mode).

If so, is there any risk in getting your account suspended or ip range banned somehow because of this, for example?


Nope, you can use it totally offline. No way of getting banned as far as I'm aware.


Absolutely amazing on the tech side!!!

Now, after reading the article, I gave your search engine a try. I was looking for that futurama its a trap meme (pretty much pops up on any image search here https://www.google.com/search?q=futurama+its+a+trap)

The problem is, the search engine you built is now very text-heavy, which seems to be usually very unconnected to the actual meme. So, searching for "its a trap" did not yield the results I was actually hoping for, but made total sense looking at how the search was implemented.

Are you planning to implement an actual tagging of the content of some sorts? Maybe a clustering of similar objects (like iphone clusters similar peoples faces in the gallery) and then tag those clusters with keywords somehow?


Yes I definitely want to improve the search to be better. It is currently very text heavy and I (only recently) got image similarity indexing working. Hoping to leverage this to do something like you mentioned!

I'd also like to figure out how to turn an image into a description of whats in it. My ML/tensorflow knowledge is very weak though, so I still have a lot to learn here.


Do you use mongodb to make it web scale? You turn it on and it scales right up.



This is great, I particularly like the part about using compute from old unwanted iPhones. Quite an inventive way to reuse/recycle otherwise obsolete hardware!


I have absolutely no experience in this area and I'm curious:

is there really no open-source text recognition software that's on-par with or close to Apple's (presumably proprietary) implementation? the article mentions Tesseract. is that the current best open-source option?



This is remarkable. I'd love to see that combined with some kind of sentiment analysis like Microsoft offers, just to see if something useful comes out of it.

Sometimes, I don’t know the exact words when looking for a meme, but once I see it, I know that’s the one.


Semantic search using CLIP embeddings / other LLM embeddings could be an amazing addition too.


Unfortunately semantic analysis barely works at the best of times, but it especially doesn't work here. Computers… they're just not good at irony.

IME CLIP embedding search can work strangely on memes as well, because it gets confused when images have words in them. Basically the same problem reported in the original CLIP paper where it thinks an apple and a piece of paper with "apple" written on it are the same thing.


> My preliminary speed tests were fairly slow on my Macbook. However, once I deployed the app to an actual iPhone the speed of OCR was extremely promising (possibly due to the Vision framework using the GPU). I was then able to perform extremely accurate OCR on thousands of images in no time at all, even on the budget iPhone models like the 2nd gen SE.

I suppose that’s an old Intel MacBook? I’d be very surprised if the Vision framework performs better on a 2nd gen iPhone SE than even the first M1 MacBook Air.


I think he was running it in the simulator - which won't perform anywhere near as fast on this type of thing.


The simulator runs a native build though. Might not be optimized, not sure.


Would love to see that load balancer implementation, as I'm a scrub and this project fascinates me.


nginx makes it really simple to setup a load balancer, it defaults to distributing requests equally between all upstream hosts but you can always assign weights to each of your server. https://docs.nginx.com/nginx/admin-guide/load-balancer/http-...


Yep, this is exactly what I'm running on the raspberry pi LB. Nginx makes it super easy!


I have a "hackish but works for me" meme database: I use my Telegram "self chat" to send memes I like to myself, and I tag them with the kind of words I'm likely to search for when looking for them later.

Works great for me.

It's kind of like trying to come up with a good Google search phrase, based on how other people must have phrased something, but relying on knowledge of how you phrase things instead.


I do this for some, but most of the time my use case is to look for something very obscure from a long time ago that I didn't regard as interesting when I first saw it (so didn't make the effort to manually categorize it).


Wait what this is absolutely brilliant. Actually insane it works so well using a stack of iPhones as an ocr server. My deepest respect.


IaaS - iPhone as a Service, coming soon to AWS.


Now this is the sort of disgusting pile of jank I love to see.


Very inventive. Admittedly when I read the first few paragraphs, I was thinking “he’s got to have $40K of iPhones doing image processing” but you made a good point about being able to use iPhones with screen and other damage.

What was your average price per iPhone, if you don’t mind disclosing?


Last time I looked into OCR stuff I came to a similar conclusion (though I didn't implement anything back thne). It would be really nice to have "open source" models that had similar performance, without having to deal with the iphone cluster hackery.


If only there was a way to filter out ifunny results, I absolutely detest that watermark.


> Better yet, I don’t even want to use them as phones, so even iPhones that are IMEI banned or are locked to unpopular networks are perfectly fine for my use.

Fences worldwide will be overjoyed to hear of this novel application.


I have a question what do you guys think is the best back end for a video search engine app?


Outrageous effort! So far japanese, mandarin returning results as well.

Do you have list of sources where memes are ingested from?

Would be nice to have some option to explore memes by category.


Very cool project! I'll try to remember it the next time I'm looking for a specific image. I noticed that repeated appearances of the search term are ranked higher, which isn't necessarily productive. Also, some kind of duplicate detection would be nice. Searching for "SpongeBob" yields many copies of the same images that mentions "SpongeBob" several times.



I was hoping this would help me find the Database Iceberg meme that shows different levels of database insanity. It didn’t. Anyone have a link?



Yes, that’s it, thankyou. Been searching a few days for it


You're welcome. FWIW I literally put your wording into google image search to find it:

> Database Iceberg meme that shows different levels of database insanity

But possibly my google search results are tailored differently from yours.


Are you going to open source the "app" part of it ?

I would love to replicate this setup for my own project....

I am thinking, load balanced, multi location redundant "iOS machines" with 3-4 phones in with power backup and internet dongle.

We could use something like zerotier/tailscale to get internet access from outside your local network


This is amazing. Out of curiosity, why not try deep learning OCR software instead of Tesseract? PaddleOCR is popular.


>PaddleOCR

I once tried installing it on a recent Ubuntu. After messing with dependency hell and pip downloading half of the internet, when I finally invoked the CLI, it complained _again_ about a missing runtime dependency.

I called it quits. DL people are simply not interested in bundled static binaries.


That's a fun way to do OCR. Next up: Classifying memes by subjects and themes to build something like KnowYourMeme's gallery, but for every meme.

Bonus: Index from a lot of sources to help track a meme's origin.

This type of thing is on my long list of "can somebody else please do this already".


Pretty insane. If you don’t want to use iPhones, I made macOCR a while back. It uses the same vision APIs, with a very simple CLI interface. See: https://github.com/schappim/macOCR


You can do it on macOS as well, it has the same API for fast high quality OCR. I used it to create an OCR system to detect secrets or credentials in screencasts: https://github.com/peterc/videocr


That's genius. I realized the cost advantage of text to speech on an old android versus Google cloud


Don't you have to re-sign and re-deploy tour iOS app every 7 days to keep it running on the iPhones?


Is there a pgsync equivalent for Oracle? Spent some time building replication from a source-of-truth to a search engine at a previous job.

Wish we could have used postgres but the tools were dictated rather than letting the requirements drive the tooling.


I'm curious how well the iPhone OCR actually works. How do you deal with errors? Is the error rate low enough that you can accept the output from the iPhone OCR as is or do you also run it through a cleaning process (e.g. spell check)?


For a single data point, it works exceptionally well for me. I routinely copy-paste from images or screenshots, and it rarely fails (mostly for handwriting or obscure fonts).


I am not sure if the Photos app search also uses the same OCR. But sometimes I search for a word and iOS will find a photo that has that text in a 50x20 cluster of pixels somewhere far in the background...it's remarkably good.


This is absolutely brilliant.

I believe I will actually use it a lot if you keep the site up.

Minor feedback for the blog post: It deserves a better meta description (for link previews). The first paragraph doesn't advertise how good the article is going to be.


I tried a few memes. The results were quite poor, and vastly inferior to just using Google. In the case of text searches I had to scroll through dozens of results before finding the original meme images.


Out of curiosity, how does your Image Similarity Search works ? Are you also using some feature of Apple's Vision framework, or running some ML model on your linode instance ?


The image similarity search is probably a blog post of its own.

Short TL;DR: It runs off my home server running a large vector database (opendistro): https://opendistro.github.io/for-elasticsearch-docs/docs/knn...


Brilliant! :-)

Maybe a dumb question, but could you use your data to train a new OCR model so you wouldn't have to rely on iOS?

I don't know much about ML/AI so maybe not feasible.


"It looked like it was time to bite the bullet and write an OCR iOS server in Swift."

Quite a large bullet required, one with plenty of chewing left in it.


Heh, this finds a bunch of copies of a video I made. If you are going to cache them and repost them, you probably need to have a DMCA process.


you are a genius. also, the Search Engine works so well


Thank you for building this! It's so much fun. I looked for memes that I saw years ago and found them in seconds. Excellent work!!


Looks like you've solved the OCR problem, now to solve the duplication problem and use it as a ranking hint. :-)


Could you not extract the model and run it on a server? Its probably not as easy but i know it has been done with NeuralHash


Is this the person on /r/hardwareswap who's been looking for semi-functional, even IMEI banned, iPhones?


Nice hack. But doesn't Google/(any public search engine) image search do this for us already?


Could you expose your iPhone cluster as an OCR API? Seems like it would be competitive with the GCP API.


I wonder if there might be something in the Apple TOS prohibiting this.


I think there is right? At least assuming it's the same as Macs where the minimum billing period is 1 day, because you're technically renting a physical Mac (and 1d is the minimum that qualifies as adhering to that).


I was more referring to bundling up an OS framework library feature and exposing it as an Internet API.


Would be great if the images had a unique name so I could save them without having to rename them.


Great project! With your DIY attitude, if you need to build your own infrastructure for web scraping, here is a tutorial for mobile proxy setup which might be helpful: https://scrapingfish.com/blog/byo-mobile-proxy-for-web-scrap...


This is the single bestest thing I have read in a long while. Absolute madness. Pure bliss.


This is utterly fantastic and you are to be commended for your Crazy Mad Scientist genius!


This makes me wonder what other cool things I can do with an old iPhone.


I wonder how it fares against deep-fried "E" or Opossum memes?


Where did the original set of meme images, gifs, and videos come from?


I love the iPhone cluster so much!


this is very clever. I wonder what other use cases could leverage this approach


This is mad and I love it.


really amazing! I love the solution and the project in general


My #1 recommendation for anyone thinking about the convoluted OCR solution: use a cheap OCR API and save yourself months of time / hassle / upkeep. Google's OCR API is a good place to start, but AWS has one too and dozens of others out there.


Without this "convoluted OCR solution" it never would have been built. Mandatory would have easily had to spend hundreds of thousands of dollars to OCR his meme collection alone, even without scraping other meme sites.


On the contrary, this sort of creative thinking is what's needed instead of automatically reaching for that shiny Cloud Toy. It's easy to get a proof-of-concept working, sure, but at scale, you start torching through cash.

Many places keep adding cloud services to their stacks until one day someone in the C-suite notices the AWS bill.


The author calculated the cost of 1 iPhone se was sub $50, which is 27k Google ocr images.

Only makes sense for small scale.

Unfortunate how there's no decent ocr library to self host, would be cheoaer than cloud costs.


If you have to process tens of millions of photos though the cost gradually becomes forbidding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: