Where are you seeing that this paper was accepted to a peer-reviewed journal or conference? As far as I can tell, it's posted on arXiv (a preprint archive), and therefore is a pre-publication draft. ArXiv does not really do any review of these papers other than categorization/relevance to topic. These are typically posted to arXiv for comment, to prove priority, prevent getting scooped, or just to share (potentially early) findings in a fast-paced field like ML...
Give the authors constructive feedback and they can update the paper.
Some significant portion of those users with Chrome installed in iOS are using it to access the Google/Chrome password manager synced with Google accounts in other apps (like Safari), without ever using it as a browser on their phones… iOS will suggest passwords from Chrome as well as its built-in iCloud password manager when installed.
Canny edge detection is optimal in some settings. Often, though, edge detection is in service of another goal, such as segmentation. Nowadays, computers are big/fast enough (GPUs) that building models that learn edge detection implicitly and do segmentation or the end task directly is feasible.
Still fun to learn about Hough transforms, canny edge detection, and all that 1990s computer vision though!
> Nowadays, computers are big/fast enough (GPUs) that building models that learn edge detection implicitly and do segmentation or the end task directly is feasible.
Wouldn't re-using Meta's SAM (Segment Anything Model) be sufficient here? (which is freely available and, I think, free to use?) Or do you think you need to build your own model specifically for detection bills/sheets of paper?
Yeah, where licensing allows it, reusing existing model weights (possibly continuing to train on your specific task) is reasonable. I was just pointing out that these methods aren’t SOTA anymore.
To be fair, I have no idea how I could use modern techniques instead of the Hough Transform. One use case is the speed and RPM car dials recognition. Using the Hough Transform, it's trivial to reliably detect the slope with a high degree of accuracy.
Have vision models started offering better alternatives for such use cases? It's a genuine question; it's been a while since I last looked.
You can pretty much solve this using modern DL models. There are options depending on how accurate you want your model and how much compute you have.
There is an entire spectrum of models, from something like Mask-RCNN, U-Net family upto something like Meta's SAM, which you can use without even training.
In this kind of cases I'd probably use the Hough-based algo as ground truth to see if you can indeed fine-tune a DNN on that regression task. If it does with reasonable accuracy, then you have a baseline that could be improved in multiple ways to surpass the original.
That said, there are not that many shapes of speedometer and wheels, and the view point is likely controlled, so your old school method is probably the better way ;)
For the purpose of learning, would you recommend some tutorials, articles or videos that help achieve that? Accuracy aside, this would make a great learning experience!
Is it better to look in the PyTorch community, or that's where some Tensorflow approaches shine? (CUDA is ok)
PyTorch is much nicer to play with in my opinion. Maybe start with their official tutorial, I've also heard good things about Karpathy's YouTube channel from beginners.
What if these filters are explicitly used as preprocessing steps when training a segmentation model? Would that at least save some epochs, if not increase accuracy?
I'm suspecting it could be similar to the learned vs. predefined positional embeddings in GPTs. That is, the learned version is a "warped and distorted" version of the exact predefined pattern, and yet somehow it performs a bit better, and no one knows exactly why.
> Still fun to learn about Hough transforms, canny edge detection, and all that 1990s computer vision though!
In my domain of metrology, this part isn't acceptable:
> you will get a model that can predict the exact edges of every input image.
Prediction is a nice way to limit the space I have to consider for a proper edge detection, but usually it's better to just go straight to something more deterministic.
The accuracy is probably quite bad, when looked at individual manipulations, but the overall result is jaw dropping - even, if it's only a demonstration
Would you not just be able to do the CTE as you would a correlated sub query? Something like:
WITH batch AS (
SELECT id FROM user_profile
WHERE followers_count IS NULL
LIMIT 10000
)
UPDATE user_profile
SET followers_count = 0
FROM batch
WHERE user_profile.id = batch.id
But with the difference that if you didn’t want to round-trip to the application for each batch you could now make this a recursive CTE?
There are also many Kubernetes based options out there. For the specific use case you specified, you might even consider a plain old Makefile and incrond if you expect these all to run on a single host and be triggered by a new file showing up in a directory…
I like Airflow because you can give access to the web UI to operators and they can kick/run/stop tasks or graphs of tasks. Both Airflow and Luigi expect you to express your workflow as a DAG in Python code.
Web browsers are mostly free and don't try to upsell you to a Pro paid version. The MacWhisper author deserves to be compensated for their work, so I'm not objecting to the existence of a paid version. This feels like yet another relatively low value freemium/upsell wrapper in the Mac shareware ecosystem to me.
I'm probably wrong and there's a real population that benefits from this work, clearly some folks perceive it as useful enough to pay for it and I'm just not in that audience to see it.
I think part of what rubs me the wrong way about this is that it feels to me like commercial freeloading due to the thinness of the commercialized wrapper around a free/open core in this case (whisper model + code); it feels ethically questionable unless the author contributes back some portion of the proceeds to research in some way -- I didn't see evidence of that. I'm probably being naive here, happy to have a less snarky discussion about it though.