Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if anyone has tried training an LLM on known spam and measured it's performance? Such an LLM would ideally be run local to the mail server for maximum privacy.



I don’t know why that would be necessary. The vast majority of the spam I get is obviously spam from the subject line alone.


Ignoring e-mail content and throwing Naive-Bayes on the header alone is pretty much hove we got amazing spam filters about 15 years ago. All of course using a millionth or less of the resources a large language model would use.


Sir! The willies you just gave me have no compare.

What if said AI gains sentience, but trained on that data?!


rspamd has had an option for it for a while, but the older markov chain based filters tend to work well enough.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: