I wonder if anyone has tried training an LLM on known spam and measured it's performance? Such an LLM would ideally be run local to the mail server for maximum privacy.
Ignoring e-mail content and throwing Naive-Bayes on the header alone is pretty much hove we got amazing spam filters about 15 years ago. All of course using a millionth or less of the resources a large language model would use.