Hacker News new | past | comments | ask | show | jobs | submit login
Inside the seedy underworld of spammers and phishers (mailgun.com)
108 points by twakefield on May 17, 2013 | hide | past | favorite | 49 comments



Spammer transcripts are priceless.

Another way to fight spammers, would be to quietly shut off sending for their account, while still providing simulated email data to their dashboard, reporting successful sends, opens etc... That way, they would think they are still sending out spam and it would take them a while to realize that they had been cut off, slowing the cycle of them doubling their efforts.


I think it's a great idea and we've been evaluating it.

However to make it happen our system needs to have 0 false positives and we are not here yet. If your system made a mistake and you just disabled the account, angry customer will appear on the chat support in 5 minutes and the problem is solved. On the other hand if you pretend to be sending, you can loose all the customer's emails and that'd be nightmare.


Well, if you have an accurate enough system, you can send the clear, five-sigma cases to this queue and ban the more borderline customers.

And this is obviously something you can't discuss, but I wonder if you guys "seed" various sites with false emails as "markers". It would be a very cheap way to detect indiscriminate scrapers and bulk spammers.


That's basically what Craigslist does when it "ghosts" content. It tells you it published it, and you can even see it when you follow the direct link, but it doesn't appare on the listings. Eventually CL spammers started checking the actual listings.

I assume the same thing would happen with email; spammers would use a few of their own email addresses on the list to make sure the emails are actually being delivered. It would be a pain for a while for sure.


HackerNews does the same when it hellbans people. Let's just say it works for some kind of spammers (the less professional ones).


Personally I think sending some email through at varying and random amounts per day (for example) would be more effective, the rationale being to reduce suspicion of being blocked.

The legal and ethical issues surrounding this strategy is something to consider, though.


Hellban for spammers! Only on HN :)


This is how you treat credit card fraudsters too. Those that use your site to run a bunch of stolen cards trying to find the active/good ones. You start telling them they're successful or false randomly in what looks like a normal process.


This seems like trouble if they're still paying for it though.


Since when is a botnet a collection of free email accounts?

Since when has a spammers return on investment been low?

Since when have spammers only used hijacked "legitimate" business domains instead of just using some wildcard email domain setup?

Its not enough that he posts his strategies online to make it easier for his adversaries to learn from, but this guy doesn't even sound like he grasps the fundamentals of what is supposed to be his profession?


Hey, "he" is here :-)

To be on the same page in this conversation we are programmatic email service for developers, not for end users. Customers can create their own virtual email server on our page and start sending in a couple of seconds. This concept is pretty similar to cloud servers.

> Since when is a botnet a collection of free email accounts?

In our terms botnet can be a mix of a free and paid Mailgun accounts. Botnets can include anything from 2-3 to dozens and hundreds of accounts created at different time and using different billing plans.

> Since when has a spammers return on investment been low?

We are talking about Mailgun service - the time they need to invest in building some solution on top of Mailgun that pays back is just not worth it. Actually I'm surprised why they even bother sending this type of spam through Mailgun. Let's say they were able to send 100K of emails via us (what is pretty hard nowadays btw), in the best case their click rates would be floating around some fractions of a percent.

http://www.sitepoint.com/spam-roi-profit-on-1-in-125m-respon...

So they wont' get even a couple of clicks from that.

On the other hand, phishing attacks are very dangerous and this is our biggest threat - we've noticed that they get very high quality lists with 0 bounces, so it might be real bank users and build pages for every atack.

> Since when have spammers only used hijacked "legitimate" business domains instead of just using some wildcard email domain setup?

Wildcard MX records are about receiving, I'm talking about subdomains on a free webhosting services (bulk subdomain creation), what is a serious threat.

> Its not enough that he posts his strategies online to make it easier for his adversaries to learn from, but this guy doesn't even sound like he grasps the fundamentals of > what is supposed to be his profession?

Botnets and targeted phishing attacks are not somewhat new - that's a common practice, it's not that I'm uncovering some unknown secret here.


> In our terms botnet can be a mix of a free and paid Mailgun accounts. Botnets can include anything from 2-3 to dozens and hundreds of accounts created at different time and using different billing plans.

My criticism is I've never heard of a botnet referred to as a group of accounts. To me, a botnet is a group of host computers that run some sort of proxy server (tens of thousands of hosts). I've never given thought to what someone would call a group of email accounts aimed at exploiting a service, but to me botnet seems like it would be specific to a network of computers, sometimes compromised, sometimes not, running a type of proxy or automated software.

> We are talking about Mailgun service - the time they need to invest in building some solution on top of Mailgun that pays back is just not worth it.

The problem is when it comes to a service like yours, if it really is that hard to bulk mail, then the guys using your service aren't the guys getting a 1 in 125m response rate.

Anyone sending that kind of volume would assume their messages were going to a spam folder, and sending larger volumes to compensate for it.

Someone going through the hoops you set in place, are doing it because your service gets inbox. This means they can send bulk email to higher quality lists, and their response rate will be significantly higher than 1 in 125m, more like 1% to 2% response rates.

In this case, the people actually sending mail through your service probably don't even bother making those accounts themselves. They likely find people who specialize in circumventing your security measures, and pay a premium of $x to $xx per 1,000 accounts.

> On the other hand, phishing attacks are very dangerous and this is our biggest threat - we've noticed that they get very high quality lists with 0 bounces, so it might be real bank users and build pages for every atack.

This too, but don't forget about simply cracking passwords for the accounts. Simple math. Take the top 100 most used passwords, assume your users are just as naive as most the internet, and you have x% users you can assume will be compromised at some point in the future.


> The problem is when it comes to a service like yours, if it really is that hard to bulk mail, then the > guys using your service aren't the guys getting a 1 in 125m response rate.

So for old school spammers even if they got lucky and got 1% click rate, they'd 1K clicks in their best day in our service. So I'm mostly considering them as people looking for potential holes in the service.

The people coming with stolen credit cards who want to steal more are the biggest threat as they are most harmful - they hunt for our ips and domain reputation, so they take time and try pretty hard to break through our filters.

> This too, but don't forget about simply cracking passwords for the accounts. Simple math. Take the top > 100 most used passwords, assume your users are just as naive as most the internet, and you have x% > users you can assume will be compromised at some point in the future.

Yep, and we watch every account in the system for changes in behavior, but that's happened only once or twice in the last 2 months - so it's not a biggest problem right now.


Unrelated to the content, but why do sites like this go out of their way to disable pinch zoom for mobile devices?


I had disabled it on one of my sites for a couple days because it was messing with rotation on ipad. If you set a min-scale of 1 (necessary to make anything responsive work properly) and then the user rotates their device, it's going to move to a different zoom level. Setting min scale 1 and max scale 1 keeps the zoom level consistent through rotation.

Turns out nobody other than me cares about the zoom changing on rotate, and everybody wants pinch to zoom.


IMO it's best to not optimize for mobile as it just makes it worse.


On android at least, you can tell your device to override it under accessibility settings for the stock browser.


"It’s a game of cat-and-mouse that we didn’t ask to play, but we can hunt when we need to."

As an ESP, isn't that pretty much the game you chose to play, both as the cat and the mouse?


Thought the same thing myself when I read that line, was also a little disappointed by the lack of content, I understand not wanting to leak information, but I didn't get a lot out of it other than "we have an algorithm, its great!"

Would be more curious to know about how effective they have observed this to be, or maybe more about what they learned a long the way, profiles are always interesting, how many false positives, customer complaints(& support time) etc. Maybe a future post?


Hey, Mailgunner here

Yep, we were trying not to disclose too much information on how we catch them, however I agree that how we fight them deserves a separate post.

Some things to share:

* Naive approaches (hey, just plug in spam filter) don't work in most cases as spammers tune and create the new content specifically for our service

* Feedback (complaints) from customers is a great signal, but at the point you start receiving the complaints it may be too late.

* Bounce-based metrics (invalid addresses) are a great signal.

* There's no silver bullet as we've found, you have to collect as many signals as you can

* Rules based systems don't work as the rules change every day, you have to plug in some learning in place.

* Domain blacklists are also not very effective - as they use hijacked domains, or services providing free sub-domains to avoid blacklists.

* Ip blacklists are not very effective as well, as a lot of people are now using cloud services sharing the same NAtted ip.

* A lot of customers don't really realize they are spammers - "Hey, we've paid money for this mailing list, it's all fair"


Awesome, thanks for the reply.

It may not have been something which you wanted to do, but I think it is a really interesting problem, and I bet it has been rewarding for both business, and in a pure engineering sense.

In some ways, I think about what it must have been like to create a fake identity in a less connected age, and I wonder at how it will continue to evolve.

I recall some Doctorow novel in which spam and its increasing sophistication was almost an escalating arms race between our ability to distinguish authentic interactions versus those that were staged or generated / general sock puppetry.

I am curious about additional signals and information, I would presume in addition to fingerprinting and collecting as much information about each of their implicit touch points, did you find yourselves increasingly relying on more traditional manifestations of identity/reputation, etc.

edit: Or I wonder about a discount for new sign ups with a one time facebook scan & score type mechanism :D

Thanks again for sharing more information, good food for thought!


Agreed, I'd rather spend time building some cool features for developers everyone can use instead of confronting scammers trying to steal someone's else credit cards, so it's definitely the fight we chose.

Talking about traditional ip, domain and complaints reputation - it helps a lot to identify and block ignorant senders using some questionable techniques for getting their recipient lists, but it's pretty useless for fighting phishers - you need to act immediately and automatically, and reputation takes time to aggregate.



Great, thanks for the link!


Really helpful information, thanks for including this.

I've been dealing with some non-email spam recently, and after reading this I count myself lucky -- most of the stuff I see is SEO related and they tend to come from distinct IP ranges and can be surfaced with some simple rules. I'm sure as time goes on, they will become more wily.


Yes, we should put some hard numbers out there in a subsequent post.


We've always wanted to build a programmatic email server for developers. To be fair we were a bit naive when started and had no idea what was ahead of us. We knew that sending was hard, but we've no idea how hard and messy it actually is.


First of all, you have a cool service and I don't really think any less of you guys for one line on a blog post.

That said, in my view, that's the entire point of your service. Every language has an SMTP library; the hard part - the thing I'm paying you for - is the constant cat and mouse.


I like how the spammers always explode in anger at the end.


I have to wonder how accurately you could tag spammers using sentence structure and spelling analysis.

Probably not well enough to use in an automated system, but perhaps well enough to be used merely to flag things for careful/further inspection.


They are also very relentless, trying again and again for several months


That's one of the most perplexing aspects for me; how much effort some people put into getting "a quick buck." If only they'd put that much energy into something productive, they'd be way ahead.


I was first struck with this thought in highschool; the time and thought and preparation applied to cheating at exams. Half that effort applied to just reading the cartoonish textbooks would give a pass.


This effect explains most of the vitriolic anti-google commenters on HN.


Personally I've seen more vitriol and misplaced anger from pro-Google commenters.


I take it the in-house 'Razor' mentioned in the article has nothing to do with Vipul's Razor (http://razor.sourceforge.net/).

It'd be great if mailgun did a blog post dedicated to what their Razor does and how they built it.


I'm surprised the customer support even had discussions with these people when they were so clearly violating the terms. I guess these are picked out of lots of similar discussions with well meaning, grammar and spelling deficient customers :)


We don't know ahead of time to whom we are talking to when the chat window opens.

Once it's clear we are trying not to offend anyone in case of some terrible mistake, but instead ask a couple of simple questions to quickly verify the identity or business. We've found that this is the only way to avoid confusion.


Yep, makes sense, also interesting that you confront them with the questionable details (bad domains, bad IP match) -- does this help bring things to a conclusion faster?


If they signed up and used Mailgun sub domain we just ask to create a domain and set up DNS records, this usually ends all conversations.

However in some cases it gets very tricky - they can control the domain or can build an website with some non-working signup forms and services. We proceed and launch investigation up to signing up for their service - usually sign up does not work :-)


Is data about spammers made available publicly at Mailgun for example to Spamhaus?


AFAIK Spamhaus does not accept spam reports directly:

http://www.spamhaus.org/faq/section/Generic%20Questions#103

However I think it's generally a great idea, I'd also love to collaborate with anyone to fight spam.


I would be interested to collaborate, but I couldn't figure out how to get in touch with you via your profile. I tried your HN username @mailgun.net but that address bounced. What's the best way to reach you?


Great, let's chat! My email is sasha.klizhentas@mailgunhq.com


Why are they always foreign? y cant they type normally lol


Heh, downvoted. It is a legitimate question. It's rare to see well spoken spammers with proper grammar when interacting with them like this.


Travis from Mailgun Support here... There are several spammers located in the US and speak excellent English. Spelling and grammar isn't the best indicator. A spammer in the US will typically have a good understanding of American communication style, thus, they are good at the social engineering aspect. The problem with spammers in the US is that it's pretty difficult to fully conceal your identity. There is always something in common or amiss with the account details. For example, if your billing address is based in the US, but you're connecting to infrastructure with IPs outside the US. Or better yet, chatting with me via an IP in the Netherlands.

I think the best example of a US spammer was this guy attempting to promote his blog. (Or trying to warm up his account before sending spam) Each post was a rip off of an article from About.com. All the post dates were adjusted to appear that the site was online for months. His spelling and grammar were excellent.

He had several flaws... The domain was just registered a couple days ago. The "Corporate HQ" address on the site was a post office in New York. The billing address was a UPS store in Nevada. He refused to talk to me on the phone or provide a physical address that I could send a t-shirt to. :-P


Well said; thank you. I suppose it's a confirmation bias: the most immediately noticeable spam is that with poor grammar, thus it's most easily identified with spam.

I've been exposed to what bank fraud and phishing scams can be like, and the craft is really amazing.


Interesting




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: