Looking at the Siberian husky site... stdLauncher.js is part of Verint ForeSee, one of those "would you like to take a survey about our website" solutions. The AAM analytics code right above the survey and urchin code lists as domain an IP associated with Sungard AS, an outfit that holds a number of federal contracts for IT services. This IP, 209.235.0.153, hosted the FBI website at some point in time. It's oddly easy to figure this out, even without something like a DomainTools subscription, because there are a lot of people scraping and archiving the FBI most wanted pages due to their cultural significance.
Some searching on code samples shows that the AAM section of analytics code is an exact match for analytics code served up by an older version of the FBI's most wanted website. Likely that it was also used on older versions of other FBI websites as well.
In the end I find it unlikely that this website has anything to do with the FBI, and more likely that the website owner copy-pastad a large section of source code and accidentally ended up with this result.
One bit of commonality I've noticed is that a lot of websites with the FBI tracking code were all built with FrontPage. I'm not sure if this is causal or coincidental, but perhaps it contributes to this that FrontPage allows you to open a webpage that you saved from IE and edit it... which might lead to some websites being complete duplicates of FBI websites, except for visible content, simply because websites like the FBI most wanted were relatively prominent parts of the early internet.
Edit: I spent a little time riding the WayBackMachine to some of the other webpages when they were apparently using FBI analytics code. The results are odd but they're so inconsistent that it's hard to think it was at all intentional. One interesting finding is that both ohthx.com and ppc-guy.com, at the time they supposedly had the FBI analytics ID, were apparently hosting an analytics package called Prosper202 that redirected the WayBackMachine crawler from the login page to fbi.gov. I have a suspicion that this was a partially-joking way to deter crawling of the admin interface of the software. The record that they used the FBI analytics code is presumably just an artifact of the crawler following the redirect. It seems that this exact Prosper202 behavior results in the majority of the old hits.
This technique was recently done by some redditors to uncover that the multi-state COVID reopen protest is being pushed by some guy who uses an antique shop in FL as a front for his shell LLCs.
They are the websites that are being used on the facebook pages that are primarily pushing 'reopen' content, and the GA accounts on those pages links them to a bunch of pro-firearm shell corps as well.
Here's the thread. It got deleted since it was deemed as doxxing (a reddit no-no) even though Whois data is public:
Look at the updates on that post, nothing is so clear cut. The problem with internet sleuthing is that everyone gets very excited and innocent people can be injured in the unnecessary witch-hunt.
>Update, April 21, 6:40 a.m. ET: Mother Jones has published a compelling interview with Mr. Murphy, who says he registered thousands of dollars worth of “reopen” and “liberate” domains to keep them out of the hands of people trying to organize protests. KrebsOnSecurity has not be able to validate this report, but it’s a fascinating twist to this tale: How an ‘Old Hippie’ Got Accused of Astroturfing the Right-Wing Campaign to Reopen the Economy
Update, April 22, 1:52 p.m. ET: Mr. Murphy told Jacksonville.com he did not register reopenmn.com or reopenpa.com, contrary to data in the spreadsheet linked above. I looked up each of the records in that spreadsheet manually, but did have some help from another source in compiling and sorting the information. It is possible the registration data for those domains got transposed with reopenmd.com and reopenva.com, which included Mr. Murphy’s information prior to being redacted by the domain registrar.
Right, and this is exactly why reddit bans doxxing. The original reddit poster was correct that there was a single individual buying most of these domains; however, other than purchasing the domains, there was no evidence that the individual was using those domains to promote protests. Let's not forget reddit's Boston bombing debacle[1].
Worth noting that, since the Analytics ID is the publicly visible, anyone can load Google Analytics on their own site using that ID. No FBI connection required.
This is called Analytics hi-jacking and it was once (still is) a common spam technique: Create site buy-my-stuff.net, load a bunch of hijacked analytics scripts there, and then the owners of those accounts will see “but-my-stuff.net” in their analytics reports.
Edit: As commenter lmgk reminded me, you don’t even need to make a site, just use the API to make pageview calls.
You don't need to host a site. The data format to send data into Google Analytics is an open API (called the Measurement Protocol). You can just ping Google's servers directly with the appropriate payload, which include crafted URL parameters.
In his book "Permanent Record" Edward Snowden[1] describes fake websites used by government agencies to disguise internet traffic that is actually use for spy craft stuff.
eg: maybe a website about siberian huskies actually has a hidden login or hosts another service when contacted on port 80/443 in just the right way?
Now, that would make more sense for the CIA than the FBI, but I think it illustrates another avenue of interpretation
That doesn't make sense, why would they even let people know that there's a connection? The hidden login part may be true, but just not on a sites that are related so obviously. It could be a smokescreen of some kind though.
In this example its quite possible FBI put their traps to get better understanding what third parties are involved; who is visiting the site, and probably some admin management page behind it. Sort of like get the contacts of a criminal and go from there.
Interesting. I've got his book on my reading list, but haven't gotten to it yet.
I just made a tenuous mental connection between this concept and a Reddit phenomenon called "Lake City quiet pills". I heard about it on the podcast "Stuff They Don't Want You To Know" and it held my interest for a few hours' worth of investigation.
The short version is that a Redditor died. He was a stereotypical grumpy old dude, and someone hopped on Reddit and posted that he'd passed. Someone got interested and tied that poster to some websites, one of which had a bunch of stuff hidden in the public source. It definitely seems like a clandestine group of some kind communicating, but to who it was and to what end isn't clear. The Reddit conspiracist belief seems to be that it was a group of assassins-for-hire.
Google analytics used to be called Urchin (they bought Urchin and made it Analytics). So all the urchin.js code is probably just really old Google analytics tracking code.
It had a hybrid log/js approach around the time google acquired it. I believe one of the first. Was the best product around. As a shared web hosting provider in early/mid2000s it was becoming more than a competitive advantage to offer it.
The Google Analytics 'trick' (to identify all the sites someone owns) has been around for quite a while. All you have to do is use a code search engine like publicwww to search for the snippet of code or the analytics ID.
It's not just the Google Analytics ID or GTM Id, you can also use the Adsense pub-id or just about anything else that you might think sites have in common. When you start to also look at backlinks and IP neighborhoods, things can get interesting, as well.
On a related note, I wonder if there are/were common patterns in the sting sites set up by Dept. of Homeland Security, such as U of Northern New Jersey [0] and U of Farmington [1]. Both of those were initiated during the Obama administration and featured fairly nice modern designs, similar in aesthetic to much of the Obama-era digital overhauls (though a quick skim shows that they don't share similar CSS naming semantics).
On brief review, one (UNNJ) is running WordPress while the other (Farmington) doesn't show any evidence of a dynamic CMS. That suggests to me totally separate provenance. My guess would be that two different contracts were awarded to two different companies to build the websites, which would both be consistent with common federal contracting behavior and a good idea from an OPSEC perspective since it would minimize any similarity in these "sting" websites.
A coworker sysadmin once told me that when he was inspecting the web server access logs (for an unrelated reason) he noticed that many requests to a resource on our website have a strange referer URL that was never present in requests to pages. He inspected that site and found that they were using our resource. We didn't really care about it, but that was really interesting.
The article says all three fbi.js files were on waybackmachine. I was only able to download urchin and the other ones are not there. Anyone have a mirror? Besides the author? pastebin or mega
All three are from commodity commercial software, finding other websites of the same period that used Urchin/GA and ForeSee should get you more or less the same files.
In the Wayback Machine archived version of triggerParams.js there is an OMB parameter of “1505-0186" if the client is section 508 compliant (US accessibility guidelines). A search of that OMB number turns up a Customer Satisfaction Measure of Government Websites survey from 2008/2009 (which makes sense if the archived js is from the FBI site). What isn’t clear is if the same version was used on all of the sites (some of the parameters are hard-coded) and how it got copied across to a mixture of hobbyist sites, plumbers, Most-Wanted pages etc. A quick peek at the page source of a random sampling of the sites in the Wayback Machine show very little similarity with each other (e.g. style of code, page layout etc.) which strongly suggests that it wasn’t people just ripping off the FBI page and wrangling it with a text editor. It is curious.
The big takeaway from this article for me is that I should probably look for or write a browser extension that tracks changes to analytics tools and IDs on sites. If a site is silently taken over, the state actor would either need to separately gain access to the analytics tool accounts, or would need to modify the IDs to connect to a new account. I'd love to see how often tracking IDs change on high-profile sites.
Google analytics ID's are tied to the account that created them.
Presumably the FBI doesn't all share just one massive "fbi@gmail.com" email address.
Even if a bunch of FBI employees decided foolishly to use google analytics on their honeypot sites, one would expect them to all separately sign up using different google accounts - either using their real email addresses, or hopefully throwaway ones.
I think you're confusing Google accounts (email addresses) with Google Analytics accounts (tracking ID prefixes). A single user can create dozens of GA accounts.
Sure, this isn't a comprehensive strategy, but you'd be amazed at how far behind some of those agencies are in terms of day-to-day operations for investigations.
A relative of mine works at FBI and several years back he told me a story about how an investigation into an organized crime syndicate was blown up because an agent on the case was dumb enough to check out the target's LinkedIn profile while he was logged into his own real account. So the target got a notification that Joe Blow from the FBI had just viewed his profile. Over a year of work down the drain with a single GET request, crazy.
My issue is the confidence with which the author presupposes that the existence of this code on sites indicates seizure or utilization in an investigation. It is a lazy position that leaves others (i.e. HN readers in this thread) with a little more intellectual horsepower to evaluate the other - and frankly more realistic - alternatives.
Some searching on code samples shows that the AAM section of analytics code is an exact match for analytics code served up by an older version of the FBI's most wanted website. Likely that it was also used on older versions of other FBI websites as well.
In the end I find it unlikely that this website has anything to do with the FBI, and more likely that the website owner copy-pastad a large section of source code and accidentally ended up with this result.
One bit of commonality I've noticed is that a lot of websites with the FBI tracking code were all built with FrontPage. I'm not sure if this is causal or coincidental, but perhaps it contributes to this that FrontPage allows you to open a webpage that you saved from IE and edit it... which might lead to some websites being complete duplicates of FBI websites, except for visible content, simply because websites like the FBI most wanted were relatively prominent parts of the early internet.
Edit: I spent a little time riding the WayBackMachine to some of the other webpages when they were apparently using FBI analytics code. The results are odd but they're so inconsistent that it's hard to think it was at all intentional. One interesting finding is that both ohthx.com and ppc-guy.com, at the time they supposedly had the FBI analytics ID, were apparently hosting an analytics package called Prosper202 that redirected the WayBackMachine crawler from the login page to fbi.gov. I have a suspicion that this was a partially-joking way to deter crawling of the admin interface of the software. The record that they used the FBI analytics code is presumably just an artifact of the crawler following the redirect. It seems that this exact Prosper202 behavior results in the majority of the old hits.