The Mystery of the Encrypted Gauss Payload

jgrahamc · on Aug 14, 2012

The core of this is 'find X' such that

  md5(md5(...10,000 times...(md5(X + salt)...)) = hash

where salt and hash are known. X is derived from the names of programs existing on a Windows machine with a particular format.

Or, find a way to calculate

  md5(md5(...10,000 times...(md5(X + salt')...))

given that hash is known and salt' but X is not.

Or alternatively, attempt a known plain text attack against RC4. Given that a certain amount of plain text is known (4 bytes) at the start of the RC4 payload then it's likely that the first few bytes of the keystream are known and an attack could be mounted via weakness in the RC4 key schedule.

raverbashing · on Aug 14, 2012

Well, wasn't MD5 broken?

It should be possible to do a brute force search using a couple of days of EC2 or (insert your favorite cloud provider) here. And by bruteforce you can try text search, or just go for the raw bytes. Not sure a collision can work in this case as well.

jgrahamc · on Aug 14, 2012

To recover X + salt you'd be looking at a preimage attack of MD5. I am only aware of one preimage attack against MD5 and it's only theoretical.

The input to the RC4 key generator is an MD5 hash which means you'd be looking at doing a brute force attack against an input of 2^128 bits. Assuming you find the answer on average in 2^127 and you are looking at an enormous search space.

According to a recent article EC2 has about 500,000 machines. Now assume that I buy them all and I am able on each machine to check 1,000,000,000 values as inputs to RC4 per second then I should have the answer in 800,000 times the age of the universe. But I think my credit card will have been cancelled first.

mtkd · on Aug 14, 2012

What about sourcing all known \Program Files\ paths out of search engine indexes which meet the criteria and brute forcing.

Microsoft exception reporting must have a list of all apps ever seen too?

raverbashing · on Aug 14, 2012

There is one aspect there that may or may not be important

I don't know if the "Program files" path (or the full path) is added to the hash calculation, but at least in Windows XP this is localized

Or who knows, the secret is that it only works in systems where Program files is in D:/

raverbashing · on Aug 14, 2012

I'd try to bruteforce X (to match the hash), not RC4 at first (though it may be easier)

PBKDF2 is SHA-1 and 4096 rounds, this shouldn't be impossible

Bonus points if you use FPGAs to calculate MD5s

jgrahamc · on Aug 14, 2012

The question is how large that search space is. If you can get a reliable list of directory names and file names then it might be small, but if you are left iterating characters in filenames (and this appears to be Unicode) then I'd imagine you'd run into the same situation.

I'd be much more tempted to look at the fact that the first four bytes of the RC4 key stream appear to be recoverable and look at key recovery from that.

tedunangst · on Aug 14, 2012

PBKDF2 uses a hash function which need not be SHA-1 and applies it a variable number of rounds with a recommended minimum of 1000.

praptak · on Aug 14, 2012

Even with a reverse-md5^10000 oracle, you'd only get some bits that hash to the same hash as the mysterious pair of strings. Unfortunately the decryption key is derived from the pair of strings themselves, not from their hash. Reverting md5 is not enough to retrieve the decryption key.

ceautery · on Aug 14, 2012

The article mentions "~" as a possible starting point, but "{" is also greater than 7A, which would match all the "InstallShield Installation Information" subfolders.

eli · on Aug 14, 2012

Great point... are those uniquely named based on the application installed? That might be a nice, oblique way of checking if a particular program is installed.

lt · on Aug 14, 2012

Yes, these are GUIDs in the following format:

{931373E2-3DA4-4631-930C-F59510630DA3}

It seems to me that's a good theory of what it might be looking for, as GUIDs should make good triggers. I wonder if this reduces the search space enough to make brute force feasible now.

Scaevolus · on Aug 14, 2012

128 bit GUIDs give pairs of 256 bits -- too large to mount an efficient brute force.

sp332 · on Aug 15, 2012

But checking all known GUIDs might be more feasible.

eli · on Aug 15, 2012

Well, yeah, but where do you get a list of known GUIDs for InstallShield? Might as well just gather a list of all known Program Files directories.

Zarathust · on Aug 14, 2012

To bruteforce like this, wouldn't you need every possible application installed on the computer?

If the payload is a zero day for an obscure Iranian made piece of software, no one will ever get that

orenmazor · on Aug 14, 2012

I just imagined the author reading that post and smirking to themselves.

tjic · on Aug 14, 2012

My thought exactly - how weird it must be to be inside looking out.

Public key crypto was discovered on the inside long before it was rediscovered on the outside, and I figure that the insiders must have been amused by Diffie, Hellman, R, S, and A.

--------

http://en.wikipedia.org/wiki/Public-key_cryptography#History

In 1997, it was publicly disclosed that asymmetric key algorithms were developed by James H. Ellis, Clifford Cocks, and Malcolm Williamson at the Government Communications Headquarters (GCHQ) in the UK in 1973.[4] These researchers independently developed Diffie–Hellman key exchange, and a special case of RSA. The GCHQ cryptographers referred to the technique as "non-secret encryption". This work was named an IEEE Milestone in 2010.[5]

mkup · on Aug 14, 2012

It looks like it would be easier to bruteforce fixed RC4 key than 10000 iterations of MD5, especially due to known weaknesses of RC4 key schedule.

Name of target software in Program Files is interesting nevertheless. Probably it's mentioned in the encrypted code/data.

mjs · on Aug 14, 2012

So whilst the malware will infect machines more or less indiscriminately, the payload itself can only be successfully decrypted (and therefore activated and executed), on machines that have a specific set of programs installed?

dkokelley · on Aug 14, 2012

I think it's actually just one specific program, who's name starts with a special character or high UNICODE character.

kayge · on Aug 14, 2012

"the attackers are looking for a very specific program with the name written in an extended character set, such as Arabic or Hebrew, or one that starts with a special symbol such as “~”."

I suppose µTorrent is too obvious... Anyway, these kinds of mysteries help re-ignite my interest in Cryptography. I'd love to hear feedback from a fellow HNer about the course from Udacity (perhaps via email since it will probably be considered off-topic here).

Achshar · on Aug 14, 2012

I thought about uTorrent too, but Mu has a hex of 0x03BC. Plus it is a popular software and in windows, it's folder in program files uses 'u' instead of Mu.

http://www.fileformat.info/info/unicode/char/3bc/index.htm

fryguy · on Aug 14, 2012

So does this mean that if you're a high profile target, you should immediately add a random folder to all of your computers in the program files directory?

ragmondo · on Aug 14, 2012

No..it means if you are running a specific program which unlocks the code you are going to have a bad time (I suppose you could rename all of your program directories though... would that defeat this ?)

sounds · on Aug 14, 2012

The full implications of this code are that the attacker already has another channel to access your machine.

It's not much consolation that you now know that you're being targeted by the Program Files entries (they're a major pain to rename). It's likely there are one or more plants inside your operation and they have physical access to the machine, which is considered game over.

lifeisstillgood · on Aug 14, 2012

oh, this is the modern version of a microdot

Release Gauss into the wild, have your agent in Fordu Nuclear plant be sure he has Gauss on his machine, and then just get him to name the jpgs or text files he wants sent back to the CIA as 'special.jpg' - Gauss nabs them, sends it back through the network of gauss infected machines, and hey presto - deniable, encrypted, distributed Dead Drops.

Wow. Clever. Thank you

pruman · on Aug 14, 2012

Clever. The font makes it possible for the agent to verify he is on a Gauss machine by visiting seemingly innocuous websites which have code to detect whether the font exists, and then inform him by outputting special text only he knows about. He could receive messages that way too. Once he knows it's a Gauss machine, he can drop his specially named files and they are delivered.

kuriraisu · on Aug 14, 2012

Is the idea that gauss would act like a secret file katamari, rolling around collecting data while it spreads, and being harvested when it "infects" a creator controlled machine? It would seem like any direct data transmission would be detectable and investigated with extreme prejudice.

lifeisstillgood · on Aug 14, 2012

I am only speculating but we know a few things

1. Its part of a wider eco-system of collecting / infecting / attacking "framework". It seems that attacking uranium enrichment was just a "plug-in".

2. They have designed for multiple infection vectors. Now if it can get in it can also get out. I would not be surprised if the family of malware here is also able to hook into outlook.exe, and even piggy back on IE connections. There is no particular reason why a payload cannot be steganographically put into every photo uploaded to irans' facebook. Which may not be entirely secure of couse :-)

The possibilites when you have the money and time are incredible.

So, no, something as silly as transmitting over UDP from the agents laptop back to www.cia.gov is unlikely, but this things will just keep pushing data around and around till it gets either home, or to a target.

Sadly, much of the code is out in the open. And is surely being pulled apart by other nationstates and the mafia.

Fun times ahead

danielweber · on Aug 14, 2012

Getting a certain filename onto your computer doesn't sound like a hard problem. Just send them a mail with an attachment of "398rgf90rej243rf.htm" that their email client helpfully extracts for them, or have a file with that name in their web cache when they browse the internet.

eli · on Aug 14, 2012

Why would you need to trick someone into saving a file with a particular name? You already have malware running on their machine!

Seems much more likely that the check is there to confirm that the payload only runs on specific targets. And, perhaps more importantly, to make recovery and dissection of the payload very difficult for someone without access to the target(s).

rdtsc · on Aug 15, 2012

If you are a virus and you are too obvious, you are quickly found and and eliminated by the "immune" system. So it is import to stay low on hosts where there is no benefit in attacking and only using them for vectors of infection and only go into full blown activation mode when some specific trigger is found.

danielweber · on Aug 14, 2012

I was thinking that this program is the bomb, but it's waiting for a trigger. Having a file with a certain name appear on the machine would be that trigger.

sirclueless · on Aug 14, 2012

Well, Gauss requires a file in %PROGRAMFILES% which is considerably more difficult to plant.

rdtsc · on Aug 15, 2012

I would guess it shouldn't be planted it is expecting it to be there. Chances are that is an Arabic name for some program from Siemmens or something like that. Or the name of the a rich bank client used to connect to a Swiss bank or something of that sort.

The key is of course is to lay low and undetected until that trigger fires, otherwise, anti-virus companies will blow the whistle.

dkokelley · on Aug 14, 2012

From a commenter:

"2. Append the pair with the second hard-coded 16-byte salt and bytes 0x15, 0x00 " and assuming point 2 of my message above:

This gives a finger print of all actual used programs. This finger print should be specific in the range of 1 to 10^(-7).

If so specific, it limits the scope to preconfigured systems, which are NOT run under user control.

Might it be, that those targets are embedded systems like ATM, Mobile base stations and again SCADA-systems?

Supposedly an administrator could send an update that inserts random files in program files to foil the system identification method, but given that the attacker has such detailed information about the target systems, this seems like a temporary measure at best.

Edit: It looks like the code is only looking for a specific filename. In that case, the only way to thwart this is to rename that file (and fix any issues that this would cause).

meatsock · on Aug 14, 2012

a good point raised in the comments is that the "arabic or hebrew" part really meant to say a "non-letter us-ascii value including curley brackets, tilde, and pipe". not sure why anyone would want to jump the gun on narrowing down geography in this way.

tjic · on Aug 14, 2012

> not sure why anyone would want to jump the gun on narrowing down geography in this way.

Because it's a tool designed to take out Iranian uranium refining tools.

lifeisstillgood · on Aug 14, 2012

I think it is fair to say it is a general purpose tool, that was configured at least once, to take out Uranium refining tools.

So, I would be surprised to find this was not also using, say the parts of Unicode with Chinese characters as well.

trebor · on Aug 14, 2012

I'd like to see Kaspersky bust one of the Russian government-made viruses. I see them protecting customers from everyone else...

DividesByZero · on Aug 14, 2012

Do you have any examples of such a virus? So far only the US and Israel have been implicated in the creation of 'weaponised' malware.

_3u10 · on Aug 15, 2012

Any chance that the Paladia Narrow font is designed to infect printers?

sunyc · on Aug 14, 2012

checking prog directory might mean they are looking for specific machin with specific system image