I probably deserve to get downvoted to oblivion for this but... I've deployed JSFuck in production!
We wanted to obfuscate this bit of code, to make life just a little bit harder for reverse engineers. We made this huge function where we pretty much passed in all our application state, and it would run this JSFuck code, and spit out a token. We even made a few tweaks to the code so that you couldn't just reverse it back into JS with something like https://enkhee-osiris.github.io/Decoder-JSFuck/.
Performance was surprisingly alright, and it has never hit an environment where it couldn't execute. All in all, a fun few hours setting it up, and I haven't had to touch it since!
makes me sad to hear that :( I will say, as a reverse engineer, that javascript minifiers like closure compiler will optimize almost all obfuscation out, and the rest you can usually translate to a form which it can understand and then it will do the rest.
The effect of obfuscation is not what you expect. It seems like it moves the whole difficulty up, but it only moves up the floor. By doing so it tends to remove all the signals that would warn non-experts of security issues. Remove the obfuscation and then you have all the catastrophic security issues that have accumulated in there like treasures in an egyptian king's tomb https://zemnmez.medium.com/how-to-hack-the-uk-tax-system-i-g...
I'm curious: can you can undo the obfuscation of JScrambler and Obfuscator.io easily? Some time ago I tried to run both through Closure Compiler, but it was way harder than I thought would be.
JScrambler I actually did de-obfuscate to bypass some very significant bot detection a few years ago, but it took a bit more doing -- it uses ES6 features IIRC so I had to transpile it down to ES5 via babel first, but it worked OK after that.
To clarify, that's what happens once you pass it through the obfuscation and THEN closure compiler. If you click the link you will see the obfuscation is much more complex
While I think itcs bad to use obfuscation to hide security holes, I do think obfuscation has it's uses. If there's reasons to make data private, then there's reasons to make execution private as well. Not to mention, it was recently proved that indistinguishable obfuscation is possible, so I'm not sure how useful de-obfuscation tools will be in the future
Really enjoyed that blog post, thanks for linking it! How long would you say actually finding those two issues took? The attack surface must've been fairly large, but I guess intuition helps.
I've been using the Google Closure compiler for many years with advanced and every time I look at the code output I'm like "there's no way in a 100 years I would be able to de-obfuscate back to my own code to a great extent". But I don't specialize in reverse engineering, so I might be missing something big.
personally, I find the code it outputs to be easier to understand often than the source because it simplifies a lot of stuff into the most abstract logic. I think it takes some getting used to for your brain to connect those pieces, though
I guess ultra-shortening the names into a,b,c,d,e, etc makes it superficially hard to understand what's going on but I agree on your point about abstract logic.
I understand the frustration, but as someone having to debug browser bugs with JavaScript edge cases, minified (and in this case ultra-obfuscated) JS is hell on earth to untangle and I wish people wouldn't.
IME, code on prod rarely ships with a sourcemap. and as far as i can tell, GP is talking about other people's minified code, not something where they control the build process
I think I might've actually analysed the code you're describing.
We even made a few tweaks to the code so that you couldn't just reverse it back into JS
This is why a lot of us keep our tools private... an old tradition of the cracking scene going back decades to the 80s. Think of things like IDA/Hexrays and Ghidra, then realise the most prolific crackers had similar private tools they had written many years before those appeared.
Not really ironic or related. The privacy here is, well, keeping the tools entirely private -- no distribution or highly limited distribution, not obscuring their function.
Irony would be applying obfuscation or other DRM protection techniques to tools intended to de-obfuscate/reverse engineer which are then distributed/sold. This is fairly common in commercial reverse engineering solutions, although I don't think the irony of the cat and mouse game is lost on the authors in that case...
why not - after all, none of the open source licenses say that you must actually distribute the program, they only say that you should also include the source code if doing so.
Different motives. Think of it like encryption: when someone finds a weakness in a cypher people will generally move onto a new cypher that’s harder to crack. The difference here is the purpose of writing those tools isn’t to improve security, it’s to break code open. So keeping these tools hidden keeps them effective.
> You shouldn't have access to it -- it's on my private servers doing stuff for me. How did you get it?
Seems like the logical solution would be to restrict access to it then, rather than obfuscate it making it harder for yourself to maintain. But hey, you do you.
> You shouldn't have access to it -- it's on my private servers doing stuff for me. How did you get it?
Well thanks for chiming in about a completely unrelated topic from website javascript. My comment wasn't getting downvoted until your slapfight with oauea blew up.
That wasn't a top level question though, it was a question to a commenter saying "We wanted to obfuscate this bit of code, to make life just a little bit harder for reverse engineers." That you obfuscate code nobody other than you can even access is completely irrelevant as a reply to that question.
Dude, I write financial applications in Node. Node IS javascript.
These applications run on the back end. Some of the API facing VMs have been attacked and so to be honest I've configured them so that if someone did get access to them, they wouldn't find much. Maybe just some API keys I can invalidate. Although I probably won't do it -- an obfuscator like this could be very handy here.
Did you give a try of Deno. This is a JavaScript runtime but it produce a single binary. This would make sense in your case because it can be harder to inspect it.
I think you ultimately need to do something like Function(code)() in JSFuck, so it is always possible to remove the final function call and get the `code` directly.
Interesting. I can definitely relate. As an author of a proprietary application written in web technologies, it's easy to be envious of compiled languages.
This is my recipe for minification:
1. Apply a convention where all class properties and methods have to end with a trailing underscore (it's trivial to make an Eslint rule to enforce it)
2. Use 2 minification tools in following order: Closure Compiler (simple optimizations mode), then Terser for best output.
3. Configure Terser mangle.properties.regex: /_$/
4. If it's a desktop app, use Bytenode for Node.js processes (cannot be used for the renderer)
I think someone once asked me the simplified variant of that question.
What is the result of:
[]+[]
And I didn’t know the answer to that (I mean, who does that kind of fuckery in Javascript, you can’t sum arrays). I would have no chance with these Google level questions.
That's an idiotic interview question. I've experienced this at young companies where both the company and its engineers are too immature to understand basic etiquette in the industry. Btw I doubt Google would ask a stupid "gotcha" question like that. They tend to ask hard algorithmic questions.
FWIW I’ve had Google recruiters ask me stuff where they don’t understand the answer and the questions are things like “what is the protocol number for ICMP” and “what port does NTP use”. If one doesn’t work with these things very often, they’re very forgettable as they’re so easy to google and find the correct answer to in seconds, and therefore not worth memorizing. No idea if they’re still doing this, but it was like a crappy version of Hacker Jeopardy (#DFIU!). After about 6 months of back and forth with the recruiter and a few interviewers missing interviews, I just gave up and told them I was no longer interested.
I went through the swe generalist interview pipeline 3 times at Google, so that's about 15 interviews. All the questions have been standard , except this one dumb question, which was along the lines of "explain how multithreading works". I suppose he wanted me to follow up with some questions to narrow down the topic, but I was stressed out enough that I just started with the basics, like cores, processes, OS threads, and shared memory, trying to explain it in simple terms. He interrupted me about 5 minutes in and moved on.
Having interviewed people myself, I think that's an overly broad question to ask in an interview and if the candidate starts answering it as asked, then you're not going to get a lot of signal from it.
Otherwise I thought the problems asked weren't that difficult, one was even directly taken from Cracking the Coding Interview - if only I had read that chapter ahead of the interview...
Heh, what's this? You can't answer this question off the top of your head after I spent 2 hours purposefully researching obscure edge cases to craft it? I guess you're not a real engineer. Did you even pass the 101 courses? No, no, no... You're no fit for us here. You see, some of us have to actually work for a living. Try pulling yourself up by the bootstraps next time, kiddo!
Only in Amazon interviews I've had actual trivia questions, like having been asked questions that'd require you to know all POSIX signals by memory (that could be fair if you squint a bit) or, more absurdly, solve a problem tailored for a particular flag of the find utility (which I realized when I looked at the manual for the thing again), as any other solution was too complicated to be satisfactory.
This is in my opinion actually a decent interview question. If you know JS in detail, if you know how the "+" operator works, it's a super easy question. If you don't know JS in detail, if you don't know how the "+" operator works, it's pretty tough. If you want an engineer who knows JS in detail, asking this question can be valuable.
"+" isn't some singular operator. They are different operators for different types using the same character. You'd have to memorize a table of all JS types and what "+" does for each, which is silly considering that the only place it should really be used in modern JS is arithmetic and the occasional string concatenation.
afaik these two places ( arithmetic and string concatenation ) are the only ones where + operation is defined. JS picks one of these two operations and casts the operands accordingly.
Edit: Wait, there is also the unary + which takes only one argument, but it also could be considered arithmetic.
The + operator doesn’t work on arrays, but tries to cast the sides of it to a string. Arrays implement the toString function, which shows the toString of the individual values comma separated. An empty array returns an empty string. Which means that this is equivalent to empty string + empty string.
This kind of thing does happen in the wild, through no fault of anyone in arms' reach, in particular when service APIs change in subtle and unexpected ways.
You might have a service that yesterday returned a JSON payload:
{ tasks: 5, ... }
but today returns:
{ tasks: [{...}, ...], ...}
If you had an array of these objects and were trying to sum the count of tasks, now your code returns weird results. Knowing how the JS engine's type coercion works is then invaluable for figuring out the problem quickly.
Yes, it's shitty code, and a change of API like that is shitty. But we don't always get to choose what code we're interfacing with, particularly when it comes to web services.
Sure, but in the wild, you would actually run the thing in the repl to see what it does. So either you know the answer or you don't and it doesn't affect whether or not you can do the job well. In my opinion binary knowledge-based questions are seldom useful unless you specifically are checking for that exact knowledge. You can have someone who is totally terrible in general but just knows that one thing or the answer to that specific trick question and you can have people who would be fantastic for the role but either don't know the answer or can't think of it because of a mental block in the moment. ie the false negative and false positive rate are too high and your question is just a random classifier.
Questions that allow a person to show their skill and their thought process are much more revealing. They're much harder to fake and bluff your way through for poor candidates and they allow good candidates to show their skill. They require a bit more effort from the interviewer though because there isn't just one right answer and you have to sacrifice the idea that you get an ego boost by dunking on some poor candidate with your trick question.
An example of a question I used to ask when interviewing candidates who said they were good at unix: "Say I have a directory full of files called a, b, c, etc and I want to rename them all to a.bar, b.bar, c.bar etc how do I do that"? Then I would follow up with "Ok now I have a directory full of files called a.bar, b.bar, c.bar etc and I want to rename them back to a, b, c, how do I do that?".
Now if you know unix you know there are a bunch of ways to do this, there are a bunch of traps you can fall into, what if there are too many files for shell expansion, what if the file names contain spaces etc. It's not hard for someone skillful to come up with a few ways that deal with these issues and if they can talk through how they did it and why then any of them are fine.
Edit to add: There is a kernel of value here though, which is say you were debugging a thing where the result was an empty string and you thought "wow, that makes no sense", maybe you get to understanding what went wrong when you see that two arrays are getting type coerced or something.
My reply was less about whether or not it's a good interview question and more about the comments people are making in this thread that the question is completely invalid because only a "fool" would write code like that.
In practice, unless you have some extremely pedantic, end to end testing, your first hint that something is going wrong is probably going to be malformed output. TypeScript won't catch this error because it doesn't do runtime checking, it trusts that type annotations are correct. It's very possible for code like this to pass through without causing any exceptions to throw. Indeed, JavaScript was originally designed to "keep on trucking" in this way, and that's why we have this degree of type coercion. So you're going to be working backwards from a result rather than forward from a source. If you already know how your code works and you see a weird result like that, knowing the type coercion gotchas can help you work backwards to understanding where the change occurred.
But in the former issue, I don't think the question on its own is bad for job interviews. How the question is used is what determines if it's good or bad. For example, it is a little unreasonable to expect people to know every single, little gotcha in JavaScript and the exact way in which they misbehave. But I do think it's reasonable to expect people to know that there are gotchas and that this is a common one. I'd accept as a valid answer, "I don't know the exact results, but I do know that this doesn't concatenate arrays and instead does something unexpected".
> My reply was less about whether or not it's a good interview question and more about the comments people are making in this thread that the question is completely invalid because only a "fool" would write code like that.
If you ask about how to uppercase a string, certain constructs like []+[], or whether camelCase or PascalCase is superior you will get the people who know the answers to those questions.
I remember encountering this as one of the layers of a heavily obfuscated script (with DRM-related purposes, not surprisingly...) many years ago --- fortunately, there's a corresponding unobfuscator for it:
This is a feature not everyone stumbles upon, but if you follow the 2nd link called “past” on the page, the one under the headline, it performs a search for you for previous discussions.
Dang has his own tool, but it’s handy when he hasn’t done the search yet.
(The 1st such link, at the very top, emulates previous days’ front page.)
I actually had a chance to meet Martin once during JSConf in Singapore back in November 2014. Very cool guy with a hysterically fucked up humor with JS.
JSFuck was included in his presentation back then and boy did it inspire me to do more JS. Here we are, a little under 7 years later, still in the JS world.
Very cool. Reminds me of a few years back when I was writing apps for facebook..along with there fan pages. They had there own markup language 'fbml' and 'fbjs'. The app was executed in a sandbox inside an iframe, which you could add as a tab on a fan page as well.
A few times I broke thru there sandbox, allowing me to run any xss on page load, even on the fan page...it grabbed there token and added friends, invited a random number or friends to a fan page, likes fan pages, then post a status update...all random, nd it would base it on how many friends the user had. Anyway, a big problem was other developers stealing my code thnx to it being JS...So I ended up using every bug in JS like this, to confuse. I made a function that would pull element names/type/src etc, then used that as a alphanumeric definition. So my source had no spelt out names...on top of using JS hacks..then finally obfuscating.
I rmbr the last time I did this and released it into the wild..it was patched up by FB in the morning after it sent to a security researcher who posted on his popular site for his audience to reverse engineer, which they did in a few hours...everything but the few lines that was passed to fb's sandbox that returned the broken code which enabled me to run the xss....
Gooooood times...javascript is fun#!
I don't understand how this works. I've seen brainfuck before, but not truly understood it. How is it possible to open up a new tab, paste the encoded form, and have working JavaScript?
It's because it _is_ working javascript. It's just obfuscated, taking advantage of the quirks of javascript's implicit type casting to represent expressions in odd ways. Look at the "basics" section of the page to get an idea of how it works on a small scale.
Brainfuck is an unrelated, separate programming language.
2. []["flat"] => The built in flat function on the Array object
3. []["flat"]["constructor"] => The constructor of any JS function is the Function constructor, which can take JavaScript, as a string, as its sole parameter. Thus, to create a function that executes any CODE_AS_STRING:
4. []["flat"]["constructor"](CODE_AS_STRING) => gives you the Function object, and finally:
5. []["flat"]["constructor"](CODE_AS_STRING)() => executes it
Thus, all that's left is to encode the strings "flat", "constructor" and CODE_AS_STRING using the patterns described early in that file.
This isn't related to BF (despite looking like it). At the bottom of the page, it explains how the conversions work. Basically, it takes advantage of JavaScript's loose typing and implicit conversions.
brainfuck isn't an obfuscator, but a language with a very small and simple instruction set. Technically it falls somewhere between c and assembly, since it doesn't have a 1:1 relationship with machine code, but in practice it is less powerful than assembly because you have to work within the confines of its restrictive conventions to recreate something as simple as y = x
There are some tasks which lend themselves to the way the language is constructed, and can be done with few instructions.
Reversing a string, for example, is just ,[[<]<+>>[>],]<[.<] which is quite short compared to most "practical" programs in this language.
And a destructive y = y + x is just [->+<] (destructive because this also sets x to zero)
This reminds me of a blog post about using pure lambda calculus for whiteboard problems [1].
I'm always fascinated by these "minimum viable language projects", since I think they sort of get into a fundamental concept: what is computing exactly?
If you want a sincere answer: it provides a huge state space of how unexpected input can produce much more unexpected results. If nothing else it’s educational about how you really shouldn’t trust input, and REALLY should scrutinize how it’s used.
A lot of other languages allow + to concatenate lists/arrays/sequences, even merge dicts/objects/maps. Languages that don’t have those semantics often have much safer and more idiomatic ways to join data structures (see lisps of all sorts) in ways you wouldn’t immediately expect if you’re coming from this perspective.
Adding two arrays together isn’t the problem, it’s a perfectly obvious thing to do. Adding two arrays together and getting a new language out of it is bonkers.
Edit to add: I work in JS/TS almost exclusively. This isn’t a chip on my shoulder criticizing a language I’m not invested in.
We wanted to obfuscate this bit of code, to make life just a little bit harder for reverse engineers. We made this huge function where we pretty much passed in all our application state, and it would run this JSFuck code, and spit out a token. We even made a few tweaks to the code so that you couldn't just reverse it back into JS with something like https://enkhee-osiris.github.io/Decoder-JSFuck/.
Performance was surprisingly alright, and it has never hit an environment where it couldn't execute. All in all, a fun few hours setting it up, and I haven't had to touch it since!