I agree with you that no interpreter should be running there. It’s bad design. B...

danShumway · on March 11, 2020

> But how many C/C++ engineers would think to design a system that runs a min interpreted code

Multiple people, in this very thread, including you[0]. And apparently at least one Avast engineer and their upper management.

I'll requote/paraphrase another commenter[1] down-thread: it wasn't JS devs who wrote a custom interpreter inside a privileged C/C++ program. It was a C/C++ developer who thought, "I can handle this."

It's very important when calling out security failings to point out the real failing. If people are reading this and trying to take away security advice, I don't want their takeaway to be, "so my custom LUA interpreter is fine."

[0]: https://news.ycombinator.com/item?id=22545385

[1]: https://news.ycombinator.com/item?id=22545945

nv-vn · on March 11, 2020

Re-read the comment. GP is not saying to make an interpreter for C++, they are saying that there should be no interpreter. If the language is compiled there's no need for one. C++ can obviously be insecure, but the scale of a JS interpreter + the fact that it's meant for executing arbitrary code leads to a huge security flaw that isn't present in just a normal C++ app.

danShumway · on March 11, 2020

Then just say that there should be no interpreter. Don't confuse the issue by talking about memory safety or act like there are better/worse ways to do this.

> but the scale of a JS interpreter + the fact that it's meant for executing arbitrary code

"the fact that it's meant for executing arbitrary code" is the only part of that statement you need. Avoid executing arbitrary code in unsandboxed/unisolated environments, even in a normal C++ app, even if the code is compiled. The scale doesn't matter.

nv-vn · on March 18, 2020

The scale is important. A JS interpreter is 10s of thousands of lines of code at best. Even if sandboxed, the likelihood that there's a way out of the sandbox massively increases as more code is added. The possibility of bugs in the interpreter is another issue. For example, 10k lines of C or C++ code means there's 10k extra lines where there could be a buffer overflow, segfault, or memory leak. And then multiply that by the number of times these lines are executed with unknown input (AKA every line of JS code).

danShumway · on March 19, 2020

The scale is only important if you trust yourself to build a secure interpreter in the first place. Caring about the complexity of the interpreter means you are relying on the interpreter to keep you safe. Do not do that thing! The interpreter is not your sandbox.

If you're following best practices and isolating the process, then the number of lines of code shouldn't matter for the actual security. You should assume that a custom-built interpreter designed to run malicious code always has bugs -- whether it's running JS, LUA, whatever -- so you should run that code in a separate, sandboxed process that doesn't have system access.

It's not that the scale doesn't make a difference in complexity, it's that (for the most part) if you find yourself at the point where you're asking questions about the scale, you have already seriously messed up, and you need to go back and rethink your design.

----

The business problem Avast was trying to solve was, "how do we tell whether or not a random Javascript file contains malware?" The answer they came up with was, "we'll run the file in a process with system-access and see what happens."

I'll ask the same question I asked the original commentor: what is a safe way to solve that business problem without process isolation? And if you are correctly isolating the untrusted code, then why does the complexity of the JS interpreter matter?

mr_toad · on March 12, 2020

Compiling untrusted code isn’t going to save you.

nv-vn · on March 18, 2020

No, it won't. But there's no need to compile untrusted code in this scenario. All the code that is run should be packaged in.

danShumway · on March 19, 2020

I think you've possibly misunderstood the actual situation. This is a virus scanner. Analyzing untrusted code is the point.

The JS interpreter isn't there to lay out an interface, it's there to help them understand untrusted code that they find on the filesystem.

LeoNatan25 · on March 11, 2020

Where did I talk about having interpreted code?

danShumway · on March 11, 2020

> Contrast this with tailor-made, slim and well tested C++ code.

Avast is (was) running untrusted code in their system, by design. If your intention is to say that they shouldn't run untrusted code, then just say that. Don't waste time talking about whether memory safety matters for untrusted code -- just avoid untrusted code.

A lean C++ solution to the design problem of "how do we run untrusted code" is no better than an interpreted solution. You're focusing on the implementation, not the core design, and it is the core design that's flawed.

The only result of bringing JS up in a conversation like this is going to be to make people doing equally unsafe things in/with other languages feel better about themselves.

LeoNatan25 · on March 11, 2020

Again, what are you talking about?

No interpreter code should be running there, C, C++ or JS. What I am saying is that this design is made by or for people who are seeking to run JS as part of the business logic, which is ridiculous. I haven't brought up "memory safety" at all. Perhaps read the thread before commenting?

danShumway · on March 11, 2020

> who are seeking to run JS as part of the business logic

You can, of course, safely use JS for business logic without executing code from unsafe sources, in the same way that you can safely use C#, LUA, Lisp, or any other language. But that aside, your criticism is completely irrelevant to the actual security flaw.

I don't see any indication Avast was trying to use JS for business logic in the first place. I'm sure they over-embed crap for other parts of their interface, but that's not what's happening here. Avast was not loading a custom interpreter to execute their own business logic, they were loading a custom interpreter to analyze user files as part of their virus scan.

> That service loads the low level antivirus engine, and analyzes untrusted data received from sources like the filesystem minifilter or intercepted network traffic.

> Despite being highly privileged and processing untrusted input by design, it is unsandboxed and has poor mitigation coverage.

It wasn't Avast's reliance on JS as a development tool that caused them to say, "maybe we should parse and run arbitrary files on the filesystem in a process with elevated permissions."

It's their business logic that's the problem. Avast's business logic is, "we want to execute untrusted source code to see if it contains viruses." It wasn't a JS engineer saying, "I can't do my job in C++". It was a C/C++ engineer saying, "I know how I can tell if this file is dangerous -- I'll run it in the main process to see what it does."

If you can tell me a safe way to accomplish that business logic in C/C++ or in any other language without process isolation or sandboxing, then I'll concede the point. But I'm pretty sure you can't.

As an aside, an interesting followup to this disclosure would be if someone tested whether or not Avast is also interpreting other languages like LUA that they regard as potentially dangerous. I wouldn't necessarily take it as a given that JS was the only language they were doing this with.

mintplant · on March 11, 2020

> What I am saying is that this design is made by or for people who are seeking to run JS as part of the business logic

Nope, you've fundamentally misunderstood the issue at hand. The JS is not Avast's business logic. Rather, the program is attempting to analyze and detect malware in JS that the user encounters, similar to how an AV engine might inspect Microsoft Office files for malware. "Low-effort JS developers" had nothing to do with this. "C/C++ engineers" (and/or those above them) made this decision.

catalogia · on March 11, 2020

> If your intention is to say that they shouldn't run untrusted code, then just say that.

Or maybe stop trying to "gotcha" other commenters by making assumptions about what they were trying to say.

danudey · on March 11, 2020

I was testing a pre-release version of one of our products at work, and it was causing (inadvertently) massive slowdowns periodically due to a naive approach to scanning a system for installed applications.

So, I did what any sane devops engineer would do; I throttled the CPU use limit for its cgroup in the systemd service file. Now no more scans.

Except now the UI wouldn't load. Couldn't figure out why. Just an empty white window. Turns out, it's running a node.js server and the whole UI is rendered in HTML/CSS/JS, but because of that it was so non-performant that the UI would effectively not render at all if it couldn't slam your CPU.

I can't think of any native-code, native-widget, control-panel-type UI that would completely fail to render the entire window at all if limited to 10% CPU time, but hey, here we are.

vosper · on March 11, 2020

> it was causing (inadvertently) massive slowdowns periodically due to a naive approach to scanning a system for installed applications.

> Turns out, it's running a node.js server and the whole UI is rendered in HTML/CSS/JS, but because of that it was so non-performant that the UI would effectively not render at all if it couldn't slam your CPU.

So was it the naive approach to system scanning, or the web-based UI that was the problem? Because there are some performant desktop applications using web rendering, like VS Code. Though I'm not sure how VSCode would behave if limited to 10% of CPU, because that's kind of a weird scenario.

staticassertion · on March 11, 2020

> But how many C/C++ engineers would think to design a system that runs a min interpreted code,

This is essentially how antivirus software works. Every one of them packages an emulator to execute malicious binaries.

I'd say the number one thing stopping C++ devs from running eval'd C++ code is the lack of a std eval, and that's probably it.

LeoNatan25 · on March 11, 2020

Antivirus software normally matches code patterns to well-known pattern database. It does not investigate the code on the client machine. AV software houses run their own labs, where emulation is used to inspect suspected malicious code.

staticassertion · on March 11, 2020

To my knowledge every single major AV packages a local emulator. We have long, long moved beyond a world where AV does basic pattern matching.

Frankly, I am far less concerned with the js interpreter than I am the rest of the codebase.

http://computervirus.uw.hu/ch11lev1sec4.html

https://www.blackhat.com/presentations/bh-europe-08/Feng-Xue...

http://joxeankoret.com/download/breaking_av_software_44con.p...

lmm · on March 11, 2020

Given the prevalence of undefined behaviour, a C/C++ system should be assumed to contain arbitrary code execution vulnerabilities until proven otherwise. So in practice most C/C++ programs that process data can interpret code, even if they weren't intended to.

munk-a · on March 11, 2020

I don't know why people freak out so much about undefined behavior - yes it's not defined in the language standard and that's quite unfortunate, but it becomes defined as soon as you chose a compiler. And, careful work (and avoiding really hacky things) can let you easily write a C++ program that dodges undefined behavior if you're uncertain how stable your build chain is.

To be honest though, in the modern world, picking a stable compiler like GCC is a good enough choice for life - this isn't the 90s where you might have to dumpster dive to find copies of that specific borland compiler your company decided to tailor their code to.

(edit: All the above holds until you start making assumptions about uninitialized memory, at that point you're really in trouble and, honestly, C++ really should be better about preventing you from using dirty memory)

lmm · on March 12, 2020

The behaviour of GCC is by no means clearly defined. Even taking it as given that any memory handling errors will result in arbitrary code execution (accessing uninitialised memory as you say, but also e.g. double free), there are other cases. GCC has been known to compile the addition of two integers into an arbitrary code execution. It has been known to compile code like:

    void doDangerousStuffIfAuthorized(AUTHORIZATION* authorizationPtr){
      AUTHORIZATION authorization = *authorizationPtr
      if(authorizationPtr == null || !isValid(authorization)) return;
      doDangerousStuff();
    }

into something that executes doDangerousStuff() when passed null. When users complain about such things, the answer is that the code was causing undefined behaviour and so what GCC is doing is correct according to the standard.