NScript is the component of mpengine that evaluates any filesystem or network activity that looks like JavaScript. To be clear, this is an unsandboxed and highly privileged JavaScript interpreter that is used to evaluate untrusted code, by default on all modern Windows systems. This is as surprising as it sounds.
Double You Tee Eff.
Why would mpengine ever want to evaluate javascript code coming over the network or file system? Even in a sandboxed environment?
What could they protect against by evaluating the code instead of just trying to lexically scan/parse it?
(I'm sure they had a reason - wondering what it is)
I've known for over a decade that antivirus products are strangely destabilizing and stayed away from them myself. For the last year or so that Tavis has been investigating them, we know why. They're just really dumb. Because the popular commercial computer security product market is a "market for lemons": https://www.schneier.com/blog/archives/2007/04/a_security_ma...
Why should companies compete on quality or price when they can put all that effort into advertising and marketing that scares or confuses most people into buying their product
I don't get this Market for Lemons thing. Am I missing any axioms that make this logic correct?. My quips with the theory are:
1) Why are there only binary (2) qualities of cars - good or bad. And why are they static? What market is this trying to model?
2) Wouldn't it be sufficient for peaches (as opposed to lemons) to be lowered in cost to market price i.e. the avg. Somehow peaches are of immutable cost/ quality with zero seller margins?
Aside:
1) Also, why is the state of the cars being considered to be unknowable i.e. even in a given model of car, the units are not of consistent quality? Is this a production quality issue (i.e. inconsistent quality of units in a given model)
2) Why is this state of unknowable or perhaps inconsistent quality an equilibrium? meaning wouldn't a middle man who tests cars try and separate the lot and charge for his services?
This market situation that is being modeled seems contrived, at least to the extent that I understand from the Market for lemons wikipedia article
The binary part is to make it easier to understand.
This is not about new cars, this is about second hand cars. And even if it was about new cars, you wouldn't know for another 5 years if the new car you bought was bad quality or not. (assuming you can't rely on the reputation of the car maker)
The problem with second hand cars is that I sell my car every 5 years, so I basically don't have any reputation. And you don't know how well I maintained my car. All you can see it that it looks shiny.
You can't let me test it, because you don't know if I got the certificate from a genuine unbiased party or from my brother in law. So if you want to test the quality of the car, it's going to cost you. Plus I'm not even sure if you can let it be test cheaply. I can paint over rust, roll back the odometer and do some quick fixes to make the car run smoothly, for now. Are you going to have the car taken apart in the test?
The problem now becomes that I know the quality of the car and you don't. The uncertainty is going to drive the price down. Sellers of good cars will hesitate to compete, because it's impossible for them to ask for a premium price. I'll probably keep the car longer (it's still good, why not) or sell it to a friend, who can rely on my reputation. Now the average quality of cars on the open market goes down, which drives the price down even further, which makes it even less likely a good car will end up on the open market, which drives the price even further down, etc.
It may also be good to look at 'The market for silver bullets' (2008) [1]. This discusses security products as products where both the seller and buyer do not know enough to judge the efficacy of security products. Also where the cost of a breach may be far lower than the PR costs and fallout for example.
I may be misunderstanding, but I would expect the "market for lemon" to be only valid for a physical market, no ?
The principle being that high quality cars are sold faster and get out of the market, so the average quality decrease. But for antivirus, selling the antivirus does not get it out of the market (the inverse is pretty much true since it makes it more famous), hence the average quality does not decrease.
You misunderstand the physical car market metaphor - the high quality cars are not sold faster. They are taken off the market and not sold at all, because the market price is too low. The market price is too low because the buyers can't tell good from bad, so they just guess they will get average quality, and just pay what the average is worth. They don't pay for the high-quality used cars - even if they did, it would usually be for the wrong cars.
The problem is the average user doesn't know what makes a quality antivirus product and they also don't want to pay a high price for one. As a result antivirus manufacturers are incentivised to produce a cheap product which only needs to tick all the same boxes of their competitors in product comparison tables, even if features don't actually improve security. Producing a high quality product is far more expensive and may result in fewer customers due to the previous point.
You mean "why can't peach sellers make it up with volume?", marginal cost of copying software is zero, right? The problem is : so can lemon sellers. The price already accounts for this and it is too low to fund a peach seller (research, software engineering, customer service...).
In this case it's the customer that is removed from the market. Nobody (deliberately) buys two antivirus products. The effect is the same, nobody's going to develop and market a 'peach' antivirus product since the users don't understand it and are very price sensitive.
I think the "market for lemons" analogy is needlessly complicating this because that's an incorrect analogy IMHO.
If it was the case that peaches can't be produced at prices a customer is willing to pay, then the market splits to high-end and low-end with two types of customers with different paying ability like in high end cars (think mercs and rolls royces).
A customer might have information asymmetry but isn't stupid (is this overtly harsh?).
To quote my own comment in a parallel thread here: "wouldn't a middle man who tests cars try and separate the lot and charge for his services?". The market for lemons seems model a static market which is not realistic IMHO
I don't understand how this is a "market for lemons" is relevant here. I'm not sure what you're aiming to communicate. I did take the time to read the linked blog post but not the linked paper from there.
There is no binary system of good or bad antivirus products but a spectrum and they all have different prices. They've been put to real (viruses and malware in the wild) and synthetic tests to test their efficacy by av-test and others. So they work, maybe not 100% but more than 99% on many benchmarks.
Comparing it to a allegedly failed device like Secustick and claiming (not sure where's the evidence for it) there are no good security products and then saying it's "Because the popular commercial computer security product market is a "market for lemons":" seems like begging the question or confirmation bias or both
I'd like to hear your thoughts. I hope I'm not trying to nitpick :)
I'm trying to be charitable, because I don't quite understand the Windows security model, but how is the reasoning to this not just:
"We don't trust this code from the internet, let's run it as SYSTEM!"
Because that seems insane. It's escalating the runtime permissions of code you don't trust -- moving it from user permission to SYSTEM permission just because it hits internet cache.
I keep thinking there must be some part of this I'm missing, because if that's really how it works... "Double You Tee Eff" indeed.
SYSTEM is even more privileged than Administrator. This is the real "root" user. One thing that sometimes annoys me greatly is that on Windows, Administrator is not the most privileged user and SYSTEM is hard to actually get to when you need it.
On a tangential point, one common "hack" when XP was popular was to replace Winlogon.exe with Minlogon, which basically made it an ultra-fast-booting "single user mode" (and coincidentally, also bypassed WPA) and that single user was... SYSTEM. That along with EWF and some creative RAMdisk usage made for some interesting liveCD-like Windows environments.
When I was little I booted into a Linux LiveCD and replaced magnify.exe with cmd.exe on a family computer with login time restrictions. Enabling Windows Magnifier from the accessibility menu on the login screen caused magnify.exe--now Command Prompt--to be launched as SYSTEM. From there you could start up explorer.exe to get a SYSTEM user desktop.
I took it for granted that all SYSTEM superpowers can be given to Administrator via Computer Configuration/Windows Settings/Security Settings/Local Policies/User Rights Assignment in gpedit.msc
Do you have examples for what SYSTEM can do, but Administrator cannot?
SYSTEM ignores access controls on the filesystem, for example. With NTFS, you can disallow Administrator access to certain files and explorer will require the administrator to explicitly take ownership of the files before any other action is allowed. SYSTEM does not need to do that.
I just checked, and on my system without having changed anything, administrators (i.e. users in the Administrators group, including Administrator) are granted SeBackupPrivilege and SeRestorePrivilege, which are the "ignore access control" privileges.
Anyways, even then administrators have by default SERVICE_ALL_ACCESS grant to the Service Control Manager, which allow you to create and start a service running as SYSTEM.
Also administrators have SeDebugPrivilege, so they can ignore ACLs on process objects. This means you can just execute arbitrary code in one of the system processes - or you can duplicate the token of one of those and impersonate it.
So the anti-virus runtime engine has to run as something akin to SYSTEM since it has oversight over the entire system, but once a type is detected it could pass the actual sandboxing into a lowpriv child process and wait for a result.
The only explanation I can imagine is performance? But child processes could be pre-spawned or it could use shared memory regions for isolation. There's a number of things you can do here to have your cake and eat it too.
I'm legitimately surprised by this from Microsoft, they take security very seriously these days, and I cannot see how this would pass their normal review process? Even just reading the description makes you facepalm.
To be fair, 3rd party antivirus software often has kernel modules analyzing files. And that TTF font vulnerability recently was stemming from the font rendering engine interpreting turing-complete font descriptions in kernel mode.
And fonts come from untrusted sources of course.
So there are precedents for direct paths from untrusted sources to SYSTEM or kernel space execution environments.
The sad part here is that Microsoft knows much better about this stuff. So it's not like they don't have the know-how, it just looks like the PMs just didn't care. I can only assume that no part of the MS SDL was applied to this system.
I expected better from MS, I really did. This is security engineering on the level of the other AVs, Linux Desktops and similarly insecure software, i.e. no security engineering AT ALL.
I'm not aware of any Linux desktop with this degree of insecurity. I know they tend to be a bit naïve with the assumption that root access is the only goal for malware, but that's about it.
Because otherwise bad actors could encode, encrypt, or otherwise obfuscate malware inside of a JavaScript file. This type of shielding can be dynamic and near impossible to detect with simple parsing.
Executing the JavaScript will run it in a similar way to the client, and perhaps provide the original malware which you can detect against.
Malware struggles to stay stealth due to it needing to access specific features or run specific commands to elevate or take control. That isn't true for obfustication techniques which can look and act like "legitimate" JavaScript.
As mpengine will unpack arbitrarily deeply nested archives and supports many obscure and esoteric archive formats (such as Amiga ZOO and MagicISO UIF), there is no practical way to identify an exploit at the network level, and administrators should patch as soon as is practically possible.
WTF
Archive bomb anyone? This isn't a new concept, right?
Doing IO intensive work, e.g. build large code base, I always have to deactivate the real time monitor - it slows down the whole process, or worded differently: turning off mpengine almost half's the build time. Reading more and more horror stories about its inner workings and design, it's time to get rid of it for good.
Idle speculation -- perhaps they run it in the sandbox to see if it appears to be malicious? I mean, it's kind of like doing crash-tests with live passengers instead of dummies, but I could maybe see somebody deciding this was a good idea.
It's like doing crash tests with live passengers, but it's OK because you have a machine that can duplicate people and you just kill the duplicates. Except one day you get it mixed up and kill the originals, whoopsie.
Yeah - that kind of makes sense - maybe using mock objects for filesystem access and seeing what it was doing - but that feels to me a bit like trying to solve the halting problem - putting yourself into the process really seems like it will prevent you from trying to figure out what it actually does.
Maybe I'm just wrong about that and this is a standard technique for evaluating security.
Running the program to see if it ever stops is an easy way to solve the halting problem. The trick, of course, is that you need infinite time to be sure of a "never halts" answer.
Turing's proof essentially says that there is no way to be sure of knowing what an arbitrary program will do except for running it and seeing what it does.
It's an imperfect technique for real-world programs, but so is everything. Think of it as being like static code analysis versus runtime "sanitizer" tools. Both are useful, and both can detect problems the other can't.
> It's an imperfect technique for real-world programs, but so is everything.
No, it's mostly a bullshit technique.
The thing is that the malware author can see "how much" of the halting problem AV software "solves", and then deliberately hide their bad/suspicious actions just out of reach of that limit.
Simple everyday heuristics are good for a lot of things, but the one thing that they are mostly ineffective against is intentional deception.
I'd expect surprised if it wasn't just the daunting compatibility work: it's a complex legacy codebase and if you make a mistake you impact security or performance on every Windows box in the world.
They are evaluating the code in a sandbox to see what it is going to do - for example, is it going to download and run some payload or maybe add itself to autorun registry section. Many antiviruses do the same to discover obfuscated malicious programs.
Last time there were discussions about AV around here, people were adamant that Defender was simply above this type of thing - mostly based on the word of a browser vendor.
Turns out that a homogeneous AV environment is a not a good idea. Who would have thought?
Project Zero. Obviously not all the vendors, but a wide selection of the largest ones. These are the ones that employ lots of people and should know better than to do some of the ridiculous stuff they are/where doing.
From this it's safe to draw a broad generalisation about the state of security in the AV market.
I got distracted by the non-sequitur. Defender was touted as the AV above all of this and, yet, it turns out that it isn't. Whether or not other AVs also have vulnerabilities (which all complex software has) is ancillary.
It depends how you argue that. I'm forced to use Defender at work and the worker process frequently pegs multiple cores at 80% - this costs time and patience. I don't have similar issues with NOD32 at home (which is precisely why I hand money over to ESET).
SourceTree is pretty much unusable on my laptop, because every time it does anything the antimalware service springs into life and uses up anything from 20%-80% of the CPU power available. I've had it take 30 seconds to revert 1 line. It's stupid.
I was very much prepared to blame Atlassian for this, but maybe I need to start thinking about blaming Microsoft instead, because it sounds like they've made a few bad decisions here.
(Still, if my options are this, or POSIX, I'll take this, thanks. Dear Antimalware Service Executable, please, take all of my CPUs; whatever SourceTree is doing, I can surely wait. Also, please feel free to continue to run fucking Javascript as administrator... I don't mind. It's a small price to pay if it means I don't have to think about EINTR or CLOEXEC.)
I'm curious to hear your rationale for preferring Windows over POSIX. It's interesting to draw that comparision and conclude that Windows is better (most arguments favor the UI/UX, the large body of software, or the hardware support - not APIs/standards).
1. Signals are bad. All the bad bits of IRQs, and you're not even working in assembly language, so they're twice as hard. Just say no.
2. The forking model is seductive, but wrong-headed. It's hard to make the file descriptor inheritance behave correctly, and it means the memory requirements are unpredictable (due to the copy-on-write pages)
3. Readiness I/O is not really the right way to do things, because there are obvious race conditions, and the OS can't guarantee one thread woken per pending operation (because it has no idea which threads will do what). Also, the process owns the buffer, which really limits what the OS can do... it needs to be able to own the buffer for the duration of the entire operation for best results, so it can fill it on any ready thread, map the buffer into the driver's address space, etc.
4. Poor multithreading primitives. This really annoyed me... like, you've got a bunch of pthreads stuff, but it doesn't interact with select & co. The Linux people aren't dumb, so they give you eventfd - but there's no promise of single wakeup! NT wakes up one thread per increment, and the wakeup atomically decrements the semaphore; Linux wakes up every thread waiting on the semaphore, and they all fight over it, because there's no other option.
(I'm just ignoring POSIX semaphores entirely, because they don't let you wait on them with select/poll/etc. in the first place.)
(Perhaps this and #3 ought to be the same item, because they are related. And the end result is that you need to use non-blocking IO for everything... but the non-blocking IO is crippled, because it still has to copy out into the caller's buffer. It's just a more inconvenient programming model, for no real benefit.)
I guess it just boils down to what you want: a beautifully polished turd, or a carefully engineered system assembled from turds.
Using signals in the ye olde UNIX fashion is bad. However, that was entirely fixed with the advent of POSIX threads: block all signals in all threads with sigprocmask(), and have a dedicated signal-handling thread that loops around on sigwaitinfo(). That thread then handles signals entirely synchronously, notifying other parts of the program using usual inter-thread communication.
That is quite the rant, I am impressed with how deep you must know both systems to crank out a list like that. Did you just write all this for this post or has this been kicking around your hard drive for a while?
> The Linux people aren't dumb, so they give you eventfd - but there's no promise of single wakeup! NT wakes up one thread per increment, and the wakeup atomically decrements the semaphore; Linux wakes up every thread waiting on the semaphore, and they all fight over it, because there's no other option.
You can get Linux to only wake up one thread using epoll and either EPOLLEXCLUSIVE or EPOLLONESHOT.
Though I don't really understand why you'd want to wait on a semaphore and other waitables at the same time… probably this is just my Unix bias/lack of Windows experience showing.
> Though I don't really understand why you'd want to wait on a semaphore and other waitables at the same time…
This. I probably also lack some specific kind of experience because I never understood why you would lock a lot of threads on a semaphore, and only want one of them to execute after a signal.
I just never saw the use case for that. Yet people complain about it a lot.
> it needs to be able to own the buffer for the duration of the entire operation for best results, so it can fill it on any ready thread, map the buffer into the driver's address space, etc.
Can it actually do that? For network write operations the OS has to split the data into MSS/PMTU-sized packets and add headers to it. For network read operations the OS has to reassemble the packets back into a stream or datagram, and it doesn't even know which process a packet is for until after the packet is read into memory.
You're making the copy regardless. Meanwhile IOCP requires O(n) read buffers for n sockets, instead of O(1) when the OS notifies you that it has enough packets to reassemble something.
If the HW has good vectored I/O support then it might be possible to send out data directly from user buffers. This is by composing a packet using two buffer descriptors, one for the header, which points to the next descriptor for the data. But there are complications:
- Almost all hardware would require the buffers to be aligned. Though I think unaligned buffers could probably be handled with a hack by copying some bytes from the user buffer to the "header" buffer.
- The user buffer would need to remain available not only until the packets are transmitted but until they are acknowledged (assuming TCP). Therefore if you want to avoid copying data to kernel buffers for the potential retransmission, the application needs to track which buffers are pending (and is notified when they can be released).
Also, I was reading that zero-copy receive is also possible in some scenarios by changing virtual memory mappings. I'm sure lots of info can be found by googling "zero-copy TCP". FreeBSD supposedly has support for zero-copy TCP.
> Therefore if you want to avoid copying data to kernel buffers for the potential retransmission, the application needs to track which buffers are pending (and is notified when they can be released).
Not necessarily. The kernel could map the page into kernel space and mark it copy-on-write so the application can't modify the kernel page. Then the application doesn't have to care when the kernel is finished with it.
> Also, I was reading that zero-copy receive is also possible in some scenarios by changing virtual memory mappings.
Not necessarily. The kernel could map the page into kernel space and mark it copy-on-write so the application can't modify the kernel page. Then the application doesn't have to care when the kernel is finished with it.
In practice, it's faster to copy the page up front than mess about with the page tables and TLB shootdowns, doubly so if you end up copying the page to break the COW anyway!
> In practice, it's faster to copy the page up front than mess about with the page tables and TLB shootdowns, doubly so if you end up copying the page to break the COW anyway!
It could be worth it when the data fills an entire page or more. And it's common that after a write, the thread will either sleep waiting on events and not get one during the milliseconds it takes for the kernel to be finished with the data, or the buffer is immediately used for read()/recv() which allows the kernel to remap the page without copying it.
But yes, that seems to be the problem in general -- we're trying to optimize something which isn't actually that slow. A memcpy() on a <500 byte packet is only tens of cycles. Even a full page is hundreds of cycles, which is on the same order as the cost of the syscall to have the OS notify the application it has finished with a buffer. None of this complexity can justify its overhead unless you're sending thousands of contiguous bytes, and at that point you're in sendfile() territory anyway.
> the memory requirements are unpredictable (due to the copy-on-write pages)
On modern Linux OOM killer became rather good at finding the real memory offenders. Plus this unpredictability allows, for example, to transparently enable memory compression in Linux eliminating the need for swap in many configurations.
Can Linux do that? I remember when OS X introduced it, performance improved notably because the system did much less paging. Would be nice to have that on Linux.
(Rephrase my question: If Linux can do that, do I have to take any special steps to enable it?)
EDIT: Nevermind, I should have thought of googling for it first!
I'm curious, given all these limitations in POSIX, what are some examples of software that do better on Windows because of these more advanced OS features?
yes, the Linux userspace<->kernel API is far better documented. Windows has literally hundreds if not thousands of completely undocumented (publicly) system calls, whereas each Linux system call has a man page available on basically every Linux system, no web browser required. even the "internal" system calls like mmap2 have man pages. find me "documentation" for NtSuspendProcess.
and even the kernel API documentation, which, while it has its issues, I would argue is still far better than MSDN, which IME is mostly pages and pages of function prototypes with a one-line restatement of the name of the function.
oh, and CLOEXEC seems very clear and explicit to me. OTOH, we have Windows where instead of open(path, O_CLOEXEC), we must use SetHandleInformation(handle, HANDLE_FLAG_INHERIT, 0). now let us compare the documentation for these two options:
MSDN:
"If this flag is set, a child process created with the bInheritHandles parameter of CreateProcess set to TRUE will inherit the object handle."
Linux:
"Enable the close-on-exec flag for the new file descriptor. Specifying this flag permits a program to avoid additional fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.
Note that the use of this flag is essential in some multithreaded programs, because using a separate fcntl(2) F_SETFD operation to set the FD_CLOEXEC flag does not suffice to avoid race conditions where one thread opens a file descriptor and attempts to set its close-on-exec flag using fcntl(2) at the same time as another thread does a fork(2) plus execve(2). Depending on the order of execution, the race may lead to the file descriptor returned by open() being unintentionally leaked to the program executed by the child process created by fork(2). (This kind of race is in principle possible for any system call that creates a file descriptor whose close-on-exec flag should be set, and various other Linux system calls provide an equivalent of the O_CLOEXEC flag to deal with this problem.)"
What for? Use of NtSuspendProcess/NtResumeProcess is usually a smell of trying to do *nix style multiprocessing in Windows. For which the answer usually is: Don't.
Yeah I know, that's a perfect point to start yet another flamewar, and as such I want to add the disclaimer that I'm not making a judgment with this statement. :)
It's just that this is by design: You're not supposed to use the kernel API directly in Windows, you are supposed to code against Win32/WinRT/UWP.
(Hmm...I'd hazard a guess there is documentation for these calls, but it's simply not public.)
I'm not exactly a hardcore fan of Win32 userspace documentation but POSIX's drives me far more insane. Can you name a few examples that aren't from shell32/shlwapi?
General principles will be more interesting than examples: Win32 does not list the errors that can happen, rendering the greater error code space it has half-useless. Even without considering errors, the description of what functions do is often unclear or made in not precise terms. It is often needed to test the functions in tiny test programs to actually understand all their details before you can use them properly, and given they often have a high number of parameters this makes this matter even worse.
Something as simple as the CreateProcess and family of function is a complete mess (both in the doc and in the detailed way they work...)
POSIX actually doubles as a reference documentation. I don't think such a thing really exists for Win32 -- MSDN is way too vague to achieve that purpose. Maybe people from Wine / ReactOS maintain a better doc, I don't know.
Firstly, the first example I thought of (CreateFile) clearly stated that you could get ERROR_FILE_NOT_FOUND, ERROR_ALREADY_EXISTS, ERROR_FILE_EXISTS, ERROR_PIPE_BUSY, ERROR_SHARING_VIOLATION, ERROR_ACCESS_DENIED.
Secondly, drivers can sometimes return error codes that the OS might not expect to be necessary, and the OS can't just coerce the error codes into something else; it'd lose information. [1] For example, if you CreateFile, a driver that mandatorily logs data might decide to return an error saying it's low on disk space even though the file already exists, which you wouldn't expect. If your code isn't specially written to handle it, you should treat it as the general failure condition it is.
Heck, you can even think of Windows as returning "TRUE" (for no error) or "FALSE" (for error), and GetLastError as just providing details you can ignore. Simply the fact that you know the possibilities are TRUE and FALSE doesn't satisfy you though, does it? So isn't the problem that Linux is just suppressing information it could actually pass through?
If anything, Windows is doing this right and Linux is doing it right by being restrictive on the returned information... but in any cases, at best you can suggest they're both "different" and maybe on a good day people would entertain the possibility. Certainly returning MORE information than you need is not something you can call out as a flaw...
You cherry-picked one function for which some errors were specified (and even in this case; few of them), yet you ignore the general case where MSDN seems to tell "go fuck yourself" to the programmer looking for the possible different error cases. Which are important to known in case you want to react programmatively to some, and in a different way to others.
Given how poor the error reporting is under Windows (so poor that useless 32 bits hex error code, component dependant, are pretty much the only (useless) thing that users are seeing), the theory of "more of those" are better is also utterly ridiculous. I've never debugged anything looking at WinNT/Win32 error codes, or even the event log, etc, while I've in plenty of occasion debugged things with Unix errno, strace (which gives errno for failed syscalls) logs in /var/log (often dumping Unix errno + some context) or even just reading stderr.
If you want to understand how much of the getlasterror specs are missing in Win32, just take a look at the Wine unit tests in various are. For each of those cases (maybe 99% of which are undocumented in MSDN), that can correspond to programmers needing to replicate their own qualification beforehand usage of affected functions, or worse to implicitly make some (sometimes false) hypothesis. And the overwhelming majority of them are not related whatsoever to any kind of driver and whatnot, it is just that not even the upper layer errors are properly specified/documented. This is the kind of info which should appear somewhere in MS doc, and which does not. It's insanely ridiculous because it's obvious MS have it in a form or another, given it is crucial for backward compatibility (or compatibility in general, if you take the Wine case -- but I guess when you remain within MS the main issue is backward compat)
> I've never debugged anything looking at WinNT/Win32 error codes
I think your lack of experience explains the issue. I do it pretty damn frequently.
> so poor that useless 32 bits hex error code, component dependant, are pretty much the only (useless) thing that users are seeing
All you need to do is look up the error code in either the headers or in MSDN [1] and then read the error message. It explains what's going on and does it far better than the minimal number of errnos possible.
And no, I didn't cherry-pick anything. Like I said, I picked the first function I thought of. If you really cared enough to provide an example you could/would have, but you didn't. And this is ignoring the fact that I then told you why the listing of error codes or lack thereof was not possible in general.
And it's not like in Linux you don't need to write and run and re-write and re-run code fifty times before you know all the edge cases. I remember having one hell of a time trying to figure out how to use epoll, for example. The documentation could easily be 2-3x as long.
Regarding WINE: WINE is not software that uses the Windows API. It replicates the Windows API. Nobody ever claimed Windows's error reporting behavior is easy to replicate. We're talking about APIs, i.e. interfaces for programmers that program TO the system. An open set of error codes makes a ton of sense for all the reasons I listed. Of course an open sense means you'll have a much harder time replicating the behaviors. Windows was not designed to be easy to copy; it was designed to be easy to use. I would've thought that would be obvious.
As for the openness because of installable FS; WTF? Other systems have (more) installable FS.
As for my lack of experience: WTF bis? I once got an NT error through a Win32 call. That is not even supposed to happen. It was (obviously) not useful to debug. I had to put a workaround in my code.
I know what is Wine. I once read its code to understand some of the ACL api of windows, and what some win functions are supposed to return. You would be surprised what is returned and in which case. MSDN seems to be so poor in this area that I found some example indicating not even some people working at MS know how some functions in this area are supposed to be used...
As someone who strives to write great reference manuals and thinks pretty high of the prose in the POSIX spec, would you point me at some examples of good win32 documentation? I'm always looking for inspiration.
I don't really have a list handy so I might have to keep looking to find something that meets the "great" bar (again -- I'm not a hardcore fan, I mere think it's reasonably decent), but for example check out the page on I/O Concepts [1] and the sub-pages (such as "Synchronous and Asynchronous I/O").
Does the POSIX documentation have anything like it? I don't recall so remind me if it does.
Parent said "hardware support"; "POSIX hardware support" would not make all that much sense since it's just an interface specification. The point was to find a reason for preferring Windows which I gave.
If real-time monitoring is running, try adding exceptions for your source and build outputs. It prevents antimalware from watching changes in those folders which can really slow down a build or sync.
I've always been a bit wary of doing that, as I've seen infected EXEs committed to version control a number of times because of it :(
What I have done - well, what I think I have done! - the UI is appalling - is added SourceTree.exe and git.exe as exclusions. My intention here is that the antimalware service won't then monitor their activity while they're doing stuff, but will still check working copies when it does a routine scan... but sadly this didn't seem to make any difference at all, literally none whatsoever, and SourceTree continued to be what I can only describe as "slow as fuck" - if you'll forgive the technical jargon.
The odd thing is, though, which just occurred to me like literally right now, it might actually be safer to add the folder exclusion after all! Because then there's zero risk of the Javascript interpreter springing into action :-|
What a time to be alive!
(But I still stand by everything I said about POSIX.)
The malware protection system running uses a filesystem minifilter so may not matter what executable is running just that files are changed in the NTFS filesystem. So the grandparent post has great advice. It is definitely worth trying even if you may not want it to be the long term solution.
I tried that but still saw significantly reduced build times when working on some projects. I think it has something to do with the temp file directory (which I don't want to whitelist for obvious reasons), so I just deal with disabling it when I build.
At this point I've memorized the keys to hit to quickly disable and re-enable defender's real time protection (winkey > "defen" > enter > tab > (down to disable, up to enable) > alt + F4)
Contents of the PoC are a ".zip" file that is actually plain-text (the engine ignores extension/mime types) and contains just this line of JS and 90kb of nonsense JS for entropy.
(new Error()).toString.call({message: 0x41414141 >> 1})
It's hard to imagine MS doesn't receive tons of watson crash reports of MsMpEng from trying to run bits of random JS. If they haven't looked at them, they probably should start now.
I think this sentence sums up the severity pretty well:
The attached proof of concept demonstrates this, but please be aware that downloading it will immediately crash MsMpEng in its default configuration and possibly destabilize your system. Extra care should be taken sharing this report with other Windows users via Exchange, or web services based on IIS, and so on.
And I think the intended formulation was "care should be taken sharing this report with other Windows users or via Exchange, or web services based on IIS..." (because they're afraid it could crash the servers even if sharing between non-Windows users!)
I'm really surprised here. When I worked on Windows we used the STRIDE model and had to do formal threat model analysis for every component. The reviews I was in was in for win 8 took a full day. A TMA should have immediately showed that a security boundary was needed.
Large company with multiple teams and varying levels of people?
I met someone who worked on Office. He told me there were parts of Excel that used a shared code base for Mac/Win/Web, parts that were just Mac/Web with different Win components, and for the Excel graphs, the labels for the graphs were unified for all three platforms but drawing the graph was individual to each platform. Mac Office was a fork from years ago they're finally trying to rebase and bring back into mainline.
Some large companies aren't really unified entities with universal standards. You can be using a standard unified approach for the company and still be seated next to an intern not forced to use version control or a brand new Uni hire that's on a totally different process due to the team they get hired into.
Simply lists: "Verify that the update is installed
Customers should verify that the latest version of the Microsoft Malware Protection Engine and definition updates are being actively downloaded and installed for their Microsoft antimalware products.
For more information on how to verify the version number for the Microsoft Malware Protection Engine that your software is currently using, see the section, "Verifying Update Installation", in Microsoft Knowledge Base Article 2510781.
For affected software, verify that the Microsoft Malware Protection Engine version is 1.1.10701.0 or later."
according to MS one should be patched-up and good to go? (The command should print nothing on vulnerable systems).
However a hyper-vm last patched before Christmas (it's not networked), lists it's version as: 1.1.12805.0 -- which certainly seems to be a higher version than 1.1.10701.0?
I'll also note that using "[version]x.y.z.a" apparently does not force some kind of magic "version compare"-predicate, based on some simple tests.
Any powershell gurus that'd care to share a one-liner to check if one has the relevant patches installed?
That makes more sense, thanks. Still can't get powershell to compare in away that makes 1.1.12805.0 be less than 1.1.13701.0, but at least manual inspection makes sense now:
# On old hyper-v vm:
Get-MpComputerStatus \
|where -Property AMEngineVersion \
-gt [version]1.1.13701.0 \
|select AMEngineVersion
AMEngineVersion
---------------
1.1.12805.0
Quick question on the timings of this. The report says that "This bug is subject to a 90 day disclosure deadline." - does that mean it was discovered 90 days ago and has been published now, or it was discovered on May 6 (as dates on the comments seem to suggest) and Microsoft has responded very quickly? In either case it seems strange not to have waited a couple more days because (for my system, anyway) I was still running the vulnerable version even after the report was made public.
It was discovered May 6th. MS has responded and fixed very quickly. I'm not overly familiar with Windows but I've seen multiple people saying that Defender patched itself without requiring any action on their part.
Not a fan either, but it would be silly to think that there does not exist equally harmful vulnerabilities in similar software for other platforms. They just remain to be discovered.
To be fair, I suspect that Windows simply makes for an attractive target, both for security researchers and malware authors, because about 90% of desktop computers are running Windows.
If GNU/Linux had a market share of >50%, I would expect similar news to pile up at a comparable rate.
For the moment, GNU/Linux enjoys kind of a reverse herd immunity, but I do not believe it is inherently more secure than Windows.
Does MsMpEng actually do file analysis itself, unpacking, unarchiving, &c? That's the kind of stuff that should usually be sandboxed. If its zip/rar/7zip/cab/whatever support hasn't been formally verified and those components run as SYSTEM, es no bueño.
Hmm...I guess I was assuming they had fixed all the potential vulnerabilities allowed by running a virus scanner as root, not just the specific vulnerability described in the example exploit.
If that broke it, I imagine it would've been discovered by now. A simple mitigation would be to check if the newly-unpacked file is the same.
However... suppose someone crafted an alternating version -- where A.zip contains B.zip contains A.zip etc. -- I bet there are systems out there which only do one tier of checking.
There exists a much simpler way, it is for example how C++ compilers deal with recursion in template instantiation:
for level = 1 to arbitrary_depth:
do_some_work_which_may_produce_more_work()
You just have to select arbitrary_depth large enough that nobody notices and there you have it - arbitrarily deep recursion without infinite loop ;)
Totally wouldn't be surprised if MS did it this way too. Otherwise you are right - they would need to remember at least hashes of all "outer" archives unpacked so far.
Haha, that's how I like to write all my code now. Sometimes I'll write a function that generates a random id, and checks the database to make sure that it doesn't already exist. I always like to add a counter and throw an error (or just return some default value) if it gets up to 100 or so. It might not ever happen in this universe, but I like to imagine there's a parallel universe out there where I saved some server from going into an infinite loop.
I actually did this recently for a random phone number generator, using the phony Ruby library to validate numbers. It was just for some test fixtures, but it's nice to know that it will always fallback to a default test number in case something goes wrong and it runs out of attempts. Or I'm in some universe where my random number generator suddenly starts producing an endless stream of zero bits.
This is just a little 'tick'. I also find myself using 12px and 14px a lot more frequently than 13px.
I wonder whether the zip-quine could be modified to insert additional garbage data with each iteration, to defeat any system that stops when it reaches a known hash. Similarly, could a zip-quine that loses data with each iteration be created? What about one that, after it has lost N iterations of junk data, the data loss mutates part of the quine mechanism, turning what was previously a block of junk data into a valid file? Could you devise a zip template that has space to insert random noise to give each copy a unique set of hashes, and has space for a payload that is revealed after a configurable number of iterations? What about a variant where each iteration contains two copies with different junk data changes?
The idea of a zip-quine and how it interacts with poorly-designed malware detection offers so many interesting hypothetical variations.
I am not happy that Google has published a full exploit well before it has been possible to anyone to actually deploy the patch and within just 3 days of notifying the vendor.
It seems that Google is eager for someone to use this exploit to attack as many systems as possible before they can be patched against it.
I know but the exploit, a template virus, has been published before i could actually install the fix, and this applies to everyone. I actually think this behaviour is appalling.
This kind of exploit could propagate very quickly.
The exploit is deployed quickly via the automatic daily update of the Windows Defender engine (the same thing that fetches new virus defintions) if I understand correctly. My system already had the patch when I checked.
Remember the RPC vulnerabilities from ~2003? Back then, a machine connected to the Internet without some firewall in front of it would get infected before the installation routine was finished.
This one is really bad, of course, but historically, it's not the worst.
Good that it was fixed. But now bad actors will be looking very hard for other bugs in the unsandboxed javascript interpreter. Tempting to just disable windows defender completely.
Exploitability Assessment for
Latest Software Release: 2 - Exploitation Less Likely
Exploitability Assessment for
Older Software Release: 2 - Exploitation Less Likely
Anyone with ideas on how they came to this conclusion? Yes, I read the linked document but felt that the index assessment didn't really reflect that google (Natalie?) seems to have found this "in the wild".
Not sure but the last comment on the bug report says: "RCE risk should be lowered due to CFG (on platforms where CFG is in effect)."
>the index assessment didn't really reflect that google (Natalie?) seems to have found this "in the wild".
Did she? How do you know this?
edit: MS seems to explicitly say they are not aware of it being used. "Microsoft had not received any information to indicate that this vulnerability had been publicly used to attack customers when this security advisory was originally issued."
A few days back there was a HN thread here talking about this tweet from Tavis[1] which involved Natashenka as well. I think it's related to this very issue.
in this context, it's not vague nor likely to be uncommon.
for the purposes of mitigating the vulnerability in msmpengine, what matters is that msmpengine has cfg enabled. whether other arbitrary programs on your computer have it too is not relevant to this story.
cfg works like this: when a process has cfg enabled, the windows kernel will maintain a list of all valid indirect call targets for it, and then validates all indirect calls against this list. if validation fails (because the process attempted a call to an invalid target), the kernel terminates the process.
obviously, this requires kernel cooperation. this was introduced in windows 8.1, which means windows defender on windows 7 can't use cfg there.
cfg raises the bar a little in making it harder to obtain remote code execution from simple memory corruption, but doesn't make it impossible.
The function count should be non-zero, the Control Flow Guard dll characteristics flag should be set, then the CF Instrumented and FID Table Present guard flags should be set. Be aware that this filters out a lot of other details and misrepresents the full list of flags.
We're only talking about one exploit in one program. CFG doesn't have to be "common" to be relevant, it just has to have been used in that program, which it must be or it wouldn't have been mentioned.
At least the good guys found this one first, and it is in Windows Defender, and the definitions should automatically update in 24hrs or less silently without a reboot.
if you read this, could you tell to Microsoft for fix the issue with definition updates that won't remove after update? The definition kept growing and waste space. (the problem auto solve if the computer is rebooted).
$ /cygdrive/c/Program\ Files/Windows\ Defender/MpCmdRun.exe /h
Windows Defender Command Line Utility (c)2006-2008 Microsoft Corp
Use this tool to automate and troubleshoot Windows Defender
Double You Tee Eff.
Why would mpengine ever want to evaluate javascript code coming over the network or file system? Even in a sandboxed environment?
What could they protect against by evaluating the code instead of just trying to lexically scan/parse it?
(I'm sure they had a reason - wondering what it is)