WD SN850 delivers the best performing SSD we've tested

vlovich123 · on Nov 9, 2020

> importantly, for all SSDs, disable write-cache buffer flushing for best performance.

Is this actually good advice, especially without mentioning the durability risks?

pizza234 · on Nov 9, 2020

StackOverflow answer here: https://superuser.com/a/1215634

It's essentially the typical "without this enabled, you will lose data in case of power loss, potentially breaking the O/S".

wtallis · on Nov 9, 2020

One important point that the StackOverflow answers miss is that the Windows checkbox for buffer flushing has a different meaning for SATA SSDs than for NVMe SSDs. If you want a reasonably fair synthetic benchmark to compare NVMe SSDs against SATA SSDs while bypassing any caching the OS is doing with main system RAM, you pretty much have to toggle that switch for at least NVMe drives. Or use Linux, which has actual documentation for its NVMe driver, and the source code is available where that falls short.

gruez · on Nov 10, 2020

>One important point that the StackOverflow answers miss is that the Windows checkbox for buffer flushing has a different meaning for SATA SSDs than for NVMe SSDs.

can you elaborate on this? What's the exact difference? Do sata drives get more aggressive buffer flushing than nvme drives?

wtallis · on Nov 10, 2020

The exact difference, as far as I'm aware, has never been publicly documented by Microsoft, and I have not attempted to properly and thoroughly reverse engineer that particular mess.

But the end result is that in the default state (buffer flushing enabled), most or all disk benchmarking tools show drastically lower write performance on consumer NVMe drives than should be expected compared to their theoretical advantage over SATA SSDs.

I'm not intimately familiar with the semantics of how flush/sync commands get generated and passed through the Windows IO stack. Empirically, when an application tries to issue a write that is not to be buffered by the OS, it appears that Windows translates that into NVMe commands that constrain or entirely prohibit the SSD from doing its own write buffering, but for SATA SSDs writes are still by default issued in a manner that permits the drive to buffer freely. Figuring out exactly what's happening without Microsoft's help might require bus analyzers I don't have. In effect, it seems that Windows is less willing to trust a NVMe SSD than a SATA SSD, and that doesn't strike me as justified.

I've also noticed that recent builds of Windows 10 have changed their behavior when running one of our old benchmarking tools that plays back IO traces. On current Windows 10, the OS outright rejects any attempt by the trace playback application to issue a flush command to a NVMe SSD, whether or not buffer flushing is enabled at the driver level. This is definitely a change from earlier builds of Windows 10, and I think it is a change that only affects NVMe drives, not SATA drives. (I'm not testing very many SATA drives these days, and have to design my testing procedures entirely around the needs of testing NVMe drives.)

gruez · on Nov 10, 2020

>But the end result is that in the default state (buffer flushing enabled), most or all disk benchmarking tools show drastically lower write performance on consumer NVMe drives than should be expected compared to their theoretical advantage over SATA SSDs.

This... doesn't seem to be the case? If you look at this chart[1] from anandtech, you can see that the nvme drives (crucial P1, intel 660p) are significantly ahead of the sata drives (860 evo, 860 qvo, crucial mx500). The sustained write results[2] worse for nvme drives, but they're still better than similar sata drives if we control for flash type (qlc vs tlc) and capacity.

Also, comparing nvme drives to sata drives is hard to begin with, since they're usually in different price segments. sata drives tend to be cheaper than nvme drives (maybe the controller is cheaper?), so if you were comparing a sata drive and a nvme drive of the same price, you might end up with a "worse" nvme drive because of the nvme premium.

[1] https://images.anandtech.com/graphs/graph13633/burst-rw.png

[2] https://images.anandtech.com/graphs/graph13633/sustained-rw....

wtallis · on Nov 10, 2020

The charts you linked to are showing measurements taken under Linux.

gruez · on Nov 10, 2020

Here are some charts from crystaldiskmark, which is windows only. Note the 4k Q1T1 figures.

crucial p1:

https://images.hothardware.com/contentimages/article/2843/co... (169.8 MB/s)

https://cdn.mos.cms.futurecdn.net/nVhtVJZ64vDP7coALo7y3T-970... (162 MB/s)

samsung 860 evo:

https://i1.wp.com/www.tech-critter.com/wp-content/uploads/20... (118.25MB/s)

https://www.notebookcheck.net/fileadmin/Notebooks/Schenker/X... (122.8 MB/s)

fomine3 · on Nov 10, 2020

I assume you're Billy Tallis on AnandTech, so you set disabled the cache on test?

wtallis · on Nov 10, 2020

Whenever possible, I run synthetic tests on Linux to avoid this and many other frustrations. When running application tests on Windows, I leave the setting at its default: disk cache enabled, and buffer flushing allowed. When running synthetic benchmarks on Windows, I turn buffer flushing off to get an accurate measurement of the drive's capabilities, relatively unhindered by OS shenanigans.

maxerickson · on Nov 9, 2020

It's a tangent, but I only recently realized/considered that UPS batteries had come along for the ride with lithium-ion.

I bought one that should power my internet+wifi for about an hour for $45.

I guess even a small one like that would handle power blips for a home server, but in an outage the battery would quickly run out and the setting would matter.

fgonzag · on Nov 9, 2020

Most smart UPS can notify equipment (via LAN, USB and some even offer IPMI) to turn off in the event of an outage, or as soon as the battery reaches a certain threshold. You just need to install the UPS's server software on the equipment you want gracefully shut down.

thegagne · on Nov 9, 2020

What brand/model? I’m not finding these, just the old/expensive APC models and the like.

maxerickson · on Nov 9, 2020

https://www.amazon.com/gp/product/B07GZR981Y/

I didn't do a lot of research, my main goal was to not have interruptions when the power blinks off.

There's APC and Amazon basics models that are about the same capacity and price.

hedora · on Nov 10, 2020

That model uses a sealed lead acid battery. There’s a picture of it in one of the reviews (the product description doesn’t mention battery technology).

jbverschoor · on Nov 9, 2020

Dont most SSDs have capacitors for this?

JaimeThompson · on Nov 9, 2020

Only enterprise class drives typically have power loss protection and even that isn't a guarantee as some in that class don't have PLP.

wtallis · on Nov 9, 2020

As far as I'm aware, server SSDs without PLP are a relatively recent trend, and most of the examples I'm aware of are entry-level SATA drives. A few companies are also marketing their client/consumer M.2 NVMe drives for use as server boot drives, but at least one competitor in that segment has made a purpose-built low capacity NVMe drive with PLP.

sushshshsh · on Nov 9, 2020

The biggest players in the game do not use SSDs for persistent storage longer than x weeks. It is offloaded to tape at some point and restored periodically.

vegardx · on Nov 9, 2020

Uhm, what? There are tons of ways to prevent data loss, none of them involves writing the data out to tapes and then back in again.

sushshshsh · on Nov 9, 2020

Sure, you can also write to disk or other media. But are you trying to argue that certain large companies don't use tape?

vegardx · on Nov 9, 2020

I never said that.

There's simply no good reason to periodically write your data from flash to $other every now and them, and then back to flash. There are heaps of other ways to prevent data loss or performance degradation in flash storage.

There are methods like tiered storage, where only the most active data is present in Flash-storage, much like write back caching.

gruez · on Nov 9, 2020

>It is offloaded to tape at some point and restored periodically.

This makes no sense. The main problem with disabling write flushing is what happens if the OS crashes or the power goes out. Doing periodic backups/restores provides a "last known good state" to revert to, but you'll still lose all the writes between the last backup and when you crashed. Depending on your workload this might be fine, but I doubt the COO is going to be pleased when he finds out you lost all customer orders from today.

The bit about "restoring periodically" also doesn't make any sense because such events are easily detectable. There's no need to constantly take the system offline to do a restore. Disabling write flushing isn't going to cause increased bitrot.

duskwuff · on Nov 9, 2020

Explicitly rewriting data periodically might make sense if you're really paranoid about flash fading over time. But bouncing the data through tape (tape!!) still makes no sense; the sensible way of managing this would be to mirror the data directly to another SSD.

sushshshsh · on Nov 10, 2020

We are arguing about nothing. Live data (being accessed many times per day) stays on SSDs. The middle tier of data is a gray area and can go on SSDs or disk.

But if you think that your FB or Youtube or Fidelity or Geico or SSN or IRS or Apple or Google data isn't sitting on a tape somewhere, you are extremely mistaken, because I was the one who wrote the tape control software they purchased from us.

wtallis · on Nov 10, 2020

Nobody's denying that the data makes it to tape eventually. What everyone finds hard to believe is your assertion that data is ever moved from tape back into warm storage during the ordinary course of business. Enterprises use tape for backup. They only read data off tapes after something goes very wrong, or periodically to verify that their backup process is actually working. But except after a catastrophe, data read from tape is not routinely used to re-populate warm storage.

sushshshsh · on Nov 12, 2020

The Library of Congress does exactly what you've described every day.

A researcher desires to download a 50 minute archived video. A request is sent from their laptop to a lookup table server. The lookup table contains the metadata for where that file is stored.

Then the request is passed to a media server which has an SSD and RAM large enough for receiving the actual media file. It opens a connection with the tape library. The tape library handles loading the media from tape to a staging server, which also has an SSD.

So totally the media file goes from tape, to staging SSD via hardwire, to main server's SSD over the network, to client's device over the network.

motives · on Nov 9, 2020

I'm also curious as to whether this is true. From what I understand and can gather from tech talks from large firms, it is common (up to and beyond petabyte scale) to use ceph clusters purely consisting of SSDs and HDDs (no tape) to store mission critical data, with cache tiering used to balance hot and cold data. AFAIK, because of their self-healing nature thanks to replication/erasure coding algorithms, all that needs to be done is replace corrupted drives periodically, thus making tape drives non-essential unless periodic full backups are required for off-site storage. Please feel free to correct me if I'm wrong in any of this. Thanks.

nix23 · on Nov 9, 2020

>to use ceph clusters purely

Exactly! Use ceph so you don't have to care about burning ssd/hdd/server/switch/racks/dc, you have your object-storeage, FS, block-device, with one service and that's it.

saagarjha · on Nov 9, 2020

They use tape because they have insanely large amounts of data they want to store and no need to have fast access to them, not because they are a good replacement for SSDs.

faster · on Nov 9, 2020

Maybe I haven't been paying attention, but I've never had a site attach extra text to copied text like this site does. I wanted to see what the test machine's case looks like, so I copied the make and model and pasted it into a search bar, and got "Read more:" and a link to the page along with the text I wanted to copy.

No thanks. I'll be skipping that site in the future, unless that javascript comes from a place I can easily block with pi-hole.

rasz · on Nov 10, 2020

    document.addEventListener('copy', (event)=>{
        const pagelink = `\n\nRead more: ${document.location.href}`;
        event.clipboardData.setData('text', document.getSelection() + pagelink);
        event.preventDefault();
    })

To prevent this make a custom extension

     "run_at": "document_start"

and inject

    window.addEventListener('copy', e => e.stopImmediatePropagation(), true);

alternatively use for example https://www.tampermonkey.net with

    Experimental Inject Mode: Instant

to inject your pacifying script.

gruez · on Nov 9, 2020

>unless that javascript comes from a place I can easily block with pi-hole.

Why on earth are you you blocking javascript using dns? Shouldn't you be doing it via the browser (via addons or site preferences)?

damnencryption · on Nov 9, 2020

Maybe he is on phone?

nicoburns · on Nov 9, 2020

You can probably just block JavaScript entirely for a site like this. I doubt it depends on it. But I agree that adding text to copied text is really irritating.

kasabali · on Nov 9, 2020

Images doesn't load on mobile brave if javascript is blocked.

intricatedetail · on Nov 10, 2020

It's when marketing folks create tasks without any user research.

damnencryption · on Nov 9, 2020

Those speeds look nice. 7100srMB/s and 5250 srMB/s

I ended up getting a Samsung Evo 970 plus (1TB) a while ago because it was in a sale for $85. Almost half the speed of the ones in the benchmark. Maybe I should have waited.

ebg13 · on Nov 9, 2020

> Almost half the speed

And 1/3 the price. If you bought it because it was on sale, doesn't that mean you're at least somewhat price sensitive? I don't think you should feel bad about your purchase.

zepolen · on Nov 9, 2020

Buy a couple more, put them in raid 0, now you have 9000MB/s

wtallis · on Nov 9, 2020

Most consumer systems don't have enough PCIe lanes to make that possible, unless you are okay with having at most 8 lanes for a GPU.

dc3 · on Nov 9, 2020

I've heard that SATA and NVME SSDs have similar performance for use cases such as lightweight app usage and loading games. Does anyone know what the typical use cases are for these high performance drives?

ralph87 · on Nov 10, 2020

The use case is benchmarks basically, and maybe faster cold boots. About the only place the typical heavy computer user is likely to notice major difference between SATA and NVME is probably running something like "find /", or some kind of full-disk search scenario.

Even in the latter, full text indexing which every platform has had for years now makes it much less likely that the full directory tree will even get walked, and differences even less likely to be noticed. As a side note, every desktop platform's full text search seems to suffer software performance problems that are largely independent of the underlying disk

Even in the full-tree enumeration case, since Spectre/Meltdown mitigations landed, system call overhead is so high now that even with a lightning fast disk, a large chunk of total time taken to walk the directory tree is lost basically twiddling the CPU mode securely. You can definitely still see the difference between SATA and NVMe, but you can also definitely measure the amount of time during the NVMe run that is spent in software -- incrementally faster NVMe will have quickly diminishing returns.

"What about databases!" This was my original interest in SSDs to begin with. It turns out, despite being a data monkey who loves large databases, since 2013 any time I've worked with a giant dataset like this, it is always in the form of large scans (usually from something like a CSV or XML file), where SSDs don't really have a mind-blowing advantage over magnetic (but of course they are still 5-10x faster a seq io, its just that data parsing and processing is typically the bottleneck now).

sliken · on Nov 10, 2020

I find the worst case scenarios come up pretty frequently. Even novice users can have a ton of photos, browser tabs, thumbnails, music, email, icons, full text search, etc. Once you get used to a system that has great I/O it's hard to go back. Sure large games often have difficult I/O patterns, but it's far from the only used case.

harrygeez · on Nov 9, 2020

The reason is it takes time to reach those top speeds for each individual file. So if you're dealing with many small files it's difficult to reach and sustain the max speed. If you're working with big files like editing 4k or 8k video footage that's when you really get the full benefits

MayeulC · on Nov 9, 2020

It really depends on your filesystem for this. However, the latency should be comparable across technologies, proportionately taking a bigger chunk of transfer times for small files.

kasabali · on Nov 9, 2020

No sustained write results? This is likely using a large dynamic cache.

hengheng · on Nov 9, 2020

What's the endurance of these devices? Could I rely on the swap file for large cfd meshes in a pinch for a week?

kasabali · on Nov 9, 2020

Heavy swapping is the worst case scenario for an SSD (highest write amplification) but I doubt it'd die in a week.

ebg13 · on Nov 9, 2020

> What's the endurance of these devices?

It says in the article.

rasz · on Nov 10, 2020

~600TB write endurance = ~500 cycle NAND Flash

wtallis · on Nov 10, 2020

Unfortunately, SLC caching means the math isn't quite that simple anymore. I would guess that this flash is probably rated for something in the 1500-3000 cycle range, and the difference between that and the apparent 500 cycle drive endurance rating comes down to a combination of a conservative warranty period, and estimated write amplification during typical usage.

2Gkashmiri · on Nov 10, 2020

Can..... this be used as a media hosting drive? Like for plex and maybe heavy server loads?

wtallis · on Nov 10, 2020

This is complete overkill for anything resembling consumer NAS duty. It's far faster than any consumer-oriented networking equipment, and far more costly per byte than the cheapest consumer SSDs that still outperform your network. A drive like this is best suited for a workstation where large quantities of data are generated or processed locally.

wil421 · on Nov 10, 2020

What kind of video files do you have on Plex that needs an NVME drive? 8k videos?

The only time I max out my sata scratch drive is when I’m unpacking a video. Playing a 4K and a 1080p video at the same time barely spikes my ZFS pool on my NAS.

rsaxvc · on Nov 10, 2020

Probably SMR /s

intricatedetail · on Nov 9, 2020

I had Optane drive, it was mind blowing, however it bricked itself after a month.

jbergens · on Nov 10, 2020

Given the price of Optane drives I hope you got a new one for free.

intricatedetail · on Nov 10, 2020

Yes after some back and forth intel sent me a new one, but I have not opened it. I feel too anxious about data loss.