One important point that the StackOverflow answers miss is that the Windows checkbox for buffer flushing has a different meaning for SATA SSDs than for NVMe SSDs. If you want a reasonably fair synthetic benchmark to compare NVMe SSDs against SATA SSDs while bypassing any caching the OS is doing with main system RAM, you pretty much have to toggle that switch for at least NVMe drives. Or use Linux, which has actual documentation for its NVMe driver, and the source code is available where that falls short.
>One important point that the StackOverflow answers miss is that the Windows checkbox for buffer flushing has a different meaning for SATA SSDs than for NVMe SSDs.
can you elaborate on this? What's the exact difference? Do sata drives get more aggressive buffer flushing than nvme drives?
The exact difference, as far as I'm aware, has never been publicly documented by Microsoft, and I have not attempted to properly and thoroughly reverse engineer that particular mess.
But the end result is that in the default state (buffer flushing enabled), most or all disk benchmarking tools show drastically lower write performance on consumer NVMe drives than should be expected compared to their theoretical advantage over SATA SSDs.
I'm not intimately familiar with the semantics of how flush/sync commands get generated and passed through the Windows IO stack. Empirically, when an application tries to issue a write that is not to be buffered by the OS, it appears that Windows translates that into NVMe commands that constrain or entirely prohibit the SSD from doing its own write buffering, but for SATA SSDs writes are still by default issued in a manner that permits the drive to buffer freely. Figuring out exactly what's happening without Microsoft's help might require bus analyzers I don't have. In effect, it seems that Windows is less willing to trust a NVMe SSD than a SATA SSD, and that doesn't strike me as justified.
I've also noticed that recent builds of Windows 10 have changed their behavior when running one of our old benchmarking tools that plays back IO traces. On current Windows 10, the OS outright rejects any attempt by the trace playback application to issue a flush command to a NVMe SSD, whether or not buffer flushing is enabled at the driver level. This is definitely a change from earlier builds of Windows 10, and I think it is a change that only affects NVMe drives, not SATA drives. (I'm not testing very many SATA drives these days, and have to design my testing procedures entirely around the needs of testing NVMe drives.)
>But the end result is that in the default state (buffer flushing enabled), most or all disk benchmarking tools show drastically lower write performance on consumer NVMe drives than should be expected compared to their theoretical advantage over SATA SSDs.
This... doesn't seem to be the case? If you look at this chart[1] from anandtech, you can see that the nvme drives (crucial P1, intel 660p) are significantly ahead of the sata drives (860 evo, 860 qvo, crucial mx500). The sustained write results[2] worse for nvme drives, but they're still better than similar sata drives if we control for flash type (qlc vs tlc) and capacity.
Also, comparing nvme drives to sata drives is hard to begin with, since they're usually in different price segments. sata drives tend to be cheaper than nvme drives (maybe the controller is cheaper?), so if you were comparing a sata drive and a nvme drive of the same price, you might end up with a "worse" nvme drive because of the nvme premium.
Whenever possible, I run synthetic tests on Linux to avoid this and many other frustrations. When running application tests on Windows, I leave the setting at its default: disk cache enabled, and buffer flushing allowed. When running synthetic benchmarks on Windows, I turn buffer flushing off to get an accurate measurement of the drive's capabilities, relatively unhindered by OS shenanigans.
It's a tangent, but I only recently realized/considered that UPS batteries had come along for the ride with lithium-ion.
I bought one that should power my internet+wifi for about an hour for $45.
I guess even a small one like that would handle power blips for a home server, but in an outage the battery would quickly run out and the setting would matter.
Most smart UPS can notify equipment (via LAN, USB and some even offer IPMI) to turn off in the event of an outage, or as soon as the battery reaches a certain threshold. You just need to install the UPS's server software on the equipment you want gracefully shut down.
That model uses a sealed lead acid battery. There’s a picture of it in one of the reviews (the product description doesn’t mention battery technology).
As far as I'm aware, server SSDs without PLP are a relatively recent trend, and most of the examples I'm aware of are entry-level SATA drives. A few companies are also marketing their client/consumer M.2 NVMe drives for use as server boot drives, but at least one competitor in that segment has made a purpose-built low capacity NVMe drive with PLP.
The biggest players in the game do not use SSDs for persistent storage longer than x weeks. It is offloaded to tape at some point and restored periodically.
There's simply no good reason to periodically write your data from flash to $other every now and them, and then back to flash. There are heaps of other ways to prevent data loss or performance degradation in flash storage.
There are methods like tiered storage, where only the most active data is present in Flash-storage, much like write back caching.
>It is offloaded to tape at some point and restored periodically.
This makes no sense. The main problem with disabling write flushing is what happens if the OS crashes or the power goes out. Doing periodic backups/restores provides a "last known good state" to revert to, but you'll still lose all the writes between the last backup and when you crashed. Depending on your workload this might be fine, but I doubt the COO is going to be pleased when he finds out you lost all customer orders from today.
The bit about "restoring periodically" also doesn't make any sense because such events are easily detectable. There's no need to constantly take the system offline to do a restore. Disabling write flushing isn't going to cause increased bitrot.
Explicitly rewriting data periodically might make sense if you're really paranoid about flash fading over time. But bouncing the data through tape (tape!!) still makes no sense; the sensible way of managing this would be to mirror the data directly to another SSD.
We are arguing about nothing. Live data (being accessed many times per day) stays on SSDs. The middle tier of data is a gray area and can go on SSDs or disk.
But if you think that your FB or Youtube or Fidelity or Geico or SSN or IRS or Apple or Google data isn't sitting on a tape somewhere, you are extremely mistaken, because I was the one who wrote the tape control software they purchased from us.
Nobody's denying that the data makes it to tape eventually. What everyone finds hard to believe is your assertion that data is ever moved from tape back into warm storage during the ordinary course of business. Enterprises use tape for backup. They only read data off tapes after something goes very wrong, or periodically to verify that their backup process is actually working. But except after a catastrophe, data read from tape is not routinely used to re-populate warm storage.
The Library of Congress does exactly what you've described every day.
A researcher desires to download a 50 minute archived video. A request is sent from their laptop to a lookup table server. The lookup table contains the metadata for where that file is stored.
Then the request is passed to a media server which has an SSD and RAM large enough for receiving the actual media file. It opens a connection with the tape library. The tape library handles loading the media from tape to a staging server, which also has an SSD.
So totally the media file goes from tape, to staging SSD via hardwire, to main server's SSD over the network, to client's device over the network.
I'm also curious as to whether this is true. From what I understand and can gather from tech talks from large firms, it is common (up to and beyond petabyte scale) to use ceph clusters purely consisting of SSDs and HDDs (no tape) to store mission critical data, with cache tiering used to balance hot and cold data. AFAIK, because of their self-healing nature thanks to replication/erasure coding algorithms, all that needs to be done is replace corrupted drives periodically, thus making tape drives non-essential unless periodic full backups are required for off-site storage. Please feel free to correct me if I'm wrong in any of this. Thanks.
Exactly! Use ceph so you don't have to care about burning ssd/hdd/server/switch/racks/dc, you have your object-storeage, FS, block-device, with one service and that's it.
They use tape because they have insanely large amounts of data they want to store and no need to have fast access to them, not because they are a good replacement for SSDs.
Maybe I haven't been paying attention, but I've never had a site attach extra text to copied text like this site does. I wanted to see what the test machine's case looks like, so I copied the make and model and pasted it into a search bar, and got "Read more:" and a link to the page along with the text I wanted to copy.
No thanks. I'll be skipping that site in the future, unless that javascript comes from a place I can easily block with pi-hole.
You can probably just block JavaScript entirely for a site like this. I doubt it depends on it. But I agree that adding text to copied text is really irritating.
Those speeds look nice. 7100srMB/s and 5250 srMB/s
I ended up getting a Samsung Evo 970 plus (1TB) a while ago because it was in a sale for $85. Almost half the speed of the ones in the benchmark. Maybe I should have waited.
And 1/3 the price. If you bought it because it was on sale, doesn't that mean you're at least somewhat price sensitive? I don't think you should feel bad about your purchase.
I've heard that SATA and NVME SSDs have similar performance for use cases such as lightweight app usage and loading games. Does anyone know what the typical use cases are for these high performance drives?
The use case is benchmarks basically, and maybe faster cold boots. About the only place the typical heavy computer user is likely to notice major difference between SATA and NVME is probably running something like "find /", or some kind of full-disk search scenario.
Even in the latter, full text indexing which every platform has had for years now makes it much less likely that the full directory tree will even get walked, and differences even less likely to be noticed. As a side note, every desktop platform's full text search seems to suffer software performance problems that are largely independent of the underlying disk
Even in the full-tree enumeration case, since Spectre/Meltdown mitigations landed, system call overhead is so high now that even with a lightning fast disk, a large chunk of total time taken to walk the directory tree is lost basically twiddling the CPU mode securely. You can definitely still see the difference between SATA and NVMe, but you can also definitely measure the amount of time during the NVMe run that is spent in software -- incrementally faster NVMe will have quickly diminishing returns.
"What about databases!" This was my original interest in SSDs to begin with. It turns out, despite being a data monkey who loves large databases, since 2013 any time I've worked with a giant dataset like this, it is always in the form of large scans (usually from something like a CSV or XML file), where SSDs don't really have a mind-blowing advantage over magnetic (but of course they are still 5-10x faster a seq io, its just that data parsing and processing is typically the bottleneck now).
I find the worst case scenarios come up pretty frequently. Even novice users can have a ton of photos, browser tabs, thumbnails, music, email, icons, full text search, etc. Once you get used to a system that has great I/O it's hard to go back. Sure large games often have difficult I/O patterns, but it's far from the only used case.
The reason is it takes time to reach those top speeds for each individual file. So if you're dealing with many small files it's difficult to reach and sustain the max speed. If you're working with big files like editing 4k or 8k video footage that's when you really get the full benefits
It really depends on your filesystem for this. However, the latency should be comparable across technologies, proportionately taking a bigger chunk of transfer times for small files.
Unfortunately, SLC caching means the math isn't quite that simple anymore. I would guess that this flash is probably rated for something in the 1500-3000 cycle range, and the difference between that and the apparent 500 cycle drive endurance rating comes down to a combination of a conservative warranty period, and estimated write amplification during typical usage.
This is complete overkill for anything resembling consumer NAS duty. It's far faster than any consumer-oriented networking equipment, and far more costly per byte than the cheapest consumer SSDs that still outperform your network. A drive like this is best suited for a workstation where large quantities of data are generated or processed locally.
What kind of video files do you have on Plex that needs an NVME drive? 8k videos?
The only time I max out my sata scratch drive is when I’m unpacking a video. Playing a 4K and a 1080p video at the same time barely spikes my ZFS pool on my NAS.
Is this actually good advice, especially without mentioning the durability risks?