Hacker News new | past | comments | ask | show | jobs | submit login
AMD Ryzen Threadripper 1920X and 1950X CPUs Announced (anandtech.com)
377 points by zdw on July 13, 2017 | hide | past | favorite | 318 comments



I really hope the ECC carries through. It irritates me to have to buy a "server" CPU if I want ECC on my desktop (which I do) and it isn't that many gates! Its not like folks are tight on transistors or anything. And on my 48GB desktop (currently using a Xeon CPU) I'll see anywhere from 1 to 4 corrected single bit errors a month.

For things like large CAD drawings which are essentially one giant data structure, flipping a bit in the middle of them somewhere silently can leave the file unable to be opened. So I certainly prefer not to have those bits flip.


I have a Ryzen 5 1600 with a pair of Crucial ECC UDIM. It works perfectly on the ASUS Prime B350M-A/CSM uATX board. Stress tested. Errors are logged and halt behavior is confirmed (ubuntu 17.04).


Interesting, could you possibly explain how "halt behavior" indicates that ECC is fully functioning? I've just read a lot about ECC being compatible but not tests explicitly showing it works.


On my system if an uncorrectable error were to happen the system goes into a 'machine check' state (a halt). In theory, the BIOS supports 'chip kill' which is that it reboots with a bit set to not use that specific chip in the DIMM. I've never seen that happen on my desktop but in the data center when it happened a system that should have 128GB of memory would reboot and come back with less than the full amount (as expected by the DIMMs plugged in) reported by the BIOS.


Interesting, depending on the amount of RAM killed that seems like it could be a bit overkill for what could just be a cosmic ray flip.

Though perhaps the rare frequency of cosmic ray flips makes that acceptable.


Chipkill is an IBM name for the feature that corrects multiple bit errors in DRAM, are you sure about the reboot thing or are there two similarly named ECC techs that do different things?


Intel discussion of Chipkill (https://www.intel.com/content/dam/www/public/us/en/documents...). Servers in question were Supermicro "Jupiter" chassis with the Westmere processors, discussion on Supermicro of their implementation (https://www.supermicro.com/support/faqs/faq.cfm?faq=2642)

When systems rebooted with less memory than the system configuration database said they should have most of the time there would be a multi-bit error detection, machine check, and 'memory update' in the IPMI buffer.


Why kill the DIMM? Cosmic ray bitflips don't harm the hardware, AFAIK.


It's usually not cosmic ray.


The system must be configured to halt the system on detection of uncorrectable multibit errors -- which shows that the detection of uncorrectable errors works.


The specs for the board indicate no ECC support. How did you test it?

BTW, I'm hoping that ECC is there.


Currently on mobile, but the hardware blog Hardware Cannucks has a fantastic post covering how "true" ECC support / functionality is on Ryzen. They also investigate this on both Windows and Linux.


Thanks - just read it. I get it now. The chipset supports it and linux turns it on. Looks like the halt functionality is not there.


Interesting, is there a list of boards and supported modules that work with ECC?


Sorry for the uninformed question but:

With RAM sizes having ballooned to very large sizes (16-32GB now fairly common for a workstation) why is non-ECC memory even considered? Other methods used to safeguard large sums of 0s and 1s like hard drives, SSDs, modern filesystems (ZFS, Btrfs) have builtin error-correcting mechanisms. Why is getting hit with a cosmic ray and having a bit flipped in your CSV file any more acceptable than the same thing occurring on a "server" with ECC memory?


Because ECC essentially takes 9 bits per byte for storage - typical implmentation is 9 chips per stick, not 8. You also need an additional controller to actually do the error checking.

So, even if it weren't for the typical "enterprise/industrial" you'd be looking at a minimum of 12.5% parts cost increase.

Multiple that out * billions of ram sticks and you're talking real money for something with dubious relevance for most users.


The controller is already built into the cpu, and cost increases aren't linear like that at all when scale is factored in, especially considering packaging and assembly overhead.


Skylake-X without ECC: $999

Skylake-SP with ECC: $3,000

There's your answer.


Current Ryzen CPUs support ECC, so presumably so will the Threadripper variant. See Lisa Su's comment(!) on Reddit repro'd on this page: http://www.hardwarecanucks.com/forum/hardware-canucks-review...


That's not an answer. That's just Intel being dicks.


I would say it's more that Intel's business model depends on price discrimination and they don't know how to survive without it.


Silly example. All the below are 4 cores/8 threads.

  Xeon E3-1230 v6 kaby lake 3.5 GHz - 3.9 GHz $250
  Xeon E3-1240 v6 kaby lake 3.7 GHz - 4.1 GHz $272
  Core i7-7700    kaby lake 3.6 GHz - 4.2 GHz $303
  Xeon E3-1270 v6 kaby lake 3.8 GHz - 4.2 GHz $328
  Core i7-7700k   kaby lake 4.2 GHz - 4.5 GHz $339
What's the ECC premium again? Clearly they are on pretty similar price/performance curves.


The premium is in workstation -> server CPUs, not desktop. The E3 line are Xeons in name only.


I agree with this, Intel uses it as a margin enhancer. My CPU was $150 more than the equivalent desktop processor.


Threadripper 1950X with ECC, 16 instead of 10 cores, 64 instead of 44 PCIe lanes and without a raid dongle which prevents certain raid modes unless you pay for them (like with Intel): $999.

I love competition. Though I would prefer if there was a third competitior in the x86 CPU space and the GPU space.


And you haven't replaced it? The only time I had ECC memory report errors, I was also experiencing undetected errors as well. System was not stable. Pulled it out, happy ever since. I've always thought of ECC as a warning system (despite the correction ability). Like a spare tire.


Memory errors happen all the time, at least they did several years back when the Google server farm saw "an average of one single-bit-error every 14 to 40 hours per Gigabit of DRAM": http://www.intelligentmemory.com/support/faq/ecc-dram/how-of...

This translates to "a mean of 3,751 correctable errors per DIMM per year": http://www.zdnet.com/article/dram-error-rates-nightmare-on-d...

I'm not sure how things pan out these days with newer memory types. ECC checks and fixes these errors so they're not an issue.


Citing the results of a paper in which they studied the errors observed mostly on DDR1 memory?

A far more recent study by CMU based on the entire fleet of Fb servers shows that correctable error rates dropped dramatically in the past decade.

http://repository.cmu.edu/cgi/viewcontent.cgi?article=1345&c...


[Unless it's obvious to others,] paper doesn't conclude that DDR1 to DDR3 discrepancy correlates to error rate. It only concludes from empirical measurements that memory density scales with error rate (2).



I don't consider it broken so I haven't replaced them, the system is rock solid stable.

Your statement though was interesting, how do you have both ECC memory report errors and 'undetected errors'. At least from a memory perspective, with ECC an 'undetected' error is a multi-bit error that both flips bits and leaves the ECC bits in a legal configuration. That seems like it would be pretty rare.

That said, I've seen motherboards (in our data center) where the memory slots themselves were unreliable (probably bad or weak solder joints on the DIMM sockets or missing terminator resistors). They appeared as a machine with a lot of ECC errors but the same DIMM in another motherboard gave no errors.


The patient first presented with classic memory corruption symptoms, like random segfaults. Ran memtest. No errors reported, but a flurry of corrections logged in the BIOS. Pulled a pair of dimms, everything cleared up. Swapped slots too, so I feel confident blaming the RAM.

It did fall outside my expectation of how ECC works. One bit errors and three bit errors, but not two? Some access pattern that memtest strides don't hit? I didn't really need the extra RAM, so I just moved on without it.


SECDED - single error correction, double error detection. Your Duff memory module was flipping more than two bits.


The typical PC-building online communities have been mindlessly drumming the line "Only servers need ECC" into each other's heads for so long that they'll probably end up refusing it, even when it's offered to them on a plate.


Well, when people are shocked that someone is experiencing 1-4 single bit errors a month across an expanse of 48GB, it's not hard to see why people think it's not something they need.


A single bit error is enough to create a misspelling in a txt file. There's a low probability of it happening, but if it does, it's unnerving to think that it will not be detected.


There's a very low probability!

- In consumer equipment, much of your RAM is often unused at any given time

- Most lines in a cache are eventually thrown away, never used

- Did you even save your text file, or just open it to read it?

- What about all the space occupied by read-only information, like executables, media files, game files, and libraries? You might crash, but nothing will be written to disk

You might get an error a month, but the odds you'll get an error that matters on your average consumer machine with average workloads is much much lower.


The typical pc builder (or user) isn't professional so rarely use data that can be corrupted. I edit photos and play games. The photos have backups so I wouldn't sweat a corrupt bit here and there. The largest structure I use is my file system but I try never to get too attached to data on a machine (machines should be possible to reinstall within an hour - data such as photos should be elsewhere).

People working with sensitive datasets or fragile data structures (large cad files was mentioned) can certainly use ECC with good reason.

But for most home machines? Sure, if it doesnt cost 10 or even 3% extra then I'd recommend it. So it would be great if AMD could pressure Intel towards bringing ECC to consumer chips.

Otherwise for a normal builder just put that $100 towards a better graphics card (if gaming) or a better monitor or whatever, and the lack of ECC will make your game crash once in 3 years (it crashes 99 more times due to bugs and bad drivers...)


"Not minding" data corruption is a perverted mindset that comes from internalising Intel's market segmentation. It's not a law of home computing that we don't deserve the good technology.


Absolutely, but my point was that people don't even have important "data" a lot of the time on home build PCs so the question is how much added cost is actually acceptable for ECC? For me on a gaming machine the acceptable added cost would be in the very low single digits, so while I welcome if AMD goes "ECC for all" I was just trying to rationalize not recommending it to home pc builders with current pricing.


> "Not minding" data corruption is a perverted mindset that comes from internalising Intel's market segmentation.

Disagree. The mindset of "assume my metal box can catch fire at any time" is absolutely the right one to adopt, and the more valuable your data is the more right the mindset becomes.

Intel's nasty market segmentation strategy doesn't make that mindset wrong.


Are you going to get the word out to every owner of Xeon(s) in the world that they're doing it wrong and should be using consumer gear if they truly value their data?

Or, is ECC in fact a good thing that's worth having?


The key here is "truly value" and how much. Intel pricing forces you to think about whether you value what's in your ram.

If you are only gaming on a Xeon you should have put that money on the gpu. If you are doing databases, cad etc without ECC then the converse is true: should have put more money towards ECC. Can't see the controversy in recommending Non-ECC for Intel buyers based on the workload in question.


OEMs will probably still ship systems without ECC too. Does not mean it should not be an option. I wonder as a side note on Bristol Ridge for example, for 8GB of RAM would you rather take dual channel or ECC? (the fun thing is that 4Gbit DDR4 is cheaper per bit on DRAMeXchange spot prices than 8Gbit DDR4 again, but...)


Are there any Ryzen motherboards that support ECC currently? ("support" defined as "corrects errors", not "can plug in ECC ram and it functions as normal ram without error correction")

Last I am aware there are not.


There is a thread on overclock.net which collects the AM4 motherboards with ECC support [1].

You don't need complete support from the motherboard though. As long as the motherboard and BIOS/UEFI don't sabotage ECC you can at least use it from the OS, even if the motherboard doesn't explicitly support it [2].

[1] http://www.overclock.net/t/1629642/ryzen-ecc-motherboards

[2] http://www.hardwarecanucks.com/forum/hardware-canucks-review...


1-4 corrected errors per month?! This sounds quite bad - which manufactor?


Crucial (six 8GB sticks). I don't put it into the 'bad' category. When overseeing 1600 machines in a data center with a collective 307TB of RAM suggests its actually on the 'good' side. Bad is multiple errors per day. Really bad is hundreds or thousands per day.

Because they are corrected nothing actually happens on the system (other than what ever accessed them saw on the order of 700nS to access RAM rather than on the order of 100nS. If it gets a double bit error it will machine check so I'll know that my memory has failed me :-).


All ram should be ECC (selective number of bits), then the processes that fab the ram can be made cheaper and sloppier. Zero bit-error-rate non-ECC RAM would actually be more expensive than ECC ram. A chip kill be on a much smaller granularity. I suspect these things haven't been done because the RAM cabal likes their high prices.


Isn't that more or less in the range expected from background radiation bitflips when having 48 GiB?


I manage many servers with >=384GB RAM. They maybe detect an error once a year, with several never reporting an error in its life. If I noticed a server reporting errors more than twice a year I'd probably swap its memory, and if that didn't fix the problem I'd probably replace the server.


Depends on what you mean by "reporting an error". The default configuration for most machines I've used is to silently correct single bit failures unless they hit some threshold. So you actively have to run an EDAC (or similar) driver and poll the error counters. Its only with uncorrectable errors does it generally rise to the level of a full MCE.


Many server-grade motherboards will report ECC memory faults in the event log through IPMI.


Error rate would probably depend on the mass and quality of shielding above/around the servers.


How do these manifest themselves and how do you detect them?


The error events are reported through IPMI in a system event log.


Threadripper will have ECC support [0] just like Ryzen but validation depends on the motherboard manufacturer.

[0] https://www.reddit.com/r/Amd/comments/6icdyo/amd_threadrippe...


Xeon E3 pricing is very similar to the equiv i5 and i7, and supports ECC. Plenty of fairly cheap motherboards ($50-$100 more than the i5/i7 boards).


Some i3 models also support ECC when paired with the right chipset. My NAS runs on an i3-2100/Supermicro/ECC DDR3 setup. The i3 was a $100 Microcenter special.

If I ever have to upgrade the compute portion I will definitely consider Ryzen.


Check out the asus prime x370-pro. :)


After reading it had both AM4 support and ECC I started planning a build based on it. Assuming the ECC implementation works as I need it will probably be my next desktop.


Best thing about this is that competition is back (in the high end x86 market) and the winner is the consumer, CPU market have been stale for a while.


What's even better (for AMD) is that they can now design a single processor, and scale it across their entire lineup from R3(?) to EPYC. This suggests that they could incrementally improve their entire lineup annually, or deliver huge boosts every couple years.


I think I saw an AMD future product roadmap (leaked?) showing 6 cores per CCX for the next iteration. If they managed to do that, suddenly 8C/16T Ryzen becomes 12C/24T, 16C/32T Threadripper becomes 24C/48T, 32C/64T Epyc becomes 48C/96T. That'd be amazing.


With all the lanes, and the current CPUs not being able to saturate them means that there is headroom to swap the CPUs in the future and retain the same IO peripherals, just being able to push them harder.


AMD has said they'll keep AM4 around for a while.


So I currently have a 160-thread machine under my command, and it's exactly as awesome as it sounds! The idea of being able to get huge thread counts in a desktop x86-64 processor gets me very excited


As long as AMD can not match Intel on IPC, such one design fits all approach will always attract attacks from Intel.

see the latest example here: http://www.techradar.com/news/intel-disses-amds-new-processo...


Competitors will always find ways of attacking each other.

In reality in any kind of engineering in the real world there's always trade offs of different approaches. These choices are often made with cost being one of the factors.

AMD's trade of is a one size fits all die that can be used in a lot of products quite effectively. And on a relative basis, IPC wise it's really not that far. As a result it can offer this at an attractive price.

Intel on the other hand is making a lot of different dies which comes with lower cross core communication latencies. Also, their more expensive / mature process allows for higher Mhz clocks. This comes with a higher premium MSPR for products in the same class.

People can figure out what is more important to their use cases and we can all argue about the benefits/downsides of each approach.

I would say the market segmentation with AVX512 (different chips supporting different features) is somewhat maddening. Plus a limited use case of HPC software make effective use of that. Most does not.


I kinda missed the boat on this story, I was of the mindset that cpu game was over, and Intel had an insurmountable advantage in 14 nanometer fabrication, and basically their scale meant AMD could never compete again. How has AMD managed to a comeback like this?

Edit; reading wikipedia, seems like it was the fact AMD could outsource 14nm finfet fabrication to GlobalFoundries that they're able to do this, since they didn't have to build their own fabrication facility, or could they have done their own fab?


Because AMD doesn't handle it's fabrication any longer. It's all done through third-parties (GloFo and Samsung, particularly) whom are competitive against Intel. Intel still has the advantage in that they are vertically integrated, however Samsung is pushing out far more chips/day than Intel and is the lead fab for cutting edge ARM processors. This makes it much more capable of multiparadigm fabbing.


AMD founder Jerry Sanders used to say "Real men have fabs" (i.e. ownership of the whole production chain), and is probably sad that had to be given up for financial reasons.

Looking up that phrase gets a history of upsides and downsides, e.g. https://webcache.googleusercontent.com/search?q=cache:Oxfqym...


We’ll GloFo is AMDs fab after they spun it out a few years back.


Yes, Intel was supposed to be on 10 nm by now but they have fallen behind their roadmap which allowed AMD/GloFo to catch up somewhat.


Part of me dreads the "Good Ol' Days" when you'd buy a brand new top-of-the-line CPU or graphics card only for it to be obsolete in a month.

A different part of me can't wait for that to happen.


How do people with many CPU cores find it helps their day to day, excluding people who run VMs, or do highly parallelisable things as their 80% core job loop (ie you run some form of data.paralellMap(awesomeness) all day)?

Does it help with general responsiveness? Do many apps / processes parallalise nicely? Or is it more "Everything is 99% idle until I need to run that Photoshop filter, and then it does it really fast"?


I have the 4930K processor (6 core, 12 thread processor from 2013-ish).

It's painful trying to do the same workflow I do on my desktop on a 2-core laptop. Because I have the power, i've started really using the ability to multitask to it's fullest. Having multiple VMs running, being able to run the full suite of unit-tests on every save, run linters/static analysis as often as it can, etc...

Builds don't slow me down, npm-install goes much faster and doesn't really cause any choppiness, I basically don't need to worry about CPU efficiency in any of my tools.

I feel it's more than paid for itself over the past 4 years (it was only like a few hundred more than the 4 core machine) in time saved alone, ignoring the reduction in frustration or distraction from having to wait on things to finish or wait for the stutters and stalls to stop.


Same with the Ryzen I have at work, my old i5 desktop feels different, micropauses I wasn't even aware of are an issue and my laptop is even worse.

It's totally spoiled me.

8C/16HT or nothing for me now ;).


It seems like the CPU could only get you so far in the setup you describe. I assume you have no spinning platters in your desktop, and are using 1 or more SSDs? IO-bound workloads are very common in what you're describing.


Yeah I've got a PCIe SSD that I don't remember the exact details on right now, and I setup a ramdisk to build from a while ago but never really benchmarked it to see if it was actually making a difference.

Still, you'd be surprised how much having the extra CPU headroom helps there. A lot is IO bound, but at least with the extra CPU cores the PC is still responsive and able to do other things while waiting (for the most part). When you get CPU bound, it starts to hurt.

Back when I built it I looked at the total cost and did some math on how long I thought it would last, and how much time it might save me and figured out that even if I really went all out, if it made me even a few percent more productive it'd be more than worth it, so I kind of went nuts on it, and I'm very happy with my decision.


Build Time!

I will pay a great deal of money to reduce my build times. Building is readily parallelizable and with a good enough build scripts and system design there are several smaller linking steps instead of one huge serialized one at the end. On my current system building the code I care about only takes about 2 minutes, about 5MLOCs of C++.

Anything more than 10 seconds is enough to get stuck in the time waste that is Reddit... or HN... I am getting back to work now, my build is likely done.


I now know why they sell so many of those "fidget spinners" these days – it's all the core-starved developers like myself not wanting to get stuck on HN while waiting for their builds to finish ;D


The problem is that builds are not just a concatentation of CPU-bound things. There's more to a build than that.

An interesting item came up recently. On Windows 10 there's a problem with highly parallel builds that is down to the fact that process exit cleanup is serialized inside the GDI on a lock that also needs to be taken during message input. Combine that with a build that splits things out into lots of compilation units that are all built in parallel using a process each, and fun ensues.

Ironically, it means that during the build you are not able to get stuck into anything else with a GUI. (-:

* https://randomascii.wordpress.com/2017/07/09/24-core-cpu-and... (https://news.ycombinator.com/item?id=14733829)


Oof, I'm limited to building at -j4 due to memory constraints, not CPU constraints.


How much memory do you have and what are building? Last I was memory constrained I was trying to build Clang on a Raspberry Pi.


16 gigs working on internal code, but apparently the culprit is boost::units....


> and system design there are several smaller linking steps instead of one huge serialized one at the end.

Have you tried the GNU Gold linker? It's makes those serialized linking steps very fast and even supports concurrent linking :)


Yes, and it took about 5% off my link step. It was nice, but no a fundamental game changer. I suspect someone who didn't already a bunch of smaller link steps would benefit more.


Gold also supports incremental linking [0], where it basically modifies the existing binary by only replacing the changed parts of it. I've never used it myself so I don't know how well it works, but you might want to look into it if you haven't already.

[0] https://gcc.gnu.org/wiki/GoldIncrementalLinking


Well that is something I hadn't heard of. Now I am curious how hard it would be to get CMake to do this?


As a programmer, being able to run non-trvial parallel builds (e.g. compiling large Scala projects), whilst keeping the machine quite usable for other things is pretty awesome.

5 years ago I'd have struggled to get that out of a desktop. Today, my laptop handles that, barely breaking a sweat.


A reasonably large proportion of workstation-type tasks do parallelise very well. The example I'm most familiar with is professional audio. A typical audio project will include hundreds of plugin instances, each with their own thread. Performance for this workload scales more-or-less linearly with core count. As I understand it, many VFX and video editing workloads are also highly parallelisable.

>Or is it more "Everything is 99% idle until I need to run that Photoshop filter, and then it does it really fast"?

Using all your cores all the time doesn't really matter. A Digital Audio Workstation is very much either/or in terms of CPU use - as soon as you stop playback, your CPU usage drops to idle. If you run out of CPU overhead during playback then everything grinds to a halt, which can be absolutely disastrous when you've only got 3 milliseconds of buffered audio. We want enough FLOPS to cope with our normal workloads, plus a substantial overhead to cope with spikes.


I have a 2 sockets x 8 cores (16 real cores) Xeon desktop machine from last year. It's very fast.

But to make it useful for compilation, I had to make a lot of alterations to our software and build systems to ensure we were maximally parallelizing builds. This included splitting up C programs into separate C files almost comically fine-grained. I was literally using ‘wc -l *.c’ and trying to even up the size of the files.

Eventually you hit Amdahl's law: Some parts of the build (I'm looking at you, autoconf configure scripts) simply cannot be parallelized and they slow the whole system down.

The other thing it does well is virtualization, but that tends to be limited by RAM and disk space. The machine has 64 GB of RAM so it can run plenty of VMs, but I had to install more disks so I could store those VMs. It now has 4 SSDs and 3 hard drives, and managing the space across those is a bit painful (with LVM).


> Eventually you hit Amdahl's law: Some parts of the build (I'm looking at you, autoconf configure scripts) simply cannot be parallelized and they slow the whole system down.

Linking a large C/C++ project can take some (a lot of) time as well. Linking is of course in general a rather difficult to parallelize problem, except if there is no linking (e.g. because the runtime links everything at every application start cough).


Absolutely yes, linking was another unsolvable problem when I was parallelizing the builds. As was odd stuff like building documentation, building Perl bindings and a few other things that were inherently serial.


Does partial linking (ld -r) allow one to parallelize linking? The Linux kernel does that on its build system, though I don't know if it's for performance reasons.


Have you tried GNU Gold? It's A LOT faster and it should even support concurrent linking.


lld (llvm's linker) is also supposed to be quite quick, if it supports all the features you need.


I'm using GCC (it compiles my C++ code faster than Clang) which can't use lld :/


> This included splitting up C programs into separate C files almost comically fine-grained. I was literally using ‘wc -l *.c’ and trying to even up the size of the files.

This probably increased CPU usage, but did it really reduce compilation times? I would imagine that any gains from parallelizing function compile times would be eliminated by having to process the header files over and over again. Not to mention the time cost of doing the splitting.


I'm pretty sure it increased build times on less parallel machines, yes. And as you say the total work done (because of duplicate header file compilation) was greater.


I guess I'm not the only one having a problem with development on our low/slow core portable computers (MBP in my case).

Just the other day a friend of mine voiced his dismay over the fact that "there was a point in time - a sweet spot seemingly - not too long ago, where we could work effortlessly on an (underpowered) Macbook Air / Macbook".

I'd have to echo that as the compounding tax on single / multi core CPU resources has grown substantially over the last couple of years - both through a "renaissance" of more compilation heavy toolchains on one side as well as the (dare I say it?) "microservices as a monolith" effect through explosive adoption of container based "architectures" on the other (docker-compose anyone?).

After more than a dozen very happy years on almost as many Macs I'll seriously get back to developing on a powerful Linux workstation – looking forward to less computational overhead over Docker on both wetware/hardware alone.. I'm really just waiting for Threadripper to make that happen.

My new iPad Pro will be there for administrivia / communication / ideas (very much looking forward to the highly pencil informed iOS 11) – of course I'll also keep my MBP to get it out of the drawer for the occasional asset manipulation with Affinity Designer (the 2 core Broadwell with 16GB RAM should last for many years for those kinds of tasks).


That sweet spot has nothing to do with compilation or Docker, and everything with the rise of Electron. Before, an app was built with native UI libraries, running on Python with core elements in C++ for speed. After, a simple app (like Slack or WhatsApp desktop) will eat 300-500mb, use 1-10% CPU and chew through your battery alive.


Yeah it's pretty shocking how much CPU and memory Electron apps use.

Used to be Java chat apps and IDEs (Eclipse, IntelliJ) were considered slow and bloated.

They're practically fleet footed compared to the new breed of web apps.

As much as I love Slack, Visual Studio Code and Atom my maxed out MacBook Pro is at 50-100% CPU and paging into swap with all three open.

If I send a few GIFs in Slack - temps hit 99C (never 100 for some reason) and my laptop sounds like a mini hovercraft.

It's a little disheartening and really has me missing those big old Mac Pros which were silent no matter the task.

JavaScript and the resulting apps are certainly enjoyable to code and use but damn does something need to change in terms of resource utilization.


Hey there, sorry even without using any Electron apps I still have performance issues which in my case have 99% to do with building applications within the bounds of "microservice" architectures. That said Electron apps sure eat up lots of (mostly single core) resources too, which ultimately amounts to a bad UX.


Something you might want to try is hosting a Kubernetes server on some more powerful machine (like a VPS or something at home) to run some of the Docker services you won't actively be developing against.


Thanks for the idea :) I actually did something along those lines - completely switched to emacs within a tmux session on some large EC2 instances but meanwhile optimized my workflow (introduced more isolated TDD) so that performance is bearable enough on my local machine..


For me, it's both the "really fast Photoshop filter" (or, in my case, static analyzer run), and responsiveness when I do e.g. a build that I've restricted to use all but one core. Otherwise yes, the cores just sit there idle.


I do a lot of analysis that often boils down to "do X a hundred million times". Getting a 6 core machine setup earlier this year didn't make a difference for most things but when I want to run something like that (most days) it makes a huge difference. Bringing an almost all day task to an hour hugely changes the workflow.


Not a lot. Most software isn't very parallel. I think 2 fast cores would be the sweet spot for most people, for the foreseeable future.

I think a question like this will get very biased answers, since most people aren't very inclined to post "paid a lot, don't really use it" and rather stay mum.

BTW taken out of context, "people with many CPU cores" would mostly consist of 8-core Android phone owners. They also do little with all those cores.


I regularly see 8 cores over 50% utilization. So sure 90% of the time you don't really need it, but a lot more things are parallel now than you might think.

Consider, if your PC is going to last 3+ years, and you average ~40 hours a week on it then the most demanding 1% is still 48+ hours.


> I regularly see 8 cores over 50% utilization.

What happens if you close background browser tabs? :b


Standard office computer here is a quad core (AMD APU), and it actually makes quite a bunch of stuff faster (e.g. conversion of scanned documents) compared to the previous-generation dual cores.


It was shown that for most games nowadays 2 cores will result in micro-stutter. For example the Pentium G4560 (2x 3.50GHz) has pretty good average FPS, but if you look at the 99th percentile, it's worse than similar priced quad cores.


Being able to do big jobs in the background and still use your computer effectively. Run 3 VMs for different platforms and still have a functioning computer.

There is a lot of task level parallelization that many cores helps with regardless if any one is optimized.

That said, a lot of software is multithreaded these days.


Running 4 OSes at once (host + 3 VMs) I would hope you have a very quick SSD and lots of RAM, otherwise that CPU isn't going to help much.


64GB of ram is surprisingly affordable assuming you want to run a few VM's on one box. You can also give each VM it's own SSD without spending crazy money. AKA it's far less than the cost of 3 mid range PC's.


64 instead of 24 pcie lanes is huge too


Has helped a lot with my photo tools. Probably any time you're chewing a lot of data, of any kind. But Word, Chrome, and PuTTy don't seem to care :)


Building stuff from source takes less time.


> Photoshop filter

Funnily enough someone correct me if I'm wrong but these are almost entirely ancient single core code.


No, they are very easily parallelized and even use the available GPU for the last dozen years.


Sean Parent does love squeezing all the performance out of even the tiniest devices.

There is a video of him loading the largest (he knew of at the time) JPG on an iPad 1. The image is several gb and it works pretty well on a machine with much smaller RAM.

I think this is the talk: https://www.youtube.com/watch?v=giNtMitSdfQ

If not, I am watching and will try to post it later.

EDIT - I don't think that is the right talk, but its still good.


A CPU that large reminds me of the famous remark made by Grace Hopper about how light can move 30cm in one nano second, I guess theoretically meaning that CPU could have some kind of maximum size.

Of course since current CPU contains cores, it doesn't apply.


Clock distribution is already quite complex in big circuits. It should arrive at the same time everywhere but it does not. You need to add extra components to de-skew them.


Instead of placing the dies next to each other on a flat surface, the dies could be above each other, or the CPU made spherical, for optimally close distances. And RAM of the motherboard could also be in a circle or sphere around it.

Is there any other concern than cooling with this?


Cooling is pretty much the one and only concern, there's nothing stopping AMD from doing this with their zeppelin dies except for the simple fact the chips on the bottom of the stack would get roasted at these TDP's.


They could go back to the SECC days but then put heatsinks on both sides.


RAM of the motherboard could also be in a circle or sphere

Cray did this back in the day https://en.m.wikipedia.org/wiki/Cray-1


I'd imagine that a circle/sphere would add all sorts of manufacturing, engineering, and build (ie actual installation in the box) concerns.


IIRC, cooling is already the primary limiting factor on current designs. People have achieved some remarkable stable overclocks by using LN2 cooling.


Silicon microchannels can cool pretty much anything but they're expensive.


Well for one thing, light doesn't bend. There's something to be said for the flat rectangular shape.


There is no light used in a CPU (or any other kind of chip found on a PC motherboard). Grace Hopper was talking about the speed of light, which is roughly how fast a change in voltage will propagate. That can and does curve all around your CPU because the wires are not always routed in straight lines.


Really? I'm pretty sure Einstein would disagree.


If your cpu is dense enough to gravitationally bend light by any significant amount then you might have other problems.

But we bounce light off of electric fields all the time, so the a flat chip is not really required (even if you made an all-optical chip). As others have said, it's all about heat distribution.


[Brief network outage provoked this far-too-long answer ;) ]

I'm pretty sure anyone who's dealt with fibre optics would disagree.

> Einstein

This likely refers to the behaviour of a classical beam of light in General Relativity passing through vacuum near a massive object sourcing an exterior Schwarzschild metric (e.g. a non-rotating but otherwise typical star).

There are redundant degrees of freedom in such a system. Normally one would set a coordinate condition or fix a gauge such that the beam of light is moving across star-centred/star-fixed coordinates [1], and one would say that the path it takes is curved compared to a beam that started parallel with the "curved" beam at a great distance from the star, and which never gets very close to the star.[2] Thus, the star's mass curves the beam of light, or equivalently, the beam(s) of light pass through curved spacetime (more technically they each follow a null geodesic of the Schwarzschild solution), with the near-to-star region of spacetime being more curved than the far-from-star region of spacetime.

In Special Relativity, spacetime is flat, so in vacuum beams of light that start parallel will always remain parallel.

Of course, when one introduces non-vacuum media (including hollow optical fibres with vacuum cores) or even a mirror, then a classical beam of light can be made to follow very different paths than it would in free space.

I think in this context that to the extent that Einstein would focus more on the non-vacuum behaviour of light and -- if we are talking about modern optical computing -- not consider the behaviour of a classical beam of light but rather a set of photons interacting with matter in a way in which quantum mechanical behaviour is not just observable but outright relevant. This would be directly analogous to considering the behaviour of fundamental charges in semiconductors in modern electronic computing, as quantum mechanical behaviour is often un-ignorable.

- --

[1] One could choose any arbitrary coordinate condition. For instance (this is something we can do in GR that is very non-Special Relativity-like) one could fix a set of coordinates on the notional leading edge of a (non-eternal) beam of light and treat the curvature of spacetime as a force acting on the beam, increasing in magnitude near the star, and pointing towards the star. While this is wholly legitimate -- the covariant formulation of the physics is identical -- one would generally take the position that the gauge discussed above is preferable as the behaviour of the light is simpler to describe (there's no need to consider the restoration of the behaviour of the beam of light as it leaves the region near the star or to consider any (pseudo-) force fields).

[2] Strictly speaking this is the path through spacetime, however, one could certainly consider deviation in the spacelike non-g_tt directions, and quantify this with the geodesic deviation equation applied between the two originally-parallel beams. In any event, a curved path through spacetime is shorter than a non-curved path, which is markedly different from Euclidean geometry in which a curved path through space is longer than a non-curved path.[3]

[3] Compare a path of a pulse of light through an optical microchip in a lab here on Earth, propagating from one edge of the chip to another. In the lab one will measure the non-straight path through curved waveguides or reflecting off mirrors as longer -- and taking longer -- than a straight path. But that's because the spacetime in the lab is so very nearly flat that it can't presently be distinguished from flat on the length scale of a (parallel-to-the-floor) microchip.


With all the cores it would still apply to inter-core communication and memory sharing.


There more i read about computer internals, the more i am reminded of the hermetic saying "as above, so below".

The layout of a CPU, the layout of a computer internally, and the layout of a network of computers seems to be more and more the same.


It goes even deeper than that. They layout of the circuits at the transister/component level have to deal with this too. Take a look at a motherboard between the cpu and the memory slots, you'll see traces that make weird turns and then back onto them selves. All to make sure the signals end up arriving at the same time (well, as close as is manageable).


Yes, but you can buffer inter core communication, cores don't need to be in sync. I think. I'm not a CPU designer.


I'm still waiting for this bug to be fixed: https://community.amd.com/message/2796982

Note: this isn't a bug in gcc, but looks like hardware bug related to hyperthreading.


AMD fanboy checking in :)

One of the good OCaml folks over at Jane Street just recently found a bug that crashes systems with Hyperthreading enabled on both Intel Skylake and Kaby Lake as well – just saying:

https://news.ycombinator.com/item?id=14630183

Those hardware bugs are all very disheartening from software folks perspective, doesn't matter which company it pertains... "microcode update" rollseyes

Edit: dear AMD haters, keep the downvotes coming B-) it has only been since May that the first fixes have been rolled out for Skylake / Kaby Lake – and yes, naturally AMD should "fix" (rollseyesagain) their respective issues ASAP but just saying...

https://arstechnica.com/information-technology/2017/06/skyla...


I bought my Ryzen CPU already, and going to build a new computer shortly. So I'm literally waiting for that fix.


If you are a Linux user it's fine (had two months on a 1700 with Fedora and zero issues) unless you hit the particular edge case.


Sounds like heavy parallel compiling hits it pretty often. Or you mean even that doesn't happen with every chip? Is it some kind of random hardware defect that's not always present?


I hammer the work machine but not at parallel GCC compiles so whatever the underlying cause it's simply not something I've hit.


Disable hyperthreading, it's overrated anyways


Is it? I know there are workloads when it actually can hurt performance, but are they more common than ones when it helps?


> I know there are workloads when it actually can hurt performance, but are they more common than ones when it helps?

Hyperthreading makes better use of execution units but causes twice as many threads to share memory bandwidth and caches. It's often a good trade-off.

But it's also less valuable for high core-count processors, because workloads that aren't highly parallel will have idle real cores to run other threads on, and highly parallel workloads are more likely to be bottlenecked by memory and caches when there are more cores.

The real question is whether it's faster for your workload. And your hardware. The answer could be different on a system with 8 cores and 2 memory channels than a system with 4 cores and 4 memory channels.

Turn it on and then off and see which is faster for you.


Here's Anandtech's recent SPEC benchmark deltas on hypertheading.

http://www.anandtech.com/show/11544/intel-skylake-ep-vs-amd-...


I see performance increases in every row there, so it must be beneficial in most cases.


comparing single threaded performance vs. two-threads-on-one-core will do that. here's a bunch of real-world tests (mostly gaming) showing HT hurting performance: https://www.techpowerup.com/forums/threads/gaming-benchmarks...


> I know there are workloads when it actually can hurt performance

From what I've seen, performance is only hurt in contrived benchmarks intended to cause slowdowns when HT is in use.


It's not net helpful if it triggers a CPU bug.


Sure, but if it's a feature that's increasing performance without a bug, it's something that should be fixed, otherwise you are paying for broken hardware.


That sucks indeed :(

I just wanted to say that many recent CPUs seem to be rushed efforts - be it Intel or AMD.


They’re not rushed, you just cannot knock out every bug in a new piece of software or hardware in one go. It is not possible (without some incredibly slow and 3xpensove processes).


AMD has pulled up to match or beat Intel, so both companies were absolutely rushing things out. It will settle down and these bugs will get fixed by microcode updates.


Compare what you can get today: an Intel CPU without a hyperthreading problem (fixed in a microcode update months back) or an AMD CPU with a hyperthreading problem (being investigated by AMD for 3 months).

Edit: thanks, s/unacknowledged/being investigated/


If I remember the story right it took Intel ~8months to accept that it was an issue and fix it.


Would you buy a pile of Ryzen CPUs under the assumption that AMD will be able to fix it properly with an update (without e.g. disabling HT or making performance worse as they did for their TLB fix in 2007)? If they take another 2 months, you're out on your return window.


to be fair, that Intel hyperthreading issue took Intel 5 months to investigate and fix.


That bug was acknowledged a while ago.


Just a question (this is how out of touch I am). I thought most linux compilation moved over to a lvvm toolchain?


In my experience, most Linux compilation still uses gcc. For instance, Fedora often recompiles all of its packages with a newer gcc for new Fedora releases; see https://fedoraproject.org/wiki/Changes/GCC7 and https://fedoraproject.org/wiki/Changes/GCC6 and so on.

And the Linux kernel, due to its heavy use of gcc-specific extensions, also has some problems compiling with LLVM; see https://bugs.llvm.org/show_bug.cgi?id=4068 for a list.


Actually, these days, it compiles just about fine. (though admittedly it depends on the platform)


Not even remotely close. No major distribution uses the LLVM toolchain.


If you are talking about the kernel itself, I think Debian did some work to support building Linux with clang. But last time I checked, it wasn't ready.

See also: https://wiki.debian.org/llvm-clang


Distributions compile more than just a kernel. That's just one of several thousand packages we build. Usually the problem (even with something as simple as PIC/ASLR) is that old projects might have problems compiling.

We just switched openSUSE Tumbleweed to GCC7 a few months ago, and it was a very large amount of work. I can't imagine how much work it would take to switch to LLVM.


Counterpoint: We compile more than one entire linux distribution (one is gentoo based, others vary) with llvm with essentially no issues (the number of local patches is incredibly small, and most is actually just buggy code that clang happens to detect, and we are waiting for upstream to accept).

Is it work? Sure. But it's not like "years of real work". It's about 5-10 people for a quarter or two.


It is definitely possible, but I was saying that switching to LLVM would take a while, and that even GCC updates are painful. 5-10 people for 6 months is a fairly serious time investment for a community project (especially since we have a lot of other things on our plate).


There are efforts for the kernel itself too.

http://llvm.linuxfoundation.org/index.php/Main_Page


To add to what other people have said, you can't even compile the kernel unpatched with llvm yet. GCC is still very much a thing.


What would be the benefit to llvm?


It is not GPL3 licensed. Most of the projects that have moved from gcc to llvm did it because of the license not for technical reasons.

That isn't to say there are not technical reasons to say llvm is better - there are good ones. However they are not compelling like the license is. Llvm has a better internal design which should be better long term, but so far gcc generally compiles faster code which is for most people more important that internal design.

Note that linux distributions are mostly built by the types of people who like the GPL3 license. Using gcc is the correct decision for most linux distributions.


Am I understanding it correctly that you're saying those projects switch because of the license of their compiler?

As far as I know the GPL doesn't "infect" the program compiled with a GPL-licensed compiler, I can have my proprietary binary-only software compiled with the GPLed gcc just fine...


> Am I understanding it correctly that you're saying those projects switch because of the license of their compiler?

Indeed they do.

  $ cc --version
  gcc (GCC) 4.2.1 20070719
This system is stuck with a rather old branch of GCC. It won't ever be upgraded, because of the license. LLVM seems to be on its way though.


It's a factor if you're writing a compiler for a new programming language. LLVM has been a pretty common target for that (see also: Julia and Rust) both for technical reasons and because it means the language itself can be released under a non-GPL license.

It's also a factor if you're distributing a compiler with your operating system. This has already driven FreeBSD (IIRC) to LLVM, and OpenBSD is starting to adopt it for a few platforms (GCC is still required due to certain OpenBSD targets that aren't supported by LLVM, but if that's ever fixed - whether because OpenBSD drops those targets or because LLVM eventually supports them - then I'd be very surprised if a switch doesn't happen for all platforms).


> It's a factor if you're writing a compiler for a new programming language.

Let them release their compiler for that language under GPLv3. What is there to be scared of?


The license for GPLv3 is something to be scared of.


Yes. Apple (NeXT at the time) had to make their objective-c frontend for gcc GPL, and they did not like that. that is why apple spent a lot of money making clang good: it allows them to do have patches they don't want to make available why still getting a good compiler. (I don't know if those patches exist, but they have that ability if they want it)

*BSD has in generally hated having anything GPL in their base system. OpenBSD played with creating their own C compiler for a while (the goal was just a compiler with only minimal optimizations - they expected everyone would just use that to build gcc and then build everything with gcc)


It's widely viewed as better designed (though that's obviously somewhat subjective); it's more modular, and a lot of innovative work on optimization now happens there rather than in GCC.


The "licensing" and "better design" arguments are not really compelling. As for "licensing", basically no commercial operating system distribution ships the compiler in the base system. As for "better design", LLVM still doesn't generate code as well as GCC, especially for the kernel; but over the years as the codegen has become better, the performance has come down to match GCC. LLVM has some pinhole optimizations which don't exist on GCC, and some slightly different behaviours around exotic non-standard types, but they are largely becoming the same compiler.

Turns out that GCC is actually really fast for the quality of the code it generates, and Clang/LLVM take almost exactly the same amount of time to generate similar code.


Licensing was THE reason Apple didn't update their GCC version in Xcode and switched over to Clang/LLVM. IANAL and my history is a bit hazy but the things that Apple wanted to support and the way they wanted to go about doing it was not compatible with the GPL.


Apple first used LLVM for their graphics stack, and it became more attractive to them because of the JIT stuff. Bringing Clang along is more or less an artifact of that. Maybe they weren't interested in GPLv3 since the tivoization clause prohibits them from distributing the compiler on devices which can't be flashed, but I don't think it's exactly rational. They also don't update the version of Emacs that they ship, and there you could make the argument for it since it ships with the operating system.

They wanted the freedom to make the base system read only, and I get that, but if LLVM weren't there they would be shipping GPLv3 GCC and Emacs 25.


> tivoization clause prohibits them from distributing the compiler on devices which can't be flashed

The tivoization clause has an explicit exception that allows for devices which can't be flashed. What the clause prohibits is the case when devices can only be flashed if you got the right developer password, and the publisher deny the owner of the device to have that password.


rms used to tell a story about NeXT approaching him to come to a deal to throw money at the FSF in order to avoid having to open source the Objective C gcc toolchain. The FSF (predictably) refused.

I wouldn't be surprised if the decision to go all-in on Clang was a reaction to a decade old grudge.


While I'm sure there would be benefits to LLVM from having such a well funded depend on it, the kernel has driven a lot of patches for GCC, I imagine the kernel would benefit more. Here is a list of advantages copy and pasted from llvm.linuxfoundation.org

Why use Clang/LLVM with the Linux Kernel?

Fast Compiles (making you able to work faster)

LLVM/Clang is a fast moving project with many things fixed quickly and features added.

LLVM is a family of tools used in many problem domains leading to one code base being able to build tools to work on just about anything you need: one place to add features, or fix bugs.

BSD License (some people prefer this license to the GPL)

Built in static analyzer

Great error reporting and Fix-it hints

LLVM technology can be embedded into many tools (even yours!)

Already in wide used in OSS projects and in industry


Not having the kernel locked into a compiler monoculture for one.


Why does that matter?


Monocultures of any kind in software aren't good.

It leads to lock-ins, like the current kernel is locked into GCC. Some of those GCC Extensions used are effectively proprietary because "the code is the documentation" so nobody can replicate it for their own compiler.

I do not believe that is healthy.


That's definitely a huge disadvantage!

But only supporting one compiler has the advantage that you're aren't limited to the lowest common denominator of compiler features.

You'll notice this especially with C++ codebases which try to support GCC, Clang, Intel and Visual C++. You'll need a lot of macros and workarounds for each compiler's quirks.


When I'm doing a new C project I usually go out of my way to make sure it compiles clean with GCC and clang. Each compiler has its strength and weaknesses. GCC tends to produce faster code in some situations where clang has great static code analysis tools and asan is amazing. I've heard GCC has made major improvements in build speed since the last time I've started a new C project from scratch but that used to be on my list of reasons too.

C can be a hard to wrangle beast and the more checks I have on my code quality the better I sleep at night.


At least from the outside, GCC asan seems comparable to clang's right now. I haven't done back-to-back checking to see if clang finds things it doesn't. I haven't really used the undefined and thread sanitizers though.


Oh wow. That's exciting news. My current job doesn't have me doing any C so I haven't been following the releases as closely as I used to.


LLVM is an extremely open backend. So it can be used for many projects a short list of its usages:

* Compiling SPIR-V shaders to GPU executables

* Compiling C

* Compiling C++

* Compiling Rust

* Compiling Haskell

* Compiling Swift

* JIT compilers for databases

* JVM JIT

Effectively once you translate to LLVM-IR you are basically home free, the LLVM can handle the rest. Which means you have a board range of people providing optimization passes, and catching bugs.

---

With the GCC its IR is tightly coupled to the C front end. While there has been some decoupling, this is generally considered a feature as it doesn't allow for a corporation to _run away_ with part of the project.

---

The LLVM is more modular, you can pick and take what you need. You can embed it, or call it externally. It is a hybrid BST/MIT/X11 license so it is open source, and embedding it in your project doesn't mean your project becomes GPL'd.



more portability, and some of the santiziers that clang has (ubsan, asan). They have gcc equivalents or ports but being able to use more of these kinds of tools is a good thing in the end since it'll allow more bugs to be found without an exploit.


No sure what you mean by 'more portability'. gcc compiles to a strict subset of the architectures than llvm supports (at least, last time I checked).


It means that you won't end up relying on a specific compiler's quirks or implementation of the language. You can expect different compilers to handle undefined behavior in different ways which will help you catch where you're doing things you shouldn't be. And since you're no longer relying on a single compiler you are more likely to be able to support other platforms and compilers that don't have a gcc port (they do exist, even if they aren't common).


You have to rely on compiler's implementation -- you can't implement a kernel in pure C unfortunately, you need to do some things "outside" of C.

In practice, most supporting of the kernel has come about by clang implementing things how gcc do them anyway.


You mean superset, right?


Yes, and it's too late to change now, thanks.


> and some of the santiziers that clang has (ubsan, asan)

GCC has these.


Do they end up finding the same issues/bugs? I thought they didn't always find the same things last I tried using them, so I'd build in both compilers to end up running tests/checks.


Why would you want to switch to llvm from gcc for a GNU/Linux distribution? (I mean, beyond proving that the code is portable and do regression testing)?

I do remember talk on and off about shifting to the Intel c compiler, for increased performance - apparently Intel now has their own distro:

https://clearlinux.org/


FreeBSD has moved to llvm.


NetBSD also supports clang/llvm.

https://wiki.netbsd.org/tutorials/pkgsrc/clang/


For the base system or for everything? I doubt they moved the ports to compiling to LLVM.


It's the default for anything. Many ports override that and specify gcc (but that was already the case in a lot of cases where they needed a newer version than what shipped in the base system).


Starting FreeBSD 10 (released in 2014) the default for everything including ports was to build with clang. There was a large effort leading up to that to get as many ports as possible building with clang. Many bugs in clang were fixed, and patches were submitted upstream to make others work.

At this time that effort is dead: the few ports that don't build with clang are accepted as doing something that will never work.


For compiler for base and most of ports, yes. The linker is still an ancient GPL2 GNU binutils one.


A lot of work happened on linking base with LLVM's LLD, I think it might be already used in -CURRENT for amd64 and arm64.


It's only used for arm64, not amd64, because arm64 doesn't exist in GPL2 binutils (from ~2008).


Not really, but it's generally up to individual projects. Many support both gcc and clang.


Turns out that after Clang got all the features GCC already had, compile times where even worse. At least for my C++ programs compiling with g++ is faster than with clang++.


I am still waiting the official errata for Family 17h


What I am going to be interested in is this versus EPYC parts. I think the higher clocks are mainly to achieve some of the more insane (and useless) FPS counts for games. If you are willing to ramp down the FPS to a number that your monitor can actually display, it may be better to find a general purpose EPYC MB and chipset, and use that. Especially if homelab / big data / compiling linux/ occasional gaming is you cup of tea.


For regular gamers I totally agree with your comment. But, since there are Twitch streamers out there the Threadripper might be a good fit if you want to stream since encoding must be done and preserve a high FPS (60 FPS or more). Also, it would future proof you maybe for VR streaming or 4k and higher streaming.


Ryzen 7 and 8 targeted the streamer market. That platform has reproducibly breezed through game streaming workloads.

Threadripper is for something else. "Professional work" most likely: developer workstations, 3D modeling workstations, etc. Given that all Ryzen SKUs support ECC [1], this is probably a good platform for individuals who have historically sought out server platforms.

[1] So far as I understand it, they haven't disabled ECC in hardware but don't promise that it works.


Ryzen 7 already struggles with two streams (2 separate x264 encoders with different settings for different streaming services, and I think that was tested at 1080p). Really serious streamers might pick up Threadripper :)


IME, Ryzen's very happy with a 1080p60 stream with fast or even medium quality, leaving headroom for CPU spikes because of rough encoding patterns. Really serious streamers are generally more advised to just split the existing stream at that point; 3.5K/medium (Twitch) and 5.5K/medium (YouTube) are pretty similar in motion.


For ECC to work in NON Pro ryzen chips the motherboard needs to support it, pretty sure there are 2-3 that are confirmed to support it


There's still plenty of room to grow into VR and/or 120Hz/240Hz monitors at 4k/5k.


Sure, and PCIe lanes will be key to making that work, not CPU. EPYC has insane capabilities there.


Resolutions has nothing to do with it. To the contrary: You need less cpu power since you will get your FPS limit from the gpu, not the cpu. Similar for VR: Current cpus are strong enough to get the 90FPS current VR headsets want. 240Hz maybe, but those processors won't be faster in games than a Intel Core i7-7700K, as games don't use that many cores.


So then what is the original comment referring to?


I think that is is comparing Threadripper to other workstation cpus, or it rather explicitly does compare to EPYC. Since Threadripper has a high clock, it might work better in games as well. Parent is right in saying that when adapting expectations, that additional gaming performance is unlikely to be necessary.

Reminds me of using one of the Xeon E5-2670 for gaming, as in https://www.techspot.com/review/1155-affordable-dual-xeon-pc....


Posting from my dual Xeon E5-2670 here, it is a great machine for gaming, look at those average fps benchmarks in your link.


Interesting argument, do you know of any CPU that, combined with an FE VEGA or GTX 1080Ti, reaches > 144fps in modern games at max settings and 4K resolution?

Because that’s what your monitor can display.


I'd be impressed if you're using a monitor that's capable of 4k at 144Hz.

While 4K and 144Hz monitors are each fairly common and inexpensive these days; monitors that do both are still very rare and expensive.


I actually am waiting for a 4K HDR 144Hz screen (which will arrive soon, actually).

Although I’ll only game on it in 1080p, I want the 144Hz and 4K mostly for easier reading.


I wish 4K (3840x2160) monitors would show a 1080p signal at a completely sharp 1-input-pixel-becomes-4-output-pixels.

They all seem to get some blurry scaler chip involved even when the numbers divide cleanly.


You can actually choose that as option in the AMD or Nvidia control.

You want to select the scaling mode Nearest Neighbor instead of Bilinear.


I think you may be imagining that, at least for AMD. Where is the option located?


You are indeed right, this was once in the Catalyst center, but the radeon controls never got the feature: https://community.amd.com/thread/195561


The GTX 1080 Ti does not reach that many FPS on 4K, it mostly strives to get 60. It does not matter which cpu you run. When you are on 4K, you could be happy with a small Intel i5.


My only concern with 4 core/4 thread CPUs like the majority of i5 models is that the video game industry is already showing some signs of optimizing for having 8 threads on the PS4 and Xbox.

Watch Dogs 2 on PC, as an example, has widely documented CPU bottlenecks on 4 thread CPUs even at 1080p. I just replaced an overclocked i5 4690k last week with an i7 to solve this issue, and is the first game I've played that was meaningfully CPU bound with my 1080 ti on 1080p/1440p displays. I think the days of an overclocked 4 core i5 being a great value choice in high(ish) end PC gaming are probably coming to an end soon. I'd certainly think twice on a new build.

http://www.gamersnexus.net/game-bench/2808-watch-dogs-2-cpu-...


Watchdogs 2 also runs surprisingly well on FX cpus, as seen in that benchmark. But one has to be careful with console ports anyway, Batman Arkham City for example would paint a very strange picture of PC performance (the port was and is a disaster)

But there are other valid examples, like Battlefield 1 in multiplayer. The 4 core era will end, and consoles might make that happen faster, but especially for 4K gaming I would not worry yet. Till gpus are fast enough to make cpu the bottleneck in those there will be a few more cpu generations to come. At least based on current performance and how gpus normally develop. We only just reached 60 FPS on high settings there, and that with the most expensive consumer gpu available.

But on 1080p and 1440p, that's a different story. Being more future proof for that development is one of the appeals of the Ryzen 5 1600 (6c/12t).


It's already ended. I have an 8-"core" AMD FX CPU, and I decided to underclock it to 1GHz to see how games reacted. You'd be surprised how playable many games are. Mostly, I guess, because the thread communicating with the GPU does not need all that much power, so frame rates stay semi-playable.

Anyway, with an underclocked 1GHz FX 8350, The Witcher 3 saturates about 5 cores in cities and nibbles on the sixth core. Dragon Age: Inquisition uses up to 8 cores. RiME uses 4 cores. TrackMania Nations uses 1-2 cores.

Generally, AAA games have been post-quad-core for years now.


I'd say it's already all but at an end. The Witcher 3 is probably the most intensive game I currently own. On my CPU (i7-4790k, 4 core/8 thread) standing still in a crowded city environment shows about 50% load across all 8 HT cores. Once you start running through the city use across all cores rises to a steady 60-70% with regular (every 2-3 seconds) spikes to 90+% on all cores. That's a two year old game. I have a feeling that when the next big wave of major AAA titles starts to hit, many of them will be severely limited by 4 core/4thread.


If your 1080 Ti "mostly strives to get 60", your game is rendering way too much heavy effects :) If you actually tweak game settings (mostly turn off Screen Space Reflections, Ambient Occlusion, crap like that), you can play many games at 4K60 with much cheaper hardware.


Well, OP talked specifically about max setting. But you are not wrong. That's a problem in general currently. Ultra/Max setting are incredibly heavy, but don't change the optics that much, especially not on a resolution that high.

Not too sure how stable your FPS will be on cheaper hardware, but not running max has to help.


Well, my overclocked RX 480 does around 80-90 fps in Overwatch with tweaked settings at 4K.

The problem is that presets tend to change everything. But at high res you need maximum texture size / mesh detail AND minimum effects.


Its been awhile since I've built a computer with my own two hands, but either that man's hands are really small or hot damn AMD Ryzen CPU are huge.


For what it's worth, I interpreted that image to be a still from the announcement video, which from the article:

> Today’s announcements, accompanied by a video from the CEO of AMD Dr. Lisa Su

So, given that she's an Asian woman, it might be a smaller hand than you were thinking. Nevertheless, it does seem to be a very large component (hence the joke image at the bottom of the article).


Of course it's her, no one else publicly showed a Threadripper yet :)

It is large — 4094 pin socket!


This announcement is essentially two Zen CPUs mounted on the same microchip module (MCM) so yes, the package is "huge" relative to previous CPUs.


MCM is multi chip module


This is a HEDT processor that comes in the package of their server processors (up to four dice and eight memory channels, over 4000 pins total). These are large, because there's a lot in them ;)

The desktop series has a more usual size.


Threadripper and Epyc are huge. The desktop Ryzen is small :)


I'm still waiting for a more diverse set of synthetic and real-world benchmarks. It'll be interesting to see how IPC performance holds up with Threadripper, however I think the most interesting debate will be whether the 1920x or lowest end Epyc CPU are a better buy.

Unfortunately, even as an enthusiast $799 is more than I'm willing to spend on a CPU. I'm also still hard pressed to build a Ryzen 1700 System since I can purchase an i7 7700 from MicroCenter for about $10 less than the Ryzen part (and have equal or better general performance with notable better IPC).


The i7-7700k will give you a slightly better performance in games, but mostly only in games. In Multi-threaded workloads - and isn't a cpu in this league mostly relevant for those? - the Ryzen 7 is a bit faster than the i7-7700, and very close to the 7700K[0]. This is before overclocking, which one would definitely do with the smaller R7.

Also, the IPC of the i7 is not that much better, as evident by the good gaming performance of the Ryzen line. The 7700K however with its higher clock arrives at a higher single threaded performance, making it reach higher FPS.

[0]: https://www.computerbase.de/2017-03/amd-ryzen-1800x-1700x-17... - application benchmark, article is in german, but the chart also readable without speaking that language.


You can have the same or better performance when running a single app, for sure.

Lots of people who've moved to Ryzen saying how it's changed the way they user their computers, not having to worry about multiple CPU hungry apps running at the same time.


Cinebench scores are fantastic for speculating on how these CPUs will work in the context of 3D rendering/simulations. That personally has me very excited.


Agreed - 3062 for the TR-1950X and 2431 for the TR-1920X are colossal results. 2167 for the i9-7900X is still huge!

At first I thought that (given it's AMD running the test) they'd hamstring the Intel CPU with an under powered cooler, but it looks like they gave it a big Corsair H100i 240mm water cooler. If that's not enough for a 140W CPU, I don't know what is - it was probably on turbo the entire time. They gave their own CPUs an "EK Prototype SP3" cooler, though - is this it? https://www.ekwb.com/shop/ek-kit-s360

Regardless, the message seems to be that if you want top performance, get a big cooler.


The "EK Prototype SP3" cooler was probably a prototype of a water block designed for the X399 platform. The chip is massive, so the existing solutions probably don't fit it.


My only qualm is that purchasing an i7 7700 system in comparison to a Ryzen 1700 system is almost the same price (since I'm buying my CPU from MircoCenter).

The use of my desktop is maybe 5% gaming, the rest is software dev and browsing, isn't there still a reason to prefer an IPC advantage for compilers and other system tasks?

I mentioned this, because I don't think a 12 core CPU would really give me that much more productivity or value gain than an i7 7700 or Ryzen 1700.

I only really need a single linux compatible nVidia GPU, enough cores and ram to comfortably handle 20 browser tabs, an IDE and a relatively hefty docker dev flow.


Unfortunately you don't see a whole lot of benchmarks of dev stuff on CPU review sites. Phoronix at least does Linux compile time. Not sure how representative that is of your "relatively hefty docker dev flow" but it shows the 1700 as 10% faster than the 7700K, which itself has a 17% higher base clock and 7% higher boost clock than the 7700.

http://www.phoronix.com/scan.php?page=article&item=amd-ryzen...


I generally have workflows that heavily utilize docker, i.e. use at least 2GB, docker compose on my MBP can take up to 6-8 minutes.

For me it's still unclear how much of a real advantage IPC provides for common tasks or non-parallel process a developer might be running.


Higher single thread performance is higher single thread performance, no matter if achieved via higher clock, doing more per clock ("IPC"), or both.

If you run one non-parallel task, the 7700K will be faster. If you run many non-parallel tasks in… parallel (make -j16), the 1700 will be seriously faster.


A simple formula: The 1920X beats the 7920X by a few hundred in Cinebench and a couple of hundred in the pocket.

I wonder if the 'Number Copy War' (started with the X299 vs. X399 Chipset) will continue throughout the year.


Since the article does refer to these as desktop CPUs, I'm curious what kind of desktop workloads people are running that could benefit from / justify them?


Gamedev is a good candidate. Compiling in parallel, running different processes that transform resources, running a not yet optimized game engine in debug build, etc.


It's fun being able to start a bunch of brute force number crunching jobs that peg cores for hours and still be able to use the computer as if nothing's going on. Browse, code, listen to music, watch video, everything runs smoothly.


Speaking of brute force, fuzzing!


Intel likes to market these CPUs to people who 'mega-task'.

Video Editing while Gaming while Streaming while Hosting a server while Dealing with encrypted data while Multi-monitor while Contributing to science while while while

etc

AMD hasn't really declared which market they are after yet. It's likely going to be a big part of the launch event.


Mega task - as in have a hundred tabs open, a dozen Excel spreadsheets plus Outlook. Maybe there will be lots of cores since there is no more Moore's law, 32 core computing will be the new core i5 with 64 core computing being the new i7. Some software might run albeit slowly on older computers that only have 16 cores.

In this future the choice of 64 core computing is one of those things like car choice. Obviously we need cars electronically restricted to 155, we would not consider a car that only did 120 or so. There is a theoretical scenario where that vital power is needed but it is not a rational decision based on cost benefit analysis.

Therefore you can expect a change over time to where core total becomes marketing.


Heh, IIRC the "mega-task" thing is very very recent, likely in direct response to AMD upping the core counts with Ryzen. (well, technically FX was 8 core but let's not go there)


I've been debating moving to one for a Gentoo build box

Beyond that, I could see highly scaled workloads like video editing that could take advantage of that much raw CPU power


BTW I had to move to tmpfs to keep this CPU busy when building Gentoo stuff.


I used to dabble in real-time ray tracing. My box is from 2005 though so it was never really great. These chips should easily do moderate complexity scenes at 720p and 60fps, and that's just an 8 core. Figure 1080p 60fps with 16 cores. But then we'll go 4K and turn on a lot more features and slow it down to a crawl again...


Wow, real-time raytracing at 60fps? Define "moderate complexity" :)

I've tried the Real Time Path-traced Quake 2 that's implemented with GPU shaders https://amietia.com/q2pt.html and it's… not fast and grainy.


Path tracing uses a lot of rays per pixel. I do a single eye-ray and a few to the light sources for the surface intersected, and of course reflections. I do have a scheme for handling large numbers of light sources to some extent. I would say moderate complexity definitely includes a near-full-screen object of arbitrary polygon count. It may also include environments like anything from the quake series - including light sources. Of course I need to buy one of these things and try out the old code to see if we're really there yet. Until then I'm still speculating.


I've got a lot of slow build in my life. Large compiled projects are CPU heavy, easily parallelized workloads that offer a great ROI on speeding up. The cost of a new CPU is very little compared to wasted engineer time and broken concentration.


If they iron out the AVX issues, then ffmpeg is what I am most looking forward to switching to Ryzen for.


This could make a good build machine.


I do a lot of video editing, color grading, and video encoding.


make -j32 :)


It is great that they're announced for an August release, but when I can actually BUY one?

Given that Naples (aka Epyc) was "released" in June, I went looking to actually buy one, and I could not find a single place selling them. Not Newegg, nothing local, nothing in Google shopping, etc.


AMD stated that the full EPYC stack will be available for top-tier OEMs end of July. You're unlikely to see them being sold at retail directly until the tier 1/2 OEMs have had the opportunity to take their fill for their customers. That will take a few quarters.

Compare it to Intel - their top-tier partners had the hardware last November, but Intel announced 8 months after that. AMD launched its parts at the start of that 8 months, rather than the end. That's where some of the confusion about EPYC's availability lies. No-one complained they couldn't get SKL-SP back in November because no-one said anything: I'm guessing that AMD had had to make some announcement, given the delays, the projects, and to drive up OEM cooperation/customer anticipation (and show something to investors).


Apparently, it's only available via one OEM: http://www.amd.com/en/where-to-buy/epyc-platforms


Who isn't even selling the single-socket config that I want.


On July 27, you'll be able to buy an Alienware Area-51 with Threadripper.


You'll be able to pre-order. Still no official word on when it'll ship.


With recently a flood of CPU SKUs, I suggest to wait at least end of this year for all of them widely available.


All three R7 SKUs were all readily available at launch, and so were the R5s at their launch. Now mobos on the other hand...


I just googled that. They apparently have an exclusive among major brands. Yuck. To be honest, for me, the last time I built a pre-made tower was 1994 or so. I'm really waiting for boards and CPUs to show up on Newegg/Amazon, at Fry's etc.


$999 list price translates to $1100-$1150 retail price in countries where you have a GST style tax, then you factor in an expensive motherboard plus heat sink, 64GB of RAM, the upgrade is like $2k.

the problem is with this confirmed return of competition between Intel vs AMD, I am no longer sure whether it is a good idea to upgrade now as it is basically the first iteration between those two. Are they going to release something even better in 6-12 months time?


Well, the new iMac Pro (8/10/18 core) will start at $5000 - I also feel ever slower development experiences on my MBP and it will be pretty much a solved problem on a 16 core Threadripper workstation.

I feel like I can't wait for another iteration (actually don't need to) or the planned December release date for the new iMac Pro...

https://www.macrumors.com/roundup/imac-pro/


Wow, that 18 core iMac Pro with 4T SSD and 128G ECC RAM could shot up to $15k. I can buy an army of Ryzen at that price.


> Are they going to release something even better in 6-12 months time?

In this segment? Almost certainly not. Intel doesn't even pretend to have a competitive product in this category ready for launch in the next 12 months. AMD is releasing one now, so will be delayed about that long.


Yes, I would expect new CPUs every year, specifically Cascade Lake-X and Threadripper 2 in 2018 then Cannonlake-X and maybe Threadripper 3 in 2019.


AMD needs to come out with a few AVX-1024 instructions for vector ops. Essentially make one core into a GPU that doesn't suck at branching.


There's a reason that GPUs suck at branching.

Hint: It's because they're good at the vector stuff.


A "GPU that doesn't suck at branching" is basically what the Xeon Phi is intended to be, with 72 AVX512-enabled Atom cores per chip. However it costs over $6000.


and it is export controlled - Chinese are not allowed to buy it.


And it is not competitive with GPUs for truly parallel workloads


The reason you think GPUs suck at branching is that they cannot branch on different lanes of an SIMD data (the 'SI' part guaranteed that). GPUs branch just fine between different instructions. There is nothing inherent to GPU making SIMD instructions to behave this way so SIMD instructions in any other PU will do the same.


I think there are some really hard limits on this now. I think the better at branching a chip is the worse it will be at SIMD and vice versa.

Barring some crazy breakthrough I don't see what you want coming soon. Which is sad, because I want it too.


The comparisons in this article are mostly against the high-end Intel core line, but these CPUs support server / enterprise type features like ECC memory, lots of PCI-E lanes, and virtualization features (I think?).

Shouldn't Threadripper be compared to Xeons?

EDIT: Or rather, what I'm really wondering is what these CPUs lack that AMD's server line (EPYC) have.


EPYC has double the PCIe lanes, double the DRAM channels, and will have enterprise level support. Threadripper is classified by AMD as a Ryzen family product, and is consumer focused (or super high-end desktop focused) rather than enterprise focused. TR will be on shelves, EPYC will not.

AMD's 16-core EPYC part (the 1P 7351P) is around $750, but supports 2TB/socket and 128 PCIe lanes in exchange for a good chunk of frequency (2.4G base, 2.9G Turbo). Threadripper is also single socket only - most of EPYC is 2P.

Though given Intel's pricing, if AMD has the ecosystem, then the mid-range of the Xeon line might migrate to TR/EPYC.


I'm waiting for these to launch so I can build a great multi-threaded computer. My Elixir apps are waiting for all these threads! :)

Does anyone know if Plex is going to see much benefit transcoding video files on the fly?


> Does anyone know if Plex is going to see much benefit transcoding video files on the fly?

Does the underlying avcodec / ffmpeg support huge thread counts?


It's not ffmpeg that matters here but libx264, which does support multithreading. I ran a quick test on my PC (workstation class Xeon with 6c/12t) and found that with 1/2/4/8/12t it took (wall time) 28.5/14.7/9.7/7.2/6 seconds to encode 10s of video at 1080p. So it is able to take advantage of multiple cores but there are diminishing returns.

For Plex I'd really focus on using hardware encoders rather more than anything. NVENC and QuickSync are both widely available and able to do decent H.264/HEVC encodes far faster than any CPU and on a home network you're not really bandwidth constrained so the degree of compression doesn't matter too much. On a mobile network I'd go for Plex's offline optimization.


soon as i have some funds i will be getting one but only if ECC is supported -- what would be even better is if one could do a mild OC on the part but also have ECC


Quad channel, so you have to install RAM with 4 match sticks at a time?


If you don't want to run in Dual or Single channel mode - yes.


Kinda.

Modern (newer than ~10 years) modern controllers are quite flexible, so you can pretty much populate any number of DIMMs in any slots (apart from wrong ordering, though this is not a IMC restriction, rather firmware, for stability).

However. Disregarding the recommended populations usually means that the MC can't interleave the channels properly and also must choose other settings in a lowest-common denominator way. This severely reduces the performance of the memory subsystem, which includes cache coherency and all core communication in Zen.


Interesting, thanks :)


I did a lot of work with artificial life and evolutionary computation in the early 2000s. Wish we had these chips back then.


How long does it take to drip a threa?


Opteron feels o//


I recently tried to go the AMD/Ryzen route. I like an underdog comeback story as much as the next guy.

But be warned: Motherboards that "support" Ryzen do not in fact support Ryzen out of the box. You have to update the BIOS to support Ryzen. How do you POST without a CPU you ask? Who knows? Magic, possibly.

I still don't understand how AMD expects their customers to have more than one CPU (and possibly DDR4-2133 sticks) to be able to POST and update the BIOS.

I returned everything AMD and went back to safe, good ole Intel. Worked on first try. I'm never getting sucked into AMD hype again.

Also, when I went back to return the AMD components to Fry's, the manager said they were aware/used to getting Ryzen returns because of this.


What?!

That sounds totally bogus. You don't have to update the BIOS to support Ryzen. Ryzen is the first CPU on AM4.

You don't need "DDR4-2133 sticks". Ever. Any DDR4 sticks can run at 2133, that's literally the DDR4 standard, everything above is overclocking. 2133 rated sticks are the cheapest (and worst).

I got my R7 1700 and mainboard yesterday. Everything worked on first try. Speaking of DDR4, my 2400 rated (Hynix) sticks overclocked to 3200, with decent timings, even :) (Well, decent for Hynix.) My previous system (overclocked non-K Skylake) couldn't run these sticks above 2450.


Why... why would I lie? What could I possibly have to gain from lying about this? I only have HN karma to lose, which I don't have much to begin with.

You can, before simply defending AMD, research and see for yourself that required BIOS updates are indeed an issue.

I tried it with an MSI Tomahawk, and an Asus B350 Prime. Had a Ryzen-5-1600 and a Ryzen-7-1700, as well as a pair of DDR4-2133, and DDR4-3000 modules each, and tried every combination, and was never able to post.

Switched to Intel CPU and Asus Intel-supporting MB with the same DDR4-3000 RAM and POST'd on the first try. I'm saying that to 'prove' that the other components were fine.

I'm glad that it worked out for you, and I wish it had gone smooth for me too.

http://www.tomshardware.com/answers/id-3387164/am4-motherboa...

https://www.reddit.com/r/Amd/comments/66b7vu/to_all_ryzen_5_...

... so many of these.


You messed up something. My 1700 and Asus B350 Prime were able to POST without issues and without any BIOS updates, and I got both a couple of weeks after release. Most BIOS updates have only been required for overclocking improvements.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: