Nvidia makes GSync. So the GPU can control and adjust display refresh rate on the fly. Proprietary, closed source, requires private Nvidia License. AMD makes FreeSync. FreeSync open sourced, added to the HDMI, and Display Port standards.
Now we see Nvidia makes NvLink. Proprietary, very fast, requires private Nvidia license. AMD partners with IBM, Google, etc., etc. to make OpenCAPI an open standard that can revise/replace PCIe3.0
Why does this feel like Microsoft vs Linux but with Hardware Standards.
Was "anyone" as "annoyed" as I was about the "scare quotes" littered "throughout" the article? It was "hard" to read the "story".
I've been guilty of using too many quotes in the recent past. Someone on HN called me on it, and I've since toned it down. Now it's something that sticks out at me.
What's wrong with PCIe? The devices are ubiquitous. It's point-to-point, allowing for device to device connectivity. We have external PCIe enclosures and cables. We have a healthy set of PCIe switches.
This is similar. It's also point-to-point, and it has an external story. Skimming the spec, they've even thought about higher latency (200ns) links, and optical. Even with these advantages, I'm unsure how it'll work due to the lack of IP available as compared to PCIe.
Interesting, though. Will be fun to see it play out. It bums me out a little bit though that POWER isn't easily available yet.
PCIe is still rather slow. A single-packet transaction on PCIe costs 120ns.
The lack of IP is the reason why P8/CAPI stuck to PCIe. In the original design, CAPI was to simply reuse the PCI link layer, not the transaction layer.
With a cache-coherent system based on PCIe, especially when the coherency layer is set at L3 like on POWER8, you are looking at ~500ns latency for a single cache line. This kind of latency is just too much for many applications.
It depends on how low latency they mean by 'low latency'. If it can't drive fat gaming GPUs to full utilization, PCIe will be around for a while still. Also, Intel isn't joining up, so PCIe is absolutely sticking around.
Xilinx offers 25gbps single-lanes that can bond up to 4x to get IEEE 802.3-2012 spec compliance for free* with their suite. Sure, you're going to need to control those trace impedences and your board won't be something coming out of OSH Park, but those are definitely attainable speeds for the consumer (e.g., in the single-thousands of dollars; not 800k Cisco VXR tier-1 infrastructure).
You can configure it in CAUI-10 (10 lanes x 10.3125G) or CAUI-4 (4 lanes x 25.78125G), either way, it's been production-ready for quite some time now. (The docs have numbers, but trust me, you can get full throughput within that 200 ns).
There's even production Agilent off-the-shelf test equipment out there that can fully sample at those speeds (none of that over-sampling tomfoolery, we're talking live, Bill O'Reilly style).
In 1989, UltraSPARC had similar facilities (SBus) to push 100MBit between other Sun machines, so I mean, not too insane comparatively.
* Free with purchase of Virtex® UltraScale™ and Kintex® UltraScale FPGA required haha.
I would like to search the term and understand how they can achieve 200ns latency. Maybe am I the only one who thinks 200ns latency as completion latency?
> "NVIDIA is a member of the OpenCAPI consortium, at the "contributor level", which is the same level Xilinx has. The same is true for HPE (HP Enterprise)"
Hmmm, interesting. I wonder what this means for the new Intel Xeon Phi Knights Landing? I liked the approach of many cores on one bootable chip, all having a reasonable amount of local memory, and high bandwidth interconnects: no need to offload data to a peripheral (GPU) device. However with this standard the currently limited bandwidth between peripherals and the main cpu will improve a lot.
To me it is obvious why Intel is not joining this party.
well, i think its a bit more about market segments and interoperability than anything else. currently overall systems from ibm, and, and intel are fundamentally incompatible. the PCI-E bus that knights landing hangs off of is a qualitatively different thing than the kinds of memory-coherent inter-cpu busses that are being addressed with the CAPI proposal. Intel has their own proprietary QPI. AMD has a quasi-open hyper transport (still?).
If this is done properly it means you could make generic motherboards, and generic memory controllers, and all sorts of different accelerators and mix and match them from various vendors. So its no surprise that the smaller players in the market and trying to gang together and the larger player is trying to keep lock-in.
Knights Landing as it stands would already integrate in systems better if it were using an inter-cpu bus than a peripheral bus as well as GPUs, FPGAs, and certainly RDMA/memory window systems like Mellanox.
Inherent distrust of standards aside, this could be a great win for people putting together bespoke systems in interesting configurations (i.e. Google), I don't think there is any downside in theory for Intel except more competition.
I haven't caught up with the latest phi releases, but I'm really interested to do so. I haven't been able to find any discussion about a coherent QPI on KNL, but have found reference to OmniPath, which looks like a non-coherent large scale memory network. Is that what you were thinking of, or maybe could you post a reference?
You are right, I was misremembering. The socketed KNL does not have QPI at all, so no multisocket boards are possible. Still the socketed KNL doesn't hang-off the PCI bus as it is its own host processor and has a dedicated link with memory.
Omnipath is used to drive both the Ethernet and PCIe.
It is precisely because Intel has developed its own next generation buses, and is licencing them very restrictively, that CAPI exists. Intel has QPI (Quickpath) for internal and Omnipath for external.
nVidia would have loved to have a GPU with a QPI interconnect, but Intel wouldn't let them because they have their own GPU ambitions in Xeon Phi. So they came up with NvLink, which is kind-of PCIe but faster. They don't have any switch asics as yet so they are limited to fully connected topologies. Details on NvLink (without NDA) are scant, but I don't think it has the ability to be multi-node (it really is a bus).
Intel are now making Xeon Phi with on board Omnipath (more like Infiniband), which is curious.
This is meant as a competitor, at least in the high-performance computing market, to Intel's Xeon Phi (aka Knights Landing) based systems which will start including their OmniPath network fabric on-die (based on QLogic's Infiniband tech).
If those gain widespread adoption, that leave little room for Mellanox's IB platform, hurts NVIDIA's sales of accelerator cards, and cuts AMD's and IBM's processors out of the picture entirely.
Do they need the market leader to take on the market?
In the end, that's also what is at stake, and the reason Intel might not be that interested. They will maybe see the light and see that they would have to steer the collaboration toward a more beneficial direction (for them), but at first, they will probably wait for it to gain momentum before deciding that they have to invest money to push the effort in the right direction (for them).
They have to be a threat for the established company to make a move. What nickpsecurity is saying by telling that they don't need Intel is that they are not yet a sufficient threat and that they are capable of becoming one.
Which they will be if they have a competitive technology that has a migration path for both x86 ISA and about everything else dominating in acceleration space. Intel might respond by supporting it or deploying their own thing. They're sure to loose market share, though, if the coalition's standardization commoditizes accelerators more than Intel's side does. They could loose some profit margin in the niche along with the market share.
Yes, but the commoditization of UNIX-like OSes and processor agnostic runtimes on the data center, means that processor architectures aren't any longer a way to keep developers captive from processor manufacturers.
So any HPC application that makes use of abstraction libraries for SIMD and GPGPU code can be easily moved to non-Intel processors.
The people involved or wanting to use this buy so many chips from intel that if intel doesn't get on board, it's going to likely turn out badly for intel.
I note that Facebook and Intel are missing, which makes me wonder if they are off in a corner somewhere doing their own thing.
"especially when you need to make sure that 30+ year-old code still work like it's the case in HPC"
Most code that works on Intel works on AMD. It's rare that it doesn't. The HPC vendor will have low risk on migration + get a bunch of competing accelerators at various price points for their problem. This is quite an incentive to move even if there would be stragglers.
The question is : are the DOD/DOE (or the chinese/european equivalent) going to risk millions for their next supercomputers on a new architecture or prefer good old intel ?
Especially when we know that the jump to exascale computation will only be possible if we get rid of the buses and have everything on the same chip (which is one direction taken by Intel).
My point is that the conclusion of the article "Intel has to come up with an answer..." is just plain wrong. This new architecture is an answer to intel new developments. We will have to wait to see what stick to the wall.
They're the people that bought all the POWER's (SP2 onward), Alphas, Itaniums (eg SGI), Cell's, and so on. They'll take risks esp if they think the firm will be around to supply the upgrades.
Err, again, the people involved, on their own, buy enough chips to have a completely self supporting ecosystem.
Even if it only ever stayed the current set of participants, it'd be enough money to be worth it
Plus the suppliers are already making enough to justify building these things. It's existing products being extended with a new interface to support new products. Whatever Intel does probably won't kill the existing products. With that in mind, I think the move can be only beneficial for Intel's competitors given they're already in an uphill battle needing differentiators. What you think on that angle?
I would argue that in some respects they would have //less// of a strategic advantage, but that consumers would see the most benefit from everyone getting along and competing on merit on an open playing field. Even in the latter case AMD would still have /an/ advantage as they would at least still be in the primary game, and would also be restricting Intel to targeting that same hardware interface.
There was at one point cHT IP available for Xilinx. But that is what I am kinda getting at. Did cHT fail for reasons of pure timing? Too far ahead of the market?
Nvidia makes GSync. So the GPU can control and adjust display refresh rate on the fly. Proprietary, closed source, requires private Nvidia License. AMD makes FreeSync. FreeSync open sourced, added to the HDMI, and Display Port standards.
Now we see Nvidia makes NvLink. Proprietary, very fast, requires private Nvidia license. AMD partners with IBM, Google, etc., etc. to make OpenCAPI an open standard that can revise/replace PCIe3.0
Why does this feel like Microsoft vs Linux but with Hardware Standards.