Apple did a lot right with this change to make memory fast. I can see AMD and Intel adopting a similar strategy and putting something like 16 GB of dram on chip. Need more than that? Then add “L2 dram” on an external dimm. 16 GB will cover most people’s use cases and with the ability to add L2 dram the high memory usage cases are covered too. (I remember when you could buy cards with L2 cache on them back in the day. 486 I think had them. This is just taking it to the next level.)
Intel already launched a processor with 16GB on-package MCDRAM in 2016 (Knight's Landing Xeon Phi). You can even buy an Intel Xeon with 64GB HBM2 today. Nvidia likewise has been packaging HBM with their server GPUs.
Embedded DRAM (eDRAM) been used for long time in the mobile and console space. e.g. IBM's POWER7 (e.g. Nintendo Gamecube) and Intel Haswell products. However, using a logic process node to make DRAM cells is wasteful. Packaging technologies have advanced sufficiently that you now regularly see regular DRAM dies (LPDDR, HBM) being put on-package.
But all of that is packaging and manufacturing technologies. We're still taking to DRAM over a memory bus like we're still living in the '80s. The true innovation I'm looking out for is for a company to stick its neck out and use a different communication standard to talk with the DRAM modules. Something like the CLX.mem standard, which is used in the server space to talk to memory expansion modules.
Parent is not speaking about cache (L1/L2/L3/...), but about main memory (RAM) of which 16GB would be permanently integrated into the CPU - this would be L1 RAM and of the rest, L2 RAM which would be outside of CPU.
I’m not exactly sure why they mentioned it, but in their defense—it is all fundamentally volatile storage, just used differently. Memory of course has a special magical meaning to operating systems, but hypothetically it might make sense to mark L3 cache as memory, and maybe… treat DRAM as swap? Hypothetically!
It wouldn’t be a cache anymore in that case. The hardware thing is named after the typical job we have for it, but if we want to play with the idea of changing how things are used the names might not line up perfectly anymore.
It's why I'm thankful it's both open source and highly scrutinized by the community, both volunteers, independent security researchers, and big companies like Google that deploy billions of instances of Linux (servers, google cloud, android, chromeOS, etc).
And we know about it. The backdoor methods have been generalized and now researchers can check for that too.
For example bitcoin's elliptic curve secp256k1 was choosen because its constants were chosen in a predictable way and that reduces the possibility of a backdoor.
Dual EC DRBG is the known backdoored curve. You have the links to the high level story in a sibling comment.
I would also like to add, however, that the possibility of a backdoor was patented by Scott Vanstone I think, and raised in NIST standardization process (and I suspect standardized under pressure from the NSA more than anything). Other negative facts that were raised include the fact that it sucks badly, i.e. compared to just about any other RNG, it performs very poorly. So the process isn't as bad as it looks.
DualEC was a backdoor, but not a very good one. People noticed the possibility and it sucks compared to literally anything else. The only people who used it appear to be customers of RSA Inc.
I would also like to add that Elliptic (not Elliptical, these are not the equations of ellipses) Curves, even the NIST ones, are not known to be backdoored and there's no evidence they contain any weaknesses at present. There are plenty of non-American cryptographers who are unlikely to keep any analysis a secret if they found such evidence, and I would say quite a few American ones who would also publish.