I wonder if the use of eDRAM (https://en.wikipedia.org/wiki/EDRAM), which is essentially embedding DRAM into a chip made on a logic process would be a good idea here.
EDRAM is essentially a tradeoff between SRAM and DRAM, offering much greater density at the cost of somewhat worse throughput and latency.
There were a couple of POWER cpus that used EDRAM as L3 cache, but it seems to have fallen out of favor.
EDRAM is essentially a tradeoff between SRAM and DRAM, offering much greater density at the cost of somewhat worse throughput and latency.
There were a couple of POWER cpus that used EDRAM as L3 cache, but it seems to have fallen out of favor.