A ways back, I wrote a sort of database that was memory-mapped-file backed (a mistake, but I didn’t know that at the time), and I would have paid top dollar for even a few GB of NVDIMMs that could be put in an ordinary server and could be somewhat straightforwardly mounted as a DAX filesystem. I even tried to do some of the kernel work. But the hardware and firmware was such a mess that it was basically a lost cause. And none of the tech ever seemed to turn into an actual purchasable product. I’m a bit suspicious that Intel never found product-market fit in part because they never had a credible product on the NVDIMM side.
Somewhere I still have some actual battery-backed DIMMs (DRAM plus FPGA interposer plus awkward little supercapacitor bundle) in a drawer. They were not made by Intel, but Intel was clearly using them as a stepping stone toward the broader NVDIMM ecosystem. They worked on exactly one SuperMicro board, kind of, and not at all if you booted using UEFI. Rebooting without doing the magic handshake over SMBUS [0] first took something like 15 minutes, which was not good for those nines of availability.
[0] You can find my SMBUS host driver for exactly this purpose on the LKML archives. It was never merged, in part, because no one could ever get all the teams involved in the Xeon memory controller to reach any sort of agreement as to who owned the bus or how the OS was supposed to communicate without, say, defeating platform thermal management or causing the refresh interval to get out of sync with the DIMM temperature, thus causing corruption.
I’m suspicious that everything involved in Optane development was like this.
> There are very few applications that benefit from such low latency
Basically any RDBMS? MySQL and Postgres both benefit from high performance storage, but too many customers have moved into the cloud where you can’t get NVMe-like performance for durable storage for anything remotely close to a worthwhile price.
I feel like this is proving my point. You can’t read “Optane” and have any real idea of what you’re buying.
Also… were those weird hybrid SSDs even implemented by actual hardware, or were they part of the giant series of massive kludges in the “Rapid Storage” family where some secret sauce in the PCIe host lied to the OS about what was actually connected so an Intel driver could replace the OS’s native storage driver (NVMe, AHCI, or perhaps something worse depending on generation) to implement all the actual logic in software?
It didn’t help Intel that some major storage companies started selling very, very nice flash SSDs in the mean time.
> Optane memory
Which “Optane memory”? The NVMe product always worked on non-Intel. The NVDIMM products that I played with only ever worked on a very small set of rather specialized Intel platforms. I bet AMD could have supported them about as easily as Intel, and Intel barely ever managed to support them.
Intel did a spectacularly poor job with the ecosystem around the memory cells. They made two plays, and both were flops.
1. “Optane” in DIMM form factor. This targeted (I think) two markets. First, use as slower but cheaper and higher density volatile RAM. There was actual demand — various caching workloads, for example, wanted hundreds of GB or even multiple TB in one server, and Optane was a route to get there. But the machines and DIMMs never really became available. Then there was the idea of using Optane DIMMs as persistent storage. This was always tricky because the DDR interface wasn’t meant for this, and Intel also seems to have a lot of legacy tech in the way (their caching system and memory controller) and, for whatever reason, they seem to be barely capable of improving their own technology. They had multiple serious false starts in the space (a power-supply-early-warning scheme using NMI or MCE to idle the system, a horrible platform-specific register to poke to ask the memory controller to kindly flush itself, and the stillborn PCOMMIT instruction).
2. Very nice NVMe devices. I think this was more of a failure of marketing. If they had marketed a line of SSDs that, coupled with an appropriate filesystem, could give 99% fsync latency of 5 microseconds and they had marketed this, I bet people would have paid. But they did nothing of the sort — instead they just threw around the term “Optane” inconsistently.
These days one could build a PCM-backed CXL-connected memory mapped drive, and the performance might be awesome. Heck, I bet it wouldn’t be too hard to get a GPU to stream weights directly off such a device at NVLink-like speeds. Maybe Intel should try it.