I skimmed over a couple of the papers referenced to get an idea of what optimizations LMCache is doing.
* KV cache compression - compressing the bytes of the KV cache, taking advantage of patterns in the KV cache and with dynamic levels of compression
* KV cache blending - concatenating the KV caches of multiple reused prompts with minimal KV cache recomputation for use cases like RAG, where it's more performant than the standard lossless KV cache prefix optimization, and gives better results than naively concatenating the KV caches for the reused prompts
These optimizations are pretty cool and different than the standard KV cache optimizations. The title saying lossless seems misleading though.