Everyone who's used Opus knows it's better than the others in a way that isn't captured by the benchmarks. I would describe it as taste.
Lots of models get really close on benchmarks, but benchmarks only tell us how good they are at solving a defined problem. Opus is far better at solving ill-defined ones.
> Can't we have a system that is optimized for the notes that are actually played in a song rather than the hypothetical set? And what if the optimization is done per note rather than over an entire song?
You can. It’s called adaptive tuning, or dynamic just intonation, and it happens naturally for singers with no accompanying instruments.
It’s impractical on a real instrument, but there’s a commercial synthesiser implementation called hermode tuning.
You’re trading one problem for another, though. No matter how you do this, you will either have occasional mis-tuning or else your notes will drift.