Nothing you've said about reasoning here is exclusive to LLMs. Human reasoning is also never guaranteed to be deterministic, excluding most correct solutions. As OP says, they may not be reasoning under the hood but if the effect is the same as a tool, does it matter?
I'm not sure if I'm up to date on the latest diffusion work, but I'm genuinely curious how you see them potentially making LLMs more deterministic? These models usually work by sampling too, and it seems like the transformer architecture is better suited to longer context problems than diffusion
Switzerland's draw is the money. It's true that a significant proportion of the population is foreign born, but the whole country is smaller than some tier 2 cities in China and many foreigners do not stay longterm. If China paid Swiss-level salaries there would be more people going for sure, but the country is so big that at a relative level I'm not sure if the proportion would change significantly