Seems like GPT non-determinism is mostly from sparse MoE.
Imagine if you use a DeepMind approach of tree traversal for the most promising paths, it can generate deterministic LLM responses.