Setting up a deployment is one thing. But once things are live, I still struggle with the basics like: Zonal redundancy
DR setups
Observability across clusters Curious how folks here think about Day 2, what’s worked for you?