Adobe Audience Manager we run Cassandra at scale—over 500 nodes in tens of clusters that serve over 150B requests every day. To upgrade and test such infrastructure, the team developed an active/passive procedure that allows safe testing in production. So, all at once, we upgraded: Cassandra, hardware, OS, JVM, VNodes, and a few more. What could go wrong after all? Well... ALL of them.