I'm testing out Cassandra with a 4 node cluster spread across 2 DCs and on my stress keyspace, Keyspace1, the RF is DC1:1,DC2:1
As part of my testing I had a VM, ip.83, wiped after several stress write runs to see how to handle a node failure. Everything worked as expected, and since I wasn't going to bring back the node for a few days I figured I'd remove its token from the ring with a nodetool removetoken. After the streaming across the remaining replica brought the other node in the same DC, ip.82, up to date with its new ownership load, OpsCenter reported that ip.82 was no longer connected, despite nodetool ring showing it as up. I took a look in the logs and noticed that in the same timestamp that confirmed the nodetool removetoken, I started receiving these errors as well.
2012-03-05 08:09:55-0500  ERROR: Corrupt endpoint range data for node 10.198.30.82 on keyspace Keyspace1: [TokenRange(end_token='1', start_token='0', endpoints=['10.198.30.82', '10.198.30.81']), TokenRange(end_token='85070591730234615865843651857942052864', start_token='1', endpoints=['10.198.30.82', '10.198.30.81']), TokenRange(end_token='0', start_token='85070591730234615865843651857942052864', endpoints=['10.198.30.82', '10.198.30.80'])]
It makes sense for it to now have ownership over the entire token range as it was the only node operating in its DC, but is it intended for some reason I'm just not thinking of for OpsCenter to treat 0-1, 1-85070591730234615865843651857942052864, 85070591730234615865843651857942052864-0 as 3 seperate keyranges, rather than 0-0? (Or in this case 1-1 I suppose, since that is its token) The problem seemed to be confined to OpsCenter as the cluster was still working, albeit expectedly slower with a 25% reduction in nodes.
I have since brought back up the downed node and put it in the ring with its previous token and everything is now working again, I was just hoping for some clarification into why this occurred. Is it simply that removetoken shouldn't really be used except for cases of replacing dead nodes?