Re: Data storage space unbalance issue

I think you answered your own question, sort of.

When you expand a cluster, it copies the appropriate rows to the new node(s) but doesn't automatically remove them from the old nodes. When you ran cleanup on datacenter1, it cleared out those old extra copies. I would suggest running a repair first for safety on datacenter2, then a "nodetool cleanup" on those hosts.

Also run "nodetool snapshot" to make sure you don't have any old snapshots sitting around taking up space.

Most of my data have a short TTL(14days). The gc_grace_seconds value for all tables is also 600sec.

I expect the two data centers to use the same size but datacenter2 is using more size. It seems that the datas of datacenter2 is rarely deleted. While the disk usage for datacenter1 remains constant, the disk usage for datacenter2 continues to grow.