Why not implement this feature?
We are having trouble deleting columns like timestamp. There are too many columns to load to client. It is really slow to delete data by reading it first!
We cost 10 days to delete 1,000,000,000 timestamp style data of about 1000 CF. Each CF have average 10000 rows.
If we can delete columns by ranges, I think the above operation can finish in serval minutes.

ZhongYu
added a comment - 28/Sep/14 13:57 Why not implement this feature?
We are having trouble deleting columns like timestamp. There are too many columns to load to client. It is really slow to delete data by reading it first!
We cost 10 days to delete 1,000,000,000 timestamp style data of about 1000 CF. Each CF have average 10000 rows.
If we can delete columns by ranges, I think the above operation can finish in serval minutes.

On second thought, I think the best way to implement this will be to create mutate(key, mutation, cl).

One problem is that this [cluster-level] operation does not translate into a [node-level] RowMutation that can be forwarded to other nodes to be applied. (A slice materialized on a node that is out of sync may not include the same columns as a a fully synced node.)

The Right Way would be to change everything so that when a coordinating node receives a message, it goes down a local path where a RM is created and applied, and then a cluster path where the original message is forwarded to other nodes.

Am I over-complicating this, or is materializing the RM into a list of column ops and sending it Good Enough?

Gary Dusbabek
added a comment - 10/May/10 16:41 On second thought, I think the best way to implement this will be to create mutate(key, mutation, cl).
One problem is that this [cluster-level] operation does not translate into a [node-level] RowMutation that can be forwarded to other nodes to be applied. (A slice materialized on a node that is out of sync may not include the same columns as a a fully synced node.)
The Right Way would be to change everything so that when a coordinating node receives a message, it goes down a local path where a RM is created and applied, and then a cluster path where the original message is forwarded to other nodes.
Am I over-complicating this, or is materializing the RM into a list of column ops and sending it Good Enough?