Database Administrators Stack Exchange is a question and answer site for database professionals who wish to improve their database skills and learn from others in the community. It's 100% free, no registration required.

In my MySQL cluster, the data usage of 4 data nodes is approximately 77%. I added a new data node online, and attempted to reorganize partition to redistribute data among all data nodes.

But before the query alter table customer_tbl reorganize partition; is successfully executed, the data usage of existing data nodes started increasing, and reached up to 95% before ERROR customer_tbl is FULL is encountered.

How can I bring a data node online and reorganize table successfully? My table has ~100 million rows.

Why the data usage of existing data nodes increased in the first place and what is the internal working of mysql cluster when it reorganizes partitions?

1 Answer
1

You're correct that the on-line repartitioning of data temporarily uses extra memory on the data nodes. In most cases the application data is split over many tables and so relatively little memory is needed when repartitioning each table. If the data is in one big table and you don't have the available RAM then there are a number of options...

Perform the data migration yourself in an iterative fashion. e.g.

create cdr2 (....) partition by ...;

loop
{
insert into cdr2 select * from cdr order by .... LIMIT 1000;
delete from cdr order by .... LIMIT 1000; } until cdr empty.

Some extra memory will be used, but shouldn't be double. Best effect would be if the order by .... could follow insert-order in some way.

Backup the data and restore it on another Cluster (with more memory), repartition and then revers the process.

Use mysqldump, empty the table, repartition it and then load the data back in

Convert the table to be disk-based (on-line operation), repartition and then convert it back into an in-memory table

(Temporarily) add extra RAM to the data nodes (this is an on-line operation).

Please tell me why it uses so much extra memory on existing data nodes ... is it due to the fact that it creates some kind of temporary tables when we reorganize partition and that temporary table make the data usage to become almost double??? .... this feature of reorganize partition in mysql cluster is of no use if we need to have so much ram as it makes data usage almost double and we add new node only when we exhaust memory in existing data nodes.
–
AnshulApr 3 '13 at 14:21