Tweaking MySQL to Improve ntopng Flows Storage Space Usage

Edit: MySQL tables engine has been migrated to MyISAM in ntopng 2.4 so this post only applies for versions <= 2.3.

This is the first post that tries to give hints on how to tweak MySQL settings to better accomodate flows exported by ntopng. In particular, in this post it is discussed how to improve disk space usage. Hopefully, a series of posts with tips and tricks on how to improve responsiveness and reduce query time will be published in the future.

ntopng MySQL flow export can be enabled using the -F command line option. Once enabled, it is possible to chose, from the web UI preferences panel, the number of days exported flows will be retained in MySQL. By default this value is set to 30 days. Users may chose to adjust this setting on the basis of their disk space availability and quantity of exported flows.

However, if MySQL is not configured properly, disk space usage may grow indefinitely even if old flows are constantly deleted. Indeed, flow deletion does not yield automatic disk space release, neither it ensures that newly arriving flows will take the place of the older, deleted flows.

innodb_file_per_table

To make sure disk space can be reclaimed and that new flows will take the place of deleted flows, innodb_file_per_table must be enabled.

To check whether innodb_file_per_table is enabled one can run the following command in a mysql shell

Please note that enabling innodb_file_per_table will not have effect on the already existing tables. Existing tables will need to be re-created using ALTER TABLE <table_name> ENGINE=InnoDB.

innodb_file_per_table is enabled by default on MySQL server >= 5.6.

Reclaiming Disk Space

Disk space can be reclaimed — provided that innodb_file_per_table is enabled — by running OPTIMIZE TABLE on the tables used by ntopng, namely, flowsv4 and flowsv6. OPTIMIZE TABLE will create, for each table it is run on, a new identical empty table. Then it will copy, row by row, data from the old table to the new one. In this process a new .ibd tablespace is created and the space is reclaimed. Optimizing the table is costly both in terms of time (a new table is created out of the old one) and in term of space usage (the new table needs to be fully created before the old one can be deleted). Therefore, optimizing a table is something that should be planned by — and agreed with — the DBA.

Reusing Disk Space

Disk space occupied by deleted flows is re-used automatically by newely arriving flows if using innodb_file_per_table. However, it should be noted that this may lead to fragmentation issues. Running an OPTIMIZE TABLE periodically will re-pack the flows in the most efficient way.

Closing Remarks

To improve ntopng flows MySQL space usage it is strongly recommended to enable innodb_file_per_table. However, even though new records will take the place of the deleted ones, it should be noted that disk space is not reclaimed automatically. This means that, for example, in an environment with an average of 1 million flows dumped to MySQL every week, setting a 7-days retention period will yield an approximately constant MySQL disk space usage — that is, the space required to accomodate 1 million flows. If, however, 2 million flows will be generated in a particular week, then the space will grow and will not be reclaimed automatically. From that point on, disk space usage will be large enough to accomodate 2 million flows even if an average of 1 million flows will be received in future weeks. Reclaiming disk space requires an OPTIMIZE TABLE on tables flowsv4 and flowvs6.