Alhough TokuDB offers great compression, from what i see it wastes a lot of space by allocating a lot more than it actually needs.
I'm using the information_schema to get the size, as follows:
SELECT
table_name,
table_dictionary_name,
round(bt_size_allocated/1024/1024,2) as size_allocated,
round(bt_size_in_use/1024/1024,2) as size_in_use
FROM
information_schema.`TokuDB_fractal_tree_info`
and here is the result
+----------------+---------------------------+-----------------+-------------+
| table | dictionary_name | size_allocated | size_in_use |
+----------------+---------------------------+-----------------+-------------+
| c_tokudb_table | key-XID_id | 875.43 | 411.59 |
| c_tokudb_table | key-operationId_timeStamp | 913.38 | 459.06 |
| c_tokudb_table | key-time_stamp | 737.36 | 338.67 |
| c_tokudb_table | main | 3217.93 | 1505.58 |
| c_tokudb_table | status | 0.04 | 0.00 |
+----------------+---------------------------+-----------------+-------------+
as you can see size_allocated
is basically doubled. The files on disk are just a bit more than size_allocated so reporting is good using information_schema.
I tried running optimize a few times but it doesn't help that much, sometimes it even increases. The only solution that seems to work is running an ALTER TABLE
tblengine=tokudb
but it takes a lot of time because it completely rebuilds the table.
Does anyone know how to recover the unused space?
(running tokudb 5.6.27-76.0 on a percona mysql server)
optimize table
command should help to flush out deleted rows, thus free-up space, but this may also take a lot of time. You can try to adjust tokudb_cleaner_period and tokudb_cleaner_iterations settings to enable more frequent cleaning of data.