I am using mariaDB(10.5.15) , I have a table with structure like
CREATE TABLE Persons (
id bigint AUTO_INCREMENT,
entityId varchar(50),
encrypted_file mediumtext,
created_at datetime,
updated_at datetime
);
Initially , I was encrypting and storing the xml file in encrypted_file column , on an average the file size was around 120KB , so reduce the file size I started to compress it and store , the average size of compressed file was around 10 KB. To store in the DB I created a new column encrypted_file_v2 and start storing it in the new column and for older column encrypted_file I started setting it as null, new schema looks like this
CREATE TABLE Persons (
id bigint AUTO_INCREMENT,
entityId varchar(50),
encrypted_file mediumtext,
created_at datetime,
updated_at datetime,
encrypted_file_v2 mediumtext
);
so it is storing all most same number of records per day before and after the change, but after making this changes what I observed that the disk storage utilisation has increased than before on the daily basis, I verified the size of the file being stored , it has reduced around 80-90%, but the overall disk utilisation has increased sql engine is innodb
not able to figure out why is this happening , any idea about this behaviour ?
I tried this in local but not able to figure out exact behaviour of the problem.
Actually, there are issues here.
First. do not store compressed or encrypted data in a
MEDIUMTEXT, you will lose data. Instead, useMEDIUMBLOB."Large"
TEXTsandBLOBsare stored "off-record -- in 16KB blocks separate from the rest of the columns, which live in 16KB blocks. A 20-byte pointer is used to access the text/blob from the main blocks.Freeing up space in InnoDB is a tricky matter. Mostly, any freed space is held onto until some other row (or off-record stuff) reuses the space. There is some attempt to coalesce 'small' blocks. The table size will never decrease, but may, after some time, be more compact.
Check the actual number of rows via
SELECT COUNT(*) FROM tbl. Get the data's BTree size fromSHOW TABLE STATUS LIKE 'tbl';. (Do not trust avg_row_size or num_rows.) Divide.After you have rebuilt the column, use
OPTIMIZE TABLEto clean up the horrible fragmentation you see. (Caution: this command will take a long time -- ot copies the table over.)Yes, your design is good. Compressing (and optionally encrypting) text and storing in some size
BLOBhelps shrink disk space and probably helps with performance.What kind of text do you have? Most text I have seen shrinks only to 1/3 of original size. Encryption is mostly 1:1.
Better yet, use column compression available since 10.3+. https://mariadb.com/kb/en/storage-engine-independent-column-compression/