May 24, 2025 by Thibault Debatty | 226 views
https://cylab.be/blog/420/laravel-model-property-compression
For one of our projects we built an analysis system that produces a lot of HTML reports. These reports are stored in a MySQL database so we can later review them in the web interface. The whole thing is implemented using Laravel. Thanks to Laravel queues and jobs the system runs smoothly, but the reports are taking non negligible DB space. Moreover, saving new reports is quite slow, probably because of the number and volume of reports to save and probably also because the physical storage is actually old/slow SSD. So I wanted to test whether compression may help…
First I wanted to extract some numbers from this table.
According to show table status like 'reports';
:
Average size (in Bytes) of the html
column:
SELECT AVG(LENGTH(html)) AS average_length FROM reports;
So my reports are 1261 Bytes on average, of which 675 Bytes are consumed by the HTML column.
Also, the average report generation time (including INSERT into MySQL DB) is 915 ms (yes, almost 1 second).
MySQL has a feature called table compression which stores data in compressed form and automatically uncompresses it on reading.
https://dev.mysql.com/doc/refman/8.4/en/innodb-compression-background.html
However, this feature is specific to MySQL, which means our system could not be transferred to another database system.
Moreover, this feature is known to have some caveats:
KEY_BLOCK_SIZE
parameter;Hence I chose to implement compression at the application level, in my Laravel code.
So, to reduce size on disk, the idea will be to compress the html
field using PHP function gzdeflate($data)
before the model is saved to the database.
The Deflate algorithm is a simple data compression algorithm that combines the LZ77 algorithm with Huffman coding. LZ77 finds repeated strings in the data and replaces them with references to previous occurrences, while Huffman coding assigns shorter codes to more frequently occurring symbols to reduce the overall size of the data.
On common text, deflate achieves compression ratio of 4 to 6, meaning that compressed data is 4 to 6 times shorter than the original text. However, this is only valid for long enough text.
Moreover, as I wanted to keep migration simple, I did not want to change the type of MySQL column (from TEXT
to BLOB
). So I will also re-encode compressed data using base64_encode()
.
Base64 is a binary-to-text encoding scheme that transforms binary data into a sequence of printable (UTF) characters.
Base64 encoding causes an overhead of 33–37% relative to the size of the original binary data.
So after compression I should expect that the average length of the html column should decrease to:
675 Bytes / 4 x 1.33 = 224 Bytes
The first step was to create a migration to convert existing data:
public function up()
{
// process reports by groups of 10
// preserving ID
// https://laravel.com/docs/12.x/queries#chunking-results
DB::table('reports')->chunkById(10, function (Collection $reports) {
foreach ($reports as $report) {
$html = $report->html;
$compressed = base64_encode(gzdeflate($html));
DB::table('reports')
->where("id", $report->id)
->update(['html' => $compressed]);
}
echo ".";
});
}
Then in the Laravel model, I added a mutator that transparently converts data before it is saved in the database, and when it is loaded from database:
public function getHtmlAttribute(string $value) : string
{
return gzinflate(base64_decode($value));
}
public function setHtmlAttribute(string $value)
{
$this->attributes['html'] = base64_encode(gzdeflate($value));
}
And that’s all…
After compression, the average html length decreased to 209 Bytes, which is slight better than what I expected
Taking into account the number of rows currently stored in the database, this allows to spare 500 MB of database storage.
Moreover, the average report generation time is now 864 ms, which shows that this data reduction also allows to slightly reduce insertion time.
In this experiment, we explored the potential benefits of compressing HTML reports stored in a MySQL database using Laravel. By applying the Deflate algorithm and base64 encoding to the HTML data, we were able to reduce the average size of the reports from 675 Bytes to 209 Bytes, achieving a compression ratio of approximately 3.2.
Moreover, the compression and encoding process did have a limited but positive impact on the average report generation time, which decreased from 915 ms to 864 ms.
While the experiment showed promising results, it is essential to consider the trade-offs of compression and encoding.
Overall, this experiment demonstrates the potential of compression and encoding techniques to improve the efficiency and scalability of data storage systems, particularly in applications with large volumes of text-based data.
This blog post is licensed under
CC BY-SA 4.0