Preprocessing text to make it more compressible

The Endeavour 2024-10-15

Summary:

Repetitive text compresses efficiently. Text like the lyrics to Jingle Bells ought to compress more efficiently than ordinary prose, assuming the compression algorithm can exploit the repetition. The idea of the Burrows-Wheeler transform is to permute text in before compressing it. The hope is that the permutation will make the repetition in the text easier […]

The post Preprocessing text to make it more compressible first appeared on John D. Cook.

Link:

https://www.johndcook.com/blog/2024/10/15/burrows-wheeler/

From feeds:

Statistics and Visualization » The Endeavour

Tags:

computing

Authors:

John

Date tagged:

10/15/2024, 11:39

Date published:

10/15/2024, 09:44