Prague Stringology Conference 2017

Markus Mauer, Timo Beller and Enno Ohlebusch

A Lempel-Ziv-style Compression Method for Repetitive Texts

Abstract:
In this paper, we present a compression algorithm that is based on finding repetitions in the file to be compressed. Our approach is a variant of longest-first-substitution compression that uses the suffix array and the LCP-array to find and encode long recurring substrings. We will show that our algorithm achieves very good compression ratios for repetitive texts.

Download paper: Article in PostScript Article in PDF BibTeX Reference
 PostScript   PDF   BibTeX reference