Man page - bzz(1)

Packages contains this manual

Manual

BZZ

NAME
SYNOPSIS
Encoding:
Decoding:
DESCRIPTION
OPTIONS
ALGORITHMS
PERFORMANCE
CREDITS
SEE ALSO

NAME

bzz - DjVu general purpose compression utility.

SYNOPSIS

Encoding:

bzz -e [blocksize] inputfile outputfile

Decoding:

bzz -d inputfile outputfile

DESCRIPTION

The first form of the command line (option -e ) compresses the data from file inputfile and writes the compressed data into outputfile . The second form of the command line (option -d ) decompressed file inputfile and writes the output to outputfile .

OPTIONS

-d

Decoding mode.

-e [blocksize]

Encoding mode. The optional argument blocksize specifies the size of the input file blocks processed by the Burrows-Wheeler transform expressed in kilobytes. The default block sizes is 2048 KB. The maximal block size is 4096 KB. Specifying a larger block size usually produces higher compression ratios and increases the memory requirements of both the encoder and decoder. It is useless to specify a block size that is larger than the input file.

ALGORITHMS

The Burrows-Wheeler transform is performed using a combination of the Karp-Miller-Rosenberg and the Bentley-Sedgewick algorithms. This is comparable to (Sadakane, DCC 98) with a slightly more flexible ranking scheme. Symbols are then ordered according to a running estimate of their occurrence frequencies. The symbol ranks are then coded using a simple fixed tree and the ZP binary adaptive coder (Bottou, DCC 98).

The Burrows-Wheeler transform is also used in the well known compressor bzip2 . The originality of bzz is the use of the ZP adaptive coder. The adaptation noise can cost up to 5 percent in file size, but this penalty is usually offset by the benefits of adaptation.

PERFORMANCE

The following table shows comparative results (in bits per character) on the Canterbury Corpus ( http://corpus.canterbury.ac.nz ). The very good bzz performance on the spreadsheet file excl puts the weighted average ahead of much more sophisticated compressors such as fsmx .

Image grohtml-754087-1.png

Note that DjVu contributors have several entries in this table. Program compress was written some time ago by Joe Orost. Program ppmd is an improvement of the PPM-C method invented by Paul Howard.

CREDITS

Program bzz was written by LΓ©on Bottou <leonb@users.sourceforge.net> and was then improved by Andrei Erofeev <andrew_erofeev@yahoo.com>, Bill Riemers <docbill@sourceforge.net> and many others.

SEE ALSO

djvu (1), compress (1), gzip (1), bzip2 (1)