PIGZ(1) | General Commands Manual | PIGZ(1) |
pigz, unpigz - compress or expand files
pigz [ -cdfhikKlLmMnNqrRtz0..9,11 ] [ -b
blocksize ] [ -p threads ] [ -S suffix ]
[ name ... ]
unpigz [ -cfhikKlLmMnNqrRtz ] [ -b blocksize ] [
-p threads ] [ -S suffix ] [ name ...
]
Pigz compresses using threads to make use of multiple processors and cores. The input is broken up into 128 KB chunks with each compressed in parallel. The individual check value for each chunk is also calculated in parallel. The compressed data is written in order to the output, and a combined check value is calculated from the individual check values.
The compressed data format generated is in the gzip, zlib, or single-entry zip format using the deflate compression method. The compression produces partial raw deflate streams which are concatenated by a single write thread and wrapped with the appropriate header and trailer, where the trailer contains the combined check value.
Each partial raw deflate stream is terminated by an empty stored block (using the Z_SYNC_FLUSH option of zlib), in order to end that partial bit stream at a byte boundary. That allows the partial streams to be concatenated simply as sequences of bytes. This adds a very small four to five byte overhead to the output for each input chunk.
The default input block size is 128K, but can be changed with the -b option. The number of compress threads is set by default to the number of online processors, which can be changed using the -p option. Specifying -p 1 avoids the use of threads entirely.
The input blocks, while compressed independently, have the last 32K of the previous block loaded as a preset dictionary to preserve the compression effectiveness of deflating in a single thread. This can be turned off using the -i or --independent option, so that the blocks can be decompressed independently for partial error recovery or for random access. This also inserts an extra empty block to flag independent blocks by prefacing each with the nine-byte sequence (in hex): 00 00 FF FF 00 00 00 FF FF.
Decompression can't be parallelized, at least not without specially prepared deflate streams for that purpose. As a result, pigz uses a single thread (the main thread) for decompression, but will create three other threads for reading, writing, and check calculation, which can speed up decompression under some circumstances. Parallel decompression can be turned off by specifying one process ( -dp 1 or -tp 1 ).
All options on the command line are processed before any names are processed. If no names are provided on the command line, or if "-" is given as a name (but not after "--"), then the input is taken from stdin.
Compressed files can be restored to their original form using pigz -d or unpigz.
This software is provided 'as-is', without any express or implied warranty. In no event will the author be held liable for any damages arising from the use of this software.
Copyright (C) 2007-2021 Mark Adler <madler@alumni.caltech.edu>
February 6, 2021 |