7z
From Wikipedia, the free encyclopedia
Filename extension | .7z |
---|---|
Internet media type |
|
Developed by | Igor Pavlov |
Type of format | Data compression |
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing filters. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest version of 7-Zip and LZMA SDK is version 4.65.
The MIME type of 7z is application/x-7z-compressed
.
Contents |
[edit] Features and enhancements
The 7z format provides the following main features:
- Open, modular architecture which allows any compression, conversion, or encryption method to be stacked.
- High compression ratios (depending on the compression method used)
- Strong Rijndael/AES-256 encryption.
- Large file support (up to approximately 16 exabytes).
- Unicode file names
- Support for solid compression, where multiple files of like type are compressed within a single stream, in order to exploit the combined redundancy inherent in similar files.
- Compression of archive headers.
The format's open architecture allows additional future compression methods to be added to the standard.
[edit] Compression method filters
The following compression methods are currently defined:
- LZMA – A variation of the LZ77 algorithm, using a sliding dictionary up to 1 GB in length for duplicate string elimination. The LZ stage is followed by entropy coding using a Markov chain based range coder and Patricia trees.
- Bzip2 – The standard Burrows-Wheeler transform algorithm. Bzip2 uses two reversible transformations; BWT, then Move to front with Huffman coding for symbol reduction (the actual compression element).
- PPMD – Dmitry Shkarin's 2002 PPMdH (PPMII/cPPMII) with small changes: PPMII is an improved version of the 1984 PPM compression algorithm (prediction by partial matching).
- DEFLATE – Standard algorithm based on 32 kB LZ77 (LZSS actually) and Huffman coding. Deflate is found in several file formats including ZIP, gzip, PNG and PDF. 7-Zip contains a from-scratch DEFLATE encoder that frequently beats the de facto standard zlib version in compression size, but at the expense of CPU usage.
A suite of recompression tools called AdvanceCOMP contains a copy of the DEFLATE encoder from the 7-Zip implementation; these utilities can often be used to further compress the size of existing gzip, ZIP, PNG, or MNG files.
[edit] Pre-processing filters (for executable files)
The LZMA SDK comes with the BCJ / BCJ2 preprocessor included, so that later stages are able to achieve greater compression: For x86, ARM, PowerPC (PPC), IA64 and ARM Thumb processors, jump targets are normalized before compression by changing relative position into absolute values. For x86, this means that near jumps, calls and conditional jumps (but not short jumps and conditional jumps) are converted from the machine language "jump 1655 bytes backwards" style notation to normalized "jump to address 5554" style notation.
- BCJ - Converter for 32-bit x86 executables. Normalise target addresses of near jumps and calls from relative distances to absolute destinations.
- BCJ2 - Pre-processor for 32-bit x86 executables. BCJ2 is an improvement on BCJ, adding additional x86 jump/call instruction processing. Near jump, near call, conditional near jump targets are split out and compressed separately in another stream.
Similar executable pre-processing technology is included in other software; the RAR compressor features displacement compression for 32-bit x86 executables and IA64 Itanium executables, and the UPX runtime executable file compressor includes support for working with 16 bit values within DOS binary files.
[edit] Encryption
The 7z format supports encryption with the AES algorithm with a 256-bit key. The key is generated from a user-supplied passphrase using an algorithm based on the SHA-256 hash algorithm. The SHA-256 is executed 218 (256K) times[1] which causes a significant delay on slow PCs before compression or extraction starts. This technique is called key strengthening and is used to make a brute-force search for the passphrase more difficult. The 7z format provides the option to encrypt the filenames of a 7z archive.
[edit] Limitations
The 7z format does not store UNIX owner/group permissions, and hence can be inappropriate for backup/archival purposes. A workaround is to convert data to a tar bitstream before compressing with 7z.
Unlike WinRAR, 7z cannot extract "broken files" - that is (for example) if one has the first segment of a series of 7z files, 7z cannot give the start of the files within the archive - it must wait until all segments are downloaded. WinZip has the same limitation.
[edit] References
This article needs references that appear in reliable third-party publications. Primary sources or sources affiliated with the subject are generally not sufficient for a Wikipedia article. Please add more appropriate citations from reliable sources. (April 2007) |
[edit] See also
[edit] External links
- 7z Format — General description about the 7z archive format.
|