7Z versus RAR: 2 compression formats
Τhe occasion
I use the archiver 7-Zip from the context menu to compress/uncompress files or email files as attachments, and from its file manager, to browse and edit archives. Also, through the Windows task scheduler and a couple of batch files, I use it for scheduled backups of my documents, my mail boxes, my Program Files, and my Documents and Settings. For the backups I use the native format of 7-Zip, 7Z.
I run some tests looking for the optimal balance between compression level and compression speed. Then, out of curiosity, I downloaded the trial version of WinRAR, www.rarlab.com, the undisputed king of strong compression among everyday archivers until 7-Zip appeared, and compressed the same sets using the format RAR. The results are presented below.
Τest files, test computer, and provisos
- Τwenty MS Word documents (text, footnotes, endnotes)
- Τhe directory Program Files from my computer
- Τhe directory of GIMP 2.2.7: get it here, or get the current version of GIMP for Windows
- the ZIP download package of VLC Media Player 0.8.2
- the 5 Canterbury Corpora (The Canterbury Corpus, The Artificial Corpus, The Large Corpus, The Miscellaneous Corpus, The Calgary Corpus)
The first two comparisons were done for private use and cannot be replicated in a strict sense, since the sets contain non-distributable files and private data. The GIMP directory was chosen because it is used for the comparison in the home page of 7-Zip (UPDATE 2005-11: this comparison has been updated, with different test files), and the VLC package because it is easy to test without installing anything. The Canterbury Corpora are standard test files most of which contain simple text. The most commonly used is The Canterbury Corpus. Here all five were bundled together.
The tests were run in Windows XP SP2 on an AMD Athlon 64 3000+ with 512MB of RAM. Deviations from default settings are noted under Level. For all tests the graphical interfaces of the two programs were used. Processor and memory loads were roughly the same throughout, but not exactly. Hence, compression times and speeds are useful only as a very rough indication. Compression ratio is not affected by any of this.
One last proviso: the performance of archivers depends on the type(s) of files. A Program Files directory gives a somewhat balanced mix in that respect, since it contains almost any type of file to be found in a Windows computer. However, no mixture can be representative of any situation.
The comparisons
UPDATE 2005-11-18: 7-Zip 4.30 is significantly faster than the version used in the comparisons, 4.26. The current version of WinRAR is 3.51, which fixes minor bugs.
20 WORD FILES ------------- 65,768,448 bytes (62.7 MBs) 00:02 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z* Fast 7-Zip 00:16 n/a 16.490.386 15.7 0.251 1.60 7Z Fast s* 7-Zip 00:15 n/a 16.484.840 15.7 0.251 1.60 7Z Nrml 7-Zip 01:13 n/a 11.411.397 10.8 0.174 1.11 7Z Nrml s 7-Zip 01:21 n/a 11.370.195 10.8 0.173 1.11 7Z Ultr 7-Zip 02:05 n/a 11.202.547 10.6 0.170 1.09 7Z Ultr s 7-Zip 02:38 n/a 10.922.771 10.4 0.166 1.06 ---------------------------------------------------------------------------- RAR Fast WinRAR 00:20 n/a 13.770.366 13.1 0.209 1.34 RAR Fast s WinRAR 00:24 n/a 13.684.559 17.9 0.208 1.33 RAR Nrml WinRAR 00:36 n/a 12.935.456 12.3 0.197 1.26 RAR Nrml s WinRAR 00:40 n/a 12.926.858 12.3 0.197 1.26 RAR Best WinRAR 00:33 n/a 10.279.705 9.8 0.156 1.00 sic RAR Best s WinRAR 00:36 n/a 10.403.345 9.9 0.158 1.01 sic ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 00:31 n/a 16.262.500 15.5 0.247 1.58 ---------------------------------------------------------------------------- PROGRAM FILES ------------- 185 directories 1.699 directories and subdirectories 20.392 files 1.156.573.239 bytes (1.08 GBs) (1.103.0 MBs) 04:10 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z* Fast 7-Zip 07:17 0.572 501.807.419 478.6 0.434 1.30 7Z Fast s* 7-Zip 08:13 0.507 492.256.908 469.5 0.426 1.27 7Z Nrml 7-Zip 21:01 0.198 459.255.702 438.0 0.397 1.19 7Z Nrml s 7-Zip 21:35 0.193 413.624.371 394.5 0.358 1.07 7Z Ultr 7-Zip 40:47 0.102 452.190.182 431.2 0.391 1.17 7Z Ultr s 7-Zip 37:20 0.112 386.853.358 368.9 0.334 1.00 ---------------------------------------------------------------------------- RAR Fast WinRAR 07:35 0.594 511.109.667 487.4 0.442 1.32 RAR Fast s WinRAR 10:31 0.396 457.234.970 436.1 0.395 1.18 RAR Nrml WinRAR 10:50 0.385 482.509.502 460.2 0.417 1.25 RAR Nrml s WinRAR 13:09 0.317 432.044.938 412.0 0.374 1.12 RAR Best WinRAR 11:48 0.353 475.307.597 453.2 0.411 1.23 RAR Best s WinRAR 17:44 0.235 427.653.700 407.8 0.370 1.11 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 08:58 0.465 540.226.985 515.2 0.467 1.40 ---------------------------------------------------------------------------- PROGRAM FILES Twice* ------------- 370 directories 3.398 directories and subdirectories 40.784 files 2.313.146.478 bytes (2.15 GBs) (2.206.0 MBs) 08:20 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z Ultr 7-Zip 82:22 0.101 904.155.684 862.3 0.391 2.22 7Z Ultr s 7-Zip 69:05 0.121 406.769.031 387.9 0.179 1.00 ---------------------------------------------------------------------------- RAR Best WinRAR 26:40 0.313 950.791.899 906.7 0.411 2.34 RAR Best s WinRAR 32:14 0.259 553.315.531 527.7 0.239 1.36 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 15:21 0.543 1080.807.404 1030.7 0.467 2.66 ---------------------------------------------------------------------------- GIMP 2.2.7 ------------- 134 directories and subdirectories 1.249 files 31.450.853 bytes (29.9 MBs) 00:06 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z* Fast 7-Zip 00:09 0.667 11.128.367 10.6 0.354 1.53 7Z Fast s* 7-Zip 00:13 0.416 10.421.201 9.9 0.331 1.43 7Z Nrml 7-Zip 00:31 0.194 10.074.002 9.6 0.320 1.38 7Z Nrml s 7-Zip 00:31 0.194 7.632.331 7.3 0.243 1.05 7Z Ultr 7-Zip 01:40 0.060 9.968.643 9.5 0.317 1.37 7Z Ultr s 7-Zip 00:54 0.111 7.287.683 6.9 0.232 1.00 ---------------------------------------------------------------------------- RAR Fast WinRAR 00:10 0.600 11.901.587 11.3 0.378 1.63 RAR Fast s WinRAR 00:11 0.545 9.354.289 8.9 0.297 1.28 RAR Nrml WinRAR 00:14 0.429 10.905.047 10.3 0.347 1.50 RAR Nrml s WinRAR 00:17 0.353 8.255.484 7.9 0.262 1.13 RAR Best WinRAR 00:18 0.333 10.526.165 10.0 0.335 1.44 RAR Best s WinRAR 00:22 0.273 8.105.728 7.7 0.258 1.11 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 00:11 0.545 12.460.383 11.8 0.396 1.71 ---------------------------------------------------------------------------- VLC 0.8.2 ------------- 35 directories and subdirectories 233 files 32.443.046 bytes (29.9 MBs) 00:02 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z Fast 7-Zip 00:10 n/a 12.158.880 11.5 0.375 2.03 7Z Fast s* 7-Zip 00:10 n/a 11.881.356 11.3 0.366 1.99 7Z Nrml 7-Zip 00:31 n/a 10.929.763 10.4 0.337 1.83 7Z Nrml s 7-Zip 00:32 n/a 10.176.448 9.7 0.317 1.70 7Z Ultr 7-Zip 00:53 n/a 10.602.434 10.1 0.327 1.77 7Z Ultr s 7-Zip 00:48 n/a 5.980.455 5.7 0.184 1.00 ---------------------------------------------------------------------------- RAR Fast WinRAR 00:12 n/a 12.965.235 12.3 0.400 2.17 RAR Fast s WinRAR 00:13 n/a 11.998.676 11.4 0.370 2.01 RAR Nrml WinRAR 00:18 n/a 11.732.822 11.1 0.362 1.96 RAR Nrml s WinRAR 00:21 n/a 10.780.672 10.2 0.332 1.80 RAR Best WinRAR 00:18 n/a 11.624.763 11.0 0.358 1.94 RAR Best s WinRAR 00:28 n/a 10.707.375 10.2 0.330 1.79 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 00:11 n/a 13.793.441 13.1 0.425 2.31 ---------------------------------------------------------------------------- CANTERBURY ------------- 37 files 18.521.760 bytes (17.6 MBs) 00:00 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z Fast 7-Zip 00:04 n/a 5.535.269 5.3 0.299 1.38 7Z Fast s* 7-Zip 00:04 n/a 5.526.030 5.3 0.298 1.37 7Z Nrml 7-Zip 00:27 n/a 4.478.925 4.3 0.242 1.11 7Z Nrml s 7-Zip 00:28 n/a 4.384.949 4.2 0.237 1.09 7Z Ultr 7-Zip 00:39 n/a 4.453.814 4.2 0.240 1.11 7Z Ultr s 7-Zip 00:41 n/a 4.337.743 4.1 0.234 1.08 ---------------------------------------------------------------------------- RAR Fast WinRAR 00:06 n/a 4.997.549 4.8 0.270 1.24 RAR Fast s WinRAR 00:06 n/a 4.907.344 4.7 0.265 1.22 RAR Nrml WinRAR 00:09 n/a 4.854.052 4.6 0.262 1.21 RAR Nrml s WinRAR 00:09 n/a 4.762.674 4.5 0.257 1.18 RAR Best WinRAR 00:09 n/a 4.024.156 3.8 0.217 1.00 RAR Best s WinRAR 00:09 n/a 4.022.342 3.8 0.217 1.00 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 00:09 n/a 5.292.317 5.0 0.286 1.32 ---------------------------------------------------------------------------- CANTERBURY Twice* ------------- 74 files 37.043.520 bytes (35.3 MBs) 00:01 copy time in minutes and seconds ---------------------------------------------------------------------------- Frm Level* Prog* Time Speed* Bytes MBs Ratio* Base* ---------------------------------------------------------------------------- 7Z Ultr 7-Zip 01:20 n/a 8.906.896 8.5 0.240 2.05 7Z Ultr s 7-Zip 01:12 n/a 4.348.873 4.1 0.117 1.00 ---------------------------------------------------------------------------- RAR Best WinRAR 00:18 n/a 8.048.581 7.7 0.217 1.85 RAR Best s WinRAR 00:18 n/a 7.137.553 6.8 0.193 1.64 ---------------------------------------------------------------------------- ZIP* Nrml 7-Zip 00:17 n/a 10.585.204 10.0 0.286 2.43 ----------------------------------------------------------------------------
Legend
- 7Z
- The default algorithm was used, LZMA. (PPMd was faster but less strong.)
- NOTE. If you enable the Multi-threading option, 7-Zip will use two threads for compressing. This can speed up compressing up to 70–80% on Multi-Processor systems or up to 25% on a Pentium 4 with Hyper-Threading.
- Base
- Compressed size divided by the best compressed size in the set.
- Level
- 7-Zip has 4 levels for 7Z: Fast, Normal (default), Maximum, and Ultra. Maximum was omitted. Ultra needs a huge amount of memory (369MB), and is not very practical without much RAM. Probably it would be faster in a computer with more than 512MB of RAM. Maximum is very close to Ultra in ratio and uses 84MB.
- WinRAR has 5 levels for RAR: Fastest, Fast, Normal (default), Good, Best. Fastest and Good were omitted. Best uses 64MB of RAM.
- Prog
- 7-Zip 4.26 was used for 7Z, and WinRAR 3.50 for RAR. 7-Zip was also used for ZIP, since it performs better than other archivers I tried: IZArc, TUGZip, and WiZ (frontend for the open-source Info-Zip). WinRAR is very fast at making ZIP archives, but produces larger sizes.
- Ratio
- The ratio of compressed to uncompressed size. The lower the better!
- s
- Solid compression, treats all files to be archived as one continuous stream. This usually results in a much smaller size. It is great for some purposes, e.g. to upload files for downloading or to send files as attachments, but not for others, e.g. archives that are accessed often, such as backups.
- At present, 7Z does not support in-place editing of solid archives: to delete a file in a solid archive or change something in it, you have to unpack the whole archive and then re-archive the contents.
- Files in RAR solid archives can be edited in place, but, in order to allow this, WinRAR analyzes the whole archive up to the location of the file, and then, to save changes, repacks automatically all files, taking as much time as the first time it made the archive.
- In WinRAR solid compression is off by default. In 7-Zip it is on. You can untick it, and the program will remember your preference. Strangely, solid compression is the default even for the CLI command u (Update). What is the purpose of solid compression for archives that are meant to be updated regularly and often?
- Speed
- The time to copy the set of files to the same partition divided by the time to compress it. The higher the better! I thought this would be more useful than an absolute value (KB/s) which depends on equipment. No ratios are given for sets where copying time approached to zero for all practical purposes.
- NOTE. Please note that this is only to give a rough idea and can vary from computer to computer or from time to time on the same computer. Also, as already noted, the available RAM was hardly sufficient for the voracious 7Z Ultra.
- Twice
- All files/directories of the set were duplicated. This situation, a set of exactly duplicate files, is unlikely in real life. However, it is not uncommon to archive sets with identical files in different (sub)directories, or with almost identical files in the same (sub)directories. Theoretically, the advantages of solid compression should be more pronounced in duplicate sets.
- ZIP
- The simple Deflate method was used, which is universally supported. Other methods give better results, but are not widely supported (e.g., neither WinZip before 8.1 nor Zip Folders in WinME and WinXP support Deflate 64). If universal support is not an issue, there is no reason to use ZIP. The Ultra level (tested with 7-Zip) was slightly better than Normal, but often took more than twice to complete.
Comments on results
Some patterns seem to be clear:
- In the three sets of program files, 7Z was always stronger and RAR faster at levels Normal and Highest. At level Fast, 7Z was stronger AND faster more often than not. In the VLC files, the result at Highest Solid is shocking, and certainly not representative. The cause is probably the same that caused the huge difference in the duplicate sets. Note also that the composition of the VLC set is not common: about 2/3 of its size is dll files (multimedia codecs).
- In the two sets of text files (both the binary doc files of MS Word, and the five Canterbury Corpora, which consist mostly of simple-text files), RAR was both stronger and faster at the highest level. At level Fast, 7Z was faster and RAR stronger. At level Normal, RAR was faster and 7Z stronger.
- In the scenario of duplicate sets, solid RAR compression seemed to be less smart than 7Z solid. 7Z Solid shone here: it produced archives very close to the solid archives of the single sets. The practical importance of this is probably small.
- 7Z Fast and RAR Fast, in both non-solid and solid mode, are usually faster AND stronger than ZIP Normal Deflate.
Other considerations
Usually, compression speed and compression ratio are not all-important factors in the choice of a format (while the importance of speed certainly diminishes with the size of the file(s) to be compressed). Other relevant factors may be the type of license, the choice of software, and, of course, the range of supported features.
7Z
The format is open and is supported by several archivers:
- 7-Zip (OSS, read/write)
- IZarc (freeware, read/write)
- PowerArchiver (commercial, read/write)
- QuickZip (freeware, read/write)
- Squeez (commercial, read/write)
- TUGZip (freeware, read/write)
- WinRAR (commercial, read-only at present)
- ZipGenius (freeware, read/write)
It is also used by the installers Inno Setup and NSIS (both OSS).
Although relatively young, 7Z offers all features most users would need, like split volumes, strong encryption, and the ability to make self-extracting archives. Its main drawback, the more relevant the larger the size of the files to handle, is the relatively slow compression. As the default algorithm, LZMA, is in active development, this is expected to improve.
RAR
The format is closed. Most good general-purpose archivers can unrar (from the above list, all can), but the choice of programs to rar is scanty:
- Squeez (commercial)
- WinRAR (commercial)
WinRAR is a rock solid program with a well designed interface, in active development for over a decade (older than 7-Zip, despite the lower version number), and has two important features that the 7Z format does not support yet (they are planned for the next major version): recovery of corrupt archives and in-place editing of solid archives. The first is the main reason why the RAR format is practically a standard for file-sharing. WinRAR is localized in 40 languages and offers a lifetime licence.
Anecdotal
Igor Pavlov, the author of 7-Zip, and Eugene Roshal, the parent of WinRAR, are both Russian.
Links
- 7-Zip
- WinRAR
- Archivers Comparison, by Michal Molhanec
- The Canterbury Corpus, by Matt Powell
- Compression Links Info, by the Compression-Links.info Team
- Maximum Compression, by Werner Bergmans
- Ultimate Command Line Compressors, by Johan de Bock
Demetris
2005-09-12: added links, VLC files, and Canterbury Corpora
2005-09-07