The Art of Lossless Data Compression vol. 20b

Here are the results of tests performed in September 2000 to compare lossless compression of "binary" files by all known good enough programs developed for such purpose, including RK, DC, YBS, Bzip2, IMP, RAR and 7-zip. See Archive Comparison Test by J.Gilchrist for more details: http://act.by.net If anybody wants to start or continue such tests, or can suggest some other sets of files, or other compression programs, (not sources or algorithm descriptions, executable programs only) or knows we have missed something important, (some new fantastic technology, an algorithm or even a program capable of lossless compression of up to 1000:1 etc.) please let us know immediately: artest@hotmail.ru Thank you!

[[1]] COMPRESSION QUALITY

(see also [[2]] Speed [[3]] Details [[4]] Comments) Last eleventh line shows results for the sum of all 5029 files in ten sets. Original 777 ACE32 BEE BIX BOA BZip2 DC ERI IMP length -mu32-mg -m5-d4096 -m3-d3 -m9-mdg -m15 -k -9 (none) (none) -1-m3-mm 240.36 102.18 105.11 110.61 107.68 107.89 113.02 106.49 109.43 110.73 253.84 100% 107.96 115.15 108.76 114.07 121.75 117.29 120.38 109.99 156.21 100% 103.11 106.63 103.54 104.95 108.42 106.67 107.61 103.74 307.02 107.41 112.65 114.57 114.22 116.18 122.72 108.57 113.77 109.12 170.69 109.38 101.79 116.93 117.09 113.29 114.41 104.91 100% 113.81 382.13 106.09 116.68 121.78 117.98 118.59 127.96 118.46 123.08 123.07 188.88 100.79 104.80 107.38 104.00 107.05 111.64 107.06 109.91 105.80 271.78 103.61 107.77 118.83 113.94 114.86 121.25 112.38 113.37 119.90 135.34 100% 102.11 105.76 102.30 103.55 105.79 105.22 105.26 101.73 352.04 105.47 113.19 116.41 115.32 117.44 128.67 120.72 126.38 119.69 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 228.67 101.35 104.96 110.82 107.80 109.14 114.48 108.42 110.29 109.02 ArHanGel PPMonstr SBC RAR RK SZip UHA 7-zip YBS ZZip -1-2-mm -o8-m56 -b19 -m5-mm-mde -mx2 -o0 -b41 -m3 -mx -m16mu-r -b20-mx 109.52 105.02 111.15 109.61 100% 110.04 107.58 114.95 109.93 109.80 124.47 111.26 119.97 113.30 102.96 118.77 107.19 128.13 114.63 116.93 105.95 104.00 108.22 105.84 100.63 107.04 102.94 107.19 105.42 106.82 114.02 112.61 120.67 110.42 105.48 118.99 100% 135.13 113.52 117.50 107.13 111.91 115.71 103.86 106.18 113.73 102.74 120.28 113.16 111.97 123.40 115.65 125.82 127.73 100% 121.56 114.55 126.54 122.03 121.75 108.73 104.85 110.18 108.45 100% 109.24 103.79 112.48 105.54 107.17 119.85 112.02 121.54 118.45 100% 117.94 112.15 121.25 117.62 117.89 104.54 102.97 105.13 102.13 101.51 105.34 102.00 104.58 105.17 104.88 128.50 111.71 126.23 128.44 100% 123.91 111.30 128.79 124.06 121.87 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 111.80 106.77 113.48 109.83 100% 111.85 104.40 116.64 110.25 110.88

[[2]] Speed

7th set, all from ftp://ftp.simtel.net/pub/simtelnet/win95/proj/tm2k.zip (17Mb) was used for this test, and an AMD-K6-400 machine with 64M RAM and Windows98. Programs,options Overall Average Compress Extract Compressed score, Users' time, time, size, score, seconds seconds bytes seconds % seconds % 777 a -mu32 -s 10718 235% 7228 164% 3879 2503 15613186 777 a -mg -s 10718 235% 7227 164% 3880 2502 15613186 777 a -mg 10176 223% 7127 162% 3388 2514 15388558 7zip a 4873 107% 4799 109% 83 11 17208118 7zip a -mx 4930 108% 4797 109% 149 12 17171238 acb B 6565 144% 5630 128% 1040 1065 16058275 acb b 7279 160% 6003 136% 1418 1440 15916219 acb u 8463 186% 6641 151% 2025 2042 15827870 ace32 a 4564 100% 4428 100% 151 15 15832593 ace32 a -d4096 4545 100% 4388 100% 175 15 15678361 ace32 a -d4096 -s- 4652 102% 4481 102% 191 18 15998753 ace32 a -d4096 -m1 4546 100% 4406 100% 156 17 15745557 ace32 a -d4096 -m5 4550 100% 4388 100% 180 16 15675025 arhangel a 6286 138% 5521 125% 850 826 16594841 arhangel a -2 -mm 7249 159% 5796 132% 1615 1002 16679006 arhangel a -1 -mm 4866 107% 4752 108% 128 20 16988655 arhangel a -2 -1 7228 159% 5594 127% 1816 806 16583314 ba -k 7568 166% 5045 114% 2804 102 16786575 ba -k -g 6635 145% 4967 113% 1854 103 16843269 ba -k -j 7610 167% 5049 115% 2846 102 16786624 ba -k -1 4979 109% 4853 110% 140 95 17078365 ba -k -10 4945 108% 4793 109% 169 100 16835584 ba -k -50 4955 109% 4782 108% 194 109 16752058 bee a -m1 5500 121% 5031 114% 522 475 16214602 bee a -m2 5731 126% 5135 117% 662 635 15962121 bee a -m3 5955 131% 5228 119% 809 726 15916558 bee a -m1 -d3 5482 120% 5011 114% 524 468 16164897 bee a -m2 -d3 5713 125% 5113 116% 667 628 15905824 bee a -m3 -d3 5934 130% 5193 118% 824 706 15856297 bix a 4772 104% 4586 104% 207 9 16402319 bix a -mdg 4783 105% 4581 104% 224 12 16370798 bix a -m9 4633 101% 4452 101% 202 10 15917029 bix a -mdg -m9 4650 102% 4446 101% 227 13 15877736 bix a -mdg -s 4790 105% 4510 102% 311 14 16075174 boa -m1 6636 146% 5574 127% 1181 796 16776121 boa -a 8281 182% 6231 142% 2279 1452 16384208 boa -m15 8786 193% 6424 146% 2625 1633 16303478 boa -s 8318 183% 6238 142% 2311 1456 16383035 boa -m15 -s 8832 194% 6430 146% 2670 1639 16285022 bzip2 -k 4924 108% 4792 109% 147 35 17073484 bzip2 -k -s 4944 108% 4843 110% 113 33 17276170 bzip2 -k -1 4990 109% 4884 111% 118 31 17430352 bzip2 -k -5 4917 108% 4810 109% 119 34 17150806 bzip2 -k -9 4923 108% 4792 109% 146 35 17073484 dc e 4761 104% 4653 106% 120 63 16482103 dc e -fx -fb 4827 106% 4705 107% 136 58 16682706 dc e -b16300 6332 139% 5619 128% 793 842 16913763 dc e -b16300 -mb5 6341 139% 5627 128% 794 849 16913763 dc e -b12000 5176 113% 4779 108% 442 79 16764158 eri a -m1 5819 128% 4907 111% 1013 83 17001695 eri a -m2 5809 127% 4876 111% 1038 85 16871942 eri a -m3 5840 128% 4863 110% 1086 92 16784470 eri a 5890 129% 4866 110% 1138 91 16780779 eri a -m5 6002 132% 4878 111% 1250 91 16780764 imp98 a 4586 100% 4518 102% 76 10 16202966 imp98 a -mm 4558 100% 4487 102% 79 10 16092171 imp98 a -mm -m3 4577 100% 4467 101% 123 10 16000717 imp98 a -mm -2 4740 104% 4603 104% 152 20 16444461 imp98 a -mm -s4 4564 100% 4492 102% 81 10 16104472 imp98 a -2 -s4 4743 104% 4605 104% 154 18 16455088 imp_d a -2 -s4 4745 104% 4606 104% 155 20 16455034 pkzip -es 5193 114% 5176 117% 19 7 18601569 pkzip -a 4886 107% 4851 110% 39 7 17426825 pkzip -exx 4896 107% 4847 110% 55 6 17407993 ppmd e -o5 5006 110% 4817 109% 211 223 16460344 ppmd e -o7 4997 109% 4806 109% 213 226 16410576 ppmd e -o9 4999 109% 4807 109% 214 227 16410355 ppmd e -o5 -m56 5041 110% 4815 109% 251 261 16305961 ppmd e -o6 -m56 5021 110% 4795 109% 251 260 16235977 ppmd e -o7 -m56 5037 110% 4793 109% 272 269 16188852 ppmd e -o8 -m56 5011 110% 4783 109% 254 276 16136880 ppmd e -o9 -m56 4991 109% 4758 108% 260 262 16091951 ppmonstr e -o5 5292 116% 4964 113% 365 392 16328461 ppmonstr e -o7 5288 116% 4958 112% 367 394 16296439 ppmonstr e -o9 5300 116% 4967 113% 371 400 16305613 ppmonstr e -o5 -m56 5439 119% 5025 114% 460 490 16163504 ppmonstr e -o7 -m56 5430 119% 5007 114% 471 500 16055439 ppmonstr e -o9 -m56 5410 119% 4984 113% 474 498 15979365 rar a 4805 105% 4653 106% 170 12 16645524 rar a -mm 4787 105% 4631 105% 175 12 16564039 rar a -mm -m1 4804 105% 4669 106% 151 12 16711090 rar a -mm -m5 4804 105% 4630 105% 193 12 16556608 rar a -mm -mdc 4784 105% 4660 106% 138 12 16683996 rar a -mm -mda 4821 106% 4712 107% 122 12 16875793 rar a -mm -s 4763 104% 4577 104% 207 11 16362865 rar32 a -mm -s 4765 104% 4579 104% 208 13 16362865 rk -mf1 4869 107% 4739 107% 144 127 16551656 rk -mf2 5043 110% 4777 108% 296 278 16089728 rk -mf3 5171 113% 4784 109% 431 297 15996404 rk -mx1 7842 172% 6329 144% 1681 1931 15227508 rk -mx2 8625 189% 6743 153% 2092 2337 15108076 rk -mx2 -ft+ -fe+ 8636 190% 6767 154% 2077 2362 15109732 sbc c 4897 107% 4770 108% 142 65 16886710 sbc c -b9 4963 109% 4789 109% 194 77 16896566 sbc c -b19 5004 110% 4803 109% 225 93 16874886 sbc c -b19 -e 4871 107% 4731 107% 156 60 16760182 szip -v0 4903 107% 4758 108% 162 71 16812371 szip -o4 4847 106% 4767 108% 89 65 16895668 szip -o8 4944 108% 4759 108% 205 77 16782635 szip -o0 6267 137% 4865 110% 1558 57 16749321 szip -v0 -b41 4923 108% 4773 108% 168 72 16862500 szip -o4 -b41 4863 106% 4782 108% 91 68 16938561 szip -o8 -b41 4970 109% 4776 108% 216 79 16833992 szip -o0 -b41 6162 135% 4869 110% 1438 58 16800606 ufa a 5099 112% 4602 104% 553 86 16057686 ufa a -mg -mu32 5042 110% 4481 102% 624 76 15632188 ufa a -s -mu32 5273 116% 4578 104% 772 93 15868931 ufa a -mg -s 5273 116% 4578 104% 772 93 15868931 uharc a 4920 108% 4491 102% 478 77 15717022 uharc a -m1 4804 105% 4504 102% 334 72 15835979 uharc a -m3 5098 112% 4471 101% 697 78 15564907 uharc a -m3 -mm 5092 112% 4470 101% 692 77 15564907 uharc a -m3 -md64 5010 110% 4622 105% 432 79 16199790 uharc a -m3 -md2048 5096 112% 4471 101% 695 77 15564907 ybs_d -y 4885 107% 4708 107% 197 53 16687784 ybs_d -m2mu 4885 107% 4708 107% 198 53 16687784 ybs_d -m16mu 5050 111% 4854 110% 218 49 17220161 ybs_d -m16mu -r 5054 111% 4854 110% 223 47 17224114 zzip a 5635 123% 4814 109% 912 56 16800833 zzip a -mm 5596 123% 4812 109% 872 56 16807633 zzip a -mm -b20 5590 122% 4795 109% 884 54 16749025 zzip a -mm -mx 5499 120% 4705 107% 883 56 16418590 Overall score is calculated by adding compression time, extraction time, and time it would take to transfer the compressed file over a 28,800bps network: (compressed_size)/3600 , because 28800 bits_per_second is 3600 bytes_per_second Average Users' score is calculated by adding (compress_time/10)+ extract_time + time it would take to transfer the compressed file over a 28,800bps network. Compression time is divided by 10 here, because more than 90% of people would never compress anything during their life (with compression programs), but they use compressed data almost _every_ time they use computers and/or Internet. That's why compression time is not so actual for them.

[[3]] Details

are no longer put to this main text (5176 lines reporting 221760 results on 5029 files in 10 sets), but can be found in FULL version with BINARIES.DAT and *.BAT at http://geocities.com/SiliconValley/Bay/1995/artest20.zip or http://artest1.tripod.com/artest20.zip

[[4]] Comments

Links to download programs:

7-Zip 2.22 :W http://www.7-zip.com/dl/7zip222.exe 513K ACE32 2.0b4 :W ftp://ftp.forlangs.net/pub/windows/winace/ace20b4.exe 576K ERI32 4.9fre :e http://geocities.com/eri32/eri49fre.zip 91K PkzipC 4.00 :W ftp://ftp.pkware.com/pkzc400s.exe 3470K RK-dos 1.04.1 :e http://rksoft.virtualave.net/downloads/rk104a1d.exe 461K RK 1.04.1 :W http://rksoft.virtualave.net/downloads/rk104a1w.exe 380K RAR32 2.80b3 :e ftp://ftp.netlab.sk/public/rarsoft/rar/rarx28b3.exe 269K WinRAR 2.80b3 :W ftp://ftp.netlab.sk/public/rarsoft/rar/wrar28b3.exe 620K BA 1.01b5 :e http://hem.spray.se/mikael.lundqvist/ba101br5.zip 61K SBC 0.500b :e http://geocities.com/sbcarchiver/sbc0500b.zip 187K ZZip 0.36b :W http://www.zzip.f2s.com/zzip-win32.zip 34K PPMD var.G, PPmonstr v.G :W ftp://ftp.simtel.net/pub/simtelnet/win95/compress/ppmdg.zip 72K BIX 1.00b7 :W http://www.7-zip.com/dl/ufa/bix100b7.zip 89K 777 0.04b1 :W http://www.7-zip.com/dl/ufa/777004b1.zip 72K UFA 0.04b1 :W http://www.7-zip.com/dl/ufa/ufa004b1.zip 64K ArHanGeL 1.40 :a http://geocities.com/SiliconValley/Lab/6606/arh140.zip 50K Imp 1.1 :e http://www.winimp.com/imp110d.zip 266K Imp-win 1.12 :W http://www.winimp.com/imp112.exe 122K PkZip 2.50 :a ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers/pk250dos.exe 202K ACB 2.00c :e ftp://ftp.simtel.net/pub/simtelnet/msdos/compress/acb_200c.zip 42K BOA 0.58b :e ftp://ftp.cdrom.com/.3/sac/pack/boa058.zip 74K DC 0.98b :W ftp://ftp.cdrom.com/.3/sac/pack/dc124.zip 55K Bzip2 1.0.1 :W ftp://sourceware.cygnus.com/pub/bzip2/v100/bzip2-100-x86-win32.exe 68K SZip 1.12a :W http://www.compressconsult.com/szip/szip_112a_win32.zip 71K UHArc 0.2b :e ftp://ftp.cdrom.com/.3/sac/pack/uharc02.zip 101K YBS 0.03e :e http://members.nbci.com/vycct/ybs003ed.zip 55K YBS 0.03e :W http://members.nbci.com/vycct/ybs003ew.zip 43K BEE 0.4.8 : mailto:Andrew.Filinsky@p11.f4.n452.z2.fidonet.org :a - any DOS - DOS programs, will run under pure DOS or in a DOS box :e - extender - DOS programs using DOS extenders like DOS/4GW or CWSDPMI :W - windows - Windows95/98/NT/etc programs If direct link doesn't work-most probably newer version of the program appeared at the same site: visit web page, or read the whole directory from ftp server (i.e. try the same URL, but without filename).

Homepages:

Arhangel : http://geocities.com/SiliconValley/Lab/6606 BA : http://hem.spray.se/mikael.lundqvist Eri32 : http://geocities.com/eri32 mirror : http://artest1.tripod.com RK : http://rksoft.virtualave.net Imp,WinImp : http://www.technelysium.com.au mirror : http://www.winimp.com ACE,WinACE : http://www.winace.com PkZip : http://www.pkware.com RAR,WinRAR : http://www.rarsoft.com BZip2 : http://sources.redhat.com/bzip2 SZip : http://www.compressconsult.com/szip ZZip : http://www.zzip.f2s.com YBS : http://members.nbci.com/vycct SBC : http://geocities.com/sbcarchiver Ufa,777, BIX,7-Zip: http://www.7-zip.com PPMD, PPMonstr, ACB, Bee, BOA, DC, UHArc - no homepage.

What's new:

Newer versions of programs are ready, they will be tested next time: RK, SBC, ZZip, ACE, UFA, 7-zip, PkzipC, RAR32, WinRAR, ERI32, BA, PPMD, PPMonstr. Results of older versions are given. Next release will come soon. Latest beta versions of BEE, DC, PPMonstr, UFA are available from authors by e-mail request: BEE: Andrew.Filinsky@p11.f4.n452.z2.fidonet.org DC: EdgarBinder@t-online.de PPMonstr: shkarin@arstel.ru UFA: support@7-zip.com ACB, BA and PKzip are not tested on all 5029 files any more, their results can be found in previous version ARTest vol.18b, and in full version artest20.zip (685K) . Results of PPMD (an open source version of PPMonstr) and UFA 0.04b1 are in full version only, BINARIES.DAT file. Results of old programs (not updated for more than 3 years, and no homepage), programs with low overall score will not be put to latest versions of ARTest. And also results of programs that are known to have bugs (in compression/decompression functions) for more than half a year. FULL version contains all *.BAT and *.DAT files you'll need to build 10 directories with 5029 files (artest20.zip\MAKE_BIN\*.*) and to repeat all our tests (BINS.BAT, BINARIES.DAT).

WARNINGS:

ACB refuses to take files shorter than 257 bytes. About 400 such files were processed with "rar a -m0 name.acb name.bin" (see do_acb_u.bat and do_acb_r.bat). BA 1.00beta can't losslessly decompress any file compressed with -f , 49 files compressed with any option (astronmy\CRLFTX~1.bin, GFEMER~1.bin, MIRAIN~1.bin, NIL_~1.bin, NIL~1.bin README~7.bin, SAOSOU~2.bin, STARVI~6.bin, UNIV00~1.bin, ZEROME~1.bin ; chem\ARCHIV~3.bin, ARCHIV~4.bin, DISK1_~1.bin, DISPOS~3.bin, KHEMCF~1.bin etc.) It says nothing like "CRC fails". BEE can't decompress some files compressed in "solid" mode. DC 0.99.158b fails to decompress HLPCOPY..bin, but only if you compress with "-mb5" switch and [<output>] filename like "any.dc5". Says nothing like "CRC fails". RK 1.03b1 was unable to correctly decompress 55 files compressed with "-mf2" or "-mf3", reporting > ERROR 303: CRC check failed. All are .htm, like font\README~3.BIN , and only one byte differs after extraction, and no problems if compressed with "-mx2" in all 55 cases. Unfortunately, 64Mb RAM is not enough to run RK 1.03 with "-mx3": endless swapping (virtual memory to hard disk) when compressing some binary files (but no swapping on text files). Bugs in tested versions of SBC and ZZip were found, but they are removed from latest versions ZZip 0.36b and SBC 0.500b . Problems in all other compressors were not found. The LATEST RELEASE, and all previous versions of these tests can be found at http://geocities.com/SiliconValley/Bay/1995/ and http://artest1.tripod.com/

The FINAL PART

> [[5]] PLEASE read THIS before replying to this article was removed from this text, but can be easily found at http://geocities.com/SiliconValley/Bay/1995/artest10.html http://artest1.tripod.com/artest10.html Send your suggestions, comments to artest@hotmail.ru With best kind regards, A.Ratushnyak Back to main ARTest page