The Art
of Lossless
Data Compression
vol. 22t
Here are the results of tests performed in May 2001 to compare
lossless compression of english texts by all known good enough programs
developed for such purpose, including RK, DC, YBS, Bzip2, IMP, RAR and 7-zip.
See Archive Comparison Test by J.Gilchrist for more details: http://act.by.net
If anybody wants to start or continue such tests,
or can suggest some other sets of texts, or other compression programs,
(not sources or algorithm descriptions, executable programs only)
or knows we have missed something important,
(some new fantastic technology, an algorithm or even a program capable
of lossless compression of up to 1000:1 etc.)
please let us know immediately: artest@inbox.ru Thank you!
[[1]] COMPRESSION QUALITY
(see also
[[2]] Speed
[[3]] Details
[[4]] Comments)
Fifth line shows results for the sum of four Canterbury Corpus Large Set files,
eleventh line - for the sum of all 1231 files in six sets.
Original PPMonstr PPMD RK DC BOA SBC BEE YBS UHArc
585.61% 100% 105.23 100.70 105.54 107.16 105.98 109.66 106.08 105.36
414.65% 101.41 105.75 102.94 102.19 101.35 103.73 105.87 102.81 100%
591.98% 100% 104.97 101.61 104.18 108.09 104.72 107.91 104.47 103.34
675.45% 100% 108.79 102.80 114.03 115.60 112.90 115.35 112.80 116.17
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
529.35% 100% 104.77 101.06 103.95 105.26 104.63 107.42 104.23 103.13
492.94% 100% 104.25 101.90 103.73 106.62 105.57 107.26 106.54 105.45
398.92% 100% 102.76 101.46 101.83 103.66 103.64 105.35 104.36 104.05
436.99% 100% 102.63 101.61 102.66 104.30 104.13 105.02 105.19 105.50
733.54% 100% 101.39 101.42 111.46 112.25 108.82 113.68 110.65 113.37
341.03% 100% 102.39 106.22 107.84 103.75 106.50 104.23 105.70 110.54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
427.88% 100% 102.48 102.51 104.14 104.43 104.77 105.30 105.33 106.68
BA ZZip ACB 777 SZip ERI BZip2 ACE RAR 7-zip
110.28 110.48 108.75 115.54 111.99 113.05 122.35 139.58 139.64 160.82
104.68 104.01 103.67 101.29 104.65 107.02 111.82 113.43 113.35 111.96
108.65 108.78 108.23 113.96 113.02 111.34 122.46 142.25 143.32 163.83
113.47 113.41 114.26 130.89 118.43 115.63 133.69 143.59 145.24 190.08
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
107.10 106.92 106.35 110.59 109.26 109.58 118.69 129.78 130.25 145.56
108.52 109.13 109.26 114.41 112.96 112.49 118.83 137.33 137.55 155.12
106.41 107.07 107.85 108.93 110.25 110.17 114.01 131.81 135.76 143.27
107.35 108.10 108.40 110.07 110.61 111.58 116.94 135.22 136.58 148.86
119.88 114.63 117.48 117.65 118.78 119.80 137.36 150.03 155.95 180.86
108.68 109.75 109.43 111.02 111.11 111.93 112.85 130.72 130.22 138.02
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108.05 108.52 108.85 110.52 110.98 111.69 116.54 134.15 135.61 147.05
[[2]] Speed
Canterbury Corpus Large Set http://corpus.canterbury.ac.nz/ftp/large.zip
was used for this test, and an AMD-K6-400 machine with 192Mb RAM and Windows98.
Programs,options Overall Average Compress Extract Compressed
score, Users' time, time, size,
score, seconds seconds bytes
seconds % seconds %
NO COMPRESSION 4446 538% 4446 562% 0 0 16005619
7z a -tufa1 1324 160% 1057 133% 296 8 3672086
7z a -tufa1 -mx 1322 160% 1057 133% 294 8 3672086
7z a -tzip 1283 155% 1231 155% 58 5 4393637
7z a -tzip -mx 1325 160% 1237 156% 97 6 4401174
7zip a 1278 154% 1229 155% 55 3 4393637
7zip a -mx 1325 160% 1237 156% 98 5 4401174
777 a -mg 1372 166% 1159 146% 237 151 3544038
acb B 3236 392% 2202 278% 1148 1156 3352388
acb b 3934 476% 2585 327% 1499 1527 3272388
acb u 5115 619% 3243 410% 2080 2139 3225662
ace32 a 1212 146% 1124 142% 98 5 3992645
ace32 a -d4096 1168 141% 1072 135% 106 6 3801917
ace32 a -d4096 -s- 1208 146% 1116 141% 103 5 3962381
ace32 a -d4096 -m1 1160 140% 1112 140% 53 6 3965841
ace32 a -d4096 -m5 1353 164% 1076 136% 309 5 3746553
arh a 1133 137% 1078 136% 61 59 3647067
arh a -2 -mm 1132 137% 1078 136% 60 59 3647067
arh a -1 -mm 1438 174% 1302 164% 152 8 4605607
arh a -2 -1 1283 155% 1093 138% 212 59 3647067
ba -k 1024 124% 977 123% 52 23 3421195
ba -k -1 1148 139% 1113 140% 39 22 3914655
ba -k -50 1006 121% 945 119% 68 22 3298943
bee a -m1 -d3 1407 170% 1217 153% 211 198 3593467
bee a -m2 -d3 1460 177% 1229 155% 256 238 3479698
bee a -m3 -d3 1700 206% 1347 170% 392 355 3432029
bix a -mdg -s 1141 138% 995 125% 163 3 3514944
boa -m15 1344 162% 1142 144% 225 236 3182739
boa -m15 -s 1321 160% 1122 141% 221 230 3132810
boa -m7 1316 159% 1130 142% 207 216 3217354
boa -m1 1400 169% 1260 159% 155 166 3886863
bzip2 -k 1060 128% 1021 129% 43 13 3616113
bzip2 -k -1 1185 143% 1155 146% 33 12 4106479
bzip2 -k -5 1077 130% 1044 132% 37 13 3700097
bzip2 -k -9 1057 128% 1021 129% 40 13 3616113
dc e 927 112% 902 114% 27 17 3179173
dc e -ft 933 113% 906 114% 29 17 3192832
dc e -b16300 826 100% 791 100% 39 17 2773427
dc e -b16300 -mt5 825 100% 790 100% 38 17 2773427
eri a -m1 1057 128% 970 122% 96 22 3378440
eri a -m2 1054 127% 962 121% 102 23 3346586
eri a -m3 1060 128% 958 121% 113 25 3318853
eri a 1070 129% 958 121% 124 26 3313568
imp98 a -mm -m3 1218 147% 1140 144% 87 4 4059874
imp98 a -mm -2 1024 124% 995 125% 33 10 3533763
imp98 a -2 -s4 1025 124% 995 125% 33 10 3533695
imp a -2 -s4 1021 123% 992 125% 32 9 3530158
pkzip -es 1657 200% 1653 209% 4 2 5945622
pkzip -a 1317 159% 1305 165% 14 1 4691491
pkzip -exx 1390 168% 1291 163% 110 1 4605942
ppmd e -o3 -m184 1093 132% 1083 137% 11 13 3849571
ppmd e -o4 -m184 985 119% 973 123% 13 15 3447452
ppmd e -o5 -m184 938 113% 925 116% 15 17 3263988
ppmd e -o6 -m184 912 110% 897 113% 17 19 3155348
ppmd e -o7 -m184 898 108% 880 111% 20 22 3084162
ppmd e -o8 -m184 890 107% 869 110% 23 25 3032824
ppmd e -o9 -m184 891 108% 867 109% 27 29 3007612
ppmd e -o10 -m184 901 109% 865 109% 40 41 2953155
ppmd e -o11 -m184 915 110% 864 109% 56 56 2891692
ppmd e -o12 -m184 1029 124% 937 118% 102 112 2935640
ppmonstr e -o3 -m184 1136 137% 1107 140% 32 35 3850354
ppmonstr e -o4 -m184 1031 125% 998 126% 37 40 3437676
ppmonstr e -o5 -m184 986 119% 949 120% 42 44 3243866
ppmonstr e -o6 -m184 966 117% 924 116% 47 49 3132547
ppmonstr e -o7 -m184 952 115% 905 114% 52 55 3040773
ppmonstr e -o8 -m184 942 114% 888 112% 60 63 2949530
ppmonstr e -o9 -m184 942 114% 880 111% 68 72 2888367
ppmonstr e -o10 -m184 959 116% 882 111% 85 87 2834425
ppmonstr e -o11 -m184 987 119% 892 112% 106 108 2785525
ppmonstr e -o12 -m184 1100 133% 959 121% 157 158 2831191
rar a 1191 144% 1130 143% 67 5 4029084
rar a -mm -m1 1233 149% 1204 152% 33 5 4304860
rar a -mm -m5 1438 174% 1132 143% 340 5 3938355
rar a -mm -s 1193 144% 1129 142% 71 5 4023405
rk -mf1 1147 139% 1127 142% 23 20 3978408
rk -mf2 1186 143% 1095 138% 101 48 3735704
rk -mf3 1280 155% 1096 138% 204 50 3693704
rk -mx1 1385 167% 1163 147% 248 279 3093640
rk -mx2 1454 176% 1200 151% 282 315 3086308
rk -mx3 1461 177% 1204 152% 285 319 3087044
sbc c -on -b59 947 114% 897 113% 56 18 3146705
sbc c -oa -b59 966 117% 912 115% 59 19 3195883
sbc c -of -b59 959 116% 907 114% 58 19 3176987
sbc c -os -b59 871 105% 813 102% 65 20 2832457
szip -o6 1021 123% 994 125% 29 27 3475264
szip -o8 1020 123% 985 124% 39 29 3430586
szip -o8 -b41 1001 121% 964 121% 41 30 3348344
ufa a 1387 168% 1137 143% 277 28 3895425
ufa a -mg -mu32 1335 161% 1159 146% 196 211 3344003
uharc a -m1 -md8192 1262 153% 1184 149% 87 25 4141149
uharc a -m2 -md8192 1275 154% 1139 144% 152 25 3955624
uharc a -m3 -md8192 1434 173% 1108 140% 363 25 3768111
uharc a -mz -md8192 1134 137% 1111 140% 26 30 3884071
uharc a -mx -md8192 1019 123% 932 117% 97 83 3023083
ybs -m16mu 914 110% 820 103% 104 17 2857446
ybs -m16mu -r 933 113% 827 104% 118 16 2878433
ybs -m8m 935 113% 888 112% 51 16 3123345
zzip a 1017 123% 972 122% 51 23 3400243
zzip a -mm -mx 1015 123% 969 122% 52 24 3383215
zzip a -mm -a 1017 123% 970 122% 52 25 3383215
Overall score is calculated by adding compression time, extraction time, and
time it would take to transfer the compressed file over a 28,800bps network:
(compressed_size)/3600 , because 28800 bits_per_second is 3600 bytes_per_second
Average Users' score is calculated by adding (compress_time/10)+ extract_time +
time it would take to transfer the compressed file over a 28,800bps network.
Compression time is divided by 10 here, because more than 90% of people would
never compress anything during their life (with compression programs), but they
use compressed data almost _every_ time they use computers and/or Internet.
That's why compression time is not so actual for them.
[[3]] Details
are no longer put to this main text
(1490 lines reporting 65614 results on 1231 files in 6 sets),
but can be found in FULL version with TEXTS.DAT and *.BAT
at http://geocities.com/SiliconValley/Bay/1995/artest22.zip
or http://artest1.tripod.com/artest22.zip
[[4]] Comments
Links to download programs:
7-Zip 2.24 :W http://www.7-zip.com/dl/7zip224.exe 463K
ACE32 2.02 :W ftp://ftp.forlangs.net/pub/windows/winace/ace202.exe 587K
ERI32 4.16fre :e http://geocities.com/eri32/eri416fr.zip 94K
PkzipC 4.00 :W ftp://ftp.pkware.com/pkzc400s.exe 3470K
RK-dos 1.04.1 :e http://rksoft.virtualave.net/downloads/rk104a1d.exe 461K
RK 1.04.1 :W http://rksoft.virtualave.net/downloads/rk104a1w.exe 380K
RAR32 2.80 :e ftp://ftp.netlab.sk/public/rarsoft/rar/rarx280.exe 269K
WinRAR 2.80 :W ftp://ftp.netlab.sk/public/rarsoft/rar/wrar280.exe 621K
BA 1.01b5 :e http://hem.spray.se/mikael.lundqvist/ba101br5.zip 61K
SBC 0.860b :e http://geocities.com/sbcarchiver/sbc0860b.zip 208K
ZZip 0.36c :W http://www.via.ecp.fr/~damien/downloads/zzip-win32.zip 35K
PPMD var.H,
PPmonstr v.H :W ftp://ftp.cdrom.com/.2/sac/pack/ppmdh.rar 57K
BIX 1.00b7 :W http://www.7-zip.com/dl/ufa/bix100b7.zip 89K
777 0.04b1 :W http://www.7-zip.com/dl/ufa/777004b1.zip 72K
UFA 0.04b1 :W http://www.7-zip.com/dl/ufa/ufa004b1.zip 64K
ArHanGeL 1.40 :a http://geocities.com/SiliconValley/Lab/6606/arh140.zip 50K
Imp 1.1 :e http://www.winimp.com/imp110d.zip 266K
Imp-win 1.12 :W http://www.winimp.com/imp112.exe 122K
PkZip 2.50 :a ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers/pk250dos.exe 202K
ACB 2.00c :e ftp://ftp.simtel.net/pub/simtelnet/msdos/compress/acb_200c.zip 42K
BOA 0.58b :e ftp://ftp.cdrom.com/.2/sac/pack/boa058.zip 74K
DC 0.98b :W ftp://ftp.cdrom.com/.2/sac/pack/dc124.zip 55K
Bzip2 1.0.1 :W ftp://sourceware.cygnus.com/pub/bzip2/v100/bzip2-100-x86-win32.exe 68K
SZip 1.12a :W http://www.compressconsult.com/szip/szip_112a_win32.zip 71K
UHArc 0.2b :e ftp://ftp.cdrom.com/.2/sac/pack/uharc02.zip 101K
YBS 0.03e :e http://members.nbci.com/vycct/ybs003ed.zip 55K
YBS 0.03e :W http://members.nbci.com/vycct/ybs003ew.zip 43K
BEE 0.4.8 :W Andrew.Filinsky@p11.f4.n452.z2.fidonet.org
:a - any DOS - DOS programs, will run under pure DOS or in a DOS box
:e - extender - DOS programs using DOS extenders like DOS/4GW or CWSDPMI
:W - windows - Windows95/98/NT/etc programs
If direct link doesn't work-most probably newer version of the program appeared
at the same site: visit web page, or read the whole directory from ftp server
(i.e. try the same URL, but without filename).
Homepages:
Arhangel : http://geocities.com/SiliconValley/Lab/6606
BA : http://hem.spray.se/mikael.lundqvist
Eri32 : http://geocities.com/eri32
mirror : http://artest1.tripod.com
RK : http://rksoft.virtualave.net
Imp,WinImp : http://www.technelysium.com.au
mirror : http://www.winimp.com
ACE,WinACE : http://www.winace.com
PkZip : http://www.pkware.com
RAR,WinRAR : http://www.rarsoft.com
BZip2 : http://sources.redhat.com/bzip2
SZip : http://www.compressconsult.com/szip
ZZip : http://www.zzip.f2s.com
YBS : http://members.nbci.com/vycct
SBC : http://geocities.com/sbcarchiver
Ufa,777,
BIX,7-Zip: http://www.7-zip.com
PPMD, PPMonstr, ACB, Bee, BOA, DC, UHArc - no homepage.
What's new:
12 new programs tested: RK, SBC, ZZip, ACE, 7-zip, RAR32, WinRAR,
ERI32, BA, PPMD, PPMonstr, UHARC.
Test data was updated, a set of Russian texts was added.
Latest beta versions of BEE, DC, UFA, UHArc are available
from authors by e-mail request:
BEE: Andrew.Filinsky@p11.f4.n452.z2.fidonet.org
DC: EdgarBinder@t-online.de
UFA: support@7-zip.com
UHARC: Uwe.Herklotz@gmx.de
Results of ArHanGeL, IMP, BICOM, BIX, Pkzip
are in full version only, TEXTS.DAT file.
WARNINGS:
BA 1.00beta5 can't correctly decompress shaks12.txt.
DC 0.99.158b failed to decompress 1DFRE10.dc , ANDES10.dc , and BTI0110.dc,
saying "Corrupted block" (while t(est) command writes "Test successful").
ERI32 4.8fre can't compress files larger than (free DPMI memory)/6, i.e.
about 10Mb on a PC with 64Mb RAM. The largest 44Mb file was split to 5 chunks
9000000 bytes long (last chunk was 8894190 bytes).
Problems in all other compressors were not found.
The LATEST RELEASE, and all previous versions of these tests can be found
at http://geocities.com/SiliconValley/Bay/1995/ and http://artest1.tripod.com/
Send your suggestions, comments to artest@hotmail.ru
With best kind regards,
A.Ratushnyak,
RAO Inc.
Back to main ARTest page