Map of Groups of Compression Methods:
Statistical Transforming
stream block stream block
for words, i.e. DMC, *pre-conditioned all LZ ST,
"Markov source" model all PPM PPM incl.BWT
for bytes, i.e. either adaptive static SEM, VQ, DCT, DWT,
"Bernoulli source" or Huffman Huffman MTF, DC FT, SC,
"Analog signal" model Fractal
for bytes or bits adaptive static RLE, LPC, PBS,
Arithmetic Arithmetic incl.Delta ENUC
*block-based-PPM is practically unexplored field:
no(?) other algorithms except pre-conditioned PPM of C.Bloom
All definitions are in "A Practical Introduction to Data Compression" article.
A BLOCK is a finite piece of digital information, STREAM is a portion with
unknown borders: data comes byte-after-byte, not block-after-block.
A N-bit BYTE is a sequence of N bits, a WORD is a finite sequence of bytes.
Abbreviations are explained at the bottom of this page.
Every group (branch, family) contains many methods.
I think any one-step compression method belongs to one of these families.
Some descriptions are in the "Compression of Multimedia Information" article.
All stream methods can be applied to blocks, but inverse is not true.
Block methods can't be applied to streams as they can't start working
before the length of the buffer with data is assigned.
Not all methods for N-bit bytes can be applied to bits (1-bit bytes).
It's not good to apply methods for bytes - to words or bits, and
methods for words - to bytes or bits (BWT output, for example).
Although this Map is very useful, I haven't seen any analogues before.
I'm very interested in your comments! Send them to artest@inbox.ru, please!
This definition can probably be used:Non-physical information is information that can be obtained at any
point of space-time as result of research on properties of space-time.
Today (4th of June 2001) web search returns about 50 results on
"non-physical information", but no definition of it was found.
Why is it important ? See answer 12 of "A Practical Introduction":
Only non-physical information is always accessible, it doesn't
depend on material objects. The size of it is infinite.
And when we study how it can be applied (wavelets, fractals,
for example...) - we actually study how we depend on it.
If we only assume that all references used in (compressing) description will
be accessible, but don't know exactly, we make "potentially lossy" compression.
Just a short example of this unwanted effect:
take file ftp://ftp.simtel.net/pub/simtelnet/msdos/astronmy/skyplot.zip (153K)
and try to decompress it with latest version of INFO-ZIP ( www.info-zip.org )
You get a nice message:
skipping: OBJECTS.DAT `shrink' method not supported
UNZIP provided by Simtel.Net to unpack all .ZIPs from ftp.simtel.net
ftp://ftp.simtel.net/pub/simtelnet/msdos/UNZIP.EXE (49K)
gives same result: it is UnZip 5.40 of 21 November 1998, by Info-ZIP.
Can you immediately say what version of what program can unpack this file,
and where can it be downloaded from ?
Do you still think Definition and "potentially lossy compression"
is a theoretical or even philosophical item ?
I'm very interested in your comments! Send them to artest@inbox.ru, please!
Abbreviations:
DMC Dynamic Markov Coding
PPM Prediction by Partial Match
LZ Lempel-Ziv methods, incl. LZ77 (zip, rar etc.), LZ78, LZW (gif, v.42bis)
ST Sort Transform, including
BWT Burrows-Wheeler Transform
SEM Separate Exponents and Mantissas
MTF Move To Front
DC Distance Coding
FT Fourier Transform, including
DCT Discrete Cosine Transform (jpeg, mp3)
DWT Discrete Wavelet Transform (jpeg-2000)
SC Subband Coding
VQ Vector Quantization
RLE Run Length Encoding
LPC Linear Prediction Coding, including delta coding, ADPCM, CELP and MELP
PBS Parallel Blocks Sorting
ENUC Enumerative Coding
Back to main ARTest page