PAQ
7 languages
Tools
From Wikipedia, the free encyclopedia
For other uses, see
.
For the unrelated
PAQ.EXE
compressor for DOS, see
.
This article includes a list of
, but
it lacks sufficient
corresponding
.
Please help to
this article by
more precise citations.
(
March 2011
)
(
)
A sample session of PAQ8O
PAQ
is a series of
archivers that have gone through collaborative
development to top rankings on several benchmarks
measuring compression ratio (although at the expense
of speed and memory usage).
Specialized versions of PAQ have won the
and the
.
PAQ is
distributed under the
.
Algorithm
[
]
PAQ uses a
algorithm. Context mixing is related to
(PPM) in that the
compressor is divided into a predictor and an
, but differs in that the next-symbol prediction is
computed using a weighted combination of probability estimates from a large number of models conditioned on different
contexts.
Unlike PPM, a context doesn't need to be contiguous.
Most PAQ versions collect next-symbol statistics for
the following contexts:
; the context is the last
n
bytes before the predicted symbol (as in PPM);
whole-word
n
-grams, ignoring case and nonalphabetic characters (useful in text files);
"sparse" contexts, for example, the second and fourth bytes preceding the predicted symbol (useful in some binary
formats);
"analog" contexts, consisting of the high-order bits of previous 8- or 16-bit words (useful for multimedia files);
two-dimensional contexts (useful for images, tables, and spreadsheets); the row length is determined by finding the
stride length of repeating byte patterns;
specialized models, such as
executables,
,
, or
images; these models are active only when the
particular file type is detected.
All PAQ versions predict and compress one bit at a time, but differ in the details of the models and how the
predictions are combined and postprocessed.
Once the next-bit probability is determined, it is encoded by
.
There are three methods for combining predictions, depending on the version:
In PAQ1 through PAQ3, each prediction is represented as a pair of bit counts
.
These counts are combined by
weighted summation, with greater weights given to longer contexts.
In PAQ4 through PAQ6, the predictions are combined as before, but the weights assigned to each model are adjusted
to favor the more accurate models.
In PAQ7 and later, each model outputs a probability rather than a pair of counts.
The probabilities are combined
using an
.
PAQ1SSE and later versions postprocess the prediction using secondary symbol estimation (SSE).
The combined prediction
and a small context are used to look up a new prediction in a table.
After the bit is encoded, the table entry is
adjusted to reduce the prediction error.
SSE stages can be pipelined with different contexts or computed in parallel
with the outputs averaged.
Arithmetic coding
[
]
A string
s
is compressed to the shortest byte string representing a base-256
number
x
in the range [0, 1]
such that P(
r
<
s
) ≤
x
< P(
r
≤
s
), where P(
r
<
s
) is the probability that a random string
r
with the same length as
s
will be
less than
s
.
It is always possible to find an
x
such that the length of
x
is at most one
byte longer than the
, −log
2
P(
r
=
s
) bits.
The length of
s
is stored in the archive header.
The
in PAQ is implemented by maintaining for each prediction a lower and upper bound on
x
, initially
[0, 1].
After each prediction, the current range is split into two parts in proportion to P(0) and P(1), the
probability that the next bit of
s
will be a 0 or 1 respectively, given the previous bits of
s
.
The next bit is then
encoded by selecting the corresponding subrange to be the new range.
The number
x
is decompressed back to string
s
by making an identical series of bit predictions (since the previous
bits of
s
are known).
The range is split as with compression.
The portion containing
x
becomes the new range, and the
corresponding bit is appended to
s
.
In PAQ, the lower and upper bounds of the range are represented in three parts.
The most significant base-256 digits
are identical, so they can be written as the leading bytes of
x
.
The next 4 bytes are kept in memory, such that the
leading byte is different.
The trailing bits are assumed to be all zeros for the lower bound and all ones for the upper
bound.
Compression is terminated by writing one more byte from the lower bound.
Adaptive model weighting
[
]
In PAQ versions through PAQ6, each model maps a set of distinct contexts to a pair of counts,
, a count of zero
bits, and
, a count of 1 bits.
In order to favor recent history, half of the count over 2 is discarded when the
opposite bit is observed.
For example, if the current state associated with a context is
and a 1
is observed, then the counts are updated to (7, 4).
A bit is arithmetically coded with space proportional to its probability, either P(1) or P(0) = 1 − P(1).
The
probabilities are computed by weighted addition of the 0 and 1 counts:
S
0
= Σ
i
w
i
n
0
i
,
S
1
= Σ
i
w
i
n
1
i
,
S
=
S
0
+
S
1
,
P(0) =
S
0
/
S
,
P(1) =
S
1
/
S
,
where
w
i
is the weight of the
i
-th model.
Through PAQ3, the weights were fixed and set in an ad-hoc manner.
(Order-
n
contexts had a weight of
n
2
.)
Beginning with PAQ4, the weights were adjusted adaptively in the direction that would
reduce future errors in the same context set.
If the bit to be coded is
y
, then the weight adjustment is:
n
i
=
n
0
i
+
n
1
i
,
error =
y
– P(1),
w
i
←
w
i
+ [(
S
n
1
i
−
S
1
n
i
) / (
S
0
S
1
)] error.
Neural-network mixing
[
]
Beginning with PAQ7, each model outputs a prediction (instead of a pair of counts).
These predictions are averaged in
the logistic domain:
x
i
= stretch(P
i
(1)),
P(1) = squash(Σ
i
w
i
x
i
),
where P(1) is the probability that the next bit will be a 1, P
i
(1) is the probability estimated by the
i
-th model, and
stretch(
x
) = ln(
x
/ (1 −
x
)),
squash(
x
) = 1 / (1 +
e
−
x
) (inverse of stretch).
After each prediction, the model is updated by adjusting the weights to minimize coding cost:
w
i
←
w
i
+ η
x
i
(
y
− P(1)),
where η is the
(typically 0.002 to 0.01),
y
is the predicted bit, and (
y
− P(1)) is the prediction
error.
The weight update algorithm differs from
in that the terms P(1)P(0) are dropped.
This is
because the goal of the neural network is to minimize coding cost, not
error.
Most versions of PAQ use a small context to select among sets of weights for the neural network.
Some versions use
multiple networks whose outputs are combined with one more network prior to the SSE stages.
Furthermore, for each
input prediction there may be several inputs which are
functions of P
i
(1) in addition to stretch(P(1)).
Context modeling
[
]
Each model partitions the known bits of
s
into a set of contexts and maps each context to a bit history represented by
an 8-bit state.
In versions through PAQ6, the state represents a pair of counters (
n
0
,
n
1
).
In PAQ7 and later versions
under certain conditions, the state also represents the value of the last bit or the entire sequence.
The states are
mapped to probabilities using a 256-entry table for each model.
After a prediction by the model, the table entry is
adjusted slightly (typically by 0.4%) to reduce the prediction error.
In all PAQ8 versions, the representable states are as follows:
The exact bit sequence for up to 4 bits.
A pair of counts and an indicator of the most recent bit for sequences of 5 to 15 bits.
A pair of counts for sequences of 16 to 41 bits.
To keep the number of states to 256, the following limits are placed on the representable counts: (41, 0), (40, 1),
(12, 2), (5, 3), (4, 4), (3, 5), (2, 12), (1, 40), (0, 41).
If a count exceeds this limit, then the next state is one
chosen to have a similar ratio of
n
0
to
n
1
.
Thus, if the current state is (
n
0
= 4,
n
1
= 4, last bit = 0) and a 1 is
observed, then the new state is not (
n
0
= 4,
n
1
= 5, last bit = 1).
Rather, it is (
n
0
= 3, n
1
= 4, last bit = 1).
Most context models are implemented as
.
Some small contexts are implemented as direct
.
Text preprocessing
[
]
Some versions of PAQ, in particular PAsQDa, PAQAR (both PAQ6 derivatives), and PAQ8HP1 through PAQ8HP8 (PAQ8
derivatives and
recipients) preprocess text files by looking up words in an external dictionary and
replacing them with 1- to 3-byte codes.
In addition, uppercase letters are encoded with a special character followed
by the lowercase letter.
In the PAQ8HP series, the dictionary is organized by grouping syntactically and semantically
related words together.
This allows models to use just the most significant bits of the dictionary codes as context.
Comparison
[
]
The following table is a sample from the
by Matt Mahoney that consists of a file
consisting of 10
9
bytes (1
, or 0.931
) of
text.
Program
Compressed size (bytes)
% of original size
Compression time (
/
)
Memory (MiB)
nncp v3.2
107,261,318
10.73
241,871
7600
cmix v20
110,119,440
11.01
621,780
31650
PAQ8PX_v206fix1
125,099,359
12.51
291,916
28151
183,976,014
18.4
880
256
254,007,875
25.4
379
8
322,649,703
32.26
104
0.1
See
for a list of file compression benchmarks.
History
[
]
The following lists the major enhancements to the PAQ algorithm.
In addition, there have been a large number of
incremental improvements, which are omitted.
PAQ1
was released on January 6, 2002, by Matt Mahoney. It used fixed weights and did not include an analog or
sparse model.
PAQ1SSE/PAQ2
was released on May 11, 2003, by Serge Osnach. It significantly improved compression by adding a
Secondary Symbol Estimation (SSE) stage between the predictor and encoder. SSE inputs a short context and the
current prediction and outputs a new prediction from a table. The table entry is then adjusted to reflect the
actual bit value.
PAQ3N
, released October 9, 2003, added a sparse model.
PAQ4
, released November 15, 2003, by Matt Mahoney used adaptive weighting.
PAQ5
(December 18, 2003) and
PAQ6
(December 30, 2003) were minor improvements, including a new analog model.
At this point, PAQ was competitive with
the best PPM compressors and attracted the attention of the data compression community, which resulted in a large
number of incremental improvements through April 2004.
Berto Destasio tuned the models and adjusted the bit count
discounting schedule.
Johan de Bock made improvements to the user interface.
David A. Scott made improvements to
the arithmetic coder.
Fabio Buffoni made speed improvements.
During the period May 20, 2004, through July 27, 2004, Alexander Ratushnyak released seven versions of
PAQAR
, which
made significant compression improvements by adding many new models, multiple mixers with weights selected by
context, adding an SSE stage to each mixer output, and adding a preprocessor to improve the compression of Intel
executable files.
PAQAR stood as the top-ranked compressor through the end of 2004 but was significantly slower
than prior PAQ versions.
During the period January 18, 2005, through February 7, 2005, Przemyslaw Skibinski released four versions of
PASqDa
, based on PAQ6 and PAQAR with the addition of an English dictionary preprocessor.
It achieved the top
ranking on the Calgary corpus but not on most other benchmarks.
A modified version of
PAQ6
won the
on January 10, 2004, by Matt Mahoney.
This was bettered by ten
subsequent versions of PAQAR by Alexander Ratushnyak.
The most recent was submitted on June 5, 2006, consisting of
compressed data and program source code totaling 589,862 bytes.
PAQ7
was released December 2005 by Matt Mahoney. PAQ7 is a complete rewrite of PAQ6 and variants (PAQAR, PAsQDa).
Compression ratio was similar to PAQAR but 3 times faster. However it lacked x86 and a dictionary, so it did not
compress Windows executables and English text files as well as PAsQDa. It does include models for color BMP, TIFF
and JPEG files, so compresses these files better. The primary difference from PAQ6 is it uses a neural network to
combine models rather than a gradient descent mixer. Another feature is PAQ7's ability to compress embedded jpeg
and bitmap images in Excel-, Word- and pdf-files.
PAQ8A
was released on January 27, 2006,
PAQ8C
on February 13, 2006. These were experimental pre-release of
anticipated PAQ8. It fixed several issues in PAQ7 (poor compression in some cases). PAQ8A also included model for
compressing (x86) executables.
PAQ8F
was released on February 28, 2006. PAQ8F had three improvements over PAQ8A: a more memory efficient context
model, a new indirect context model to improve compression, and a new user interface to support drag and drop in
Windows. It does not use an English dictionary like the PAQ8B/C/D/E variants.
PAQ8G
was released March 3, 2006, by Przemyslaw Skibinski.
PAQ8G is PAQ8F with dictionaries added and some other
improvements as a redesigned TextFilter (which does not decrease compression performance on non-textual files)
PAQ8H
was released on March 22, 2006, by Alexander Ratushnyak and updated on March 24, 2006. PAQ8H is based on
PAQ8G with some improvements to the model.
PAQ8I
was released on August 18, 2006, by Pavel L. Holoborodko, with bug fixes on August 24, September 4, and
September 13.
It added a
model for
files.
PAQ8J
was released on November 13, 2006, by Bill Pettis.
It was based on
PAQ8F
with some text model improvements
taken from PAQ8HP5.
Thus, it did not include the text dictionaries from
PAQ8G
or PGM model from
PAQ8I
.
Serge Osnach released a series of modeling improvements:
PAQ8JA
on November 16, 2006,
PAQ8JB
on November 21, and
PAQ8JC
on November 28.
PAQ8JD
was released on December 30, 2006, by Bill Pettis.
This version has since been ported to 32 bit
for
several processors, and 32 and 64 bit
.
PAQ8K
was released on February 13, 2007, by Bill Pettis.
It includes additional models for binary files.
PAQ8L
was released on March 8, 2007 by Matt Mahoney.
It is based on PAQ8JD and adds a
model.
PAQ8O
was released on August 24, 2007 by Andreas Morphis. Contains improved
and
models over PAQ8L. Can be
optionally compiled with
support and for 64-bit Linux. The algorithm has notable performance benefits under
64-bit OS.
PAQ8P
was released on August 25, 2008, by Andreas Morphis. Contains improved BMP model and adds a
model.
PAQ8PX
was released on April 25, 2009, by Jan Ondrus. Unlike most PAQ variants, it has continued to receive updates
from multiple contributors and has evolved through many versions.
PAQ8KX
was released on July 15, 2009, by Jan Ondrus. It is a combination of PAQ8K with PAQ8PX.
PAQ8PF
was released on September 9, 2009, by LovePimple without source code (which the
license requires). It
compresses 7% worse, but is 7 times faster compared to PAQ8PX v66 (measured with 1 MB English text)
PAQ9A
was released on December 31, 2007, by Matt Mahoney. A new experimental version. It does not include models
for specific file types, has an LZP preprocessor and supports files over 2 GB.
was released on March 12, 2009, by Matt Mahoney. It uses a new archive format designed so that the current
ZPAQ program will be able to decompress archives created by future ZPAQ versions
(the various PAQ variants listed
above are not forward compatible in this fashion). It achieves this by specifying the decompression algorithm in a
bytecode program that is stored in each created archive file.
Hutter Prizes
[
]
The series
PAQ8HP1
through
PAQ8HP8
were released by Alexander Ratushnyak from August 21, 2006, through January 18,
2007, as
submissions.
The Hutter Prize is a text compression contest using a 100 MB English and XML data
set derived from Wikipedia's source.
The PAQ8HP series was forked from PAQ8H.
The programs include text preprocessing
dictionaries and models tuned specifically to the benchmark.
All non-text models were removed.
The dictionaries were
organized to group syntactically and semantically related words and to group words by common suffix.
The former
strategy improves compression because related words (which are likely to appear in similar context) can be modeled on
the high order bits of their dictionary codes.
The latter strategy makes the dictionary easier to compress.
The size
of the decompression program and compressed dictionary is included in the contest ranking.
On October 27, 2006, it was announced
that
PAQ8HP5
won a
of
3,416.
On June 30, 2007, Ratushnyak's
PAQ8HP12
was awarded a second Hutter prize of €1732,
improving upon his previous
record by 3.46%.
PAQ derivations
[
]
Being
, PAQ can be modified and redistributed by anyone who has a copy. This has allowed other authors to
the PAQ compression engine and add new features such as a
or better speed (at the
expense of compression ratio). Notable PAQ derivatives include:
WinUDA 0.291
, based on PAQ6 but faster
UDA 0.301
, based on PAQ8I algorithm
, based on PAQ6
(beta version is based on PAQ7).
Emilcont
based on PAQ6
GUI frontend (for Windows and Linux) for
LPAQ
,
ZPAQ and various PAQ8* algorithms
PWCM (PAQ weighted context mixing) is an independently developed closed source implementation of the PAQ algorithm
used in WinRK.
PAQCompress
is a graphical user interface for several newer PAQ versions, including the latest releases of PAQ8PX,
PAQ8PXD and PAQ8PXV. It is updated whenever a new version is released. The software intelligently appends an
extension to the filename which it can use to decompress the file using the correct PAQ Version. The software is
open source.
PerfectCompress
Is a compression software which features UCA (ULTRA Compressed Archive). A compression format
that featured PAQ8PX v42 to v65 and that now can use PAQ8PF, PAQ8KX, or PAQ8PXPRE as the default UCA Compressor. In
addition, PerfectCompress can compress files to PAQ8PX v42 to v67, and ZPAQ, and as of version 6.0, can compress
files to LPAQ and PAQ8PF beta 1 to beta 3. PerfectCompress v6.10 introduced support compression for the recently
released PAQ8PXPRE. PerfectCompress 6.12 introduces support for the PAQ8KX series.
FrontPAQ
, small gui for PAQ. Latest version is FrontPAQ v8 supporting PAQ8PX, PAQ8PF, and FP8. The software is no
longer updated and users are encouraged to use PAQCompress, which implements the latest PAQ releases.
See also
[
]
References
[
]
. Mailcom.com
. Retrieved
2010-05-19
.
. Retrieved
2007-07-10
.
You may download, use, copy, modify, and distribute these programs
under the terms of the GNU general public license
. Retrieved
2026-03-05
.
.
manpages.ubuntu.com
.
(PDF)
. Retrieved
2010-09-03
.
James Bowery.
. Published October 27, 2006. Retrieved October 30, 2006.
. Archived from
on 2007-07-08.
^
February 24, 2007, at the
. Kgbarchiver.net. Archived from
on 2009-01-05
. Retrieved
2010-05-19
.
. Freewebs.com. Archived from
on 2010-09-10
. Retrieved
2010-05-19
.
Matt Mahoney (2007).
. Retrieved
2013-12-29
.
. PeaZip
. Retrieved
2013-10-06
.
. Maximumcompression.com. 2007-04-14. Archived from
on 2009-04-17
. Retrieved
2010-05-19
.
Cardona, Moises (2022-09-10).
.
moisespr123/PAQCompress
. Retrieved
2025-03-28
.
. Moises-studios.110mb.com. 2010-04-03. Archived from
on 2013-08-24
.
Retrieved
2010-05-19
.
. Facebook.com. Archived from
on 2016-01-09
. Retrieved
2010-05-19
.
.
encode.su
. Retrieved
2019-07-26
.
Further reading
[
]
David Salomon, Giovanni Motta, (with contributions by David Bryant),
Handbook of Data Compression
, 5th edition,
Springer, 2009,
, 5.15 PAQ, pp. 314–319
Byron Knoll, Nando de Freitas,
, University of
British Columbia, Vancouver, Canada, August 17, 2011
External links
[
]
- Linux command-line executables download.
software
with
compression
(
)
(decompression only)
(decompression only)
(decompression
only)
Non-archiving
compressors
Generic
(
)
(
)
Blu-code
Others
See also:
and
Archiving only
Compressing only
Archiving
and compressing
DGCA
SQX
Document packaging
and distributing
methods
type
Other
Hybrid
LZ77 + Huffman
LZ77 + ANS
LZ77 + Huffman + ANS
LZ77 + Huffman + context
LZSS + Huffman
LZ77 + Range
LZHAM
RLE
+ BWT + MTF + Huffman
type
Predictive
Motion
Concepts
parts
Concepts
Methods
Concepts
parts
Motion
Community
People
:
This page was last edited on 17 April 2026, at 01:06
(UTC)
.
Text is available under the
;
additional terms may apply. By using this site, you agree to the
and
. Wikipedia® is a registered trademark of the
, a non-profit organization.