An elementary Python data compression question
Hi all,
Though I don't work in data compression, but for learning
purpose only I ran a program very similar to the one given in version
specific tutorial from www.python.org. (ch 11, tut.pdf). The two programs are:
PROGRAM 1
import zlib
s = "witch which has which witches wrist watch"
print "Original String is ::%s:: and length of the string is ::%d::" %(s,len(s))
t = zlib.compress(s)
print "Compressed String is ::%s:: and length of the string is ::%d::" %(t,len(t))
print "Decompressed String is ::%s:: and length of the decompressed string is ::%d::" %(zlib.decompress(t),len(zlib.decompress(t)))
PROGRAM 2
import zlib
s = "witch which has which witches wrist watch"
print "Original String is ::%s:: and length of the string is ::%d::" %(s,len(s))
t = zlib.compress(s)
print "Compressed String is ::%s:: and length of the string is ::%d::" %(t,len(t))
print "Decompressed String is ::%s:: and length of the decompressed string is ::%d::" %(zlib.decompress(t),len(zlib.decompress(t)))
Now the problem is that:
in program2, the 'len(t)', i.e. length of
the compressed string is greater than length of the original string!!!
In program 1, this 'apparent anomaly' doesn't occur, and this is the
string used in the python.org tutorial.
Can anybody provide an
explanation? At this point, I don't want to read a few hundred pages on
data compression to find out the answer.
Thanks in advance,
Abhishek Pathak.