Mailing List Archive

Bug in module gzip
In past Python versions I have had various problems with the gzip
module, but several told me that they were fixed in 1.5.2. So I looked
at it again, only to get an immediate crash when reading a gziped file:

File "/usr/lib/python1.5/gzip.py", line 270, in readline
c = self.read(readsize)
File "/usr/lib/python1.5/gzip.py", line 157, in read
self._read(readsize)
File "/usr/lib/python1.5/gzip.py", line 210, in _read
if self.decompress.unused_data != "":
AttributeError: unused_data

The attribute unused_data is in fact never set. It should thus be
impossible to read any file at all, which makes me wonder if nobody
noticed this before... and perhaps has a fix!
--
-------------------------------------------------------------------------------
Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr
Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69
Rue Charles Sadron | Fax: +33-2.38.63.15.17
45071 Orleans Cedex 2 | Deutsch/Esperanto/English/
France | Nederlands/Francais
-------------------------------------------------------------------------------
Bug in module gzip [ In reply to ]
Konrad Hinsen writes:
>The attribute unused_data is in fact never set. It should thus be
>impossible to read any file at all, which makes me wonder if nobody
>noticed this before... and perhaps has a fix!

unused_data is an attribute added to the zlib module in 1.5.2,
so Python code can get access to data after the end of a compressed
stream. Therefore you'll also need to use the zlib module from 1.5.2.

--
A.M. Kuchling http://starship.python.net/crew/amk/
Despair says little, and is patient.
-- From SANDMAN: "Season of Mists", episode 0
Bug in module gzip [ In reply to ]
I would guess that you're using the new gzip.py but not the new zlib
module. The unused_data attribute was added to fix a bug in gzip.py,
which prevented it from properly handling gzip-created files that
contained multiple, independent compressed chunks. (Did I say that at
all clearly?)

with the gzip utility you can do the following:
cat part1.txt >> input
cat part2.txt >> input
gzip part1.txt
gzip part2.txt
cat part2.txt.gz >> part1.txt.gz
gzip -d --stdout part1.txt.gz > output
diff input output
[reports no difference]

Getting this right involved changing the interface to zlib.

Jeremy