gzip compression, using class GzipFile from gzip.py, by default
inserts a timestamp to the compressed stream. If the optional
argument `mtime` is absent or None, then the current time is used [1].
This makes outputs non-deterministic, which can badly confuse
unsuspecting users: If you run "diff" over two outputs to see
whether they are unaffected by changes in your application,
then you would not expect that the *.gz binaries differ just
because they were created at different times.
I'd propose to introduce a new constant `NO_TIMESTAMP` as
possible value of `mtime`.
Furthermore, if policy about API changes allows, I'd suggest
that `NO_TIMESTAMP` become the new default value for `mtime`.
How to proceed from here? Is this the kind of proposals that
has to go through a PEP?
- Joachim
[1]
https://github.com/python/cpython/blob/6f1e8ccffa5b1272a36a35405d3c4e4bbba0c082/Lib/gzip.py#L163
inserts a timestamp to the compressed stream. If the optional
argument `mtime` is absent or None, then the current time is used [1].
This makes outputs non-deterministic, which can badly confuse
unsuspecting users: If you run "diff" over two outputs to see
whether they are unaffected by changes in your application,
then you would not expect that the *.gz binaries differ just
because they were created at different times.
I'd propose to introduce a new constant `NO_TIMESTAMP` as
possible value of `mtime`.
Furthermore, if policy about API changes allows, I'd suggest
that `NO_TIMESTAMP` become the new default value for `mtime`.
How to proceed from here? Is this the kind of proposals that
has to go through a PEP?
- Joachim
[1]
https://github.com/python/cpython/blob/6f1e8ccffa5b1272a36a35405d3c4e4bbba0c082/Lib/gzip.py#L163