Mailing List Archive

inquiry on method to compress PNG imagefiles stored by WikiMedia
i've never contributed to any mailing lists, or partaken much, so i don't know how to start this off...

hi? i've realized i can compressed a dataset of lossless .png files down to between a third- or a fourth of the initial size on disk.

that said: in compressing my own backups i find that lossless .png files converted to .xpm (not .ppm, not .bmp) before being compressed result in a compression that of 3- to 4 times smaller in size on disk than the initial .png with it's own compression --- irregardless of any 'PNG optimization' done beforehand: the resulting .xpm files remain identical in size and compress to precisely the same size.

the following commands¹ can surely be tinkered with to greater effect:
---start of shell commands---
mkdir -p xpm
magick convert image.png xpm/image.xpm
mkdwarfs -i xpm/ -o compressed.dfs -l9

---end of shell commands---
now you have a compressed image, three to four times smaller in size on disk, to inspect.
here i openly wonder how a comparison - to assert whether the resulting .xpm file and the lossless .png are indeed the same picture still - would be carried out.

likewise i see this quality of compression extends to the .xgm .xbm formats.

in sharing i wish to bring up the above observation to my best ability. insights welcome as to why this happens and if indeed the image resulting from a lossless .png to .xpm conversion is the same - if this compressed bitmap outperforms the PNG compression significantly without compromising image integrity. (do reply with saying if this is irrelavant information and or presented inadequately in any way: i don't wish to bring red herrings to this mailing list.)

ultimately this is about if a large portion of WikiMedia imagedata indeed can be compressed further by this process - in a 'Pareto improvement' kind of way.
-Ivy, 25

for interest: sources to programs referenced and my brief notes on these.
[1]
magick: https://imagemagick.org/index.php

mkdwarfs, part of the inappropriately named dwarfs toolset:
https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md

note that the -l flag is given the option 9 in the command - this means LZMA is used in this program.
also note that while i use 'mkdwarfs' with LZMA here, i realize the same result on any other program using LZMA or XZ occurs - like the following 'dar', more adequate for single file extraction from an archive and analyzing individual file compression values en masse.
dar: http://dar.linux.free.fr/doc/man/dar.html
unfortunately the projects' website doesn't use HTTPS, so a wayback machine link with HTTPS:
https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html

---start of referenced dar command---
dar -c output -zxz9 -R input/
---end of referenced dar command---

lastly two clarifications:
'a lossless .png file' here means an imagefile which never has been converted in a lossy way. a .jpg file converted to a .png used in this procedure produces an output taking more space on disk.
'the procedure' here refers to conversion of a lossless .png to .xpm imagefile format then compressed with either LZMA or XZ in any program.

thank you for reading.
Re: inquiry on method to compress PNG imagefiles stored by WikiMedia [ In reply to ]
Generally disk space is cheap enough that doing these sorts of things
aren't worth the extra complexity, especially when we just have to convert
it back to png in order to serve it to the user at the end of the day.

LZMA is a more advanced compression algorithm than DEFLATE (which is what
png uses), so its not entirely surprising it might give better results for
some images.

--
Bawolff

On Tuesday 23 April 2024, TealSynapse--- via Wikitech-l <
wikitech-l@lists.wikimedia.org> wrote:

>
> *i've never contributed to any mailing lists, or partaken much, so i don't
> know how to start this off...*
>
>
> *hi? i've realized i can compressed a dataset of lossless .png files down
> to between a third- or a fourth of the initial size on disk. *
>
>
> *that said: in compressing my own backups i find that lossless .png files
> converted to .xpm (not .ppm, not .bmp) before being compressed result in a
> compression that of 3- to 4 times smaller in size on disk than the initial
> .png with it's own compression --- irregardless of any 'PNG optimization'
> done beforehand: the resulting .xpm files remain identical in size and
> compress to precisely the same size. *
>
> *the following commands¹ can surely be tinkered with to greater effect:*
>
> *---start of shell commands---*
> mkdir -p xpm
> magick convert image.png xpm/image.xpm
> mkdwarfs -i xpm/ -o compressed.dfs -l9
>
> *---end of shell commands---*
> *now you have a compressed image, three to four times smaller in size on
> disk, to inspect.*
>
> *here i openly wonder how a comparison - to assert whether the resulting
> .xpm file and the lossless .png are indeed the same picture still - would
> be carried out.*
>
> *likewise i see this quality of compression extends to the .xgm .xbm
> formats.*
>
>
> *in sharing i wish to bring up the above observation to my best ability.
> insights welcome as to why this happens and if indeed the image resulting
> from a lossless .png to .xpm conversion is the same - if this compressed
> bitmap outperforms the PNG compression significantly without compromising
> image integrity. (do reply with saying if this is irrelavant information
> and or presented inadequately in any way: i don't wish to bring red
> herrings to this mailing list.)*
>
>
> *ultimately this is about if a large portion of WikiMedia imagedata indeed
> can be compressed further by this process - in a 'Pareto improvement' kind
> of way.*
>
> *-Ivy, 25*
>
>
>
> *for interest: sources to programs referenced and my brief notes on these.*
> *[1]*
>
> *magick: https://imagemagick.org/index.php
> <https://imagemagick.org/index.php>*
> *mkdwarfs, part of the inappropriately named dwarfs toolset:*
>
> *https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md
> <https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md>*
> *note that the -l flag is given the option 9 in the command - this means
> LZMA is used in this program.*
>
> *also note that while i use 'mkdwarfs' with LZMA here, i realize the same
> result on any other program using LZMA or XZ occurs - like the following
> 'dar', more adequate for single file extraction from an archive and
> analyzing individual file compression values en masse. *
>
> *dar: http://dar.linux.free.fr/doc/man/dar.html
> <http://dar.linux.free.fr/doc/man/dar.html>unfortunately the projects'
> website doesn't use HTTPS, so a wayback machine link with HTTPS:*
>
> *https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html
> <https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html>*
> *---start of referenced dar command---*
> dar -c output -zxz9 -R input/
> *---end of referenced dar command---*
>
> *lastly two clarifications:*
>
> *'a lossless .png file' here means an imagefile which never has been
> converted in a lossy way. a .jpg file converted to a .png used in this
> procedure produces an output taking more space on disk. *
>
> *'the procedure' here refers to conversion of a lossless .png to .xpm
> imagefile format then compressed with either LZMA or XZ in any program.*
>
>
> *thank you for reading.*
>
Re: inquiry on method to compress PNG imagefiles stored by WikiMedia [ In reply to ]
It is true that the PNG format doesn't offer the absolute best compression
- no one would claim it does. What it does offer is ultimate compatibility
-- it's viewable on nearly every browser and device ever made, which is
more important than disk usage.

When ImageMagick converts from PNG to XPM, the image remains "the same",
but it changes the image from an RGB encoding to a palette-based bitmap
with a predefined set of colors. Combine this with modern compression, and
you should indeed see a better compression ratio.


On Tue, Apr 23, 2024 at 8:30?PM TealSynapse--- via Wikitech-l <
wikitech-l@lists.wikimedia.org> wrote:

>
> *i've never contributed to any mailing lists, or partaken much, so i don't
> know how to start this off...*
>
>
> *hi? i've realized i can compressed a dataset of lossless .png files down
> to between a third- or a fourth of the initial size on disk. *
>
>
> *that said: in compressing my own backups i find that lossless .png files
> converted to .xpm (not .ppm, not .bmp) before being compressed result in a
> compression that of 3- to 4 times smaller in size on disk than the initial
> .png with it's own compression --- irregardless of any 'PNG optimization'
> done beforehand: the resulting .xpm files remain identical in size and
> compress to precisely the same size. *
>
> *the following commands¹ can surely be tinkered with to greater effect:*
>
> *---start of shell commands---*
> mkdir -p xpm
> magick convert image.png xpm/image.xpm
> mkdwarfs -i xpm/ -o compressed.dfs -l9
>
> *---end of shell commands---*
> *now you have a compressed image, three to four times smaller in size on
> disk, to inspect.*
>
> *here i openly wonder how a comparison - to assert whether the resulting
> .xpm file and the lossless .png are indeed the same picture still - would
> be carried out.*
>
> *likewise i see this quality of compression extends to the .xgm .xbm
> formats.*
>
>
> *in sharing i wish to bring up the above observation to my best ability.
> insights welcome as to why this happens and if indeed the image resulting
> from a lossless .png to .xpm conversion is the same - if this compressed
> bitmap outperforms the PNG compression significantly without compromising
> image integrity. (do reply with saying if this is irrelavant information
> and or presented inadequately in any way: i don't wish to bring red
> herrings to this mailing list.)*
>
>
> *ultimately this is about if a large portion of WikiMedia imagedata indeed
> can be compressed further by this process - in a 'Pareto improvement' kind
> of way.*
>
> *-Ivy, 25*
>
>
>
> *for interest: sources to programs referenced and my brief notes on these.*
> *[1]*
>
> *magick: https://imagemagick.org/index.php
> <https://imagemagick.org/index.php>*
> *mkdwarfs, part of the inappropriately named dwarfs toolset:*
>
> *https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md
> <https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md>*
> *note that the -l flag is given the option 9 in the command - this means
> LZMA is used in this program.*
>
> *also note that while i use 'mkdwarfs' with LZMA here, i realize the same
> result on any other program using LZMA or XZ occurs - like the following
> 'dar', more adequate for single file extraction from an archive and
> analyzing individual file compression values en masse. *
>
> *dar: http://dar.linux.free.fr/doc/man/dar.html
> <http://dar.linux.free.fr/doc/man/dar.html>unfortunately the projects'
> website doesn't use HTTPS, so a wayback machine link with HTTPS:*
>
> *https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html
> <https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html>*
> *---start of referenced dar command---*
> dar -c output -zxz9 -R input/
> *---end of referenced dar command---*
>
> *lastly two clarifications:*
>
> *'a lossless .png file' here means an imagefile which never has been
> converted in a lossy way. a .jpg file converted to a .png used in this
> procedure produces an output taking more space on disk. *
>
> *'the procedure' here refers to conversion of a lossless .png to .xpm
> imagefile format then compressed with either LZMA or XZ in any program.*
>
>
> *thank you for reading.*
> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/



--
Dmitry Brant
Lead Software Engineer (Android)
Wikimedia Foundation
https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
Re: inquiry on method to compress PNG imagefiles stored by WikiMedia [ In reply to ]
And there is all the metadata that tends to get
jumbled/warped/damaged/lost. That is another risk that has to be taken into
account when processing originals to be stored in another format.

DJ

On Wed, Apr 24, 2024 at 3:28?AM Dmitry Brant <dbrant@wikimedia.org> wrote:

> It is true that the PNG format doesn't offer the absolute best compression
> - no one would claim it does. What it does offer is ultimate compatibility
> -- it's viewable on nearly every browser and device ever made, which is
> more important than disk usage.
>
> When ImageMagick converts from PNG to XPM, the image remains "the same",
> but it changes the image from an RGB encoding to a palette-based bitmap
> with a predefined set of colors. Combine this with modern compression, and
> you should indeed see a better compression ratio.
>
>
> On Tue, Apr 23, 2024 at 8:30?PM TealSynapse--- via Wikitech-l <
> wikitech-l@lists.wikimedia.org> wrote:
>
>>
>> *i've never contributed to any mailing lists, or partaken much, so i
>> don't know how to start this off...*
>>
>>
>> *hi? i've realized i can compressed a dataset of lossless .png files down
>> to between a third- or a fourth of the initial size on disk. *
>>
>>
>> *that said: in compressing my own backups i find that lossless .png files
>> converted to .xpm (not .ppm, not .bmp) before being compressed result in a
>> compression that of 3- to 4 times smaller in size on disk than the initial
>> .png with it's own compression --- irregardless of any 'PNG optimization'
>> done beforehand: the resulting .xpm files remain identical in size and
>> compress to precisely the same size. *
>>
>> *the following commands¹ can surely be tinkered with to greater effect:*
>>
>> *---start of shell commands---*
>> mkdir -p xpm
>> magick convert image.png xpm/image.xpm
>> mkdwarfs -i xpm/ -o compressed.dfs -l9
>>
>> *---end of shell commands---*
>> *now you have a compressed image, three to four times smaller in size on
>> disk, to inspect.*
>>
>> *here i openly wonder how a comparison - to assert whether the resulting
>> .xpm file and the lossless .png are indeed the same picture still - would
>> be carried out.*
>>
>> *likewise i see this quality of compression extends to the .xgm .xbm
>> formats.*
>>
>>
>> *in sharing i wish to bring up the above observation to my best ability.
>> insights welcome as to why this happens and if indeed the image resulting
>> from a lossless .png to .xpm conversion is the same - if this compressed
>> bitmap outperforms the PNG compression significantly without compromising
>> image integrity. (do reply with saying if this is irrelavant information
>> and or presented inadequately in any way: i don't wish to bring red
>> herrings to this mailing list.)*
>>
>>
>> *ultimately this is about if a large portion of WikiMedia imagedata
>> indeed can be compressed further by this process - in a 'Pareto
>> improvement' kind of way.*
>>
>> *-Ivy, 25*
>>
>>
>>
>> *for interest: sources to programs referenced and my brief notes on
>> these.*
>> *[1]*
>>
>> *magick: https://imagemagick.org/index.php
>> <https://imagemagick.org/index.php>*
>> *mkdwarfs, part of the inappropriately named dwarfs toolset:*
>>
>> *https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md
>> <https://github.com/mhx/dwarfs/blob/main/doc/mkdwarfs.md>*
>> *note that the -l flag is given the option 9 in the command - this means
>> LZMA is used in this program.*
>>
>> *also note that while i use 'mkdwarfs' with LZMA here, i realize the same
>> result on any other program using LZMA or XZ occurs - like the following
>> 'dar', more adequate for single file extraction from an archive and
>> analyzing individual file compression values en masse. *
>>
>> *dar: http://dar.linux.free.fr/doc/man/dar.html
>> <http://dar.linux.free.fr/doc/man/dar.html>unfortunately the projects'
>> website doesn't use HTTPS, so a wayback machine link with HTTPS:*
>>
>> *https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html
>> <https://web.archive.org/web/20240423233825/http://dar.linux.free.fr/doc/man/dar.html>*
>> *---start of referenced dar command---*
>> dar -c output -zxz9 -R input/
>> *---end of referenced dar command---*
>>
>> *lastly two clarifications:*
>>
>> *'a lossless .png file' here means an imagefile which never has been
>> converted in a lossy way. a .jpg file converted to a .png used in this
>> procedure produces an output taking more space on disk. *
>>
>> *'the procedure' here refers to conversion of a lossless .png to .xpm
>> imagefile format then compressed with either LZMA or XZ in any program.*
>>
>>
>> *thank you for reading.*
>> _______________________________________________
>> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
>> To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
>>
>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
>
>
> --
> Dmitry Brant
> Lead Software Engineer (Android)
> Wikimedia Foundation
> https://www.mediawiki.org/wiki/Wikimedia_mobile_engineering
>
> _______________________________________________
> Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
> To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/