Mailing List Archive

How do you fight image-spam?
Hi, everyone.!!

I'm trying to fight an image, which refers to an attempt at Microsoft phishing, i install FuzzyOCR, i know this plugin is very old.

the installation is fine, but I don't see the plug-in loading correctly, because I train a spam message and in the logs I don't see any information related to the import to the plugin database:

spamassassin --debug FuzzyOcr < Vista\ Previa\ -\ Confirme\ inicio\ de\ sesion.eml > /dev/null
ene 17 09:02:52.231 [31789] dbg: FuzzyOcr: focr_bin_helper: 'pnmnorm,pnminvert,pamthreshold,ppmtopgm,pamtopnm'
ene 17 09:02:52.231 [31789] info: FuzzyOcr: Adding <5> new helper apps
ene 17 09:02:52.231 [31789] dbg: FuzzyOcr: focr_bin_helper: 'tesseract'
ene 17 09:02:52.231 [31789] info: FuzzyOcr: Adding <1> new helper apps
ene 17 09:02:52.231 [31789] dbg: FuzzyOcr: focr_bin_helper: 'pnmnorm,pnminvert,convert,ppmtopgm,tesseract'
ene 17 09:02:52.231 [31789] warn: FuzzyOcr: pnmnorm is already defined, skipping...
ene 17 09:02:52.231 [31789] warn: FuzzyOcr: pnminvert is already defined, skipping...
ene 17 09:02:52.231 [31789] warn: FuzzyOcr: ppmtopgm is already defined, skipping...
ene 17 09:02:52.231 [31789] warn: FuzzyOcr: tesseract is already defined, skipping...
ene 17 09:02:52.231 [31789] info: FuzzyOcr: Adding <1> new helper apps
ene 17 09:02:52.232 [31789] info: FuzzyOcr: Starting preprocessor parser for file "/etc/mail/spamassassin/FuzzyOcr.preps"...
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor normalize {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = pnmnorm
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor invert {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = pnminvert
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor ppmtopgm {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = ppmtopgm
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor pamtopnm {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = pamtopnm
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor pamthreshold {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = pamthreshold
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: args = -simple -threshold 0.5
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: preprocessor maketiff {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: command = pnmtotiff
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: args = -color -truecolor
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line: }
ene 17 09:02:52.232 [31789] info: FuzzyOcr: Starting scanset parser for file "/etc/mail/spamassassin/FuzzyOcr.scansets"...
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line scanset ocrad {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line command = $ocrad
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line args = -s5 $input
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line scanset ocrad-invert {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line command = $ocrad
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line args = -s5 -i $input
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line scanset ocrad-decolorize-invert {
ene 17 09:02:52.232 [31789] dbg: FuzzyOcr: line preprocessors = ppmtopgm, pamthreshold, pamtopnm
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line command = $ocrad
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line args = -s5 -i $input
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line scanset ocrad-decolorize {
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line preprocessors = ppmtopgm, pamthreshold, pamtopnm
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line command = $ocrad
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line args = -s5 $input
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line scanset gocr {
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line command = $gocr
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line args = -i $input
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line scanset gocr-180 {
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line command = $gocr
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line args = -l 180 -d 2 -i $input
ene 17 09:02:52.233 [31789] dbg: FuzzyOcr: line }
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Searching in: /usr/local/netpbm/bin
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Searching in: /usr/local/bin
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Searching in: /usr/bin
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using gifsicle => /usr/bin/gifsicle
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using giffix => /usr/bin/giffix
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using giftext => /usr/bin/giftext
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using gifinter => /usr/bin/gifinter
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using giftopnm => /usr/bin/giftopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using jpegtopnm => /usr/bin/jpegtopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pngtopnm => /usr/bin/pngtopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using bmptopnm => /usr/bin/bmptopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using tifftopnm => /usr/bin/tifftopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using ppmhist => /usr/bin/ppmhist
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pamfile => /usr/bin/pamfile
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using ocrad => /usr/bin/ocrad
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using gocr => /usr/bin/gocr
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pnmnorm => /usr/bin/pnmnorm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pnminvert => /usr/bin/pnminvert
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pamthreshold => /usr/bin/pamthreshold
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using ppmtopgm => /usr/bin/ppmtopgm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using pamtopnm => /usr/bin/pamtopnm
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using tesseract => /usr/bin/tesseract
ene 17 09:02:52.567 [31789] info: FuzzyOcr: Using convert => /usr/bin/convert
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[max_hash] => 5
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[c] => 5
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[s] => 0.01
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[w] => 0.01
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[cn] => 0.01
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: Threshold[h] => 0.01
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: focr_add_score => 1
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: focr_autodisable_negative_score => -5
ene 17 09:02:52.567 [31789] dbg: FuzzyOcr: focr_autodisable_score => 1000
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_autosort_buffer => 10
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_autosort_scanset => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_base_score => 5
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_corrupt_score => 2.5
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_corrupt_unfixable_score => 5
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_counts_required => 2
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_db_hash => /etc/mail/spamassassin/FuzzyOcr.db
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_db_max_days => 35
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_db_safe => /etc/mail/spamassassin/FuzzyOcr.safe.db
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_digest_db => /etc/mail/spamassassin/FuzzyOcr.hashdb
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_enable_image_hashing => 2
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_global_timeout => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_global_wordlist => /etc/mail/spamassassin/FuzzyOcr.words
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_hashing_learn_scanned => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_keep_bad_images => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_log_pmsinfo => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_log_stderr => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_logfile => /var/log/FuzzyOcr.log
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_max_height => 800
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_max_width => 800
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_min_height => 4
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_min_width => 4
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_minimal_scanset => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_db => FuzzyOcr
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_hash => Hash
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_host => localhost
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_port => 3306
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_safe => Safe
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_update_hash => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_mysql_user => fuzzyocr
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_no_homedirs => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_path_bin => /usr/local/netpbm/bin:/usr/local/bin:/usr/bin
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_pdf_maxpages => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_personal_wordlist => __userstate__/FuzzyOcr.words
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_preprocessor_file => /etc/mail/spamassassin/FuzzyOcr.preps
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_scan_pdfs => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_scanset_file => /etc/mail/spamassassin/FuzzyOcr.scansets
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_score_ham => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_bmp => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_gif => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_jpeg => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_png => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_tiff => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_skip_updates => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_strip_numbers => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_threshold => 0.25
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_timeout => 10
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_twopass_scoring_factor => 1.5
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_unique_matches => 0
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_verbose => 1
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_wrongctype_score => 1.5
ene 17 09:02:52.568 [31789] dbg: FuzzyOcr: focr_wrongext_score => 1.5
ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor normalize: /usr/bin/pnmnorm
ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor invert: /usr/bin/pnminvert
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor ppmtopgm: /usr/bin/ppmtopgm
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamtopnm: /usr/bin/pamtopnm
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamthreshold: /usr/bin/pamthreshold -simple -threshold 0.5
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor maketiff: pnmtotiff -color -truecolor
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad: /usr/bin/ocrad -s5 $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-invert: /usr/bin/ocrad -s5 -i $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize-invert: /usr/bin/ocrad -s5 -i $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize: /usr/bin/ocrad -s5 $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr: /usr/bin/gocr -i $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr-180: /usr/bin/gocr -l 180 -d 2 -i $input
ene 17 09:02:52.569 [31789] info: FuzzyOcr: Added <45> words from "/etc/mail/spamassassin/FuzzyOcr.words"

This plugin is work fine in Centos 7?

Any ideas?

Regards,

System information:

CentOS Linux release 7.6.1810 (Core)

x86_64

SpamAssassin version 3.4.3
Re: How do you fight image-spam? [ In reply to ]
On 17.01.20 13:46, Emanuel Gonzalez wrote:
>I'm trying to fight an image, which refers to an attempt at Microsoft phishing, i install FuzzyOCR, i know this plugin is very old.
>
>the installation is fine, but I don't see the plug-in loading correctly, because I train a spam message and in the logs I don't see any information related to the import to the plugin database:
>
>spamassassin --debug FuzzyOcr < Vista\ Previa\ -\ Confirme\ inicio\ de\ sesion.eml > /dev/null
>ene 17 09:02:52.231 [31789] dbg: FuzzyOcr: focr_bin_helper: 'pnmnorm,pnminvert,pamthreshold,ppmtopgm,pamtopnm'
[...]
>ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor normalize: /usr/bin/pnmnorm
>ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor invert: /usr/bin/pnminvert
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor ppmtopgm: /usr/bin/ppmtopgm
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamtopnm: /usr/bin/pamtopnm
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamthreshold: /usr/bin/pamthreshold -simple -threshold 0.5
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor maketiff: pnmtotiff -color -truecolor
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad: /usr/bin/ocrad -s5 $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-invert: /usr/bin/ocrad -s5 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize-invert: /usr/bin/ocrad -s5 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize: /usr/bin/ocrad -s5 $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr: /usr/bin/gocr -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr-180: /usr/bin/gocr -l 180 -d 2 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Added <45> words from "/etc/mail/spamassassin/FuzzyOcr.words"

I would expect some more lines here, did you break it?
note that fuzzyocr plugin can run for a long time.

>This plugin is work fine in Centos 7?

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Save the whales. Collect the whole set.
RE: How do you fight image-spam? [ In reply to ]
I haven't touched the plugin at all, download the rpm file from http://repo.iotti.biz/CentOS/7/noarch/spamassassin-FuzzyOcr-3.6.0-12.el7.lux.1.noarch.rpm and follow installation steps

Any ideas?

Regards,
________________________________
De: Matus UHLAR - fantomas <uhlar@fantomas.sk>
Enviado: viernes, 17 de enero de 2020 10:55
Para: users@spamassassin.apache.org <users@spamassassin.apache.org>
Asunto: Re: How do you fight image-spam?

On 17.01.20 13:46, Emanuel Gonzalez wrote:
>I'm trying to fight an image, which refers to an attempt at Microsoft phishing, i install FuzzyOCR, i know this plugin is very old.
>
>the installation is fine, but I don't see the plug-in loading correctly, because I train a spam message and in the logs I don't see any information related to the import to the plugin database:
>
>spamassassin --debug FuzzyOcr < Vista\ Previa\ -\ Confirme\ inicio\ de\ sesion.eml > /dev/null
>ene 17 09:02:52.231 [31789] dbg: FuzzyOcr: focr_bin_helper: 'pnmnorm,pnminvert,pamthreshold,ppmtopgm,pamtopnm'
[...]
>ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor normalize: /usr/bin/pnmnorm
>ene 17 09:02:52.568 [31789] info: FuzzyOcr: Loaded preprocessor invert: /usr/bin/pnminvert
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor ppmtopgm: /usr/bin/ppmtopgm
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamtopnm: /usr/bin/pamtopnm
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor pamthreshold: /usr/bin/pamthreshold -simple -threshold 0.5
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Loaded preprocessor maketiff: pnmtotiff -color -truecolor
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad: /usr/bin/ocrad -s5 $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-invert: /usr/bin/ocrad -s5 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize-invert: /usr/bin/ocrad -s5 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan ocrad-decolorize: /usr/bin/ocrad -s5 $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr: /usr/bin/gocr -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Using scan gocr-180: /usr/bin/gocr -l 180 -d 2 -i $input
>ene 17 09:02:52.569 [31789] info: FuzzyOcr: Added <45> words from "/etc/mail/spamassassin/FuzzyOcr.words"

I would expect some more lines here, did you break it?
note that fuzzyocr plugin can run for a long time.

>This plugin is work fine in Centos 7?

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Save the whales. Collect the whole set.