Mailing List Archive

txrep - why it inserts some strange values to db?
Hi!
I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam
<msg> it put six tuples to database (postgres):
# select * from txrep ;
username | email |
ip | msgcount | totscore | signedby | last_hit
----------+-------------------------------------------------------+-----------+----------+----------+------------+----------------------------
nobody | yes@multifinansowanie.com.pl |
5.199.143 | 1 | 20 | | 2020-12-10
17:34:25.830758
nobody | multifinansowanie.com.pl |
5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.83376
nobody | slot10.multifinansowanie.com.pl | none
| 1 | 20 | helo | 2020-12-10 17:34:25.836672
nobody | yes@multifinansowanie.com.pl | none
| 1 | 20 | | 2020-12-10 17:34:25.840392
nobody | 5.199.143.45 | none
| 1 | 20 | | 2020-12-10 17:34:25.843831
nobody | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none
| 2 | 40 | 1606432596 | 2020-12-10 17:34:25.850803
(6 rows)

this doesn't look correctly, none of this tuple is 100% correct.
Original message's headers, with redacted destination address:
https://pastebin.com/JxfNm7LE

Regards,
Marcin
Re: txrep - why it inserts some strange values to db? [ In reply to ]
On Thu, 10 Dec 2020 17:40:42 +0100
Marcin Miros?aw wrote:

> Hi!
> I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam
> <msg> it put six tuples to database (postgres):
> # select * from txrep ;
...
> this doesn't look correctly, none of this tuple is 100% correct.

Hopefully this version isn't wrapped:

username | email | ip | msgcount | totscore | signedby | last_hit
----------+-------------------------------------------------------+-----------+----------+----------+------------+----------------------------
nobody | yes@multifinansowanie.com.pl | 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.830758
nobody | multifinansowanie.com.pl | 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.83376
nobody | slot10.multifinansowanie.com.pl | none | 1 | 20 | helo | 2020-12-10 17:34:25.836672
nobody | yes@multifinansowanie.com.pl | none | 1 | 20 | | 2020-12-10 17:34:25.840392
nobody | 5.199.143.45 | none | 1 | 20 | | 2020-12-10 17:34:25.843831
nobody | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none | 2 | 40 | 1606432596 | 2020-12-10 17:34:25.850803


I don't use TxRep and I've not looked at a TxRep database of either version.
However my concerns would be:

1. msgcount=2 in the last line (assuming the database was previously empty)

2. The absence of either DKIM or SPF entries.

The truncated IP address "5.199.143" is correct if you have
"txrep_ipv4_mask_len 24". The helo and epoch time in the
signedby column look OK.

The use of the username nobody suggests you are running sa-learn as root.

Part of the reason I don't use TxRep is that I have no confidence
in its correctness.
Re: txrep - why it inserts some strange values to db? [ In reply to ]
W dniu 2020-12-12 o?17:56, RW pisze:
> On Thu, 10 Dec 2020 17:40:42 +0100
> Marcin Miros?aw wrote:
>
>> Hi!
>> I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam
>> <msg> it put six tuples to database (postgres):
>> # select * from txrep ;
> ...
>> this doesn't look correctly, none of this tuple is 100% correct.
>
> Hopefully this version isn't wrapped:
>
> username | email | ip | msgcount | totscore | signedby | last_hit
> ----------+-------------------------------------------------------+-----------+----------+----------+------------+----------------------------
> nobody | yes@multifinansowanie.com.pl | 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.830758
> nobody | multifinansowanie.com.pl | 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.83376
> nobody | slot10.multifinansowanie.com.pl | none | 1 | 20 | helo | 2020-12-10 17:34:25.836672
> nobody | yes@multifinansowanie.com.pl | none | 1 | 20 | | 2020-12-10 17:34:25.840392
> nobody | 5.199.143.45 | none | 1 | 20 | | 2020-12-10 17:34:25.843831
> nobody | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none | 2 | 40 | 1606432596 | 2020-12-10 17:34:25.850803
>
>
> I don't use TxRep and I've not looked at a TxRep database of either version.
> However my concerns would be:
>
> 1. msgcount=2 in the last line (assuming the database was previously empty)

Database was empty and I run `sa-learn --spam ` only once.

> 2. The absence of either DKIM or SPF entries.
>
> The truncated IP address "5.199.143" is correct if you have
> "txrep_ipv4_mask_len 24". The helo and epoch time in the
> signedby column look OK.

In column "signedby" should be, IMHO, domain (d=) or selector (s=) from
DKIM.


> The use of the username nobody suggests you are running sa-learn as root.

username is forced in configuration to be the same as send from MTA.

> Part of the reason I don't use TxRep is that I have no confidence
> in its correctness.

In my opinion this looks completly useless. How it looks in other databases?
Marcin
Re: txrep - why it inserts some strange values to db? [ In reply to ]
TxRep could have some bugs but I've not found it to cause a FP in years
so it's generally safe for production use. Do you have an email that might
replicate the issue?
--
Kevin A. McGrail
Member, Apache Software Foundation
Chair Emeritus Apache SpamAssassin Project
https://www.linkedin.com/in/kmcgrail - 703.798.0171


On Sat, Dec 12, 2020 at 12:00 PM RW <rwmaillists@googlemail.com> wrote:

> On Thu, 10 Dec 2020 17:40:42 +0100
> Marcin Miros?aw wrote:
>
> > Hi!
> > I use spamassassin 3.4.4 and I try txrep. When I run sa-learn --spam
> > <msg> it put six tuples to database (postgres):
> > # select * from txrep ;
> ...
> > this doesn't look correctly, none of this tuple is 100% correct.
>
> Hopefully this version isn't wrapped:
>
> username | email | ip
> | msgcount | totscore | signedby | last_hit
>
> ----------+-------------------------------------------------------+-----------+----------+----------+------------+----------------------------
> nobody | yes@multifinansowanie.com.pl |
> 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.830758
> nobody | multifinansowanie.com.pl |
> 5.199.143 | 1 | 20 | | 2020-12-10 17:34:25.83376
> nobody | slot10.multifinansowanie.com.pl | none
> | 1 | 20 | helo | 2020-12-10 17:34:25.836672
> nobody | yes@multifinansowanie.com.pl | none
> | 1 | 20 | | 2020-12-10 17:34:25.840392
> nobody | 5.199.143.45 | none
> | 1 | 20 | | 2020-12-10 17:34:25.843831
> nobody | f2c484bc9daccb07db6497f57fd18a7f0a1e29fa@sa_generated | none
> | 2 | 40 | 1606432596 | 2020-12-10 17:34:25.850803
>
>
> I don't use TxRep and I've not looked at a TxRep database of either
> version.
> However my concerns would be:
>
> 1. msgcount=2 in the last line (assuming the database was previously empty)
>
> 2. The absence of either DKIM or SPF entries.
>
> The truncated IP address "5.199.143" is correct if you have
> "txrep_ipv4_mask_len 24". The helo and epoch time in the
> signedby column look OK.
>
> The use of the username nobody suggests you are running sa-learn as root.
>
> Part of the reason I don't use TxRep is that I have no confidence
> in its correctness.
>
>
>
>
>