Mailing List Archive

Counting number of instances of a particular header
I'm trying to create a rule to count the number of instances of a particular
header.
IE in email messages there could be zero or more instances of a particular
header and I want to know how many there are so I can use that info in a meta to
detect a spam sign.

I first crafted a rule:
header L_MY_HEADER X-My-Header !~ /^UNSET$/ [if-unset: UNSET]
describe L_MY_HEADER has X-My_header
score L_MY_HEADER 0.1

Which did correctly detect the existence of 'X-My-Header'. Then to count the
number of them I added a 'tflags':
tflags L_MY_HEADER multiple maxhits=10

But that would always fire 10 times if there were any instances of 'X-My-Header'
(even if there was only one).

So I modified the pattern match part of the rule:
header L_MY_HEADER X-My-Header =~ /./

Which had the same effect as the first form (IE either zero or 10 firings).

As the header would have at least 6 characters but less than 150 I then tried:
header L_MY_HEADER X-My-Header =~ /^.{5,200}/

Which would fire only once, even if there were 5 or more instances of the
header.

What am I doing wrong? How should I craft a rule to count the number of
instances of that header?

Thanks,
Dave

--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center, 103 S Capitol St.
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{
Re: Counting number of instances of a particular header [ In reply to ]
https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WritingRulesAdvanced

You need m-modifier, matched string is all the header values separated by
newline, so you want to match all of the line starts.

header L_MY_HEADER X-My-Header =~ /^/m
tflags L_MY_HEADER multiple


On Mon, May 03, 2021 at 10:18:51AM -0500, Dave Funk wrote:
> I'm trying to create a rule to count the number of instances of a particular
> header.
> IE in email messages there could be zero or more instances of a particular
> header and I want to know how many there are so I can use that info in a
> meta to detect a spam sign.
>
> I first crafted a rule:
> header L_MY_HEADER X-My-Header !~ /^UNSET$/ [if-unset: UNSET]
> describe L_MY_HEADER has X-My_header
> score L_MY_HEADER 0.1
>
> Which did correctly detect the existence of 'X-My-Header'. Then to count the
> number of them I added a 'tflags':
> tflags L_MY_HEADER multiple maxhits=10
>
> But that would always fire 10 times if there were any instances of
> 'X-My-Header' (even if there was only one).
>
> So I modified the pattern match part of the rule:
> header L_MY_HEADER X-My-Header =~ /./
>
> Which had the same effect as the first form (IE either zero or 10 firings).
>
> As the header would have at least 6 characters but less than 150 I then tried:
> header L_MY_HEADER X-My-Header =~ /^.{5,200}/
>
> Which would fire only once, even if there were 5 or more instances of the
> header.
>
> What am I doing wrong? How should I craft a rule to count the number of
> instances of that header?
>
> Thanks,
> Dave
>
> --
> Dave Funk University of Iowa
> <dbfunk (at) engineering.uiowa.edu> College of Engineering
> 319/335-5751 FAX: 319/384-0549 1256 Seamans Center, 103 S Capitol St.
> Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
> #include <std_disclaimer.h>
> Better is not better, 'standard' is better. B{
Re: Counting number of instances of a particular header [ In reply to ]
On Mon, 3 May 2021 10:18:51 -0500 (CDT)
Dave Funk wrote:

> I'm trying to create a rule to count the number of instances of a
> particular header.
...
> What am I doing wrong? How should I craft a rule to count the number
> of instances of that header?

It's important to understand that when headers are repeated, the match
runs against a single string with multiple lines, not multiple strings.

So

header L_MY_HEADER X-My-Header =~ /^.{5,200}/

can only match once because, by default, ^ matches the beginning of
the string.

For header tests involving multiple lines, the /m and /s modifiers
can be useful, but are often not essential as you can test for newline
characters instead.

You only need tflags multiple if you need to get a numerical
value for meta-rules, you can write a rule for N or more headers using
a single header rule.

e.g. to test for two or more headers (if empty headers aren't a
concern) you can simply use:

header L_MULTIPLE_MY_HEADER X-My-Header =~ /\n./
Re: Counting number of instances of a particular header [ In reply to ]
On 3 May 2021, at 11:18, Dave Funk wrote:

> I'm trying to create a rule to count the number of instances of a
> particular header.
> IE in email messages there could be zero or more instances of a
> particular header and I want to know how many there are so I can use
> that info in a meta to detect a spam sign.
>
> I first crafted a rule:
> header L_MY_HEADER X-My-Header !~ /^UNSET$/ [if-unset: UNSET]

????
That's a deeply weird rule.

Try just this:

header L_MY_HEADER X-My-Header =~ /^./m


> describe L_MY_HEADER has X-My_header
> score L_MY_HEADER 0.1
>
> Which did correctly detect the existence of 'X-My-Header'. Then to
> count the number of them I added a 'tflags':
> tflags L_MY_HEADER multiple maxhits=10
>
> But that would always fire 10 times if there were any instances of
> 'X-My-Header' (even if there was only one).

I guess that's an artifact of combining the 'if-unset' functionality
with 'tflags multiple' or possibly the negative match test or both. I'm
not sure that it is exactly a bug, because I can't say how SA "should"
deal with that combination of syntax.



>
> So I modified the pattern match part of the rule:
> header L_MY_HEADER X-My-Header =~ /./
>
> Which had the same effect as the first form (IE either zero or 10
> firings).
>
> As the header would have at least 6 characters but less than 150 I
> then tried:
> header L_MY_HEADER X-My-Header =~ /^.{5,200}/
>
> Which would fire only once, even if there were 5 or more instances of
> the header.
>
> What am I doing wrong? How should I craft a rule to count the number
> of instances of that header?
>
> Thanks,
> Dave
>
> --
> Dave Funk University of Iowa
> <dbfunk (at) engineering.uiowa.edu> College of Engineering
> 319/335-5751 FAX: 319/384-0549 1256 Seamans Center, 103 S
> Capitol St.
> Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
> #include <std_disclaimer.h>
> Better is not better, 'standard' is better. B{


--
Bill Cole
bill@scconsult.com or billcole@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire
Re: Counting number of instances of a particular header [ In reply to ]
On Mon, 03 May 2021 13:17:59 -0400
Bill Cole wrote:

> On 3 May 2021, at 11:18, Dave Funk wrote:

> >
> > I first crafted a rule:
> > header L_MY_HEADER X-My-Header !~ /^UNSET$/ [if-unset: UNSET]
>
>
> > But that would always fire 10 times if there were any instances of
> > 'X-My-Header' (even if there was only one).
>
> I guess that's an artifact of combining the 'if-unset' functionality
> with 'tflags multiple' or possibly the negative match test or both.


Probably the combination of !~ with tflags multiple.


> I'm not sure that it is exactly a bug, because I can't say how SA
> "should" deal with that combination of syntax.

It could be seen as a bug since without 'maxhits' it would
presumably have looped until timeout.