Mailing List Archive

[no subject]
Hello,

is it possible to match message headers in rfc822 atttachments?

from what I know, "header" rules only apply to mail headers and mimeheader
only apply to mime headers.

body and rawbody afaik only search in bodies of messages or included
messages.

I have asked some time ago but no success:

https://marc.info/?l=spamassassin-users&m=132282473328809&w=2

is this possible now or do we need out-of SA solution for this?

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Fighting for peace is like fucking for virginity...
Re: your mail [ In reply to ]
On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
> Hello,
>
> is it possible to match message headers in rfc822 atttachments?
>
> from what I know, "header" rules only apply to mail headers and mimeheader
> only apply to mime headers.
>
> body and rawbody afaik only search in bodies of messages or included
> messages.
>
> I have asked some time ago but no success:
>
> https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
>
> is this possible now or do we need out-of SA solution for this?

full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s
Re: your mail [ In reply to ]
On Tue, Apr 26, 2022 at 04:04:13PM +0300, Henrik K wrote:
> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
> > Hello,
> >
> > is it possible to match message headers in rfc822 atttachments?
> >
> > from what I know, "header" rules only apply to mail headers and mimeheader
> > only apply to mime headers.
> >
> > body and rawbody afaik only search in bodies of messages or included
> > messages.
> >
> > I have asked some time ago but no success:
> >
> > https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
> >
> > is this possible now or do we need out-of SA solution for this?
>
> full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s

Maybe a bit safer version that doesn't log huge strings and run wild

full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
Re: your mail [ In reply to ]
>> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
>> > is it possible to match message headers in rfc822 atttachments?
>> >
>> > from what I know, "header" rules only apply to mail headers and mimeheader
>> > only apply to mime headers.
>> >
>> > body and rawbody afaik only search in bodies of messages or included
>> > messages.
>> >
>> > I have asked some time ago but no success:
>> >
>> > https://marc.info/?l=spamassassin-users&m=132282473328809&w=2
>> >
>> > is this possible now or do we need out-of SA solution for this?

>On Tue, Apr 26, 2022 at 04:04:13PM +0300, Henrik K wrote:
>> full FOO /\nContent-Type: message\/rfc822.*?\nReceived:(?:[^\n]+\n\s+){0,3}[^\n]*\b1.2.3.4\b/s

On 26.04.22 16:11, Henrik K wrote:
>Maybe a bit safer version that doesn't log huge strings and run wild
>
>full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s

Doesn't this requires mime headers in specific order that may not be
fullfilled?

--
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Honk if you love peace and quiet.
Re: your mail [ In reply to ]
On Tue, Apr 26, 2022 at 03:59:36PM +0200, Matus UHLAR - fantomas wrote:
> On 26.04.22 16:11, Henrik K wrote:
> > Maybe a bit safer version that doesn't log huge strings and run wild
> >
> > full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
>
> Doesn't this requires mime headers in specific order that may not be
> fullfilled?

Well if you want to match rfc822 contents, it's Received: headers can only
be after a rfc822 declaration.

Other than that it's up to you to figure out, since there's no samples. Of
course this doesn't replace a full parser, but as long as the stuff you
receive doesn't vary much, there's no reason for it not to work.
Re: your mail [ In reply to ]
On Tue, Apr 26, 2022 at 05:11:47PM +0300, Henrik K wrote:
> On Tue, Apr 26, 2022 at 03:59:36PM +0200, Matus UHLAR - fantomas wrote:
> > On 26.04.22 16:11, Henrik K wrote:
> > > Maybe a bit safer version that doesn't log huge strings and run wild
> > >
> > > full FOO /^(?=.*?\nContent-Type: message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
> >
> > Doesn't this requires mime headers in specific order that may not be
> > fullfilled?
>
> Well if you want to match rfc822 contents, it's Received: headers can only
> be after a rfc822 declaration.
>
> Other than that it's up to you to figure out, since there's no samples. Of
> course this doesn't replace a full parser, but as long as the stuff you
> receive doesn't vary much, there's no reason for it not to work.

.. as long as the whole rfc822 contents isn't base64 encoded. Probably not that common.
Re: your mail [ In reply to ]
Matus UHLAR - fantomas wrote:
>>> On Tue, Apr 26, 2022 at 02:35:25PM +0200, Matus UHLAR - fantomas wrote:
>>> > is it possible to match message headers in rfc822 atttachments?
>>> >
>>> > from what I know, "header" rules only apply to mail headers and
>>> mimeheader
>>> > only apply to mime headers.
>>> >
>>> > body and rawbody afaik only search in bodies of messages or included
>>> > messages.

> On 26.04.22 16:11, Henrik K wrote:
>> Maybe a bit safer version that doesn't log huge strings and run wild
>>
>> full FOO /^(?=.*?\nContent-Type:
>> message\/rfc822.{0,1024}?\nReceived:(?:[^\n]{1,100}\n\s{1,100}){0,3}[^\n]{0,100}\b1\.2\.3\.4\b)/s
>>
>
> Doesn't this requires mime headers in specific order that may not be
> fullfilled?

If your attached message has headers that are mixed in with the MIME
headers then it's badly (arguably maliciously) structured and probably
not sanely parseable.

Pulling a quick sample from the spam reporting account here:

====

------=_NextPart_000_0011_01D858C3.7B7DAB10
Content-Type: message/rfc822
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment

Received: from vmsa103.odn.ne.jp by cmsa103.odn.ne.jp with ESMTP id
<20220425222911188.OMLG.51715.cmsa103.odn.ne.jp@msa103.odn.ne.jp> for
[...and the rest of the attached message...]

====

However, Henrik's {0,1024} safety barrier is unfortunately likely to
skip intended matches because of huge/multiple DKIM signatures and other
odds and ends that certain mail clients or platforms take delight in
stuffing into email headers. I've lost count of the ones I've seen with
~40-50k+ characters just in the message headers, never mind all the
Stupid found in the message body. (I think the record has to be
something like 200k+ for a one-line message with no embedded images.
Yay progress?)

Can you expand some more on your use case? You may be better off
splitting the attached message off outside of SA (which is relatively
simple[0]) and processing it directly. If there are attributes from the
parent message needed when processing the child, your splitter could add
them as pseudoheaders on the child message passed to SA. Looking back
at your previous post this seems likely to be easier than trying to
wedge things fully inside SA.

-kgd
[0] I'm slightly terrified by how many abuse departments at companies
that should really know better, and be able to afford more and better
talent than me at this kind of mail-mangling, do not seem to know what
to do with an RFC822 attachment. It took me less than a week to
implement a fairly solid on-delivery splitter like this for FN and FP
reporting, and I've since extended it to handle several mangled
variations to the tune of maybe 5 hours or so each.