Mailing List Archive

Problem with filter on Base64-encoded subject field
Hi,
I have to filter out some tags from outgoing email subject field. I use the following filter:
 
    if $h_subject: matches \N\[\NTAG1|TAG2|TAG3\N\]\N
    then
        headers add "tmpSubject: ${sg{$bh_subject:}{\\\\[(TAG1|TAG2|TAG3)\\\\]\\\\ }{}}"
        headers remove "Subject"
        headers add "Subject: $h_tmpSubject"
        headers remove "tmpSubject"
    endif
 
and it works fine while users send emails with email clients and subject is ISO-8859-1 plain text. But some users do it with Google Mail’s web interface. Then outgoing email subject is in UTF-8 and encoded with Base64. Based on «String expansion» section of documentation, I have tried to add second stage to the filter with the same content, but prepended with 
 
    headers charset "UTF-8"
 
- no luck.
Then I’ve tried to determine the problem. I’ll bypass some intermediate steps, but I have ended with following in filter:
 
    headers charset "UTF-8" 
    headers add "tmpB64decodedSubject: $bh_subject" 
    headers add "hSubject: $h_subject"
 
In my understanding of «String expansion» this both should result with additional headers, decoded from Base64 — however, I’m getting three identical headers:
 
    Subject: =?UTF-8?B?UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4?= я поÑ?Ñ?Ñ?
    tmpB64decodedSubject: =?UTF-8?B?UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4?= я поÑ?Ñ?Ñ?
    hSubject: =?UTF-8?B?UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4?= я поÑ?Ñ?Ñ?   
 
Once again, according to my understanding of «String expansion» this means that Exim cannot decode the original header for some reason. I have tried to decode « UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4» manually on the same server with 
 
    echo   UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4 | base64 -d
 
and it works fine:
 
    Re: [EXTERNAL] Re: Re: Re: ????????????
 
(tag [EXTERNAL] is one of the actual tags in use).
What am I doing wrong in filters? Or how can I do further diagnostic to determine why Exim cannot decode the headers? Does multiline in subject header influences the result? 
 
Best regards,
Kirill Sluchanko
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
On 2020-12-15 Kirill Sluchanko via Exim-users <exim-users@exim.org> wrote:
[...]  
>     Subject: =?UTF-8?B?UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4?= я поÑ?Ñ?Ñ?
[...]  
>     echo   UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4 | base64 -d
>  
> and it works fine:
>  
>     Re: [EXTERNAL] Re: Re: Re: ????????????

Your decoding test was only applied to the =?UTF-8?...?= part but the
rest of the header contains non-ASCII characters which are not even
valid UTF8. It could be that exims match function never matches on invalid
strings.

cu Andreas

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
On 15/12/2020 18:13, Andreas Metzler via Exim-users wrote:
> On 2020-12-15 Kirill Sluchanko via Exim-users <exim-users@exim.org> wrote:
> [...]
>>     Subject: =?UTF-8?B?UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4?= я поÑ?Ñ?Ñ?
> [...]
>>     echo   UmU6IFtFWFRFUk5BTF0gUmU6IFJlOiBSZTog0JzQsNGA0YjRgNGD0YLQuNC30LDRhtC4 | base64 -d
>>
>> and it works fine:
>>
>>     Re: [EXTERNAL] Re: Re: Re: ????????????
>
> Your decoding test was only applied to the =?UTF-8?...?= part but the
> rest of the header contains non-ASCII characters which are not even
> valid UTF8. It could be that exims match function never matches on invalid
> strings.

I'm paranoid enough to want to start checking exactly what was received
on the wire for that header, in case there's a buffer malfunction in exim's
code...
--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
???????, 15 ??????? 2020, 21:21 +03:00 ?? Andreas Metzler via Exim-users <exim-users@exim.org>:
 
> Your decoding test was only applied to the =?UTF-8?...?= part but the
> rest of the header contains non-ASCII characters which are not even
> valid UTF8. It could be that exims match function never matches on invalid
> strings.
 
Yes, I see. But my previous searches makes me think that without special instructions filter matches only the part of the header before the first "new line", which is right after "?=". Is it wrong for decoding Base64 and related only to the regex matching, so the part after "new line" breaks decoding from Base64? If so, is there a way to workaround this situation?
 
? ?????????,
Kirill Sluchanko
ksluchanko@mail.ru
 
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
 
?????, 16 ??????? 2020, 3:03 +03:00 ?? Jeremy Harris via Exim-users <exim-users@exim.org>:
 
> I'm paranoid enough to want to start checking exactly what was received
> on the wire for that header, in case there's a buffer malfunction in exim's code...
 
Not sure if this is the case. The subject is right the same as it has to be on receiving side — just not filtered.
 
Best regards,
Kirill Sluchanko
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
> this means that Exim cannot decode the original header for some reason.

Insert into the beginning of Exim config:
check_rfc2047_length = false

Also, you should encode Subject back.


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Problem with filter on Base64-encoded subject field [ In reply to ]
 
Hi,
 
> Insert into the beginning of Exim config:
> check_rfc2047_length = false
 
No luck. Subject is still not filtered.

> Also, you should encode Subject back.
 
Skipped this, as any changes (even corrupted encoding) can indicate that filter works — but there are no any.
 
Best regards,
Kirill Sluchanko
 
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/