Mailing List Archive

From header with encoding not parsed?
Hi, I have a variable to extract the email address in from header set
like this:

${lc:${address:$h_From:}}

But it comes out blank(empty) given a "from" header like this one:

From: =?utf-8?Q?My=20Bizness=2C=20Inc.?= <charles@*munged*.org>

I think thats a valid header? Did i do somethings wrong please? Thanks!

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>Hi, I have a variable to extract the email address in from header set like this:
>
>${lc:${address:$h_From:}}

Header is valid, but after decoding it contains comma without
qoutes, the comma is address separator and thus results in
list of two "addresses", first without valid address, thus empty...

Use raw header for address extracting -- $rh_From: that works
for both, quoted and encoded content...

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
Slavko via Exim-users wrote on 12.04.2023 20:42:
> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>> Hi, I have a variable to extract the email address in from header set like this:
>>
>> ${lc:${address:$h_From:}}
>
> Header is valid, but after decoding it contains comma without
> qoutes, the comma is address separator and thus results in
> list of two "addresses", first without valid address, thus empty...
>
> Use raw header for address extracting -- $rh_From: that works
> for both, quoted and encoded content...


What about the colon without encoding?

From: =?utf-8?Q?My=20Bizness:=20Inc.?= <charles@example.org>


--
Best wishes Victor Ustugov
mailto:victor@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On 2023-04-12 17:42, Slavko via Exim-users wrote:
> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users
> <exim-users@exim.org> napísal:
>> Hi, I have a variable to extract the email address in from header set
>> like this:
>>
>> ${lc:${address:$h_From:}}
>
> Header is valid, but after decoding it contains comma without
> qoutes, the comma is address separator and thus results in
> list of two "addresses", first without valid address, thus empty...
>
> Use raw header for address extracting -- $rh_From: that works
> for both, quoted and encoded content...

thank you Slavko!

If using rh_From: is there risk to get tricked with header like:

From: "spammer_address@example.bad" <compromised_account@example.com>

${address:} expansion is following RFC 2822... so maybe its ok and the
importance is $h_ should never be used with ${address:} because that
address expansion will decode it anyway??

Also question about $h_ decoding, I dont remember if quoting is required
if it is encoded like my exmaple. Is the example a invalid header
because it needs quoting? Or is the problem that i'm using two unrelated
steps for full parsing? ($h_ then ${address:})

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
D?a 12. apríla 2023 18:43:09 UTC používate? Victor Ustugov via Exim-users <exim-users@exim.org> napísal:
>Slavko via Exim-users wrote on 12.04.2023 20:42:
>> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>>> Hi, I have a variable to extract the email address in from header set like this:
>>>
>>> ${lc:${address:$h_From:}}
>>
>> Header is valid, but after decoding it contains comma without
>> qoutes, the comma is address separator and thus results in
>> list of two "addresses", first without valid address, thus empty...
>>
>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>
>What about the colon without encoding?
>
>From: =?utf-8?Q?My=20Bizness:=20Inc.?= <charles@example.org>

AFAIK colon have to be encoded, quote from by RFC 2047, section
5 (the From: and similar):

characters that may be used in a "Q"-encoded 'encoded-word' is
restricted to: <upper and lower case ASCII letters, decimal digits,
"!", "*", "+", "-", "/", "=", and "_">

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
D?a 12. apríla 2023 19:15:19 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>On 2023-04-12 17:42, Slavko via Exim-users wrote:

>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>If using rh_From: is there risk to get tricked with header like:
>
>From: "spammer_address@example.bad" <compromised_account@example.com>

Simple put that line in some file and try itself by -bem, eg:

exim -bem /file/with/that_header '${address:$rh_From:}'

>${address:} expansion is following RFC 2822... so maybe its ok and the importance is $h_ should never be used with ${address:} because that address expansion will decode it anyway??

Hard to say, headers can be broken (by mistake or by purpose)
in many ways. One usually do not need look into From: headers
from foreign source, but will want eg. to extract domain from it
for DKIM (DMARC intended) signature from own messages, thus
ensure valid From: header on MSA with in depth inspection.

I delegate in depth message inspection to rspamd, with
some exceptions -- mostly Subject: and attachments (eg. for
DMARC reports extraction/routing).

>Also question about $h_ decoding, I dont remember if quoting is required if it is encoded like my exmaple. Is the example a invalid header because it needs quoting? Or is the problem that i'm using two unrelated steps for full parsing? ($h_ then ${address:})

RFC defines when quotes are required, the "@" is one of that
case, exim properly checks that syntax with control=verifyXY
ACL (sorry i forgot exact) condition.

AFAIK, the name part is either quoted (for ASCII only) or
encoded (for nonASCII). But i often see encoded ASCII
only chars (rspamd detects that), and often in legitime
messages...

BTW, i am always surprised how problematic are nonASCII
things. My first bigger computer project was to teach computer
to print chars nowadays known as Latin2 & Cyrillic (in 1984 :-) ),
Nowadays it is no problem to print/show that, but...

regards


--
Slavko
https://www.slavino.sk/

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On 12/04/2023 17:50, MRob via Exim-users wrote:
> Hi, I have a variable to extract the email address in from header set like this:
>
> ${lc:${address:$h_From:}}
>
> But it comes out blank(empty) given a "from" header like this one:
>
> From: =?utf-8?Q?My=20Bizness=2C=20Inc.?= <charles@*munged*.org>
>
> I think thats a valid header? Did i do somethings wrong please? Thanks!

You didn't say whree you are trying to do that expansion.
If it's before data phase, the headers have not yet been received.

--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
@Jeremy I think location is no problem since the address is successfully
extracted in most cases. Only this one problem, because the encoded
comma

> Simple put that line in some file and try itself by -bem, eg:

Thank you Slavko so i will not bother list with that kind of question!

>> ${address:} expansion is following RFC 2822... so maybe its ok and the
>> importance is $h_ should never be used with ${address:} because that
>> address expansion will decode it anyway??
>
> Hard to say, headers can be broken (by mistake or by purpose)
> in many ways

>> Also question about $h_ decoding, I dont remember if quoting is
>> required
>> if it is encoded like my exmaple. Is the example a invalid header
>> because
>> it needs quoting? Or is the problem that i'm using two unrelated steps
>> for
>> full parsing? ($h_ then ${address:})

Looking like RFC2822 requires quote when have comma in display-name but
doesn't talk about when encoding used on display-name so I still dont
know if its valid header. I will guess that it is required to decode
then parse as normal non-encoded rfc2822 header, thus this header is not
valid?

If that's right then using $h_ to do decoding then ${address:} to parse
and extract address is ok even though its two separate operations.
Otherwise, exim would need an operation that does both in once
(does ${address:} do decoding or only parsing?)

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On 2023-04-12, Victor Ustugov via Exim-users <exim-users@exim.org> wrote:
> Slavko via Exim-users wrote on 12.04.2023 20:42:
>> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>>> Hi, I have a variable to extract the email address in from header set like this:
>>>
>>> ${lc:${address:$h_From:}}
>>
>> Header is valid, but after decoding it contains comma without
>> qoutes, the comma is address separator and thus results in
>> list of two "addresses", first without valid address, thus empty...
>>
>> Use raw header for address extracting -- $rh_From: that works
>> for both, quoted and encoded content...
>
>
> What about the colon without encoding?
>
> From: =?utf-8?Q?My=20Bizness:=20Inc.?= <charles@example.org>

yes, the colon breaks it. it's not a valid from header.

RFC5322 is a bit of a rabbit hole to dive into.

but the short story is none of these should be used in "bare" names

specials = "(" / ")" / ; Special characters that do
"<" / ">" / ; not appear in atext
"[" / "]" /
":" / ";" /
"@" / "\" /
"," / "." /
DQUOTE

except where there is specific permission given


Easiest fix for the sender is to use quotes.

From: "=?utf-8?Q?My=20Bizness:=20Inc.?=" <charles@example.org>

--
Jasen.
???????? ????? ???????

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
Slavko via Exim-users wrote on 12.04.2023 22:38:

>>>> Hi, I have a variable to extract the email address in from header set like this:
>>>>
>>>> ${lc:${address:$h_From:}}
>>>
>>> Header is valid, but after decoding it contains comma without
>>> qoutes, the comma is address separator and thus results in
>>> list of two "addresses", first without valid address, thus empty...
>>>
>>> Use raw header for address extracting -- $rh_From: that works
>>> for both, quoted and encoded content...
>>
>>
>> What about the colon without encoding?
>>
>> From: =?utf-8?Q?My=20Bizness:=20Inc.?= <charles@example.org>
>
> AFAIK colon have to be encoded, quote from by RFC 2047, section
> 5 (the From: and similar):
>
> characters that may be used in a "Q"-encoded 'encoded-word' is
> restricted to: <upper and lower case ASCII letters, decimal digits,
> "!", "*", "+", "-", "/", "=", and "_">

I'm not talking about what should be encoded, but about what can be
received in a real email from a spammer, some kind of script or
something like that.


--
Best wishes Victor Ustugov
mailto:victor@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
Jasen Betts via Exim-users wrote on 13.04.2023 10:07:
> On 2023-04-12, Victor Ustugov via Exim-users <exim-users@exim.org> wrote:
>> Slavko via Exim-users wrote on 12.04.2023 20:42:
>>> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <exim-users@exim.org> napísal:
>>>> Hi, I have a variable to extract the email address in from header set like this:
>>>>
>>>> ${lc:${address:$h_From:}}
>>>
>>> Header is valid, but after decoding it contains comma without
>>> qoutes, the comma is address separator and thus results in
>>> list of two "addresses", first without valid address, thus empty...
>>>
>>> Use raw header for address extracting -- $rh_From: that works
>>> for both, quoted and encoded content...
>>
>>
>> What about the colon without encoding?
>>
>> From: =?utf-8?Q?My=20Bizness:=20Inc.?= <charles@example.org>
>
> yes, the colon breaks it. it's not a valid from header.

I know. But email clients correctly display the From header shown above.
And it is quite possible to get such a header in an incoming email.

> RFC5322 is a bit of a rabbit hole to dive into.
>
> but the short story is none of these should be used in "bare" names
>
> specials = "(" / ")" / ; Special characters that do
> "<" / ">" / ; not appear in atext
> "[" / "]" /
> ":" / ";" /
> "@" / "\" /
> "," / "." /
> DQUOTE
>
> except where there is specific permission given
>
>
> Easiest fix for the sender is to use quotes.
>
> From: "=?utf-8?Q?My=20Bizness:=20Inc.?=" <charles@example.org>

in order to insert double quotes, I need to separate the From header
into the address and the part of the header that comes before it. Why do
I need to add quotes if I have already determined which part of the
header is the address?


--
Best wishes Victor Ustugov
mailto:victor@corvax.kiev.ua
public GnuPG/PGP key: https://victor.corvax.kiev.ua/corvax.asc

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On 13/04/2023 09:54, Victor Ustugov via Exim-users wrote:
> I'm not talking about what should be encoded, but about what can be
> received in a real email from a spammer, some kind of script or
> something like that.

A mail sender could send you *anything*.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On Thu, 13 Apr 2023 at 19:36, Slavko <linux@slavino.sk> wrote in
exim-users@exim.org:

> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <
> exim-users@exim.org> napísal:
> > Hi, I have a variable to extract the email address in from header set
> like this:
> >
> > ${lc:${address:$h_From:}}
>
> Header is valid, but after decoding it contains comma without
> qoutes, the comma is address separator and thus results in
> list of two "addresses", first without valid address, thus empty...
>

My take on this is that Exim is wrong there.

Anywhere else, splitting addresses on commas happens before decoding, and
this should be no different.

One way to do that would be to treat encoded characters as if they were
quoted.

-Martin
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: From header with encoding not parsed? [ In reply to ]
On 13/04/2023 23:24, Martin D Kealey via Exim-users wrote:
> On Thu, 13 Apr 2023 at 19:36, Slavko <linux@slavino.sk> wrote in
> exim-users@exim.org:
>
>> D?a 12. apríla 2023 16:50:29 UTC používate? MRob via Exim-users <
>> exim-users@exim.org> napísal:
>>> Hi, I have a variable to extract the email address in from header set
>> like this:
>>>
>>> ${lc:${address:$h_From:}}
>>
>> Header is valid, but after decoding it contains comma without
>> qoutes, the comma is address separator and thus results in
>> list of two "addresses", first without valid address, thus empty...
>>
>
> My take on this is that Exim is wrong there.
>
> Anywhere else, splitting addresses on commas happens before decoding, and
> this should be no different.

Uh, it's only a list if and when you use that string (the result of that expansion)
where a list is expected. And the list separator is also defined
by the context.

I don't agree with "Exim is wrong there".

--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/