Mailing List Archive

Variable names
I'm writing a CLIENTID extension for exim which will
add some variables to be used in the exim config.

One of them, call it "token", is unsafe and cannot be safely untainted
(it is a string of "between 1 and 128 printable characters") so I am
thinking of exposing a second variable which is the string hex-encoded.

What should I call these
"token" and "token_hex",
"token_raw" and "token",
"tainted_token" and "token",
"token_tainted" and "token_hex"
or are there better suggestions ?

It would be nice if error messages about misusing the tainted version
suggested using the safe version. Is that possible ?

Thanks,

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:
> I'm writing a CLIENTID extension for exim which will
> add some variables to be used in the exim config.
>
> One of them, call it "token", is unsafe and cannot be safely untainted
> (it is a string of "between 1 and 128 printable characters") so I am
> thinking of exposing a second variable which is the string hex-encoded.

That second one should also be tainted, in that case, so I don't see
it buys you anything. But - why does it matter if the value is
tainted? How is it expected to be used?
--
Cheers,
Jeremy


--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:

> On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:
>> I'm writing a CLIENTID extension for exim which will
>> add some variables to be used in the exim config.
>>
>> One of them, call it "token", is unsafe and cannot be safely untainted
>> (it is a string of "between 1 and 128 printable characters") so I am
>> thinking of exposing a second variable which is the string hex-encoded.
>
> That second one should also be tainted, in that case,

Why should the hex-encoded version be tainted ?
As an implementation detail it is currently untainted.
If someone decoded the hex it would pose a risk,
but I am prepared to assume that they will understand that they
do that at their own risk.

Exim is able to guarantee to the exim config that the hex-token is a
string of (not more than 256) hex digits, so I don't see a need for it
ti be tainted. Am I missing some other risk ?

There is actually another variable which is "between 1 and 16
characters comprised of only alphanumeric and dash characters" (ASCII
by RFC5324, I think). I *was* assuming that this would also be an
untainted variable.

> so I don't see it buys you anything. But - why does it matter if
> the value is tainted? How is it expected to be used?

This token is somewhere between a username and a password.
It is shared with the associated imap service.
It is likely to appear in logfiles and be used as a rate-limit
key and in block lists.

I don't have much or recent experience with user databases, and they
vary a lot, so I expect at least initially, that people will have to
write their own configurations to integrate it with their user database
and to communicate with the imap service (so that blocking or
unblocking in one does the same in the other).

My observation from the exim lists is that people struggle to write
configs that work with tainted variables, so it seems important to
make safe variables available to the exim config.

The overall point of the exercise is to be able to, say, disable a
user's laptop but still allow them to read and send mail from their
phone or web-mail, should exim or the imap service believe that the
laptop is misusing the mail service.

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
On 06/07/2023 22:30, Andrew C Aitchison wrote:
> On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:
>
>> On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:
>>> I'm writing a CLIENTID extension for exim which will
>>> add some variables to be used in the exim config.
>>>
>>> One of them, call it "token", is unsafe and cannot be safely untainted
>>> (it is a string of "between 1 and 128 printable characters") so I am
>>> thinking of exposing a second variable which is the string hex-encoded.
>>
>> That second one should also be tainted, in that case,
>
> Why should the hex-encoded version be tainted ?

Because it derives from attacker-source data, not verified
against local data.

> As an implementation detail it is currently untainted.

Effectively, you've introduced an untainting backdoor.

> If someone decoded the hex it would pose a risk,
> but I am prepared to assume that they will understand that they
> do that at their own risk.
>
> Exim is able to guarantee to the exim config that the hex-token is a
> string of (not more than 256) hex digits, so I don't see a need for it
> ti be tainted. Am I missing some other risk ?

It's more a matter of principle, allowing simple reasoning about
the provenance of the data.

> There is actually another variable which is "between 1 and 16
> characters comprised of only alphanumeric and dash characters" (ASCII
> by RFC5324, I think). I *was* assuming that this would also be an
> untainted variable.

Again, if this comes off the wire (or is derived from something that
does, don't do it.

>> so I don't see it buys you anything.  But - why does it matter if
>> the value is tainted?  How is it expected to be used?
>
> This token is somewhere between a username and a password.
> It is shared with the associated imap service.
> It is likely to appear in logfiles and be used as a rate-limit
> key and in block lists.
>
> I don't have much or recent experience with user databases, and they
> vary a lot, so I expect at least initially, that people will have to
> write their own configurations to integrate it with their user database
> and to communicate with the imap service (so that blocking or
> unblocking in one does the same in the other).

So long as you're only going to feed the datum to a DB, or external program
as an argument, there's no problem with it being tainted.
(Mostly because we can't control what these non-exim components do,
so have to give up on protection at that boundary).

> My observation from the exim lists is that people struggle to write
> configs that work with tainted variables, so it seems important to
> make safe variables available to the exim config.

"safe" by just abandoning the taint-tracking isn't something I really
want to end up with. The whole effort was put in place to keep exim
away from a log4j-style debacle (yes, we got burned by something similar).
>
> The overall point of the exercise is to be able to, say, disable a
> user's laptop but still allow them to read and send mail from their
> phone or web-mail, should exim or the imap service believe that the
> laptop is misusing the mail service.

Sounds like stashing (tainted) values in a DB, and comparing against a DB
suffices?

--
Cheers,
Jeremy


--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
Now that I have figured out that I was wrong to follow the advice of
Randall Munroe and Mrs. Roberts https://xkcd.com/327/
"I hope you've learned to sanitize your database inputs."
Instead we should *let the database routines* sanitise our data.

I agree with Jeremy's ponts below have removed the detainting.


> Sounds like stashing (tainted) values in a DB, and comparing against a DB
> suffices?

I think so.

Though the impression I get is that many people get caught with tainted
data when they use database lookups or run external commands.

On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:

> On 06/07/2023 22:30, Andrew C Aitchison wrote:
>> On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:
>>
>>> On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:
>>>> I'm writing a CLIENTID extension for exim which will
>>>> add some variables to be used in the exim config.
>>>>
>>>> One of them, call it "token", is unsafe and cannot be safely untainted
>>>> (it is a string of "between 1 and 128 printable characters") so I am
>>>> thinking of exposing a second variable which is the string hex-encoded.
>>>
>>> That second one should also be tainted, in that case,
>>
>> Why should the hex-encoded version be tainted ?
>
> Because it derives from attacker-source data, not verified
> against local data.
>
>> As an implementation detail it is currently untainted.
>
> Effectively, you've introduced an untainting backdoor.
>
>> If someone decoded the hex it would pose a risk,
>> but I am prepared to assume that they will understand that they
>> do that at their own risk.
>>
>> Exim is able to guarantee to the exim config that the hex-token is a
>> string of (not more than 256) hex digits, so I don't see a need for it
>> ti be tainted. Am I missing some other risk ?
>
> It's more a matter of principle, allowing simple reasoning about
> the provenance of the data.
>
>> There is actually another variable which is "between 1 and 16
>> characters comprised of only alphanumeric and dash characters" (ASCII
>> by RFC5324, I think). I *was* assuming that this would also be an
>> untainted variable.
>
> Again, if this comes off the wire (or is derived from something that
> does, don't do it.
>
>>> so I don't see it buys you anything.? But - why does it matter if
>>> the value is tainted?? How is it expected to be used?
>>
>> This token is somewhere between a username and a password.
>> It is shared with the associated imap service.
>> It is likely to appear in logfiles and be used as a rate-limit
>> key and in block lists.
>>
>> I don't have much or recent experience with user databases, and they
>> vary a lot, so I expect at least initially, that people will have to
>> write their own configurations to integrate it with their user database
>> and to communicate with the imap service (so that blocking or
>> unblocking in one does the same in the other).
>
> So long as you're only going to feed the datum to a DB, or external program
> as an argument, there's no problem with it being tainted.
> (Mostly because we can't control what these non-exim components do,
> so have to give up on protection at that boundary).
>
>> My observation from the exim lists is that people struggle to write
>> configs that work with tainted variables, so it seems important to
>> make safe variables available to the exim config.
>
> "safe" by just abandoning the taint-tracking isn't something I really
> want to end up with. The whole effort was put in place to keep exim
> away from a log4j-style debacle (yes, we got burned by something similar).
>>
>> The overall point of the exercise is to be able to, say, disable a
>> user's laptop but still allow them to read and send mail from their
>> phone or web-mail, should exim or the imap service believe that the
>> laptop is misusing the mail service.
>
> Sounds like stashing (tainted) values in a DB, and comparing against a DB
> suffices?
>
> --
> Cheers,
> Jeremy
>
>
> --
> ## subscription configuration (requires account):
> ## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
> ## unsubscribe (doesn't require an account):
> ## exim-dev-unsubscribe@lists.exim.org
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
On 2023-07-06, Andrew C Aitchison via Exim-dev <exim-dev@lists.exim.org> wrote:
> On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:
>
>> On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:
>>> I'm writing a CLIENTID extension for exim which will
>>> add some variables to be used in the exim config.
>>>
>>> One of them, call it "token", is unsafe and cannot be safely untainted
>>> (it is a string of "between 1 and 128 printable characters") so I am
>>> thinking of exposing a second variable which is the string hex-encoded.
>>
>> That second one should also be tainted, in that case,
>
> Why should the hex-encoded version be tainted ?

Why should it not be tainted? why should it exist?

> This token is somewhere between a username and a password.
> It is shared with the associated imap service.
> It is likely to appear in logfiles and be used as a rate-limit
> key and in block lists.

There's no need to untaint it for those uses, it will also work in all kinds of
database lookups.

> I don't have much or recent experience with user databases, and they
> vary a lot, so I expect at least initially, that people will have to
> write their own configurations to integrate it with their user database
> and to communicate with the imap service (so that blocking or
> unblocking in one does the same in the other).
>
> My observation from the exim lists is that people struggle to write
> configs that work with tainted variables, so it seems important to
> make safe variables available to the exim config.

Tainting is a new feature so it hits established users by surprise
when deployed, and then they post here. Usually only a little re-thinking
is needed to get an untainted value when needed.

--
Jasen.
???????? ????? ???????

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: Variable names [ In reply to ]
I found this in my drafts folder, and pondered whether I should still send
it.

However I'm *still* seeing occasional "T=address_pipe defer (0): Tainted"
in my logs, albeit on a new host that's not yet in production. Clearly I
have work to do before then.

The remainder of this message is as I wrote it six months ago.

On 06/07/2023 17:21, Andrew C Aitchison via Exim-dev wrote:

> One of them, call it "token", is unsafe and cannot be safely untainted (it
>>>> is a string of "between 1 and 128 printable characters") so I am thinking
>>>> of exposing a second variable which is the string hex-encoded.
>>>>
>>> On Thu, 6 Jul 2023, Jeremy Harris via Exim-dev wrote:

> That second one should also be tainted, in that case,
>>>
>> On 2023-07-06, Andrew C Aitchison via Exim-dev <exim-dev@lists.exim.org>
wrote:
>
> Why should the hex-encoded version be tainted ?
>>
> On Tue, 11 Jul 2023 at 22:00, Jasen Betts <jasen.betts@gmail.com> wrote:

> Why should it not be tainted? why should it exist?
>

The point of tainting is to prevent the inadvertent use of data in ways
that may be unsafe if it contains something unexpected. In particular, if
it contains characters that become metacharacters in the surrounding
context where the value is used.

Coming from a trusted source isn't the only way to prove that it's safe,
nor should it be.

One way to prove that a datum is safe is to match it against a pattern as a
precondition to taking action.

Another way, just as valid, is to ensure that it's encoded safely. In
particular various kinds of ASCII-armouring (hex encode, base-64 encode,
base-94 encode, etc) don't require tainting because they implicitly
guarantee that it will match a predictable pattern.

There's no need to untaint it for those uses, it will also work in all
> kinds of database lookups.
>

Until it doesn't. Try this:

SENDER_INFO=DIR/$sender_address_domain/sender_info/$sender_address_localpart

Nope, those are all tainted. Let's try:

SENDER_INFO=${lookup
{$sender_address_domain/sender_info/$sender_address_localpart} dsearch{DIR}
{DIR/$value} fail}

Nope again, the lookup key for dsearch isn't allowed to have "/" in it.

Tainting is a new feature so it hits established users by surprise when
> deployed, and then they post here.


Or some of us actually read the documentation and try to figure it out for
ourselves. (In hindsight that was obviously the wrong course of action; I
should have just come here and asked someone else to solve it for me.
Grrrr.)

Usually only a little re-thinking is needed to get an untainted value when
> needed.
>

"Usually" but not always. We have configurations with more than 80 routers
(because they're combining multiple legacy systems), and that "little
thinking" became an enormous headache.

I was coming at this cold, having rarely needed to read the documentation
as I was moderately familiar with what we needed. A routine OS upgrade
pushed Exim to version 4.96, but fortunately we caught this just before we
obliterated our last few servers running the Exim 4.90. That left us with
an unstable mail platform with inadequate fail-over capability, while I
desperately read manuals and conducted experiments to find out exactly what
was broken, what could be used to replace those parts, and what would need
to be rewritten.

The data-flow of tainted data can still surprise even an experienced Exim
config writer, until they've learned all the new nuances.

For example, even though we had already sanitized the path (by ensuring
each component did not contain "." or ".." or "/"), a simple "fetch file
contents" now takes exponentially many nested dsearch lookups: one for the
leaf filename, then 2, 4, 8 etc for each tainted directory name in the
path. One is forced to invent a bunch of new macros just to make the whole
thing even *vaguely* manageable.

My *SENDER_INFO* macro (above) seemed to need to be rewritten something
like this:

SENDER_INFO=${lookup {$sender_address_localpart} dsearch{${lookup
{$sender_address_domain} dsearch{DIR} {DIR/$value/sender_info} fail}} {${lookup
{$sender_address_domain} dsearch{DIR} {DIR/$value/sender_info} fail}/$value}
fail}

Or a little less obnoxiously like:

SENDER_INFO_DIR=${lookup {$sender_address_domain} dsearch{DIR} {DIR/$value
/sender_info} fail}
SENDER_INFO=${lookup {$sender_address_localpart} dsearch{SENDER_INFO_DIR} {
SENDER_INFO_DIR/$value} fail}

(Maybe there's a less cumbersome way involving stashing sanitized
components in per-recipient variables, but finding out if that would even
be possible would have taken too long.)

So it took us *many* days to create a new config file, repeatedly comparing
its behaviour under Exim 4.96 with the behaviour of the old config under
Exim 4.90. Heck, it took a number of hours just to set up a test framework
so that this could be done.

This has been by far the most disruptive change since moving from Exim3 to
Exim4.

-Martin

--
## subscription configuration (requires account):
## https://lists.exim.org/mailman3/postorius/lists/exim-dev.lists.exim.org/
## unsubscribe (doesn't require an account):
## exim-dev-unsubscribe@lists.exim.org
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/