Mailing List Archive

A study of failing tls certs, with valid certificate files
Hi all,

please take this text as it is, a study for a fail you could avoid, no
fingerpointing, no flaming, only suggestions what to look for/change in
your toolchains.

In early December 2022 the server in question switched his os release
and was restarted (exim including). In this upgrade, the following
switch was made:

FROM:

2022-11-28T20:46:24+0100 SUBDEBUG Upgraded: exim-4.96-5.fc35.x86_64
2022-11-28T20:46:32+0100 SUBDEBUG Upgraded: *openssl-1:*1.1.1q-1.fc35.x86_64

TO:

2022-11-28T20:41:00+0100 SUBDEBUG Upgrade: *openssl-1:3*.0.5-2.fc36.x86_64
2022-11-28T20:42:54+0100 SUBDEBUG Upgrade: exim-4.96-5.fc36.x86_64

later was an update to 4.96-6

2022-12-01T08:01:27+0100 SUBDEBUG Upgrade: exim-4.96-6.fc36.x86_64
2022-12-01T08:01:45+0100 SUBDEBUG Upgraded: exim-4.96-5.fc36.x86_64

Certs are renewed by a periodic 5 day cron job ( to not hurt LE to much
) which restarts the apache, but not exim.

At that time the Let's Encrypt certificate for exim and all other
services had these dates:

            Not Before: Oct 10 21:07:39 2022 GMT
            Not After : Jan  8 21:07:38 2023 GMT

On the 11th of December 2022 0:08 it was auto renewed and switched to
these dates:

            Not Before: Dec 10 22:08:37 2022 GMT
            Not After : Mar 10 22:08:36 2023 GMT

-rw-r----- 1 root exim 1834 11. Dez 00:08 cert-1670713689.csr
-rw-r----- 1 root exim 2366 11. Dez 00:08 cert-1670713689.pem

Yesterday evening at around 22:25 CET ( +1 GMT ) openssl( via exim )
started to spit out these messages on incoming connections:

2023-01-08 22:25:18 TLS error on connection from
vmi395689.contaboserver.net [5.189.157.109] (SSL_accept):
error:0A000415:SSL routines::sslv3 alert certificate expired

This was caused by the EOT of the cert loaded at the last update
(2022-12-01) and exim not being restarted since.

This was happening for the first time since Let's Encrypted was formed (
we use it since then ), so for years by now.

ATM this exim is in use:

Name        : exim
Version     : 4.96
Release     : 6.fc36
Architecture: x86_64
Install Date: Do 01 Dez 2022 08:01:27 CET
Build Date  : Di 22 Nov 2022 15:25:30 CET

Name        : openssl
Version     : 3.0.5
Release     : 2.fc36
Architecture: x86_64
Install Date: Mo 28 Nov 2022 20:41:00 CET
Build Date  : Di 01 Nov 2022 17:26:57 CET

The original cert setup looks like this:

lrwxrwxrwx 1 root root 59 17. Sep 2018  /etc/pki/tls/certs/exim.pem ->
/etc/httpd/letsencrypt/certs/server.de/fullchain.pem
0 lrwxrwxrwx 1 root root 24 11. Dez 00:08 fullchain.pem ->
fullchain-1670713689.pem
8 -rw-r----- 1 root exim 6117 11. Dez 00:08 fullchain-1670713689.pem

/etc/pki/tls/certs/exim.pem is the default location for Fedoras exim
package.

O== are there more systems?

Yes, there are, this is just the one, we detected it first. So it's not
a glitch.

O== Conclusion:

As I can't remember any downstream patches to Exim inside Fedora's
build, so something changed how exim or openssl3  is handling the
underlying certificate switch detection. As Exim had only a tiny minor
switch, OpenSSL3 is my personal candidate for this.

O== Suggestions:

In this combination exim needs to be restarted, when the server cert was
renewed, as the auto detection is not reliable working any more.

It may be a good idea to check for a new solution inside exim like auto
reloading the used cert every 24h's the server is running, if openssl3
is causing this "detection" bug.


best regards,
Marius
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 09/01/2023 11:30, Cyborg via Exim-users wrote:
> It may be a good idea to check for a new solution inside exim like auto reloading the used cert every 24h's the server is running, if openssl3 is causing this "detection" bug.

It wouldn't be an OpenSSL change. Exim (since 4.95) on both Linux
and FreeBSD platforms[*] sets a watch on the relevant directories and files,
and (supposedly) reloads certs when they change. Best guess is that
this mechanism failed for some reasons.

*] For any platform not noted in the build config as supporting
either "inotify" or "kevent", TLS credentials are not cached
but re-read from file on every connection.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 9 Jan 2023, at 12:05, Jeremy Harris via Exim-users <exim-users@exim.org> wrote:
> It wouldn't be an OpenSSL change. Exim (since 4.95) on both Linux
> and FreeBSD platforms[*] sets a watch on the relevant directories and files,
> and (supposedly) reloads certs when they change. Best guess is that
> this mechanism failed for some reasons.

Could it be that the path - a symlink to a symlink to a file - wasn't fully dereferenced, so from Exim's perspective the file hadn't changed? ISTR that inotofy used to (many years ago), but that was changed somwhere in the kernel 2.x days.

[searches...]

Perhaps. Although I did find a bug (2909) and the commit to fix it (a1ec98d). If I'm reading the Fedora changelog properly, that commit is not in the RPM the OP is running because it post-dates the 4.96 release. Although it's unclear if it'll fix the issue cleanly, because there are two symlinks before the actual file!

Graeme
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 2023-01-09 Cyborg via Exim-users <exim-users@exim.org> wrote:
> please take this text as it is, a study for a fail you could avoid, no
> fingerpointing, no flaming, only suggestions what to look for/change in your
> toolchains.

> In early December 2022 the server in question switched his os release and
> was restarted (exim including). In this upgrade, the following switch was
> made:

> FROM:

> 2022-11-28T20:46:24+0100 SUBDEBUG Upgraded: exim-4.96-5.fc35.x86_64
> 2022-11-28T20:46:32+0100 SUBDEBUG Upgraded: *openssl-1:*1.1.1q-1.fc35.x86_64
[...]
> As I can't remember any downstream patches to Exim inside Fedora's build, so
> something changed how exim or openssl3? is handling the underlying
> certificate switch detection. As Exim had only a tiny minor switch, OpenSSL3
> is my personal candidate for this.
[...]

The major change in recentish time was in 4.95
11. Faster TLS startup. When various configuration options contain no
expandable elements, the information can be preloaded and cached rather
than the provious behaviour of always loading at startup time for every
connection. This helps particularly for the CA bundle.

I have also switch to restarting instead of HUP-ing my exim after cert
updates at some point because the old cert still showed up.

cu Andreas

--
`What a good friend you are to him, Dr. Maturin. His other friends are
so grateful to you.'
`I sew his ears on from time to time, sure'

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 09/01/2023 17:39, Andreas Metzler via Exim-users wrote:
> On 2023-01-09 Cyborg via Exim-users <exim-users@exim.org> wrote:
>> please take this text as it is, a study for a fail you could avoid, no
>> fingerpointing, no flaming, only suggestions what to look for/change in your
>> toolchains.
>
>> In early December 2022 the server in question switched his os release and
>> was restarted (exim including). In this upgrade, the following switch was
>> made:
>
>> FROM:
>
>> 2022-11-28T20:46:24+0100 SUBDEBUG Upgraded: exim-4.96-5.fc35.x86_64
>> 2022-11-28T20:46:32+0100 SUBDEBUG Upgraded: *openssl-1:*1.1.1q-1.fc35.x86_64
> [...]
>> As I can't remember any downstream patches to Exim inside Fedora's build, so
>> something changed how exim or openssl3  is handling the underlying
>> certificate switch detection. As Exim had only a tiny minor switch, OpenSSL3
>> is my personal candidate for this.
> [...]
>
> The major change in recentish time was in 4.95
> 11. Faster TLS startup. When various configuration options contain no
> expandable elements, the information can be preloaded and cached rather
> than the provious behaviour of always loading at startup time for every
> connection. This helps particularly for the CA bundle.
>
> I have also switch to restarting instead of HUP-ing my exim after cert
> updates at some point because the old cert still showed up.

Interesting. Is/are you cert(s) behind a symlink, from the place
baked into the TLS library (which is what Exim monitors)?

If so, you should pick up commits ef57b25bfa76, a1ec98dd9637
"Symlink following for TLS creds files"
These are post-4.96 so have not hit a release yet.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 09/01/2023 12:38, Graeme Fowler via Exim-users wrote:
> Although it's unclear if it'll fix the issue cleanly, because there are two symlinks before the actual file!

Theory goes that it walks to the end of a symlink-chain
(max 20 deep) and watches the real file.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
Am 09.01.23 um 19:12 schrieb Jeremy Harris via Exim-users:
>
> If so, you should pick up commits ef57b25bfa76, a1ec98dd9637
> "Symlink following for TLS creds files"
> These are post-4.96 so have not hit a release yet.

I will see if the maintainer can help fedora users here.

I switched to restart exim after a renew, so solved for me.

Best regards,
Marius
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 2023-01-09 Jeremy Harris via Exim-users <exim-users@exim.org> wrote:
> On 09/01/2023 17:39, Andreas Metzler via Exim-users wrote:
[...]
>>> something changed how exim or openssl3? is handling the underlying
>>> certificate switch detection. As Exim had only a tiny minor switch, OpenSSL3
>>> is my personal candidate for this.
>> [...]

>> The major change in recentish time was in 4.95
>> 11. Faster TLS startup. When various configuration options contain no
>> expandable elements, the information can be preloaded and cached rather
>> than the provious behaviour of always loading at startup time for every
>> connection. This helps particularly for the CA bundle.
>>
>> I have also switch to restarting instead of HUP-ing my exim after cert
>> updates at some point because the old cert still showed up.

> Interesting. Is/are you cert(s) behind a symlink, from the place
> baked into the TLS library (which is what Exim monitors)?

> If so, you should pick up commits ef57b25bfa76, a1ec98dd9637
> "Symlink following for TLS creds files"
> These are post-4.96 so have not hit a release yet.

Hello Jeremy,

I have had this on my TODO, waiting for the next letsencrypt cert
update. I dropped the
"service exim4 stop ; sleep .2 ; service exim4 start"
from my post update script and checked whether exim now automatically
saw the new certs. It did. :-)

I am not symlinking my certs and since this was on Debian's 4.96-14~bpo11+1
neither of the two symlink-cert fixes are included. (I will consider
cherry-picking them anyway.) So it looks like something else was broken
at some point in time and is fixed again.

cu Andreas
--
`What a good friend you are to him, Dr. Maturin. His other friends are
so grateful to you.'
`I sew his ears on from time to time, sure'

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: A study of failing tls certs, with valid certificate files [ In reply to ]
On 25/02/2023 14:45, Andreas Metzler via Exim-users wrote:
> So it looks like something else was broken
> at some point in time and is fixed again.

Good to hear. Thanks for the follow-up.
--
Cheers,
Jeremy


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/