Mailing List Archive

2 hours delay (gnutls_handshake): timed out: delivering unencrypted to
Hi all,

got a sending e-mail relay here, Ubuntu 18.04 LTS. About 22k e-mails sending
volume per day. There are two receiving e-mail servers
that are experiencing delays (~2 hours) when receiving e-mails from us. This
does not happen for every e-mail we're sending tot them, but for some.

Log if delay occurs:

2022-03-21 08:00:57.876 [7442] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:05:58.110 [10388] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:10:59.560 [11462] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:20:18.369 [13434] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:20:57.808 [13476] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:30:21.004 [14859] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:30:57.813 [14899] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:40:18.196 [16573] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:40:57.817 [16679] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:50:18.556 [20921] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 08:50:58.437 [20990] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:00:19.873 [27095] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:00:58.842 [27188] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:10:18.848 [32597] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:10:57.828 [316] 1nWC1t-0001kn-G2 Spool file is locked (another
process is handling this message)
2022-03-21 09:20:58.127 [4368] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:22:29.533 [5172] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:30:18.840 [6937] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:30:58.353 [6984] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:40:18.589 [10653] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:43:08.577 [12567] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:50:19.260 [16095] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:53:08.644 [16636] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 09:58:07.653 [17655] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 10:00:57.843 [17993] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 10:08:09.749 [20352] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 10:10:57.832 [20889] 1nWC1t-0001kn-G2 Spool file is locked
(another process is handling this message)
2022-03-21 10:13:04.657 [6774] 1nWC1t-0001kn-G2 TLS session:
(gnutls_handshake): timed out: delivering unencrypted to H=..(not in
hosts_require_tls)

The offending hosts are not controlled by us, but i am in contact with the
admin (Symantec Messaging Gateway in use). They are only having this kind of
trouble when they receive from us, an we are only having trouble when
sending to them. They already tried different versions of their Symantec
Gateway.

exim -bP smtp_receive_timeout is not set, so should be 5 minutes.

There was one occurence to another domain, but there it only took 5 minutes
for the timeout to fire:

2022-01-12 18:09:23.266 [22623] 1n7h7j-0005st-8A <= P=esmtps
X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no S=13624
2022-01-12 18:10:50.064 [22805] 1n7h7j-0005st-8A Spool file is locked
(another process is handling this message)
2022-01-12 18:14:23.442 [22647] 1n7h7j-0005st-8A TLS session:
(gnutls_handshake): timed out: delivering unencrypted to H= (not in
hosts_require_tls)


Exim version 4.90_1 #4 built 30-Apr-2021 14:15:04
Copyright (c) University of Cambridge, 1995 - 2017
(c) The Exim Maintainers and contributors in ACKNOWLEDGMENTS file, 2007 -
2017
Berkeley DB: Berkeley DB 5.3.28: (September 9, 2013)
Support for: crypteq iconv() IPv6 GnuTLS move_frozen_messages DKIM DNSSEC
Event OCSP PRDR SOCKS TCP_Fast_Open
Lookups (built-in): lsearch wildlsearch nwildlsearch iplsearch cdb dbm dbmjz
dbmnz dnsdb dsearch nis nis0 passwd
Authenticators: cram_md5 plaintext
Routers: accept dnslookup ipliteral manualroute queryprogram redirect
Transports: appendfile/maildir/mailstore autoreply lmtp pipe smtp
Fixed never_users: 0
Configure owner: 0:0
Size of off_t: 8
Compiler: GCC [7.5.0]
Library version: Glibc: Compile: 2.27
Runtime: 2.27
Library version: GnuTLS: Compile: 3.5.18
Runtime: 3.5.18
Library version: PCRE: Compile: 8.39
Runtime: 8.39 2016-06-14
WHITELIST_D_MACROS: "OUTGOING"
TRUSTED_CONFIG_LIST: "/etc/exim4/trusted_configs"
Exim version 4.90_1 uid=0 gid=0 pid=18641 D=8000000
tls_validate_require_cipher child 18642 ended: status=0x0
configuration file is /etc/exim4/exim4.conf
log selectors = ffffffff ffffffff
LOG: MAIN
cwd=/root 3 args: exim -d-all+tls -bV
trusted user
admin user
Configuration file is /etc/exim4/exim4.conf


Would be nice to know where to look for a solution.

Cheers,

Marc


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On 22/03/2022 11:00, tt-admin via Exim-users wrote:
> Exim version 4.90_1 #4 built 30-Apr-2021 14:15:04

Mmmm, an ancient Exim release number but a build "only"
11 months old. That's what you get for running an LTS
distro, I suppose.

Difficult to guess exactly what fixes from the original
4.90 have been picked up. There was one, involving Exim's
notion of time slipping, and screwing up
delays, which comes to mind given your two hours...

It bit most spectacularly after a suspend/resume. I'm not
sure if a system time adjustment operation would have dome
the same, but I'd not be surprised.

A brief search though git history for "time" throws up Bug 2615.
(oh yes... source now comments "the Linux non-Posix behaviour").

--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
So your suggestion would be to try another exim version or to file a bug
with the Ubuntu 18.04 package?

The machine was not suspended. According to the logfiles the problem exists
since 2020-03-04 (last exim package update via apt before was in 2019-12,
and then again in 2020-06).
I found another post from this list (from July 2017) where you talked to
somebody called Christian Balzer who had the same problem, but there seems
to be no solution as of the last posting.
--
Cheers

Marc

-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Jeremy Harris via Exim-users
Gesendet: Donnerstag, 24. M?rz 2022 01:40
An: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On 22/03/2022 11:00, tt-admin via Exim-users wrote:
> Exim version 4.90_1 #4 built 30-Apr-2021 14:15:04

Mmmm, an ancient Exim release number but a build "only"
11 months old. That's what you get for running an LTS
distro, I suppose.

Difficult to guess exactly what fixes from the original
4.90 have been picked up. There was one, involving Exim's
notion of time slipping, and screwing up
delays, which comes to mind given your two hours...

It bit most spectacularly after a suspend/resume. I'm not
sure if a system time adjustment operation would have dome
the same, but I'd not be surprised.

A brief search though git history for "time" throws up Bug 2615.
(oh yes... source now comments "the Linux non-Posix behaviour").

--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On 25/03/2022 07:22, tt-admin via Exim-users wrote:
> So your suggestion would be to try another exim version or to file a bug
> with the Ubuntu 18.04 package?

Yes.

> The machine was not suspended.

So, it might not be that one then. Were there any other
operations that might have stepped the system's notion
of time?

That one would also go away on a daemon-restart, of course.

> I found another post from this list (from July 2017) where you talked to
> somebody called Christian Balzer who had the same problem, but there seems
> to be no solution as of the last posting.

https://bugs.exim.org/show_bug.cgi?id=2615 shows it fixed.
--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
>So, it might not be that one then. Were there any other
>operations that might have stepped the system's notion
>of time?

No - and the uptime of the server is 325 days, but the problem occurs daily
since 2020-03-04. I just restarted the exim daemon, for what it's worth.

--

Cheers,

Marc


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Tue, 22 Mar 2022, tt-admin via Exim-users wrote:

> Hi all,
>
> got a sending e-mail relay here, Ubuntu 18.04 LTS. About 22k e-mails sending
> volume per day. There are two receiving e-mail servers
> that are experiencing delays (~2 hours) when receiving e-mails from us. This
> does not happen for every e-mail we're sending tot them, but for some.
>
> Log if delay occurs:
>
> 2022-03-21 08:00:57.876 [7442] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)
> 2022-03-21 08:05:58.110 [10388] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)
> 2022-03-21 08:10:59.560 [11462] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)
> 2022-03-21 08:20:18.369 [13434] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)

> 2022-03-21 10:08:09.749 [20352] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)
> 2022-03-21 10:10:57.832 [20889] 1nWC1t-0001kn-G2 Spool file is locked
> (another process is handling this message)
> 2022-03-21 10:13:04.657 [6774] 1nWC1t-0001kn-G2 TLS session:
> (gnutls_handshake): timed out: delivering unencrypted to H=..(not in
> hosts_require_tls)

Does
exigrep 1nWC1t-0001kn-G2 /var/log/exim4/mainlog-202203*
(or where ever your exim logs are) show anything for the "other process" ?


--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
>Does
> exigrep 1nWC1t-0001kn-G2 /var/log/exim4/mainlog-202203*
>(or where ever your exim logs are) show anything for the "other process" ?

I'm afraid not, here you see two logs from yesterday (complete exigrep
output without sensitive information):

2022-03-29 10:02:40.626 [1758] cwd=/var/spool/exim4 3 args: /usr/sbin/exim4
-Mc 1nZ6oK-0000S3-GM
2022-03-29 10:02:40.614 [1739] 1nZ6oK-0000S3-GM <= ... H=(...) [...]:56786
I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no
S=3421694 M8S=0 id=02276F4E_71BD948B_foo@bar from <..> for ...
2022-03-29 10:05:58.606 [4928] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:10:58.591 [5636] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:15:58.606 [6321] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:20:58.594 [7183] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:25:58.602 [7933] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:30:58.600 [8457] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:35:58.738 [9096] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:40:58.613 [9701] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:45:58.603 [10239] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:50:58.728 [11294] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 10:56:01.073 [12271] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:00:58.601 [12901] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:05:58.601 [14012] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:10:58.593 [14823] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:15:58.597 [15630] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:20:58.604 [16520] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:25:58.787 [17004] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:30:58.610 [18508] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:35:58.610 [20784] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:40:58.616 [22363] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:45:58.665 [23905] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:50:58.728 [24668] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 11:55:58.595 [25150] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 12:00:58.617 [25571] 1nZ6oK-0000S3-GM Spool file is locked
(another process is handling this message)
2022-03-29 12:04:06.427 [1762] 1nZ6oK-0000S3-GM TLS session:
(gnutls_handshake): timed out: delivering unencrypted to H=... [...] (not in
hosts_require_tls)
2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM => ... F=<...> P=<...>
R=dnslookup T=remote_smtp S=3466181 H=... [...]:25 PRX=[]:0 I=[...]:59614
C="250 2.0.0 OK 59/09-19783-CC9D2426" QT=2h1m27s DT=1.678s
2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM Completed QT=2h1m27s


Same sender, same recipient, same recipient host:

2022-03-29 12:33:19.531 [29600] cwd=/var/spool/exim4 3 args: /usr/sbin/exim4
-Mc 1nZ9A7-0007hO-Fh

2022-03-29 12:33:19.506 [29598] 1nZ9A7-0007hO-Fh <= ... H=(...) [...]:40002
I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no S=610337
M8S=0 id=02B18AED_0F70847B_foo@bar from <...> for ...
2022-03-29 12:33:20.105 [29600] 1nZ9A7-0007hO-Fh => ... F=<...> P=<...>
R=dnslookup T=remote_smtp S=618306 H=... [...]:25 PRX=[]:0 I=[...]:60340
X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=yes DN="CN=*.foo.bar" C="250
2.0.0 OK DE/29-19783-5A0E2426" QT=0.618s DT=0.546s
2022-03-29 12:33:20.106 [29600] 1nZ9A7-0007hO-Fh Completed QT=0.619s

Restarting the daemon did not help (was not expecting, but tried anyway).


Cheers
--
Marc



--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Wed, 30 Mar 2022, tt-admin via Exim-users wrote:

>> Does
>> exigrep 1nWC1t-0001kn-G2 /var/log/exim4/mainlog-202203*
>> (or where ever your exim logs are) show anything for the "other process" ?
>
> I'm afraid not, here you see two logs from yesterday (complete exigrep
> output without sensitive information):
>
> 2022-03-29 10:02:40.626 [1758] cwd=/var/spool/exim4 3 args: /usr/sbin/exim4
> -Mc 1nZ6oK-0000S3-GM
> 2022-03-29 10:02:40.614 [1739] 1nZ6oK-0000S3-GM <= ... H=(...) [...]:56786
> I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no
> S=3421694 M8S=0 id=02276F4E_71BD948B_foo@bar from <..> for ...
> 2022-03-29 10:05:58.606 [4928] 1nZ6oK-0000S3-GM Spool file is locked
> (another process is handling this message)
... ...
> 2022-03-29 12:00:58.617 [25571] 1nZ6oK-0000S3-GM Spool file is locked
> (another process is handling this message)
> 2022-03-29 12:04:06.427 [1762] 1nZ6oK-0000S3-GM TLS session:
> (gnutls_handshake): timed out: delivering unencrypted to H=... [...] (not in
> hosts_require_tls)
> 2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM => ... F=<...> P=<...>
> R=dnslookup T=remote_smtp S=3466181 H=... [...]:25 PRX=[]:0 I=[...]:59614
> C="250 2.0.0 OK 59/09-19783-CC9D2426" QT=2h1m27s DT=1.678s
> 2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM Completed QT=2h1m27s

I wonder what 1758 and 1762 were doing all that time -
I presume there is no other mention of 1762 in the log ?

If you see any more of these stuck in the queue, can you get an lsof
and a truss/strace/ltrace of the processes ?

> Same sender, same recipient, same recipient host:
>
> 2022-03-29 12:33:19.531 [29600] cwd=/var/spool/exim4 3 args: /usr/sbin/exim4
> -Mc 1nZ9A7-0007hO-Fh
>
> 2022-03-29 12:33:19.506 [29598] 1nZ9A7-0007hO-Fh <= ... H=(...) [...]:40002
> I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no S=610337
> M8S=0 id=02B18AED_0F70847B_foo@bar from <...> for ...
> 2022-03-29 12:33:20.105 [29600] 1nZ9A7-0007hO-Fh => ... F=<...> P=<...>
> R=dnslookup T=remote_smtp S=618306 H=... [...]:25 PRX=[]:0 I=[...]:60340
> X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=yes DN="CN=*.foo.bar" C="250
> 2.0.0 OK DE/29-19783-5A0E2426" QT=0.618s DT=0.546s
> 2022-03-29 12:33:20.106 [29600] 1nZ9A7-0007hO-Fh Completed QT=0.619s
>
> Restarting the daemon did not help (was not expecting, but tried anyway).
>
>
> Cheers
> --
> Marc
>
>
>
> --
> Andrew C. Aitchison Kendal, UK
> andrew@aitchison.me.uk
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
>I wonder what 1758 and 1762 were doing all that time -
>I presume there is no other mention of 1762 in the log ?

No, they are not mentioned besides the exigrep output.

>If you see any more of these stuck in the queue, can you get an lsof
>and a truss/strace/ltrace of the processes ?

Appreciate your help, i'm not familiar with debugging software (yet ;-) ),
but here goes my try:

root@mx:~# lsof /var/spool/exim4/input/1nZTd8-0001j3-Pe-*
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
exim4 6649 root 3uw REG 8,2 111099 3280684
/var/spool/exim4/input/1nZTd8-0001j3-Pe-D
exim4 6649 root 6w REG 8,2 0 3280743
/var/spool/exim4/input/1nZTd8-0001j3-Pe-J
exim4 6671 Debian-exim 3u REG 8,2 111099 3280684
/var/spool/exim4/input/1nZTd8-0001j3-Pe-D
exim4 6671 Debian-exim 6w REG 8,2 0 3280743
/var/spool/exim4/input/1nZTd8-0001j3-Pe-J

root@mx:~# lsof -p 6649
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
exim4 6649 root cwd DIR 8,2 4096 3147755 /var/spool/exim4
exim4 6649 root rtd DIR 8,2 4096 2 /
exim4 6649 root txt REG 8,2 1152456 1705265 /usr/sbin/exim4
exim4 6649 root mem REG 8,2 47568 3938632
/lib/x86_64-linux-gnu/libnss_files-2.27.so
exim4 6649 root mem REG 8,2 47576 3938636
/lib/x86_64-linux-gnu/libnss_nis-2.27.so
exim4 6649 root mem REG 8,2 39744 3938628
/lib/x86_64-linux-gnu/libnss_compat-2.27.so
exim4 6649 root mem REG 8,2 31032 1710336
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
exim4 6649 root mem REG 8,2 526688 1710347
/usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
exim4 6649 root mem REG 8,2 211704 1704461
/usr/lib/x86_64-linux-gnu/libhogweed.so.4.4
exim4 6649 root mem REG 8,2 219304 1704459
/usr/lib/x86_64-linux-gnu/libnettle.so.6.4
exim4 6649 root mem REG 8,2 75776 1710420
/usr/lib/x86_64-linux-gnu/libtasn1.so.6.5.5
exim4 6649 root mem REG 8,2 1562664 1704474
/usr/lib/x86_64-linux-gnu/libunistring.so.2.1.0
exim4 6649 root mem REG 8,2 116656 1704048
/usr/lib/x86_64-linux-gnu/libidn2.so.0.3.3
exim4 6649 root mem REG 8,2 1237640 1704463
/usr/lib/x86_64-linux-gnu/libp11-kit.so.0.3.0
exim4 6649 root mem REG 8,2 116960 3937054
/lib/x86_64-linux-gnu/libz.so.1.2.11
exim4 6649 root mem REG 8,2 144976 3938642
/lib/x86_64-linux-gnu/libpthread-2.27.so
exim4 6649 root mem REG 8,2 2030928 3938612
/lib/x86_64-linux-gnu/libc-2.27.so
exim4 6649 root mem REG 8,2 464824 3937028
/lib/x86_64-linux-gnu/libpcre.so.3.13.3
exim4 6649 root mem REG 8,2 1461856 1704582
/usr/lib/x86_64-linux-gnu/libgnutls.so.30.14.10
exim4 6649 root mem REG 8,2 1740240 1704575
/usr/lib/x86_64-linux-gnu/libdb-5.3.so
exim4 6649 root mem REG 8,2 14560 3938618
/lib/x86_64-linux-gnu/libdl-2.27.so
exim4 6649 root mem REG 8,2 1700792 3938620
/lib/x86_64-linux-gnu/libm-2.27.so
exim4 6649 root mem REG 8,2 39208 3938616
/lib/x86_64-linux-gnu/libcrypt-2.27.so
exim4 6649 root mem REG 8,2 97176 3938626
/lib/x86_64-linux-gnu/libnsl-2.27.so
exim4 6649 root mem REG 8,2 97072 3938644
/lib/x86_64-linux-gnu/libresolv-2.27.so
exim4 6649 root mem REG 8,2 179152 3938604
/lib/x86_64-linux-gnu/ld-2.27.so
exim4 6649 root 0u CHR 1,3 0t0 6 /dev/null
exim4 6649 root 1u CHR 1,3 0t0 6 /dev/null
exim4 6649 root 2u CHR 1,3 0t0 6 /dev/null
exim4 6649 root 3uw REG 8,2 111099 3280684
/var/spool/exim4/input/1nZTd8-0001j3-Pe-D
exim4 6649 root 4w REG 8,2 326 3408083
/var/spool/exim4/msglog/1nZTd8-0001j3-Pe
exim4 6649 root 5w REG 8,2 11903758 3017759
/var/log/exim4/mainlog
exim4 6649 root 6w REG 8,2 0 3280743
/var/spool/exim4/input/1nZTd8-0001j3-Pe-J
exim4 6649 root 7r FIFO 0,12 0t0 273058944 pipe


root@mx:~# lsof -p 6671
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
exim4 6671 Debian-exim cwd DIR 8,2 4096 3147755
/var/spool/exim4
exim4 6671 Debian-exim rtd DIR 8,2 4096 2 /
exim4 6671 Debian-exim txt REG 8,2 1152456 1705265
/usr/sbin/exim4
exim4 6671 Debian-exim mem REG 8,2 47568 3938632
/lib/x86_64-linux-gnu/libnss_files-2.27.so
exim4 6671 Debian-exim mem REG 8,2 47576 3938636
/lib/x86_64-linux-gnu/libnss_nis-2.27.so
exim4 6671 Debian-exim mem REG 8,2 39744 3938628
/lib/x86_64-linux-gnu/libnss_compat-2.27.so
exim4 6671 Debian-exim mem REG 8,2 31032 1710336
/usr/lib/x86_64-linux-gnu/libffi.so.6.0.4
exim4 6671 Debian-exim mem REG 8,2 526688 1710347
/usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
exim4 6671 Debian-exim mem REG 8,2 211704 1704461
/usr/lib/x86_64-linux-gnu/libhogweed.so.4.4
exim4 6671 Debian-exim mem REG 8,2 219304 1704459
/usr/lib/x86_64-linux-gnu/libnettle.so.6.4
exim4 6671 Debian-exim mem REG 8,2 75776 1710420
/usr/lib/x86_64-linux-gnu/libtasn1.so.6.5.5
exim4 6671 Debian-exim mem REG 8,2 1562664 1704474
/usr/lib/x86_64-linux-gnu/libunistring.so.2.1.0
exim4 6671 Debian-exim mem REG 8,2 116656 1704048
/usr/lib/x86_64-linux-gnu/libidn2.so.0.3.3
exim4 6671 Debian-exim mem REG 8,2 1237640 1704463
/usr/lib/x86_64-linux-gnu/libp11-kit.so.0.3.0
exim4 6671 Debian-exim mem REG 8,2 116960 3937054
/lib/x86_64-linux-gnu/libz.so.1.2.11
exim4 6671 Debian-exim mem REG 8,2 144976 3938642
/lib/x86_64-linux-gnu/libpthread-2.27.so
exim4 6671 Debian-exim mem REG 8,2 2030928 3938612
/lib/x86_64-linux-gnu/libc-2.27.so
exim4 6671 Debian-exim mem REG 8,2 464824 3937028
/lib/x86_64-linux-gnu/libpcre.so.3.13.3
exim4 6671 Debian-exim mem REG 8,2 1461856 1704582
/usr/lib/x86_64-linux-gnu/libgnutls.so.30.14.10
exim4 6671 Debian-exim mem REG 8,2 1740240 1704575
/usr/lib/x86_64-linux-gnu/libdb-5.3.so
exim4 6671 Debian-exim mem REG 8,2 14560 3938618
/lib/x86_64-linux-gnu/libdl-2.27.so
exim4 6671 Debian-exim mem REG 8,2 1700792 3938620
/lib/x86_64-linux-gnu/libm-2.27.so
exim4 6671 Debian-exim mem REG 8,2 39208 3938616
/lib/x86_64-linux-gnu/libcrypt-2.27.so
exim4 6671 Debian-exim mem REG 8,2 97176 3938626
/lib/x86_64-linux-gnu/libnsl-2.27.so
exim4 6671 Debian-exim mem REG 8,2 97072 3938644
/lib/x86_64-linux-gnu/libresolv-2.27.so
exim4 6671 Debian-exim mem REG 8,2 179152 3938604
/lib/x86_64-linux-gnu/ld-2.27.so
exim4 6671 Debian-exim 0u CHR 1,3 0t0 6 /dev/null
exim4 6671 Debian-exim 1u CHR 1,3 0t0 6 /dev/null
exim4 6671 Debian-exim 2u CHR 1,3 0t0 6 /dev/null
exim4 6671 Debian-exim 3u REG 8,2 111099 3280684
/var/spool/exim4/input/1nZTd8-0001j3-Pe-D
exim4 6671 Debian-exim 4w REG 8,2 326 3408083
/var/spool/exim4/msglog/1nZTd8-0001j3-Pe
exim4 6671 Debian-exim 5w REG 8,2 12079149 3017759
/var/log/exim4/mainlog
exim4 6671 Debian-exim 6w REG 8,2 0 3280743
/var/spool/exim4/input/1nZTd8-0001j3-Pe-J
exim4 6671 Debian-exim 7u IPv4 273058115 0t0 TCP
mx:60066->IP_THAT_SHOWS_THE_PROBLEM:smtp (ESTABLISHED)
exim4 6671 Debian-exim 8w FIFO 0,12 0t0 273058944 pipe

strace oft he two PIDs (still running):

strace -p 6671,6649 -o trace_output.txt

root@mx:~# less trace_output.txt

6671 recvfrom(7, <unfinished ...>
6649 select(8, [7], NULL, NULL, {tv_sec=0, tv_usec=307060}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)
6649 wait4(-1, 0x7fff3cb7f78c, WNOHANG, NULL) = 0
6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}

--
Cheers,
Marc


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
Thinking a little tangentially...

Could this be an entropy problem? Do you have the 'rng-tools' (might be 'rng-utils', I'm not a debian/ubuntu expert) package installed?

I have a vague recollection of blocked reads to /dev/random, which isn't filling quickly enough and stalls processes. /dev/urandom is non-blocking and fills via different means.

Additionally - the issue could in fact not be with your system at all, but with the receiver. If it's only a single host that you have the issue with, I'd be looking at their end.

Graeme
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On 30/03/2022 09:53, tt-admin via Exim-users wrote:
>> I wonder what 1758 and 1762 were doing all that time -
>> I presume there is no other mention of 1762 in the log ?
> No, they are not mentioned besides the exigrep output.

Additional useful info might be given by "exiwhat".
Run that; anything for the two PIDs involved?
--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
Continuation of the strace;

6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
ECONNRESET (Connection reset by peer)
6671 alarm(0) = 0
6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
si_uid=111} ---
6671 close(7) = 0
6671 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2335, ...}) = 0
6671 getpid() = 6671
6671 stat("", 0x7fff3cb7f5c0) = -1 ENOENT (No such file or
directory)
6671 close(5) = 0
6671 geteuid() = 111
6671 geteuid() = 111
6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
6671 fcntl(5, F_GETFD) = 0
6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
6671 fcntl(5, F_GETFL) = 0x8c01 (flags
O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
6671 alarm(300) = 0

Then the unencrypted smtp connections starts and the mesage is delivered
unencrypted.

Did not have the chance to call exiwhat, will try next time.

--
Cheers,

Marc

-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Andrew C Aitchison via Exim-users
Gesendet: Mittwoch, 30. M?rz 2022 09:19
An: tt-admin
Cc: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On Wed, 30 Mar 2022, tt-admin via Exim-users wrote:

>> Does
>> exigrep 1nWC1t-0001kn-G2 /var/log/exim4/mainlog-202203*
>> (or where ever your exim logs are) show anything for the "other process"
?
>
> I'm afraid not, here you see two logs from yesterday (complete exigrep
> output without sensitive information):
>
> 2022-03-29 10:02:40.626 [1758] cwd=/var/spool/exim4 3 args:
/usr/sbin/exim4
> -Mc 1nZ6oK-0000S3-GM
> 2022-03-29 10:02:40.614 [1739] 1nZ6oK-0000S3-GM <= ... H=(...) [...]:56786
> I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no
> S=3421694 M8S=0 id=02276F4E_71BD948B_foo@bar from <..> for ...
> 2022-03-29 10:05:58.606 [4928] 1nZ6oK-0000S3-GM Spool file is locked
> (another process is handling this message)
... ...
> 2022-03-29 12:00:58.617 [25571] 1nZ6oK-0000S3-GM Spool file is locked
> (another process is handling this message)
> 2022-03-29 12:04:06.427 [1762] 1nZ6oK-0000S3-GM TLS session:
> (gnutls_handshake): timed out: delivering unencrypted to H=... [...] (not
in
> hosts_require_tls)
> 2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM => ... F=<...> P=<...>
> R=dnslookup T=remote_smtp S=3466181 H=... [...]:25 PRX=[]:0 I=[...]:59614
> C="250 2.0.0 OK 59/09-19783-CC9D2426" QT=2h1m27s DT=1.678s
> 2022-03-29 12:04:08.110 [1758] 1nZ6oK-0000S3-GM Completed QT=2h1m27s

I wonder what 1758 and 1762 were doing all that time -
I presume there is no other mention of 1762 in the log ?

If you see any more of these stuck in the queue, can you get an lsof
and a truss/strace/ltrace of the processes ?

> Same sender, same recipient, same recipient host:
>
> 2022-03-29 12:33:19.531 [29600] cwd=/var/spool/exim4 3 args:
/usr/sbin/exim4
> -Mc 1nZ9A7-0007hO-Fh
>
> 2022-03-29 12:33:19.506 [29598] 1nZ9A7-0007hO-Fh <= ... H=(...)
[...]:40002
> I=[...]:25 P=esmtps X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=no
S=610337
> M8S=0 id=02B18AED_0F70847B_foo@bar from <...> for ...
> 2022-03-29 12:33:20.105 [29600] 1nZ9A7-0007hO-Fh => ... F=<...> P=<...>
> R=dnslookup T=remote_smtp S=618306 H=... [...]:25 PRX=[]:0 I=[...]:60340
> X=TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256 CV=yes DN="CN=*.foo.bar" C="250
> 2.0.0 OK DE/29-19783-5A0E2426" QT=0.618s DT=0.546s
> 2022-03-29 12:33:20.106 [29600] 1nZ9A7-0007hO-Fh Completed QT=0.619s
>
> Restarting the daemon did not help (was not expecting, but tried anyway).
>
>
> Cheers
> --
> Marc
>
>
>
> --
> Andrew C. Aitchison Kendal, UK
> andrew@aitchison.me.uk
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
> Continuation of the strace;
>
> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
> ECONNRESET (Connection reset by peer)
> 6671 alarm(0) = 0
> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
> si_uid=111} ---

Diring TLS exchange server sent RST, client got ECONNRESET and tried to
send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
so client got EPIPE because connection has been deleted by kernel on RST.

The question is why server has dropped connection. Probably this connection
was timed out -- on the server's side, not on the client's one.
Traffic dump from both ends could make this situation clean.

> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
> 6671 fcntl(5, F_GETFD) = 0
> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187

You have better to increase strace display length with -s.., or post
full log line here.

> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> 6671 alarm(300) = 0
>
> Then the unencrypted smtp connections starts and the mesage is delivered
> unencrypted.

I could imagine one reason for such server's behaviour: presence of some
statefull firewall behind server, which scans SMTP packets and drops
packets with "suspicious" patterns. If it drop some TLS packet from client,
it would also drop all its copies on retransmissions, so TLS session
become disrupted and tcp connection dies on timeout.

However, in this scenario messages should not wait in queue for hours,
they should be delivered in plain SMTP just after first TLS timeout.
--
Eugene Berdnikov

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
exiwhat says: waiting for a remote delivery subprocess to finish

started a new strace with longer lines (-s 256) and date column.

I will try to contact the remote administrator, but my hopes for a trace
from their side are low (Symantec Messaging Gateway, i suppose without root
access).

--
Marc

-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Evgeniy Berdnikov via Exim-users
Gesendet: Mittwoch, 30. M?rz 2022 19:56
An: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
> Continuation of the strace;
>
> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
> ECONNRESET (Connection reset by peer)
> 6671 alarm(0) = 0
> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
> si_uid=111} ---

Diring TLS exchange server sent RST, client got ECONNRESET and tried to
send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
so client got EPIPE because connection has been deleted by kernel on RST.

The question is why server has dropped connection. Probably this connection
was timed out -- on the server's side, not on the client's one.
Traffic dump from both ends could make this situation clean.

> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
> 6671 fcntl(5, F_GETFD) = 0
> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187

You have better to increase strace display length with -s.., or post
full log line here.

> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> 6671 alarm(300) = 0
>
> Then the unencrypted smtp connections starts and the mesage is delivered
> unencrypted.

I could imagine one reason for such server's behaviour: presence of some
statefull firewall behind server, which scans SMTP packets and drops
packets with "suspicious" patterns. If it drop some TLS packet from client,
it would also drop all its copies on retransmissions, so TLS session
become disrupted and tcp connection dies on timeout.

However, in this scenario messages should not wait in queue for hours,
they should be delivered in plain SMTP just after first TLS timeout.
--
Eugene Berdnikov

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:

> exiwhat says: waiting for a remote delivery subprocess to finish
>
> started a new strace with longer lines (-s 256) and date column.
>
> I will try to contact the remote administrator, but my hopes for a trace
> from their side are low (Symantec Messaging Gateway, i suppose without root
> access).

Ah. If it is a mail gateway, the question might be whether their gateway
starts to send the message on to their mail server (cut-through delivery)
or not.

Does the Symantec Messaging Gateway advertise PIPELINING and do you use it ?

> --
> Marc
>
> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Evgeniy Berdnikov via Exim-users
> Gesendet: Mittwoch, 30. M?rz 2022 19:56
> An: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
> unencrypted to
>
> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
>> Continuation of the strace;
>>
>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
>> ECONNRESET (Connection reset by peer)
>> 6671 alarm(0) = 0
>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
>> si_uid=111} ---
>
> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
> so client got EPIPE because connection has been deleted by kernel on RST.
>
> The question is why server has dropped connection. Probably this connection
> was timed out -- on the server's side, not on the client's one.
> Traffic dump from both ends could make this situation clean.
>
>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
>> 6671 fcntl(5, F_GETFD) = 0
>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
>
> You have better to increase strace display length with -s.., or post
> full log line here.
>
>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>> 6671 alarm(300) = 0
>>
>> Then the unencrypted smtp connections starts and the mesage is delivered
>> unencrypted.
>
> I could imagine one reason for such server's behaviour: presence of some
> statefull firewall behind server, which scans SMTP packets and drops
> packets with "suspicious" patterns. If it drop some TLS packet from client,
> it would also drop all its copies on retransmissions, so TLS session
> become disrupted and tcp connection dies on timeout.
>
> However, in this scenario messages should not wait in queue for hours,
> they should be delivered in plain SMTP just after first TLS timeout.
> --
> Eugene Berdnikov
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
>Ah. If it is a mail gateway, the question might be whether their gateway
>starts to send the message on to their mail server (cut-through delivery)
>or not.

>Does the Symantec Messaging Gateway advertise PIPELINING and do you use it
?


Yes, the remote gateway does advertise pipelining. As far is i remember,
exim does youse pipelining as a default when sending e-mails. I need to ask
if they forward their mail internally.

Here ist he complete strace of the hanging process:

https://pastebin.com/wPPGab1K

--

Marc

-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Andrew C Aitchison via Exim-users
Gesendet: Donnerstag, 31. M?rz 2022 12:35
An: tt-admin
Cc: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:

> exiwhat says: waiting for a remote delivery subprocess to finish
>
> started a new strace with longer lines (-s 256) and date column.
>
> I will try to contact the remote administrator, but my hopes for a trace
> from their side are low (Symantec Messaging Gateway, i suppose without
root
> access).

Ah. If it is a mail gateway, the question might be whether their gateway
starts to send the message on to their mail server (cut-through delivery)
or not.

Does the Symantec Messaging Gateway advertise PIPELINING and do you use it ?

> --
> Marc
>
> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Evgeniy Berdnikov via Exim-users
> Gesendet: Mittwoch, 30. M?rz 2022 19:56
> An: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
delivering
> unencrypted to
>
> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
>> Continuation of the strace;
>>
>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
>> ECONNRESET (Connection reset by peer)
>> 6671 alarm(0) = 0
>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
>> si_uid=111} ---
>
> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
> so client got EPIPE because connection has been deleted by kernel on RST.
>
> The question is why server has dropped connection. Probably this
connection
> was timed out -- on the server's side, not on the client's one.
> Traffic dump from both ends could make this situation clean.
>
>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
>> 6671 fcntl(5, F_GETFD) = 0
>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
>
> You have better to increase strace display length with -s.., or post
> full log line here.
>
>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>> 6671 alarm(300) = 0
>>
>> Then the unencrypted smtp connections starts and the mesage is delivered
>> unencrypted.
>
> I could imagine one reason for such server's behaviour: presence of some
> statefull firewall behind server, which scans SMTP packets and drops
> packets with "suspicious" patterns. If it drop some TLS packet from
client,
> it would also drop all its copies on retransmissions, so TLS session
> become disrupted and tcp connection dies on timeout.
>
> However, in this scenario messages should not wait in queue for hours,
> they should be delivered in plain SMTP just after first TLS timeout.
> --
> Eugene Berdnikov
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Thu, 7 Apr 2022, tt-admin via Exim-users wrote:

>
>> Ah. If it is a mail gateway, the question might be whether their gateway
>> starts to send the message on to their mail server (cut-through delivery)
>> or not.
>
>> Does the Symantec Messaging Gateway advertise PIPELINING and do you use it
> ?
>
>
> Yes, the remote gateway does advertise pipelining. As far is i remember,
> exim does youse pipelining as a default when sending e-mails. I need to ask
> if they forward their mail internally.

You can disable pipelining when sending to specific hosts, with
hosts_avoid_pipelining


>> -----Urspr?ngliche Nachricht-----
>> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
>> Im Auftrag von Evgeniy Berdnikov via Exim-users
>> Gesendet: Mittwoch, 30. M?rz 2022 19:56
... ...
>> The question is why server has dropped connection. Probably this
>> connection was timed out -- on the server's side, not on the client's one.
>> Traffic dump from both ends could make this situation clean.

I think that this is the next step ...
"pcap" is the string to search documentation for.


--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On 07/04/2022 15:16, tt-admin via Exim-users wrote:
> Here ist he complete strace of the hanging process:
>
> https://pastebin.com/wPPGab1K

31032 10:47:07 wait4(-1, 0x7fff70a35a0c, WNOHANG, NULL) = 0
31032 10:47:07 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0}) = 0 (Timeout)

This looks like 31032 is your daemon, running with a queue_interval of 60 seconds
(and with select rather than poll, you are running with likely known
bugs active on FreeBSD).


31037 10:47:07 <... recvfrom resumed> 0x55f93be5671b, 324, 0, NULL, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
31032 10:47:07 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=32665, si_uid=0} ---
31037 10:47:07 --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_USER, si_pid=32665, si_uid=0} ---

Presumably 31037 is the process of interest. SIGUSR1 was possibly the result of
running exiwhat.

31037 10:46:58 write(9, "31037 delivering 1nZqMd-00084U-Qc to foo.bar [x.x.x.x] (foog@bar)\n", 105 <unfinished ...>
31037 10:46:58 rt_sigreturn({mask=[]} <unfinished ...>

consistent with exiwhat

31037 10:46:58 recvfrom(7, <unfinished ...>

and back to waiting in read-from-network...
We can at least discount a lack-of-entropy issue.



The call from the Exim transport to the GnuTLS library gnutls_handshake()
routine is wrapped in a an alarm() call, set by the transport option "command_timeout".
The default for that is 5 minutes (but check your config... a setting of
two hours would probably be unwise. You also mentioned that you've seen
5 minutes on other connections, presumably the same transport).

Seeing (via strace) the syscall *setting* that alarm might be interesting
(though I fear we'll see it being 300s and be no closer to a fix).
--
Cheers,
Jeremy

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
Just finished testing - the problem still occurs when we use Ubuntu 20.04
with Exim version 4.93 #5 (built 28-Apr-2021 13:19:17) (our older machine
where the problem was first monitored is v4.90 and Ubuntu 18.04).

-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Andrew C Aitchison via Exim-users
Gesendet: Donnerstag, 31. M?rz 2022 12:35
An: tt-admin
Cc: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:

> exiwhat says: waiting for a remote delivery subprocess to finish
>
> started a new strace with longer lines (-s 256) and date column.
>
> I will try to contact the remote administrator, but my hopes for a trace
> from their side are low (Symantec Messaging Gateway, i suppose without
root
> access).

Ah. If it is a mail gateway, the question might be whether their gateway
starts to send the message on to their mail server (cut-through delivery)
or not.

Does the Symantec Messaging Gateway advertise PIPELINING and do you use it ?

> --
> Marc
>
> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Evgeniy Berdnikov via Exim-users
> Gesendet: Mittwoch, 30. M?rz 2022 19:56
> An: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
delivering
> unencrypted to
>
> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
>> Continuation of the strace;
>>
>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
>> ECONNRESET (Connection reset by peer)
>> 6671 alarm(0) = 0
>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
>> si_uid=111} ---
>
> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
> so client got EPIPE because connection has been deleted by kernel on RST.
>
> The question is why server has dropped connection. Probably this
connection
> was timed out -- on the server's side, not on the client's one.
> Traffic dump from both ends could make this situation clean.
>
>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
>> 6671 fcntl(5, F_GETFD) = 0
>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
>
> You have better to increase strace display length with -s.., or post
> full log line here.
>
>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>> 6671 alarm(300) = 0
>>
>> Then the unencrypted smtp connections starts and the mesage is delivered
>> unencrypted.
>
> I could imagine one reason for such server's behaviour: presence of some
> statefull firewall behind server, which scans SMTP packets and drops
> packets with "suspicious" patterns. If it drop some TLS packet from
client,
> it would also drop all its copies on retransmissions, so TLS session
> become disrupted and tcp connection dies on timeout.
>
> However, in this scenario messages should not wait in queue for hours,
> they should be delivered in plain SMTP just after first TLS timeout.
> --
> Eugene Berdnikov
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Tue, 19 Apr 2022, tt-admin via Exim-users wrote:

> Just finished testing - the problem still occurs when we use Ubuntu 20.04
> with Exim version 4.93 #5 (built 28-Apr-2021 13:19:17) (our older machine
> where the problem was first monitored is v4.90 and Ubuntu 18.04).

Exim v4.93 was released in 2019, so still not very new.
Ubuntu 21.10 has exim 4.94.2 and 22.04 (released this week) has exim 4.95.

Any chance of seeing whether a newer exim, perhaps locally built, helps ?

> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Andrew C Aitchison via Exim-users
> Gesendet: Donnerstag, 31. M?rz 2022 12:35
> An: tt-admin
> Cc: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
> unencrypted to
>
> On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:
>
>> exiwhat says: waiting for a remote delivery subprocess to finish
>>
>> started a new strace with longer lines (-s 256) and date column.
>>
>> I will try to contact the remote administrator, but my hopes for a trace
>> from their side are low (Symantec Messaging Gateway, i suppose without
> root
>> access).
>
> Ah. If it is a mail gateway, the question might be whether their gateway
> starts to send the message on to their mail server (cut-through delivery)
> or not.
>
> Does the Symantec Messaging Gateway advertise PIPELINING and do you use it ?
>
>> --
>> Marc
>>
>> -----Urspr?ngliche Nachricht-----
>> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
>> Im Auftrag von Evgeniy Berdnikov via Exim-users
>> Gesendet: Mittwoch, 30. M?rz 2022 19:56
>> An: exim-users@exim.org
>> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
> delivering
>> unencrypted to
>>
>> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
>>> Continuation of the strace;
>>>
>>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
>>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
>>> ECONNRESET (Connection reset by peer)
>>> 6671 alarm(0) = 0
>>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
>>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
>>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
>>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
>>> si_uid=111} ---
>>
>> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
>> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
>> so client got EPIPE because connection has been deleted by kernel on RST.
>>
>> The question is why server has dropped connection. Probably this
> connection
>> was timed out -- on the server's side, not on the client's one.
>> Traffic dump from both ends could make this situation clean.
>>
>>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
>>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
>>> 6671 fcntl(5, F_GETFD) = 0
>>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
>>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
>>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
>>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
>>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
>>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
>>
>> You have better to increase strace display length with -s.., or post
>> full log line here.
>>
>>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
>>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>>> 6671 alarm(300) = 0
>>>
>>> Then the unencrypted smtp connections starts and the mesage is delivered
>>> unencrypted.
>>
>> I could imagine one reason for such server's behaviour: presence of some
>> statefull firewall behind server, which scans SMTP packets and drops
>> packets with "suspicious" patterns. If it drop some TLS packet from
> client,
>> it would also drop all its copies on retransmissions, so TLS session
>> become disrupted and tcp connection dies on timeout.
>>
>> However, in this scenario messages should not wait in queue for hours,
>> they should be delivered in plain SMTP just after first TLS timeout.
>> --
>> Eugene Berdnikov
>>
>> --
>> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
>> ## Exim details at http://www.exim.org/
>> ## Please use the Wiki with this list - http://wiki.exim.org/
>>
>>
>> --
>> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
>> ## Exim details at http://www.exim.org/
>> ## Please use the Wiki with this list - http://wiki.exim.org/
>>
>
> --
> Andrew C. Aitchison Kendal, UK
> andrew@aitchison.me.uk
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
Well, seems like the remote site managed to install an update for their
Symantec E-Mail Gateways and since then - the error disappeared. We still
run on Exim v4.90.


-----Urspr?ngliche Nachricht-----
Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
Im Auftrag von Andrew C Aitchison via Exim-users
Gesendet: Sonntag, 24. April 2022 11:37
An: tt-admin
Cc: exim-users@exim.org
Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
unencrypted to

On Tue, 19 Apr 2022, tt-admin via Exim-users wrote:

> Just finished testing - the problem still occurs when we use Ubuntu 20.04
> with Exim version 4.93 #5 (built 28-Apr-2021 13:19:17) (our older machine
> where the problem was first monitored is v4.90 and Ubuntu 18.04).

Exim v4.93 was released in 2019, so still not very new.
Ubuntu 21.10 has exim 4.94.2 and 22.04 (released this week) has exim 4.95.

Any chance of seeing whether a newer exim, perhaps locally built, helps ?

> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Andrew C Aitchison via Exim-users
> Gesendet: Donnerstag, 31. M?rz 2022 12:35
> An: tt-admin
> Cc: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
delivering
> unencrypted to
>
> On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:
>
>> exiwhat says: waiting for a remote delivery subprocess to finish
>>
>> started a new strace with longer lines (-s 256) and date column.
>>
>> I will try to contact the remote administrator, but my hopes for a trace
>> from their side are low (Symantec Messaging Gateway, i suppose without
> root
>> access).
>
> Ah. If it is a mail gateway, the question might be whether their gateway
> starts to send the message on to their mail server (cut-through delivery)
> or not.
>
> Does the Symantec Messaging Gateway advertise PIPELINING and do you use it
?
>
>> --
>> Marc
>>
>> -----Urspr?ngliche Nachricht-----
>> Von: Exim-users
[mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
>> Im Auftrag von Evgeniy Berdnikov via Exim-users
>> Gesendet: Mittwoch, 30. M?rz 2022 19:56
>> An: exim-users@exim.org
>> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
> delivering
>> unencrypted to
>>
>> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
>>> Continuation of the strace;
>>>
>>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
>>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
>>> ECONNRESET (Connection reset by peer)
>>> 6671 alarm(0) = 0
>>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
>>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
>>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
>>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
>>> si_uid=111} ---
>>
>> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
>> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
>> so client got EPIPE because connection has been deleted by kernel on RST.
>>
>> The question is why server has dropped connection. Probably this
> connection
>> was timed out -- on the server's side, not on the client's one.
>> Traffic dump from both ends could make this situation clean.
>>
>>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
>>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
>>> 6671 fcntl(5, F_GETFD) = 0
>>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
>>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
>>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
>>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
>>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
>>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
>>
>> You have better to increase strace display length with -s.., or post
>> full log line here.
>>
>>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
>>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
>>> 6671 alarm(300) = 0
>>>
>>> Then the unencrypted smtp connections starts and the mesage is delivered
>>> unencrypted.
>>
>> I could imagine one reason for such server's behaviour: presence of some
>> statefull firewall behind server, which scans SMTP packets and drops
>> packets with "suspicious" patterns. If it drop some TLS packet from
> client,
>> it would also drop all its copies on retransmissions, so TLS session
>> become disrupted and tcp connection dies on timeout.
>>
>> However, in this scenario messages should not wait in queue for hours,
>> they should be delivered in plain SMTP just after first TLS timeout.
>> --
>> Eugene Berdnikov
>>
>> --
>> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
>> ## Exim details at http://www.exim.org/
>> ## Please use the Wiki with this list - http://wiki.exim.org/
>>
>>
>> --
>> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
>> ## Exim details at http://www.exim.org/
>> ## Please use the Wiki with this list - http://wiki.exim.org/
>>
>
> --
> Andrew C. Aitchison Kendal, UK
> andrew@aitchison.me.uk
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>

--
Andrew C. Aitchison Kendal, UK
andrew@aitchison.me.uk
--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/


--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/
Re: 2 hours delay (gnutls_handshake): timed out: delivering unencrypted to [ In reply to ]
On Wed, Apr 27, 2022 at 01:31:27PM +0200, tt-admin via Exim-users wrote:
> Well, seems like the remote site managed to install an update for their
> Symantec E-Mail Gateways and since then - the error disappeared. We still
> run on Exim v4.90.

You were told (3 weeks ago) that it looks like erroneous packet filtering
by statefull firewall. The only news is that its name is Symantec.

Waiting for 2-5 years is a cool method to "resolve" every problem with
minimum efforts. :)


> -----Urspr?ngliche Nachricht-----
> Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> Im Auftrag von Andrew C Aitchison via Exim-users
> Gesendet: Sonntag, 24. April 2022 11:37
> An: tt-admin
> Cc: exim-users@exim.org
> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out: delivering
> unencrypted to
>
> On Tue, 19 Apr 2022, tt-admin via Exim-users wrote:
>
> > Just finished testing - the problem still occurs when we use Ubuntu 20.04
> > with Exim version 4.93 #5 (built 28-Apr-2021 13:19:17) (our older machine
> > where the problem was first monitored is v4.90 and Ubuntu 18.04).
>
> Exim v4.93 was released in 2019, so still not very new.
> Ubuntu 21.10 has exim 4.94.2 and 22.04 (released this week) has exim 4.95.
>
> Any chance of seeing whether a newer exim, perhaps locally built, helps ?
>
> > -----Urspr?ngliche Nachricht-----
> > Von: Exim-users [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> > Im Auftrag von Andrew C Aitchison via Exim-users
> > Gesendet: Donnerstag, 31. M?rz 2022 12:35
> > An: tt-admin
> > Cc: exim-users@exim.org
> > Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
> delivering
> > unencrypted to
> >
> > On Thu, 31 Mar 2022, tt-admin via Exim-users wrote:
> >
> >> exiwhat says: waiting for a remote delivery subprocess to finish
> >>
> >> started a new strace with longer lines (-s 256) and date column.
> >>
> >> I will try to contact the remote administrator, but my hopes for a trace
> >> from their side are low (Symantec Messaging Gateway, i suppose without
> > root
> >> access).
> >
> > Ah. If it is a mail gateway, the question might be whether their gateway
> > starts to send the message on to their mail server (cut-through delivery)
> > or not.
> >
> > Does the Symantec Messaging Gateway advertise PIPELINING and do you use it
> ?
> >
> >> --
> >> Marc
> >>
> >> -----Urspr?ngliche Nachricht-----
> >> Von: Exim-users
> [mailto:exim-users-bounces+tt-admin=intranett.de@exim.org]
> >> Im Auftrag von Evgeniy Berdnikov via Exim-users
> >> Gesendet: Mittwoch, 30. M?rz 2022 19:56
> >> An: exim-users@exim.org
> >> Betreff: Re: [exim] 2 hours delay (gnutls_handshake): timed out:
> > delivering
> >> unencrypted to
> >>
> >> On Wed, Mar 30, 2022 at 02:53:45PM +0200, tt-admin via Exim-users wrote:
> >>> Continuation of the strace;
> >>>
> >>> 6649 select(8, [7], NULL, NULL, {tv_sec=60, tv_usec=0} <unfinished ...>
> >>> 6671 <... recvfrom resumed> 0x56352d0bd71b, 324, 0, NULL, NULL) = -1
> >>> ECONNRESET (Connection reset by peer)
> >>> 6671 alarm(0) = 0
> >>> 6671 sendmsg(7, {msg_name=NULL, msg_namelen=0,
> >>> msg_iov=[{iov_base="\25\3\3\0\2\2Z", iov_len=7}], msg_iovlen=1,
> >>> msg_controllen=0, msg_flags=0}, 0) = -1 EPIPE (Broken pipe)
> >>> 6671 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=6671,
> >>> si_uid=111} ---
> >>
> >> Diring TLS exchange server sent RST, client got ECONNRESET and tried to
> >> send TLS closing packet (it's my guess what "\25\3\3\0\2\2Z" means),
> >> so client got EPIPE because connection has been deleted by kernel on RST.
> >>
> >> The question is why server has dropped connection. Probably this
> > connection
> >> was timed out -- on the server's side, not on the client's one.
> >> Traffic dump from both ends could make this situation clean.
> >>
> >>> 6671 openat(AT_FDCWD, "/var/log/exim4/mainlog",
> >>> O_WRONLY|O_CREAT|O_APPEND|O_NONBLOCK, 0640) = 5
> >>> 6671 fcntl(5, F_GETFD) = 0
> >>> 6671 fcntl(5, F_SETFD, FD_CLOEXEC) = 0
> >>> 6671 fcntl(5, F_GETFL) = 0x8c01 (flags
> >>> O_WRONLY|O_APPEND|O_NONBLOCK|O_LARGEFILE)
> >>> 6671 fcntl(5, F_SETFL, O_WRONLY|O_APPEND|O_LARGEFILE) = 0
> >>> 6671 fstat(5, {st_mode=S_IFREG|0640, st_size=15402412, ...}) = 0
> >>> 6671 write(5, "2022-03-30 12:25:33.594 [6671] 1"..., 187) = 187
> >>
> >> You have better to increase strace display length with -s.., or post
> >> full log line here.
> >>
> >>> 6671 socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 7
> >>> 6671 setsockopt(7, SOL_TCP, TCP_NODELAY, [1], 4) = 0
> >>> 6671 alarm(300) = 0
> >>>
> >>> Then the unencrypted smtp connections starts and the mesage is delivered
> >>> unencrypted.
> >>
> >> I could imagine one reason for such server's behaviour: presence of some
> >> statefull firewall behind server, which scans SMTP packets and drops
> >> packets with "suspicious" patterns. If it drop some TLS packet from
> > client,
> >> it would also drop all its copies on retransmissions, so TLS session
> >> become disrupted and tcp connection dies on timeout.
> >>
> >> However, in this scenario messages should not wait in queue for hours,
> >> they should be delivered in plain SMTP just after first TLS timeout.
> >> --
> >> Eugene Berdnikov
> >>
> >> --
> >> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> >> ## Exim details at http://www.exim.org/
> >> ## Please use the Wiki with this list - http://wiki.exim.org/
> >>
> >>
> >> --
> >> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> >> ## Exim details at http://www.exim.org/
> >> ## Please use the Wiki with this list - http://wiki.exim.org/
> >>
> >
> > --
> > Andrew C. Aitchison Kendal, UK
> > andrew@aitchison.me.uk
> > --
> > ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> > ## Exim details at http://www.exim.org/
> > ## Please use the Wiki with this list - http://wiki.exim.org/
> >
> >
> > --
> > ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> > ## Exim details at http://www.exim.org/
> > ## Please use the Wiki with this list - http://wiki.exim.org/
> >
>
> --
> Andrew C. Aitchison Kendal, UK
> andrew@aitchison.me.uk
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
> --
> ## List details at https://lists.exim.org/mailman/listinfo/exim-users
> ## Exim details at http://www.exim.org/
> ## Please use the Wiki with this list - http://wiki.exim.org/
>
>
>

--
Eugene Berdnikov

--
## List details at https://lists.exim.org/mailman/listinfo/exim-users
## Exim details at http://www.exim.org/
## Please use the Wiki with this list - http://wiki.exim.org/