Mailing List Archive

[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error.
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 07-Sep-16 23:37 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 08-Sep-16 19:01 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://www.dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 15-Sep-16 17:31 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://www.dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://www.dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://www.dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://www.dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://www.dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 28-Sep-16 20:24 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://www.dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://www.dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://www.dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://www.dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://www.dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 30-Sep-16 16:58 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 30-Sep-16 23:37 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 01-Oct-16 00:18 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 01-Oct-16 15:36 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 03-Oct-16 16:54 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 03-Oct-16 17:52 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 17:52
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) [^]
from today with src_modules_authldap.c-20161001.diff and receive the
attached error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) .
This time I've cut the last parts of the full dbmail.err file which
consists of the last threads 0x1466450 and 0xde7400. The service still
ABEND's.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 04-Oct-16 17:54 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 04-Oct-16 20:47 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 20:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 06-Oct-16 15:57 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 06-Oct-16 19:45 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 12-Oct-16 20:13 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 13-Oct-16 17:16 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 14-Oct-16 19:01 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 14-Oct-16 19:33 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 14-Oct-16 20:50 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 14-Oct-16 22:40 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 17-Oct-16 10:34 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

----------------------------------------------------------------------
(0003768) alan (reporter) - 17-Oct-16 10:34
http://dbmail.org/mantis/view.php?id=1081#c3768
----------------------------------------------------------------------
Default timeout of 4000 appears excessive, I'm successfully running
production with 300 (5 mins), perhaps 60 might be useful for testing and
may illuminate a stuck thread.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
17-Oct-16 10:34 alan Note Added: 0003768
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 17-Oct-16 21:07 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

----------------------------------------------------------------------
(0003768) alan (reporter) - 17-Oct-16 10:34
http://dbmail.org/mantis/view.php?id=1081#c3768
----------------------------------------------------------------------
Default timeout of 4000 appears excessive, I'm successfully running
production with 300 (5 mins), perhaps 60 might be useful for testing and
may illuminate a stuck thread.

----------------------------------------------------------------------
(0003769) PeterS (reporter) - 17-Oct-16 21:07
http://dbmail.org/mantis/view.php?id=1081#c3769
----------------------------------------------------------------------
Here is the latest.

dbmail.20161017-1031.err.xz (dbmail.20161017-1031.err) and
dbmail.20161017-1031.extra.txt (dbmail.20161017-1031.extra.txt.xz)

Updated configuration for your suggested timeout, "query_timeout = 60".
Still ABENDs, with the last line being:

"dbmail-imapd: dm_config.c:134: config_get_value_once: Assertion
`config_dict' failed."

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
17-Oct-16 10:34 alan Note Added: 0003768
17-Oct-16 21:07 PeterS Note Added: 0003769
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 18-Oct-16 15:05 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

----------------------------------------------------------------------
(0003768) alan (reporter) - 17-Oct-16 10:34
http://dbmail.org/mantis/view.php?id=1081#c3768
----------------------------------------------------------------------
Default timeout of 4000 appears excessive, I'm successfully running
production with 300 (5 mins), perhaps 60 might be useful for testing and
may illuminate a stuck thread.

----------------------------------------------------------------------
(0003769) PeterS (reporter) - 17-Oct-16 21:07
http://dbmail.org/mantis/view.php?id=1081#c3769
----------------------------------------------------------------------
Here is the latest.

dbmail.20161017-1031.err.xz (dbmail.20161017-1031.err) and
dbmail.20161017-1031.extra.txt (dbmail.20161017-1031.extra.txt.xz)

Updated configuration for your suggested timeout, "query_timeout = 60".
Still ABENDs, with the last line being:

"dbmail-imapd: dm_config.c:134: config_get_value_once: Assertion
`config_dict' failed."

----------------------------------------------------------------------
(0003770) alan (reporter) - 18-Oct-16 15:05
http://dbmail.org/mantis/view.php?id=1081#c3770
----------------------------------------------------------------------
Just pushed a commit so the connection should show an error but not ABEND.
The imap connection should fail and require re-authentication, depending on
how the client connects it may need to be restarted. All should be ok if
the other threads continue and either the affected thread dies with the
connection or fully recovers after 'unrecoverable error while talking to
ldap server'.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
17-Oct-16 10:34 alan Note Added: 0003768
17-Oct-16 21:07 PeterS Note Added: 0003769
17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.err.xz

17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.extra.txt.xz

18-Oct-16 15:05 alan Note Added: 0003770
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 18-Oct-16 19:52 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

----------------------------------------------------------------------
(0003768) alan (reporter) - 17-Oct-16 10:34
http://dbmail.org/mantis/view.php?id=1081#c3768
----------------------------------------------------------------------
Default timeout of 4000 appears excessive, I'm successfully running
production with 300 (5 mins), perhaps 60 might be useful for testing and
may illuminate a stuck thread.

----------------------------------------------------------------------
(0003769) PeterS (reporter) - 17-Oct-16 21:07
http://dbmail.org/mantis/view.php?id=1081#c3769
----------------------------------------------------------------------
Here is the latest.

dbmail.20161017-1031.err.xz (dbmail.20161017-1031.err) and
dbmail.20161017-1031.extra.txt (dbmail.20161017-1031.extra.txt.xz)

Updated configuration for your suggested timeout, "query_timeout = 60".
Still ABENDs, with the last line being:

"dbmail-imapd: dm_config.c:134: config_get_value_once: Assertion
`config_dict' failed."

----------------------------------------------------------------------
(0003770) alan (reporter) - 18-Oct-16 15:05
http://dbmail.org/mantis/view.php?id=1081#c3770
----------------------------------------------------------------------
Just pushed a commit so the connection should show an error but not ABEND.
The imap connection should fail and require re-authentication, depending on
how the client connects it may need to be restarted. All should be ok if
the other threads continue and either the affected thread dies with the
connection or fully recovers after 'unrecoverable error while talking to
ldap server'.

----------------------------------------------------------------------
(0003771) PeterS (reporter) - 18-Oct-16 19:52
http://dbmail.org/mantis/view.php?id=1081#c3771
----------------------------------------------------------------------
The update is a bit better as it does not ABEND any longer. However once
it loses its connection to LDAP it can not recover and infinitely attempts
to connect.

See sanitized log, dbmail.20161018-1122.err (dbmail.20161018-1122.err.xz).

I have watched for any traffic outbound from the system to LDAP or LDAPS
and it is not actually attempting or sending anything, so the connection
attempts are bogus.

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
17-Oct-16 10:34 alan Note Added: 0003768
17-Oct-16 21:07 PeterS Note Added: 0003769
17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.err.xz

17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.extra.txt.xz

18-Oct-16 15:05 alan Note Added: 0003770
18-Oct-16 19:52 PeterS Note Added: 0003771
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
[DBMail 0001081]: DBmail ABEND'ing upon LDAP access error. [ In reply to ]
A NOTE has been added to this issue.
======================================================================
http://dbmail.org/mantis/view.php?id=1081
======================================================================
Reported By: PeterS
Assigned To:
======================================================================
Project: DBMail
Issue ID: 1081
Category: Authentication layer
Reproducibility: always
Severity: major
Priority: normal
Status: new
target:
======================================================================
Date Submitted: 10-Aug-16 23:27 CEST
Last Modified: 19-Oct-16 18:58 CEST
======================================================================
Summary: DBmail ABEND'ing upon LDAP access error.
Description:
I've been trying to find a version of DBmail 3.X that does not ABEND while
it is accessing LDAP. I have not been able to determine the sequence of
events that causes the ABEND, however it happens to me many many times a
day. I've written a script around the execution of DBmail to ensure that
it restarts when it ABENDs.
======================================================================

----------------------------------------------------------------------
(0003740) PeterS (reporter) - 11-Aug-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3740
----------------------------------------------------------------------
Persuant to the "Bug posting etiquette", yesterday I pulled a clean,
newest, version of DBmail from GIT, compiled it, installed it, cleared the
dbmail.err log, and started it. Since then I have received 14 ABENDs.

[compiler@swlx143 dbmail-git]$ date ; git status -u no
Thu Aug 11 13:02:11 CDT 2016
# On branch master
nothing to commit (working directory clean)
[compiler@swlx143 dbmail-git]$

----------------------------------------------------------------------
(0003745) PeterS (reporter) - 07-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3745
----------------------------------------------------------------------
Today I migrated my database over to new hardware and am receiving the same
error.

dbmail-imapd: sasl.c:75: ldap_sasl_bind: Assertion `(
(ld)->ldc->ldc_options.ldo_valid == 0x2 )' failed.

Red Hat RHEL7
dbmail-auth-ldap-3.2.3-1.el7.x86_64
dbmail-3.2.3-1.el7.x86_64
postgresql-9.2.15-1.el7_2.x86_64
postgresql-server-9.2.15-1.el7_2.x86_64

----------------------------------------------------------------------
(0003746) alan (reporter) - 08-Sep-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3746
----------------------------------------------------------------------
This error appears to be ldap connection related. Error reports elsewhere
suggest either something in /etc/openldap/ldap.conf or library
compatibility following updates to openssl. Are there any certificates
missing?
Are you able to compile dbmail locally, this may address library issues.

----------------------------------------------------------------------
(0003749) PeterS (reporter) - 15-Sep-16 17:31
http://dbmail.org/mantis/view.php?id=1081#c3749
----------------------------------------------------------------------
Alan, I have already tried the latest build(s) of DBmail to no avail. I've
disabled all certificate handling in OPENLDAP using both:

TLS_REQCERT never

and also:

TLS_REQCERT allow

with no change in results. We have multiple Microsoft AD LDAP servers
here, with varying certificates that aren't really verifiable, so I have
always disabled tracking or requiring valid certificates. We have other
Linux systems using LDAP/OPENLDAP without any issues using this same
config.

----------------------------------------------------------------------
(0003751) alan (reporter) - 28-Sep-16 20:24
http://dbmail.org/mantis/view.php?id=1081#c3751
----------------------------------------------------------------------
Confirm I'm able to reproduce the issue, think I've tracked down the parts
of the code causing it. Will test asap.

----------------------------------------------------------------------
(0003752) alan (reporter) - 30-Sep-16 16:58
http://dbmail.org/mantis/view.php?id=1081#c3752
----------------------------------------------------------------------
This issues appears to be fixed using the attached patch, please test.

----------------------------------------------------------------------
(0003753) PeterS (reporter) - 30-Sep-16 23:37
http://dbmail.org/mantis/view.php?id=1081#c3753
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
dbmail_src_modules_authldap.c-20160930.diff and receive the attached error,
dbmail.20160930-1642.err . The service still ABEND's.

----------------------------------------------------------------------
(0003754) PeterS (reporter) - 01-Oct-16 00:18
http://dbmail.org/mantis/view.php?id=1081#c3754
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with dbmail_src_modules_authldap.c-20160930.diff and receive the
attached error, dbmail.20160930-1721.err . The service still ABEND's.

----------------------------------------------------------------------
(0003755) alan (reporter) - 01-Oct-16 15:36
http://dbmail.org/mantis/view.php?id=1081#c3755
----------------------------------------------------------------------
Ok, latest patch src_modules_authldap.c-20161001.diff increases the time to
check for server gone away to 2 minutes.

There is an edge case where after binding, the connection instantly goes
away. Also the daemon exits if the ldap server doesn't reappear after two
minutes, similar to not starting if not available.

There should be warnings such as the following, then recovers when the
server returns.
Oct 01 14:06:38 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(1/5).
Oct 01 14:06:39 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(2/5).
Oct 01 14:06:49 lully.p-o.co.uk dbmail-imapd[96422]: [0x805c0a800]
Warning:[auth] ldap_con_get(+142): LDAP gone away: Can't contact LDAP
server. Trying to reconnect(119/5).

Does this get closer to addressing your issue?

----------------------------------------------------------------------
(0003756) PeterS (reporter) - 03-Oct-16 16:54
http://dbmail.org/mantis/view.php?id=1081#c3756
----------------------------------------------------------------------
I've patched the Red Hat RHEL 7 (EPEL 7) version of DBmail
(dbmail-3.2.3-1.el7.x86_64, dbmail-auth-ldap-3.2.3-1.el7.x86_64, and
dbmail-debuginfo-3.2.3-1.el7.x86_64) with
src_modules_authldap.c-20161001.diff and receive the attached error,
dbmail.20161003-0956.err . This time I've cut the last parts of the full
dbmail.err file which consists of the last threads 0x7f86e84e8000 and
0x7f86e81d1400 . The service still ABEND's.

I'll now try with today's GIT version and the patch.

----------------------------------------------------------------------
(0003757) PeterS (reporter) - 03-Oct-16 19:57
http://dbmail.org/mantis/view.php?id=1081#c3757
----------------------------------------------------------------------
I've patched a GIT clone (git clone git://git.dbmail.eu/paul/dbmail) from
today with src_modules_authldap.c-20161001.diff and receive the attached
error, dbmail.20161003-1055.err (dbmail.20161003-1055.err.xz) . This time
I've cut the last parts of the full dbmail.err file which consists of the
last threads 0x1466450 and 0xde7400. The service still ABEND's.



----------------------------------------------------------------------
(0003758) alan (reporter) - 04-Oct-16 17:54
http://dbmail.org/mantis/view.php?id=1081#c3758
----------------------------------------------------------------------
Patch src_modules_authldap.c-20161004.diff adds slightly more debugging
information, plus binding and search retries.

The error appears to be transient and localised in authldap_search, this
patch should address the issue.

fyi there is a copy at https://github.com/alan-hicks/dbmail

----------------------------------------------------------------------
(0003759) PeterS (reporter) - 04-Oct-16 21:47
http://dbmail.org/mantis/view.php?id=1081#c3759
----------------------------------------------------------------------
I pulled down a clone of your https://github.com/alan-hicks/dbmail and had
it running. This version does not ABEND but rather stops communicating
(which caused my client to Segfault in the middle of doing IMAPS things).

Please see the attached dbmail.20161004-1348.err
(dbmail.20161004-1348.err.xz) for the full error listing.

Correction: it did ABEND when I restarted my IMAPS client and attempted to
connect to the DBmail instance. The same "dbmail-imapd: dm_config.c:134:
config_get_value_once: Assertion `config_dict' failed." error as before is
the last thing logged.



----------------------------------------------------------------------
(0003760) alan (reporter) - 06-Oct-16 15:57
http://dbmail.org/mantis/view.php?id=1081#c3760
----------------------------------------------------------------------
I'm struggling to find what causes dbmail to be unable to contact the
server.
Patch src_modules_authldap.c-20161006.diff bumps the search timeout to 60
seconds.
The error is that the ldap server can't be contacted. Although there is
only a single connection, nothing suggests it's blocked, out of memory or
any of the normal errors. As there are relatively few ldap server queries
compared to sql, pooling probably wouldn't help. I'm assuming the ldap
server/active directory is fine and that it's just dbmail that's unable to
connect.
Not sure what would have caused your client to segfault, the retries would
have paused sending anything back to the client and at most would have
dropped the connection.
Unable to see any "dm_config.c:134: config_get_value_once: Assertion"
errors in the log, can you upload the part of the log leading up to that
assertion please as I'm not sure where getting config values would be
called after startup.

----------------------------------------------------------------------
(0003761) PeterS (reporter) - 06-Oct-16 19:45
http://dbmail.org/mantis/view.php?id=1081#c3761
----------------------------------------------------------------------
Alan, thank you for continuing to assist with this. At some point I may
write a script that will allow me to sanitize my logs so that I can just
post the entire thing to you.

I am attaching even more information for you.

It looks like my dbmail is not disconnecting sessions that it makes with
our LDAP server, and apparently finally ABEND's after it attempts to re-use
a connection to LDAP that is over an hour old.

Please see the attached logs -- dbmail.20161006-1236.err
(dbmail.20161006-1236.err.xz), dbmail.20161006-1236.ldaps_tcpdump.txt
(dbmail.20161006-1236.ldaps_tcpdump.txt.xz), and
dbmail.20161006-1236.extra.txt (dbmail.20161006-1236.extra.txt.xz) .

dbmail.20161006-1236.err contains the very beginning of the log and a large
portion of the end of the log. The "dm_config.c:134:" error is there.
dbmail.20161006-1236.ldaps_tcpdump.txt is the text framing output of the
*full* LDAPS conversation with my LDAP server (I captured all traffic to
and from its IP address.) dbmail.20161006-1236.extra.txt contains all of
the "ldap_con_get"-s that exist in the full dbmail.err log. There is a
drop-off from 11:14:53 to 12:15:05 at which point DBmail ABENDs.

----------------------------------------------------------------------
(0003762) alan (reporter) - 12-Oct-16 20:13
http://dbmail.org/mantis/view.php?id=1081#c3762
----------------------------------------------------------------------
I'm happy to continue supporting dbmail, it's used here so makes sense to
share and keep it up to date.

It's normal to keep ldap connections open if the threads are still extant,
it's similar for the database.

As there are no obvious reasons why the search suddenly stops without an
error, it may be due to a long running thread hogging a resource, I'm aware
that occasionally a long folder listing/update appears to suspend some
clients.

Patch src_modules_authldap.c-20161012.diff updates some deprecated
functions and takes the timeout value from query_timeout in dbmail.conf
which from memory defaults to 300 (5 minutes).

If this fixes the issue great, otherwise updating the code to be more
resource friendly looks as though it will take much longer. As
dbmail.20161006-1236.err.xz timed out after 60 seconds it may not achieve
much, though you might try updating query_timeout to 3600 (an hour) to test
if whatever is causing the block is released. Hopefully whatever precedes
any resumption will help pinpoint the blocking.

----------------------------------------------------------------------
(0003763) alan (reporter) - 13-Oct-16 17:16
http://dbmail.org/mantis/view.php?id=1081#c3763
----------------------------------------------------------------------
patch src_server.c-20161013.diff removes some redundant event and profiling
code.

----------------------------------------------------------------------
(0003764) alan (reporter) - 14-Oct-16 19:01
http://dbmail.org/mantis/view.php?id=1081#c3764
----------------------------------------------------------------------
Apologies for previous patch removing required event_assign. I tested the
wrong binary.
All appears working with the code at https://github.com/alan-hicks/dbmail
Still unable to reproduce LDAP gone away.

----------------------------------------------------------------------
(0003765) PeterS (reporter) - 14-Oct-16 19:33
http://dbmail.org/mantis/view.php?id=1081#c3765
----------------------------------------------------------------------
Ok; yes, I did experience some issues with that latest patch, so I will
give this newer one a try. Thanks.

----------------------------------------------------------------------
(0003766) PeterS (reporter) - 14-Oct-16 20:50
http://dbmail.org/mantis/view.php?id=1081#c3766
----------------------------------------------------------------------
With your patch from today, it looks as though at some point it gives up on
the LDAPS server that is configured and while it claims to be
"reconnecting", it in fact is not sending any requests to the LDAPS
server...

It is logging entries like the following:

Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(519/3600).
Oct 14 13:53:37 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(429/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a0630]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(520/3600).
Oct 14 13:53:38 swlx143.swmed.e dbmail-imapd[1552]: [0x25a04a0]
Warning:[auth] authldap_search(+293): LDAP gone away: Can't contact LDAP
server. Trying again(430/3600).

A TCPDUMP of all traffic for the destination of the LDAPS server that is
configured reveals NO TRAFFIC being generated. Am I misunderstanding what
"Trying again" means perhaps?

----------------------------------------------------------------------
(0003767) alan (reporter) - 14-Oct-16 22:40
http://dbmail.org/mantis/view.php?id=1081#c3767
----------------------------------------------------------------------
That part of the code is trying to send a request, something in dbmail is
blocking it. Are existing sessions still OK, is it possible to start a new
session? Any clues to what else might be waiting or stalled?

----------------------------------------------------------------------
(0003768) alan (reporter) - 17-Oct-16 10:34
http://dbmail.org/mantis/view.php?id=1081#c3768
----------------------------------------------------------------------
Default timeout of 4000 appears excessive, I'm successfully running
production with 300 (5 mins), perhaps 60 might be useful for testing and
may illuminate a stuck thread.

----------------------------------------------------------------------
(0003769) PeterS (reporter) - 17-Oct-16 21:07
http://dbmail.org/mantis/view.php?id=1081#c3769
----------------------------------------------------------------------
Here is the latest.

dbmail.20161017-1031.err.xz (dbmail.20161017-1031.err) and
dbmail.20161017-1031.extra.txt (dbmail.20161017-1031.extra.txt.xz)

Updated configuration for your suggested timeout, "query_timeout = 60".
Still ABENDs, with the last line being:

"dbmail-imapd: dm_config.c:134: config_get_value_once: Assertion
`config_dict' failed."

----------------------------------------------------------------------
(0003770) alan (reporter) - 18-Oct-16 15:05
http://dbmail.org/mantis/view.php?id=1081#c3770
----------------------------------------------------------------------
Just pushed a commit so the connection should show an error but not ABEND.
The imap connection should fail and require re-authentication, depending on
how the client connects it may need to be restarted. All should be ok if
the other threads continue and either the affected thread dies with the
connection or fully recovers after 'unrecoverable error while talking to
ldap server'.

----------------------------------------------------------------------
(0003771) PeterS (reporter) - 18-Oct-16 19:52
http://dbmail.org/mantis/view.php?id=1081#c3771
----------------------------------------------------------------------
The update is a bit better as it does not ABEND any longer. However once
it loses its connection to LDAP it can not recover and infinitely attempts
to connect.

See sanitized log, dbmail.20161018-1122.err (dbmail.20161018-1122.err.xz).

I have watched for any traffic outbound from the system to LDAP or LDAPS
and it is not actually attempting or sending anything, so the connection
attempts are bogus.

----------------------------------------------------------------------
(0003772) alan (reporter) - 19-Oct-16 18:58
http://dbmail.org/mantis/view.php?id=1081#c3772
----------------------------------------------------------------------
LDAP call updated to remove a deprecated function.

It appears timeout's don't work for synchronous calls:
http://www.openldap.org/lists/openldap-technical/201311/msg00266.html

Issue History
Date Modified Username Field Change
======================================================================
10-Aug-16 23:27 PeterS New Issue
10-Aug-16 23:27 PeterS File Added: dbmail.err
11-Aug-16 19:57 PeterS Note Added: 0003740
07-Sep-16 23:37 PeterS Note Added: 0003745
08-Sep-16 19:01 alan Note Added: 0003746
15-Sep-16 17:31 PeterS Note Added: 0003749
28-Sep-16 20:24 alan Note Added: 0003751
30-Sep-16 16:58 alan Note Added: 0003752
30-Sep-16 17:01 alan File Added: dbmail_src_modules_authldap.c.diff

30-Sep-16 18:15 alan Issue Monitored: alan
30-Sep-16 18:17 alan File Added:
dbmail_src_modules_authldap.c-20160930.diff
30-Sep-16 23:37 PeterS Note Added: 0003753
30-Sep-16 23:37 PeterS File Added: dbmail.20160930-1642.err

01-Oct-16 00:18 PeterS Note Added: 0003754
01-Oct-16 00:18 PeterS File Added: dbmail.20160930-1721.err

01-Oct-16 15:29 alan File Added: src_modules_authldap.c-20161001.diff

01-Oct-16 15:36 alan Note Added: 0003755
03-Oct-16 16:54 PeterS Note Added: 0003756
03-Oct-16 16:56 PeterS File Added: dbmail.20161003-0956.err.xz

03-Oct-16 17:52 PeterS Note Added: 0003757
03-Oct-16 17:52 PeterS File Added: dbmail.20161003-1055.err.xz

03-Oct-16 19:57 PeterS Note Edited: 0003757
04-Oct-16 17:54 alan Note Added: 0003758
04-Oct-16 17:54 alan File Added: src_modules_authldap.c-20161004.diff

04-Oct-16 20:47 PeterS Note Added: 0003759
04-Oct-16 20:48 PeterS File Added: dbmail.20161004-1348.err.xz

04-Oct-16 21:47 PeterS Note Edited: 0003759
06-Oct-16 15:57 alan Note Added: 0003760
06-Oct-16 15:57 alan File Added: src_modules_authldap.c-20161006.diff

06-Oct-16 19:45 PeterS Note Added: 0003761
06-Oct-16 19:45 PeterS File Added: dbmail.20161006-1236.err.xz

06-Oct-16 19:46 PeterS File Added:
dbmail.20161006-1236.ldaps_tcpdump.txt.xz
06-Oct-16 19:46 PeterS File Added: dbmail.20161006-1236.extra.txt.xz

12-Oct-16 19:20 alan File Added: src_modules_authldap.c-20161012.diff

12-Oct-16 20:13 alan Note Added: 0003762
13-Oct-16 17:14 alan File Added: src_server.c-20161013.diff

13-Oct-16 17:16 alan Note Added: 0003763
14-Oct-16 19:01 alan Note Added: 0003764
14-Oct-16 19:33 PeterS Note Added: 0003765
14-Oct-16 20:50 PeterS Note Added: 0003766
14-Oct-16 22:40 alan Note Added: 0003767
17-Oct-16 10:34 alan Note Added: 0003768
17-Oct-16 21:07 PeterS Note Added: 0003769
17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.err.xz

17-Oct-16 21:08 PeterS File Added: dbmail.20161017-1031.extra.txt.xz

18-Oct-16 15:05 alan Note Added: 0003770
18-Oct-16 19:52 PeterS Note Added: 0003771
18-Oct-16 19:56 PeterS File Added: dbmail.20161018-1122.err.xz

19-Oct-16 18:58 alan Note Added: 0003772
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev

1 2  View All