Mailing List Archive

Exim 3.34 assertion failure--LDAP related [followup]
Hi,

I'm following up on that problem. It doesn't seem to have the same
undetermined cause (duh), but still, the error message is the same.

I am able to reproduce the same problem with the same config on Exim 3.34,
3.35 and 3.36.

It seems to happen after direction, and before or during transport. Here's
the "exim -q -v" output".

== something@remote.com T=remote_smtp defer (-44): retry time not
reached for any host
delivering message 17Aynk-0007py-00 (queue run pid 1114 fd 6)
LOG: 0 MAIN
=> runtime <runtime@cam.org> D=ldapuser T=quota_delivery
exim: error.c:221: ldap_parse_result: Assertion `r != ((void *)0)' failed.
LOG: 0 MAIN PANIC
queue run: process 1127 crashed with signal 6 while delivering
17Aynk-0007py-00
delivering message 17AvDk-000822-00 (queue run pid 1114 fd 6)

The quota_delivery transport is configured as:
ldap_default_servers = server1:server2
...
quota_delivery:
driver = appendfile
maildir_format = true
directory = ${lookup ldap {user="binduser" pass="bindpass"
ldap:///basedn?mailMessageStore?sub?(uid=${quote_ldap:$local_part})}{$value}
fail}
delivery_date_add
envelope_to_add
maildir_tag = ,S=$message_size
quota_size_regex = S=(\d+)$
directory = ${lookup ldap {user="binduser" pass="bindpass"
ldap:///basedn?uidNumber?sub?(uid=${quote_ldap:$local_part})}{$value}fail}
group = users
mode = 0600
quota = 20M
quota_warn_threshold = 16M

There is 2 LDAP lookups. One for the maildir, and one for the UID (nssldap
isn't installed). Statisticaly, 100% of the logged accounts triggering the
error have both fields defined and valid in the LDAP DB. Moreover, the error
seems to be quite random but happens very often. Issuing
"exim -qf -Rruntime@cam.org -v" will end up working after a couple of tries.

The director's is doing an LDAP lookup as well (in the "search_type/query"
form) but never fails.

Below's the end of the original thread for a similar problem.

Wed, 13 Mar 2002 11:07:32 +0000 (GMT)
JWB>> Let's assume that
JWB>> Some exim process (perhaps 13321, perhaps not) made the LDAP query
which
JWB>> caused the LDAP library to emit the assertion. That leaves Exim
merely as
JWB>> the trigger mechanism, and not the problem.

PH> Indeed, but of course something wrong in Exim could be making it pull
PH> the trigger. However, unless it happens again, I guess it has to go into
PH> the "unresolved mysteries" pile.


Any idea?

Regards,
__
Tommy Lacroix ( runtime@cam.org )
Unix/Linux System Administrator, CAM Internet
Web: http://www.cam.org
Tel: +1 514 529-3000 ext 247
Fax: +1 514 529-3300
Re: Exim 3.34 assertion failure--LDAP related [followup] [ In reply to ]
Hi again,

I found the cause of my LDAP assertion problem...

If the ldap_search function fails (ie. when the server is reported being
down), result doesn't get any value (thus is NULL). If ldap_result2error is
called with a NULL "result", it will end up outputing the "assertion" error.

This little patch calls ldap_result2error if result is non-NULL. This
routine, according to LDAP documentation, gets "The ld_errno field in ld
[...] set and returned." If the ldap_result2error isn't called, the
ld->ld_errno field contains the value
set by ldap_result.

ldap_get_option is then called to retrieve ld->ld_errno, and pass it to
ldap_err2string.


--- exim-3.34/src/lookups/ldap.c Wed Dec 19 06:50:29 2001
+++ exim-3.34-orig/src/lookups/ldap.c Fri May 24 14:30:08 2002
@@ -464,11 +464,16 @@

if (rc == -1)
{
+ int err;
DEBUG(9) debug_printf("ldap_result failed\n");

#if defined LDAP_LIB_SOLARIS || defined LDAP_LIB_OPENLDAP2
+ if (result != NULL) ldap_result2error(lcp->ld, result, 0);
+
+ ldap_get_option(lcp->ld, LDAP_OPT_ERROR_NUMBER, &err);
+
*errmsg = string_sprintf("ldap_result failed: %s",
- ldap_err2string(ldap_result2error(lcp->ld, result, 0)));
+ ldap_err2string(err));

#elif defined LDAP_LIB_NETSCAPE
{



Best regards,

__
Tommy Lacroix ( runtime@cam.org )
Unix/Linux System Administrator, CAM Internet
Web: http://www.cam.org
Tel: +1 514 529-3000 ext 247
Fax: +1 514 529-3300
Re: Exim 3.34 assertion failure--LDAP related [followup] [ In reply to ]
On Thu, 23 May 2002, Tommy Lacroix wrote:

> I'm following up on that problem. It doesn't seem to have the same
> undetermined cause (duh), but still, the error message is the same.

I'm afraid I don't remember if anything was sorted out for this - I
rather suspect not. But so much has been going on lately...

> == something@remote.com T=remote_smtp defer (-44): retry time not
> reached for any host
> delivering message 17Aynk-0007py-00 (queue run pid 1114 fd 6)
> LOG: 0 MAIN
> => runtime <runtime@cam.org> D=ldapuser T=quota_delivery
> exim: error.c:221: ldap_parse_result: Assertion `r != ((void *)0)' failed.

error.c must be part of LDAP; it isn't part of Exim. I guess somebody
needs to read the LDAP code to see what function it is in. Exim doesn't
call ldap_parse_result itself.

I'm afraid I'm on vacation for a week from now, so this will have to
wait for my attention. If you can find out any more while I'm away, that
will be helpful.

It is odd that it happens only some times.


--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.
Re: Exim 3.34 assertion failure--LDAP related [followup] [ In reply to ]
C'est un message de format MIME en plusieurs parties.
--
> > exim: error.c:221: ldap_parse_result: Assertion `r != ((void *)0)'
failed.
> error.c must be part of LDAP; it isn't part of Exim. I guess somebody
> needs to read the LDAP code to see what function it is in. Exim doesn't
> call ldap_parse_result itself.
>
> I'm afraid I'm on vacation for a week from now, so this will have to
> wait for my attention. If you can find out any more while I'm away, that
> will be helpful.
>
> It is odd that it happens only some times.

I found out more. ldap_search is called on a cached connection that is down.
ldap_result reports error *without* a result (thus the result var is NULL).
ldap_result2error is called, which calls ldap_parse_result, which causes an
assertion failure because result is NULL.

The patch included - against 3.34 but should work on 3.35/36 since the
changelog reports no mods to the ldap lookups - is a new version of the
first one I sent since it didn't completely solve my problem: three
consecutive ldap lookups on the same connection just fails for some reason.
Upon LDAP_SERVER_DOWN failure on a cached connection, tidying ldap
connections and recalling perform_ldap_search works ok, althought I'm not
quite sure if it's a correct way of fixing this up.

I'll wait for some news. I wish you good vacations.

Regards,

__
Tommy Lacroix ( runtime@cam.org )
Linux/Unix System Administrator, CAM Internet
Web: http://www.cam.org
Tel: +1 514 529-3000 ext 247
Fax: +1 514 529-3300
--
[ exim-ldap.patch of type application/octet-stream deleted ]
--
Re: Exim 3.34 assertion failure--LDAP related [followup] [ In reply to ]
On Fri, 24 May 2002, Tommy Lacroix wrote:

> This little patch calls ldap_result2error if result is non-NULL.

A similar change has already been made in Exim 4. However, yours looks
to be more sophisticated in that it manages to extract an error message
from LDAP. I didn't know how to do that. So I'll consider modifying the
code in Exim 4. (I'm trying hard not to touch Exim 3 any more.)

Philip

--
Philip Hazel University of Cambridge Computing Service,
ph10@cus.cam.ac.uk Cambridge, England. Phone: +44 1223 334714.