Mailing List Archive

[PATCH] mod_ldap: fix apr_ldap_rebind_remove() use
In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed
NULL since ldc->ldap is either NULL on entry or is set to NULL. This
looks safe, but seems like an expensive noop since
apr_ldap_rebind_remove() acquires a mutex and iterates a linked list
trying to find a rebind reference matching NULL (I assume always
failing).

http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222

Can this be removed or should it be moved up so it's not a noop? I've
not tried to reproduce problems in an LDAP config with referrals.

(Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning
and debugged this)

Index: modules/ldap/util_ldap.c
===================================================================
--- modules/ldap/util_ldap.c (revision 1877157)
+++ modules/ldap/util_ldap.c (working copy)
@@ -225,6 +225,12 @@

if (ldc) {
if (ldc->ldap) {
+ /* forget the rebind info for this conn */
+ if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
+ apr_ldap_rebind_remove(ldc->ldap);
+ apr_pool_clear(ldc->rebind_pool);
+ }
+
if (ldc->r) {
ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp unbind", ldc);
}
@@ -233,11 +239,6 @@
}
ldc->bound = 0;

- /* forget the rebind info for this conn */
- if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
- apr_ldap_rebind_remove(ldc->ldap);
- apr_pool_clear(ldc->rebind_pool);
- }
}

return APR_SUCCESS;
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Wed, Apr 29, 2020 at 9:29 AM Joe Orton <jorton@redhat.com> wrote:
>
> In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed
> NULL since ldc->ldap is either NULL on entry or is set to NULL. This
> looks safe, but seems like an expensive noop since
> apr_ldap_rebind_remove() acquires a mutex and iterates a linked list
> trying to find a rebind reference matching NULL (I assume always
> failing).
>
> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222
>
> Can this be removed or should it be moved up so it's not a noop? I've
> not tried to reproduce problems in an LDAP config with referrals.
>
> (Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning
> and debugged this)
>
> Index: modules/ldap/util_ldap.c
> ===================================================================
> --- modules/ldap/util_ldap.c (revision 1877157)
> +++ modules/ldap/util_ldap.c (working copy)
> @@ -225,6 +225,12 @@
>
> if (ldc) {
> if (ldc->ldap) {
> + /* forget the rebind info for this conn */
> + if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> + apr_ldap_rebind_remove(ldc->ldap);
> + apr_pool_clear(ldc->rebind_pool);
> + }
> +
> if (ldc->r) {
> ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp unbind", ldc);
> }
> @@ -233,11 +239,6 @@
> }
> ldc->bound = 0;
>
> - /* forget the rebind info for this conn */
> - if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> - apr_ldap_rebind_remove(ldc->ldap);
> - apr_pool_clear(ldc->rebind_pool);
> - }
> }

+1 to moving up
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On 4/29/20 5:02 PM, Eric Covener wrote:
> On Wed, Apr 29, 2020 at 9:29 AM Joe Orton <jorton@redhat.com> wrote:
>>
>> In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed
>> NULL since ldc->ldap is either NULL on entry or is set to NULL. This
>> looks safe, but seems like an expensive noop since
>> apr_ldap_rebind_remove() acquires a mutex and iterates a linked list
>> trying to find a rebind reference matching NULL (I assume always
>> failing).
>>
>> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222
>>
>> Can this be removed or should it be moved up so it's not a noop? I've
>> not tried to reproduce problems in an LDAP config with referrals.
>>
>> (Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning
>> and debugged this)
>>
>> Index: modules/ldap/util_ldap.c
>> ===================================================================
>> --- modules/ldap/util_ldap.c (revision 1877157)
>> +++ modules/ldap/util_ldap.c (working copy)
>> @@ -225,6 +225,12 @@
>>
>> if (ldc) {
>> if (ldc->ldap) {
>> + /* forget the rebind info for this conn */
>> + if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
>> + apr_ldap_rebind_remove(ldc->ldap);
>> + apr_pool_clear(ldc->rebind_pool);
>> + }
>> +
>> if (ldc->r) {
>> ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp unbind", ldc);
>> }
>> @@ -233,11 +239,6 @@
>> }
>> ldc->bound = 0;
>>
>> - /* forget the rebind info for this conn */
>> - if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
>> - apr_ldap_rebind_remove(ldc->ldap);
>> - apr_pool_clear(ldc->rebind_pool);
>> - }
>> }
>
> +1 to moving up

Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
does this only makes sense if ldc->ldap != NULL?
And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?

Regards

RĂ¼diger

>
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
>
>
> On 4/29/20 5:02 PM, Eric Covener wrote:
> > On Wed, Apr 29, 2020 at 9:29 AM Joe Orton <jorton@redhat.com> wrote:
> >>
> >> In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed
> >> NULL since ldc->ldap is either NULL on entry or is set to NULL. This
> >> looks safe, but seems like an expensive noop since
> >> apr_ldap_rebind_remove() acquires a mutex and iterates a linked list
> >> trying to find a rebind reference matching NULL (I assume always
> >> failing).
> >>
> >> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222
> >>
> >> Can this be removed or should it be moved up so it's not a noop? I've
> >> not tried to reproduce problems in an LDAP config with referrals.
> >>
> >> (Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning
> >> and debugged this)
> >>
> >> Index: modules/ldap/util_ldap.c
> >> ===================================================================
> >> --- modules/ldap/util_ldap.c (revision 1877157)
> >> +++ modules/ldap/util_ldap.c (working copy)
> >> @@ -225,6 +225,12 @@
> >>
> >> if (ldc) {
> >> if (ldc->ldap) {
> >> + /* forget the rebind info for this conn */
> >> + if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> >> + apr_ldap_rebind_remove(ldc->ldap);
> >> + apr_pool_clear(ldc->rebind_pool);
> >> + }
> >> +
> >> if (ldc->r) {
> >> ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp unbind", ldc);
> >> }
> >> @@ -233,11 +239,6 @@
> >> }
> >> ldc->bound = 0;
> >>
> >> - /* forget the rebind info for this conn */
> >> - if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> >> - apr_ldap_rebind_remove(ldc->ldap);
> >> - apr_pool_clear(ldc->rebind_pool);
> >> - }
> >> }
> >
> > +1 to moving up
>
> Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
> does this only makes sense if ldc->ldap != NULL?
> And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?

I see what you mean. If we lost the ldc->ldap some other way, anything
allocated from ldc->rebind_pool is leaked because the LDAP key can't
be looked up in the xref linked list.
We would likely have linked list growth in that case too.
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Wed, Apr 29, 2020 at 03:09:28PM -0400, Eric Covener wrote:
> On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem <rpluem@apache.org> wrote:
> > Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
> > does this only makes sense if ldc->ldap != NULL?
> > And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?
>
> I see what you mean. If we lost the ldc->ldap some other way, anything
> allocated from ldc->rebind_pool is leaked because the LDAP key can't
> be looked up in the xref linked list.
> We would likely have linked list growth in that case too.

Shouldn't clearing the pool here be sufficient anyway? Since there is a
cleanup registered by apr_ldap_rebind_add() which calls
apr_ldap_rebind_remove() with the original LDAP * pointer value it looks
safe, and it avoids entering _rebind_remove twice.

Doing it before the ldap_unbind() call so that LDAP * pointer is not a
dangling pointer seems better.

The underlying problem here is a performance issue which looked like
linked list growth so actually with:

a) threaded MPM and large number of threads,
b) each thread may have its own LDAP * and an associated rebind entry (is that really right?!)
c) on unbind each thread walks the linked list w/mutex held

which looks quite painful regardless, and doing (c) twice is clearly
worse. So as well as changing this the indexed linked list should
really be a hash table?

Regards, Joe

--
Joe Orton // Red Hat Core Services
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On 4/30/20 11:34 AM, Joe Orton wrote:
> On Wed, Apr 29, 2020 at 03:09:28PM -0400, Eric Covener wrote:
>> On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem <rpluem@apache.org> wrote:
>>> Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
>>> does this only makes sense if ldc->ldap != NULL?
>>> And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?
>>
>> I see what you mean. If we lost the ldc->ldap some other way, anything
>> allocated from ldc->rebind_pool is leaked because the LDAP key can't
>> be looked up in the xref linked list.
>> We would likely have linked list growth in that case too.
>
> Shouldn't clearing the pool here be sufficient anyway? Since there is a
> cleanup registered by apr_ldap_rebind_add() which calls
> apr_ldap_rebind_remove() with the original LDAP * pointer value it looks
> safe, and it avoids entering _rebind_remove twice.
>
> Doing it before the ldap_unbind() call so that LDAP * pointer is not a
> dangling pointer seems better.
>
> The underlying problem here is a performance issue which looked like
> linked list growth so actually with:
>
> a) threaded MPM and large number of threads,
> b) each thread may have its own LDAP * and an associated rebind entry (is that really right?!)
> c) on unbind each thread walks the linked list w/mutex held
>
> which looks quite painful regardless, and doing (c) twice is clearly
> worse. So as well as changing this the indexed linked list should
> really be a hash table?

Not looked at the problem further, but there is now a PR for this:

https://bz.apache.org/bugzilla/show_bug.cgi?id=64414

Maybe this revives the discussion :-)

Regards

RĂ¼diger
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Thu, May 7, 2020 at 2:04 AM Ruediger Pluem <rpluem@apache.org> wrote:
>
>
>
> On 4/30/20 11:34 AM, Joe Orton wrote:
> > On Wed, Apr 29, 2020 at 03:09:28PM -0400, Eric Covener wrote:
> >> On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem <rpluem@apache.org> wrote:
> >>> Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
> >>> does this only makes sense if ldc->ldap != NULL?
> >>> And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?
> >>
> >> I see what you mean. If we lost the ldc->ldap some other way, anything
> >> allocated from ldc->rebind_pool is leaked because the LDAP key can't
> >> be looked up in the xref linked list.
> >> We would likely have linked list growth in that case too.
> >
> > Shouldn't clearing the pool here be sufficient anyway? Since there is a
> > cleanup registered by apr_ldap_rebind_add() which calls
> > apr_ldap_rebind_remove() with the original LDAP * pointer value it looks
> > safe, and it avoids entering _rebind_remove twice.
> >
> > Doing it before the ldap_unbind() call so that LDAP * pointer is not a
> > dangling pointer seems better.
> >
> > The underlying problem here is a performance issue which looked like
> > linked list growth so actually with:
> >
> > a) threaded MPM and large number of threads,
> > b) each thread may have its own LDAP * and an associated rebind entry (is that really right?!)

I think it is right, the LDAP* is the only userdata we portably get on
the callback from the SDK and we have to return credentials.

There are also pooled LDAP* sitting idle who have entries floating
around. Probably not too many extras unless multiple
LDAP servers are used.

> > c) on unbind each thread walks the linked list w/mutex held
> >
> > which looks quite painful regardless, and doing (c) twice is clearly
> > worse. So as well as changing this the indexed linked list should
> > really be a hash table?

My hope is that the httpd-only fix will eliminate the leaks and the
locking will not be too much relative to ldap costs, especially w/ the
caching in mod_ldap.

> Not looked at the problem further, but there is now a PR for this:
>
> https://bz.apache.org/bugzilla/show_bug.cgi?id=64414
>
> Maybe this revives the discussion :-)

Nothing stood out in the source, is it OK to hash a pointer just by
passing APR_SIZEOF_VOIDP as the length?
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
> My hope is that the httpd-only fix will eliminate the leaks and the
> locking will not be too much relative to ldap costs, especially w/ the
> caching in mod_ldap.

Above is wrong since there is really no leak currently as Joe pointed
out, the apr_pool_clear is going to remove the entry with its stashed
away copy of the pointer.
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Wed, Apr 29, 2020 at 2:43 PM Ruediger Pluem <rpluem@apache.org> wrote:
>
>
>
> On 4/29/20 5:02 PM, Eric Covener wrote:
> > On Wed, Apr 29, 2020 at 9:29 AM Joe Orton <jorton@redhat.com> wrote:
> >>
> >> In uldap_connection_unbind, apr_ldap_rebind_remove() is always passed
> >> NULL since ldc->ldap is either NULL on entry or is set to NULL. This
> >> looks safe, but seems like an expensive noop since
> >> apr_ldap_rebind_remove() acquires a mutex and iterates a linked list
> >> trying to find a rebind reference matching NULL (I assume always
> >> failing).
> >>
> >> http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/ldap/util_ldap.c?revision=1876599&view=markup#l222
> >>
> >> Can this be removed or should it be moved up so it's not a noop? I've
> >> not tried to reproduce problems in an LDAP config with referrals.
> >>
> >> (Credit to Abhishek and Angel who saw apr_ldap_rebind_remove() spinning
> >> and debugged this)
> >>
> >> Index: modules/ldap/util_ldap.c
> >> ===================================================================
> >> --- modules/ldap/util_ldap.c (revision 1877157)
> >> +++ modules/ldap/util_ldap.c (working copy)
> >> @@ -225,6 +225,12 @@
> >>
> >> if (ldc) {
> >> if (ldc->ldap) {
> >> + /* forget the rebind info for this conn */
> >> + if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> >> + apr_ldap_rebind_remove(ldc->ldap);
> >> + apr_pool_clear(ldc->rebind_pool);
> >> + }
> >> +
> >> if (ldc->r) {
> >> ap_log_rerror(APLOG_MARK, APLOG_TRACE5, 0, ldc->r, "LDC %pp unbind", ldc);
> >> }
> >> @@ -233,11 +239,6 @@
> >> }
> >> ldc->bound = 0;
> >>
> >> - /* forget the rebind info for this conn */
> >> - if (ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON) {
> >> - apr_ldap_rebind_remove(ldc->ldap);
> >> - apr_pool_clear(ldc->rebind_pool);
> >> - }
> >> }
> >
> > +1 to moving up
>
> Hm. Don't we need to do the apr_pool_clear(ldc->rebind_pool); in any case if ldc->ChaseReferrals == AP_LDAP_CHASEREFERRALS_ON or
> does this only makes sense if ldc->ldap != NULL?

We need the ldc->LDAP to add the rebind as it is the key, and we do
seem to uniformly only clear ldc->LDAP in the unbind method. So I
think it only makes sense
if ldc->LDAP != NULL.


> And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?

This is a good point. For the IBM/Tivoli SDK the callback can be
called a 2nd time as a cleanup (in the IBM SDK terms, not APR). This
will trigger a full/failing search of the linked list under the mutex.
I don't know when the "cleanup" style of callback is actually called
by the SDK but it is probably best to avoid that ordering.
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
> > And can we still do an ldap_unbind_s(ldc->ldap); if we did an apr_ldap_rebind_remove(ldc->ldap); before?
>
> This is a good point. For the IBM/Tivoli SDK the callback can be
> called a 2nd time as a cleanup (in the IBM SDK terms, not APR). This
> will trigger a full/failing search of the linked list under the mutex.
> I don't know when the "cleanup" style of callback is actually called
> by the SDK but it is probably best to avoid that ordering.

Looks like we don't hit the linked list in the free/cleanup case on the IBM SDK.
But for openldap we do give the SDK pointers to pool allocated memory
(not that they should be needed for unbind, but could be traced)
So probably best to continue unbinding first.

--
Eric Covener
covener@gmail.com
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Thu, May 07, 2020 at 08:06:00AM -0400, Eric Covener wrote:
> On Thu, May 7, 2020 at 2:04 AM Ruediger Pluem <rpluem@apache.org> wrote:
> > Not looked at the problem further, but there is now a PR for this:
> >
> > https://bz.apache.org/bugzilla/show_bug.cgi?id=64414
> >
> > Maybe this revives the discussion :-)
>
> Nothing stood out in the source, is it OK to hash a pointer just by
> passing APR_SIZEOF_VOIDP as the length?

I don't see why not.

Better question... or stupider question? For "modern" OpenLDAP where
ldap_set_rebind_proc takes a void *, this linked list cache is
completely redundant and you can "simply" pass the (bindDN, bindPW)
through to the rebind callback via the void *, and that will work
correctly?
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Thu, May 7, 2020 at 9:39 AM Joe Orton <jorton@redhat.com> wrote:
>
> On Thu, May 07, 2020 at 08:06:00AM -0400, Eric Covener wrote:
> > On Thu, May 7, 2020 at 2:04 AM Ruediger Pluem <rpluem@apache.org> wrote:
> > > Not looked at the problem further, but there is now a PR for this:
> > >
> > > https://bz.apache.org/bugzilla/show_bug.cgi?id=64414
> > >
> > > Maybe this revives the discussion :-)
> >
> > Nothing stood out in the source, is it OK to hash a pointer just by
> > passing APR_SIZEOF_VOIDP as the length?
>
> I don't see why not.
>
> Better question... or stupider question? For "modern" OpenLDAP where
> ldap_set_rebind_proc takes a void *, this linked list cache is
> completely redundant and you can "simply" pass the (bindDN, bindPW)
> through to the rebind callback via the void *, and that will work
> correctly?

That makes sense, I would think so. The manual on my Mac has the
userdata parm too.

--
Eric Covener
covener@gmail.com
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Thu, May 07, 2020 at 09:50:26AM -0400, Eric Covener wrote:
> On Thu, May 7, 2020 at 9:39 AM Joe Orton <jorton@redhat.com> wrote:
> > Better question... or stupider question? For "modern" OpenLDAP where
> > ldap_set_rebind_proc takes a void *, this linked list cache is
> > completely redundant and you can "simply" pass the (bindDN, bindPW)
> > through to the rebind callback via the void *, and that will work
> > correctly?
>
> That makes sense, I would think so. The manual on my Mac has the
> userdata parm too.

I have merged that change in r1878890 - thanks a lot for reviewing &
walking me through this far, apologies for moving slowing but I wanted
to get a working test case before proceeding.

Reviewing this thread again I am not sure if the change to
remove/reorder the removal of the rebind is safe for non-OpenLDAP
toolkits. I can revert that bit if required?

https://github.com/apache/httpd/commit/130eac3ae6e8f7a8465c7792660ce8b747642b23#diff-f0cf47f5d582509d36e95f170231bc21L226

Regards, Joe
Re: [PATCH] mod_ldap: fix apr_ldap_rebind_remove() use [ In reply to ]
On Tue, Jun 16, 2020 at 10:40 AM Joe Orton <jorton@redhat.com> wrote:
>
> On Thu, May 07, 2020 at 09:50:26AM -0400, Eric Covener wrote:
> > On Thu, May 7, 2020 at 9:39 AM Joe Orton <jorton@redhat.com> wrote:
> > > Better question... or stupider question? For "modern" OpenLDAP where
> > > ldap_set_rebind_proc takes a void *, this linked list cache is
> > > completely redundant and you can "simply" pass the (bindDN, bindPW)
> > > through to the rebind callback via the void *, and that will work
> > > correctly?
> >
> > That makes sense, I would think so. The manual on my Mac has the
> > userdata parm too.
>
> I have merged that change in r1878890 - thanks a lot for reviewing &
> walking me through this far, apologies for moving slowing but I wanted
> to get a working test case before proceeding.
>
> Reviewing this thread again I am not sure if the change to
> remove/reorder the removal of the rebind is safe for non-OpenLDAP
> toolkits. I can revert that bit if required?
>
> https://github.com/apache/httpd/commit/130eac3ae6e8f7a8465c7792660ce8b747642b23#diff-f0cf47f5d582509d36e95f170231bc21L226

I think it should be left in. I found I had made a very similar
change in the distribution I look after while adding Darwin sandbox
support.