Mailing List Archive

Exim issues with shadow_transport
[ bcc'd the -dev list, to let discussion fork ]

On 2008-09-25 at 16:53 -0400, Chris Zimmerman wrote:
> rim_bis_notifier_virtual_user:
> driver = pipe
> headers_only
> command = /usr/local/cpanel/bin/rim_bis_notifier "${local_part}@${domain}"
> user = "${lookup{$domain}lsearch* {/etc/userdomains}{$value}}"
> group = ${extract{3}{:}{${lookup{${lookup{$domain}lsearch*
> {/etc/userdomains}{$value}}}lsearch{/etc/passwd}{$value}}}}
> log_output = true
> current_directory = "/tmp"
> return_fail_output = true
> return_path_add = false

FWIW, and this is likely *not* the cause of your ALRM problems, it
appears that there's an undocumented (but intuitively sensible, if you
think about it) constraint that you can't use
return_output/return_fail_output on shadow transports.

If someone can find the docs on it, please point them out to me; I'm
rather tired and might just be searching badly. Otherwise, that's a
documentation bug. It should also probably be sanity-checked in the
config *_init() functions to log a config error message; probably best
to not make it a LOG_PANIC_DIE since it's not documented (yet) as a
misconfiguration.

Chris, you'll want to remove that setting.

I'm heading to get some sleep. Someone in another timezone might have
more luck investigating this. Otherwise, I'll pick up tomorrow evening.

-Phil

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Exim issues with shadow_transport [ In reply to ]
On 2008-09-26 at 03:12 -0700, Phil Pennock wrote:
> FWIW, and this is likely *not* the cause of your ALRM problems, it
> appears that there's an undocumented (but intuitively sensible, if you
> think about it) constraint that you can't use
> return_output/return_fail_output on shadow transports.

In transport.c:transport_write_block(), shouldn't there be a
"sigalrm_seen = FALSE" before the "alarm(local_timeout);" on line 234
(per rev 1.21) so that if a previous alarm's SIGALRM was issued, we
don't fail out immediately?

-Phil

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Exim issues with shadow_transport [ In reply to ]
Phil Pennock wrote:
> On 2008-09-26 at 03:12 -0700, Phil Pennock wrote:
>> FWIW, and this is likely *not* the cause of your ALRM problems, it
>> appears that there's an undocumented (but intuitively sensible, if you
>> think about it) constraint that you can't use
>> return_output/return_fail_output on shadow transports.
>
> In transport.c:transport_write_block(), shouldn't there be a
> "sigalrm_seen = FALSE" before the "alarm(local_timeout);" on line 234
> (per rev 1.21) so that if a previous alarm's SIGALRM was issued, we
> don't fail out immediately?
>
> -Phil
>

It seems to be a global var and so is set to FALSE before the function
is called (line 765 of transports/pipe.c)
I don't see why it can't be set to FALSE before the call, but it might
not make any difference to the problem. When the error occurs and
changes the value of sigalrm_seen, the entire function returns FALSE and
the value of the variable is never checked again.

--
The Exim Manual
http://www.exim.org/docs.html
http://docs.exim.org/current/

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Exim issues with shadow_transport [ In reply to ]
On 2008-09-26 at 21:19 +1000, Ted Cooper wrote:
> Phil Pennock wrote:
> > On 2008-09-26 at 03:12 -0700, Phil Pennock wrote:
> >> FWIW, and this is likely *not* the cause of your ALRM problems, it
> >> appears that there's an undocumented (but intuitively sensible, if you
> >> think about it) constraint that you can't use
> >> return_output/return_fail_output on shadow transports.
> >
> > In transport.c:transport_write_block(), shouldn't there be a
> > "sigalrm_seen = FALSE" before the "alarm(local_timeout);" on line 234
> > (per rev 1.21) so that if a previous alarm's SIGALRM was issued, we
> > don't fail out immediately?
>
> It seems to be a global var and so is set to FALSE before the function
> is called (line 765 of transports/pipe.c)
> I don't see why it can't be set to FALSE before the call, but it might
> not make any difference to the problem. When the error occurs and
> changes the value of sigalrm_seen, the entire function returns FALSE and
> the value of the variable is never checked again.

Right, and when the function is called again?

Eg, when sending a bounce message, the return value of
transport_write_message() is not checked and if multiple recipients
failed, the loop is called multiple times, so you can go into this with
sigalrm_seen set TRUE.

This won't create spurious SIGALRM so isn't the problem source, but will
have Exim think it hasn't returned a message when it has.

For this, you'd have to have reached the timeout on a connection
already, so the odds are that things are dead anyway. So it's unlikely
to be making things worse. Especially since you'd have to be accepting
and then generating bounces.

I haven't tracked down all the callers of transport_write_block
transport_write_string transport_write_message so haven't seen if there
are other instances. It's probably best just to reset to clean state.

Hideous packet loss to colo box, ending mail terse and stopping looking
at Exim invocation paths.

-Phil

--
## List details at http://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##