Mailing List Archive

Old topic again: Option to avoid fsync()?
Hello,

I asked here once ago, but got no replies.

If you have a previously overloaded system that built up a large queue,
disabling fsync() is like an overboost switch: Not for regular operation,
but it solves the problem and brings you back to regular operation,
allowing to care about the original problem.

Apart from that, I run a spam honeypot where losing mail is no problem.
Avoiding fsync() for regular operation allows to run it on a way cheaper
system.

A command line option does not help, because it would not be passed
to queue runners and their children. Either you compile Exim without
fsync() or introduce a new configuration file option. Having an extra
executable finally annoys me enough to bring this topic up again.

Philip does not like an option like that, because dumb admins may use
it without being aware of the risks. That's a valid point.

Me, I think the flexibility of Exim allows to screw up so many things
already, that dumb admins probably screw up already and have nothing
to lose. I wouldn't mind if the daemon logged a message about unsafe
operation to mainlog when starting up. I am trying to reduce the amount
of private Exim patches and getting this in the main distribution helps
me a lot, plus it may help others that know what they are doing.

Remember ghost busters: There will be the day when you have to cross
the streams. ;-)

Any comments are appreciated!

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
At 15:21 +0100 Michael Haardt wrote:

> If you have a previously overloaded system that built up a large queue,
> disabling fsync() is like an overboost switch: Not for regular operation,
> but it solves the problem and brings you back to regular operation,
> allowing to care about the original problem.

I try to avoid r/w contention by using full data journalling with external
journals, which are on physically different (and hopefully faster) HDDs. I
believe this is good practice for synchronous I/O like mail and NFS.

For real psychopathic I-don't-care-about-my-data cases you can always use
tmpfs..

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> I try to avoid r/w contention by using full data journalling with external
> journals, which are on physically different (and hopefully faster) HDDs. I
> believe this is good practice for synchronous I/O like mail and NFS.

There is a bunch things you can do for _regular operation_. But disks can
fail by getting ridiculous slow, networks can lose or destroy packets,
name servers can fail, then of course there are attacks and whatever
else... and you end up with a queue.

Disable fsync() for a couple minutes and people have their mail. Enable
it again and everything is fine.

> For real psychopathic I-don't-care-about-my-data cases you can always use
> tmpfs..

Right, but that may either cause swapping or I have a very limited queue.
Running the system on real disks, but without fsync(), means data will
either be written lazy, or not at all. It's nicely in between tmpfs
and fsync().

Do you vote for or against having an option to disable fsync()?

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On Jan 9 Michael Haardt wrote:

> Do you vote for or against having an option to disable fsync()?

Against; I don't want Exim authors blamed for irresponsible behaviour.

Another option available to you is to LD_PRELOAD a no-op for fsync(), eg
<http://ftp.die.net/pub/qmail-tools/libnosync.c>.

But please try the external journal trick first, and set a commit interval
as large as you like--I use a minute or two. Your I/O will scale since
then main volume is largely only reading and the journal volume will only
write if you have enough RAM.

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
* Matt Bernstein:

> I try to avoid r/w contention by using full data journalling with external
> journals, which are on physically different (and hopefully faster) HDDs. I
> believe this is good practice for synchronous I/O like mail and NFS.

If you want to throw money at the problem, a RAID controller with a
battery-backed cache is a good option as well.

On the other hand, with a lot drives in their default configuration,
fsync() can't reliably do what it claims to anyway. 8-/

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On Wed, Jan 10, 2007 at 04:26:29PM +0100, Florian Weimer wrote:
> If you want to throw money at the problem, a RAID controller with a
> battery-backed cache is a good option as well.

You completely miss the point, so let me rephrase it: I am _not_
talking about regular operation. I am talking about cleaning up a mess,
e.g. after an attack or double/triple fault that managed to kill all
redundancy. Additionally, exotic applications benefit from disabling
fsync().

It's not economical to run systems at 10% of their maximum performance
just to have enough if shit happens, unless of course you just run a
small site, where the economic disadvantage of doing so can be tolerated.

> On the other hand, with a lot drives in their default configuration,
> fsync() can't reliably do what it claims to anyway. 8-/

Actually, if you use maildir, there is no fsync() to synchronise the
directory, just one for the tmp file, but a code audit must be harder
than implicating to leave fsync in place under all conditions, because
it the most you can do and still sometimes not enough.

The only valid point so far was:

>> Do you vote for or against having an option to disable fsync()?

> Against; I don't want Exim authors blamed for irresponsible behaviour.

What's irresponsible most of the time, may exceptionally be sane.
LD_PRELOAD is an idea that probably works fine, although I don't like
the implications of a setuid library being usable for other setuid
applications, too. That's why I still prefer an exim-only configuration
file option.

Any other opinions than "enforce fsync, because it works for me"?

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
Michael Haardt wrote:
> On Wed, Jan 10, 2007 at 04:26:29PM +0100, Florian Weimer wrote:
>
>> If you want to throw money at the problem, a RAID controller with a
>> battery-backed cache is a good option as well.
>>
>
> You completely miss the point, so let me rephrase it: I am _not_
> talking about regular operation. I am talking about cleaning up a mess,
> e.g. after an attack or double/triple fault that managed to kill all
> redundancy. Additionally, exotic applications benefit from disabling
> fsync().
>
> It's not economical to run systems at 10% of their maximum performance
> just to have enough if shit happens, unless of course you just run a
> small site, where the economic disadvantage of doing so can be tolerated.
>

Errrrrr. I am somewhat concerned about your last statement. I run the
mail system for the University here, which isn't really a big site, but
we see over a million attempts to deliver mail a day which translates
into about 46,000 real mail messages after greylisting.

We have internal mail servers which accept email from local users and
handle all internal communications and we have a pair of external mail
servers which talk to the outside world. Our mail servers are running
at a fraction of their capacity just because bad things happen too often.

All it takes is some annoying spammer out on the internet to use one of
our users as a fake "From" address and we will see hundreds of thousands
of error messages heading our way.

We've also seen cases of a trojan getting on a local users PC which has
then sent hundreds of thousands of email messages off site.

We've also has cases where our ISP, or the firewalls, or some other
system admin type mistake has taken us down for a weekend which means we
get three days of email on Monday.

So we do always plan for the unexpected and even though a mess happens
several times a year I don't need to do anything to fix it.

I have tried to run a mail system in the way that you are trying to and
I'm very happy that we have the resources here to run ours with lots of
spare capacity because it makes my life simpler.

Having said that here's what I used to do.

1. Find a way to stop whatever was generating the mess.
2. Move the input queue out of the way and restart exim

That at least gets you to the point where current email is flowing.

3. Move the valid email back into the real queue

Easier said than done, but judicious use of "grep" on the header files
usually results in a short list of real email and then its just a case
of moving the header and data files back into the normal queue space.

4. Delete the old queue that is now full of junk.


If you have more than one mail server then you could take the queue onto
another system and run it there rather than slowing down the main
server. In theory you could move it onto a tmpfs filesystem and perform
an exim queue run specifically on that input queue to avoid the fsync()
delays.


Jon.


--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
Michael Haardt wrote:
> Any other opinions than "enforce fsync, because it works for me"?

If this can be done without impacting those that doesn't want to use the
feature, I don't think there's much of an argument against it. As has
been pointed out before, Exim already gives you an almost infinite
number of ways to shoot yourself (or others) in the foot. Given that
there are legitimate use-cases for this functionality, I'd vote to
include it.

Personally I may find it useful for my spamtrap MX, which is handling
close to 200k messages/week.


Bob

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> Errrrrr. I am somewhat concerned about your last statement. I run the
> mail system for the University here, which isn't really a big site, but
> we see over a million attempts to deliver mail a day which translates
> into about 46,000 real mail messages after greylisting.

Are that the two servers multiplexing your traffic between inside and
out? 46,000 messages/day are around 30 messages/minute total average and
probably 60-90 messages/minute peak, and that's two machines in total? I
see. I run as low as 400 messages/minute, peak being 1500/minute - on a
single node. I know I can reach 2000-2200, if needed. The systems for
internal delivery run at 100-200 messages/minute when operating regularly.

> I have tried to run a mail system in the way that you are trying to and
> I'm very happy that we have the resources here to run ours with lots of
> spare capacity because it makes my life simpler.

I sure wouldn't mind a few hundred systems more to make my life simple. ;-)
But as I said: Only small sites can afford that.

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On Wed, 10 Jan 2007, B. Johannessen wrote:

> Michael Haardt wrote:
> > Any other opinions than "enforce fsync, because it works for me"?
>
> If this can be done without impacting those that doesn't want to use the
> feature, I don't think there's much of an argument against it.

It is clear that this is a controversial issue. Perhaps the resolution
is to add the option, but require a compile time configuration to
include the feature. Then it would certainly have zero impact on anybody
who chose not to include it in the binary. If you have it in the binary
but do not turn it on, the impact is a flag test every time Exim might
do an fsync(). I suspect this is a very small cost compared with
everything else that's going on.

--
Philip Hazel University of Cambridge Computing Service
Get the Exim 4 book: http://www.uit.co.uk/exim-book

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> It is clear that this is a controversial issue. Perhaps the resolution
> is to add the option, but require a compile time configuration to
> include the feature. Then it would certainly have zero impact on anybody
> who chose not to include it in the binary. If you have it in the binary
> but do not turn it on, the impact is a flag test every time Exim might
> do an fsync(). I suspect this is a very small cost compared with
> everything else that's going on.

Thanks, that sounds like a perfect solution to me. :)

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
Michael Haardt wrote:
>> Perhaps the resolution is to add the option, but require a compile
>> time configuration to include the feature.

Maybe the no-fsync stuff should be limited to non-daemon mode operation?
I think "exim -qff" would do the trick for Michael, (and for me)
wouldn't it? Michael?
That would at least prevent people from running "exim -bd" or "-q5m"
ordinarily. We could just ignore the no-fsync option or abort during
startup.

> Thanks, that sounds like a perfect solution to me. :)

If the Debian people will activate this switch at least in their -heavy
package, I'd second that. Andreas, do you think you can bear the risk?
At least with the above modification? I'm not sure whether I would.

lg,
daniel

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> Maybe the no-fsync stuff should be limited to non-daemon mode operation?

I don't think the delivery process knows much about running under a
queue runner spawned by a daemon or by a manually started queue runner
or as part of direct manual delivery.

> I think "exim -qff" would do the trick for Michael, (and for me)
> wouldn't it? Michael?

I don't use Exim queue runners for larger systems, because they do not
scale with a growing queue.

> That would at least prevent people from running "exim -bd" or "-q5m"
> ordinarily. We could just ignore the no-fsync option or abort during
> startup.

Unfortunately, the frequent fsync() calls still impose a large penalty
for queue runners, even if those omit them. Try running one queue runner
with fsync and the rest without, and you won't see much improvement.

> > Thanks, that sounds like a perfect solution to me. :)
>
> If the Debian people will activate this switch at least in their -heavy
> package, I'd second that. Andreas, do you think you can bear the risk?
> At least with the above modification? I'm not sure whether I would.

Ah, the joy of "distributions". There ought to be a large banner on that
compile-time switch, saying: You SHOULD (capital letters and reference
to RFC 2119) not enable this option:

3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

And that describes the situation. Enable it, blow up the house, and it's
YOUR FAULT. It's not like compiling in all lookups and authenticators,
which is nice for those who like to use them and not dangerous to play
with. I don't know if Debian wanted to take responsibility for enabling
it, but if so, it would make sense if they already shipped Exim with a
suitable patch by now.

Whoever wonders what Exim 5 could contain to justify a new major version:
A queue storage API like INN has for articles would be my ultimate
favourite and definitively THE feature to start closing the performance
gap to some commercial MTAs. Well, one can dream of having the best of
both worlds.

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On 2007-01-11 Daniel Tiefnig <exim@inode.at> wrote:
[...]
> If the Debian people will activate this switch at least in their -heavy
> package, I'd second that. Andreas, do you think you can bear the risk?
> At least with the above modification? I'm not sure whether I would.

I don't think we would enable this (unless it is enabled by default
upstream), since it is a controversial feature with Phil's opinion in
a rather definite direction. (I am giving Phil's opinion big weight
for the simple reason that he knows a lot more about the issue.)

However, the ultimate decision would be Marc's not mine, since he is
doing almost the whole work for exim packagin nowadays.
cu andreas
--
The 'Galactic Cleaning' policy undertaken by Emperor Zhark is a personal
vision of the emperor's, and its inclusion in this work does not constitute
tacit approval by the author or the publisher for any such projects,
howsoever undertaken. (c) Jasper Ffforde

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
Michael Haardt wrote:
>> Maybe the no-fsync stuff should be limited to non-daemon mode
>> operation?
>
> I don't think the delivery process knows much about running under a
> queue runner spawned by a daemon or by a manually started queue
> runner or as part of direct manual delivery.

Ah, no. I just meant to include a check into exim's options parsing that
will abort on "exim -bd --no-fsync". (however --no-fsync will be called)

>> I think "exim -qff" would do the trick for Michael, (and for me)
>> wouldn't it? Michael?
>
> I don't use Exim queue runners for larger systems, because they do
> not scale with a growing queue.

Hmm, so what are we talking about then? :o)

> Unfortunately, the frequent fsync() calls still impose a large
> penalty for queue runners, even if those omit them. Try running one
> queue runner with fsync and the rest without, and you won't see much
> improvement.

Well, you can of course disable regular queueruns while messing around.
The listening daemon may make some problems, but you can (re)start it
with "-odq" at least.

> Ah, the joy of "distributions".

I just thought it won't hurt to ask...

lg,
daniel

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> > I don't use Exim queue runners for larger systems, because they do
> > not scale with a growing queue.
>
> Hmm, so what are we talking about then? :o)

Exim queue runners don't deliver mails on their own, but spawn children
doing that. Your suggestion is to use a new flag that queue runners
had to pass to those children, and of course exim had to check if that
flag had been passed by a non-admin user. That works, but it is more
work to be sure you get it all right.

That's why I suggested a configuration file option. Only admins can
change it and any exim process has the same, consistent view of the
configuration.

I don't use n queue runners that scan the queue in an uncoordinated
manner, thus frequently colliding with each other, but one script that
enumerates the queue once and keeps n parallel deliveries running.
In fact, n plus a few more (one delivery may trigger further deliveries).
The actual delivery process wouldn't know the difference. You can
do nice things that way.

> Well, you can of course disable regular queueruns while messing around.
> The listening daemon may make some problems, but you can (re)start it
> with "-odq" at least.

If there is any way to still accept new messages, I do that, because
otherwise I hurt whoever wants to send them.

> > Ah, the joy of "distributions".
>
> I just thought it won't hurt to ask...

It's a valid question and I am surprised for good to hear Debian
is likely to follow Philip not compiling that extension by default.

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
Michael Haardt wrote:
> Your suggestion is to use a new flag that queue runners had to pass
> to those children,

Not neccessarily, but that's what I thought would be most usefull.

> and of course exim had to check if that flag had been passed by a
> non-admin user. That works, but it is more work to be sure you get
> it all right.

I now get your point.

> I don't use n queue runners that scan the queue in an uncoordinated
> manner, thus frequently colliding with each other, but one script
> that enumerates the queue once and keeps n parallel deliveries
> running.

Sounds reasonable, maybe I should try that too on our queue server. I
didn't mind so far, as it is running fine as long as the queue stays
below, say, 100k messages.

lg,
daniel

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On Thu, 11 Jan 2007, Michael Haardt wrote:

> Whoever wonders what Exim 5 could contain to justify a new major version:
> A queue storage API like INN has for articles would be my ultimate
> favourite and definitively THE feature to start closing the performance
> gap to some commercial MTAs. Well, one can dream of having the best of
> both worlds.

There are several things that Exim 5 could useful contain, but I might
as well make it clear that it won't be my responsibility as I will be
retired. :-) At some stage making a list might be useful.

--
Philip Hazel University of Cambridge Computing Service
Get the Exim 4 book: http://www.uit.co.uk/exim-book

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
On Thu, 11 Jan 2007, Philip Hazel wrote:

> It is clear that this is a controversial issue. Perhaps the resolution
> is to add the option, but require a compile time configuration to
> include the feature. Then it would certainly have zero impact on anybody
> who chose not to include it in the binary. If you have it in the binary
> but do not turn it on, the impact is a flag test every time Exim might
> do an fsync(). I suspect this is a very small cost compared with
> everything else that's going on.

OK, I've tried to make everyone happy. I have added a compile-time
option called ENABLE_DISABLE_FSYNC, and put a lot of warnings about it
in EDITME. In particular, I've said it should never be used when
compiling binaries for distribution.

If ENABLE_DISABLE_FSYNC is set, a runtime option called disable_fsync is
compiled. If the compile time option is not set, an attempt to use the
runtime option gets "unknown option".

This code is committed to CVS and so will be in tonight's snapshot.

Philip

--
Philip Hazel, University of Cambridge Computing Service.

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
Re: Old topic again: Option to avoid fsync()? [ In reply to ]
> OK, I've tried to make everyone happy. I have added a compile-time
> option called ENABLE_DISABLE_FSYNC, and put a lot of warnings about it
> in EDITME. In particular, I've said it should never be used when
> compiling binaries for distribution.
>
> If ENABLE_DISABLE_FSYNC is set, a runtime option called disable_fsync is
> compiled. If the compile time option is not set, an attempt to use the
> runtime option gets "unknown option".

Perfectly! I just tried it and it works great, reducing I/O from 400
down to about 20 operations per second. It's back at 400 now, because
that's how things are meant to work, but it is good to know there is an
emergency exit in reach.

Thanks a lot for your work,

Michael

--
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##