On 2 April 2017 at 20:47, Charles Cazabon
<search-web-for-address@pyropus.ca> wrote:
> Manvendra Bhangui <mbhangui@gmail.com> wrote:
>> On 1 April 2017 at 22:36, Charles Cazabon wrote:
>>
>> > reasonable operating systems and even vaguely-modern hardware. Do you
>> > actually have data showing that this minimal overhead is a bottleneck in
>> > your situation? If not, why are you trying to eliminate it?
>>
>> I do have a situation where I open around 23 control files in the
>> qmail-smtpd process. I want to move this to the main() of the
>> tcpserver process and avoid loading it again and again.
> [...]
>> No I do not have any data. It is definitely not a bottleneck. It's
>> just that the idea of loading the control files just once does look
>> good.
>
> I'm afraid you're fooling yourself. Any control files that are opened and
> read regularly - i.e. if you're getting at least a few SMTP connections every
> minute - will be satisfied from the cache. And qmail's reading of those files
> doesn't use a lot of CPU time or other resources.
>
> So you aren't really saving CPU, or anything else for that matter. You've
> just introduced complexity, and probably security problems, for no real
> benefit.
>
> I *strongly* urge you to profile the performance of vanilla qmail/tcpserver
> before you decide to "fix" something and try to make it "look good". I would
> bet dollars to doughnuts that you will find you are trying to optimize
> something which is only responsible for 0.01% of the program's work in the
> first place.
>
> As djb himself says, measure, don't speculate.
So finally I measured and now I have data, and it so consistent, that
I too am surprised. Using dlopen() to load the control files in
tcpserver gives a huge, huge gain if the control files are large. I
finally managed to do some extreme testing. In cases where the load
was light, the dlopen() method gives a very minor performance
improvement over case where tcpserver uses pathexec func to to exec
qmail-smtpd. In the extreme case where I have loaded a 25 Mib
badmailfrom, there is a huge difference. In the case of indimail-mta,
I loaded 50 Mib of control file. Every invocation of qmail-smtpd,
using the traditional fork()/exec() method, loads the control file and
my laptop (sony viao with hybrid sshd) becomes unresponsive for almost
an hour. netqmail also gives the same pathetic response. The testing
was done by firing, 2000 swaks perl program in parallel, for sending
emails through SMTP. I used indimail-mta's qmail-nullqueue to throw
the emails into void. For netqmail too, I defined QMAILQUEUE to
indimail-mta's nullqueue to keep the testing parameters equal.
In all the tests, dlopen() gave far superior response compared to the
existing tcpserver fork() --> exec() mechanism and the load never shot
above 170. In case of tcpserver doing fork() and then exec of
qmail-smtpd(), with each invocation of qmail-smtpd loading the 25 Mib
control file, the load shot up to > 1000 and after that laptop used to
remain frozen till the qmail-smtpd processes completed. I used a Sony
Viao latop with 8 GB Ram, 1Tb sshd 2012 model.
Using dlopen, I never was able to see qmail-smtpd() in the top
processes as the smtp function is now being done by tcpserver itself.
However, I did not notice the top screen getting filled with tcpserver
processes. In the fork() exec() mechanism, top used to show only
qmail-smtpd() processes.
The result for these tests are tabulated in the online google sheet below.
https://docs.google.com/spreadsheets/d/1lw0V2NRkUHxBVJuqiigVm-ggYiTiPJ9AoE8dMwviyew/edit?usp=sharing
I used this script for testing.
#!/bin/sh
count=0
while true
do
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
count=`expr $count + 1`
if [ $count -eq 200 ] ; then
break
fi
done
wait
The size of my control files for testing was huge (deliberately 25 Mib).
$ ls -l /etc/indimail/control/blackholedsenders
/etc/indimail/control/badmailfrom /var/qmail/control/badmailfrom
-rw-rw-r--. 1 mbhangui mbhangui 26889998 Apr 4 11:00
/etc/indimail/control/badmailfrom
lrwxrwxrwx. 1 root root 11 Apr 4 18:58
/etc/indimail/control/blackholedsenders -> badmailfrom
-rw-r--r--. 1 root root 26889998 Apr 4 17:03
/var/qmail/control/badmailfrom
Now the bad news :( I tried dlmopen() and it is not working. strace
gives the error "process tcpserver runs in x32 mode". There are no 32
bit libraries loaded. Just don't know that is happening and so I have
gone back to using dlopen() which uses the same namespace as
tcpserver's namespace.
<search-web-for-address@pyropus.ca> wrote:
> Manvendra Bhangui <mbhangui@gmail.com> wrote:
>> On 1 April 2017 at 22:36, Charles Cazabon wrote:
>>
>> > reasonable operating systems and even vaguely-modern hardware. Do you
>> > actually have data showing that this minimal overhead is a bottleneck in
>> > your situation? If not, why are you trying to eliminate it?
>>
>> I do have a situation where I open around 23 control files in the
>> qmail-smtpd process. I want to move this to the main() of the
>> tcpserver process and avoid loading it again and again.
> [...]
>> No I do not have any data. It is definitely not a bottleneck. It's
>> just that the idea of loading the control files just once does look
>> good.
>
> I'm afraid you're fooling yourself. Any control files that are opened and
> read regularly - i.e. if you're getting at least a few SMTP connections every
> minute - will be satisfied from the cache. And qmail's reading of those files
> doesn't use a lot of CPU time or other resources.
>
> So you aren't really saving CPU, or anything else for that matter. You've
> just introduced complexity, and probably security problems, for no real
> benefit.
>
> I *strongly* urge you to profile the performance of vanilla qmail/tcpserver
> before you decide to "fix" something and try to make it "look good". I would
> bet dollars to doughnuts that you will find you are trying to optimize
> something which is only responsible for 0.01% of the program's work in the
> first place.
>
> As djb himself says, measure, don't speculate.
So finally I measured and now I have data, and it so consistent, that
I too am surprised. Using dlopen() to load the control files in
tcpserver gives a huge, huge gain if the control files are large. I
finally managed to do some extreme testing. In cases where the load
was light, the dlopen() method gives a very minor performance
improvement over case where tcpserver uses pathexec func to to exec
qmail-smtpd. In the extreme case where I have loaded a 25 Mib
badmailfrom, there is a huge difference. In the case of indimail-mta,
I loaded 50 Mib of control file. Every invocation of qmail-smtpd,
using the traditional fork()/exec() method, loads the control file and
my laptop (sony viao with hybrid sshd) becomes unresponsive for almost
an hour. netqmail also gives the same pathetic response. The testing
was done by firing, 2000 swaks perl program in parallel, for sending
emails through SMTP. I used indimail-mta's qmail-nullqueue to throw
the emails into void. For netqmail too, I defined QMAILQUEUE to
indimail-mta's nullqueue to keep the testing parameters equal.
In all the tests, dlopen() gave far superior response compared to the
existing tcpserver fork() --> exec() mechanism and the load never shot
above 170. In case of tcpserver doing fork() and then exec of
qmail-smtpd(), with each invocation of qmail-smtpd loading the 25 Mib
control file, the load shot up to > 1000 and after that laptop used to
remain frozen till the qmail-smtpd processes completed. I used a Sony
Viao latop with 8 GB Ram, 1Tb sshd 2012 model.
Using dlopen, I never was able to see qmail-smtpd() in the top
processes as the smtp function is now being done by tcpserver itself.
However, I did not notice the top screen getting filled with tcpserver
processes. In the fork() exec() mechanism, top used to show only
qmail-smtpd() processes.
The result for these tests are tabulated in the online google sheet below.
https://docs.google.com/spreadsheets/d/1lw0V2NRkUHxBVJuqiigVm-ggYiTiPJ9AoE8dMwviyew/edit?usp=sharing
I used this script for testing.
#!/bin/sh
count=0
while true
do
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
swaks --from testuser01@example.com --to testuser02@example.com >
/dev/null 2>&1 &
count=`expr $count + 1`
if [ $count -eq 200 ] ; then
break
fi
done
wait
The size of my control files for testing was huge (deliberately 25 Mib).
$ ls -l /etc/indimail/control/blackholedsenders
/etc/indimail/control/badmailfrom /var/qmail/control/badmailfrom
-rw-rw-r--. 1 mbhangui mbhangui 26889998 Apr 4 11:00
/etc/indimail/control/badmailfrom
lrwxrwxrwx. 1 root root 11 Apr 4 18:58
/etc/indimail/control/blackholedsenders -> badmailfrom
-rw-r--r--. 1 root root 26889998 Apr 4 17:03
/var/qmail/control/badmailfrom
Now the bad news :( I tried dlmopen() and it is not working. strace
gives the error "process tcpserver runs in x32 mode". There are no 32
bit libraries loaded. Just don't know that is happening and so I have
gone back to using dlopen() which uses the same namespace as
tcpserver's namespace.