Mailing List Archive

need advice for high traffic network
Hi,

I have a network (LAN) consisting of (mostly) gigabit ethernet on a few
switches. Most of the traffic is taken up by small HTTP reqests. All
computers are running Fedora (all are core 4 through 7).

I've been having some problems with servers not being accessible and
just last night noticed that the problems disappear when I turn off the
firewall.
What happens is that there are lots of small HTTP requests and
apparently at some point the firewall starts dropping or disallowing new
connections. This has been verified with both ab (apache benchmark) and
plain SSH - a lot of times the connections time out or take a long time
to get established.
There are ~25 rules total (as listed by 'iptables -L')

As a temporary measure, I've turned off firewalls on more of the servers
until I can figure out a better solution - I'd like to have a firewall
on each server, but performance is more important.

I'l looking at nf-HiPAC right now - will probably try it some time soon.
Beyond that, I'm out of ideas for the moment.

Is there anything else I can do?
Any other firewalls? Tricks with rearranging the rules?
etc...


Thanks!



Notes:
* Problems do not seem to be limited to any specific Fedora version or
hardware.
* external firewalls are out of the question, unless they're really
small & cheap: there are >40 servers in the internal network and the
number is growing
Re: need advice for high traffic network [ In reply to ]
I'll bet you are hitting your max connections

check the value of net.ipv4.netfilter.ip_conntrack_max

David Lang

On Thu, 19 Jul 2007, Konstantin Svist wrote:

> Date: Thu, 19 Jul 2007 15:17:00 -0700
> From: Konstantin Svist <kostya@relevad.com>
> To: netfilter@lists.netfilter.org
> Subject: need advice for high traffic network
>
> Hi,
>
> I have a network (LAN) consisting of (mostly) gigabit ethernet on a few
> switches. Most of the traffic is taken up by small HTTP reqests. All
> computers are running Fedora (all are core 4 through 7).
>
> I've been having some problems with servers not being accessible and just
> last night noticed that the problems disappear when I turn off the firewall.
> What happens is that there are lots of small HTTP requests and apparently at
> some point the firewall starts dropping or disallowing new connections. This
> has been verified with both ab (apache benchmark) and plain SSH - a lot of
> times the connections time out or take a long time to get established.
> There are ~25 rules total (as listed by 'iptables -L')
>
> As a temporary measure, I've turned off firewalls on more of the servers
> until I can figure out a better solution - I'd like to have a firewall on
> each server, but performance is more important.
>
> I'l looking at nf-HiPAC right now - will probably try it some time soon.
> Beyond that, I'm out of ideas for the moment.
>
> Is there anything else I can do?
> Any other firewalls? Tricks with rearranging the rules?
> etc...
>
>
> Thanks!
>
>
>
> Notes:
> * Problems do not seem to be limited to any specific Fedora version or
> hardware.
> * external firewalls are out of the question, unless they're really small &
> cheap: there are >40 servers in the internal network and the number is
> growing
>
>
>
>
>
Re: need advice for high traffic network [ In reply to ]
# cat /proc/sys/net/netfilter/nf_conntrack_max
65536

somehow I doubt I have THAT many connections :)

highest load right now is around 600 requests per second, and ~60%
complete within 10ms - the rest complete within 200ms (unless the
firewall is turned on - then some start timing out 3s and up)



David Lang wrote:
> I'll bet you are hitting your max connections
>
> check the value of net.ipv4.netfilter.ip_conntrack_max
>
> David Lang
>
> On Thu, 19 Jul 2007, Konstantin Svist wrote:
>
>> Date: Thu, 19 Jul 2007 15:17:00 -0700
>> From: Konstantin Svist <kostya@relevad.com>
>> To: netfilter@lists.netfilter.org
>> Subject: need advice for high traffic network
>>
>> Hi,
>>
>> I have a network (LAN) consisting of (mostly) gigabit ethernet on a
>> few switches. Most of the traffic is taken up by small HTTP reqests.
>> All computers are running Fedora (all are core 4 through 7).
>>
>> I've been having some problems with servers not being accessible and
>> just last night noticed that the problems disappear when I turn off
>> the firewall.
>> What happens is that there are lots of small HTTP requests and
>> apparently at some point the firewall starts dropping or disallowing
>> new connections. This has been verified with both ab (apache
>> benchmark) and plain SSH - a lot of times the connections time out or
>> take a long time to get established.
>> There are ~25 rules total (as listed by 'iptables -L')
>>
>> As a temporary measure, I've turned off firewalls on more of the
>> servers until I can figure out a better solution - I'd like to have a
>> firewall on each server, but performance is more important.
>>
>> I'l looking at nf-HiPAC right now - will probably try it some time
>> soon. Beyond that, I'm out of ideas for the moment.
>>
>> Is there anything else I can do?
>> Any other firewalls? Tricks with rearranging the rules?
>> etc...
>>
>>
>> Thanks!
>>
>>
>>
>> Notes:
>> * Problems do not seem to be limited to any specific Fedora version
>> or hardware.
>> * external firewalls are out of the question, unless they're really
>> small & cheap: there are >40 servers in the internal network and the
>> number is growing
>>
>>
>>
>>
>>
>
>
Re: need advice for high traffic network [ In reply to ]
> I'l looking at nf-HiPAC right now - will probably try it some time soon.
> Beyond that, I'm out of ideas for the moment.

nf-HiPAC won't help there if you just have 25 rules
( => http://people.netfilter.org/kadlec/nftest.pdf ), the problem is
very likely down to you using the default parameters for the conntrack hash table,
just like the other reply indicated.
Re: need advice for high traffic network [ In reply to ]
as I said, the current (and default) value is 65536
what would you suggest changing it to?

Thomas Jacob wrote:
>> I'l looking at nf-HiPAC right now - will probably try it some time soon.
>> Beyond that, I'm out of ideas for the moment.
>>
>
> nf-HiPAC won't help there if you just have 25 rules
> ( => http://people.netfilter.org/kadlec/nftest.pdf ), the problem is
> very likely down to you using the default parameters for the conntrack hash table,
> just like the other reply indicated.
>
>
Re: need advice for high traffic network [ In reply to ]
On Thu, Jul 19, 2007 at 03:40:27PM -0700, Konstantin Svist wrote:
> # cat /proc/sys/net/netfilter/nf_conntrack_max
> 65536
>
> somehow I doubt I have THAT many connections :)
>
> highest load right now is around 600 requests per second, and ~60%
> complete within 10ms - the rest complete within 200ms (unless the
> firewall is turned on - then some start timing out 3s and up)

600s * 120s ip_conntrack_tcp_timeout_time_wait = 72000 entries

( => http://www.isi.edu/touch/pubs/infocomm99/infocomm99-web/ )

You might want to try to reduce those timers or just push
up your hash bucket = max entry values to maybe twice that.
Re: need advice for high traffic network [ In reply to ]
On Thu, 19 Jul 2007, Konstantin Svist wrote:

> as I said, the current (and default) value is 65536
> what would you suggest changing it to?

I have it set to 256000 on my low traffic boxes and 1024000 on my high traffic
boxes.

David Lang

> Thomas Jacob wrote:
>>> I'l looking at nf-HiPAC right now - will probably try it some time soon.
>>> Beyond that, I'm out of ideas for the moment.
>>>
>>
>> nf-HiPAC won't help there if you just have 25 rules
>> ( => http://people.netfilter.org/kadlec/nftest.pdf ), the problem is
>> very likely down to you using the default parameters for the conntrack hash
>> table,
>> just like the other reply indicated.
>>
>>
>
>
Re: need advice for high traffic network [ In reply to ]
How do I reduce those timers?


Thomas Jacob wrote:
> On Thu, Jul 19, 2007 at 03:40:27PM -0700, Konstantin Svist wrote:
>
>> # cat /proc/sys/net/netfilter/nf_conntrack_max
>> 65536
>>
>> somehow I doubt I have THAT many connections :)
>>
>> highest load right now is around 600 requests per second, and ~60%
>> complete within 10ms - the rest complete within 200ms (unless the
>> firewall is turned on - then some start timing out 3s and up)
>>
>
> 600s * 120s ip_conntrack_tcp_timeout_time_wait = 72000 entries
>
> ( => http://www.isi.edu/touch/pubs/infocomm99/infocomm99-web/ )
>
> You might want to try to reduce those timers or just push
> up your hash bucket = max entry values to maybe twice that.
Re: need advice for high traffic network [ In reply to ]
On Thu, Jul 19, 2007 at 04:17:20PM -0700, Konstantin Svist wrote:
> How do I reduce those timers?

echo <VALUE> > /proc/sys/net/ipv4/netfilter/<SETTING>
Re: need advice for high traffic network [ In reply to ]
Sorry, I meant:
Which parameters are those and what values would you recommend?

Thanks!


Thomas Jacob wrote:
> On Thu, Jul 19, 2007 at 04:17:20PM -0700, Konstantin Svist wrote:
>
>> How do I reduce those timers?
>>
>
> echo <VALUE> > /proc/sys/net/ipv4/netfilter/<SETTING>
>
>
Re: need advice for high traffic network [ In reply to ]
Hmm, not sure really, but lower TIME WAIT settings should keep
your conntrack table afloat at least ;-)

I'd rather increase ip_conntrack_max and ip_conntrack_buckets
to the values suggested by David,


http://www.netfilter.org/documentation/FAQ/netfilter-faq-3.html#ss3.7

On Thu, Jul 19, 2007 at 04:35:11PM -0700, Konstantin Svist wrote:
> Sorry, I meant:
> Which parameters are those and what values would you recommend?
>
> Thanks!
>
>
> Thomas Jacob wrote:
> >On Thu, Jul 19, 2007 at 04:17:20PM -0700, Konstantin Svist wrote:
> >
> >>How do I reduce those timers?
> >>
> >
> >echo <VALUE> > /proc/sys/net/ipv4/netfilter/<SETTING>
> >
> >
>
Re: need advice for high traffic network [ In reply to ]
alright, so far I have:

net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_syncookies = 1
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.netfilter.ip_conntrack_max = 1024000


what would you recommend for the buckets? is default (8192) reasonable?




Thomas Jacob wrote:
> Hmm, not sure really, but lower TIME WAIT settings should keep
> your conntrack table afloat at least ;-)
>
> I'd rather increase ip_conntrack_max and ip_conntrack_buckets
> to the values suggested by David,
>
>
> http://www.netfilter.org/documentation/FAQ/netfilter-faq-3.html#ss3.7
>
> On Thu, Jul 19, 2007 at 04:35:11PM -0700, Konstantin Svist wrote:
>
>> Sorry, I meant:
>> Which parameters are those and what values would you recommend?
>>
>> Thanks!
>>
>>
>> Thomas Jacob wrote:
>>
>>> On Thu, Jul 19, 2007 at 04:17:20PM -0700, Konstantin Svist wrote:
>>>
>>>
>>>> How do I reduce those timers?
>>>>
>>>>
>>> echo <VALUE> > /proc/sys/net/ipv4/netfilter/<SETTING>
>>>
>>>
>>>
Re: need advice for high traffic network [ In reply to ]
On Thu, Jul 19, 2007 at 05:18:19PM -0700, Konstantin Svist wrote:
> alright, so far I have:
>
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_syncookies = 1
> net.core.rmem_max = 16777216
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 4096 87380 16777216
> net.ipv4.tcp_wmem = 4096 65536 16777216
> net.ipv4.tcp_no_metrics_save = 1

AFAIK those values do not influence netfilter performance,
just local tcp socket performance.

> net.ipv4.netfilter.ip_conntrack_max = 1024000
>
>
> what would you recommend for the buckets? is default (8192) reasonable?

At the moment I am always setting this to the value of ip_conntrack_max
(on the theory that this should result in constant lookup times), as I
can spare the memory. But I haven't run any real performance tests with
lower hash bucket counts....

The FAQ says though, that one should use odd hash bucket counts, so you
might want to decrease that by one.
Re: need advice for high traffic network [ In reply to ]
You are running firewalls on the servers AND the routers?

Why?

-gc


Konstantin Svist wrote:
> Hi,
>
> I have a network (LAN) consisting of (mostly) gigabit ethernet on a
> few switches. Most of the traffic is taken up by small HTTP reqests.
> All computers are running Fedora (all are core 4 through 7).
>
> I've been having some problems with servers not being accessible and
> just last night noticed that the problems disappear when I turn off
> the firewall.
> What happens is that there are lots of small HTTP requests and
> apparently at some point the firewall starts dropping or disallowing
> new connections. This has been verified with both ab (apache
> benchmark) and plain SSH - a lot of times the connections time out or
> take a long time to get established.
> There are ~25 rules total (as listed by 'iptables -L')
>
> As a temporary measure, I've turned off firewalls on more of the
> servers until I can figure out a better solution - I'd like to have a
> firewall on each server, but performance is more important.
>
> I'l looking at nf-HiPAC right now - will probably try it some time
> soon. Beyond that, I'm out of ideas for the moment.
>
> Is there anything else I can do?
> Any other firewalls? Tricks with rearranging the rules?
> etc...
>
>
> Thanks!
>
>
>
> Notes:
> * Problems do not seem to be limited to any specific Fedora version or
> hardware.
> * external firewalls are out of the question, unless they're really
> small & cheap: there are >40 servers in the internal network and the
> number is growing
>
>
>
Re: need advice for high traffic network [ In reply to ]
On Fri, 20 Jul 2007, Thomas Jacob wrote:

> On Thu, Jul 19, 2007 at 05:18:19PM -0700, Konstantin Svist wrote:
>> alright, so far I have:
>>
>> net.ipv4.tcp_window_scaling = 1
>> net.ipv4.tcp_syncookies = 1
>> net.core.rmem_max = 16777216
>> net.core.wmem_max = 16777216
>> net.ipv4.tcp_rmem = 4096 87380 16777216
>> net.ipv4.tcp_wmem = 4096 65536 16777216
>> net.ipv4.tcp_no_metrics_save = 1
>
> AFAIK those values do not influence netfilter performance,
> just local tcp socket performance.
>
>> net.ipv4.netfilter.ip_conntrack_max = 1024000
>>
>>
>> what would you recommend for the buckets? is default (8192) reasonable?
>
> At the moment I am always setting this to the value of ip_conntrack_max
> (on the theory that this should result in constant lookup times), as I
> can spare the memory. But I haven't run any real performance tests with
> lower hash bucket counts....

you should run the tests. doing a hash across too many buckets ends up costing
performance as well.

you want the list per bucket to not be too long, but you also don't want to
spend more effort and ram on empty buckets.

setting conntrack_max equal to the number of buckets doesn't mean that you will
have one entry in each bucket, it means that you will have a lot of empty
buckets and other buckets with several items in them.

hash algorithms have collisions (cases where different input generates the same
output) cryptographicly strong hashes mean that it's really hard to create a
second input that results in the same output as some other existing input. but
collisions do happen there.

> The FAQ says though, that one should use odd hash bucket counts, so you
> might want to decrease that by one.

it's not unusual for simple (i.e. cheap to use) has algorithims to have
pathalogical results for specific sizes. ideally you want the bucket count to be
a prime number, if it's not (for example a even power of 2) you can get
situations where it only puts things in a very small number of buckets.

David Lang
Re: need advice for high traffic network [ In reply to ]
> you should run the tests. doing a hash across too many buckets ends up
> costing performance as well.

Yes, I should :=)

> you want the list per bucket to not be too long, but you also don't want to
> spend more effort and ram on empty buckets.

What's the extra effort when you have the ram to spare? A worst
you might slightly reduce the cache hit rate.

> setting conntrack_max equal to the number of buckets doesn't mean that you
> will have one entry in each bucket, it means that you will have a lot of
> empty buckets and other buckets with several items in them.

Right, but it's more likely to have short bucket lists if you have
more hash buckets, given the same number of connections, isn't it?

> >The FAQ says though, that one should use odd hash bucket counts, so you
> >might want to decrease that by one.
>
> it's not unusual for simple (i.e. cheap to use) has algorithims to have
> pathalogical results for specific sizes. ideally you want the bucket count
> to be a prime number, if it's not (for example a even power of 2) you can
> get situations where it only puts things in a very small number of buckets.

As far as I understand is, the Jenkins Hash used internally in netfilter
and other parts of the Linux kernel, isn't just your average
text book hash, but something with quite a lot of thought and analysis
behind it:

=> http://www.burtleburtle.net/bob/hash/doobs.html
Re: need advice for high traffic network [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1





ROFL! Why read docs and get the info for yourself when others can spoon
feed ya eh?


Thanks,

Ron DuFresne

On Thu, 19 Jul 2007, Konstantin Svist wrote:

> Sorry, I meant:
> Which parameters are those and what values would you recommend?
>
> Thanks!
>
>
> Thomas Jacob wrote:
>> On Thu, Jul 19, 2007 at 04:17:20PM -0700, Konstantin Svist wrote:
>>
>>> How do I reduce those timers?
>>>
>>
>> echo <VALUE> > /proc/sys/net/ipv4/netfilter/<SETTING>
>>
>>

- --
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
admin & senior security consultant: sysinfo.com
http://sysinfo.com
Key fingerprint = 9401 4B13 B918 164C 647A E838 B2DF AFCC 94B0 6629

...We waste time looking for the perfect lover
instead of creating the perfect love.

-Tom Robbins <Still Life With Woodpecker>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFGt21bst+vzJSwZikRAvqAAJ9+jrvhFa8BrM8oh4X/tWYuZee4FACeK5vF
Zcf8EsQBMhHxGJ8io6Awt4U=
=TUol
-----END PGP SIGNATURE-----