Mailing List Archive

Re: [nsp] MSFC2 128,000 route limitation
On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:

> The TCAM that holds the FIB table is capable of holding 256,000 entries.
> Without unicast RPF checking turned on the maximum number of unicast
> entries that can be held in the hardware FIB table is 244,000. The
> remaining 12,000 entries are reserved for multicast routes. If unicast RPF
> checking is enabled then the number of routes that are held in the TCAM is
> halved.

> You can exceed the capacity of the hardware forwarding table, and the
> consequences are that the routes that are not programmed into the TCAM
that
> holds the FIB table will be switched in software by the MSFC2 / RP.

I have apparently ran into this limitation, with much worse consequences
(running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and all
updates from the MSFC to the supervisor/PFC stopped. This happened in both
of a pair of redundant 6500s, bringing both down and leaving me unable to
bring them back up with a full routing table.

Cisco TAC found bug cscdw89942, and said the internal notes recommend using
the "set mls cef per-prefix-stats disable" to reduce the number of entries.

It appears that at this point the limitiation is not something to take
lightly. Reaching it (at least under Hybrid) apparently brings everything
down. There is no software yet available that fixes this, and the only
workaround is to take measures to reduce your CEF table size (such as
turning off per-prefix-stats).

For perspective, the routers that failed each see two BGP feeds of full
Internet routes, as well as about 12 OSPF routes (each of which has 2 or 3
paths to get there). This doesn't seem like a particularly large number of
routes to me, however it certainly passes the limit listed in the bug of
50,000 routes with dual paths.

Is there anywhere I can get a count of the actual current number of entries
and/or space free, or is the only way to tell to show the cef table size and
manually figure out if you need to multiply it if you have unicast RPF on,
then make sure that is less than 244,000? I want to go through all my 6500s
and make sure I'm not about to hit the limit on any of them (some are hybrid
and some are native). The thought of all my 6500s falling over at once and
staying down because I reached the maximum limit on routes scares me
greatly.
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
Hi Matt,

I am wondering how much memory you have installed on SUP2. Could
you share this information with us?

-simagami


On Thu, 22 Aug 2002 18:24:01 -0400,
"Matt Buford" <matt@overloaded.net> wrote:

> On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:
>
> > The TCAM that holds the FIB table is capable of holding 256,000 entries.
> > Without unicast RPF checking turned on the maximum number of unicast
> > entries that can be held in the hardware FIB table is 244,000. The
> > remaining 12,000 entries are reserved for multicast routes. If unicast RPF
> > checking is enabled then the number of routes that are held in the TCAM is
> > halved.
>
> > You can exceed the capacity of the hardware forwarding table, and the
> > consequences are that the routes that are not programmed into the TCAM
> that
> > holds the FIB table will be switched in software by the MSFC2 / RP.
>
> I have apparently ran into this limitation, with much worse consequences
> (running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and all
> updates from the MSFC to the supervisor/PFC stopped. This happened in both
> of a pair of redundant 6500s, bringing both down and leaving me unable to
> bring them back up with a full routing table.
>
> Cisco TAC found bug cscdw89942, and said the internal notes recommend using
> the "set mls cef per-prefix-stats disable" to reduce the number of entries.
>
> It appears that at this point the limitiation is not something to take
> lightly. Reaching it (at least under Hybrid) apparently brings everything
> down. There is no software yet available that fixes this, and the only
> workaround is to take measures to reduce your CEF table size (such as
> turning off per-prefix-stats).
>
> For perspective, the routers that failed each see two BGP feeds of full
> Internet routes, as well as about 12 OSPF routes (each of which has 2 or 3
> paths to get there). This doesn't seem like a particularly large number of
> routes to me, however it certainly passes the limit listed in the bug of
> 50,000 routes with dual paths.
>
> Is there anywhere I can get a count of the actual current number of entries
> and/or space free, or is the only way to tell to show the cef table size and
> manually figure out if you need to multiply it if you have unicast RPF on,
> then make sure that is less than 244,000? I want to go through all my 6500s
> and make sure I'm not about to hit the limit on any of them (some are hybrid
> and some are native). The thought of all my 6500s falling over at once and
> staying down because I reached the maximum limit on routes scares me
> greatly.
>
>
> _______________________________________________
> cisco-nsp mailing list real_name)s@puck.nether.net
> http://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
The bug you site has nothing to do with filling up the FIB tcam, it has to
do with filling up the memory that contains the adjacency rewrite
information. Filling up either of these two resources will cause problems.
If the problem you are running into is the DDTs you refer to then an
accurate description of the problem is:

[snip]

only with a large network configuration in which many prefixes have
multiple paths to them resulting in adjacency exception condition
(condition where the NMP runs of adjacency table ) on the NMP. This happens
because we do not share adjacencies among prefixes even if they have the
same multiple paths ... Because of constant network updates, some of the
adjacencies get deleted and when the NMP comes of out adj exception, we
issue a reload of FIB/ADJ table . This cycles go on and on resulting in the
high CPU on NMP.

[end snip]

If you only have 50k routes, then it only consume 50k TCAM entries. The
structure of the forwarding information looks like this:

FIB TCAM Adjacency Table
+-----------+ +------------------+
| 1.0.0.0/8 | --1:N--< | path1 rewrite 1 |
| 2.1.0.0/16| | path2 rewrite 2 |
| | | path3 rewrite 3 |
+-----------+ +------------------+
256k entries 256k entries

If you have two parallel paths and you had 50k entries with CatOS you would
consume 50k x 2 entries in the adjacency table. In IOS for the Catalyst
6000 this is done differently and you would consume just 2 entries since
the prefixes can share adjacency table entries.


I have not dealt with CatOS on the platform for over 18 months, only IOS
for the Catalyst 6000. IOS for the Catalyst 6000 does not have this
problem, it handles the adjacency table programming by allowing prefixes to
share adjacencies entries. The way to show how much fib tcam has been
consumed on IOS for the Catalyst 6000 is:

tromso#sh mls cef summary

tromso-sp#
Total CEF switched packets: 0000019424896725
Total CEF switched bytes: 0001857776306791
Total routes: 110568
IP unicast routes: 110555
IPX routes: 0
IP multicast routes: 13
tromso#


Looking at the manual the command for CatOS is "show mls cef" to get the
equivalent information.

To get the number of entries used in the adjacency table for IOS for
Catalyst 6000 use:

tromso#sh mls cef adjacency count

tromso-sp#
Total adjacencies: 24
tromso#


The only similar count for adjacency usage I can see for a system running
CatOS is under "sh polaris fibmgr usage"


Ian



At 06:24 PM 8/22/2002 -0400, Matt Buford wrote:
>On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:
>
> > The TCAM that holds the FIB table is capable of holding 256,000 entries.
> > Without unicast RPF checking turned on the maximum number of unicast
> > entries that can be held in the hardware FIB table is 244,000. The
> > remaining 12,000 entries are reserved for multicast routes. If unicast RPF
> > checking is enabled then the number of routes that are held in the TCAM is
> > halved.
>
> > You can exceed the capacity of the hardware forwarding table, and the
> > consequences are that the routes that are not programmed into the TCAM
>that
> > holds the FIB table will be switched in software by the MSFC2 / RP.
>
>I have apparently ran into this limitation, with much worse consequences
>(running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and all
>updates from the MSFC to the supervisor/PFC stopped. This happened in both
>of a pair of redundant 6500s, bringing both down and leaving me unable to
>bring them back up with a full routing table.
>
>Cisco TAC found bug cscdw89942, and said the internal notes recommend using
>the "set mls cef per-prefix-stats disable" to reduce the number of entries.
>
>It appears that at this point the limitiation is not something to take
>lightly. Reaching it (at least under Hybrid) apparently brings everything
>down. There is no software yet available that fixes this, and the only
>workaround is to take measures to reduce your CEF table size (such as
>turning off per-prefix-stats).
>
>For perspective, the routers that failed each see two BGP feeds of full
>Internet routes, as well as about 12 OSPF routes (each of which has 2 or 3
>paths to get there). This doesn't seem like a particularly large number of
>routes to me, however it certainly passes the limit listed in the bug of
>50,000 routes with dual paths.
>
>Is there anywhere I can get a count of the actual current number of entries
>and/or space free, or is the only way to tell to show the cef table size and
>manually figure out if you need to multiply it if you have unicast RPF on,
>then make sure that is less than 244,000? I want to go through all my 6500s
>and make sure I'm not about to hit the limit on any of them (some are hybrid
>and some are native). The thought of all my 6500s falling over at once and
>staying down because I reached the maximum limit on routes scares me
>greatly.
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
Excellent. Thank you. This type of detailed technical explanation is
exactly what I needed. TAC always gave me descriptions that weren't
detailed enough for me to figure out exactly what was happening. It was
clear that something was filling up related to too many routes, but that was
as detailed as it got.

Is there any way to see how much memory is used (or free) for the adjacency
table? As I mentioned before, these boxes didn't really have anything but
dual bgp feeds (of full Internet routes) and OSPF providing two equal cost
paths to each BGP next-hop. This doesn't seem like a particularly large or
unusual situation to me. If anything I see this as more of a starting point
configuration for initial deployment - that will only get larger with time
and additional customers. I understand that the command "set mls cef
per-prefix-stats disable" has reduced this (by half?) but I'd sleep better
if I knew how close I was to hitting this limitation again.

These specific switches are running hybrid. This decision was made because
this situation calls for mostly switching - with very few router interfaces.
Basically there's just two upstream vlan interfaces (two backbones) then a
downstream customer interface. Other than those interfaces, its all
switching. CatOS's "hitless" upgrades plus the ability to upgrade the
routers without affecting switching seemed a big plus. This allows router
upgrades without affecting even customers single-homed directly to these
switches. Would you say the architectural improvements (such as the change
in adjacency table handling you mentioned) are such that you recommend
native IOS in all situations?

----- Original Message -----
From: "Ian Cox" <icox@cisco.com>
To: "Matt Buford" <matt@overloaded.net>; <cisco-nsp@puck.nether.net>
Sent: Thursday, August 22, 2002 8:15 PM
Subject: Re: [nsp] MSFC2 128,000 route limitation


>
> The bug you site has nothing to do with filling up the FIB tcam, it has to
> do with filling up the memory that contains the adjacency rewrite
> information. Filling up either of these two resources will cause problems.
> If the problem you are running into is the DDTs you refer to then an
> accurate description of the problem is:
>
> [snip]
>
> only with a large network configuration in which many prefixes have
> multiple paths to them resulting in adjacency exception condition
> (condition where the NMP runs of adjacency table ) on the NMP. This
happens
> because we do not share adjacencies among prefixes even if they have the
> same multiple paths ... Because of constant network updates, some of the
> adjacencies get deleted and when the NMP comes of out adj exception, we
> issue a reload of FIB/ADJ table . This cycles go on and on resulting in
the
> high CPU on NMP.
>
> [end snip]
>
> If you only have 50k routes, then it only consume 50k TCAM entries. The
> structure of the forwarding information looks like this:
>
> FIB TCAM Adjacency Table
> +-----------+ +------------------+
> | 1.0.0.0/8 | --1:N--< | path1 rewrite 1 |
> | 2.1.0.0/16| | path2 rewrite 2 |
> | | | path3 rewrite 3 |
> +-----------+ +------------------+
> 256k entries 256k entries
>
> If you have two parallel paths and you had 50k entries with CatOS you
would
> consume 50k x 2 entries in the adjacency table. In IOS for the Catalyst
> 6000 this is done differently and you would consume just 2 entries since
> the prefixes can share adjacency table entries.
>
>
> I have not dealt with CatOS on the platform for over 18 months, only IOS
> for the Catalyst 6000. IOS for the Catalyst 6000 does not have this
> problem, it handles the adjacency table programming by allowing prefixes
to
> share adjacencies entries. The way to show how much fib tcam has been
> consumed on IOS for the Catalyst 6000 is:
>
> tromso#sh mls cef summary
>
> tromso-sp#
> Total CEF switched packets: 0000019424896725
> Total CEF switched bytes: 0001857776306791
> Total routes: 110568
> IP unicast routes: 110555
> IPX routes: 0
> IP multicast routes: 13
> tromso#
>
>
> Looking at the manual the command for CatOS is "show mls cef" to get the
> equivalent information.
>
> To get the number of entries used in the adjacency table for IOS for
> Catalyst 6000 use:
>
> tromso#sh mls cef adjacency count
>
> tromso-sp#
> Total adjacencies: 24
> tromso#
>
>
> The only similar count for adjacency usage I can see for a system running
> CatOS is under "sh polaris fibmgr usage"
>
>
> Ian
>
>
>
> At 06:24 PM 8/22/2002 -0400, Matt Buford wrote:
> >On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:
> >
> > > The TCAM that holds the FIB table is capable of holding 256,000
entries.
> > > Without unicast RPF checking turned on the maximum number of unicast
> > > entries that can be held in the hardware FIB table is 244,000. The
> > > remaining 12,000 entries are reserved for multicast routes. If unicast
RPF
> > > checking is enabled then the number of routes that are held in the
TCAM is
> > > halved.
> >
> > > You can exceed the capacity of the hardware forwarding table, and the
> > > consequences are that the routes that are not programmed into the TCAM
> >that
> > > holds the FIB table will be switched in software by the MSFC2 / RP.
> >
> >I have apparently ran into this limitation, with much worse consequences
> >(running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and all
> >updates from the MSFC to the supervisor/PFC stopped. This happened in
both
> >of a pair of redundant 6500s, bringing both down and leaving me unable to
> >bring them back up with a full routing table.
> >
> >Cisco TAC found bug cscdw89942, and said the internal notes recommend
using
> >the "set mls cef per-prefix-stats disable" to reduce the number of
entries.
> >
> >It appears that at this point the limitiation is not something to take
> >lightly. Reaching it (at least under Hybrid) apparently brings
everything
> >down. There is no software yet available that fixes this, and the only
> >workaround is to take measures to reduce your CEF table size (such as
> >turning off per-prefix-stats).
> >
> >For perspective, the routers that failed each see two BGP feeds of full
> >Internet routes, as well as about 12 OSPF routes (each of which has 2 or
3
> >paths to get there). This doesn't seem like a particularly large number
of
> >routes to me, however it certainly passes the limit listed in the bug of
> >50,000 routes with dual paths.
> >
> >Is there anywhere I can get a count of the actual current number of
entries
> >and/or space free, or is the only way to tell to show the cef table size
and
> >manually figure out if you need to multiply it if you have unicast RPF
on,
> >then make sure that is less than 244,000? I want to go through all my
6500s
> >and make sure I'm not about to hit the limit on any of them (some are
hybrid
> >and some are native). The thought of all my 6500s falling over at once
and
> >staying down because I reached the maximum limit on routes scares me
> >greatly.
>
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
At 04:36 PM 8/23/2002 -0400, Matt Buford wrote:
>Excellent. Thank you. This type of detailed technical explanation is
>exactly what I needed. TAC always gave me descriptions that weren't
>detailed enough for me to figure out exactly what was happening. It was
>clear that something was filling up related to too many routes, but that was
>as detailed as it got.
>
>Is there any way to see how much memory is used (or free) for the adjacency
>table? As I mentioned before, these boxes didn't really have anything but
>dual bgp feeds (of full Internet routes) and OSPF providing two equal cost
>paths to each BGP next-hop. This doesn't seem like a particularly large or
>unusual situation to me. If anything I see this as more of a starting point
>configuration for initial deployment - that will only get larger with time
>and additional customers. I understand that the command "set mls cef
>per-prefix-stats disable" has reduced this (by half?) but I'd sleep better
>if I knew how close I was to hitting this limitation again.

It was at the end of my last email "sh polaris fibmgr usage" is the CatOS
command that has this information.

[snip]

...
Total FIB entries: 262144
Allocated FIB entries: 107840
Free FIB entries: 154304
FIB entries used for IP ucast: 107839

...

Total adjacencies: 262144
Allocated adjacencies: 216332
Free adjacencies: 45812

...

[end snip]



>These specific switches are running hybrid. This decision was made because
>this situation calls for mostly switching - with very few router interfaces.
>Basically there's just two upstream vlan interfaces (two backbones) then a
>downstream customer interface. Other than those interfaces, its all
>switching. CatOS's "hitless" upgrades plus the ability to upgrade the
>routers without affecting switching seemed a big plus. This allows router
>upgrades without affecting even customers single-homed directly to these
>switches. Would you say the architectural improvements (such as the change
>in adjacency table handling you mentioned) are such that you recommend
>native IOS in all situations?

IOS for Catalyst 6000 handles large routing tables with parallel paths much
more efficiently than CatOS does today. HA is much better in CatOS but over
the next year IOS will end on at the same level. Asking me which one to use
is not really fair because I'm totally biased towards IOS for Catalyst 6000 :)


Ian

>----- Original Message -----
>From: "Ian Cox" <icox@cisco.com>
>To: "Matt Buford" <matt@overloaded.net>; <cisco-nsp@puck.nether.net>
>Sent: Thursday, August 22, 2002 8:15 PM
>Subject: Re: [nsp] MSFC2 128,000 route limitation
>
>
> >
> > The bug you site has nothing to do with filling up the FIB tcam, it has to
> > do with filling up the memory that contains the adjacency rewrite
> > information. Filling up either of these two resources will cause problems.
> > If the problem you are running into is the DDTs you refer to then an
> > accurate description of the problem is:
> >
> > [snip]
> >
> > only with a large network configuration in which many prefixes have
> > multiple paths to them resulting in adjacency exception condition
> > (condition where the NMP runs of adjacency table ) on the NMP. This
>happens
> > because we do not share adjacencies among prefixes even if they have the
> > same multiple paths ... Because of constant network updates, some of the
> > adjacencies get deleted and when the NMP comes of out adj exception, we
> > issue a reload of FIB/ADJ table . This cycles go on and on resulting in
>the
> > high CPU on NMP.
> >
> > [end snip]
> >
> > If you only have 50k routes, then it only consume 50k TCAM entries. The
> > structure of the forwarding information looks like this:
> >
> > FIB TCAM Adjacency Table
> > +-----------+ +------------------+
> > | 1.0.0.0/8 | --1:N--< | path1 rewrite 1 |
> > | 2.1.0.0/16| | path2 rewrite 2 |
> > | | | path3 rewrite 3 |
> > +-----------+ +------------------+
> > 256k entries 256k entries
> >
> > If you have two parallel paths and you had 50k entries with CatOS you
>would
> > consume 50k x 2 entries in the adjacency table. In IOS for the Catalyst
> > 6000 this is done differently and you would consume just 2 entries since
> > the prefixes can share adjacency table entries.
> >
> >
> > I have not dealt with CatOS on the platform for over 18 months, only IOS
> > for the Catalyst 6000. IOS for the Catalyst 6000 does not have this
> > problem, it handles the adjacency table programming by allowing prefixes
>to
> > share adjacencies entries. The way to show how much fib tcam has been
> > consumed on IOS for the Catalyst 6000 is:
> >
> > tromso#sh mls cef summary
> >
> > tromso-sp#
> > Total CEF switched packets: 0000019424896725
> > Total CEF switched bytes: 0001857776306791
> > Total routes: 110568
> > IP unicast routes: 110555
> > IPX routes: 0
> > IP multicast routes: 13
> > tromso#
> >
> >
> > Looking at the manual the command for CatOS is "show mls cef" to get the
> > equivalent information.
> >
> > To get the number of entries used in the adjacency table for IOS for
> > Catalyst 6000 use:
> >
> > tromso#sh mls cef adjacency count
> >
> > tromso-sp#
> > Total adjacencies: 24
> > tromso#
> >
> >
> > The only similar count for adjacency usage I can see for a system running
> > CatOS is under "sh polaris fibmgr usage"
> >
> >
> > Ian
> >
> >
> >
> > At 06:24 PM 8/22/2002 -0400, Matt Buford wrote:
> > >On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:
> > >
> > > > The TCAM that holds the FIB table is capable of holding 256,000
>entries.
> > > > Without unicast RPF checking turned on the maximum number of unicast
> > > > entries that can be held in the hardware FIB table is 244,000. The
> > > > remaining 12,000 entries are reserved for multicast routes. If unicast
>RPF
> > > > checking is enabled then the number of routes that are held in the
>TCAM is
> > > > halved.
> > >
> > > > You can exceed the capacity of the hardware forwarding table, and the
> > > > consequences are that the routes that are not programmed into the TCAM
> > >that
> > > > holds the FIB table will be switched in software by the MSFC2 / RP.
> > >
> > >I have apparently ran into this limitation, with much worse consequences
> > >(running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and all
> > >updates from the MSFC to the supervisor/PFC stopped. This happened in
>both
> > >of a pair of redundant 6500s, bringing both down and leaving me unable to
> > >bring them back up with a full routing table.
> > >
> > >Cisco TAC found bug cscdw89942, and said the internal notes recommend
>using
> > >the "set mls cef per-prefix-stats disable" to reduce the number of
>entries.
> > >
> > >It appears that at this point the limitiation is not something to take
> > >lightly. Reaching it (at least under Hybrid) apparently brings
>everything
> > >down. There is no software yet available that fixes this, and the only
> > >workaround is to take measures to reduce your CEF table size (such as
> > >turning off per-prefix-stats).
> > >
> > >For perspective, the routers that failed each see two BGP feeds of full
> > >Internet routes, as well as about 12 OSPF routes (each of which has 2 or
>3
> > >paths to get there). This doesn't seem like a particularly large number
>of
> > >routes to me, however it certainly passes the limit listed in the bug of
> > >50,000 routes with dual paths.
> > >
> > >Is there anywhere I can get a count of the actual current number of
>entries
> > >and/or space free, or is the only way to tell to show the cef table size
>and
> > >manually figure out if you need to multiply it if you have unicast RPF
>on,
> > >then make sure that is less than 244,000? I want to go through all my
>6500s
> > >and make sure I'm not about to hit the limit on any of them (some are
>hybrid
> > >and some are native). The thought of all my 6500s falling over at once
>and
> > >staying down because I reached the maximum limit on routes scares me
> > >greatly.
> >
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
Some weeks ago we wanted to have a full bgp table in an 6500 (we wanted to
test if it will cope with it after an 128-to-512 memory upgrade on the
MSFC).

The machine runs native IOS and has more de 30 bgp peering with a total of
around 7000 routes.

First we hit it with around 60000-70000 from a border router. The things
weren't fine but it was almost ok, meaning that we were going really low
on free memory on the SUP2, which had 128M. At the end of this proces we
had around 4-5Mbytes free on the SUP2.

My questions are:

- are the prefix-stats using memory from the SUP or from the MSFC ?
- is there any part-number for upgrading the memory on the SUP2 ?
- did anyone tried to install a 512M SODIMM (like the one on the MSFC2,
I think they are identical) in the SUP2 ?
- from the memory consuption point of view, which is using more memory -
the hybrid configuration or the native IOS configuration ?
- in this circumstances (with 128M on SUP2 and 512M on MSFC2) what should
I use if I really want to have the full bgp table ?

Thanks for your time.

--

George Boulescu
Senior Network Engineer
RoEduNet Bucharest NOC
CCAI, CCNA
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
> - did anyone tried to install a 512M SODIMM (like the one on the MSFC2,
> I think they are identical) in the SUP2 ?

We did something similar - after upgrading the MSFC2 on one box from 256
to 512 MB, we used the 256 MB SODIMM to upgrade the Sup2 on another box
from 128 MB to 256 MB. Worked like a charm.

> - in this circumstances (with 128M on SUP2 and 512M on MSFC2) what should
> I use if I really want to have the full bgp table ?

You really want 256 MB on the Sup2. This is the memory consumption on a
6509 running native IOS with 256 MB on the Sup2 and 512 MB on the MSFC2,
3 full views and a total of 47 BGP peers, 12.1(11b)E3:

Sup2:
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 431BD600 180627968 57950684 122677284 121552512 122204372
I/O DE00000 35651584 9157436 26494148 25563024 23759292

MSFC2:
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 41CCD7A0 471017568 157273820 313743748 287910784 186499888
I/O DE00000 35651584 8884708 26766876 24654184 26729532

This is from a 6506 running native IOS with 256 MB on the Sup2 and 512 MB
on the MSFC2, 2 full views and a total of 12 BGP peers, 12.1(11b)E4:

Sup2:
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 431C08A0 180615008 55297044 125317964 125229812 125214812
I/O DE00000 35651584 9158404 26493180 25726048 18532092
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)

MSFC2:
Processor 41CD5980 470984320 142732016 328252304 319569860 193600640
I/O DE00000 35651584 8754888 26896696 26800856 26848956

Steinar Haug, Nethelp consulting, sthaug@nethelp.no
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
>
>
>Cisco TAC found bug cscdw89942, and said the internal notes recommend using
>the "set mls cef per-prefix-stats disable" to reduce the number of entries.
>
>
never seen this command. what catos release is this in? cant find it in
docs.

--
marc
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
something is clicking here.

We've a 6509, SUP2, MSFC2, in hybrid, known as switch8.oct and msfc1.oct
(same physical box):

switch8.oct> sho ver
[ ... ]
DRAM FLASH NVRAM
Module Total Used Free Total Used Free Total Used Free
------ ------- ------- ------- ------- ------- ------- ----- ----- -----
1 130944K 91643K 39301K 32768K 23031K 9737K 512K 377K 135K


msfc1.oct#sho mem
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 41C77780 473467008 124689452 348777556 212393264 140669952
I/O E000000 33554432 5466480 28087952 18835872 21579388


msfc1.oct#sh ip bgp sum
BGP router identifier 209.123.12.59, local AS number 8001
BGP table version is 19384378, main routing table version 19384378
112889 network entries and 241992 paths using 19887723 bytes of memory
63868 BGP path attribute entries using 3832440 bytes of memory
2 BGP rrinfo entries using 48 bytes of memory
36775 BGP AS-PATH entries using 938972 bytes of memory
160 BGP community entries using 4898 bytes of memory
134038 BGP route-map cache entries using 2144608 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Dampening enabled. 48 history paths, 11 dampened paths
3 received paths for inbound soft reconfiguration
BGP activity 317162/22307871 prefixes, 11082573/10840581 paths, scan interval 15 secs

There are about 20 peers on this, three of which are full views.


seperately, we have a 6509, sup1, msfc2, in integrated:

msfc1.nyc>sho mem
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 42017080 467570560 144717408 322853152 293755012 192841908
I/O DE00000 35651584 8754888 26896696 24740584 26866364

(how do you check free SUP ram on an integrated box?)

msfc1.nyc#sho ip bgp sum
BGP router identifier 209.123.12.4, local AS number 8001
BGP table version is 17174893, main routing table version 17174893
112903 network entries and 323852 paths using 22836069 bytes of memory
105107 BGP path attribute entries using 6307800 bytes of memory
3 BGP rrinfo entries using 72 bytes of memory
49711 BGP AS-PATH entries using 1407988 bytes of memory
243 BGP community entries using 7002 bytes of memory
179720 BGP route-map cache entries using 2875520 bytes of memory
0 BGP filter-list cache entries using 0 bytes of memory
Dampening enabled. 95 history paths, 45 dampened paths
11 received paths for inbound soft reconfiguration
BGP activity 300078/110778093 prefixes, 14305643/13981791 paths, scan interval 15 secs

This box has about 50 to 70 peers, two are full views, seven or so are
internal mesh, and the rest range from 0 to several thousand.

If you want, http://eng.nac.net/lookingglass/ has ways for you to access
both of these boxes if you please.

Hope this helps.





On Sat, 24 Aug 2002, George Boulescu wrote:

>
>
> Some weeks ago we wanted to have a full bgp table in an 6500 (we wanted to
> test if it will cope with it after an 128-to-512 memory upgrade on the
> MSFC).
>
> The machine runs native IOS and has more de 30 bgp peering with a total of
> around 7000 routes.
>
> First we hit it with around 60000-70000 from a border router. The things
> weren't fine but it was almost ok, meaning that we were going really low
> on free memory on the SUP2, which had 128M. At the end of this proces we
> had around 4-5Mbytes free on the SUP2.
>
> My questions are:
>
> - are the prefix-stats using memory from the SUP or from the MSFC ?
> - is there any part-number for upgrading the memory on the SUP2 ?
> - did anyone tried to install a 512M SODIMM (like the one on the MSFC2,
> I think they are identical) in the SUP2 ?
> - from the memory consuption point of view, which is using more memory -
> the hybrid configuration or the native IOS configuration ?
> - in this circumstances (with 128M on SUP2 and 512M on MSFC2) what should
> I use if I really want to have the full bgp table ?
>
> Thanks for your time.
>
> --
>
> George Boulescu
> Senior Network Engineer
> RoEduNet Bucharest NOC
> CCAI, CCNA
>
>
> _______________________________________________
> cisco-nsp mailing list real_name)s@puck.nether.net
> http://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>

-- Alex Rubenstein, AR97, K2AHR, alex@nac.net, latency, Al Reuben --
-- Net Access Corporation, 800-NET-ME-36, http://www.nac.net --
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
> (how do you check free SUP ram on an integrated box?)

#remote command module 1 show memory

if your Sup2/MSFC2 is module 1.

Steinar Haug, Nethelp consulting, sthaug@nethelp.no
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
At 08:41 AM 8/24/2002 +0300, George Boulescu wrote:


>Some weeks ago we wanted to have a full bgp table in an 6500 (we wanted to
>test if it will cope with it after an 128-to-512 memory upgrade on the
>MSFC).
>
>The machine runs native IOS and has more de 30 bgp peering with a total of
>around 7000 routes.
>
>First we hit it with around 60000-70000 from a border router. The things
>weren't fine but it was almost ok, meaning that we were going really low
>on free memory on the SUP2, which had 128M. At the end of this proces we
>had around 4-5Mbytes free on the SUP2.
>
>My questions are:
>
>- are the prefix-stats using memory from the SUP or from the MSFC ?

prefix stats actually use memory is multiple places.

>- is there any part-number for upgrading the memory on the SUP2 ?

Yes, see
http://www.cisco.com/univercd/cc/td/doc/product/lan/cat6000/cfgnotes/78_12693.htm

>- did anyone tried to install a 512M SODIMM (like the one on the MSFC2,
>I think they are identical) in the SUP2 ?

Yes, they are identical.

>- from the memory consuption point of view, which is using more memory -
>the hybrid configuration or the native IOS configuration ?

IOS for the Catalyst 6000 uses more memory on the Supervisor because it is
treated just like a distributed forwarding card, and has the CEF table
downloaded to it. In CatOS the MSFC2 does not download a copy of the CEF
table to the Supervisor.

>- in this circumstances (with 128M on SUP2 and 512M on MSFC2) what should
>I use if I really want to have the full bgp table ?

If you want to use IOS for the Catalyst 6000 then you need to upgrade the
memory on the Supervisor. If you do not want to do this, then 128Mbytes is
enough memory to run CatOS with full Internet routes.


Ian

>Thanks for your time.
>
>--
>
>George Boulescu
>Senior Network Engineer
>RoEduNet Bucharest NOC
>CCAI, CCNA
>
>
>_______________________________________________
>cisco-nsp mailing list real_name)s@puck.nether.net
>http://puck.nether.net/mailman/listinfo/cisco-nsp
>archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [nsp] MSFC2 128,000 route limitation [ In reply to ]
Thanks again. Your detailed technical explanations have been invaluable,
and are exactly what I have been trying to get out of TAC. If only
cisco.com had this detail on the web... :)

I now have a much better handle on what happened. As I brought up the new
network, there were 3 equal OSPF paths to the BGP next hop. Then, load in
the BGP routes and you have ~110000 routes * 3 paths = 330000 adjacencies.

So, between the large routing effeciency gain and the cef concistancy
problems I've been having, I've decided to convert these switches to native
this weekend. I do have a couple major native bug cases open as well but it
sounds to me like the native architecture is more suited towards a full BGP
view with redundant paths. Hopefully this will give us reasonably reliable
routing and we'll just live without the advanced HA features for now.

For those that asked, the supervisors in question here have 512 megs of RAM,
are sup2/pfc2/msfc2, and are running CatOS 7.3(1) and IOS 12.1(11b)E4.

Here are some current stats (all taken recently - after running the "set mls
cef per-prefix-stats disable" command):

Total FIB entries: 262144
Allocated FIB entries: 112600
Free FIB entries: 149544
FIB entries used for IP ucast: 112599
FIB entries used for IPX : 1
FIB entries used for IP mcast: 0

Total adjacencies: 262144
Allocated adjacencies: 1140
Free adjacencies: 261004

Looks like I'm nowhere near filling the adjacency or FIB tables at this
point. Oh well, I'll convert anyway.

----- Original Message -----
From: "Ian Cox" <icox@cisco.com>
To: "Matt Buford" <matt@overloaded.net>; <cisco-nsp@puck.nether.net>
Sent: Friday, August 23, 2002 7:56 PM
Subject: Re: [nsp] MSFC2 128,000 route limitation


> At 04:36 PM 8/23/2002 -0400, Matt Buford wrote:
> >Excellent. Thank you. This type of detailed technical explanation is
> >exactly what I needed. TAC always gave me descriptions that weren't
> >detailed enough for me to figure out exactly what was happening. It was
> >clear that something was filling up related to too many routes, but that
was
> >as detailed as it got.
> >
> >Is there any way to see how much memory is used (or free) for the
adjacency
> >table? As I mentioned before, these boxes didn't really have anything
but
> >dual bgp feeds (of full Internet routes) and OSPF providing two equal
cost
> >paths to each BGP next-hop. This doesn't seem like a particularly large
or
> >unusual situation to me. If anything I see this as more of a starting
point
> >configuration for initial deployment - that will only get larger with
time
> >and additional customers. I understand that the command "set mls cef
> >per-prefix-stats disable" has reduced this (by half?) but I'd sleep
better
> >if I knew how close I was to hitting this limitation again.
>
> It was at the end of my last email "sh polaris fibmgr usage" is the CatOS
> command that has this information.
>
> [snip]
>
> ...
> Total FIB entries: 262144
> Allocated FIB entries: 107840
> Free FIB entries: 154304
> FIB entries used for IP ucast: 107839
>
> ...
>
> Total adjacencies: 262144
> Allocated adjacencies: 216332
> Free adjacencies: 45812
>
> ...
>
> [end snip]
>
>
>
> >These specific switches are running hybrid. This decision was made
because
> >this situation calls for mostly switching - with very few router
interfaces.
> >Basically there's just two upstream vlan interfaces (two backbones) then
a
> >downstream customer interface. Other than those interfaces, its all
> >switching. CatOS's "hitless" upgrades plus the ability to upgrade the
> >routers without affecting switching seemed a big plus. This allows
router
> >upgrades without affecting even customers single-homed directly to these
> >switches. Would you say the architectural improvements (such as the
change
> >in adjacency table handling you mentioned) are such that you recommend
> >native IOS in all situations?
>
> IOS for Catalyst 6000 handles large routing tables with parallel paths
much
> more efficiently than CatOS does today. HA is much better in CatOS but
over
> the next year IOS will end on at the same level. Asking me which one to
use
> is not really fair because I'm totally biased towards IOS for Catalyst
6000 :)
>
>
> Ian
>
> >----- Original Message -----
> >From: "Ian Cox" <icox@cisco.com>
> >To: "Matt Buford" <matt@overloaded.net>; <cisco-nsp@puck.nether.net>
> >Sent: Thursday, August 22, 2002 8:15 PM
> >Subject: Re: [nsp] MSFC2 128,000 route limitation
> >
> >
> > >
> > > The bug you site has nothing to do with filling up the FIB tcam, it
has to
> > > do with filling up the memory that contains the adjacency rewrite
> > > information. Filling up either of these two resources will cause
problems.
> > > If the problem you are running into is the DDTs you refer to then an
> > > accurate description of the problem is:
> > >
> > > [snip]
> > >
> > > only with a large network configuration in which many prefixes have
> > > multiple paths to them resulting in adjacency exception condition
> > > (condition where the NMP runs of adjacency table ) on the NMP. This
> >happens
> > > because we do not share adjacencies among prefixes even if they have
the
> > > same multiple paths ... Because of constant network updates, some of
the
> > > adjacencies get deleted and when the NMP comes of out adj exception,
we
> > > issue a reload of FIB/ADJ table . This cycles go on and on resulting
in
> >the
> > > high CPU on NMP.
> > >
> > > [end snip]
> > >
> > > If you only have 50k routes, then it only consume 50k TCAM entries.
The
> > > structure of the forwarding information looks like this:
> > >
> > > FIB TCAM Adjacency Table
> > > +-----------+ +------------------+
> > > | 1.0.0.0/8 | --1:N--< | path1 rewrite 1 |
> > > | 2.1.0.0/16| | path2 rewrite 2 |
> > > | | | path3 rewrite 3 |
> > > +-----------+ +------------------+
> > > 256k entries 256k entries
> > >
> > > If you have two parallel paths and you had 50k entries with CatOS you
> >would
> > > consume 50k x 2 entries in the adjacency table. In IOS for the
Catalyst
> > > 6000 this is done differently and you would consume just 2 entries
since
> > > the prefixes can share adjacency table entries.
> > >
> > >
> > > I have not dealt with CatOS on the platform for over 18 months, only
IOS
> > > for the Catalyst 6000. IOS for the Catalyst 6000 does not have this
> > > problem, it handles the adjacency table programming by allowing
prefixes
> >to
> > > share adjacencies entries. The way to show how much fib tcam has been
> > > consumed on IOS for the Catalyst 6000 is:
> > >
> > > tromso#sh mls cef summary
> > >
> > > tromso-sp#
> > > Total CEF switched packets: 0000019424896725
> > > Total CEF switched bytes: 0001857776306791
> > > Total routes: 110568
> > > IP unicast routes: 110555
> > > IPX routes: 0
> > > IP multicast routes: 13
> > > tromso#
> > >
> > >
> > > Looking at the manual the command for CatOS is "show mls cef" to get
the
> > > equivalent information.
> > >
> > > To get the number of entries used in the adjacency table for IOS for
> > > Catalyst 6000 use:
> > >
> > > tromso#sh mls cef adjacency count
> > >
> > > tromso-sp#
> > > Total adjacencies: 24
> > > tromso#
> > >
> > >
> > > The only similar count for adjacency usage I can see for a system
running
> > > CatOS is under "sh polaris fibmgr usage"
> > >
> > >
> > > Ian
> > >
> > >
> > >
> > > At 06:24 PM 8/22/2002 -0400, Matt Buford wrote:
> > > >On Wed May 15 2002 - 07:53:32 EDT, Ian Cox wrote:
> > > >
> > > > > The TCAM that holds the FIB table is capable of holding 256,000
> >entries.
> > > > > Without unicast RPF checking turned on the maximum number of
unicast
> > > > > entries that can be held in the hardware FIB table is 244,000. The
> > > > > remaining 12,000 entries are reserved for multicast routes. If
unicast
> >RPF
> > > > > checking is enabled then the number of routes that are held in the
> >TCAM is
> > > > > halved.
> > > >
> > > > > You can exceed the capacity of the hardware forwarding table, and
the
> > > > > consequences are that the routes that are not programmed into the
TCAM
> > > >that
> > > > > holds the FIB table will be switched in software by the MSFC2 /
RP.
> > > >
> > > >I have apparently ran into this limitation, with much worse
consequences
> > > >(running Sup2/MSFC2 hybrid). The supervisor CPU shot up to 100%, and
all
> > > >updates from the MSFC to the supervisor/PFC stopped. This happened
in
> >both
> > > >of a pair of redundant 6500s, bringing both down and leaving me
unable to
> > > >bring them back up with a full routing table.
> > > >
> > > >Cisco TAC found bug cscdw89942, and said the internal notes recommend
> >using
> > > >the "set mls cef per-prefix-stats disable" to reduce the number of
> >entries.
> > > >
> > > >It appears that at this point the limitiation is not something to
take
> > > >lightly. Reaching it (at least under Hybrid) apparently brings
> >everything
> > > >down. There is no software yet available that fixes this, and the
only
> > > >workaround is to take measures to reduce your CEF table size (such as
> > > >turning off per-prefix-stats).
> > > >
> > > >For perspective, the routers that failed each see two BGP feeds of
full
> > > >Internet routes, as well as about 12 OSPF routes (each of which has 2
or
> >3
> > > >paths to get there). This doesn't seem like a particularly large
number
> >of
> > > >routes to me, however it certainly passes the limit listed in the bug
of
> > > >50,000 routes with dual paths.
> > > >
> > > >Is there anywhere I can get a count of the actual current number of
> >entries
> > > >and/or space free, or is the only way to tell to show the cef table
size
> >and
> > > >manually figure out if you need to multiply it if you have unicast
RPF
> >on,
> > > >then make sure that is less than 244,000? I want to go through all
my
> >6500s
> > > >and make sure I'm not about to hit the limit on any of them (some are
> >hybrid
> > > >and some are native). The thought of all my 6500s falling over at
once
> >and
> > > >staying down because I reached the maximum limit on routes scares me
> > > >greatly.
> > >
>