Mailing List Archive

trouble when a lot of users try and log on
HI

Whenever our L2TP provider has any problems and they drop our link and
the 1500 or so L2TP / ADSL connections we have trouble when they all
try and log on again, so far the only way we have managed to get
through this is to restart the radius daemon on rad 1 after 200 logins
or so.

We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
1500 sessions and we get the below error in the radius logs on rad 2

Error: Dropping duplicate authentication packet from client Cisco-LNS

We are currently running a old version of ICradius (on both) but we
are in the process of migrating to Freeradius, both radius servers are
using a MySQL backend. We don't see any load on the sql DB or radius
servers but the CPU is high on the router. Would this be a radius
problem or a LNS problem?.

The setup looks like this

Provider ------> Rad1 -----------> Provider --------> LNS ---------> Rad2

Rad 1 allows all users and only sends back Tunnel Server endpoint IP
Rad 2 does final auth and any other attributes like static IP and accounting


Thanks in advance for any help or pointers in debugging this.

Wayne
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
Radius has only a single byte unique identifier within the protocol. If your
radus client (i.e. Cisco NAS) uses the same source port for all requests
then

Source IP / Port + Dest IP / Port +Unique ID will only give you 256 unique
requets.

Its entirely possible with high (or even "some") churn you may have > 256
outstanding reequests and therefore "duplicates"....especially if you're
authenticating the domain before the full user.

Try "radius-server source-ports extended"
http://www.cisco.com/en/US/docs/ios/12_3/security/command/reference/sec_p1g.html#wp1107199

Dean

----- Original Message -----
From: "Wayne Lee" <linkconnect@googlemail.com>
To: <cisco-bba@puck.nether.net>
Sent: Monday, October 06, 2008 1:07 PM
Subject: [cisco-bba] trouble when a lot of users try and log on


> HI
>
> Whenever our L2TP provider has any problems and they drop our link and
> the 1500 or so L2TP / ADSL connections we have trouble when they all
> try and log on again, so far the only way we have managed to get
> through this is to restart the radius daemon on rad 1 after 200 logins
> or so.
>
> We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
> 1500 sessions and we get the below error in the radius logs on rad 2
>
> Error: Dropping duplicate authentication packet from client Cisco-LNS
>
> We are currently running a old version of ICradius (on both) but we
> are in the process of migrating to Freeradius, both radius servers are
> using a MySQL backend. We don't see any load on the sql DB or radius
> servers but the CPU is high on the router. Would this be a radius
> problem or a LNS problem?.
>
> The setup looks like this
>
> Provider ------> Rad1 -----------> Provider --------> LNS ---------> Rad2
>
> Rad 1 allows all users and only sends back Tunnel Server endpoint IP
> Rad 2 does final auth and any other attributes like static IP and
> accounting
>
>
> Thanks in advance for any help or pointers in debugging this.
>
> Wayne
> _______________________________________________
> cisco-bba mailing list
> cisco-bba@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-bba

_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
On Mon, Oct 06, 2008 at 01:07:56PM +0100, Wayne Lee wrote:
> Whenever our L2TP provider has any problems and they drop our link and
> the 1500 or so L2TP / ADSL connections we have trouble when they all
> try and log on again, so far the only way we have managed to get
> through this is to restart the radius daemon on rad 1 after 200 logins
> or so.

Perhaps the restart on Rad 1 just stops new sessions being presented to
the LNS for long enough for it to deal with the ones it's already
got outstanding.

> We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
> 1500 sessions and we get the below error in the radius logs on rad 2

Pre-clone? Are you using config / IOS that prevents you using
subinterface VAIs instead of the Full VAIs that pre-cloning gives you.
(I did think that pre-cloning subinterface VAIs would still be
an optimisation, but since it doesn't do it, I guess Cisco found not!).

> Error: Dropping duplicate authentication packet from client Cisco-LNS

Guessing that the LNS is just too busy and dropping / missing the
responses, so retransmitting.

> Thanks in advance for any help or pointers in debugging this.

There are some tuning knobs available to limit the number of
sessions the LNS will deal with at the same time. Without them it is
possible for a mass disconnection / mass reconnections to make the
LNS busy enough trying to deal with ALL new sessions to successfully
deal with NONE of them.

Google for "site:cisco.com Session scalability" and/or
"site:cisco.com Broadband scalability"

There are also some optimisations that help keep the CPU down a bit
in general for L2TP, or especially on session setup.

e.g.

vpdn ip udp ignore checksum
no virtual-template snmp

Should be mentioned in the BB Scalabilty docs, but from a quick google
I can't find the exact doc I'm thinking of.

--
Euan Galloway
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
On Mon, Oct 6, 2008 at 2:17 PM, Dean Smith <dean@eatworms.org.uk> wrote:
> Radius has only a single byte unique identifier within the protocol. If your
> radus client (i.e. Cisco NAS) uses the same source port for all requests
> then
>
> Source IP / Port + Dest IP / Port +Unique ID will only give you 256 unique
> requets.
>
> Its entirely possible with high (or even "some") churn you may have > 256
> outstanding reequests and therefore "duplicates"....especially if you're
> authenticating the domain before the full user.
>
> Try "radius-server source-ports extended"
> http://www.cisco.com/en/US/docs/ios/12_3/security/command/reference/sec_p1g.html#wp1107199
>
> Dean
>
> ----- Original Message ----- From: "Wayne Lee" <linkconnect@googlemail.com>
> To: <cisco-bba@puck.nether.net>
> Sent: Monday, October 06, 2008 1:07 PM
> Subject: [cisco-bba] trouble when a lot of users try and log on
>
>
>> HI
>>
>> Whenever our L2TP provider has any problems and they drop our link and
>> the 1500 or so L2TP / ADSL connections we have trouble when they all
>> try and log on again, so far the only way we have managed to get
>> through this is to restart the radius daemon on rad 1 after 200 logins
>> or so.
>>
>> We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
>> 1500 sessions and we get the below error in the radius logs on rad 2
>>
>> Error: Dropping duplicate authentication packet from client Cisco-LNS
>>
>> We are currently running a old version of ICradius (on both) but we
>> are in the process of migrating to Freeradius, both radius servers are
>> using a MySQL backend. We don't see any load on the sql DB or radius
>> servers but the CPU is high on the router. Would this be a radius
>> problem or a LNS problem?.
>>
>> The setup looks like this
>>
>> Provider ------> Rad1 -----------> Provider --------> LNS ---------> Rad2
>>
>> Rad 1 allows all users and only sends back Tunnel Server endpoint IP
>> Rad 2 does final auth and any other attributes like static IP and
>> accounting
>>
>>
>> Thanks in advance for any help or pointers in debugging this.
>>
>> Wayne
>> _______________________________________________
>> cisco-bba mailing list
>> cisco-bba@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-bba
>
>
Dean

Yes we have radius-server source-ports extended already in the config
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
> Pre-clone? Are you using config / IOS that prevents you using
> subinterface VAIs instead of the Full VAIs that pre-cloning gives you.
> (I did think that pre-cloning subinterface VAIs would still be
> an optimisation, but since it doesn't do it, I guess Cisco found not!).

We are using pre-clone as I thought it might help take off some of the pressure


> There are some tuning knobs available to limit the number of
> sessions the LNS will deal with at the same time. Without them it is
> possible for a mass disconnection / mass reconnections to make the
> LNS busy enough trying to deal with ALL new sessions to successfully
> deal with NONE of them.
>
> Google for "site:cisco.com Session scalability" and/or
> "site:cisco.com Broadband scalability"
>
> There are also some optimisations that help keep the CPU down a bit
> in general for L2TP, or especially on session setup.
>
> e.g.
>
> vpdn ip udp ignore checksum
> no virtual-template snmp
>
> Should be mentioned in the BB Scalabilty docs, but from a quick google
> I can't find the exact doc I'm thinking of.
>
> --
> Euan Galloway
> _______________________________________________
> cisco-bba mailing list
> cisco-bba@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-bba
>

I'll do some more hunting.

Thanks
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
Wayne,

We use CAC for incoming vpdn sessions (it works for PPPoX too), which limits the number of
vpdn sessions being established simultaneously, based on either CPU or session charges.

call admission limit 320
call admission vpdn 10 1

The above numbers work ok with us, taking into account that they are from a 10k platform
(we don't use CAC on our 7200s, because they have very few sessions), the LAC uses its own
CAC method too, and our radius servers cannot handle too many requests at the same time.
Probably you'll have to experiment and find you own values.

You can find more info below:
http://www.cisco.com/en/US/docs/routers/10000/10008/feature/guides/122_31sb13/cac-enha.html

We also use the following under the radius groups in order to split the load on our radius
servers according the auth/acct requests waiting in line:

aaa group server radius XXX
load-balance method least-outstanding

More info can be found below:
http://www.cisco.com/en/US/docs/ios/12_2sb/feature/guide/sbrdldbl.html

Regarding the precloning thing, according to our experience with the 12.2(31)SB series,
precloning doesn't help much and we prefer using va subinterfaces (with all their
advantages/disadvantages). Here is the relevant output:


7200#sh vtemplate
Virtual access subinterface creation is globally enabled

Active Active Subint Pre-clone Pre-clone
Interface Subinterface Capable Available Limit
--------- ------------ ------- --------- ---------
Vt1 0 1370 Yes
Vt2 0 235 Yes


--
Tassos

Wayne Lee wrote on 06/10/2008 15:07:
> HI
>
> Whenever our L2TP provider has any problems and they drop our link and
> the 1500 or so L2TP / ADSL connections we have trouble when they all
> try and log on again, so far the only way we have managed to get
> through this is to restart the radius daemon on rad 1 after 200 logins
> or so.
>
> We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
> 1500 sessions and we get the below error in the radius logs on rad 2
>
> Error: Dropping duplicate authentication packet from client Cisco-LNS
>
> We are currently running a old version of ICradius (on both) but we
> are in the process of migrating to Freeradius, both radius servers are
> using a MySQL backend. We don't see any load on the sql DB or radius
> servers but the CPU is high on the router. Would this be a radius
> problem or a LNS problem?.
>
> The setup looks like this
>
> Provider ------> Rad1 -----------> Provider --------> LNS ---------> Rad2
>
> Rad 1 allows all users and only sends back Tunnel Server endpoint IP
> Rad 2 does final auth and any other attributes like static IP and accounting
>
>
> Thanks in advance for any help or pointers in debugging this.
>
> Wayne
> _______________________________________________
> cisco-bba mailing list
> cisco-bba@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-bba
>

_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
We had a something similar problem a few months ago with 12.2(26) where when
an OC-3 dropped only 75% of our 2400 connections came back in.

I wrote it up here:
http://www.gossamer-threads.com/lists/cisco/bba/91052

Frank

-----Original Message-----
From: cisco-bba-bounces@puck.nether.net
[mailto:cisco-bba-bounces@puck.nether.net] On Behalf Of Wayne Lee
Sent: Monday, October 06, 2008 7:08 AM
To: cisco-bba@puck.nether.net
Subject: [cisco-bba] trouble when a lot of users try and log on

HI

Whenever our L2TP provider has any problems and they drop our link and
the 1500 or so L2TP / ADSL connections we have trouble when they all
try and log on again, so far the only way we have managed to get
through this is to restart the radius daemon on rad 1 after 200 logins
or so.

We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
1500 sessions and we get the below error in the radius logs on rad 2

Error: Dropping duplicate authentication packet from client Cisco-LNS

We are currently running a old version of ICradius (on both) but we
are in the process of migrating to Freeradius, both radius servers are
using a MySQL backend. We don't see any load on the sql DB or radius
servers but the CPU is high on the router. Would this be a radius
problem or a LNS problem?.

The setup looks like this

Provider ------> Rad1 -----------> Provider --------> LNS ---------> Rad2

Rad 1 allows all users and only sends back Tunnel Server endpoint IP
Rad 2 does final auth and any other attributes like static IP and accounting


Thanks in advance for any help or pointers in debugging this.

Wayne
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba

_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
Wayne,

You might want to check out the AAA throttling feature if its available on
your IOS. This and the load-balance method least-outstanding feature
should work well. You may want to start by throttling accounting records
and then auth requests.

Dan



Tassos Chatzithomaoglou <achatz@forthnet.gr>
Sent by: cisco-bba-bounces@puck.nether.net
10/06/2008 03:27 PM

To
cisco-bba@puck.nether.net
cc

Subject
Re: [cisco-bba] trouble when a lot of users try and log on






Wayne,

We use CAC for incoming vpdn sessions (it works for PPPoX too), which
limits the number of
vpdn sessions being established simultaneously, based on either CPU or
session charges.

call admission limit 320
call admission vpdn 10 1

The above numbers work ok with us, taking into account that they are from
a 10k platform
(we don't use CAC on our 7200s, because they have very few sessions), the
LAC uses its own
CAC method too, and our radius servers cannot handle too many requests at
the same time.
Probably you'll have to experiment and find you own values.

You can find more info below:
http://www.cisco.com/en/US/docs/routers/10000/10008/feature/guides/122_31sb13/cac-enha.html


We also use the following under the radius groups in order to split the
load on our radius
servers according the auth/acct requests waiting in line:

aaa group server radius XXX
load-balance method least-outstanding

More info can be found below:
http://www.cisco.com/en/US/docs/ios/12_2sb/feature/guide/sbrdldbl.html

Regarding the precloning thing, according to our experience with the
12.2(31)SB series,
precloning doesn't help much and we prefer using va subinterfaces (with
all their
advantages/disadvantages). Here is the relevant output:


7200#sh vtemplate
Virtual access subinterface creation is globally enabled

Active Active Subint Pre-clone Pre-clone
Interface Subinterface Capable Available Limit
--------- ------------ ------- --------- ---------
Vt1 0 1370 Yes
Vt2 0 235 Yes


--
Tassos

Wayne Lee wrote on 06/10/2008 15:07:
> HI
>
> Whenever our L2TP provider has any problems and they drop our link and
> the 1500 or so L2TP / ADSL connections we have trouble when they all
> try and log on again, so far the only way we have managed to get
> through this is to restart the radius daemon on rad 1 after 200 logins
> or so.
>
> We are running a 7206vxr (g1) with 1gig of mem, pre-clone is set for
> 1500 sessions and we get the below error in the radius logs on rad 2
>
> Error: Dropping duplicate authentication packet from client Cisco-LNS
>
> We are currently running a old version of ICradius (on both) but we
> are in the process of migrating to Freeradius, both radius servers are
> using a MySQL backend. We don't see any load on the sql DB or radius
> servers but the CPU is high on the router. Would this be a radius
> problem or a LNS problem?.
>
> The setup looks like this
>
> Provider ------> Rad1 -----------> Provider --------> LNS --------->
Rad2
>
> Rad 1 allows all users and only sends back Tunnel Server endpoint IP
> Rad 2 does final auth and any other attributes like static IP and
accounting
>
>
> Thanks in advance for any help or pointers in debugging this.
>
> Wayne
> _______________________________________________
> cisco-bba mailing list
> cisco-bba@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-bba
>

_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba
Re: trouble when a lot of users try and log on [ In reply to ]
We are not running the SB IOS, so far almost all suggested commands
need this version.

I'll see if we can upgrade.
_______________________________________________
cisco-bba mailing list
cisco-bba@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-bba