Mailing List Archive

[patch] patch for correct ospfapi ready callbacks
hi all,

this patch tries to correct the incorrect behavior of Quagga's ospfd in
the special situation that a node's opaque capability has changed as "ON
-> OFF -> ON"

without the patch, ospfd did call ospfapi ready callbacks only for those
opaque types which did NOT have injected lsa's before the ON->OFF->ON
sequence. when SRRD (an ospfapi client application) injected only
type-11 lsa's this resulted in ready-9, ready-10 and NO ready-11 after
the ON->OFF->ON sequence.

now, with the patch, ospfd correctly generates the 3 ready callbacks.
the patch simply removes the test for emptyness of the list ipt->id_list.

i'm not sure if it's correct. as far as i can see everything still works
after applying the patch.

BUT ... the list oipt->id_list IS not empty and this means that the
headler is already active... this means also that with the test for
oipt->id_list's emptyness ospf_opaque_lsa_reoriginate_schedule() never
gets called, resulting in NO ready callback of every type which existed
before the ON->OFF->ON sequence.

eventually someone with more insight into ospfd's data structures could
correct me?

cheers
- amir
--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
i've attached the missing patch.

sorry
- amir

--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
hi,

i'm sorry again, but the last patch did not apply cleanly on a fresh
Quagga source. i re-did the patch and i've attached the new version of
the patch. it applies cleanly to a quagga-0.94.4 source.

- amir
--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
+ /*
+ * removed the test for
+ * (! list_isempty (oipt->id_list)) * Handler is already active. *
+ * because opaque cababilities ON -> OFF -> ON result in list_isempty (oipt->id_list)
+ * not being empty.
+ */

So why is this the right fix, rather than the list getting cleared
when opaque capabilities go off? Should this result in a loss of
client api sessions with an error?

--
Greg Troxel <gdt@ir.bbn.com>
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
Hi Greg,

> + /*
> + * removed the test for
> + * (! list_isempty (oipt->id_list)) * Handler is already active. *
> + * because opaque cababilities ON -> OFF -> ON result in list_isempty (oipt->id_list)
> + * not being empty.
> + */
>
> So why is this the right fix, rather than the list getting cleared
> when opaque capabilities go off? Should this result in a loss of
> client api sessions with an error?

I never said this fixes the error correctly. But I found out that it
fixes the missing ready-11. I'm not sure what really happens and if
oipt->id_list /should/ be empty, or not!

Before going not-ready there were injected type-11 lsa in the lsadb!
Exactly this is the problem. If no lsa are injected of a type (in my
example type 9 and 10), then the first block of if's will be taken, and
there is no problem, and ready-9 and ready-10 gets received.

So the problem only shows up if a type has injected lsa's. Having
injected lsa's of a type means (in my eyes) that oipt->id_list is
non-empty... I'm I wrong? Is it only non-empty if there are no /pending/
inject?

I've not enough knowledge of the internal opaque data structures to
answer that, and was hoping to receive some comments on how to fix the
stuff /correctly/!

Greg, if you could give us a hint, how to clear the list when opaque
capabilities go off?

Thanks in advance for your help!

Regards,
- Amir

--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
I really don't understand the details of the ospfapi.

Greg, if you could give us a hint, how to clear the list when opaque
capabilities go off?

I'm not sure this is right. My point is really that what happens when
the router is commanded to turn opaque capabilities off and there are
clients connected is not clear, and that needs to be thought through
and documented first.
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
Re Greg,

> I really don't understand the details of the ospfapi.
>
> Greg, if you could give us a hint, how to clear the list when opaque
> capabilities go off?
>
> I'm not sure this is right. My point is really that what happens when
> the router is commanded to turn opaque capabilities off and there are
> clients connected is not clear, and that needs to be thought through
> and documented first.

Oh, that's clear and documented. an inject will return the error code
-7, which means "not ready". OSPFAPI will call the ready-callback with
an argument telling which type got ready as soon as the count of opaque
enabled routers becomes > 1. This /does/ happen when no opaque lsa are
currently injected of a specific type, but this /does not/ happen for a
specific type if that type has injected lsa's at the moment of going
non-ready. This is an incorrect (I discussed this with Ralph, the
OSPFAPI author today) and OSPFAPI should call the ready-callback.

Now, the problem is simply to find out if we need to clean up ->list_id
or if we can leave it full. Iif we need to clean it up, then we should
fix the "going non-ready" path and we should clean up ->list_id when
going non-ready, meaning when a inject fails with -7. If it's ok to have
->list_id filled, then my fix is correct.

We would need to know more about ->list_id. What I can say after some
testing, is that I did not yet see anything bad happening using my patch
in the last 48h.

Regards
- Amir
--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
But what about

a client connects
gets 'ready'
injects a few LSAs
opaque disabled
presumably client gets a 'not ready'
opaque enabled
client gets 'ready'

Now, are the injected LSAs still injected, or was the not ready
supposed to notify the client they had been lost?
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
Re,

> But what about
>
> a client connects
> gets 'ready'
> injects a few LSAs
> opaque disabled
> presumably client gets a 'not ready'
> opaque enabled
> client gets 'ready'
>
> Now, are the injected LSAs still injected, or was the not ready
> supposed to notify the client they had been lost?

The injected LSA stay in the global lsa database until they get deleted.
A not-ready callback tells a ospfapi client that the ospf daemon is
currently not able to flood a new LSA.

This does not mean that the already flooded LSA (which eventually will
be updated by this new inject) are lost!

As soon as an LSA's type gets ready again, it will be flooded containing
the old data, until a new inject updates that data or the LSA gets
deleted by the client.

- Amir
--
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
The injected LSA stay in the global lsa database until they get deleted.
A not-ready callback tells a ospfapi client that the ospf daemon is
currently not able to flood a new LSA.

This does not mean that the already flooded LSA (which eventually will
be updated by this new inject) are lost!

But maybe it should. If one disables opaque lsa on a router, I would
expect that to mean that all currently published opaque LSAs are
deleted (max-aged) from the global database, and all connected clients
are informed that their LSAs have been unpublished.

I must say that I find the whole 'not ready' concept to be odd. I
realize that the router can't flood without a partner, but I would
expect this to be handled by ospfd rather than pushed onto the user
process.

But changing this wouldn't resolve the central question, which is
about administratively disabling opaque lsas.
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Greg,

> The injected LSA stay in the global lsa database until they get deleted.
> A not-ready callback tells a ospfapi client that the ospf daemon is
> currently not able to flood a new LSA.
>
> This does not mean that the already flooded LSA (which eventually will
> be updated by this new inject) are lost!
>
> But maybe it should. If one disables opaque lsa on a router, I would
> expect that to mean that all currently published opaque LSAs are
> deleted (max-aged) from the global database, and all connected clients
> are informed that their LSAs have been unpublished.

I agree. This is a possible and usable sematics too. But this is sadly
not the sematics implemented by OSPFAPI.

> I must say that I find the whole 'not ready' concept to be odd. I
> realize that the router can't flood without a partner, but I would
> expect this to be handled by ospfd rather than pushed onto the user
> process.

Look what I've found on Ralph's OSPFAPI homepage:

Important note:

* In order to originate an opaque LSA, there must be at least
one active opaque-capable neighbor. Thus, you cannot originate opaque
LSAs of no neighbors are present. If you try to originate even so no
neighbor is ready, you will receive a not ready error message. The
reason for this restriction is that it might be possible that some
routers have an identical opaque LSA from a previous origination in
their LSDB that unfortunately could not be flushed due to a crash, and
now if the router comes up again and starts originating a new opaque
LSA, the new opaque LSA is considered older since it has a lower
sequence number and is ignored by other routers (that consider the
stalled opaque LSA as more recent). However, if the originating router
first synchronizes the database before originating opaque LSAs, it will
detect the older opaque LSA and can flush it first.

SRRD (https://open.datacore.ch/wikipage/SRRD) extends OSPFAPI's
sematics, by intercepting the nsm(down) callback and automatically
deletes all lsa's which have an advertising router which is the same as
the one mentioned in the nsm(down) callback in it's local SRRD lsa
database, which is a copy of the ospf lsadb.

It seems that the above is exactly what you are proposing to do on
OSPFAPI layer. Now, as far as I see this inherently brings the problem
that the OSPF RFC states how OSPF should react. OSPF is built in a way
that a router lsa for example of a router going down stays in the lsadb
(the idea probably was that the node topology did not change, only the
link topology which can result in a node not being reachable anymore)
until it's age is >3600min, then it gets removed.

For SRRD this was not acceptable, since SRRD implements a cluster server
which should serve services redundantly and it should do this with a
fail-over-time as short as possible. Waiting for 1h till a service LSA
timeouts was not acceptable, resulting in the above implementation with
a local SRRD based TTL count to dead and removal of all lsa's advertised
by the router whose TTL goes <0.

> But changing this wouldn't resolve the central question, which is
> about administratively disabling opaque lsas.

What's the problem with that?

Administratively disabling opaque lsa support on a router is the same as
if the router itself went down, as far as LSA type 9,10 and 11 are
concerned, isn't it?

Could you please elaborate on the meaning of that "central question"?

Regards,
- - Amir
- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAJ79rbycOjskSVCwRAhApAKConoEJE+ILq+LQWYvXuyS+93FwcwCbB+VA
S4bOKkp8h1Sfk1Hrzf2pp0A=
=gocr
-----END PGP SIGNATURE-----
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
ospfapi declines to let a user inject LSAs until there is a peer.
But this really doesn't solve the problem it puports to solve, since
there could be a peer present, but a disconnected group with the old
LSA that later joins. So for correctness the router has to watch for
old LSAs with higher sequence numbers that don't match the local
'publish this' db, max-age any it sees, and then republish the desired
ones.

This problem doesn't have to be solved like this - the api could
accept LSAs and then do what it has to in order to delete old LSAs; it
sounds like SRRD does that. But I'm not trying to change the API.

Administratively disabling opaque lsa support on a router is the same as
if the router itself went down, as far as LSA type 9,10 and 11 are
concerned, isn't it?

No, disabling Opaque LSAs on a router is entirely different from the
router crashing. Does the API document discuss this case? I would
expect that on a 'disable' event, the router would unpublish (max-age)
all the currently-originated opaque lsas, and inform the clients that
their lsas have been discarded. Here the admin has said 'we aren't
going to do this any more', which is a graceful shutdown, not a crash.

So is the intended behavior in the API that the LSAs are retained by
the router, a down callback is issued so that no new LSAs can be
injected, and that when/if OLSAs are re-enabled and there is a peer,
those old LSAs will be advertised again? Further, are the existing
LSAs during a down event still published in the system while the
router has them disabled? This all seems kind of broken to me - if
the admin says no opaque lsas, the router should IMHO withdraw what
it has published.
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
I don't mean to argue that your proposed changes are wrong. But I
would like to see them commented well enough to explain this whole
issue, and refer to the ospfapi documentation (specifically admin
enable/disable as a separate case from crash/restart).

I read the web page and glanced at the source tree and was unable to
find any descriptions of the semantics for administratively disabling
opaque LSAs. I also couldn't find anything in the info pages about
opaque LSAs at all.


This also raises the question of why we have commands to turn opaque
lsas on/off. Presumably off means don't originate, don't accept them
from neighbors, don't reflood neighbor values, and delete any in the
local database, from us or from neighbors. It would be good to have a
use case for this feature in order to decide what the behavior ought
to be.

Ralph: Any guidance here?
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Re Greg,

> ospfapi declines to let a user inject LSAs until there is a peer.
> But this really doesn't solve the problem it puports to solve, since
> there could be a peer present, but a disconnected group with the old
> LSA that later joins. So for correctness the router has to watch for
> old LSAs with higher sequence numbers that don't match the local
> 'publish this' db, max-age any it sees, and then republish the desired
> ones.

That's correct. SRRD implements exactly this behavior. It max-ages every
lsa it gets from a lsadb sync if that lsa was not injected by the
current SRRD instance.

> This problem doesn't have to be solved like this - the api could
> accept LSAs and then do what it has to in order to delete old LSAs; it
> sounds like SRRD does that. But I'm not trying to change the API.

I think you have a point. I hope Ralph, the author of OSPFAPI, will
contribute to this discussion as soon as he finds time!

I agree, OSPFAPI could (and probably should) handle this case by itself
and it should not bother the application level with these issues. This
would require a API change.

Not long ago, I proposed another API change to Ralph, concerning a
not-ready-callback alerting the application level of the loss of the
last adjecent opaque lsa enabled router. This would allow the
application to act immediately on that event and the application would
not have to wait for the next failing inject to notice the state change.

> Administratively disabling opaque lsa support on a router is the same as
> if the router itself went down, as far as LSA type 9,10 and 11 are
> concerned, isn't it?
>
> No, disabling Opaque LSAs on a router is entirely different from the
> router crashing. Does the API document discuss this case? I would
> expect that on a 'disable' event, the router would unpublish (max-age)
> all the currently-originated opaque lsas, and inform the clients that
> their lsas have been discarded. Here the admin has said 'we aren't
> going to do this any more', which is a graceful shutdown, not a crash.

ic.

I meant that the /current/ implementation treats a admininistrative
disable of the opaque lsa support the same as if the router itself went
down.

Ralph, could you enlight us on this topic?

> So is the intended behavior in the API that the LSAs are retained by
> the router, a down callback is issued so that no new LSAs can be
> injected, and that when/if OLSAs are re-enabled and there is a peer,
> those old LSAs will be advertised again?

Yes, I think so! The current OSPFAPI implements /only/ the
ready-callback. No not-ready-callback is supported at the moment!
OSPFAPI client applications take note of the not-readyness of the OSPF
router when they try to inject a new LSA.

> Further, are the existing
> LSAs during a down event still published in the system while the
> router has them disabled?

Yes. SRRD tries to correct this by adding a internal TTL to each
srrd-router lsa (type 11!) and tries to count them to death. Should this
happen it removes all LSAs advertised by that srrd router.

> This all seems kind of broken to me - if
> the admin says no opaque lsas, the router should IMHO withdraw what
> it has published.

;-))

Ralph, do you hear this? *grin*

Greg, we will have to wait for Ralph's comments! As far as I have
grasped the conceptual design, this /is/ the intended behavior. I'm
unsure if this behavior is optimal. I think, at very the least a
not-ready-callback should be implemented...

Regards,
- - Amir
- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAJ+sBbycOjskSVCwRAooQAJ9sTdi6YYHhOD2ET7GWdywjKNe5lwCg5ror
gtsqQ2gV0qBLZGRJzkLHmO8=
=ZDv0
-----END PGP SIGNATURE-----
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Re Greg,

> I don't mean to argue that your proposed changes are wrong. But I
> would like to see them commented well enough to explain this whole
> issue, and refer to the ospfapi documentation (specifically admin
> enable/disable as a separate case from crash/restart).

I agree. I think you find some parts of that documentation in the
reference further down.

> I read the web page and glanced at the source tree and was unable to
> find any descriptions of the semantics for administratively disabling
> opaque LSAs. I also couldn't find anything in the info pages about
> opaque LSAs at all.

Did you read:

Dissemination of Application-Specific Information using the OSPF Routing
Protocol. Ralph Keller. Technical Report Nr. 181, TIK, Swiss Federal
Institute of Technology Zurich, Switzerland,
November 2003.

http://www.tik.ee.ethz.ch/~keller/publications/tik181.pdf

It explains OSPFAPI in detail.

And be sure to check the "Related Papers" and "Related RFCs" sections on
https://open.datacore.ch/wikipage/SRRD

Take a look at RFC 2370 - The OSPF Opaque LSA Option

> This also raises the question of why we have commands to turn opaque
> lsas on/off. Presumably off means don't originate, don't accept them
> from neighbors, don't reflood neighbor values, and delete any in the
> local database, from us or from neighbors.

Exactly!

> It would be good to have a
> use case for this feature in order to decide what the behavior ought
> to be.
>
> Ralph: Any guidance here?

I've troubles finding a use case for on/off. I used the on/off/on cycle
to test and debug SRRD's behavior on a remote OSPFD crash. The OSPFAPI
events generated on a srrd node look exactly the same in those two cases.

Regards,
- - Amir
- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAJ/L7bycOjskSVCwRAoZgAKDZwdivc4y0eVmr9FTAlSqa6DTkOgCg/VKy
TAbEgGdC0ls0yaE5vMBc924=
=+gov
-----END PGP SIGNATURE-----
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
Greg Troxel wrote:

> I don't mean to argue that your proposed changes are wrong. But I
> would like to see them commented well enough to explain this whole
> issue, and refer to the ospfapi documentation (specifically admin
> enable/disable as a separate case from crash/restart).
>
> I read the web page and glanced at the source tree and was unable to
> find any descriptions of the semantics for administratively disabling
> opaque LSAs. I also couldn't find anything in the info pages about
> opaque LSAs at all.
>

I think there are two issues that came up in the discussion, namely the
(a) enabling/disabling opaque LSA capabilities in the OSPF daemon, and
(b) ready-concept before originating opaque LSAs.

> This also raises the question of why we have commands to turn opaque
> lsas on/off. Presumably off means don't originate, don't accept them
> from neighbors, don't reflood neighbor values, and delete any in the
> local database, from us or from neighbors. It would be good to have a
> use case for this feature in order to decide what the behavior ought
> to be.

(a): Enabling/disabling opaque LSAs is a special CLI command that you
use to tell your OSPF daemon that it should be able to process opaque
LSAs. Usually, it is set once in the ospf.conf file that you want your
router to accept and disseminate opaque LSAs. See the following config file:

router ospf
router-id 10.0.0.1
network 10.0.0.1/24 area 1
neighbor 10.0.0.2
network 10.0.1.2/24 area 1
neighbor 10.0.1.1
ospf opaque-lsa <============ add this statement!

However, I don't see a reason why one should want to disable opaque LSAs
once your ospfd has started. Also note that disabling opaque LSA
capability is not a function that is provided by the OSPFAPI. Our
API-enhanced router assumes that opaque LSAs are enabled, otherwise it
couldn't do its job.

(b): The current concept delays originating opaque LSAs until the OSPF
router has synchronized its database with at least one opaque-capable
neighbor. Otherwise, as described earlier, it would be possible that
stalled opaque LSAs are still around and the newly originated LSAs would
simply be ignored (since they have smaller seq num). So, the application
has to wait until the underlying OSPF router is in the right state.

Sure, we could implement a different semantic where the application
would be allowed to originate all the time (even when no neighbor is
active), and once neighbors get active, distribute the opaque LSAs. This
would require to cache depending opaque LSAs. However, the application
wouldn't be aware that opaque LSAs don't get distributed if no neighbors
are around.

--

- Ralph

--
keller@tik.ee.ethz.ch
Computer Engineering and Networks Lab (TIK)
ETZ G61.3, ETHZ CH-8092 Zurich
Phone +41 1 632 7015 Fax +41 1 632 1035
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Re all,

> (a): Enabling/disabling opaque LSAs is a special CLI command that you
> use to tell your OSPF daemon that it should be able to process opaque
> LSAs. Usually, it is set once in the ospf.conf file that you want your
> router to accept and disseminate opaque LSAs. See the following config
> file:
>
> router ospf
> router-id 10.0.0.1
> network 10.0.0.1/24 area 1
> neighbor 10.0.0.2
> network 10.0.1.2/24 area 1
> neighbor 10.0.1.1
> ospf opaque-lsa <============ add this statement!

That's true. But you also can connect to ospf'd vty (port 2604), login,
enable, configure terminal, router ospf, no ospf opaque-lsa and you
suddenly have chaged this from ON to OFF.

This can happen at any time!

> However, I don't see a reason why one should want to disable opaque LSAs
> once your ospfd has started.

I do neighter.

But since it's supported, we have to support it too, don't we?

I used the "no ospf opaque-lsa" to switch my testbed from 2 flooding
ospf routers and neighbors to 1 non-flooding, not-ready ospf router
(without stopping the second router) while testing SRRD.

> (b): The current concept delays originating opaque LSAs until the OSPF
> router has synchronized its database with at least one opaque-capable
> neighbor. Otherwise, as described earlier, it would be possible that
> stalled opaque LSAs are still around and the newly originated LSAs would
> simply be ignored (since they have smaller seq num). So, the application
> has to wait until the underlying OSPF router is in the right state.

I agree.

> Sure, we could implement a different semantic where the application
> would be allowed to originate all the time (even when no neighbor is
> active), and once neighbors get active, distribute the opaque LSAs. This
> would require to cache depending opaque LSAs. However, the application
> wouldn't be aware that opaque LSAs don't get distributed if no neighbors
> are around.

I, for one, think the ready/not-ready callbacks are needed, and I don't
think their sematics to be wrong. I'm missing the not-ready callback though.

Regards,
- - Amir

- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAKLLFbycOjskSVCwRAnOeAKDcxbTcAIC/1TQlHJVyYeobwTPLvgCgujBN
I2ExT1a5/ZDMDDosnh8whM0=
=wvEq
-----END PGP SIGNATURE-----
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
But since it's supported, we have to support it too, don't we?

We could make turning off opaque lsas be an error for now (with no
effect), and file a bug in bugzilla.

I, for one, think the ready/not-ready callbacks are needed, and I don't
think their sematics to be wrong. I'm missing the not-ready callback though.

I think (eventually) there should be 3 states, though:

ready
not-ready
disabled

and on a transition to disabled, all locally originated OLSAs should
be flushed from the system and discarded. Yes, this is an API
change.

From: Ralph Keller <keller@tik.ee.ethz.ch>

I think there are two issues that came up in the discussion, namely the
(a) enabling/disabling opaque LSA capabilities in the OSPF daemon, and
(b) ready-concept before originating opaque LSAs.

absolutely - these are separate.

Our API-enhanced router assumes that opaque LSAs are enabled,
otherwise it couldn't do its job.

Sure, but the API should cope (gracefully) with the feature being
disabled.

(b): The current concept delays originating opaque LSAs until the OSPF
router has synchronized its database with at least one opaque-capable
neighbor. Otherwise, as described earlier, it would be possible that
stalled opaque LSAs are still around and the newly originated LSAs would
simply be ignored (since they have smaller seq num).

But what if there is the node that crashed and a new node (that
doesn't have the old LSA) talking, and they these two get connected to
a third node that has the old LSA? I suspect that the right thing
doesn't happen here.

Sure, we could implement a different semantic where the application
would be allowed to originate all the time (even when no neighbor is
active), and once neighbors get active, distribute the opaque LSAs. This
would require to cache depending opaque LSAs. However, the application
wouldn't be aware that opaque LSAs don't get distributed if no neighbors
are around.

I think the cache is needed anyway so that the router can unpublish
the LSAs on graceful shutdown of OSPF or opaque lsas being turned off.
And, I think the application not being aware of the ready/not-ready
state would be a big win - publishing means that the set of connected
routers gets the opaque LSA. If the set of connected (other) routers
is zero, so be it - that still meets the semantics. If the app needs
to know how many other routers are connected, that's a different
question and it should be asked explicitly.

But I don't mean to complain too much - the current situation is
pretty nice, and we just have to prohibit turning off opaque lsas to
get to a state where things remain consistent. Until a larger effort
is made to deal with this in a more complex way, I think that's the
best plan.
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Greg,

> But I don't mean to complain too much - the current situation is
> pretty nice, and we just have to prohibit turning off opaque lsas to
> get to a state where things remain consistent. Until a larger effort
> is made to deal with this in a more complex way, I think that's the
> best plan.

I don't think so!

If you have to ospf routers (both opaque enabled) and the second one of
them goes down the first goes ON->OFF (it becomes not-ready).

If now the second ospf router comes up again (still with opaque enabled)
the first goes OFF->ON (it becomes ready)!

So even if we disable the "no ospf opaque" statement we still would have
to fix the issue. Furthermore I would like to say, that SRRD /needs/
this feature to work correctly. This happens in normal SRRD operation as
soon as a SRRD node gets isolated from the rest of SRRD nodes and OSPF
opaque lsa enabled routers. It's not a seldom case...

I would propose, that we fix it in a way allowing the full sematics...
... or we leave it as it is.

Regards
- - Amir
- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAKN1lbycOjskSVCwRAvqsAJ42sx4WQ7Qpd6Z+3mU1Jdh8vtV+/gCdEOAF
mmH/rBaag4awsu7kHf6LD2Y=
=ubWv
-----END PGP SIGNATURE-----
Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
Hi Greg,

Greg Troxel wrote:
> We could make turning off opaque lsas be an error for now (with no
> effect), and file a bug in bugzilla.

I think, if opaque LSAs are disabled, then originating should simply
fail by returning an error code. The user application then knows that
there is a misconfiguration and the user has to manually fix it by
turning opaque LSAs on.

Note that the OSPF API is also useful when no opaque capable routers are
present in case an application whats a synchronized copy of the LSDB

> I, for one, think the ready/not-ready callbacks are needed, and I don't
> think their sematics to be wrong. I'm missing the not-ready callback though.
>
> I think (eventually) there should be 3 states, though:
>
> ready
> not-ready
> disabled

I agree, there should be something like a not-ready callback, informing
the application that it should suspend originating opaque LSAs.

If a user disables opaque capabilities, then we could return "not-ready"
so that the application stops originating opaque LSAs. Subsequently,
registering new opaque types will fail if opaque LSAs are disabled.

--

- Ralph

--
keller@tik.ee.ethz.ch
Computer Engineering and Networks Lab (TIK)
ETZ G61.3, ETHZ CH-8092 Zurich
Phone +41 1 632 7015 Fax +41 1 632 1035
Re: [srrd-dev 19] Re: [patch] patch for correct ospfapi ready callbacks [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi all,

>> We could make turning off opaque lsas be an error for now (with no
>> effect), and file a bug in bugzilla.
>
> I think, if opaque LSAs are disabled, then originating should simply
> fail by returning an error code. The user application then knows that
> there is a misconfiguration and the user has to manually fix it by
> turning opaque LSAs on.

Manualyly or programatically. SRRD reconfigures ospfd over the ospfd vty
(and automatically enables opaque lsa on startup in the ospfd).

> Note that the OSPF API is also useful when no opaque capable routers are
> present in case an application whats a synchronized copy of the LSDB
>
>> I, for one, think the ready/not-ready callbacks are needed, and I
>> don't think their sematics to be wrong. I'm missing the not-ready
>> callback though.
>>
>> I think (eventually) there should be 3 states, though:
>>
>> ready
>> not-ready
>> disabled
>
> I agree, there should be something like a not-ready callback, informing
> the application that it should suspend originating opaque LSAs.

Yep. That would be great.

> If a user disables opaque capabilities, then we could return "not-ready"
> so that the application stops originating opaque LSAs. Subsequently,
> registering new opaque types will fail if opaque LSAs are disabled.

Then how do we differ between:
- - "not-ready" because we are opaque enabled but we are the _only one_
opaque enabled router
- - "not-ready" because we are not opaque enabled?

I propose to return something like the above "disabled" state on inject.
We could let inject return ok, not-ready _and_ disabled allowing the
application to see those 3 different states.

Furthermore we should add a not-ready callback allowing the application
to notice this immediately.

Eventually we could add an argument to the not-ready callback saying if
this happens because we lost the second last opaque enabled router (->
"not-ready") or if this happens because we get disabled (-> "disabled").

One could argue that "not-ready" and "disabled" is the same from the
applications point of view, since in both states, injecting lsas is not
possible...

Regards
- - Amir
- --
Amir Guindehi, nospam.amir@datacore.ch
DataCore GmbH, Witikonerstrasse 289, 8053 Zurich, Switzerland

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2-nr1 (Windows 2000)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAKhXqbycOjskSVCwRAuizAKC3hZMpJnZ4DoNMRJkJu0YYDkUvhwCfYz0j
0CcGWCHy50ttQBeDfGrpQFw=
=2/Cq
-----END PGP SIGNATURE-----