Mailing List Archive: Scalable Agent Communication

Scalable Agent Communication

May 16, 2012, 12:37 AM

Post #1 of 7 (1967 views)

Hi,
First and foremost sorry for being a bit unclear last night. I am not my
best at 1:55am. Would it be possible to move the meeting a few hours
forwards or backwards?

Update:
I have a POC running where instead of the agents polling the plugin
database a request is sent from the agent to the plugin to retrieve all
of the relevant network details (this is only done when a tap device has
been added or deleted (same for the gw)). I am still addressing the
agent network configuration (was distracted by a bug and issues with
devstack along the way). The current implementation has the agent
polling the hosts network devices to detect if there is a change. I
started with this following Sumits comments about decoupling the agent
and the VIF driver.

I hope to have a detailed design document ready in a few days -
reviewing will give a better idea and it will also help for
documentation in the future.

Problem that I tried to mention last night:
The VIF driver creates the tap device for the attachment. This is done
by taking the attachment UUID and using the first 11 bytes, which is the
name of the tap device. The agent detects the new tap device and
notifies the plugin which in turn sends the network details back to the
plugin.
The problem that I have with this (and it is a bug in the existing
Quantum code) is that if there is more than one attachment with the same
prefix then the networking will be incorrect (same for the network ID's).
For example:
Network A - 30e46c6c-fd53-4c32-86bb-628423c3083f
Attachment X on A - 04ea2bb8-d2fb-4517-97fc-046fe8eb04c5
On the host there will be - gw-30e46c6c-fd and tap04ea2bb8-d2 created.

Network B - *30e46c6c-fd*53-0000-2222-628423c3083f => problems with the
gateway
Attachment Y on B - *04ea2bb8-d2*fb-0000-1111-046fe8eb04c5
*The host will have to create gw-30e46c6c-fd and tap04ea2bb8-d2 created. *

First and foremost this will fail on the VIF driver with the creation.
On the agent side which is requesting the information - it may receive
the wrong network or attachment information. Should the plugin ensure
that the prefixes of the UUID's are unique until linux enables device
names whose length can solve the problem.

This is why I think that it is important that the VIF driver notifies
the agent with the actual ID's.

In addition to this I think that there are a number of additional issues
that we need to address:
1. Inclusion of openstack common - on the IRC last night it was
mentioned to have a blueprint for the config (I feel this only addresses
a small part of the problem). I think that we should do this for the
openstack common project. Thgis will be healthier in the short and long run.
2. Python 2.4. I have yet to understand how to identify which modules
are from later versions. If this is a MUST for the agents then we can
leave the agents as they are and introduce new agents that support RPC.
Is this a viable solution?
3. I am in favour of the drivers notifying the agents. Yes, this has a
bit of coupling ans syncing but it is a healthier solution.

Thanks
Gary