Mailing List Archive

PROPOSAL: Resignature Duplicate SRs
Hello Everyone,
I am in the early stages of planning the development of the following
functionality. If you have any comments or suggestions I would appreciate
your feedback.

I am looking to develop a supplemental pack for XenServer to allow an
existing SR to be reintroduced to the XenServer and have the new SR
resignatured (including all its VDIs) so both the original and new SR can
exist on the system at the same time.

*Context:*
We currently use XenServer with Apache CloudStack (ACS) and SolidFire for
storage. We have a configuration where we have an SR including a single
VDI that maps to a LUN in the SolidFire. This configuration allows us to
provide QoS on each VDI in our system. The SolidFire offers us a lot of
interesting features and is well supported by ACS.

*The Problem:*
We are trying to solve for a problem when we take a snapshot of the LUN on
the SolidFire and then try to reintroduce that new snapshot back into
XenServer. The situation is as follows:
- XenServer has an SR with 1 (or more) VDIs in it which is represented as a
single LUN (L0) in the SolidFire (for QoS of that SR).
- The SolidFire can do a snapshot of this LUN (L0) to create a second
logically identical LUN (L1).
- The attempt to introduce the new LUN (L1) into XenServer fails with the
following error:

Attaching SR
Internal error:
Db_exn.Uniqueness_constraint_violation("SR", "uuid", "1c2...hash...71a")
Check your settings and try again.

This is the expected behavior because the current implementation of
XenServer does not know how to deal with an attempt to attach a second SR
with the same UUID as an SR that already exists in the system.

*Observations:*
- The newly created LUN (L1) on the SolidFire has SR metadata stored inside
it which includes the SR UUID.
- The newly created LUN (L1) on the SolidFire has VDI metadata stored
inside it which includes the VDI UUIDs.

*Challenges:*
- Since the new LUN (L1) has the SR UUID saved as metadata inside the LUN
(L1), we need to be able to modify that metadata in order to resignature
it. However, this means that we have to attach the SR in order to modify
the existing metadata configured on the LUN (L1).
- Since we can not attach the LUN as an SR because of the uniqueness
constraint, we can not modify the SR to resignature it.

*Possible Solution(s):*
(I am still validating how I will develop this, so please correct me if I
am off base on anything...)
- Introduce a new XenServer config option that would be something like '
resignature_duplicate_srs' with the following options:
-- 'False' (default) : This is the current behavior and would not let the
operation happen
-- 'True' : This option would catch if a duplicate SR is being added and
would resignature the newly added SR
- If the configuration is set to 'True' (resignature the SR) when
introducing a duplicate SR, the XenServer would attempt to do the following:
-- Generate a new SR UUID for the duplicate SR.
-- Attach the SR (temporarily? unofficially according to XenServer?) using
the new UUID and the ISCSI IQN (which would be the LUN (L1) on the
SolidFire in this case).
-- Update the SR metadata on the LUN (L1) to reflect the new SR UUID (not
sure how much other stuff I would have to change, but this is probably more
involved than this).
-- Loop through all the VDIs on this SR and resignature each of the VDIs.
This may be reasonably complex since this is setup through LVM and I will
have to update the Volume Group (VG) and the Logical Volumes (LV). I have
not spent much time looking at this piece yet.
-- I should not have to worry about the VBDs at this point because none
should be attached to the VDIs since these will all be new references. The
VBDs should be automatically handled when the new VDIs are attached to VMs.
- Assuming this all goes as planned, I will probably have to detach this
new temporary SR now that it has been resignatured and kick off the normal
XenServer flow to attach this new SR. This is important because we need
the XenServer to update all of its local caching and such and load the SR
correctly from scratch.

*Next Steps:*
- Figure out how the supplemental packs work with the DDK
<http://support.citrix.com/servlet/KbServlet/download/38324-102-714674/XenServer-6.5.0_Supplemental%20Packs%20and%20the%20DDK%20Guide.pdf>
.
- Setup a dev environment to start working on the DDK.
- Figure out how to unofficially attach an SR using a newly generated UUID
so I can get access to the LUN (L1) metadata and update it.

This is my plan right now. If you have concerns with this approach, please
speak up because I want to reduce the number of failed attempts at
implementing this. I am still learning how everything works and how it all
fits together, so your comments are very helpful for me as most of you
understand the inner workings of this better than I do at this point.

I have been spending a lot of time getting my head around how the Storage
Manager works by looking through the code here:
https://github.com/xapi-project/sm/tree/master/drivers

Thanks for your time.

Cheers,

*Will STEVENS*
Lead Developer

*CloudOps* *| *Cloud Solutions Experts
420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6
w cloudops.com *|* tw @CloudOps_