Mailing List Archive

[PATCH] Safely finish closing protocol when guest fails in blkfront
If a guest finds any error and aborts the connection of a block device,
it's online state set at device create phase will stop it from being
properly cleaned up.

Follows a fix for it.

--
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"
Re: [PATCH] Safely finish closing protocol when guest fails in blkfront [ In reply to ]
On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote:

> If a guest finds any error and aborts the connection of a block device,
> it's online state set at device create phase will stop it from being
> properly cleaned up.
>
> Follows a fix for it.

Assignment and unassignment of physical resources is really a tools issue.
Tools should really be integrated with device-hotplug success/failure anyway
-- for example, it is likely the initiator would like confirmation of
success/failure in most cases.

-- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Safely finish closing protocol when guest fails in blkfront [ In reply to ]
On Tue, Dec 05, 2006 at 07:39:11AM +0000, Keir Fraser wrote:
> On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote:
>
> > If a guest finds any error and aborts the connection of a block device,
> > it's online state set at device create phase will stop it from being
> > properly cleaned up.
> >
> > Follows a fix for it.
>
> Assignment and unassignment of physical resources is really a tools issue.
> Tools should really be integrated with device-hotplug success/failure anyway
> -- for example, it is likely the initiator would like confirmation of
> success/failure in most cases.
>
Agree. But what if after properly initiation, frontend finds an error
and starts Closing protocol? What will happen is that the test

if (xenbus_dev_is_online(dev))

will cause the device to not be unregistered. At this point, it do not
see frontend changes. (putting backend in closing leads to frontend
closing,closed, but backend never see frontend closing, never going to
closed).

Given that, what tools can do ? At the current point, this is what leads
me to believe that arbitrary frontend-failure cases should be handled in the frontend.


--
Glauber de Oliveira Costa
Red Hat Inc.
"Free as in Freedom"

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Safely finish closing protocol when guest fails in blkfront [ In reply to ]
> > Assignment and unassignment of physical resources is really a tools issue.
> > Tools should really be integrated with device-hotplug success/failure anyway
> > -- for example, it is likely the initiator would like confirmation of
> > success/failure in most cases.
> >
> Agree. But what if after properly initiation, frontend finds an error
> and starts Closing protocol? What will happen is that the test
>
> if (xenbus_dev_is_online(dev))
>
> will cause the device to not be unregistered. At this point, it do not
> see frontend changes. (putting backend in closing leads to frontend
> closing,closed, but backend never see frontend closing, never going to
> closed).
>
> Given that, what tools can do ? At the current point, this is what leads
> me to believe that arbitrary frontend-failure cases should be handled in the frontend.
>
Keir,

Let me just try to clarify this. (after all, I just realised that even
if this is the right path, there's a piece missing).

Right now, I think that handling failures in the frontend code is the
correct choice, because failures can pretty much happen anytime .
According to the diagram at
http://wiki.xensource.com/xenwiki/XenSplitDrivers, a closedown
initiated by the frontend should end in the device being unregistered,
and I don't think tools will _ever_ be able to do it. The best they
can do is wait to see if the device is properly connected, but what if
the error happens after it? If this is indeed the real scenario, the
missing piece would be to delete the error message, to avoid
unregistering devices that should not be unregistered.

If you can assure, that now and ever, errors in the frontend side will
_always_ be constrained to the pre-Connect steps, then, my proposal is
to set the online flag just after the device is connected. It would
assure that device is properly unregistered, and tools would have a
way to know if the process was successfull (online = 1). Any comments
on that ?

I assume that I don't understand exactly the purpose of online. At
first I thought it was save & restore related, but I'm currently able
to save & restore with online being always 0. Can you shed some light
on it ?

As soon as you answer those, I'll proceed with the right approach to fix this.


--
Glauber de Oliveira Costa.
"Free as in Freedom"

Add your comments to GPLv3 at:
http://gplv3.fsf.org/comments/gplv3-draft-2.html

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel