Mailing List Archive

Fwd: VirtualDomain broken for live migration.
Dear all,

I'm in the process of setting up my first four-node cluster. I'm
using CentOS7 with PCS/Pacemaker/Corosync.

I've got everything set up with shared storage using GlusterFS. The
cluster is running and I'm in the process of adding resources. My
intention for the cluster is to use it to host virtual machines. I
want the cluster to be able to live-migrate VMs between hosts. I'm
not interested in monitoring resources inside the guests, just knowing
that the guest is running or not is fine.

I've got all the virtualization working with libvirt using KVM. Live
migration works fine. Now I'm trying to make it work through the
cluster.

I am using the VirtualDomain resource in heartbeat. I can add and
remove VMs. It works. But the live migration feature is broken.
Looking at the source, the fault is on this line:

virsh ${VIRSH_OPTIONS} migrate --live $DOMAIN_NAME ${remoteuri} ${migrateuri}

I guess virsh must have changed at some point, because the "--live"
flag does not exist any more. I can make it work with the following
change

virsh ${VIRSH_OPTIONS} migrate --p2p --tunnelled $DOMAIN_NAME
${remoteuri} ${migrateuri}

This works, at least for my case where I'm tunnelling the migration
over SSH. But it's not a real bug fix because it's going to need
extra logic somewhere to determine whether it needs to add the
"--tunnelled" flag or not, and whatever other flags are required.

I see that the VirtualDomain resource hasn't been worked on in over
four years. Similarly the Wiki page has had no updated in this time.

http://www.linux-ha.org/wiki/VirtualDomain_%28resource_agent%29

Is this project still in active development? Is anyone actually
working on this? While I could do the work to fix the VirtualDomain
resource to work with the latest version of virsh, I don't see the
point if the project is dead. I gather Heartbeat became what is now
Pacemaker, but there doesn't seem to be a new up-to-date version of
VirtualDomain included with Pacemaker.

Indeed even the Pacemaker documentation seems completely out of date.
I spent hours working with ClusterMon and these pages

http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch07.html
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-eventnotification-HAAR.html

just trying to get my cluster to send notification emails. It was
only when I looked at the ClusterMon source and the man page for
crm_mon that I realised the documentation is completely wrong and
ClusterMon has no ability at all to send emails. The "extra_options"
field lists options that crm_mon doesn't even show as supported!

What does everybody else use for managing virtual machines on a
Pacemaker cluster? If heartbeat VirtualDomain is no longer supported,
can anyone point me in the direction of something is that is still in
development?

Thanks for any help and advice anyone can offer.

Steve.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: Fwd: VirtualDomain broken for live migration. [ In reply to ]
On Mon, 18 Aug 2014 20:29:19 +0000
Steven Hale <email@stevenhale.co.uk> wrote:

> What does everybody else use for managing virtual machines on a
> Pacemaker cluster? If heartbeat VirtualDomain is no longer supported,
> can anyone point me in the direction of something is that is still in
> development?

Hi,

I suspect you may be looking in the wrong place. Development for
resource-agents has moved to Github, and the VirtualDomain agent was
last updated 17 days ago:

https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/VirtualDomain

--
// Kristoffer Grönlund
// kgronlund@suse.com
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: Fwd: VirtualDomain broken for live migration. [ In reply to ]
Hi,

On Mon, Aug 18, 2014 at 08:29:19PM +0000, Steven Hale wrote:
> Dear all,
>
> I'm in the process of setting up my first four-node cluster. I'm
> using CentOS7 with PCS/Pacemaker/Corosync.
>
> I've got everything set up with shared storage using GlusterFS. The
> cluster is running and I'm in the process of adding resources. My
> intention for the cluster is to use it to host virtual machines. I
> want the cluster to be able to live-migrate VMs between hosts. I'm
> not interested in monitoring resources inside the guests, just knowing
> that the guest is running or not is fine.
>
> I've got all the virtualization working with libvirt using KVM. Live
> migration works fine. Now I'm trying to make it work through the
> cluster.
>
> I am using the VirtualDomain resource in heartbeat. I can add and
> remove VMs. It works. But the live migration feature is broken.
> Looking at the source, the fault is on this line:
>
> virsh ${VIRSH_OPTIONS} migrate --live $DOMAIN_NAME ${remoteuri} ${migrateuri}

Please open an issue at http://github.com/ClusterLabs/resource-agents

> I guess virsh must have changed at some point, because the "--live"
> flag does not exist any more.

I was not able to find any information on this. In libvirt
v1.2.5 the --live option still exists for the migrate command.

> I can make it work with the following
> change
>
> virsh ${VIRSH_OPTIONS} migrate --p2p --tunnelled $DOMAIN_NAME
> ${remoteuri} ${migrateuri}
>
> This works, at least for my case where I'm tunnelling the migration
> over SSH. But it's not a real bug fix because it's going to need
> extra logic somewhere to determine whether it needs to add the
> "--tunnelled" flag or not, and whatever other flags are required.
>
> I see that the VirtualDomain resource hasn't been worked on in over
> four years. Similarly the Wiki page has had no updated in this time.
>
> http://www.linux-ha.org/wiki/VirtualDomain_%28resource_agent%29
>
> Is this project still in active development?

Yes, it is. Isn't there some information in the resource-agents
package where the software lives and where to get support?

Thanks,

Dejan


> Is anyone actually
> working on this? While I could do the work to fix the VirtualDomain
> resource to work with the latest version of virsh, I don't see the
> point if the project is dead. I gather Heartbeat became what is now
> Pacemaker, but there doesn't seem to be a new up-to-date version of
> VirtualDomain included with Pacemaker.
>
> Indeed even the Pacemaker documentation seems completely out of date.
> I spent hours working with ClusterMon and these pages
>
> http://clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/ch07.html
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Reference/s1-eventnotification-HAAR.html
>
> just trying to get my cluster to send notification emails. It was
> only when I looked at the ClusterMon source and the man page for
> crm_mon that I realised the documentation is completely wrong and
> ClusterMon has no ability at all to send emails. The "extra_options"
> field lists options that crm_mon doesn't even show as supported!
>
> What does everybody else use for managing virtual machines on a
> Pacemaker cluster? If heartbeat VirtualDomain is no longer supported,
> can anyone point me in the direction of something is that is still in
> development?
>
> Thanks for any help and advice anyone can offer.
>
> Steve.
> _______________________________________________
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: Fwd: VirtualDomain broken for live migration. [ In reply to ]
Thanks Kristoffer, Cédric, and Dejan for your help.

On 19 August 2014 08:00, Dejan Muhamedagic <dejanmm@fastmail.fm> wrote:
> I was not able to find any information on this. In libvirt
> v1.2.5 the --live option still exists for the migrate command.

I'm using libvirt 1.1.1 which is the latest release in the CentOS7 yum repo.

There is no mention at all of a "--live" option on the libvirt Wiki

http://libvirt.org/migration.html

and it's not mentioned in "virsh help". Although I see it is
mentioned in the man page, so it is still there but it doesn't help
with the problem.

VirtualDomain appears to be forming a virsh command that looks like this

# virsh --connect=qemu:///system --quiet migrate --live VMname
qemu+ssh://newhost/system

which gives me the following error

error: unable to connect to server at 'newhost:49152': No route to host

because it's not using SSH it's trying to do a native migration over
TCP connection which is blocked by the firewall.

In order to tell virsh to use SSH I need to pass the "--tunnelled"
parameter, which in turn then requires the "--p2p" parameter.

# virsh --connect=qemu:///system --quiet migrate --live --tunnelled
--p2p VMname qemu+ssh://newhost/system

Which correctly migrates the guest. How do I get VirtualDomain to
pass the "--tunnelled --p2p" options to make it actually use ssh as
required by the remote URI?

When I do a manual migration it does leave behind the source config,
and doesn't create a config on the new host. The guest goes from
"persistent" to "transient" as described in the table on the libvirt
Wiki. I've tried to pass the --undefine-source and --persist options
to make it transfer the config and go from "persistent" to
"persistent", but that doesn't work either. I get either

error: unsupported option '--undefine-source'. See --help.

or

error: command 'migrate' doesn't support option --undefine-source

depending on whether I try to put the option before or after the
"migrate" command. Looking at the help like it says doesn't help.

# virsh --help | grep source
find-storage-pool-sources-as find potential storage pool sources
find-storage-pool-sources discover potential storage pool sources

which tells me nothing about the --undefine-source option.

Of course, this is nothing to do with Pacemaker. I only bring it up
as an example of where the libvirt documentation appears to be
completely wrong, hence my thinking that --live didn't appear to be
doing anything either.

> Yes, it is. Isn't there some information in the resource-agents
> package where the software lives and where to get support?

Yes, I see now the link to the Github repo is provided in the package
information. Thanks. I hadn't seen that. I'd only been looking at
the ha-linux Wiki which doesn't seem to have been updated since 2010.

Thanks again for all the help.

Steve.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Re: Fwd: VirtualDomain broken for live migration. [ In reply to ]
Steven Hale <email@stevenhale.co.uk> writes:

> There is no mention at all of a "--live" option on the libvirt Wiki
>
> http://libvirt.org/migration.html

Live or offline migration is irrelevant to the topic of that page. The
documentation of this option should be (but is not) present on
http://libvirt.org/sources/virshcmdref/html/sect-migrate.html instead.

> and it's not mentioned in "virsh help"

Try virsh help migrate.

> In order to tell virsh to use SSH I need to pass the "--tunnelled"
> parameter, which in turn then requires the "--p2p" parameter.

I doubt SSH tunnels are the best way to configure cluster migration,
credential management won't be easy. I use qemu:// desturis with TLS
x509 authentication and tcp:// migrateuris for performance on a separate
network.

> When I do a manual migration it does leave behind the source config,
> and doesn't create a config on the new host. The guest goes from
> "persistent" to "transient" as described in the table on the libvirt
> Wiki. I've tried to pass the --undefine-source and --persist options
> to make it transfer the config and go from "persistent" to
> "persistent", but that doesn't work either.

Try --undefinesource and --persistent.

Btw. I decided to keep all my VMs transient and centrally manage their
configuration in the CIB. This required creating a new resource agent,
of course, which I can share if you are interested. It surely lacks
some features of VirtualDomain, but works pretty good for us.
--
Regards,
Feri.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems