Hi,
"xl vcpu-set" is commonly used to hotplug/unhotplug vCPUs of an SMP VM.
However, the current implementation of this command makes the driver
domain vulnerable to denial-of-service attack: in certain cases, this command
consumes too many CPU cycles in dom0, adversely affecting dom0's other
tasks (e.g., IO processing, monitoring, etc.)
[An illustrative example]
Say, with a Linux PV guest called "vm01", when vm01 just boots or reboots
(e.g., in its grub period), if dom0 issues "xl vcpu-set vm01 xxx" at this
moment, the following will happen:
(1) "xl vcpu-set" hangs, until vm01 has loaded its kernel successfully.
(2) in dom0, "oxenstored" consumes 100% of a single core.
So, if a malicious guest purposely stays in its grub period (or other
purposely designed phases, as long as it does not load its kernel),
"xl vcpu-set" will hang _forever_, and dom0 has to sacrifice one core.
[Affected versions]
This problem has been there since libxl was introduced in Xen 4.1 release.
[Implementation problem]
"xl vcpu-set" involves a "loop" as follows (retry_transaction):
--------------
static int libxl__set_vcpuonline_xenstore(...)
{
... ...
retry_transaction:
t = xs_transaction_start(CTX->xsh);
for (i = 0; i <= info.vcpu_max_id; i++)
libxl__xs_write(gc, t,
libxl__sprintf(gc, "%s/cpu/%u/availability", dompath, i),
"%s", libxl_bitmap_test(cpumap, i) ? "online" : "offline");
if (!xs_transaction_end(CTX->xsh, t, 0)) {
if (errno == EAGAIN)
goto retry_transaction;
}
... ...
}
--------------
[Possible solution]
In principle, the correctness of "xl vcpu-set" should _not_ depend on any
guest's behaviors.
One possible fix might be like this: if xs_transaction_end does not succeed,
either ignore it or retry for a pre-defined timeout period (rather
than forever).
[Suggestions]
I find that the loop method like "retry_transaction" is used at several places
in libxl. You probably want to revisit the potential negative effect
it brings.
Please take a look and help confirm my reported bug. Thanks.
(Cc'd to two authors listed in libxl.c)
Luwei Cheng
Department of Computer Science
The University of Hong Kong
_______________________________________________
Xen-api mailing list
Xen-api@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
"xl vcpu-set" is commonly used to hotplug/unhotplug vCPUs of an SMP VM.
However, the current implementation of this command makes the driver
domain vulnerable to denial-of-service attack: in certain cases, this command
consumes too many CPU cycles in dom0, adversely affecting dom0's other
tasks (e.g., IO processing, monitoring, etc.)
[An illustrative example]
Say, with a Linux PV guest called "vm01", when vm01 just boots or reboots
(e.g., in its grub period), if dom0 issues "xl vcpu-set vm01 xxx" at this
moment, the following will happen:
(1) "xl vcpu-set" hangs, until vm01 has loaded its kernel successfully.
(2) in dom0, "oxenstored" consumes 100% of a single core.
So, if a malicious guest purposely stays in its grub period (or other
purposely designed phases, as long as it does not load its kernel),
"xl vcpu-set" will hang _forever_, and dom0 has to sacrifice one core.
[Affected versions]
This problem has been there since libxl was introduced in Xen 4.1 release.
[Implementation problem]
"xl vcpu-set" involves a "loop" as follows (retry_transaction):
--------------
static int libxl__set_vcpuonline_xenstore(...)
{
... ...
retry_transaction:
t = xs_transaction_start(CTX->xsh);
for (i = 0; i <= info.vcpu_max_id; i++)
libxl__xs_write(gc, t,
libxl__sprintf(gc, "%s/cpu/%u/availability", dompath, i),
"%s", libxl_bitmap_test(cpumap, i) ? "online" : "offline");
if (!xs_transaction_end(CTX->xsh, t, 0)) {
if (errno == EAGAIN)
goto retry_transaction;
}
... ...
}
--------------
[Possible solution]
In principle, the correctness of "xl vcpu-set" should _not_ depend on any
guest's behaviors.
One possible fix might be like this: if xs_transaction_end does not succeed,
either ignore it or retry for a pre-defined timeout period (rather
than forever).
[Suggestions]
I find that the loop method like "retry_transaction" is used at several places
in libxl. You probably want to revisit the potential negative effect
it brings.
Please take a look and help confirm my reported bug. Thanks.
(Cc'd to two authors listed in libxl.c)
Luwei Cheng
Department of Computer Science
The University of Hong Kong
_______________________________________________
Xen-api mailing list
Xen-api@lists.xen.org
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api