Mailing List Archive

[PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock
Add Documentation on CPUID, KVM_CAP_PVLOCK_KICK, and Hypercalls supported.

KVM_HC_KICK_CPU hypercall added to wakeup halted vcpu in
paravirtual spinlock enabled guest.

KVM_FEATURE_PVLOCK_KICK enables guest to check whether pv spinlock can
be enabled in guest. support in host is queried via
ioctl(KVM_CHECK_EXTENSION, KVM_CAP_PVLOCK_KICK)

A minimal Documentation and template is added for hypercalls.

Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
---
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index e2a4b52..1583bc7 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1109,6 +1109,13 @@ support. Instead it is reported via
if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
feature in userspace, then you can enable the feature for KVM_SET_CPUID2.

+Paravirtualized ticket spinlocks can be enabled in guest by checking whether
+support exists in host via,
+
+ ioctl(KVM_CHECK_EXTENSION, KVM_CAP_PVLOCK_KICK)
+
+if this call return true, guest can use the feature.
+
4.47 KVM_PPC_GET_PVINFO

Capability: KVM_CAP_PPC_GET_PVINFO
diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
index 8820685..c7fc0da 100644
--- a/Documentation/virtual/kvm/cpuid.txt
+++ b/Documentation/virtual/kvm/cpuid.txt
@@ -39,6 +39,10 @@ KVM_FEATURE_CLOCKSOURCE2 || 3 || kvmclock available at msrs
KVM_FEATURE_ASYNC_PF || 4 || async pf can be enabled by
|| || writing to msr 0x4b564d02
------------------------------------------------------------------------------
+KVM_FEATURE_PVLOCK_KICK || 6 || guest checks this feature bit
+ || || before enabling paravirtualized
+ || || spinlock support.
+------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
new file mode 100644
index 0000000..7872da5
--- /dev/null
+++ b/Documentation/virtual/kvm/hypercalls.txt
@@ -0,0 +1,54 @@
+KVM Hypercalls Documentation
+===========================
+Template for documentation is
+The documenenation for hypercalls should inlcude
+1. Hypercall name, value.
+2. Architecture(s)
+3. Purpose
+
+
+1. KVM_HC_VAPIC_POLL_IRQ
+------------------------
+value: 1
+Architecture: x86
+Purpose:
+
+2. KVM_HC_MMU_OP
+------------------------
+value: 2
+Architecture: x86
+Purpose: Support MMU operations such as writing to PTE,
+flushing TLB, release PT.
+
+3. KVM_HC_FEATURES
+------------------------
+value: 3
+Architecture: PPC
+Purpose:
+
+4. KVM_HC_PPC_MAP_MAGIC_PAGE
+------------------------
+value: 4
+Architecture: PPC
+Purpose: To enable communication between the hypervisor and guest there is a
+new shared page that contains parts of supervisor visible register state.
+The guest can map this shared page using this hypercall.
+
+5. KVM_HC_KICK_CPU
+------------------------
+value: 5
+Architecture: x86
+Purpose: Hypercall used to wakeup a vcpu from HLT state
+
+Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
+kernel mode for an event to occur (ex: a spinlock to become available)
+can execute HLT instruction once it has busy-waited for more than a
+threshold time-interval. Execution of HLT instruction would cause
+the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
+of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
+vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
+wokenup.
+
+TODO:
+1. more information on input and output needed?
+2. Add more detail to purpose of hypercalls.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 14.01.2012, at 19:27, Raghavendra K T wrote:

> Add Documentation on CPUID, KVM_CAP_PVLOCK_KICK, and Hypercalls supported.
>
> KVM_HC_KICK_CPU hypercall added to wakeup halted vcpu in
> paravirtual spinlock enabled guest.
>
> KVM_FEATURE_PVLOCK_KICK enables guest to check whether pv spinlock can
> be enabled in guest. support in host is queried via
> ioctl(KVM_CHECK_EXTENSION, KVM_CAP_PVLOCK_KICK)
>
> A minimal Documentation and template is added for hypercalls.
>
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> Signed-off-by: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
> ---
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index e2a4b52..1583bc7 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1109,6 +1109,13 @@ support. Instead it is reported via
> if that returns true and you use KVM_CREATE_IRQCHIP, or if you emulate the
> feature in userspace, then you can enable the feature for KVM_SET_CPUID2.
>
> +Paravirtualized ticket spinlocks can be enabled in guest by checking whether
> +support exists in host via,
> +
> + ioctl(KVM_CHECK_EXTENSION, KVM_CAP_PVLOCK_KICK)
> +
> +if this call return true, guest can use the feature.
> +
> 4.47 KVM_PPC_GET_PVINFO
>
> Capability: KVM_CAP_PPC_GET_PVINFO
> diff --git a/Documentation/virtual/kvm/cpuid.txt b/Documentation/virtual/kvm/cpuid.txt
> index 8820685..c7fc0da 100644
> --- a/Documentation/virtual/kvm/cpuid.txt
> +++ b/Documentation/virtual/kvm/cpuid.txt
> @@ -39,6 +39,10 @@ KVM_FEATURE_CLOCKSOURCE2 || 3 || kvmclock available at msrs
> KVM_FEATURE_ASYNC_PF || 4 || async pf can be enabled by
> || || writing to msr 0x4b564d02
> ------------------------------------------------------------------------------
> +KVM_FEATURE_PVLOCK_KICK || 6 || guest checks this feature bit
> + || || before enabling paravirtualized
> + || || spinlock support.
> +------------------------------------------------------------------------------
> KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
> || || per-cpu warps are expected in
> || || kvmclock.
> diff --git a/Documentation/virtual/kvm/hypercalls.txt b/Documentation/virtual/kvm/hypercalls.txt
> new file mode 100644
> index 0000000..7872da5
> --- /dev/null
> +++ b/Documentation/virtual/kvm/hypercalls.txt
> @@ -0,0 +1,54 @@
> +KVM Hypercalls Documentation
> +===========================
> +Template for documentation is
> +The documenenation for hypercalls should inlcude
> +1. Hypercall name, value.
> +2. Architecture(s)
> +3. Purpose
> +
> +
> +1. KVM_HC_VAPIC_POLL_IRQ
> +------------------------
> +value: 1
> +Architecture: x86
> +Purpose:
> +
> +2. KVM_HC_MMU_OP
> +------------------------
> +value: 2
> +Architecture: x86
> +Purpose: Support MMU operations such as writing to PTE,
> +flushing TLB, release PT.

This one is deprecated, no? Should probably be mentioned here.

> +
> +3. KVM_HC_FEATURES
> +------------------------
> +value: 3
> +Architecture: PPC
> +Purpose:

Expose hypercall availability to the guest. On x86 you use cpuid to enumerate which hypercalls are available. The natural fit on ppc would be device tree based lookup (which is also what EPAPR dictates), but we also have a second enumeration mechanism that's KVM specific - which is this hypercall.

> +
> +4. KVM_HC_PPC_MAP_MAGIC_PAGE
> +------------------------
> +value: 4
> +Architecture: PPC
> +Purpose: To enable communication between the hypervisor and guest there is a
> +new

It's not new anymore :)

> shared page that contains parts of supervisor visible register state.
> +The guest can map this shared page using this hypercall.

... to access its supervisor register through memory.

> +
> +5. KVM_HC_KICK_CPU
> +------------------------
> +value: 5
> +Architecture: x86
> +Purpose: Hypercall used to wakeup a vcpu from HLT state
> +
> +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
> +kernel mode for an event to occur (ex: a spinlock to become available)
> +can execute HLT instruction once it has busy-waited for more than a
> +threshold time-interval. Execution of HLT instruction would cause
> +the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
> +of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
> +vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
> +wokenup.

The description is way too specific. The hypercall basically gives the guest the ability to yield() its current vcpu to another chosen vcpu. The APIC piece is an implementation detail for x86. On PPC we could just use the PIR register contents (processor identifier).

Maybe I didn't fully understand what this really is about though :)


Alex


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Alexander Graf <agraf@suse.de> [2012-01-16 04:23:24]:

> > +5. KVM_HC_KICK_CPU
> > +------------------------
> > +value: 5
> > +Architecture: x86
> > +Purpose: Hypercall used to wakeup a vcpu from HLT state
> > +
> > +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
> > +kernel mode for an event to occur (ex: a spinlock to become available)
> > +can execute HLT instruction once it has busy-waited for more than a
> > +threshold time-interval. Execution of HLT instruction would cause
> > +the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
> > +of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
> > +vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
> > +wokenup.
>
> The description is way too specific. The hypercall basically gives the guest the ability to yield() its current vcpu to another chosen vcpu.

Hmm ..the hypercall does not allow a vcpu to yield. It just allows some
target vcpu to be prodded/wokenup, after which vcpu continues execution.

Note that semantics of this hypercall is different from the hypercall on which
PPC pv-spinlock (__spin_yield()) is currently dependent. This is mainly because
of ticketlocks on x86 (which does not allow us to easily store owning cpu
details in lock word itself).

> The APIC piece is an implementation detail for x86. On PPC we could just use the PIR register contents (processor identifier).

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 16.01.2012, at 04:51, Srivatsa Vaddagiri wrote:

> * Alexander Graf <agraf@suse.de> [2012-01-16 04:23:24]:
>
>>> +5. KVM_HC_KICK_CPU
>>> +------------------------
>>> +value: 5
>>> +Architecture: x86
>>> +Purpose: Hypercall used to wakeup a vcpu from HLT state
>>> +
>>> +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
>>> +kernel mode for an event to occur (ex: a spinlock to become available)
>>> +can execute HLT instruction once it has busy-waited for more than a
>>> +threshold time-interval. Execution of HLT instruction would cause
>>> +the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
>>> +of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
>>> +vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
>>> +wokenup.
>>
>> The description is way too specific. The hypercall basically gives the guest the ability to yield() its current vcpu to another chosen vcpu.
>
> Hmm ..the hypercall does not allow a vcpu to yield. It just allows some
> target vcpu to be prodded/wokenup, after which vcpu continues execution.
>
> Note that semantics of this hypercall is different from the hypercall on which
> PPC pv-spinlock (__spin_yield()) is currently dependent. This is mainly because
> of ticketlocks on x86 (which does not allow us to easily store owning cpu
> details in lock word itself).

Yes, sorry for not being more exact in my wording. It is a directed yield(). Not like the normal old style thing that just says "I'm done, get some work to someone else" but more something like "I'm done, get some work to this specific guy over there" :).


Alex


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 01/16/2012 06:00 AM, Alexander Graf wrote:
> On 16.01.2012, at 04:51, Srivatsa Vaddagiri wrote:
>
> > * Alexander Graf <agraf@suse.de> [2012-01-16 04:23:24]:
> >
> >>> +5. KVM_HC_KICK_CPU
> >>> +------------------------
> >>> +value: 5
> >>> +Architecture: x86
> >>> +Purpose: Hypercall used to wakeup a vcpu from HLT state
> >>> +
> >>> +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
> >>> +kernel mode for an event to occur (ex: a spinlock to become available)
> >>> +can execute HLT instruction once it has busy-waited for more than a
> >>> +threshold time-interval. Execution of HLT instruction would cause
> >>> +the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
> >>> +of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
> >>> +vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
> >>> +wokenup.
> >>
> >> The description is way too specific. The hypercall basically gives the guest the ability to yield() its current vcpu to another chosen vcpu.
> >
> > Hmm ..the hypercall does not allow a vcpu to yield. It just allows some
> > target vcpu to be prodded/wokenup, after which vcpu continues execution.
> >
> > Note that semantics of this hypercall is different from the hypercall on which
> > PPC pv-spinlock (__spin_yield()) is currently dependent. This is mainly because
> > of ticketlocks on x86 (which does not allow us to easily store owning cpu
> > details in lock word itself).
>
> Yes, sorry for not being more exact in my wording. It is a directed yield(). Not like the normal old style thing that just says "I'm done, get some work to someone else" but more something like "I'm done, get some work to this specific guy over there" :).
>

It's not a yield. It unhalts a vcpu. Kind of like an IPI, but without
actually issuing an interrupt on the target, and disregarding the
interrupt flag. It says nothing about the source.

--
error compiling committee.c: too many arguments to function


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 01/14/2012 08:27 PM, Raghavendra K T wrote:
> +
> +5. KVM_HC_KICK_CPU
> +------------------------
> +value: 5
> +Architecture: x86
> +Purpose: Hypercall used to wakeup a vcpu from HLT state
> +
> +Usage example : A vcpu of a paravirtualized guest that is busywaiting in guest
> +kernel mode for an event to occur (ex: a spinlock to become available)
> +can execute HLT instruction once it has busy-waited for more than a
> +threshold time-interval. Execution of HLT instruction would cause
> +the hypervisor to put the vcpu to sleep (unless yield_on_hlt=0) until occurence
> +of an appropriate event. Another vcpu of the same guest can wakeup the sleeping
> +vcpu by issuing KVM_HC_KICK_CPU hypercall, specifying APIC ID of the vcpu to be
> +wokenup.

Wait, what happens with yield_on_hlt=0? Will the hypercall work as
advertised?

> +
> +TODO:
> +1. more information on input and output needed?
> +2. Add more detail to purpose of hypercalls.
>


--
error compiling committee.c: too many arguments to function


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Avi Kivity <avi@redhat.com> [2012-01-16 11:00:41]:

> Wait, what happens with yield_on_hlt=0? Will the hypercall work as
> advertised?

Hmm ..I don't think it will work when yield_on_hlt=0.

One option is to make the kick hypercall available only when
yield_on_hlt=1?

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 01/16/2012 11:40 AM, Srivatsa Vaddagiri wrote:
> * Avi Kivity <avi@redhat.com> [2012-01-16 11:00:41]:
>
> > Wait, what happens with yield_on_hlt=0? Will the hypercall work as
> > advertised?
>
> Hmm ..I don't think it will work when yield_on_hlt=0.
>
> One option is to make the kick hypercall available only when
> yield_on_hlt=1?

It's not a good idea to tie various options together. Features should
be orthogonal.

Can't we make it work? Just have different handling for
KVM_REQ_PVLOCK_KICK (let's rename it, and the hypercall, PV_UNHALT,
since we can use it for non-locks too).

--
error compiling committee.c: too many arguments to function


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Avi Kivity <avi@redhat.com> [2012-01-16 12:14:27]:

> > One option is to make the kick hypercall available only when
> > yield_on_hlt=1?
>
> It's not a good idea to tie various options together. Features should
> be orthogonal.
>
> Can't we make it work? Just have different handling for
> KVM_REQ_PVLOCK_KICK (let 's rename it, and the hypercall, PV_UNHALT,
> since we can use it for non-locks too).

The problem case I was thinking of was when guest VCPU would have issued
HLT with interrupts disabled. I guess one option is to inject an NMI,
and have the guest kernel NMI handler recognize this and make
adjustments such that the vcpu avoids going back to HLT instruction.

Having another hypercall to do yield/sleep (rather than effecting that
via HLT) seems like an alternate clean solution here ..

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On Mon, Jan 16, 2012 at 07:41:17PM +0530, Srivatsa Vaddagiri wrote:
> * Avi Kivity <avi@redhat.com> [2012-01-16 12:14:27]:
>
> > > One option is to make the kick hypercall available only when
> > > yield_on_hlt=1?
> >
> > It's not a good idea to tie various options together. Features should
> > be orthogonal.
> >
> > Can't we make it work? Just have different handling for
> > KVM_REQ_PVLOCK_KICK (let 's rename it, and the hypercall, PV_UNHALT,
> > since we can use it for non-locks too).
>
> The problem case I was thinking of was when guest VCPU would have issued
> HLT with interrupts disabled. I guess one option is to inject an NMI,
> and have the guest kernel NMI handler recognize this and make
> adjustments such that the vcpu avoids going back to HLT instruction.
>
Just kick vcpu out of a guest mode and adjust rip to point after HLT on
next re-entry. Don't forget to call vmx_clear_hlt().

> Having another hypercall to do yield/sleep (rather than effecting that
> via HLT) seems like an alternate clean solution here ..
>
> - vatsa

--
Gleb.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Gleb Natapov <gleb@redhat.com> [2012-01-17 11:14:13]:

> > The problem case I was thinking of was when guest VCPU would have issued
> > HLT with interrupts disabled. I guess one option is to inject an NMI,
> > and have the guest kernel NMI handler recognize this and make
> > adjustments such that the vcpu avoids going back to HLT instruction.
> >
> Just kick vcpu out of a guest mode and adjust rip to point after HLT on
> next re-entry. Don't forget to call vmx_clear_hlt().

Looks bit hackish to me compared to having another hypercall to yield!

> > Having another hypercall to do yield/sleep (rather than effecting that
> > via HLT) seems like an alternate clean solution here ..

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On Tue, Jan 17, 2012 at 05:56:50PM +0530, Srivatsa Vaddagiri wrote:
> * Gleb Natapov <gleb@redhat.com> [2012-01-17 11:14:13]:
>
> > > The problem case I was thinking of was when guest VCPU would have issued
> > > HLT with interrupts disabled. I guess one option is to inject an NMI,
> > > and have the guest kernel NMI handler recognize this and make
> > > adjustments such that the vcpu avoids going back to HLT instruction.
> > >
> > Just kick vcpu out of a guest mode and adjust rip to point after HLT on
> > next re-entry. Don't forget to call vmx_clear_hlt().
>
> Looks bit hackish to me compared to having another hypercall to yield!
>
Do not see anything hackish about it. But what you described above (the
part I replied to) is not another hypercall, but yet another NMI source
and special handling in a guest. So what hypercall do you mean?

> > > Having another hypercall to do yield/sleep (rather than effecting that
> > > via HLT) seems like an alternate clean solution here ..
>
> - vatsa

--
Gleb.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Gleb Natapov <gleb@redhat.com> [2012-01-17 14:51:26]:

> On Tue, Jan 17, 2012 at 05:56:50PM +0530, Srivatsa Vaddagiri wrote:
> > * Gleb Natapov <gleb@redhat.com> [2012-01-17 11:14:13]:
> >
> > > > The problem case I was thinking of was when guest VCPU would have issued
> > > > HLT with interrupts disabled. I guess one option is to inject an NMI,
> > > > and have the guest kernel NMI handler recognize this and make
> > > > adjustments such that the vcpu avoids going back to HLT instruction.
> > > >
> > > Just kick vcpu out of a guest mode and adjust rip to point after HLT on
> > > next re-entry. Don't forget to call vmx_clear_hlt().
> >
> > Looks bit hackish to me compared to having another hypercall to yield!
> >
> Do not see anything hackish about it. But what you described above (the
> part I replied to) is not another hypercall, but yet another NMI source
> and special handling in a guest.

True, which I didn't exactly like and hence was suggesting we use
another hypercall to let spinning vcpu sleep.

> So what hypercall do you mean?

The hypercall is described below:

> > > > Having another hypercall to do yield/sleep (rather than effecting that
> > > > via HLT) seems like an alternate clean solution here ..

and was implemented in an earlier version of this patch (v2) as
KVM_HC_WAIT_FOR_KICK hypercall:

https://lkml.org/lkml/2011/10/23/211

Having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
hypervisor vs assuming that because of a trapped HLT instruction (which
will anyway won't work when yield_on_hlt=0).

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On 01/17/2012 06:21 PM, Gleb Natapov wrote:
> On Tue, Jan 17, 2012 at 05:56:50PM +0530, Srivatsa Vaddagiri wrote:
>> * Gleb Natapov<gleb@redhat.com> [2012-01-17 11:14:13]:
>>
>>>> The problem case I was thinking of was when guest VCPU would have issued
>>>> HLT with interrupts disabled. I guess one option is to inject an NMI,
>>>> and have the guest kernel NMI handler recognize this and make
>>>> adjustments such that the vcpu avoids going back to HLT instruction.
>>>>
>>> Just kick vcpu out of a guest mode and adjust rip to point after HLT on
>>> next re-entry. Don't forget to call vmx_clear_hlt().
>>
>> Looks bit hackish to me compared to having another hypercall to yield!
>>
> Do not see anything hackish about it. But what you described above (the
> part I replied to) is not another hypercall, but yet another NMI source
> and special handling in a guest. So what hypercall do you mean?
>

Earlier version had a hypercall to sleep instead of current halt()
approach. This was taken out to avoid extra hypercall.

So here is the hypercall hunk referred :

+/*
+ * kvm_pv_wait_for_kick_op : Block until kicked by either a KVM_HC_KICK_CPU
+ * hypercall or a event like interrupt.
+ *
+ * @vcpu : vcpu which is blocking.
+ */
+static void kvm_pv_wait_for_kick_op(struct kvm_vcpu *vcpu)
+{
+ DEFINE_WAIT(wait);
+
+ /*
+ * Blocking on vcpu->wq allows us to wake up sooner if required to
+ * service pending events (like interrupts).
+ *
+ * Also set state to TASK_INTERRUPTIBLE before checking
vcpu->kicked to
+ * avoid racing with kvm_pv_kick_cpu_op().
+ */
+ prepare_to_wait(&vcpu->wq, &wait, TASK_INTERRUPTIBLE);
+
+ /*
+ * Somebody has already tried kicking us. Acknowledge that
+ * and terminate the wait.
+ */
+ if (vcpu->kicked) {
+ vcpu->kicked = 0;
+ goto end_wait;
+ }
+
+ /* Let's wait for either KVM_HC_KICK_CPU or someother event
+ * to wake us up.
+ */
+
+ srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
+ schedule();
+ vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
+
+end_wait:
+ finish_wait(&vcpu->wq, &wait);
+}

>>>> Having another hypercall to do yield/sleep (rather than effecting that
>>>> via HLT) seems like an alternate clean solution here ..


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On Tue, Jan 17, 2012 at 06:41:03PM +0530, Srivatsa Vaddagiri wrote:
> * Gleb Natapov <gleb@redhat.com> [2012-01-17 14:51:26]:
>
> > On Tue, Jan 17, 2012 at 05:56:50PM +0530, Srivatsa Vaddagiri wrote:
> > > * Gleb Natapov <gleb@redhat.com> [2012-01-17 11:14:13]:
> > >
> > > > > The problem case I was thinking of was when guest VCPU would have issued
> > > > > HLT with interrupts disabled. I guess one option is to inject an NMI,
> > > > > and have the guest kernel NMI handler recognize this and make
> > > > > adjustments such that the vcpu avoids going back to HLT instruction.
> > > > >
> > > > Just kick vcpu out of a guest mode and adjust rip to point after HLT on
> > > > next re-entry. Don't forget to call vmx_clear_hlt().
> > >
> > > Looks bit hackish to me compared to having another hypercall to yield!
> > >
> > Do not see anything hackish about it. But what you described above (the
> > part I replied to) is not another hypercall, but yet another NMI source
> > and special handling in a guest.
>
> True, which I didn't exactly like and hence was suggesting we use
> another hypercall to let spinning vcpu sleep.
>
Ah, sorry. Missed that.

> > So what hypercall do you mean?
>
> The hypercall is described below:
>
> > > > > Having another hypercall to do yield/sleep (rather than effecting that
> > > > > via HLT) seems like an alternate clean solution here ..
>
> and was implemented in an earlier version of this patch (v2) as
> KVM_HC_WAIT_FOR_KICK hypercall:
>
> https://lkml.org/lkml/2011/10/23/211
>
> Having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
> hypervisor vs assuming that because of a trapped HLT instruction (which
> will anyway won't work when yield_on_hlt=0).
>
The purpose of yield_on_hlt=0 is to allow VCPU to occupy CPU for the
entire time slice no mater what. I do not think disabling yield on HLT
is even make sense in CPU oversubscribe scenario. Now if you'll call
KVM_HC_WAIT_FOR_KICK instead of HLT you will effectively ignore
yield_on_hlt=0 setting. This is like having PV HLT that does not obey
VMX exit control setting.

--
Gleb.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Gleb Natapov <gleb@redhat.com> [2012-01-17 15:20:51]:

> > Having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
> > hypervisor vs assuming that because of a trapped HLT instruction (which
> > will anyway won't work when yield_on_hlt=0).
> >
> The purpose of yield_on_hlt=0 is to allow VCPU to occupy CPU for the
> entire time slice no mater what. I do not think disabling yield on HLT
> is even make sense in CPU oversubscribe scenario.

Yes, so is there any real use for yield_on_hlt=0? I believe Anthony
initially added it as a way to implement CPU bandwidth capping for VMs,
which would ensure that busy VMs don't eat into cycles meant for a idle
VM. Now that we have proper support in scheduler for CPU bandwidth capping, is
there any real world use for yield_on_hlt=0? If not, deprecate it?

> Now if you'll call
> KVM_HC_WAIT_FOR_KICK instead of HLT you will effectively ignore
> yield_on_hlt=0 setting.

I guess that depends on what we do in KVM_HC_WAIT_FOR_KICK. If we do
yield_to() rather than sleep, it should minimize how much cycles vcpu gives away
to a competing VM (which seems to be the biggest purpose why one may
want to set yield_on_hlt=0).

> This is like having PV HLT that does not obey
> VMX exit control setting.

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On Tue, Jan 17, 2012 at 07:58:18PM +0530, Srivatsa Vaddagiri wrote:
> * Gleb Natapov <gleb@redhat.com> [2012-01-17 15:20:51]:
>
> > > Having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
> > > hypervisor vs assuming that because of a trapped HLT instruction (which
> > > will anyway won't work when yield_on_hlt=0).
> > >
> > The purpose of yield_on_hlt=0 is to allow VCPU to occupy CPU for the
> > entire time slice no mater what. I do not think disabling yield on HLT
> > is even make sense in CPU oversubscribe scenario.
>
> Yes, so is there any real use for yield_on_hlt=0? I believe Anthony
> initially added it as a way to implement CPU bandwidth capping for VMs,
> which would ensure that busy VMs don't eat into cycles meant for a idle
> VM. Now that we have proper support in scheduler for CPU bandwidth capping, is
> there any real world use for yield_on_hlt=0? If not, deprecate it?
>
I was against adding it in the first place, so if IBM no longer needs it
I am for removing it ASAP.

> > Now if you'll call
> > KVM_HC_WAIT_FOR_KICK instead of HLT you will effectively ignore
> > yield_on_hlt=0 setting.
>
> I guess that depends on what we do in KVM_HC_WAIT_FOR_KICK. If we do
> yield_to() rather than sleep, it should minimize how much cycles vcpu gives away
> to a competing VM (which seems to be the biggest purpose why one may
> want to set yield_on_hlt=0).
>
> > This is like having PV HLT that does not obey
> > VMX exit control setting.
>
> - vatsa

--
Gleb.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
On Tue, Jan 17, 2012 at 05:32:33PM +0200, Gleb Natapov wrote:
> On Tue, Jan 17, 2012 at 07:58:18PM +0530, Srivatsa Vaddagiri wrote:
> > * Gleb Natapov <gleb@redhat.com> [2012-01-17 15:20:51]:
> >
> > > > Having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
> > > > hypervisor vs assuming that because of a trapped HLT instruction (which
> > > > will anyway won't work when yield_on_hlt=0).
> > > >
> > > The purpose of yield_on_hlt=0 is to allow VCPU to occupy CPU for the
> > > entire time slice no mater what. I do not think disabling yield on HLT
> > > is even make sense in CPU oversubscribe scenario.
> >
> > Yes, so is there any real use for yield_on_hlt=0? I believe Anthony
> > initially added it as a way to implement CPU bandwidth capping for VMs,
> > which would ensure that busy VMs don't eat into cycles meant for a idle
> > VM. Now that we have proper support in scheduler for CPU bandwidth capping, is
> > there any real world use for yield_on_hlt=0? If not, deprecate it?
> >
> I was against adding it in the first place, so if IBM no longer needs it
> I am for removing it ASAP.

+1.

Anthony?


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH RFC V4 5/5] Documentation/kvm : Add documentation on Hypercalls and features used for PV spinlock [ In reply to ]
* Marcelo Tosatti <mtosatti@redhat.com> [2012-01-17 13:53:03]:

> on tue, jan 17, 2012 at 05:32:33pm +0200, gleb natapov wrote:
> > on tue, jan 17, 2012 at 07:58:18pm +0530, srivatsa vaddagiri wrote:
> > > * gleb natapov <gleb@redhat.com> [2012-01-17 15:20:51]:
> > >
> > > > > having the hypercall makes the intent of vcpu (to sleep on a kick) clear to
> > > > > hypervisor vs assuming that because of a trapped hlt instruction (which
> > > > > will anyway won't work when yield_on_hlt=0).
> > > > >
> > > > the purpose of yield_on_hlt=0 is to allow vcpu to occupy cpu for the
> > > > entire time slice no mater what. i do not think disabling yield on hlt
> > > > is even make sense in cpu oversubscribe scenario.
> > >
> > > Yes, so is there any real use for yield_on_hlt=0? I believe Anthony
> > > initially added it as a way to implement CPU bandwidth capping for VMs,
> > > which would ensure that busy VMs don't eat into cycles meant for a idle
> > > VM. Now that we have proper support in scheduler for CPU bandwidth capping, is
> > > there any real world use for yield_on_hlt=0? If not, deprecate it?
> > >
> > I was against adding it in the first place, so if IBM no longer needs it
> > I am for removing it ASAP.
>
> +1.
>
> Anthony?

CCing Anthony.

Anthony, could you ACK removal of yield_on_hlt (as keeping it around will
require unnecessary complications in pv-spinlock patches)?

- vatsa


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel