Mailing List Archive

Xen Clocksource, the VDSO)... not how, but why... and some micro-benchmarks.
Hi,

The hypervisor command line and the domX commandline (using linux here
as example) allow settings related to clocksource.

https://xenbits.xen.org/docs/unstable/misc/xen-command-line.html#clocksource-x86

clocksource (x86) = pit | hpet | acpi | tsc

Or, for linux:

https://raw.githubusercontent.com/torvalds/linux/master/Documentation/admin-guide/kernel-parameters.txt

clocksource [X86-64] hpet,tsc

The question I want to raise in this post is not really what can be set,
or how, since that's clear. The question is: why should I change
defaults to something else (something better?).

By default, dom0 and domU use clocksource 'xen' (which can be seen at
/sys/devices/system/clocksource/clocksource0/current_clocksource).

A well known 'issue' with the 'xen' clocksource is that it does not have
VDSO support to accellerate a certain group of syscalls. A rather well
known blog about this is...


https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/

So, I'm looking at the possibility to set...

clocksource=tsc tsc=stable:socket

...for the hypervisor, since I think my hardware can do this and...

clocksource=tsc

...for the domU kernel command line.

I still haven't found the exact reason why we should add clocksource=tsc
tsc=stable:socket to the hypervisor command line. It's not needed to
make all tsc vdso trickery in the domU work. Moreover, it does not even
seem needed at all to set clocksource=tsc in xen to be able to use it in
the domU?

The only place where I can find tsc=stable:socket being mentioned is
https://lore.kernel.org/patchwork/cover/849340/ which should be a
changeset to enable vdso calls for the xen clocksource (in linux 4.15).
But that never got merged.

==== Some benchmarks ====

Anyway, I did some micro-benchmarks today to see what different
combinations of settings do.

The victim hardware is a HP dl360 gen8 with Intel(R) Xeon(R) CPU E5-2650
v2 @ 2.60GHz cpus. cpuid says: Intel Xeon E5-1600/E5-2600 v2 (Ivy
Bridge-EP C1/M1/S1), 22nm. It's running Xen 4.11 (from commit
87f51bf366), and I have a test domU running Debian Stretch with 4.19.20
kernel. Dom0 is also Debian Stretch with 4.19.20 kernel.

What I'm testing is a totally not-real-life scenario of just calling
gettimeofday 5 million times:

-$ cat test2.c
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>

int
main(int argc, char *argv[])
{
struct timeval tv;
int i = 0;
for (; i<5000000; i++) {
gettimeofday(&tv,NULL);
}

return 0;
}

The hypervisor command line has "clocksource=tsc tsc=stable:socket" in
all cases.

The results of running this with different combinations of PV, PVH, xen
and tsc clocksource, and tsc clocksource after live migration are
attached in xen-tsc-timings.txt. The results are... rather interesting.

==== TSC and no migrate ====

I have read the information at:


https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html#TSC-INVARIANT-BIT-and-NO_MIGRATE

The information about only being able to use tsc when no migrate is set
seems no longer true, since I can set clocksource=tsc and then live
migrate. When doing so, the following things change in cpuid output:

--- cpuid 2019-03-07 15:57:39.045024075 +0100
+++ cpuid-after-migrate 2019-03-07 15:59:27.456474458 +0100
@@ -272,14 +272,14 @@
MSR base address = 0x40000000
MMU_PT_UPDATE_PRESERVE_AD supported = false
hypervisor time features (0x40000003/00):
- vtsc = false
+ vtsc = true
host tsc is safe = true
boot cpu has RDTSCP = true
tsc mode = 0x0 (0)
tsc frequency (kHz) = 2593772
- incarnation = 0x1 (1)
- cpu frequency (kHZ) = 747622702
- 0x40000004 0x00: eax=0x0000001f ebx=0x00000001 ecx=0x00000005
edx=0x00000000
+ incarnation = 0x2 (2)
+ cpu frequency (kHZ) = 0
+ 0x40000004 0x00: eax=0x0000001f ebx=0x00000000 ecx=0x0000000d
edx=0x00000000
0x40000005 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000
edx=0x00000000
extended feature flags (0x80000001/edx):
SYSCALL and SYSRET instructions = true
@@ -341,7 +341,7 @@
software thermal control (STC) = false
100 MHz multiplier control = false
hardware P-State control = false
- TscInvariant = false
+ TscInvariant = true
Physical Address and Linear Address Size (0x80000008/eax):
maximum physical address bits = 0x2e (46)
maximum linear (virtual) address bits = 0x30 (48)

So it seems I'm running a virtualized tsc then.

==== H'okay, so... ====

So the remaining questions are:

* Why should I set clocksource=tsc on the hypervisor line at all? I can
see it makes the clocksource in dom0 change to tsc.
* What's this tsc=stable:socket about? What difference does it make for
Xen? Do I want this if my hardware can do it?
* What other things am I missing?
* Should we write a HOWTO wiki page about this, as addition to the
reference documentation?
* Any other feedback?

Thanks,

Hans
Re: Xen Clocksource, the VDSO)... not how, but why... and some micro-benchmarks. [ In reply to ]
On 3/7/19 5:39 PM, Hans van Kranenburg wrote:
> [...]
>
> I still haven't found the exact reason why we should add clocksource=tsc
> tsc=stable:socket to the hypervisor command line. It's not needed to
> make all tsc vdso trickery in the domU work. Moreover, it does not even
> seem needed at all to set clocksource=tsc in xen to be able to use it in
> the domU?
>
> The only place where I can find tsc=stable:socket being mentioned is
> https://lore.kernel.org/patchwork/cover/849340/ which should be a
> changeset to enable vdso calls for the xen clocksource (in linux 4.15).
> But that never got merged.

Well, I went on a journey to find out what tsc=stable:socket is actually
doing, and that journey ended rather soon. :D

The tsc=stable:socket was introduced in...

commit bc900cbc8f37b93cc6c9f6370beb14e6430b334d
Author: Joao Martins <joao.m.martins@oracle.com>
Date: Fri Sep 23 18:26:19 2016 +0200

x86/time: extend "tsc" param with "stable:socket"

...and nothing at all seems to be using TSC_RELIABLE_SOCKET or the
tsc_flags introduced there in the code further on.

+/* TSC is reliable across sockets */
+#define TSC_RELIABLE_SOCKET (1 << 0)

So, that command line part is effectively a noop. Maybe it was added
because that linux patchset for xen clocksource wanted to use it? But,
those patches also don't explain what would be special about this extra
option and what it would be used for.

--
Hans van Kranenburg
_______________________________________________
Xen-users mailing list
Xen-users@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-users
Re: Xen Clocksource, the VDSO)... not how, but why... and some micro-benchmarks. [ In reply to ]
On 11/03/2019 01:05, Hans van Kranenburg wrote:
> On 3/7/19 5:39 PM, Hans van Kranenburg wrote:
>> [...]
>>
>> I still haven't found the exact reason why we should add clocksource=tsc
>> tsc=stable:socket to the hypervisor command line. It's not needed to
>> make all tsc vdso trickery in the domU work. Moreover, it does not even
>> seem needed at all to set clocksource=tsc in xen to be able to use it in
>> the domU?
>>
>> The only place where I can find tsc=stable:socket being mentioned is
>> https://lore.kernel.org/patchwork/cover/849340/ which should be a
>> changeset to enable vdso calls for the xen clocksource (in linux 4.15).
>> But that never got merged.
>
> Well, I went on a journey to find out what tsc=stable:socket is actually
> doing, and that journey ended rather soon. :D
>
> The tsc=stable:socket was introduced in...
>
> commit bc900cbc8f37b93cc6c9f6370beb14e6430b334d
> Author: Joao Martins <joao.m.martins@oracle.com>
> Date: Fri Sep 23 18:26:19 2016 +0200
>
> x86/time: extend "tsc" param with "stable:socket"
>
> ...and nothing at all seems to be using TSC_RELIABLE_SOCKET or the
> tsc_flags introduced there in the code further on.
>
> +/* TSC is reliable across sockets */
> +#define TSC_RELIABLE_SOCKET (1 << 0)
>
> So, that command line part is effectively a noop. Maybe it was added
> because that linux patchset for xen clocksource wanted to use it? But,
> those patches also don't explain what would be special about this extra
> option and what it would be used for.
>

What about its usage in init_tsc()? It will clearly have an effect on
systems with multiple sockets.


Juergen

_______________________________________________
Xen-users mailing list
Xen-users@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-users
Re: Xen Clocksource, the VDSO)... not how, but why... and some micro-benchmarks. [ In reply to ]
On 3/14/19 9:32 AM, Juergen Gross wrote:
> On 11/03/2019 01:05, Hans van Kranenburg wrote:
>> On 3/7/19 5:39 PM, Hans van Kranenburg wrote:
>>> [...]
>>>
>>> I still haven't found the exact reason why we should add clocksource=tsc
>>> tsc=stable:socket to the hypervisor command line. It's not needed to
>>> make all tsc vdso trickery in the domU work. Moreover, it does not even
>>> seem needed at all to set clocksource=tsc in xen to be able to use it in
>>> the domU?
>>>
>>> The only place where I can find tsc=stable:socket being mentioned is
>>> https://lore.kernel.org/patchwork/cover/849340/ which should be a
>>> changeset to enable vdso calls for the xen clocksource (in linux 4.15).
>>> But that never got merged.
>>
>> Well, I went on a journey to find out what tsc=stable:socket is actually
>> doing, and that journey ended rather soon. :D
>>
>> The tsc=stable:socket was introduced in...
>>
>> commit bc900cbc8f37b93cc6c9f6370beb14e6430b334d
>> Author: Joao Martins <joao.m.martins@oracle.com>
>> Date: Fri Sep 23 18:26:19 2016 +0200
>>
>> x86/time: extend "tsc" param with "stable:socket"
>>
>> ...and nothing at all seems to be using TSC_RELIABLE_SOCKET or the
>> tsc_flags introduced there in the code further on.
>>
>> +/* TSC is reliable across sockets */
>> +#define TSC_RELIABLE_SOCKET (1 << 0)
>>
>> So, that command line part is effectively a noop. Maybe it was added
>> because that linux patchset for xen clocksource wanted to use it? But,
>> those patches also don't explain what would be special about this extra
>> option and what it would be used for.
>>
>
> What about its usage in init_tsc()? It will clearly have an effect on
> systems with multiple sockets.

Right, how did I miss that. Maybe some accidental filter on a file or
directory while showing the commit contents. :|

So, in that case, let's continue the investigation...

BIOS hyperthreading disabled
Xen smt=off clocksource=tsc tsc=stable:socket loglvl=all
----
(XEN) Brought up 12 CPUs
(XEN) TSC: CPU Hotplug intended <- logged on WARNING level
(XEN) TSC: Not setting it as clocksource <- logged on DEBUG level
----
nr_cpus : 12
max_cpu_id : 31
nr_nodes : 2
cores_per_socket : 6
threads_per_core : 1

When looking at the code:

if ( nr_cpu_ids != num_present_cpus() )
{
printk(XENLOG_WARNING "TSC: CPU Hotplug intended\n");
ret = 0;
}

Hm... nr_cpu_ids? In my case nr_cpu_ids seems to have value 32, because
the hardware reports there can be 32 cpu cores in total.

That number 32 is interesting, since I guess it means that if I would
replace the 2x 6-core cpus in this pizzabox with 2x 8-core cpus that I
buy somewhere and enable hyperthreading, I can reach the number 32.

(XEN) SMP: Allowing 32 CPUs (20 hotplug CPUs)

However, I can't hotswap these physical cpus when the machine is
running, and I also can't change the BIOS hyperthreading setting while
it's running, so I don't really get what kind of CPU Hotplug I would be
intending here.

Dum dum dum... This means I can't ever get TSC.

Also... turning off hyperthreading is encouraged for security reasons
nowadays, so even if I would buy some 8 core cpus, I would have to
enable hyperthreading in BIOS and Xen to get TSC.

Other combinations of settings:

BIOS hyperthreading enabled
Xen smt=off clocksource=tsc tsc=stable:socket loglvl=all
----
(XEN) Brought up 12 CPUs
(XEN) Parked 12 CPUs
(XEN) TSC: CPU Hotplug intended
(XEN) TSC: Not setting it as clocksource
----
nr_cpus : 12
max_cpu_id : 31
nr_nodes : 2
cores_per_socket : 6
threads_per_core : 1

and clocksource=tsc tsc=stable:socket loglvl=all

BIOS hyperthreading enabled
Xen smt=on
(XEN) Brought up 24 CPUs
(XEN) TSC: CPU Hotplug intended
(XEN) TSC: Not setting it as clocksource
nr_cpus : 24
max_cpu_id : 31
nr_nodes : 2
cores_per_socket : 6
threads_per_core : 2

if ( nr_sockets > 1 && !(tsc_flags & TSC_RELIABLE_SOCKET) )
{
printk(XENLOG_WARNING "TSC: Not invariant across sockets\n");
ret = 0;
}

It's actually not easy to find which hardware can be used with
tsc=stable:socket.

This test box has Xeon 5600 series cpus. I can find something related in
the datasheet:

https://www.intel.com/content/www/us/en/processors/xeon/xeon-5600-vol-1-datasheet.html
On page 63 "Note: In order to ensure Timestamp Counter (TSC)
synchronization across sockets in multi-socket systems, the RESET#
deassertion edge should [...]"

But the hardware around the cpus is a HP DL360 G7, which has to comply
to these rules in the datasheet. I haven't been able to find any info
about that yet.

---- >8 ----

if ( !ret )
printk(XENLOG_DEBUG "TSC: Not setting it as clocksource\n");

return ret;

I didn't see the "TSC: Not setting it as clocksource" at first, because
it's logged at DEBUG level. It would make sense to have that on WARNING
level, since as a user I'm explicitly telling it to use TSC and it's
refusing and doing something else, so I'd like to know.

---- >8 ----

I also couldn't really find out yet which clocksource it's actually
using now, there is no info readily available in e.g. xl info output.

In xl dmesg, I earlier see...
(XEN) Platform timer is 14.318MHz HPET
...so maybe that's it?

---- >8 ----

The dom0 linux kernel seems more happy about it. Without even specifying
clocksource=, it chooses tsc.

-# dmesg |grep -i tsc
[ 0.034139] tsc: Fast TSC calibration using PIT
[ 0.034141] tsc: Detected 2666.872 MHz processor
[ 0.034142] tsc: Detected 2666.784 MHz TSC
[ 1.042078] clocksource: tsc-early: mask: 0xffffffffffffffff
max_cycles: 0x2670acaa8d1, max_idle_ns: 440795300001 ns
[ 1.165111] clocksource: Switched to clocksource tsc-early
[ 2.642276] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x2670acaa8d1, max_idle_ns: 440795300001 ns
[ 2.642462] clocksource: Switched to clocksource tsc

---- >8 ----

Now, this is all very interesting of course, but taking one step back
again: Should I care? Does the result of choices in the hypervisor have
effect on what I can do in dom0 and domU?

--
Hans van Kranenburg
_______________________________________________
Xen-users mailing list
Xen-users@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-users