Mailing List Archive

[xen-unstable test] 6947: regressions - trouble: broken/fail/pass
flight 6947 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/6947/

Regressions :-(

Tests which did not succeed and are blocking:
test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs. 6944
test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs. 6944
test-amd64-amd64-pv 5 xen-boot fail REGR. vs. 6944
test-amd64-amd64-win 3 host-install(3) broken
test-amd64-amd64-xl-win 5 xen-boot fail REGR. vs. 6944
test-amd64-amd64-xl 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-pair 8 xen-boot/dst_host fail REGR. vs. 6944
test-amd64-i386-pair 7 xen-boot/src_host fail REGR. vs. 6944
test-amd64-i386-pv 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-rhel6hvm-intel 3 host-install(3) broken
test-amd64-i386-win-vcpus1 3 host-install(3) broken
test-amd64-i386-win 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-xl-credit2 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-xl-multivcpu 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-xl-win-vcpus1 5 xen-boot fail REGR. vs. 6944
test-amd64-i386-xl 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-pair 8 xen-boot/dst_host fail REGR. vs. 6944
test-amd64-xcpkern-i386-pair 7 xen-boot/src_host fail REGR. vs. 6944
test-amd64-xcpkern-i386-pv 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-rhel6hvm-intel 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-win 3 host-install(3) broken
test-amd64-xcpkern-i386-xl-credit2 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-xl-multivcpu 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-xl-win 5 xen-boot fail REGR. vs. 6944
test-amd64-xcpkern-i386-xl 5 xen-boot fail REGR. vs. 6944
test-i386-i386-pair 8 xen-boot/dst_host fail REGR. vs. 6945
test-i386-i386-pair 7 xen-boot/src_host fail REGR. vs. 6945
test-i386-i386-pv 5 xen-boot fail REGR. vs. 6945
test-i386-i386-win 5 xen-boot fail REGR. vs. 6945
test-i386-i386-xl-win 3 host-install(3) broken
test-i386-i386-xl 5 xen-boot fail REGR. vs. 6945
test-i386-xcpkern-i386-pair 8 xen-boot/dst_host fail REGR. vs. 6945
test-i386-xcpkern-i386-pair 7 xen-boot/src_host fail REGR. vs. 6945
test-i386-xcpkern-i386-pv 5 xen-boot fail REGR. vs. 6945
test-i386-xcpkern-i386-win 5 xen-boot fail REGR. vs. 6945
test-i386-xcpkern-i386-xl 5 xen-boot fail REGR. vs. 6945

version targeted for testing:
xen 24346f749826
baseline version:
xen 476b0d68e7d5

------------------------------------------------------------
People who touched revisions under test:
Jan Beulich <jbeulich@novell.com>
Keir Fraser <keir@xen.org>
Samuel Thibault <samuel.thibault@ens-lyon.org>
------------------------------------------------------------

jobs:
build-i386-xcpkern pass
build-amd64 pass
build-i386 pass
build-amd64-oldkern pass
build-i386-oldkern pass
build-amd64-pvops pass
build-i386-pvops pass
test-amd64-amd64-xl fail
test-amd64-i386-xl fail
test-i386-i386-xl fail
test-amd64-xcpkern-i386-xl fail
test-i386-xcpkern-i386-xl fail
test-amd64-i386-rhel6hvm-amd fail
test-amd64-xcpkern-i386-rhel6hvm-amd fail
test-amd64-i386-xl-credit2 fail
test-amd64-xcpkern-i386-xl-credit2 fail
test-amd64-i386-rhel6hvm-intel broken
test-amd64-xcpkern-i386-rhel6hvm-intel fail
test-amd64-i386-xl-multivcpu fail
test-amd64-xcpkern-i386-xl-multivcpu fail
test-amd64-amd64-pair fail
test-amd64-i386-pair fail
test-i386-i386-pair fail
test-amd64-xcpkern-i386-pair fail
test-i386-xcpkern-i386-pair fail
test-amd64-amd64-pv fail
test-amd64-i386-pv fail
test-i386-i386-pv fail
test-amd64-xcpkern-i386-pv fail
test-i386-xcpkern-i386-pv fail
test-amd64-i386-win-vcpus1 broken
test-amd64-i386-xl-win-vcpus1 fail
test-amd64-amd64-win broken
test-amd64-i386-win fail
test-i386-i386-win fail
test-amd64-xcpkern-i386-win broken
test-i386-xcpkern-i386-win fail
test-amd64-amd64-xl-win fail
test-i386-i386-xl-win broken
test-amd64-xcpkern-i386-xl-win fail


------------------------------------------------------------
sg-report-flight on woking.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.

------------------------------------------------------------
changeset: 23296:24346f749826
tag: tip
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 13:17:44 2011 +0100

replace d->nr_pirqs sized arrays with radix tree

With this it is questionable whether retaining struct domain's
nr_pirqs is actually necessary - the value now only serves for bounds
checking, and this boundary could easily be nr_irqs.

Another thing to consider is whether it's worth storing the pirq
number in struct pirq, to avoid passing the number and a pointer to
quite a number of functions.

Note that ia64, the build of which is broken currently anyway, is only
partially fixed up.

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23295:4891f1f41ba5
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 13:16:30 2011 +0100

x86: replace nr_irqs sized per-domain arrays with radix trees

It would seem possible to fold the two trees into one (making e.g. the
emuirq bits stored in the upper half of the pointer), but I'm not
certain that's worth it as it would make deletion of entries more
cumbersome. Unless pirq-s and emuirq-s were mutually exclusive...

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23294:c0a8f889ca9e
user: Keir Fraser <keir@xen.org>
date: Sun May 01 13:03:37 2011 +0100

public/arch-ia64/debug_op.h: Reinsert copyright that I accidentally deleted.

Signed-off-by: Keir Fraser <keir@xen.org>


changeset: 23293:f48c72de4208
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 10:20:44 2011 +0100

x86: a little bit of cleanup to time.c

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23292:e2fb962d13ff
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 10:16:54 2011 +0100

x86: clean up building in mm/hap/

Building 4-level guest walks is unnecessary for x86-32, and with this
no longer being built the fallback code used here isn't necessary
anymore either.

Additonally the mechanism to determine the value of
GUEST_PAGING_LEVELS to be passed to the compiler can be much
simplified given that we're using a pattern rule here.

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23291:485b7c5e6f17
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 10:15:11 2011 +0100

A little bit of SMP boot code cleanup

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23290:1ac7336b6298
user: Jan Beulich <jbeulich@novell.com>
date: Sun May 01 10:14:15 2011 +0100

x86: set ARAT feature flag for non-buggy AMD CPUs

This is the equivalent of a recent Linux change.

Signed-off-by: Jan Beulich <jbeulich@novell.com>


changeset: 23289:e4fc9494b940
user: Samuel Thibault <samuel.thibault@ens-lyon.org>
date: Sun May 01 10:11:58 2011 +0100

mini-os: fix lib.h licence

Update the Linux stdio functions prototypes, and move them to a
separate header, licenced under GPL2+. Import FreeBSD8 string
functions prototypes, update licence. Drop kvec, of unsure source and
useless anyway.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>


changeset: 23288:60dfb5aca706
user: Samuel Thibault <samuel.thibault@ens-lyon.org>
date: Sun May 01 10:10:12 2011 +0100

mini-os: lib/math.c: import FreeBSD 8 functions

Import lib/math.c functions (and thus licence) from FreeBSD 8,
and re-apply a few of our changes. Whitespaces left aside, this
leads to almost no source change except s/int64_t/quad_t/ and
s/uint64_t/u_quad_t/.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>


changeset: 23287:bf11f502684a
user: Samuel Thibault <samuel.thibault@ens-lyon.org>
date: Sun May 01 10:09:47 2011 +0100

mini-os: Fix printf.c licence

Changeset df1348e72390 actually completely replaced the freebsd printf
implementation with the Linux printf implementation. Further changes
are extremely minor and thus don't pose IP issue. Fix the licence
accordingly.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>


changeset: 23286:6f48f5f843f0
user: Keir Fraser <keir@xen.org>
date: Sun May 01 10:08:40 2011 +0100

Clean up licensing in the public header directory.

The COPYING file at xen/include/public/COPYING clearly states that all
public header files are distributed under a permissive MIT
license. Therefore make sure the same permissive license is included
at the top of every header file (i.e., not GPL).

Signed-off-by: Keir Fraser <keir@xen.org>


changeset: 23285:a7ac0a0170b0
user: Keir Fraser <keir@xen.org>
date: Sun May 01 09:32:48 2011 +0100

x86: Clean up smp_call_function handling.

We don't need so many communication fields between caller and
handler.

Signed-off-by: Keir Fraser <keir@xen.org>


changeset: 23284:476b0d68e7d5
user: Keir Fraser <keir@xen.org>
date: Sat Apr 30 09:48:16 2011 +0100

x86: Remove TRAP_INSTR from the public headers.

Direct hypercall traps (rather than using the hypercall transfer page)
was long obsolete even when TRAP_INSTR was deprecated in the API
headers. No current guest will be, or should be, using TRAP_INSTR.

Signed-off-by: Keir Fraser <keir@xen.org>


(qemu changes not included)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 01/05/2011 20:56, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:

> flight 6947 xen-unstable real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/6947/
>
> Regressions :-(
>
> Tests which did not succeed and are blocking:
> test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs.
> 6944
> test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs.
> 6944
> test-amd64-amd64-pv 5 xen-boot fail REGR. vs.
> 6944

Looks like your bug, Jan (changeset 23296):

May 1 17:03:45.335804 (XEN) Xen BUG at spinlock.c:47
May 1 17:03:45.734780 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not
tainted ]----
May 1 17:03:45.734819 (XEN) CPU: 0
May 1 17:03:45.743763 (XEN) RIP: e008:[<ffff82c480123cc4>]
check_lock+0x44/0x50
May 1 17:03:45.743796 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
May 1 17:03:45.755762 (XEN) rax: 0000000000000000 rbx: ffff8301a7ff9868
rcx: 0000000000000001
May 1 17:03:45.755797 (XEN) rdx: 0000000000000000 rsi: 0000000000000001
rdi: ffff8301a7ff986c
May 1 17:03:45.770774 (XEN) rbp: ffff82c48029fca0 rsp: ffff82c48029fca0
r8: 0000000000000000
May 1 17:03:45.782761 (XEN) r9: 00000000deadbeef r10: ffff82c48021ca20
r11: 0000000000000286
May 1 17:03:45.782796 (XEN) r12: ffff8301a7ff8000 r13: 0000000000000080
r14: 0000000000000000
May 1 17:03:45.787773 (XEN) r15: ffff8301a7ff9868 cr0: 000000008005003b
cr4: 00000000000006f0
May 1 17:03:45.802762 (XEN) cr3: 000000021b001000 cr2: ffff88000191cfc0
May 1 17:03:45.802791 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss:
e010 cs: e008
May 1 17:03:45.814766 (XEN) Xen stack trace from rsp=ffff82c48029fca0:
May 1 17:03:45.814794 (XEN) ffff82c48029fcb8 ffff82c480123d01
0000000000000080 ffff82c48029fcf8
May 1 17:03:45.826766 (XEN) ffff82c48012a73a ffff82c48029fd08
0000000000000080 0000000000000080
May 1 17:03:45.826800 (XEN) 0000000000000090 0000000000000000
ffff8301a7e82000 ffff82c48029fd28
May 1 17:03:45.838781 (XEN) ffff82c48012abfa 0000000000000002
0000000000000010 ffff8301a7e82000
May 1 17:03:45.846770 (XEN) ffff8301a7e82000 ffff82c48029fd58
ffff82c4801615e1 ffff82c4802d9950
May 1 17:03:45.858762 (XEN) 0000000000000000 ffff8301a7e82000
ffff8301a7e821a8 ffff82c48029fd88
May 1 17:03:45.858797 (XEN) ffff82c4801043b9 ffff8301a7e82c18
0000000000000000 0000000000000000
May 1 17:03:45.870772 (XEN) 0000000000000000 ffff82c48029fdc8
ffff82c480160bdb 0000000000000000
May 1 17:03:45.882762 (XEN) 0000000000000286 0000000000000000
0000000000000000 ffff8301a7e82000
May 1 17:03:45.882796 (XEN) 0000000000000000 ffff82c48029fe48
ffff82c480161186 0000000000000000
May 1 17:03:45.894774 (XEN) 00000001801198ad 0000000000000000
ffff8301a7ffaed0 ffff82c48029fe48
May 1 17:03:45.899765 (XEN) ffff82c4801675a1 ffff8301a7f000b4
ffff8300d7afb000 ffff82c48029fe48
May 1 17:03:45.899805 (XEN) 0000000000000000 ffffffff817afea8
0000000000000000 0000000000000000
May 1 17:03:45.911776 (XEN) 0000000000000000 ffff82c48029fef8
ffff82c480174adb ffff82c4802d8c00
May 1 17:03:45.923767 (XEN) ffff82c4802d95a0 000000011fc37ff0
0000000000000000 ffffffff817afee8
May 1 17:03:45.923801 (XEN) ffffffff810565b5 ffffffff817aff18
0000000000000000 0000000000000000
May 1 17:03:45.938774 (XEN) ffff82c4802b8880 ffff82c48029ff18
ffffffffffffffff ffff8301a7e82000
May 1 17:03:45.947764 (XEN) 000000008012395f ffff82c480159df4
ffff8300d7afb000 0000000000000000
May 1 17:03:45.947800 (XEN) ffffffff817aff08 ffffffff818cc510
0000000000000000 00007d3b7fd600c7
May 1 17:03:45.959772 (XEN) ffff82c480213eb8 ffffffff8100942a
0000000000000021 0000000000000000
May 1 17:03:45.974784 (XEN) Xen call trace:
May 1 17:03:45.974811 (XEN) [<ffff82c480123cc4>] check_lock+0x44/0x50
May 1 17:03:45.974830 (XEN) [<ffff82c480123d01>] _spin_lock+0x11/0x5d
May 1 17:03:45.982768 (XEN) [<ffff82c48012a73a>]
xmem_pool_alloc+0x138/0x4d4
May 1 17:03:45.982799 (XEN) [<ffff82c48012abfa>] _xmalloc+0x124/0x1ce
May 1 17:03:45.991767 (XEN) [<ffff82c4801615e1>]
alloc_pirq_struct+0x36/0x7f
May 1 17:03:45.991804 (XEN) [<ffff82c4801043b9>] pirq_get_info+0x43/0x8f
May 1 17:03:46.003769 (XEN) [<ffff82c480160bdb>]
set_domain_irq_pirq+0x71/0xae
May 1 17:03:46.003791 (XEN) [<ffff82c480161186>]
map_domain_pirq+0x370/0x3bb
May 1 17:03:46.018770 (XEN) [<ffff82c480174adb>]
do_physdev_op+0xa6b/0x1598
May 1 17:03:46.018802 (XEN) [<ffff82c480213eb8>]
syscall_enter+0xc8/0x122
May 1 17:03:46.030766 (XEN)
May 1 17:03:46.030783 (XEN)
May 1 17:03:46.030798 (XEN) ****************************************
May 1 17:03:46.030825 (XEN) Panic on CPU 0:
May 1 17:03:46.038760 (XEN) Xen BUG at spinlock.c:47
May 1 17:03:46.038783 (XEN) ****************************************
May 1 17:03:46.038808 (XEN)



> test-amd64-amd64-win 3 host-install(3) broken
> test-amd64-amd64-xl-win 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-amd64-xl 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-pair 8 xen-boot/dst_host fail REGR. vs.
> 6944
> test-amd64-i386-pair 7 xen-boot/src_host fail REGR. vs.
> 6944
> test-amd64-i386-pv 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-rhel6hvm-intel 3 host-install(3) broken
> test-amd64-i386-win-vcpus1 3 host-install(3) broken
> test-amd64-i386-win 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-xl-credit2 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-xl-multivcpu 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-xl-win-vcpus1 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-i386-xl 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-pair 8 xen-boot/dst_host fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-pair 7 xen-boot/src_host fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-pv 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-rhel6hvm-amd 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-rhel6hvm-intel 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-win 3 host-install(3) broken
> test-amd64-xcpkern-i386-xl-credit2 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-xl-multivcpu 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-xl-win 5 xen-boot fail REGR. vs.
> 6944
> test-amd64-xcpkern-i386-xl 5 xen-boot fail REGR. vs.
> 6944
> test-i386-i386-pair 8 xen-boot/dst_host fail REGR. vs.
> 6945
> test-i386-i386-pair 7 xen-boot/src_host fail REGR. vs.
> 6945
> test-i386-i386-pv 5 xen-boot fail REGR. vs.
> 6945
> test-i386-i386-win 5 xen-boot fail REGR. vs.
> 6945
> test-i386-i386-xl-win 3 host-install(3) broken
> test-i386-i386-xl 5 xen-boot fail REGR. vs.
> 6945
> test-i386-xcpkern-i386-pair 8 xen-boot/dst_host fail REGR. vs.
> 6945
> test-i386-xcpkern-i386-pair 7 xen-boot/src_host fail REGR. vs.
> 6945
> test-i386-xcpkern-i386-pv 5 xen-boot fail REGR. vs.
> 6945
> test-i386-xcpkern-i386-win 5 xen-boot fail REGR. vs.
> 6945
> test-i386-xcpkern-i386-xl 5 xen-boot fail REGR. vs.
> 6945
>
> version targeted for testing:
> xen 24346f749826
> baseline version:
> xen 476b0d68e7d5
>
> ------------------------------------------------------------
> People who touched revisions under test:
> Jan Beulich <jbeulich@novell.com>
> Keir Fraser <keir@xen.org>
> Samuel Thibault <samuel.thibault@ens-lyon.org>
> ------------------------------------------------------------
>
> jobs:
> build-i386-xcpkern pass
> build-amd64 pass
> build-i386 pass
> build-amd64-oldkern pass
> build-i386-oldkern pass
> build-amd64-pvops pass
> build-i386-pvops pass
> test-amd64-amd64-xl fail
> test-amd64-i386-xl fail
> test-i386-i386-xl fail
> test-amd64-xcpkern-i386-xl fail
> test-i386-xcpkern-i386-xl fail
> test-amd64-i386-rhel6hvm-amd fail
> test-amd64-xcpkern-i386-rhel6hvm-amd fail
> test-amd64-i386-xl-credit2 fail
> test-amd64-xcpkern-i386-xl-credit2 fail
> test-amd64-i386-rhel6hvm-intel broken
> test-amd64-xcpkern-i386-rhel6hvm-intel fail
> test-amd64-i386-xl-multivcpu fail
> test-amd64-xcpkern-i386-xl-multivcpu fail
> test-amd64-amd64-pair fail
> test-amd64-i386-pair fail
> test-i386-i386-pair fail
> test-amd64-xcpkern-i386-pair fail
> test-i386-xcpkern-i386-pair fail
> test-amd64-amd64-pv fail
> test-amd64-i386-pv fail
> test-i386-i386-pv fail
> test-amd64-xcpkern-i386-pv fail
> test-i386-xcpkern-i386-pv fail
> test-amd64-i386-win-vcpus1 broken
> test-amd64-i386-xl-win-vcpus1 fail
> test-amd64-amd64-win broken
> test-amd64-i386-win fail
> test-i386-i386-win fail
> test-amd64-xcpkern-i386-win broken
> test-i386-xcpkern-i386-win fail
> test-amd64-amd64-xl-win fail
> test-i386-i386-xl-win broken
> test-amd64-xcpkern-i386-xl-win fail
>
>
> ------------------------------------------------------------
> sg-report-flight on woking.cam.xci-test.com
> logs: /home/xc_osstest/logs
> images: /home/xc_osstest/images
>
> Logs, config files, etc. are available at
> http://www.chiark.greenend.org.uk/~xensrcts/logs
>
> Test harness code can be found at
> http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary
>
>
> Not pushing.
>
> ------------------------------------------------------------
> changeset: 23296:24346f749826
> tag: tip
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 13:17:44 2011 +0100
>
> replace d->nr_pirqs sized arrays with radix tree
>
> With this it is questionable whether retaining struct domain's
> nr_pirqs is actually necessary - the value now only serves for bounds
> checking, and this boundary could easily be nr_irqs.
>
> Another thing to consider is whether it's worth storing the pirq
> number in struct pirq, to avoid passing the number and a pointer to
> quite a number of functions.
>
> Note that ia64, the build of which is broken currently anyway, is only
> partially fixed up.
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23295:4891f1f41ba5
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 13:16:30 2011 +0100
>
> x86: replace nr_irqs sized per-domain arrays with radix trees
>
> It would seem possible to fold the two trees into one (making e.g. the
> emuirq bits stored in the upper half of the pointer), but I'm not
> certain that's worth it as it would make deletion of entries more
> cumbersome. Unless pirq-s and emuirq-s were mutually exclusive...
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23294:c0a8f889ca9e
> user: Keir Fraser <keir@xen.org>
> date: Sun May 01 13:03:37 2011 +0100
>
> public/arch-ia64/debug_op.h: Reinsert copyright that I accidentally
> deleted.
>
> Signed-off-by: Keir Fraser <keir@xen.org>
>
>
> changeset: 23293:f48c72de4208
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 10:20:44 2011 +0100
>
> x86: a little bit of cleanup to time.c
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23292:e2fb962d13ff
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 10:16:54 2011 +0100
>
> x86: clean up building in mm/hap/
>
> Building 4-level guest walks is unnecessary for x86-32, and with this
> no longer being built the fallback code used here isn't necessary
> anymore either.
>
> Additonally the mechanism to determine the value of
> GUEST_PAGING_LEVELS to be passed to the compiler can be much
> simplified given that we're using a pattern rule here.
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23291:485b7c5e6f17
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 10:15:11 2011 +0100
>
> A little bit of SMP boot code cleanup
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23290:1ac7336b6298
> user: Jan Beulich <jbeulich@novell.com>
> date: Sun May 01 10:14:15 2011 +0100
>
> x86: set ARAT feature flag for non-buggy AMD CPUs
>
> This is the equivalent of a recent Linux change.
>
> Signed-off-by: Jan Beulich <jbeulich@novell.com>
>
>
> changeset: 23289:e4fc9494b940
> user: Samuel Thibault <samuel.thibault@ens-lyon.org>
> date: Sun May 01 10:11:58 2011 +0100
>
> mini-os: fix lib.h licence
>
> Update the Linux stdio functions prototypes, and move them to a
> separate header, licenced under GPL2+. Import FreeBSD8 string
> functions prototypes, update licence. Drop kvec, of unsure source and
> useless anyway.
>
> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
>
>
> changeset: 23288:60dfb5aca706
> user: Samuel Thibault <samuel.thibault@ens-lyon.org>
> date: Sun May 01 10:10:12 2011 +0100
>
> mini-os: lib/math.c: import FreeBSD 8 functions
>
> Import lib/math.c functions (and thus licence) from FreeBSD 8,
> and re-apply a few of our changes. Whitespaces left aside, this
> leads to almost no source change except s/int64_t/quad_t/ and
> s/uint64_t/u_quad_t/.
>
> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
>
>
> changeset: 23287:bf11f502684a
> user: Samuel Thibault <samuel.thibault@ens-lyon.org>
> date: Sun May 01 10:09:47 2011 +0100
>
> mini-os: Fix printf.c licence
>
> Changeset df1348e72390 actually completely replaced the freebsd printf
> implementation with the Linux printf implementation. Further changes
> are extremely minor and thus don't pose IP issue. Fix the licence
> accordingly.
>
> Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
>
>
> changeset: 23286:6f48f5f843f0
> user: Keir Fraser <keir@xen.org>
> date: Sun May 01 10:08:40 2011 +0100
>
> Clean up licensing in the public header directory.
>
> The COPYING file at xen/include/public/COPYING clearly states that all
> public header files are distributed under a permissive MIT
> license. Therefore make sure the same permissive license is included
> at the top of every header file (i.e., not GPL).
>
> Signed-off-by: Keir Fraser <keir@xen.org>
>
>
> changeset: 23285:a7ac0a0170b0
> user: Keir Fraser <keir@xen.org>
> date: Sun May 01 09:32:48 2011 +0100
>
> x86: Clean up smp_call_function handling.
>
> We don't need so many communication fields between caller and
> handler.
>
> Signed-off-by: Keir Fraser <keir@xen.org>
>
>
> changeset: 23284:476b0d68e7d5
> user: Keir Fraser <keir@xen.org>
> date: Sat Apr 30 09:48:16 2011 +0100
>
> x86: Remove TRAP_INSTR from the public headers.
>
> Direct hypercall traps (rather than using the hypercall transfer page)
> was long obsolete even when TRAP_INSTR was deprecated in the API
> headers. No current guest will be, or should be, using TRAP_INSTR.
>
> Signed-off-by: Keir Fraser <keir@xen.org>
>
>
> (qemu changes not included)
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 01.05.11 at 22:48, Keir Fraser <keir.xen@gmail.com> wrote:
> On 01/05/2011 20:56, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:
>
>> flight 6947 xen-unstable real [real]
>> http://www.chiark.greenend.org.uk/~xensrcts/logs/6947/
>>
>> Regressions :-(
>>
>> Tests which did not succeed and are blocking:
>> test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs.
>> 6944
>> test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs.
>> 6944
>> test-amd64-amd64-pv 5 xen-boot fail REGR. vs.
>> 6944
>
> Looks like your bug, Jan (changeset 23296):

I'm afraid you'll have to revert 23295 and 23296 for the time being,
as there's no obvious immediate solution: {set,clear}_domain_irq_pirq()
must be called with the IRQ descriptor lock held (which implies disabling
IRQs), and they must be able to call xmalloc() (both through
radix_tree_insert() and pirq_get_info() -> alloc_pirq_struct()).

I have to admit that I find it bogus to not be allowed to call xmalloc()
with interrupts disabled. There's no equivalent restriction on kmalloc()
in Linux.

If we really need to stay with this limitation, I'd have to replace the
call to xmalloc() in alloc_irq_struct() with one to xmem_pool_alloc(),
disabling interrupts up front. Similarly I'd have to call the radix tree
insertion/deletion functions with custom allocation routines. Both
parts would feel like hacks to me though.

An alternative (implementation-wise, i.e. not much less of a hack
imo) might be to introduce something like xmalloc_irq() which always
disabled IRQs and does its allocations from a separate pool (thus
using a distinct spin lock).

Jan

> May 1 17:03:45.335804 (XEN) Xen BUG at spinlock.c:47
> May 1 17:03:45.734780 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not
> tainted ]----
> May 1 17:03:45.734819 (XEN) CPU: 0
> May 1 17:03:45.743763 (XEN) RIP: e008:[<ffff82c480123cc4>]
> check_lock+0x44/0x50
> May 1 17:03:45.743796 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
> May 1 17:03:45.755762 (XEN) rax: 0000000000000000 rbx: ffff8301a7ff9868
> rcx: 0000000000000001
> May 1 17:03:45.755797 (XEN) rdx: 0000000000000000 rsi: 0000000000000001
> rdi: ffff8301a7ff986c
> May 1 17:03:45.770774 (XEN) rbp: ffff82c48029fca0 rsp: ffff82c48029fca0
> r8: 0000000000000000
> May 1 17:03:45.782761 (XEN) r9: 00000000deadbeef r10: ffff82c48021ca20
> r11: 0000000000000286
> May 1 17:03:45.782796 (XEN) r12: ffff8301a7ff8000 r13: 0000000000000080
> r14: 0000000000000000
> May 1 17:03:45.787773 (XEN) r15: ffff8301a7ff9868 cr0: 000000008005003b
> cr4: 00000000000006f0
> May 1 17:03:45.802762 (XEN) cr3: 000000021b001000 cr2: ffff88000191cfc0
> May 1 17:03:45.802791 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss:
> e010 cs: e008
> May 1 17:03:45.814766 (XEN) Xen stack trace from rsp=ffff82c48029fca0:
> May 1 17:03:45.814794 (XEN) ffff82c48029fcb8 ffff82c480123d01
> 0000000000000080 ffff82c48029fcf8
> May 1 17:03:45.826766 (XEN) ffff82c48012a73a ffff82c48029fd08
> 0000000000000080 0000000000000080
> May 1 17:03:45.826800 (XEN) 0000000000000090 0000000000000000
> ffff8301a7e82000 ffff82c48029fd28
> May 1 17:03:45.838781 (XEN) ffff82c48012abfa 0000000000000002
> 0000000000000010 ffff8301a7e82000
> May 1 17:03:45.846770 (XEN) ffff8301a7e82000 ffff82c48029fd58
> ffff82c4801615e1 ffff82c4802d9950
> May 1 17:03:45.858762 (XEN) 0000000000000000 ffff8301a7e82000
> ffff8301a7e821a8 ffff82c48029fd88
> May 1 17:03:45.858797 (XEN) ffff82c4801043b9 ffff8301a7e82c18
> 0000000000000000 0000000000000000
> May 1 17:03:45.870772 (XEN) 0000000000000000 ffff82c48029fdc8
> ffff82c480160bdb 0000000000000000
> May 1 17:03:45.882762 (XEN) 0000000000000286 0000000000000000
> 0000000000000000 ffff8301a7e82000
> May 1 17:03:45.882796 (XEN) 0000000000000000 ffff82c48029fe48
> ffff82c480161186 0000000000000000
> May 1 17:03:45.894774 (XEN) 00000001801198ad 0000000000000000
> ffff8301a7ffaed0 ffff82c48029fe48
> May 1 17:03:45.899765 (XEN) ffff82c4801675a1 ffff8301a7f000b4
> ffff8300d7afb000 ffff82c48029fe48
> May 1 17:03:45.899805 (XEN) 0000000000000000 ffffffff817afea8
> 0000000000000000 0000000000000000
> May 1 17:03:45.911776 (XEN) 0000000000000000 ffff82c48029fef8
> ffff82c480174adb ffff82c4802d8c00
> May 1 17:03:45.923767 (XEN) ffff82c4802d95a0 000000011fc37ff0
> 0000000000000000 ffffffff817afee8
> May 1 17:03:45.923801 (XEN) ffffffff810565b5 ffffffff817aff18
> 0000000000000000 0000000000000000
> May 1 17:03:45.938774 (XEN) ffff82c4802b8880 ffff82c48029ff18
> ffffffffffffffff ffff8301a7e82000
> May 1 17:03:45.947764 (XEN) 000000008012395f ffff82c480159df4
> ffff8300d7afb000 0000000000000000
> May 1 17:03:45.947800 (XEN) ffffffff817aff08 ffffffff818cc510
> 0000000000000000 00007d3b7fd600c7
> May 1 17:03:45.959772 (XEN) ffff82c480213eb8 ffffffff8100942a
> 0000000000000021 0000000000000000
> May 1 17:03:45.974784 (XEN) Xen call trace:
> May 1 17:03:45.974811 (XEN) [<ffff82c480123cc4>] check_lock+0x44/0x50
> May 1 17:03:45.974830 (XEN) [<ffff82c480123d01>] _spin_lock+0x11/0x5d
> May 1 17:03:45.982768 (XEN) [<ffff82c48012a73a>]
> xmem_pool_alloc+0x138/0x4d4
> May 1 17:03:45.982799 (XEN) [<ffff82c48012abfa>] _xmalloc+0x124/0x1ce
> May 1 17:03:45.991767 (XEN) [<ffff82c4801615e1>]
> alloc_pirq_struct+0x36/0x7f
> May 1 17:03:45.991804 (XEN) [<ffff82c4801043b9>] pirq_get_info+0x43/0x8f
> May 1 17:03:46.003769 (XEN) [<ffff82c480160bdb>]
> set_domain_irq_pirq+0x71/0xae
> May 1 17:03:46.003791 (XEN) [<ffff82c480161186>]
> map_domain_pirq+0x370/0x3bb
> May 1 17:03:46.018770 (XEN) [<ffff82c480174adb>]
> do_physdev_op+0xa6b/0x1598
> May 1 17:03:46.018802 (XEN) [<ffff82c480213eb8>]
> syscall_enter+0xc8/0x122
> May 1 17:03:46.030766 (XEN)
> May 1 17:03:46.030783 (XEN)
> May 1 17:03:46.030798 (XEN) ****************************************
> May 1 17:03:46.030825 (XEN) Panic on CPU 0:
> May 1 17:03:46.038760 (XEN) Xen BUG at spinlock.c:47
> May 1 17:03:46.038783 (XEN) ****************************************
> May 1 17:03:46.038808 (XEN)


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 10:01, "Jan Beulich" <JBeulich@novell.com> wrote:

>>>> On 01.05.11 at 22:48, Keir Fraser <keir.xen@gmail.com> wrote:
>> On 01/05/2011 20:56, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:
>>
>>> flight 6947 xen-unstable real [real]
>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/6947/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking:
>>> test-amd64-amd64-pair 8 xen-boot/dst_host fail REGR. vs.
>>> 6944
>>> test-amd64-amd64-pair 7 xen-boot/src_host fail REGR. vs.
>>> 6944
>>> test-amd64-amd64-pv 5 xen-boot fail REGR. vs.
>>> 6944
>>
>> Looks like your bug, Jan (changeset 23296):
>
> I'm afraid you'll have to revert 23295 and 23296 for the time being,
> as there's no obvious immediate solution: {set,clear}_domain_irq_pirq()
> must be called with the IRQ descriptor lock held (which implies disabling
> IRQs), and they must be able to call xmalloc() (both through
> radix_tree_insert() and pirq_get_info() -> alloc_pirq_struct()).

Okay, reverted.

> I have to admit that I find it bogus to not be allowed to call xmalloc()
> with interrupts disabled. There's no equivalent restriction on kmalloc()
> in Linux.

Well, the reason for the restriction on IRQ-disabled status on spinlock
acquisition (IRQs disabled *only*, or IRQs disabled *never*) is because of
the TSC synchronising rendezvous in x86/time.c:time_calibration().

A few options:

(1) Revert that rendezvous to using softirq or similar. The reason it was
turned into hardirq rendezvous is that Dan Magenheimer measured that it
reduced TSC skew by an order of magnitude or more. Perhaps it matters less
on modern CPUs, or perhaps we could come up with some other smart workaround
that would once again let us acquire IRQ-unsafe spinlocks with IRQs
disabled. See (2) for why alloc_heap_pages() may still be IRQs-disabled
unsafe however.

(2) Change the xmalloc lock to spin_lock_irqsave(). This would also have to
be transitively applied to at least the heap_lock in page_alloc.c. One issue
with this (and indeed with calling alloc_heap_pages at all with IRQs
disabled) is that alloc_heap_pages does actually assume IRQs are enabled
(for example, it calls flush_tlb_mask()) -- actually I think this limitation
probably predates the tsc rendezvous changes, and could be a source of
latent bugs in earlier Xen releases.

(3) Restructure the interrupt code to do less work in IRQ context. For
example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
the PIRQ structures with a non-IRQ lock. Would increase interrupt latency if
the local CPU is interrupted in hypervisor context. I'm not sure about this
one -- I'm not that happy about the amount of work now done in hardirq
context, but I'm not sure on the performance impact of deferring the work.

-- Keir

> If we really need to stay with this limitation, I'd have to replace the
> call to xmalloc() in alloc_irq_struct() with one to xmem_pool_alloc(),
> disabling interrupts up front. Similarly I'd have to call the radix tree
> insertion/deletion functions with custom allocation routines. Both
> parts would feel like hacks to me though.
>
> An alternative (implementation-wise, i.e. not much less of a hack
> imo) might be to introduce something like xmalloc_irq() which always
> disabled IRQs and does its allocations from a separate pool (thus
> using a distinct spin lock).
>
> Jan
>
>> May 1 17:03:45.335804 (XEN) Xen BUG at spinlock.c:47
>> May 1 17:03:45.734780 (XEN) ----[ Xen-4.2-unstable x86_64 debug=y Not
>> tainted ]----
>> May 1 17:03:45.734819 (XEN) CPU: 0
>> May 1 17:03:45.743763 (XEN) RIP: e008:[<ffff82c480123cc4>]
>> check_lock+0x44/0x50
>> May 1 17:03:45.743796 (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor
>> May 1 17:03:45.755762 (XEN) rax: 0000000000000000 rbx: ffff8301a7ff9868
>> rcx: 0000000000000001
>> May 1 17:03:45.755797 (XEN) rdx: 0000000000000000 rsi: 0000000000000001
>> rdi: ffff8301a7ff986c
>> May 1 17:03:45.770774 (XEN) rbp: ffff82c48029fca0 rsp: ffff82c48029fca0
>> r8: 0000000000000000
>> May 1 17:03:45.782761 (XEN) r9: 00000000deadbeef r10: ffff82c48021ca20
>> r11: 0000000000000286
>> May 1 17:03:45.782796 (XEN) r12: ffff8301a7ff8000 r13: 0000000000000080
>> r14: 0000000000000000
>> May 1 17:03:45.787773 (XEN) r15: ffff8301a7ff9868 cr0: 000000008005003b
>> cr4: 00000000000006f0
>> May 1 17:03:45.802762 (XEN) cr3: 000000021b001000 cr2: ffff88000191cfc0
>> May 1 17:03:45.802791 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss:
>> e010 cs: e008
>> May 1 17:03:45.814766 (XEN) Xen stack trace from rsp=ffff82c48029fca0:
>> May 1 17:03:45.814794 (XEN) ffff82c48029fcb8 ffff82c480123d01
>> 0000000000000080 ffff82c48029fcf8
>> May 1 17:03:45.826766 (XEN) ffff82c48012a73a ffff82c48029fd08
>> 0000000000000080 0000000000000080
>> May 1 17:03:45.826800 (XEN) 0000000000000090 0000000000000000
>> ffff8301a7e82000 ffff82c48029fd28
>> May 1 17:03:45.838781 (XEN) ffff82c48012abfa 0000000000000002
>> 0000000000000010 ffff8301a7e82000
>> May 1 17:03:45.846770 (XEN) ffff8301a7e82000 ffff82c48029fd58
>> ffff82c4801615e1 ffff82c4802d9950
>> May 1 17:03:45.858762 (XEN) 0000000000000000 ffff8301a7e82000
>> ffff8301a7e821a8 ffff82c48029fd88
>> May 1 17:03:45.858797 (XEN) ffff82c4801043b9 ffff8301a7e82c18
>> 0000000000000000 0000000000000000
>> May 1 17:03:45.870772 (XEN) 0000000000000000 ffff82c48029fdc8
>> ffff82c480160bdb 0000000000000000
>> May 1 17:03:45.882762 (XEN) 0000000000000286 0000000000000000
>> 0000000000000000 ffff8301a7e82000
>> May 1 17:03:45.882796 (XEN) 0000000000000000 ffff82c48029fe48
>> ffff82c480161186 0000000000000000
>> May 1 17:03:45.894774 (XEN) 00000001801198ad 0000000000000000
>> ffff8301a7ffaed0 ffff82c48029fe48
>> May 1 17:03:45.899765 (XEN) ffff82c4801675a1 ffff8301a7f000b4
>> ffff8300d7afb000 ffff82c48029fe48
>> May 1 17:03:45.899805 (XEN) 0000000000000000 ffffffff817afea8
>> 0000000000000000 0000000000000000
>> May 1 17:03:45.911776 (XEN) 0000000000000000 ffff82c48029fef8
>> ffff82c480174adb ffff82c4802d8c00
>> May 1 17:03:45.923767 (XEN) ffff82c4802d95a0 000000011fc37ff0
>> 0000000000000000 ffffffff817afee8
>> May 1 17:03:45.923801 (XEN) ffffffff810565b5 ffffffff817aff18
>> 0000000000000000 0000000000000000
>> May 1 17:03:45.938774 (XEN) ffff82c4802b8880 ffff82c48029ff18
>> ffffffffffffffff ffff8301a7e82000
>> May 1 17:03:45.947764 (XEN) 000000008012395f ffff82c480159df4
>> ffff8300d7afb000 0000000000000000
>> May 1 17:03:45.947800 (XEN) ffffffff817aff08 ffffffff818cc510
>> 0000000000000000 00007d3b7fd600c7
>> May 1 17:03:45.959772 (XEN) ffff82c480213eb8 ffffffff8100942a
>> 0000000000000021 0000000000000000
>> May 1 17:03:45.974784 (XEN) Xen call trace:
>> May 1 17:03:45.974811 (XEN) [<ffff82c480123cc4>] check_lock+0x44/0x50
>> May 1 17:03:45.974830 (XEN) [<ffff82c480123d01>] _spin_lock+0x11/0x5d
>> May 1 17:03:45.982768 (XEN) [<ffff82c48012a73a>]
>> xmem_pool_alloc+0x138/0x4d4
>> May 1 17:03:45.982799 (XEN) [<ffff82c48012abfa>] _xmalloc+0x124/0x1ce
>> May 1 17:03:45.991767 (XEN) [<ffff82c4801615e1>]
>> alloc_pirq_struct+0x36/0x7f
>> May 1 17:03:45.991804 (XEN) [<ffff82c4801043b9>] pirq_get_info+0x43/0x8f
>> May 1 17:03:46.003769 (XEN) [<ffff82c480160bdb>]
>> set_domain_irq_pirq+0x71/0xae
>> May 1 17:03:46.003791 (XEN) [<ffff82c480161186>]
>> map_domain_pirq+0x370/0x3bb
>> May 1 17:03:46.018770 (XEN) [<ffff82c480174adb>]
>> do_physdev_op+0xa6b/0x1598
>> May 1 17:03:46.018802 (XEN) [<ffff82c480213eb8>]
>> syscall_enter+0xc8/0x122
>> May 1 17:03:46.030766 (XEN)
>> May 1 17:03:46.030783 (XEN)
>> May 1 17:03:46.030798 (XEN) ****************************************
>> May 1 17:03:46.030825 (XEN) Panic on CPU 0:
>> May 1 17:03:46.038760 (XEN) Xen BUG at spinlock.c:47
>> May 1 17:03:46.038783 (XEN) ****************************************
>> May 1 17:03:46.038808 (XEN)
>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 02.05.11 at 13:22, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 10:01, "Jan Beulich" <JBeulich@novell.com> wrote:
>> I have to admit that I find it bogus to not be allowed to call xmalloc()
>> with interrupts disabled. There's no equivalent restriction on kmalloc()
>> in Linux.
>
> Well, the reason for the restriction on IRQ-disabled status on spinlock
> acquisition (IRQs disabled *only*, or IRQs disabled *never*) is because of
> the TSC synchronising rendezvous in x86/time.c:time_calibration().
>
> A few options:
>
> (1) Revert that rendezvous to using softirq or similar. The reason it was
> turned into hardirq rendezvous is that Dan Magenheimer measured that it
> reduced TSC skew by an order of magnitude or more. Perhaps it matters less
> on modern CPUs, or perhaps we could come up with some other smart workaround
> that would once again let us acquire IRQ-unsafe spinlocks with IRQs
> disabled. See (2) for why alloc_heap_pages() may still be IRQs-disabled
> unsafe however.

I admit I would want to avoid touching this (fragile) code.

> (2) Change the xmalloc lock to spin_lock_irqsave(). This would also have to
> be transitively applied to at least the heap_lock in page_alloc.c. One issue
> with this (and indeed with calling alloc_heap_pages at all with IRQs
> disabled) is that alloc_heap_pages does actually assume IRQs are enabled
> (for example, it calls flush_tlb_mask()) -- actually I think this limitation
> probably predates the tsc rendezvous changes, and could be a source of
> latent bugs in earlier Xen releases.

(2b) Make only the xmalloc() lock disable IRQs, and don't allow it to
go into the page allocator when IRQs were disabled on entry. Have
a reserve page available on each pCPU (requires that in a single
hypercall there can't be allocations adding up to more than PAGE_SIZE),
and when consumed, re-fill this page e.g. from a softirq or tasklet.

> (3) Restructure the interrupt code to do less work in IRQ context. For
> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency if
> the local CPU is interrupted in hypervisor context. I'm not sure about this
> one -- I'm not that happy about the amount of work now done in hardirq
> context, but I'm not sure on the performance impact of deferring the work.

I'm not inclined to make changes in this area for the purpose at hand
either (again, Linux gets away without this - would have to check how
e.g. KVM gets the TLB flushing done, or whether they don't defer
flushes like we do).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:

>> (2) Change the xmalloc lock to spin_lock_irqsave(). This would also have to
>> be transitively applied to at least the heap_lock in page_alloc.c. One issue
>> with this (and indeed with calling alloc_heap_pages at all with IRQs
>> disabled) is that alloc_heap_pages does actually assume IRQs are enabled
>> (for example, it calls flush_tlb_mask()) -- actually I think this limitation
>> probably predates the tsc rendezvous changes, and could be a source of
>> latent bugs in earlier Xen releases.
>
> (2b) Make only the xmalloc() lock disable IRQs, and don't allow it to
> go into the page allocator when IRQs were disabled on entry. Have
> a reserve page available on each pCPU (requires that in a single
> hypercall there can't be allocations adding up to more than PAGE_SIZE),
> and when consumed, re-fill this page e.g. from a softirq or tasklet.

You'd have to release/acquire the xmalloc lock across the ->get_mem call.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:

>> (3) Restructure the interrupt code to do less work in IRQ context. For
>> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
>> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency if
>> the local CPU is interrupted in hypervisor context. I'm not sure about this
>> one -- I'm not that happy about the amount of work now done in hardirq
>> context, but I'm not sure on the performance impact of deferring the work.
>
> I'm not inclined to make changes in this area for the purpose at hand
> either (again, Linux gets away without this - would have to check how
> e.g. KVM gets the TLB flushing done, or whether they don't defer
> flushes like we do).

Oh, another way would be to make lookup_slot invocations from IRQ context be
RCU-safe. Then the radix tree updates would not have to synchronise on the
irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
trees -- certainly Linux's radix-tree.h mentions RCU.

I must say this would be far more attractive to me than hacking the xmalloc
subsystem. That's pretty nasty.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 02.05.11 at 14:13, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>> (2) Change the xmalloc lock to spin_lock_irqsave(). This would also have to
>>> be transitively applied to at least the heap_lock in page_alloc.c. One issue
>>> with this (and indeed with calling alloc_heap_pages at all with IRQs
>>> disabled) is that alloc_heap_pages does actually assume IRQs are enabled
>>> (for example, it calls flush_tlb_mask()) -- actually I think this limitation
>>> probably predates the tsc rendezvous changes, and could be a source of
>>> latent bugs in earlier Xen releases.
>>
>> (2b) Make only the xmalloc() lock disable IRQs, and don't allow it to
>> go into the page allocator when IRQs were disabled on entry. Have
>> a reserve page available on each pCPU (requires that in a single
>> hypercall there can't be allocations adding up to more than PAGE_SIZE),
>> and when consumed, re-fill this page e.g. from a softirq or tasklet.
>
> You'd have to release/acquire the xmalloc lock across the ->get_mem call.

Not sure what you're trying to make me aware of - initial acquire
would be spin_lock_irqsave(), prior to ->get_mem() it would
spin_unlock_irqrestore(), and the ->get_mem() handler would be
responsible for not calling into the page allocator when interrupts
are (still) disabled (and instead use the per-CPU reserve page if
populated, triggering its re-population).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 02.05.11 at 14:19, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>> (3) Restructure the interrupt code to do less work in IRQ context. For
>>> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
>>> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency if
>>> the local CPU is interrupted in hypervisor context. I'm not sure about this
>>> one -- I'm not that happy about the amount of work now done in hardirq
>>> context, but I'm not sure on the performance impact of deferring the work.
>>
>> I'm not inclined to make changes in this area for the purpose at hand
>> either (again, Linux gets away without this - would have to check how
>> e.g. KVM gets the TLB flushing done, or whether they don't defer
>> flushes like we do).
>
> Oh, another way would be to make lookup_slot invocations from IRQ context be
> RCU-safe. Then the radix tree updates would not have to synchronise on the
> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix

I'm not sure - the patch doesn't introduce the locking (i.e. the
translation arrays used without the patch also get updated under
lock). I'm also not certain about slot recycling aspects (i.e. what
would the result be if freeing slots got deferred via RCU, but the
same slot is then needed to be used again before the grace period
expires). Quite possibly this consideration is mute, just resulting
from my only half-baked understanding of RCU...

Jan

> trees -- certainly Linux's radix-tree.h mentions RCU.
>
> I must say this would be far more attractive to me than hacking the xmalloc
> subsystem. That's pretty nasty.
>
> -- Keir




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 13:29, "Jan Beulich" <JBeulich@novell.com> wrote:

>>>> On 02.05.11 at 14:19, Keir Fraser <keir.xen@gmail.com> wrote:
>> On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:
>>
>>>> (3) Restructure the interrupt code to do less work in IRQ context. For
>>>> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
>>>> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency
>>>> if
>>>> the local CPU is interrupted in hypervisor context. I'm not sure about this
>>>> one -- I'm not that happy about the amount of work now done in hardirq
>>>> context, but I'm not sure on the performance impact of deferring the work.
>>>
>>> I'm not inclined to make changes in this area for the purpose at hand
>>> either (again, Linux gets away without this - would have to check how
>>> e.g. KVM gets the TLB flushing done, or whether they don't defer
>>> flushes like we do).
>>
>> Oh, another way would be to make lookup_slot invocations from IRQ context be
>> RCU-safe. Then the radix tree updates would not have to synchronise on the
>> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
>
> I'm not sure - the patch doesn't introduce the locking (i.e. the
> translation arrays used without the patch also get updated under
> lock). I'm also not certain about slot recycling aspects (i.e. what
> would the result be if freeing slots got deferred via RCU, but the
> same slot is then needed to be used again before the grace period
> expires). Quite possibly this consideration is mute, just resulting
> from my only half-baked understanding of RCU...

The most straightforward way to convert to RCU with the most similar
synchronising semantics would be to add a 'live' boolean flag to each
pirq-related struct that is stored in a radix tree. Then:
* insertions into radix tree would be moved before acquisition of the
irq_desc lock, then set 'live' under the lock
* deletions would clear 'live' under the lock, then do the actual radix
deletion would happen after irq_desc lock release;
* lookups would happen as usual under the irq_desc lock, but with an extra
test of the 'live' flag.

The main complexity of this approach would probably be in breaking up the
insertions/deletions across the irq_desc-lock critical section. Basically
the 'live' flag update would happen wherever the insertion/deletion happens
right now, but the physical insertion/deletion would be moved respectively
earlier/later.

We'd probably also need an extra lock to protect against concurrent
radix-tree update operations (should be pretty straightforward to add
however, needing to protect *only* the radix-tree update calls).

This is a pretty nice way to go imo.

-- Keir

> Jan
>
>> trees -- certainly Linux's radix-tree.h mentions RCU.
>>
>> I must say this would be far more attractive to me than hacking the xmalloc
>> subsystem. That's pretty nasty.
>>
>> -- Keir
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 14:14, "Keir Fraser" <keir.xen@gmail.com> wrote:

> We'd probably also need an extra lock to protect against concurrent
> radix-tree update operations (should be pretty straightforward to add
> however, needing to protect *only* the radix-tree update calls).

Actually this second lock would need to encompass both the logical and
physical insert/delete operations. Possibly there is already a non-IRQ lock
on these code paths that can contain irq_desc-locked regions. If not then
such a lock would need to be added.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 02.05.11 at 15:14, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 13:29, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>>>> On 02.05.11 at 14:19, Keir Fraser <keir.xen@gmail.com> wrote:
>>> On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:
>>>
>>>>> (3) Restructure the interrupt code to do less work in IRQ context. For
>>>>> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
>>>>> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency
>>>>> if
>>>>> the local CPU is interrupted in hypervisor context. I'm not sure about this
>>>>> one -- I'm not that happy about the amount of work now done in hardirq
>>>>> context, but I'm not sure on the performance impact of deferring the work.
>>>>
>>>> I'm not inclined to make changes in this area for the purpose at hand
>>>> either (again, Linux gets away without this - would have to check how
>>>> e.g. KVM gets the TLB flushing done, or whether they don't defer
>>>> flushes like we do).
>>>
>>> Oh, another way would be to make lookup_slot invocations from IRQ context be
>>> RCU-safe. Then the radix tree updates would not have to synchronise on the
>>> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
>>
>> I'm not sure - the patch doesn't introduce the locking (i.e. the
>> translation arrays used without the patch also get updated under
>> lock). I'm also not certain about slot recycling aspects (i.e. what
>> would the result be if freeing slots got deferred via RCU, but the
>> same slot is then needed to be used again before the grace period
>> expires). Quite possibly this consideration is mute, just resulting
>> from my only half-baked understanding of RCU...
>
> The most straightforward way to convert to RCU with the most similar
> synchronising semantics would be to add a 'live' boolean flag to each
> pirq-related struct that is stored in a radix tree. Then:
> * insertions into radix tree would be moved before acquisition of the
> irq_desc lock, then set 'live' under the lock
> * deletions would clear 'live' under the lock, then do the actual radix
> deletion would happen after irq_desc lock release;
> * lookups would happen as usual under the irq_desc lock, but with an extra
> test of the 'live' flag.

This still leaves unclear to me how an insert hitting a not-yet-deleted
(but no longer live) entry should behave. Simply setting 'live' again
won't help, as that wouldn't cover a delete->insert->delete all
happening before the first delete's grace period expires. Nor would
this work with populating the new data (prior to setting live) when
the old data might sill be used.

But wait - what you describe doesn't need RCU anymore, at least
as long as the code paths from radix tree insert to setting 'live'
(and similarly from clearing 'live' to doing the radix tree delete) are
fully covered by some other lock (d->event_lock, see below). Am
I overlooking something?

> The main complexity of this approach would probably be in breaking up the
> insertions/deletions across the irq_desc-lock critical section. Basically
> the 'live' flag update would happen wherever the insertion/deletion happens
> right now, but the physical insertion/deletion would be moved respectively
> earlier/later.
>
> We'd probably also need an extra lock to protect against concurrent
> radix-tree update operations (should be pretty straightforward to add
> however, needing to protect *only* the radix-tree update calls).

That would seem to naturally be d->event_lock (I think [almost?]
every code path changed is already protected by it).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 15:04, "Jan Beulich" <JBeulich@novell.com> wrote:

>> The most straightforward way to convert to RCU with the most similar
>> synchronising semantics would be to add a 'live' boolean flag to each
>> pirq-related struct that is stored in a radix tree. Then:
>> * insertions into radix tree would be moved before acquisition of the
>> irq_desc lock, then set 'live' under the lock
>> * deletions would clear 'live' under the lock, then do the actual radix
>> deletion would happen after irq_desc lock release;
>> * lookups would happen as usual under the irq_desc lock, but with an extra
>> test of the 'live' flag.
>
> This still leaves unclear to me how an insert hitting a not-yet-deleted
> (but no longer live) entry should behave. Simply setting 'live' again
> won't help, as that wouldn't cover a delete->insert->delete all
> happening before the first delete's grace period expires. Nor would
> this work with populating the new data (prior to setting live) when
> the old data might sill be used.

Yes, this is why in my follow-up email I explained that a secondary lock is
needed that covers both logical and physical insertion/deletion. Then the
case you describe above cannot happen. As you say, event_lock covers us.

> But wait - what you describe doesn't need RCU anymore, at least
> as long as the code paths from radix tree insert to setting 'live'
> (and similarly from clearing 'live' to doing the radix tree delete) are
> fully covered by some other lock (d->event_lock, see below). Am
> I overlooking something?

No, I think you just misunderstand RCU. What I (and now also you, a bit
independently ;-) have described is how to synchronise writers against other
writers -- e.g., someone inserting concurrently with deleting, as you
describe above. What RCU is all about is synchronising *readers* against
writers, without needing a lock. And we still need it because the radix-tree
updates will happen under d->event_lock, which the readers in IRQ context
will not be holding. The main thing that RCU generally needs is that, when a
node is deleted from a structure (radix-tree in this case) it cannot be
freed until an RCU grace period because concurrent lock-free readers may
still hold a pointer to it. There are also other details to consider but
actually the whole RCU issue appears to be handled by Linux's radix-tree
implementation -- we just need to pull an up-to-date version across into
Xen.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
RE: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
> No, I think you just misunderstand RCU. What I (and now also you, a bit
> independently ;-) have described is how to synchronise writers against
> other
> writers -- e.g., someone inserting concurrently with deleting, as you
> describe above. What RCU is all about is synchronising *readers*
> against
> writers, without needing a lock. And we still need it because the
> radix-tree
> updates will happen under d->event_lock, which the readers in IRQ
> context
> will not be holding. The main thing that RCU generally needs is that,
> when a
> node is deleted from a structure (radix-tree in this case) it cannot be
> freed until an RCU grace period because concurrent lock-free readers
> may
> still hold a pointer to it. There are also other details to consider
> but
> actually the whole RCU issue appears to be handled by Linux's radix-
> tree
> implementation -- we just need to pull an up-to-date version across
> into
> Xen.

I won't claim to understand RCU very well either, but I
actually explicitly chose a pre-RCU version of the Linux
radix tree code because tmem (which was the only user of
the radix tree code at the time IIRC) is write-often
AND read-often and my understanding of RCU is that it
works best for read-often-write-infrequently trees.

Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 02/05/2011 17:36, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

> I won't claim to understand RCU very well either, but I
> actually explicitly chose a pre-RCU version of the Linux
> radix tree code because tmem (which was the only user of
> the radix tree code at the time IIRC) is write-often
> AND read-often and my understanding of RCU is that it
> works best for read-often-write-infrequently trees.

That is where it gives the best performance boost (because it obviates the
need for read-side locking, with its associated overheads). But there can be
other reasons for using a lock-free synchronisation strategy -- our current
motivation, sync'ing with interrupt handlers (or, similarly, signal handlers
in Unix processes), is another common one.

In terms of the cost of switching to an RCU radix-tree implementation, for
those users that don't need it (i.e., tmem, because you have full lock-based
synchronisation) it looks like node deletions unconditionally wait for an
RCU grace period before freeing the old node. If tmem is doing a reasonable
rate of deletion it might make sense for us to make that optional, selected
when the tree is first initialised. It would be easy enough to add an
'rcu_safe' flag for that purpose. There's also a rcu_head struct added to
every tree node. Not so much we can do about that. It's only 16 bytes,
hopefully not too bad an issue for tmem.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 02.05.11 at 14:19, Keir Fraser <keir.xen@gmail.com> wrote:
> On 02/05/2011 13:00, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>> (3) Restructure the interrupt code to do less work in IRQ context. For
>>> example tasklet-per-irq, and schedule on the local cpu. Protect a bunch of
>>> the PIRQ structures with a non-IRQ lock. Would increase interrupt latency if
>>> the local CPU is interrupted in hypervisor context. I'm not sure about this
>>> one -- I'm not that happy about the amount of work now done in hardirq
>>> context, but I'm not sure on the performance impact of deferring the work.
>>
>> I'm not inclined to make changes in this area for the purpose at hand
>> either (again, Linux gets away without this - would have to check how
>> e.g. KVM gets the TLB flushing done, or whether they don't defer
>> flushes like we do).
>
> Oh, another way would be to make lookup_slot invocations from IRQ context be
> RCU-safe. Then the radix tree updates would not have to synchronise on the
> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
> trees -- certainly Linux's radix-tree.h mentions RCU.
>
> I must say this would be far more attractive to me than hacking the xmalloc
> subsystem. That's pretty nasty.

I think that I can actually get away with two stage insertion/removal
without needing RCU, based on the fact that prior to these changes
we have the translation arrays also hold zero values that mean "does
not have a valid translation". Hence I can do tree insertion (removal)
with just d->event_lock held, but data not yet (no longer) populated,
and valid <-> invalid transitions only happening with the IRQ's
descriptor lock held (and interrupts disabled). All this requires is that
readers properly deal with the non-populated state, which they
already had to in the first version of the patch anyway.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 03/05/2011 10:35, "Jan Beulich" <JBeulich@novell.com> wrote:

>> Oh, another way would be to make lookup_slot invocations from IRQ context be
>> RCU-safe. Then the radix tree updates would not have to synchronise on the
>> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
>> trees -- certainly Linux's radix-tree.h mentions RCU.
>>
>> I must say this would be far more attractive to me than hacking the xmalloc
>> subsystem. That's pretty nasty.
>
> I think that I can actually get away with two stage insertion/removal
> without needing RCU, based on the fact that prior to these changes
> we have the translation arrays also hold zero values that mean "does
> not have a valid translation". Hence I can do tree insertion (removal)
> with just d->event_lock held, but data not yet (no longer) populated,
> and valid <-> invalid transitions only happening with the IRQ's
> descriptor lock held (and interrupts disabled). All this requires is that
> readers properly deal with the non-populated state, which they
> already had to in the first version of the patch anyway.

But the readers in irq context will call lookup_slot() without d->event_lock
held? In that case you do need an RCU-aware version of radix-tree.[ch],
because lookups can be occurring concurrently with insertions/deletions.
Good news is that the RCU-aware radix tree implementation hides the RCU
details from you entirely.

Well, in any case, I'm happy to iterate on this patch if necessary.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
>>> On 03.05.11 at 12:09, Keir Fraser <keir.xen@gmail.com> wrote:
> On 03/05/2011 10:35, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>>> Oh, another way would be to make lookup_slot invocations from IRQ context be
>>> RCU-safe. Then the radix tree updates would not have to synchronise on the
>>> irq_desc lock? And I believe Linux has examples of RCU-safe usage of radix
>>> trees -- certainly Linux's radix-tree.h mentions RCU.
>>>
>>> I must say this would be far more attractive to me than hacking the xmalloc
>>> subsystem. That's pretty nasty.
>>
>> I think that I can actually get away with two stage insertion/removal
>> without needing RCU, based on the fact that prior to these changes
>> we have the translation arrays also hold zero values that mean "does
>> not have a valid translation". Hence I can do tree insertion (removal)
>> with just d->event_lock held, but data not yet (no longer) populated,
>> and valid <-> invalid transitions only happening with the IRQ's
>> descriptor lock held (and interrupts disabled). All this requires is that
>> readers properly deal with the non-populated state, which they
>> already had to in the first version of the patch anyway.
>
> But the readers in irq context will call lookup_slot() without d->event_lock
> held? In that case you do need an RCU-aware version of radix-tree.[ch],
> because lookups can be occurring concurrently with insertions/deletions.

No, in IRQ context we only need the irq -> pirq translation afaics, and
that translation doesn't use an allocated object (it instead simply inserts
the [non-zero] pirq as data item).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [xen-unstable test] 6947: regressions - trouble: broken/fail/pass [ In reply to ]
On 03/05/2011 14:36, "Jan Beulich" <JBeulich@novell.com> wrote:

>> But the readers in irq context will call lookup_slot() without d->event_lock
>> held? In that case you do need an RCU-aware version of radix-tree.[ch],
>> because lookups can be occurring concurrently with insertions/deletions.
>
> No, in IRQ context we only need the irq -> pirq translation afaics, and
> that translation doesn't use an allocated object (it instead simply inserts
> the [non-zero] pirq as data item).

Ah well that makes things easier. :-) If a single lock protects all
operations (including lookups) on a particular radix tree then of course we
don't need RCU.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel