Mailing List Archive

Driver domain - NEW issue: IRQ handling error
Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
so I'm pretty sure this hasn't been addressed already.

Symptom:

While running a "ping -f <some local host outside the box>" from dom0
where the physical nic is in a driver dom (bridged), after about 1
minute the connection dies and won't restart. (even with a reboot of
the driver domain).

ex. Dom0 vif1.0=10.1.1.1/24
outside host=10.1.1.2/24

e1000 driver dom = bridge containing physical e1000(eth0) and virtual
nic (eth1)

dmesg on dom0 gives:
irq 18: nobody cared!
[<c012a4b7>]
[<c012a547>]
[<c0129f6c>]
[<c010cd1b>]
[<c0105c13>]
[<c0108aa3>]
[<c0106c05>]
[<c0106c39>]
[<c02e2621>]
handlers:
[<c020cdb6>]
[<cc94b867>]
Disabling IRQ #18


and a dmesg of the driver domain shows that the nic hooked IRQ 18:

Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
Copyright (c) 1999-2004 Intel Corporation.
PCI: Obtained IRQ 18 for device 0000:01:01.0
PCI: Setting latency timer of device 0000:01:01.0 to 64
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection

Am I correct in that the interrupt that should have been sent to the
driver domain was instead sent to dom0? or what happened? If I don't
have the driver dom setup correctly, would someone please explain what
I'm doing wrong?

Thanks,
B.


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
OK, I haven't heard of this issue. Could you post your grub.conf for dom0 and
your domain config file for the backend?

I'm not entirely clear on your configuration - how does your networking setup
work? What *does* work?

Cheers,
Mark

On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> so I'm pretty sure this hasn't been addressed already.
>
> Symptom:
>
> While running a "ping -f <some local host outside the box>" from dom0
> where the physical nic is in a driver dom (bridged), after about 1
> minute the connection dies and won't restart. (even with a reboot of
> the driver domain).
>
> ex. Dom0 vif1.0=10.1.1.1/24
> outside host=10.1.1.2/24
>
> e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> nic (eth1)
>
> dmesg on dom0 gives:
> irq 18: nobody cared!
> [<c012a4b7>]
> [<c012a547>]
> [<c0129f6c>]
> [<c010cd1b>]
> [<c0105c13>]
> [<c0108aa3>]
> [<c0106c05>]
> [<c0106c39>]
> [<c02e2621>]
> handlers:
> [<c020cdb6>]
> [<cc94b867>]
> Disabling IRQ #18
>
>
> and a dmesg of the driver domain shows that the nic hooked IRQ 18:
>
> Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> Copyright (c) 1999-2004 Intel Corporation.
> PCI: Obtained IRQ 18 for device 0000:01:01.0
> PCI: Setting latency timer of device 0000:01:01.0 to 64
> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
>
> Am I correct in that the interrupt that should have been sent to the
> driver domain was instead sent to dom0? or what happened? If I don't
> have the driver dom setup correctly, would someone please explain what
> I'm doing wrong?
>
> Thanks,
> B.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Sure, here they are:
GRUB:
default 0
timeout 5
fallback 1

splashimage=(hd0,0)/grub/splash.xpm.gz

title=Production (xen)
root (hd0,0)
kernel /xen.gz dom0_mem=65535 console=vga
physdev_dom0_hide=(01:01.0)(02:00.0) com1=115200,8n1
module /vmlinuz-2.6.10-xen0 root=/dev/md1 ro console=tty0
video=intelfb:1280x1024-16@74,accel=1
#module /vmlinuz-2.6.10-xen0 root=/dev/md1 ro console=tty0

title=Recovery (2.6.10)
root (hd0,0)
kernel /vmlinuz root=/dev/md1 video=intelfb:1280x1024-16@74,accel=1
devfs=nomount 3

title=Single user mode (2.6.10)
root (hd0,0)
kernel /vmlinuz-2.6.10-gentoo-r6 root=/dev/md1 devfs=nomount 1
video=intelfb:1024x768-16@85,accel=1

e1000 config (xm create -nf e1000):
Using config file "/etc/xen.old/e1000".
(vm
(name e1000)
(memory 128)
(restart onreboot)
(image
(linux
(kernel /boot/vmlinuz-2.6.10-xen-be)
(root '/dev/hda1 ro')
(args 'panic=1')
)
)
(device (vbd (uname phy:raid1/e1000.1) (dev /dev/hda1) (mode w)))
(device (vbd (uname phy:raid1/portage) (dev /dev/hda2) (mode r)))
(device (vbd (uname phy:raid0/swap_e1000) (dev /dev/hda3) (mode w)))
(device (pci (bus 0x1) (dev 0x1) (func 0x0)))
(device (vif (mac aa:00:01:fa:00:02) (bridge e1000)))
)


Ideally, I'd like to get front end domains hooking directly into backend
domains, however I do not seem to be able to the vifX.X to be created in
any domain other than dom0. xen-be is a DOM0 build with the nic drivers
included. I'm running Gentoo-dev-sources (2.6.10-r6). Last night I
uninstalled xen, manually checked for any extraneous xen packages/code,
grabbed a fresh clone of testing, and reinstalled xen + recompiled my
kernels.

Sample of what I've tried:
Using config file "fwmgmt".
(vm
(name fwmgmt)
(memory 128)
(restart onreboot)
(image
(linux
(kernel /boot/vmlinuz-2.6.10-xen-fe)
(root '/dev/hda1 ro')
(args 'panic=1')
)
)
(device (vbd (uname phy:raid1/fwmgmt) (dev /dev/hda1) (mode w)))
(device (vbd (uname phy:raid1/portage) (dev /dev/hda2) (mode r)))
(device (vbd (uname phy:raid0/swap_fwmgmt) (dev /dev/hda3) (mode
w)))
(device (vif (mac aa:00:01:fa:00:04) (bridge e1000) (backend
e1000)))
(device (vif (mac aa:00:02:fa:00:04) (bridge 3c59x) (backend
3c59x)))
(device (vif (mac aa:00:03:fa:00:04) (bridge vsw0) (backend vsw0)))
(device (vif (mac aa:00:04:fa:00:04) (bridge mgmt)))
)


I am, however able to bridge eth0(real) and eth1(virtual) in the e1000
driver domain and get the vifX.X in dom0. If I assign an local (to the
physical nic) ip to that vif, I am able to see the rest of my network.



On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> OK, I haven't heard of this issue. Could you post your grub.conf for dom0 and
> your domain config file for the backend?
>
> I'm not entirely clear on your configuration - how does your networking setup
> work? What *does* work?
>
> Cheers,
> Mark
>
> On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > so I'm pretty sure this hasn't been addressed already.
> >
> > Symptom:
> >
> > While running a "ping -f <some local host outside the box>" from dom0
> > where the physical nic is in a driver dom (bridged), after about 1
> > minute the connection dies and won't restart. (even with a reboot of
> > the driver domain).
> >
> > ex. Dom0 vif1.0=10.1.1.1/24
> > outside host=10.1.1.2/24
> >
> > e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > nic (eth1)
> >
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> > [<c012a4b7>]
> > [<c012a547>]
> > [<c0129f6c>]
> > [<c010cd1b>]
> > [<c0105c13>]
> > [<c0108aa3>]
> > [<c0106c05>]
> > [<c0106c39>]
> > [<c02e2621>]
> > handlers:
> > [<c020cdb6>]
> > [<cc94b867>]
> > Disabling IRQ #18
> >
> >
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> >
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> >
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0? or what happened? If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
> >
> > Thanks,
> > B.
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > Tool for open source databases. Create drag-&-drop reports. Save time
> > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Here are the dom config files:

B.


On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> OK, I haven't heard of this issue. Could you post your grub.conf for dom0 and
> your domain config file for the backend?
>
> I'm not entirely clear on your configuration - how does your networking setup
> work? What *does* work?
>
> Cheers,
> Mark
>
> On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > so I'm pretty sure this hasn't been addressed already.
> >
> > Symptom:
> >
> > While running a "ping -f <some local host outside the box>" from dom0
> > where the physical nic is in a driver dom (bridged), after about 1
> > minute the connection dies and won't restart. (even with a reboot of
> > the driver domain).
> >
> > ex. Dom0 vif1.0=10.1.1.1/24
> > outside host=10.1.1.2/24
> >
> > e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > nic (eth1)
> >
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> > [<c012a4b7>]
> > [<c012a547>]
> > [<c0129f6c>]
> > [<c010cd1b>]
> > [<c0105c13>]
> > [<c0108aa3>]
> > [<c0106c05>]
> > [<c0106c39>]
> > [<c02e2621>]
> > handlers:
> > [<c020cdb6>]
> > [<cc94b867>]
> > Disabling IRQ #18
> >
> >
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> >
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> >
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0? or what happened? If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
> >
> > Thanks,
> > B.
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > Tool for open source databases. Create drag-&-drop reports. Save time
> > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
>
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Most disconcerting is that if I perform the same ping flood from the
non-xen'd local box back to either dom0 or an ip assigned to the bridge
in the driver domain, eventually (about 100,000 packets) the same result
will occur. The nic dies. I can still ping between Dom0 and the driver
domain, but there is no outside traffic. When I run this against a
stock linux kernel, there is no issue (1,000,000+ packets).

B.

On Tue, 2005-02-01 at 12:13, B.G. Bruce wrote:
> Here are the dom config files:
>
> B.
>
>
> On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> > OK, I haven't heard of this issue. Could you post your grub.conf for dom0 and
> > your domain config file for the backend?
> >
> > I'm not entirely clear on your configuration - how does your networking setup
> > work? What *does* work?
> >
> > Cheers,
> > Mark
> >
> > On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > > so I'm pretty sure this hasn't been addressed already.
> > >
> > > Symptom:
> > >
> > > While running a "ping -f <some local host outside the box>" from dom0
> > > where the physical nic is in a driver dom (bridged), after about 1
> > > minute the connection dies and won't restart. (even with a reboot of
> > > the driver domain).
> > >
> > > ex. Dom0 vif1.0=10.1.1.1/24
> > > outside host=10.1.1.2/24
> > >
> > > e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > > nic (eth1)
> > >
> > > dmesg on dom0 gives:
> > > irq 18: nobody cared!
> > > [<c012a4b7>]
> > > [<c012a547>]
> > > [<c0129f6c>]
> > > [<c010cd1b>]
> > > [<c0105c13>]
> > > [<c0108aa3>]
> > > [<c0106c05>]
> > > [<c0106c39>]
> > > [<c02e2621>]
> > > handlers:
> > > [<c020cdb6>]
> > > [<cc94b867>]
> > > Disabling IRQ #18
> > >
> > >
> > > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> > >
> > > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > > Copyright (c) 1999-2004 Intel Corporation.
> > > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> > >
> > > Am I correct in that the interrupt that should have been sent to the
> > > driver domain was instead sent to dom0? or what happened? If I don't
> > > have the driver dom setup correctly, would someone please explain what
> > > I'm doing wrong?
> > >
> > > Thanks,
> > > B.
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > > Tool for open source databases. Create drag-&-drop reports. Save time
> > > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> >


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
I also get this, but it doesn't look like the network card but the USB.

Dom0 dmesg:

PCI: Obtained IRQ 23 for device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: EHCI Host Controller
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: irq 23, pci mem 0xfe700800
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: USB 2.0 initialized, EHCI 1.00, driver 26 Oct 2004
hub 1-0:1.0: USB hub found

...


irq 23: nobody cared!
[<c013590a>] __report_bad_irq+0x2a/0xa0
[<c01350f0>] handle_IRQ_event+0x40/0x90
[<c0135a10>] note_interrupt+0x70/0xb0
[<c0135268>] __do_IRQ+0x128/0x140
[<c010eed9>] do_IRQ+0x19/0x30
[<c0105f0f>] evtchn_do_upcall+0xaf/0x110
[<c0109927>] hypervisor_callback+0x37/0x40
[<c02a996f>] e1000_intr+0x1f/0x90
[<c0105e5a>] force_evtchn_callback+0xa/0x10
[<c01350f0>] handle_IRQ_event+0x40/0x90
[<c0135211>] __do_IRQ+0xd1/0x140
[<c010eed9>] do_IRQ+0x19/0x30
[<c0105f0f>] evtchn_do_upcall+0xaf/0x110
[<c0109927>] hypervisor_callback+0x37/0x40
[<c010738e>] xen_idle+0x8e/0x150
[<c0463209>] preempt_schedule+0x29/0x50
[<c0107479>] cpu_idle+0x29/0x50
[<c05767c8>] start_kernel+0x178/0x1c0
[<c0576350>] unknown_bootoption+0x0/0x1e0
handlers:
[<c0390de0>] (usb_hcd_irq+0x0/0x70)
Disabling IRQ #23



This is a xen-2.0.3, 2.6.10 kernel that I built myself.
Almost everything built-in.





-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Okay, after a little RTFM, I have the the frontend domain putting it's
vif in the correct backend domain (not dom0) and the backend domain
properly configured to be a backend domain (backend(netif)) in the sxp
config. Don't I feel silly. However, running the ping test still kills
the nic. If I run it from the front end domain, I get the disabling IRQ
18 error message in that dom. If I run it external to the physical box,
the NIC still dies, however if I run the NICs from dom0, everything is
fine. For testing purposes, I'm using the SAME xen0 for booting both
xen0 and the backend domain.

Regards,
B.


On Tue, 2005-02-01 at 13:01, B.G. Bruce wrote:
> Most disconcerting is that if I perform the same ping flood from the
> non-xen'd local box back to either dom0 or an ip assigned to the bridge
> in the driver domain, eventually (about 100,000 packets) the same result
> will occur. The nic dies. I can still ping between Dom0 and the driver
> domain, but there is no outside traffic. When I run this against a
> stock linux kernel, there is no issue (1,000,000+ packets).
>
> B.
>
> On Tue, 2005-02-01 at 12:13, B.G. Bruce wrote:
> > Here are the dom config files:
> >
> > B.
> >
> >
> > On Tue, 2005-02-01 at 09:10, Mark Williamson wrote:
> > > OK, I haven't heard of this issue. Could you post your grub.conf for dom0 and
> > > your domain config file for the backend?
> > >
> > > I'm not entirely clear on your configuration - how does your networking setup
> > > work? What *does* work?
> > >
> > > Cheers,
> > > Mark
> > >
> > > On Tuesday 01 February 2005 13:11, B.G. Bruce wrote:
> > > > Okay, I'm using a bk snapshot of testing as of 20:40 (-4:00) yesterday
> > > > so I'm pretty sure this hasn't been addressed already.
> > > >
> > > > Symptom:
> > > >
> > > > While running a "ping -f <some local host outside the box>" from dom0
> > > > where the physical nic is in a driver dom (bridged), after about 1
> > > > minute the connection dies and won't restart. (even with a reboot of
> > > > the driver domain).
> > > >
> > > > ex. Dom0 vif1.0=10.1.1.1/24
> > > > outside host=10.1.1.2/24
> > > >
> > > > e1000 driver dom = bridge containing physical e1000(eth0) and virtual
> > > > nic (eth1)
> > > >
> > > > dmesg on dom0 gives:
> > > > irq 18: nobody cared!
> > > > [<c012a4b7>]
> > > > [<c012a547>]
> > > > [<c0129f6c>]
> > > > [<c010cd1b>]
> > > > [<c0105c13>]
> > > > [<c0108aa3>]
> > > > [<c0106c05>]
> > > > [<c0106c39>]
> > > > [<c02e2621>]
> > > > handlers:
> > > > [<c020cdb6>]
> > > > [<cc94b867>]
> > > > Disabling IRQ #18
> > > >
> > > >
> > > > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> > > >
> > > > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > > > Copyright (c) 1999-2004 Intel Corporation.
> > > > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > > > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > > > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
> > > >
> > > > Am I correct in that the interrupt that should have been sent to the
> > > > driver domain was instead sent to dom0? or what happened? If I don't
> > > > have the driver dom setup correctly, would someone please explain what
> > > > I'm doing wrong?
> > > >
> > > > Thanks,
> > > > B.
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> > > > Tool for open source databases. Create drag-&-drop reports. Save time
> > > > by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> > > > Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> > >
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
On Tue, 01 Feb 2005 09:11:34 -0400, B.G. Bruce <bgb@nt-nv.com> wrote:
> dmesg on dom0 gives:
> irq 18: nobody cared!
> Disabling IRQ #18
>
> and a dmesg of the driver domain shows that the nic hooked IRQ 18:
>
> Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> Copyright (c) 1999-2004 Intel Corporation.
> PCI: Obtained IRQ 18 for device 0000:01:01.0
> PCI: Setting latency timer of device 0000:01:01.0 to 64
> e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection

Do you have any other devices which get assigned IRQ 18? Xen prints
information about the interrupt routing when it starts, you can read
this information with xm dmesg. Also lspci -v should show for each
device not hidden from dom0 which interrupt is used by the device.
FWIW, I've seen "irq nobody cared" on the IRQ assigned to the USB
controller and a kernel without USB support.

> Am I correct in that the interrupt that should have been sent to the
> driver domain was instead sent to dom0? or what happened? If I don't
> have the driver dom setup correctly, would someone please explain what
> I'm doing wrong?

Yes, it should have been sent to the driver domain. If there's a 2nd
device on IRQ 18 and this device is not hidden from dom0 and the 2nd
device gets an interrupt, it will go to dom0. This should be harmless
but apparently, it's not.

christian


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Yeah, the SATA and PATA contollers are grabbing 18 as well. This box
doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
the bios), but what do I do about the PATA/EIDE (00:1f.1). I need it!

B.



On Tue, 2005-02-01 at 21:28, Christian Limpach wrote:
> On Tue, 01 Feb 2005 09:11:34 -0400, B.G. Bruce <bgb@nt-nv.com> wrote:
> > dmesg on dom0 gives:
> > irq 18: nobody cared!
> > Disabling IRQ #18
> >
> > and a dmesg of the driver domain shows that the nic hooked IRQ 18:
> >
> > Intel(R) PRO/1000 Network Driver - version 5.5.4-k2-NAPI
> > Copyright (c) 1999-2004 Intel Corporation.
> > PCI: Obtained IRQ 18 for device 0000:01:01.0
> > PCI: Setting latency timer of device 0000:01:01.0 to 64
> > e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
>
> Do you have any other devices which get assigned IRQ 18? Xen prints
> information about the interrupt routing when it starts, you can read
> this information with xm dmesg. Also lspci -v should show for each
> device not hidden from dom0 which interrupt is used by the device.
> FWIW, I've seen "irq nobody cared" on the IRQ assigned to the USB
> controller and a kernel without USB support.
>
> > Am I correct in that the interrupt that should have been sent to the
> > driver domain was instead sent to dom0? or what happened? If I don't
> > have the driver dom setup correctly, would someone please explain what
> > I'm doing wrong?
>
> Yes, it should have been sent to the driver domain. If there's a 2nd
> device on IRQ 18 and this device is not hidden from dom0 and the 2nd
> device gets an interrupt, it will go to dom0. This should be harmless
> but apparently, it's not.
>
> christian
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
On Tue, Feb 01, 2005 at 10:09:45PM -0400, B.G. Bruce wrote:
> Yeah, the SATA and PATA contollers are grabbing 18 as well. This box
> doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
> the bios), but what do I do about the PATA/EIDE (00:1f.1). I need it!

Have you tried with SATA excluded? The PATA controller could be ok
because it will have a driver handling the interrupts. Otherwise you
could try moving the card around (to a different slot) and see if it
gets a different IRQ.

christian



-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Yes, I've tried with the SATA excluded - no change.

B.

On Tue, 2005-02-01 at 22:47, Christian Limpach wrote:
> On Tue, Feb 01, 2005 at 10:09:45PM -0400, B.G. Bruce wrote:
> > Yeah, the SATA and PATA contollers are grabbing 18 as well. This box
> > doesn't have any SATA drives, so I can exclude 00:1f.2 (won't disable in
> > the bios), but what do I do about the PATA/EIDE (00:1f.1). I need it!
>
> Have you tried with SATA excluded? The PATA controller could be ok
> because it will have a driver handling the interrupts. Otherwise you
> could try moving the card around (to a different slot) and see if it
> gets a different IRQ.
>
> christian
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
> Tool for open source databases. Create drag-&-drop reports. Save time
> by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
> Download a FREE copy at http://www.intelliview.com/go/osdn_nl
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
What I've actually found is that if is disable the disabling of the
interrupt in kernel/irq/spurious.c, everything works fine. I still get
the error messages (I didn't comment them out) about every
50.000-100.000 packets but I don't drop a packet and everything works as
it should. Now obviously I don't want to keep the disabling irq
disabled, but I'm at a loss for how to fix this otherwise.

B.


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
On Wed, Feb 02, 2005 at 02:00:18PM -0400, B.G. Bruce wrote:
> What I've actually found is that if is disable the disabling of the
> interrupt in kernel/irq/spurious.c, everything works fine. I still get
> the error messages (I didn't comment them out) about every
> 50.000-100.000 packets but I don't drop a packet and everything works as
> it should. Now obviously I don't want to keep the disabling irq
> disabled, but I'm at a loss for how to fix this otherwise.

I think I understand now what's happening:
- since you have devices on IRQ18 in both dom0 and another domain,
all IRQ18 interrupts get delivered to both (for loop in
__do_IRQ_guest in xen/arch/x86/irq.c).
- the ide driver in dom0 will only handle IRQs for the ide controller.
- all e1000 interrupts will be counted as spurious/unhandled.
- if there's hardly any ide interrupts, you can hit the case where
of 100000 interrupts, 99900 were unhandled and this will cause the
interrupt to get disabled.

We seem to hit the case mentioned in () above __report_bad_irq. Not
disabling the interrupt in that case is the correct thing to do, but
the sharing does certainly have a significant performance impact.

christian



-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
Re: Driver domain - NEW issue: IRQ handling error [ In reply to ]
Christian,

Thank you for your time and patience in helping me to understand what is
going on. It is greatly appreciated.

B.


On Wed, 2005-02-02 at 16:04, Christian Limpach wrote:
> On Wed, Feb 02, 2005 at 02:00:18PM -0400, B.G. Bruce wrote:
> > What I've actually found is that if is disable the disabling of the
> > interrupt in kernel/irq/spurious.c, everything works fine. I still get
> > the error messages (I didn't comment them out) about every
> > 50.000-100.000 packets but I don't drop a packet and everything works as
> > it should. Now obviously I don't want to keep the disabling irq
> > disabled, but I'm at a loss for how to fix this otherwise.
>
> I think I understand now what's happening:
> - since you have devices on IRQ18 in both dom0 and another domain,
> all IRQ18 interrupts get delivered to both (for loop in
> __do_IRQ_guest in xen/arch/x86/irq.c).
> - the ide driver in dom0 will only handle IRQs for the ide controller.
> - all e1000 interrupts will be counted as spurious/unhandled.
> - if there's hardly any ide interrupts, you can hit the case where
> of 100000 interrupts, 99900 were unhandled and this will cause the
> interrupt to get disabled.
>
> We seem to hit the case mentioned in () above __report_bad_irq. Not
> disabling the interrupt in that case is the correct thing to do, but
> the sharing does certainly have a significant performance impact.
>
> christian
>
>


-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
RE: Driver domain - NEW issue: IRQ handling error [ In reply to ]
> > What I've actually found is that if is disable the disabling of the
> > interrupt in kernel/irq/spurious.c, everything works fine.
> I still get
> > the error messages (I didn't comment them out) about every
> > 50.000-100.000 packets but I don't drop a packet and
> everything works as
> > it should. Now obviously I don't want to keep the disabling irq
> > disabled, but I'm at a loss for how to fix this otherwise.
>
> I think I understand now what's happening:
> - since you have devices on IRQ18 in both dom0 and another domain,
> all IRQ18 interrupts get delivered to both (for loop in
> __do_IRQ_guest in xen/arch/x86/irq.c).
> - the ide driver in dom0 will only handle IRQs for the ide controller.
> - all e1000 interrupts will be counted as spurious/unhandled.
> - if there's hardly any ide interrupts, you can hit the case where
> of 100000 interrupts, 99900 were unhandled and this will cause the
> interrupt to get disabled.
>
> We seem to hit the case mentioned in () above __report_bad_irq. Not
> disabling the interrupt in that case is the correct thing to do, but
> the sharing does certainly have a significant performance impact.

Nicely figured -- I was putting this down to ioapic folklore and just
hoping it was going to go away with new ioapic code...

Shared interrupts across multiple domains are heinous.

Ian





-------------------------------------------------------
This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting
Tool for open source databases. Create drag-&-drop reports. Save time
by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc.
Download a FREE copy at http://www.intelliview.com/go/osdn_nl
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel