Mailing List Archive

[PATCH] C6 state with EOI issue fix for some Intel processors
There is an errata in some of Intel processors.

AAJ72. EOI Transaction May Not be Sent if Software Enters Core C6 During
an Interrupt Service Routine

If core C6 is entered after the start of an interrupt service routine but before
a write to the APIC EOI register, the core may not send an EOI transaction (if
needed) and further interrupts from the same priority level or lower may be
blocked.

This patch fix this issue, by checking if ISR is pending before enter deep Cx
state. If so, it would use power->safe_state instead of deep Cx state to prevent
the above issue happen.
Re: [PATCH] C6 state with EOI issue fix for some Intel processors [ In reply to ]
On Wednesday 15 September 2010 15:10:43 Sheng Yang wrote:
> There is an errata in some of Intel processors.
>
> AAJ72. EOI Transaction May Not be Sent if Software Enters Core C6 During
> an Interrupt Service Routine
>
> If core C6 is entered after the start of an interrupt service routine but
> before a write to the APIC EOI register, the core may not send an EOI
> transaction (if needed) and further interrupts from the same priority
> level or lower may be blocked.
>
> This patch fix this issue, by checking if ISR is pending before enter deep
> Cx state. If so, it would use power->safe_state instead of deep Cx state
> to prevent the above issue happen.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>

--
regards
Yang, Sheng

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] C6 state with EOI issue fix for some Intel processors [ In reply to ]
Aieee! :-)

K.

On 15/09/2010 08:10, "Sheng Yang" <sheng@linux.intel.com> wrote:

> There is an errata in some of Intel processors.
>
> AAJ72. EOI Transaction May Not be Sent if Software Enters Core C6 During
> an Interrupt Service Routine
>
> If core C6 is entered after the start of an interrupt service routine but
> before
> a write to the APIC EOI register, the core may not send an EOI transaction (if
> needed) and further interrupts from the same priority level or lower may be
> blocked.
>
> This patch fix this issue, by checking if ISR is pending before enter deep Cx
> state. If so, it would use power->safe_state instead of deep Cx state to
> prevent
> the above issue happen.



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] C6 state with EOI issue fix for some Intel processors [ In reply to ]
On 15/09/2010 08:10, "Sheng Yang" <sheng@linux.intel.com> wrote:

> This patch fix this issue, by checking if ISR is pending before enter deep Cx
> state. If so, it would use power->safe_state instead of deep Cx state to
> prevent
> the above issue happen.

Thanks. I reworked this patch substantially and applied as
xen-unstable:22160 and xen-4.0-testing:21348.

-- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] C6 state with EOI issue fix for some Intel processors [ In reply to ]
On 15.09.2010 10:03, Keir Fraser wrote:
>> This patch fix this issue, by checking if ISR is pending before enter deep Cx
>> state. If so, it would use power->safe_state instead of deep Cx state to
>> prevent
>> the above issue happen.
> Thanks. I reworked this patch substantially and applied as
> xen-unstable:22160 and xen-4.0-testing:21348.

I tested the patch on vanilla 4.0.1 and it does help a bit. Uptime was
now over 100 minutes instead of under 3 minutes. But problems still
occurred (aacraid reset, eth reset).

With my patch from
(http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00556.html)
the machine uptime was over 10 days when I stopped the test.

Regards Andreas

Sep 15 14:55:19 virt kernel: ------------[ cut here ]------------
Sep 15 14:55:19 virt kernel: WARNING: at net/sched/sch_generic.c:261
dev_watchdog+0x220/0x230()
Sep 15 14:55:19 virt kernel: Hardware name: X8SIL
Sep 15 14:55:19 virt kernel: NETDEV WATCHDOG: peth0 (e1000e): transmit
queue 0 timed out
Sep 15 14:55:19 virt kernel: Modules linked in: bridge stp llc
iptable_filter xt_MARK xt_mark xt_iprange xt_conntrack nf_conntrack
ip_tables x_tables tun loop e1000e
Sep 15 14:55:19 virt kernel: Pid: 4088, comm: blkback.1.hdc Not tainted
2.6.32.18-pvops0-ak3 #1
Sep 15 14:55:19 virt kernel: Call Trace:
Sep 15 14:55:19 virt kernel: <IRQ> [<ffffffff810458f6>]
warn_slowpath_common+0x76/0xb0
Sep 15 14:55:19 virt kernel: [<ffffffff8104598c>]
warn_slowpath_fmt+0x3c/0x40
Sep 15 14:55:19 virt kernel: [<ffffffff812f1b70>] dev_watchdog+0x220/0x230
Sep 15 14:55:19 virt kernel: [<ffffffff81050820>] ? mod_timer+0x110/0x180
Sep 15 14:55:19 virt kernel: [<ffffffff81091c40>] ?
sync_supers_timer_fn+0x0/0x20
Sep 15 14:55:19 virt kernel: [<ffffffff812f1950>] ? dev_watchdog+0x0/0x230
Sep 15 14:55:19 virt kernel: [<ffffffff810502fc>]
run_timer_softirq+0x14c/0x230
Sep 15 14:55:19 virt kernel: [<ffffffff8104b72f>] __do_softirq+0xaf/0x140
Sep 15 14:55:19 virt kernel: [<ffffffff811c5c09>] ?
__xen_evtchn_do_upcall+0x219/0x230
Sep 15 14:55:19 virt kernel: [<ffffffff8101357c>] call_softirq+0x1c/0x30
Sep 15 14:55:19 virt kernel: [<ffffffff81015675>] do_softirq+0x65/0xa0
Sep 15 14:55:19 virt kernel: [<ffffffff8104b3fd>] irq_exit+0x8d/0x90
Sep 15 14:55:19 virt kernel: [<ffffffff811c5cdd>]
xen_evtchn_do_upcall+0x3d/0x60
Sep 15 14:55:19 virt kernel: [<ffffffff810135ce>]
xen_do_hypervisor_callback+0x1e/0x30
Sep 15 14:55:19 virt kernel: <EOI> [<ffffffff8100922a>] ?
hypercall_page+0x22a/0x1010
Sep 15 14:55:19 virt kernel: [<ffffffff8100922a>] ?
hypercall_page+0x22a/0x1010
Sep 15 14:55:19 virt kernel: [<ffffffff8100ed7d>] ?
xen_force_evtchn_callback+0xd/0x10
Sep 15 14:55:19 virt kernel: [<ffffffff8100f712>] ? check_events+0x12/0x20
Sep 15 14:55:19 virt kernel: [<ffffffff8100f6b9>] ?
xen_irq_enable_direct_end+0x0/0x7
Sep 15 14:55:19 virt kernel: [<ffffffff8135e9dd>] ?
_spin_unlock_irq+0xd/0x40
Sep 15 14:55:19 virt kernel: [<ffffffff81151955>] ?
generic_unplug_device+0x35/0x40
Sep 15 14:55:19 virt kernel: [<ffffffff811cf456>] ? unplug_queue+0x26/0x50
Sep 15 14:55:19 virt kernel: [<ffffffff811d001e>] ?
blkif_schedule+0xde/0x320
Sep 15 14:55:19 virt kernel: [<ffffffff8105c530>] ?
autoremove_wake_function+0x0/0x40
Sep 15 14:55:19 virt kernel: [<ffffffff8135ea42>] ?
_spin_unlock_irqrestore+0x32/0x40
Sep 15 14:55:19 virt kernel: [<ffffffff811cff40>] ? blkif_schedule+0x0/0x320
Sep 15 14:55:19 virt kernel: [<ffffffff8105c24e>] ? kthread+0x8e/0xa0
Sep 15 14:55:19 virt kernel: [<ffffffff8101347a>] ? child_rip+0xa/0x20
Sep 15 14:55:19 virt kernel: [<ffffffff81012626>] ?
int_ret_from_sys_call+0x7/0x1b
Sep 15 14:55:19 virt kernel: [<ffffffff81012de1>] ?
retint_restore_args+0x5/0x6
Sep 15 14:55:19 virt kernel: [<ffffffff81013470>] ? child_rip+0x0/0x20
Sep 15 14:55:19 virt kernel: ---[ end trace 6548e737c4c22ec9 ]---
Sep 15 14:55:19 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
Sep 15 14:55:19 virt kernel: eth0: port 1(peth0) entering disabled state
Sep 15 14:55:19 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
Sep 15 14:55:22 virt kernel: e1000e: peth0 NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: None
Sep 15 14:55:22 virt kernel: eth0: port 1(peth0) entering forwarding state
Sep 15 15:16:29 virt kernel: hrtimer: interrupt took 10082426 ns
Sep 15 15:24:06 virt kernel: aacraid: Host adapter abort request (0,0,1,0)
Sep 15 15:24:06 virt kernel: aacraid: Host adapter abort request (0,0,1,0)
Sep 15 15:24:06 virt kernel: aacraid: Host adapter reset request. SCSI
hang ?
Sep 15 15:24:06 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
Sep 15 15:24:06 virt kernel: eth0: port 1(peth0) entering disabled state
Sep 15 15:24:06 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
Sep 15 15:24:09 virt kernel: e1000e: peth0 NIC Link is Up 1000 Mbps Full
Duplex, Flow Control: None

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: Re: [PATCH] C6 state with EOI issue fix for some Intel processors [ In reply to ]
On Wednesday 15 September 2010 21:42:53 Andreas Kinzler wrote:
> On 15.09.2010 10:03, Keir Fraser wrote:
> >> This patch fix this issue, by checking if ISR is pending before enter
> >> deep Cx state. If so, it would use power->safe_state instead of deep Cx
> >> state to prevent
> >> the above issue happen.
> >
> > Thanks. I reworked this patch substantially and applied as
> > xen-unstable:22160 and xen-4.0-testing:21348.
>
> I tested the patch on vanilla 4.0.1 and it does help a bit. Uptime was
> now over 100 minutes instead of under 3 minutes. But problems still
> occurred (aacraid reset, eth reset).

To determine if the issue was caused by the errata, you can try disable C6 state
in the BIOS. This errata only happen with C6 state involved.

--
regards
Yang, Sheng

> With my patch from
> (http://lists.xensource.com/archives/html/xen-devel/2010-09/msg00556.html)
> the machine uptime was over 10 days when I stopped the test.
>
> Regards Andreas
>
> Sep 15 14:55:19 virt kernel: ------------[ cut here ]------------
> Sep 15 14:55:19 virt kernel: WARNING: at net/sched/sch_generic.c:261
> dev_watchdog+0x220/0x230()
> Sep 15 14:55:19 virt kernel: Hardware name: X8SIL
> Sep 15 14:55:19 virt kernel: NETDEV WATCHDOG: peth0 (e1000e): transmit
> queue 0 timed out
> Sep 15 14:55:19 virt kernel: Modules linked in: bridge stp llc
> iptable_filter xt_MARK xt_mark xt_iprange xt_conntrack nf_conntrack
> ip_tables x_tables tun loop e1000e
> Sep 15 14:55:19 virt kernel: Pid: 4088, comm: blkback.1.hdc Not tainted
> 2.6.32.18-pvops0-ak3 #1
> Sep 15 14:55:19 virt kernel: Call Trace:
> Sep 15 14:55:19 virt kernel: <IRQ> [<ffffffff810458f6>]
> warn_slowpath_common+0x76/0xb0
> Sep 15 14:55:19 virt kernel: [<ffffffff8104598c>]
> warn_slowpath_fmt+0x3c/0x40
> Sep 15 14:55:19 virt kernel: [<ffffffff812f1b70>] dev_watchdog+0x220/0x230
> Sep 15 14:55:19 virt kernel: [<ffffffff81050820>] ? mod_timer+0x110/0x180
> Sep 15 14:55:19 virt kernel: [<ffffffff81091c40>] ?
> sync_supers_timer_fn+0x0/0x20
> Sep 15 14:55:19 virt kernel: [<ffffffff812f1950>] ? dev_watchdog+0x0/0x230
> Sep 15 14:55:19 virt kernel: [<ffffffff810502fc>]
> run_timer_softirq+0x14c/0x230
> Sep 15 14:55:19 virt kernel: [<ffffffff8104b72f>] __do_softirq+0xaf/0x140
> Sep 15 14:55:19 virt kernel: [<ffffffff811c5c09>] ?
> __xen_evtchn_do_upcall+0x219/0x230
> Sep 15 14:55:19 virt kernel: [<ffffffff8101357c>] call_softirq+0x1c/0x30
> Sep 15 14:55:19 virt kernel: [<ffffffff81015675>] do_softirq+0x65/0xa0
> Sep 15 14:55:19 virt kernel: [<ffffffff8104b3fd>] irq_exit+0x8d/0x90
> Sep 15 14:55:19 virt kernel: [<ffffffff811c5cdd>]
> xen_evtchn_do_upcall+0x3d/0x60
> Sep 15 14:55:19 virt kernel: [<ffffffff810135ce>]
> xen_do_hypervisor_callback+0x1e/0x30
> Sep 15 14:55:19 virt kernel: <EOI> [<ffffffff8100922a>] ?
> hypercall_page+0x22a/0x1010
> Sep 15 14:55:19 virt kernel: [<ffffffff8100922a>] ?
> hypercall_page+0x22a/0x1010
> Sep 15 14:55:19 virt kernel: [<ffffffff8100ed7d>] ?
> xen_force_evtchn_callback+0xd/0x10
> Sep 15 14:55:19 virt kernel: [<ffffffff8100f712>] ? check_events+0x12/0x20
> Sep 15 14:55:19 virt kernel: [<ffffffff8100f6b9>] ?
> xen_irq_enable_direct_end+0x0/0x7
> Sep 15 14:55:19 virt kernel: [<ffffffff8135e9dd>] ?
> _spin_unlock_irq+0xd/0x40
> Sep 15 14:55:19 virt kernel: [<ffffffff81151955>] ?
> generic_unplug_device+0x35/0x40
> Sep 15 14:55:19 virt kernel: [<ffffffff811cf456>] ? unplug_queue+0x26/0x50
> Sep 15 14:55:19 virt kernel: [<ffffffff811d001e>] ?
> blkif_schedule+0xde/0x320
> Sep 15 14:55:19 virt kernel: [<ffffffff8105c530>] ?
> autoremove_wake_function+0x0/0x40
> Sep 15 14:55:19 virt kernel: [<ffffffff8135ea42>] ?
> _spin_unlock_irqrestore+0x32/0x40
> Sep 15 14:55:19 virt kernel: [<ffffffff811cff40>] ?
> blkif_schedule+0x0/0x320 Sep 15 14:55:19 virt kernel: [<ffffffff8105c24e>]
> ? kthread+0x8e/0xa0 Sep 15 14:55:19 virt kernel: [<ffffffff8101347a>] ?
> child_rip+0xa/0x20 Sep 15 14:55:19 virt kernel: [<ffffffff81012626>] ?
> int_ret_from_sys_call+0x7/0x1b
> Sep 15 14:55:19 virt kernel: [<ffffffff81012de1>] ?
> retint_restore_args+0x5/0x6
> Sep 15 14:55:19 virt kernel: [<ffffffff81013470>] ? child_rip+0x0/0x20
> Sep 15 14:55:19 virt kernel: ---[ end trace 6548e737c4c22ec9 ]---
> Sep 15 14:55:19 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
> Sep 15 14:55:19 virt kernel: eth0: port 1(peth0) entering disabled state
> Sep 15 14:55:19 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
> Sep 15 14:55:22 virt kernel: e1000e: peth0 NIC Link is Up 1000 Mbps Full
> Duplex, Flow Control: None
> Sep 15 14:55:22 virt kernel: eth0: port 1(peth0) entering forwarding state
> Sep 15 15:16:29 virt kernel: hrtimer: interrupt took 10082426 ns
> Sep 15 15:24:06 virt kernel: aacraid: Host adapter abort request (0,0,1,0)
> Sep 15 15:24:06 virt kernel: aacraid: Host adapter abort request (0,0,1,0)
> Sep 15 15:24:06 virt kernel: aacraid: Host adapter reset request. SCSI
> hang ?
> Sep 15 15:24:06 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
> Sep 15 15:24:06 virt kernel: eth0: port 1(peth0) entering disabled state
> Sep 15 15:24:06 virt kernel: e1000e 0000:04:00.0: peth0: Reset adapter
> Sep 15 15:24:09 virt kernel: e1000e: peth0 NIC Link is Up 1000 Mbps Full
> Duplex, Flow Control: None

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel