Mailing List Archive: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat

[sched next] overflowed cpu time for kernel threads in /proc/PID/stat

sergey.senozhatsky at gmail

Aug 20, 2013, 4:14 AM

Post #1 of 23 (3692 views)

Hello,
3.11.0-rc6-next-20130820

at some point an overflowed cpu time is reported in /proc/PID/stat

cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 477 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
^^^^^^^^^
cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 597 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(seems there are several more -1UL in output)

oddly enough, I see this happening only for RCU

ps aux | grep rcu
root 8 0.0 0.0 0 0 ? S 12:42 0:00 [rcuc/0]
root 9 0.0 0.0 0 0 ? S 12:42 0:00 [rcub/0]
root 10 62422329 0.0 0 0 ? R 12:42 21114581:37 [rcu_preempt]
root 11 0.1 0.0 0 0 ? S 12:42 0:02 [rcuop/0]
root 12 62422329 0.0 0 0 ? S 12:42 21114581:35 [rcuop/1]

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Aug 20, 2013, 8:15 AM

Post #2 of 23 (3653 views)

On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> Hello,
> 3.11.0-rc6-next-20130820
>
> at some point an overflowed cpu time is reported in /proc/PID/stat
>
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 477 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> ^^^^^^^^^
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 597 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> (seems there are several more -1UL in output)
>
>
> oddly enough, I see this happening only for RCU
>
> ps aux | grep rcu
> root 8 0.0 0.0 0 0 ? S 12:42 0:00 [rcuc/0]
> root 9 0.0 0.0 0 0 ? S 12:42 0:00 [rcub/0]
> root 10 62422329 0.0 0 0 ? R 12:42 21114581:37 [rcu_preempt]
> root 11 0.1 0.0 0 0 ? S 12:42 0:02 [rcuop/0]
> root 12 62422329 0.0 0 0 ? S 12:42 21114581:35 [rcuop/1]
>
> -ss

Hello,

Please upload your config.

I'm adding Stanislaw in Cc in case it relates to cputime scaling.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Aug 20, 2013, 8:35 AM

Post #3 of 23 (3665 views)

On (08/20/13 17:15), Frederic Weisbecker wrote:
> On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > Hello,
> > 3.11.0-rc6-next-20130820
> >
> > at some point an overflowed cpu time is reported in /proc/PID/stat
> >
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 406 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 0 407 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 477 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> > ^^^^^^^^^
> > cat /proc/10/stat
> > 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 597 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> >
> > (seems there are several more -1UL in output)
> >
> >
> > oddly enough, I see this happening only for RCU
> >
> > ps aux | grep rcu
> > root 8 0.0 0.0 0 0 ? S 12:42 0:00 [rcuc/0]
> > root 9 0.0 0.0 0 0 ? S 12:42 0:00 [rcub/0]
> > root 10 62422329 0.0 0 0 ? R 12:42 21114581:37 [rcu_preempt]
> > root 11 0.1 0.0 0 0 ? S 12:42 0:02 [rcuop/0]
> > root 12 62422329 0.0 0 0 ? S 12:42 21114581:35 [rcuop/1]
> >
> > -ss
>
>
> Hello,
>
> Please upload your config.
>
> I'm adding Stanislaw in Cc in case it relates to cputime scaling.
>

please find attached.

-ss

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Aug 20, 2013, 8:42 AM

Post #4 of 23 (3646 views)

On Tue, Aug 20, 2013 at 06:35:50PM +0300, Sergey Senozhatsky wrote:
> On (08/20/13 17:15), Frederic Weisbecker wrote:
> > On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > Hello,
> >
> > Please upload your config.
> >
> > I'm adding Stanislaw in Cc in case it relates to cputime scaling.
> >
>
> please find attached.
>
> -ss

> #
> # CPU/Task time and stats accounting
> #
> CONFIG_VIRT_CPU_ACCOUNTING=y
> CONFIG_VIRT_CPU_ACCOUNTING_GEN=y

So it happens with full dynticks cputime accounting.
How reproducable is this? Does it always happen?
Does it happen with CONFIG_TICK_CPU_ACCOUNTING as well? (you'll
need to deactivate full dynticks.)

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Aug 20, 2013, 8:53 AM

Post #5 of 23 (3643 views)

On (08/20/13 17:42), Frederic Weisbecker wrote:
> On Tue, Aug 20, 2013 at 06:35:50PM +0300, Sergey Senozhatsky wrote:
> > On (08/20/13 17:15), Frederic Weisbecker wrote:
> > > On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > > Hello,
> > >
> > > Please upload your config.
> > >
> > > I'm adding Stanislaw in Cc in case it relates to cputime scaling.
> > >
> >
> > please find attached.
> >
> > -ss
>
> > #
> > # CPU/Task time and stats accounting
> > #
> > CONFIG_VIRT_CPU_ACCOUNTING=y
> > CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
>
> So it happens with full dynticks cputime accounting.

yes.

> How reproducable is this?

compiling something big enough (e.g. kernel) `works' for me.

> Does it always happen?

I'd say yes.

> Does it happen with CONFIG_TICK_CPU_ACCOUNTING as well? (you'll
> need to deactivate full dynticks.)

will test.

-ss

> Thanks.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Aug 20, 2013, 11:48 AM

Post #6 of 23 (3644 views)

On (08/20/13 17:42), Frederic Weisbecker wrote:
> On Tue, Aug 20, 2013 at 06:35:50PM +0300, Sergey Senozhatsky wrote:
> > On (08/20/13 17:15), Frederic Weisbecker wrote:
> > > On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > > Hello,
> > >
> > > Please upload your config.
> > >
> > > I'm adding Stanislaw in Cc in case it relates to cputime scaling.
> > >
> >
> > please find attached.
> >
> > -ss
>
> > #
> > # CPU/Task time and stats accounting
> > #
> > CONFIG_VIRT_CPU_ACCOUNTING=y
> > CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
>
> So it happens with full dynticks cputime accounting.
> How reproducable is this? Does it always happen?
> Does it happen with CONFIG_TICK_CPU_ACCOUNTING as well? (you'll
> need to deactivate full dynticks.)
>
> Thanks.
>

with full dynticks disabled:

#
# Timers subsystem
#
CONFIG_TICK_ONESHOT=y
CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
# CONFIG_NO_HZ is not set
CONFIG_HIGH_RES_TIMERS=y

#
# CPU/Task time and stats accounting
#
CONFIG_TICK_CPU_ACCOUNTING=y
# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
# CONFIG_IRQ_TIME_ACCOUNTING is not set
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y

everything looks fine (no overflow).

-ss

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Aug 21, 2013, 8:39 AM

Post #7 of 23 (3654 views)

On (08/20/13 17:42), Frederic Weisbecker wrote:
> On Tue, Aug 20, 2013 at 06:35:50PM +0300, Sergey Senozhatsky wrote:
> > On (08/20/13 17:15), Frederic Weisbecker wrote:
> > > On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > > Hello,
> > >
> > > Please upload your config.
> > >
> > > I'm adding Stanislaw in Cc in case it relates to cputime scaling.
> > >
> >
> > please find attached.
> >
> > -ss
>
> > #
> > # CPU/Task time and stats accounting
> > #
> > CONFIG_VIRT_CPU_ACCOUNTING=y
> > CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
>
> So it happens with full dynticks cputime accounting.
> How reproducable is this? Does it always happen?
> Does it happen with CONFIG_TICK_CPU_ACCOUNTING as well? (you'll
> need to deactivate full dynticks.)

in cputime_adjust() `stime' is greater than `rtime', so `utime = rtime - stime'
sets wrong value.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Aug 30, 2013, 4:04 PM

Post #8 of 23 (3618 views)

On Wed, Aug 21, 2013 at 06:39:57PM +0300, Sergey Senozhatsky wrote:
> On (08/20/13 17:42), Frederic Weisbecker wrote:
> > On Tue, Aug 20, 2013 at 06:35:50PM +0300, Sergey Senozhatsky wrote:
> > > On (08/20/13 17:15), Frederic Weisbecker wrote:
> > > > On Tue, Aug 20, 2013 at 02:14:26PM +0300, Sergey Senozhatsky wrote:
> > > > Hello,
> > > >
> > > > Please upload your config.
> > > >
> > > > I'm adding Stanislaw in Cc in case it relates to cputime scaling.
> > > >
> > >
> > > please find attached.
> > >
> > > -ss
> >
> > > #
> > > # CPU/Task time and stats accounting
> > > #
> > > CONFIG_VIRT_CPU_ACCOUNTING=y
> > > CONFIG_VIRT_CPU_ACCOUNTING_GEN=y
> >
> > So it happens with full dynticks cputime accounting.
> > How reproducable is this? Does it always happen?
> > Does it happen with CONFIG_TICK_CPU_ACCOUNTING as well? (you'll
> > need to deactivate full dynticks.)
>
> in cputime_adjust() `stime' is greater than `rtime', so `utime = rtime - stime'
> sets wrong value.

But stime should always be below rtime due to the calculation done by scale_stime()
which roughly sums up to:

stime = (stime / stime + utime) * rtime

So this shouldn't happen.

I couldn't manage to reproduce it though. Can you still trigger it with latest -next?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Aug 31, 2013, 12:01 PM

Post #9 of 23 (3623 views)

On (08/31/13 01:04), Frederic Weisbecker wrote:
> But stime should always be below rtime due to the calculation done by scale_stime()
> which roughly sums up to:
>
> stime = (stime / stime + utime) * rtime
>
> So this shouldn't happen.
>
> I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
>
> Thanks.

Hello,
I can't pull current -next (away from good network connection for a
couple of days). What I have is 3.11.0-rc7-next-20130829:

ps aux | grep rcu
root 8 0.0 0.0 0 0 ? S Aug30 0:00 [rcuc/0]
root 9 0.0 0.0 0 0 ? S Aug30 0:00 [rcub/0]
root 10 21376376 0.0 0 0 ? S Aug30 21114581:36 [rcu_preempt]
root 11 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/0]
root 12 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/1]

cat /proc/10/stat
10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 715 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 2, 2013, 5:28 AM

Post #10 of 23 (3624 views)

On (08/31/13 01:04), Frederic Weisbecker wrote:
> > in cputime_adjust() `stime' is greater than `rtime', so `utime = rtime - stime'
> > sets wrong value.
>
> But stime should always be below rtime due to the calculation done by scale_stime()
> which roughly sums up to:
>
> stime = (stime / stime + utime) * rtime
>
> So this shouldn't happen.
>
> I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
>

Hope this may help.
I've added a silly check to make sure that `stime < rtime'

@@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
if (total) {
stime = scale_stime((__force u64)stime,
(__force u64)rtime, (__force u64)total);
+ if (stime > rtime) {
+ printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
+ WARN_ON(1);
+ }
utime = rtime - stime;
} else {
stime = rtime;

and got the following results:

[ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
[ 1295.311063] ------------[ cut here ]------------
[ 1295.311072] WARNING: CPU: 1 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
[ 1295.311123] CPU: 1 PID: 2167 Comm: top Tainted: G C 3.11.0-rc7-next-20130902-dbg-dirty #2
[ 1295.311126] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 1295.311128] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
[ 1295.311134] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
[ 1295.311139] 00000001138994d0 00000001138994d1 ffff880153118ff0 ffff880099937c98
[ 1295.311145] Call Trace:
[ 1295.311155] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
[ 1295.311162] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
[ 1295.311166] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
[ 1295.311170] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
[ 1295.311175] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
[ 1295.311182] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
[ 1295.311188] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
[ 1295.311192] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
[ 1295.311198] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
[ 1295.311204] [<ffffffff811805e1>] vfs_read+0xa1/0x180
[ 1295.311208] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
[ 1295.311213] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
[ 1295.311216] ---[ end trace bab1e899ff45eea2 ]---

[ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
[ 1301.384401] ------------[ cut here ]------------
[ 1301.384409] WARNING: CPU: 0 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
[ 1301.384450] CPU: 0 PID: 2167 Comm: top Tainted: G WC 3.11.0-rc7-next-20130902-dbg-dirty #2
[ 1301.384452] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
[ 1301.384454] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
[ 1301.384458] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
[ 1301.384462] 0000000115678476 0000000115678477 ffff880153118ff0 ffff880099937c98
[ 1301.384466] Call Trace:
[ 1301.384475] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
[ 1301.384481] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
[ 1301.384484] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
[ 1301.384487] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
[ 1301.384491] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
[ 1301.384497] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
[ 1301.384501] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
[ 1301.384504] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
[ 1301.384509] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
[ 1301.384514] [<ffffffff811805e1>] vfs_read+0xa1/0x180
[ 1301.384517] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
[ 1301.384521] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
[ 1301.384523] ---[ end trace bab1e899ff45eea3 ]---

and so on.

$ dmesg | grep Ooops
[ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
[ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
[ 1307.438935] Ooops: stime:4687858107 rtime:4687858106
[ 1313.493462] Ooops: stime:4724166945 rtime:4724166944
[ 1316.521740] Ooops: stime:4741142233 rtime:4741142232
[ 1325.605625] Ooops: stime:4793906690 rtime:4793906688
[ 1331.648611] Ooops: stime:4308413629 rtime:4308413628
[ 1337.708468] Ooops: stime:4510156747 rtime:4510156746
[ 1340.744586] Ooops: stime:4880584390 rtime:4880584389
[ 1343.773111] Ooops: stime:4442005879 rtime:4442005878
[ 1343.773362] Ooops: stime:4558310943 rtime:4558310942
[ 1349.815797] Ooops: stime:4927379646 rtime:4927379645
[ 1352.836936] Ooops: stime:4942782836 rtime:4942782834
[ 1358.885232] Ooops: stime:4979031940 rtime:4979031938
[ 1364.977902] Ooops: stime:5019633914 rtime:5019633912
[ 1364.978283] Ooops: stime:4601969101 rtime:4601969100
[ 1364.978623] Ooops: stime:4826393815 rtime:4826393814
[ 1371.034265] Ooops: stime:4894272175 rtime:4894272174
[ 1374.077150] Ooops: stime:4688831829 rtime:4688831828
[ 1374.077519] Ooops: stime:4916319315 rtime:4916319314
[ 1377.110416] Ooops: stime:5084174048 rtime:5084174047
[ 1377.111043] Ooops: stime:4949250198 rtime:4949250196
[ 1380.137382] Ooops: stime:5096963048 rtime:5096963046
[ 1380.137739] Ooops: stime:4975014880 rtime:4975014879
[ 1383.161441] Ooops: stime:5002466396 rtime:5002466394
[ 1389.212166] Ooops: stime:5140400038 rtime:5140400036
[ 1392.239303] Ooops: stime:5082051028 rtime:5082051027
[ 1395.270391] Ooops: stime:5170118636 rtime:5170118634
[ 1398.303593] Ooops: stime:5184355318 rtime:5184355316

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Sep 2, 2013, 6:07 AM

Post #11 of 23 (3625 views)

On Mon, Sep 02, 2013 at 03:28:45PM +0300, Sergey Senozhatsky wrote:
> On (08/31/13 01:04), Frederic Weisbecker wrote:
> > > in cputime_adjust() `stime' is greater than `rtime', so `utime = rtime - stime'
> > > sets wrong value.
> >
> > But stime should always be below rtime due to the calculation done by scale_stime()
> > which roughly sums up to:
> >
> > stime = (stime / stime + utime) * rtime
> >
> > So this shouldn't happen.
> >
> > I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
> >
>
> Hope this may help.
> I've added a silly check to make sure that `stime < rtime'
>
> @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> if (total) {
> stime = scale_stime((__force u64)stime,
> (__force u64)rtime, (__force u64)total);
> + if (stime > rtime) {
> + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> + WARN_ON(1);
> + }
> utime = rtime - stime;
> } else {
> stime = rtime;
>
>
> and got the following results:
>
>
> [ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
> [ 1295.311063] ------------[ cut here ]------------
> [ 1295.311072] WARNING: CPU: 1 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
> [ 1295.311123] CPU: 1 PID: 2167 Comm: top Tainted: G C 3.11.0-rc7-next-20130902-dbg-dirty #2
> [ 1295.311126] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 1295.311128] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
> [ 1295.311134] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
> [ 1295.311139] 00000001138994d0 00000001138994d1 ffff880153118ff0 ffff880099937c98
> [ 1295.311145] Call Trace:
> [ 1295.311155] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
> [ 1295.311162] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
> [ 1295.311166] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
> [ 1295.311170] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
> [ 1295.311175] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
> [ 1295.311182] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
> [ 1295.311188] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
> [ 1295.311192] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
> [ 1295.311198] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
> [ 1295.311204] [<ffffffff811805e1>] vfs_read+0xa1/0x180
> [ 1295.311208] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
> [ 1295.311213] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
> [ 1295.311216] ---[ end trace bab1e899ff45eea2 ]---
>
> [ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
> [ 1301.384401] ------------[ cut here ]------------
> [ 1301.384409] WARNING: CPU: 0 PID: 2167 at kernel/sched/cputime.c:584 cputime_adjust+0x132/0x140()
> [ 1301.384450] CPU: 0 PID: 2167 Comm: top Tainted: G WC 3.11.0-rc7-next-20130902-dbg-dirty #2
> [ 1301.384452] Hardware name: Acer Aspire 5741G /Aspire 5741G , BIOS V1.20 02/08/2011
> [ 1301.384454] 0000000000000009 ffff880099937c50 ffffffff8162fc85 0000000000000000
> [ 1301.384458] ffff880099937c88 ffffffff8104ee2d ffff8801530bb158 ffff880099937df0
> [ 1301.384462] 0000000115678476 0000000115678477 ffff880153118ff0 ffff880099937c98
> [ 1301.384466] Call Trace:
> [ 1301.384475] [<ffffffff8162fc85>] dump_stack+0x4e/0x82
> [ 1301.384481] [<ffffffff8104ee2d>] warn_slowpath_common+0x7d/0xa0
> [ 1301.384484] [<ffffffff8104ef0a>] warn_slowpath_null+0x1a/0x20
> [ 1301.384487] [<ffffffff8108db72>] cputime_adjust+0x132/0x140
> [ 1301.384491] [<ffffffff8108e941>] thread_group_cputime_adjusted+0x41/0x50
> [ 1301.384497] [<ffffffff811f3ac4>] do_task_stat+0x9c4/0xb50
> [ 1301.384501] [<ffffffff811f4904>] proc_tgid_stat+0x14/0x20
> [ 1301.384504] [<ffffffff811edac4>] proc_single_show+0x54/0xa0
> [ 1301.384509] [<ffffffff811a6d24>] seq_read+0x164/0x3d0
> [ 1301.384514] [<ffffffff811805e1>] vfs_read+0xa1/0x180
> [ 1301.384517] [<ffffffff8118119c>] SyS_read+0x4c/0xa0
> [ 1301.384521] [<ffffffff8163ad06>] tracesys+0xd4/0xd9
> [ 1301.384523] ---[ end trace bab1e899ff45eea3 ]---
>
> and so on.
>
>
> $ dmesg | grep Ooops
> [ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
> [ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
> [ 1307.438935] Ooops: stime:4687858107 rtime:4687858106
> [ 1313.493462] Ooops: stime:4724166945 rtime:4724166944
> [ 1316.521740] Ooops: stime:4741142233 rtime:4741142232
> [ 1325.605625] Ooops: stime:4793906690 rtime:4793906688
> [ 1331.648611] Ooops: stime:4308413629 rtime:4308413628
> [ 1337.708468] Ooops: stime:4510156747 rtime:4510156746
> [ 1340.744586] Ooops: stime:4880584390 rtime:4880584389
> [ 1343.773111] Ooops: stime:4442005879 rtime:4442005878
> [ 1343.773362] Ooops: stime:4558310943 rtime:4558310942
> [ 1349.815797] Ooops: stime:4927379646 rtime:4927379645
> [ 1352.836936] Ooops: stime:4942782836 rtime:4942782834
> [ 1358.885232] Ooops: stime:4979031940 rtime:4979031938
> [ 1364.977902] Ooops: stime:5019633914 rtime:5019633912
> [ 1364.978283] Ooops: stime:4601969101 rtime:4601969100
> [ 1364.978623] Ooops: stime:4826393815 rtime:4826393814
> [ 1371.034265] Ooops: stime:4894272175 rtime:4894272174
> [ 1374.077150] Ooops: stime:4688831829 rtime:4688831828
> [ 1374.077519] Ooops: stime:4916319315 rtime:4916319314
> [ 1377.110416] Ooops: stime:5084174048 rtime:5084174047
> [ 1377.111043] Ooops: stime:4949250198 rtime:4949250196
> [ 1380.137382] Ooops: stime:5096963048 rtime:5096963046
> [ 1380.137739] Ooops: stime:4975014880 rtime:4975014879
> [ 1383.161441] Ooops: stime:5002466396 rtime:5002466394
> [ 1389.212166] Ooops: stime:5140400038 rtime:5140400036
> [ 1392.239303] Ooops: stime:5082051028 rtime:5082051027
> [ 1395.270391] Ooops: stime:5170118636 rtime:5170118634
> [ 1398.303593] Ooops: stime:5184355318 rtime:5184355316

Thanks a lot Sergey for testing this further!

Interesting results, so rtime is always one or two units off stime after scaling.
Stanislaw made the scaling code with Linus and he has a better idea on the math guts
here.

Stanislaw, any idea?

>
>
> -ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 2, 2013, 6:39 AM

Post #12 of 23 (3630 views)

On (09/02/13 15:07), Frederic Weisbecker wrote:
> > $ dmesg | grep Ooops
> > [ 1295.311056] Ooops: stime:4622750929 rtime:4622750928
> > [ 1301.384396] Ooops: stime:4654072951 rtime:4654072950
> > [ 1307.438935] Ooops: stime:4687858107 rtime:4687858106
> > [ 1313.493462] Ooops: stime:4724166945 rtime:4724166944
> > [ 1316.521740] Ooops: stime:4741142233 rtime:4741142232
> > [ 1325.605625] Ooops: stime:4793906690 rtime:4793906688
> > [ 1331.648611] Ooops: stime:4308413629 rtime:4308413628
> > [ 1337.708468] Ooops: stime:4510156747 rtime:4510156746
> > [ 1340.744586] Ooops: stime:4880584390 rtime:4880584389
> > [ 1343.773111] Ooops: stime:4442005879 rtime:4442005878
> > [ 1343.773362] Ooops: stime:4558310943 rtime:4558310942
> > [ 1349.815797] Ooops: stime:4927379646 rtime:4927379645
> > [ 1352.836936] Ooops: stime:4942782836 rtime:4942782834
> > [ 1358.885232] Ooops: stime:4979031940 rtime:4979031938
> > [ 1364.977902] Ooops: stime:5019633914 rtime:5019633912
> > [ 1364.978283] Ooops: stime:4601969101 rtime:4601969100
> > [ 1364.978623] Ooops: stime:4826393815 rtime:4826393814
> > [ 1371.034265] Ooops: stime:4894272175 rtime:4894272174
> > [ 1374.077150] Ooops: stime:4688831829 rtime:4688831828
> > [ 1374.077519] Ooops: stime:4916319315 rtime:4916319314
> > [ 1377.110416] Ooops: stime:5084174048 rtime:5084174047
> > [ 1377.111043] Ooops: stime:4949250198 rtime:4949250196
> > [ 1380.137382] Ooops: stime:5096963048 rtime:5096963046
> > [ 1380.137739] Ooops: stime:4975014880 rtime:4975014879
> > [ 1383.161441] Ooops: stime:5002466396 rtime:5002466394
> > [ 1389.212166] Ooops: stime:5140400038 rtime:5140400036
> > [ 1392.239303] Ooops: stime:5082051028 rtime:5082051027
> > [ 1395.270391] Ooops: stime:5170118636 rtime:5170118634
> > [ 1398.303593] Ooops: stime:5184355318 rtime:5184355316
>
>
> Thanks a lot Sergey for testing this further!
>
> Interesting results, so rtime is always one or two units off stime after scaling.
> Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> here.
>
> Stanislaw, any idea?

sure. sorry, it took so long to get back to this.

actual scale_stime() input that triggers overflow:

[ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815
[ 1300.484675] stime:3966526699 rtime:4316110636 total:3966526699
[ 1309.548850] stime:4016453845 rtime:4369061182 total:4016453845
[ 1315.597880] stime:4055256777 rtime:4409603756 total:4055256777
[ 1315.598340] stime:4004230541 rtime:4571167362 total:4004230541
[ 1318.623774] stime:4072651687 rtime:4427641260 total:4072651687
[ 1318.624194] stime:3307672697 rtime:4359433252 total:3307672697
[ 1321.661950] stime:4073588267 rtime:4644946914 total:4073588267
[ 1324.691457] stime:4105876887 rtime:4462631018 total:4105876887
[ 1327.722074] stime:3375231967 rtime:4439900096 total:3375231967
[ 1333.757482] stime:4156087279 rtime:4514990216 total:4156087279
[ 1333.757755] stime:3427423145 rtime:4504337498 total:3427423145
[ 1333.758080] stime:4180115893 rtime:4758073724 total:4180115893
[ 1339.813117] stime:3465843945 rtime:4554325330 total:3465843945
[ 1342.845746] stime:4204665773 rtime:4565346324 total:4204665773
[ 1345.888570] stime:3497346343 rtime:4592210094 total:3497346343
[ 1348.922371] stime:4348957782 rtime:4935439460 total:4348957782
[ 1351.950096] stime:4362056506 rtime:4949617939 total:4362056506
[ 1361.021453] stime:4295785738 rtime:4661661137 total:4295785738
[ 1361.022000] stime:4458897246 rtime:5051395981 total:4458897246
[ 1364.050371] stime:4311972683 rtime:4678581654 total:4311972683
[ 1364.050765] stime:4479087426 rtime:5072949454 total:4479087426
[ 1367.076973] stime:4499465526 rtime:5094687861 total:4499465526
[ 1370.099987] stime:4343775870 rtime:4712036053 total:4343775870
[ 1373.125650] stime:4359154163 rtime:4727095545 total:4359154163
[ 1373.126009] stime:4552630150 rtime:5150626456 total:4552630150
[ 1376.148541] stime:4374640011 rtime:4743265121 total:4374640011
[ 1379.177031] stime:3732027459 rtime:4887067184 total:3732027459
[ 1382.220666] stime:4404735122 rtime:4774829507 total:4404735122
[ 1385.246696] stime:4420289930 rtime:4790957716 total:4420289930
[ 1385.247197] stime:4649961982 rtime:5253432805 total:4649961982
[ 1388.269661] stime:4433706951 rtime:4804365472 total:4433706951
[ 1388.270078] stime:3783514895 rtime:4952742424 total:3783514895
[ 1391.299533] stime:4449952651 rtime:4821791998 total:4449952651
[ 1394.329016] stime:4463714342 rtime:4836891922 total:4463714342

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sgruszka at redhat

Sep 2, 2013, 6:50 AM

Post #13 of 23 (3617 views)

On Mon, Sep 02, 2013 at 03:07:45PM +0200, Frederic Weisbecker wrote:
> > Hope this may help.
> > I've added a silly check to make sure that `stime < rtime'
> >
> > @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> > if (total) {
> > stime = scale_stime((__force u64)stime,
> > (__force u64)rtime, (__force u64)total);
> > + if (stime > rtime) {
> > + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> > + WARN_ON(1);
> > + }
> > utime = rtime - stime;
> > } else {
> > stime = rtime;
[snip]

> Thanks a lot Sergey for testing this further!
>
> Interesting results, so rtime is always one or two units off stime after scaling.
> Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> here.

I don't think this is scale issue, but rather at scale_stime() input
stime is already bigger then rtime. Sergey, could you verify that
by adding check before scale_stime() ?

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 2, 2013, 7:00 AM

Post #14 of 23 (3617 views)

On (09/02/13 15:50), Stanislaw Gruszka wrote:
> Date: Mon, 2 Sep 2013 15:50:34 +0200
> From: Stanislaw Gruszka <sgruszka@redhat.com>
> To: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>, Ingo Molnar
> <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, "Paul E.
> McKenney" <paulmck@linux.vnet.ibm.com>, Borislav Petkov <bp@alien8.de>,
> linux-kernel@vger.kernel.org
> Subject: Re: [sched next] overflowed cpu time for kernel threads in
> /proc/PID/stat
> User-Agent: Mutt/1.5.21 (2010-09-15)
>
> On Mon, Sep 02, 2013 at 03:07:45PM +0200, Frederic Weisbecker wrote:
> > > Hope this may help.
> > > I've added a silly check to make sure that `stime < rtime'
> > >
> > > @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> > > if (total) {
> > > stime = scale_stime((__force u64)stime,
> > > (__force u64)rtime, (__force u64)total);
> > > + if (stime > rtime) {
> > > + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> > > + WARN_ON(1);
> > > + }
> > > utime = rtime - stime;
> > > } else {
> > > stime = rtime;
> [snip]
>
> > Thanks a lot Sergey for testing this further!
> >
> > Interesting results, so rtime is always one or two units off stime after scaling.
> > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > here.
>
> I don't think this is scale issue, but rather at scale_stime() input
> stime is already bigger then rtime. Sergey, could you verify that
> by adding check before scale_stime() ?
>

usually stime < rtime.
this is what scale_stime() gets as input:

[ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815
[ 1300.484675] stime:3966526699 rtime:4316110636 total:3966526699
[ 1309.548850] stime:4016453845 rtime:4369061182 total:4016453845
[ 1315.597880] stime:4055256777 rtime:4409603756 total:4055256777
[ 1315.598340] stime:4004230541 rtime:4571167362 total:4004230541
[ 1318.623774] stime:4072651687 rtime:4427641260 total:4072651687
[ 1318.624194] stime:3307672697 rtime:4359433252 total:3307672697
[ 1321.661950] stime:4073588267 rtime:4644946914 total:4073588267
[ 1324.691457] stime:4105876887 rtime:4462631018 total:4105876887
[ 1327.722074] stime:3375231967 rtime:4439900096 total:3375231967
[ 1333.757482] stime:4156087279 rtime:4514990216 total:4156087279
[ 1333.757755] stime:3427423145 rtime:4504337498 total:3427423145
[ 1333.758080] stime:4180115893 rtime:4758073724 total:4180115893
[ 1339.813117] stime:3465843945 rtime:4554325330 total:3465843945
[ 1342.845746] stime:4204665773 rtime:4565346324 total:4204665773
[ 1345.888570] stime:3497346343 rtime:4592210094 total:3497346343
[ 1348.922371] stime:4348957782 rtime:4935439460 total:4348957782
[ 1351.950096] stime:4362056506 rtime:4949617939 total:4362056506
[ 1361.021453] stime:4295785738 rtime:4661661137 total:4295785738
[ 1361.022000] stime:4458897246 rtime:5051395981 total:4458897246
[ 1364.050371] stime:4311972683 rtime:4678581654 total:4311972683
[ 1364.050765] stime:4479087426 rtime:5072949454 total:4479087426
[ 1367.076973] stime:4499465526 rtime:5094687861 total:4499465526
[ 1370.099987] stime:4343775870 rtime:4712036053 total:4343775870
[ 1373.125650] stime:4359154163 rtime:4727095545 total:4359154163
[ 1373.126009] stime:4552630150 rtime:5150626456 total:4552630150
[ 1376.148541] stime:4374640011 rtime:4743265121 total:4374640011
[ 1379.177031] stime:3732027459 rtime:4887067184 total:3732027459
[ 1382.220666] stime:4404735122 rtime:4774829507 total:4404735122
[ 1385.246696] stime:4420289930 rtime:4790957716 total:4420289930
[ 1385.247197] stime:4649961982 rtime:5253432805 total:4649961982
[ 1388.269661] stime:4433706951 rtime:4804365472 total:4433706951
[ 1388.270078] stime:3783514895 rtime:4952742424 total:3783514895
[ 1391.299533] stime:4449952651 rtime:4821791998 total:4449952651
[ 1394.329016] stime:4463714342 rtime:4836891922 total:4463714342

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Sep 2, 2013, 7:13 AM

Post #15 of 23 (3613 views)

On Mon, Sep 02, 2013 at 03:50:34PM +0200, Stanislaw Gruszka wrote:
> On Mon, Sep 02, 2013 at 03:07:45PM +0200, Frederic Weisbecker wrote:
> > > Hope this may help.
> > > I've added a silly check to make sure that `stime < rtime'
> > >
> > > @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> > > if (total) {
> > > stime = scale_stime((__force u64)stime,
> > > (__force u64)rtime, (__force u64)total);
> > > + if (stime > rtime) {
> > > + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> > > + WARN_ON(1);
> > > + }
> > > utime = rtime - stime;
> > > } else {
> > > stime = rtime;
> [snip]
>
> > Thanks a lot Sergey for testing this further!
> >
> > Interesting results, so rtime is always one or two units off stime after scaling.
> > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > here.
>
> I don't think this is scale issue, but rather at scale_stime() input
> stime is already bigger then rtime. Sergey, could you verify that
> by adding check before scale_stime() ?

Note that having stime > rtime should be fine to handle. This can happen for
example if the task runs on tiny timeslices but is unlucky enough that all these
timeslices are interrupted by the tick.

>
> Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 2, 2013, 7:24 AM

Post #16 of 23 (3625 views)

On (09/02/13 16:13), Frederic Weisbecker wrote:
> > On Mon, Sep 02, 2013 at 03:07:45PM +0200, Frederic Weisbecker wrote:
> > > > Hope this may help.
> > > > I've added a silly check to make sure that `stime < rtime'
> > > >
> > > > @@ -579,6 +582,10 @@ static void cputime_adjust(struct task_cputime *curr,
> > > > if (total) {
> > > > stime = scale_stime((__force u64)stime,
> > > > (__force u64)rtime, (__force u64)total);
> > > > + if (stime > rtime) {
> > > > + printk(KERN_ERR "Ooops: stime:%llu rtime:%llu\n", stime, rtime);
> > > > + WARN_ON(1);
> > > > + }
> > > > utime = rtime - stime;
> > > > } else {
> > > > stime = rtime;
> > [snip]
> >
> > > Thanks a lot Sergey for testing this further!
> > >
> > > Interesting results, so rtime is always one or two units off stime after scaling.
> > > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > > here.
> >
> > I don't think this is scale issue, but rather at scale_stime() input
> > stime is already bigger then rtime. Sergey, could you verify that
> > by adding check before scale_stime() ?
>
> Note that having stime > rtime should be fine to handle. This can happen for
> example if the task runs on tiny timeslices but is unlucky enough that all these
> timeslices are interrupted by the tick.
>

even is stime > rtime, scale_stime() fixes it:

if (stime > rtime)
swap(rtime, stime);

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sgruszka at redhat

Sep 3, 2013, 1:43 AM

Post #17 of 23 (3628 views)

On Mon, Sep 02, 2013 at 05:00:15PM +0300, Sergey Senozhatsky wrote:
> > > Thanks a lot Sergey for testing this further!
> > >
> > > Interesting results, so rtime is always one or two units off stime after scaling.
> > > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > > here.
> >
> > I don't think this is scale issue, but rather at scale_stime() input
> > stime is already bigger then rtime. Sergey, could you verify that
> > by adding check before scale_stime() ?
> >
>
> usually stime < rtime.
> this is what scale_stime() gets as input:
>
> [ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815

Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
as you told that your application is kernel compilation, so we utilize
lot of cpu time in user-space.

Anyway we should handle utime == 0 situation on scaling code. We work
well when rtime & stime are not big (variables and results fit in
32 bit), otherwise we have that stime bigger than rtime problem. Let's
try to handle the problem by below patch. Sergey, does it work for you ?

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index a7959e0..25cc35d 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -557,7 +557,7 @@ static void cputime_adjust(struct task_cputime *curr,
struct cputime *prev,
cputime_t *ut, cputime_t *st)
{
- cputime_t rtime, stime, utime, total;
+ cputime_t rtime, stime, utime;

if (vtime_accounting_enabled()) {
*ut = curr->utime;
@@ -565,9 +565,6 @@ static void cputime_adjust(struct task_cputime *curr,
return;
}

- stime = curr->stime;
- total = stime + curr->utime;
-
/*
* Tick based cputime accounting depend on random scheduling
* timeslices of a task to be interrupted or not by the timer.
@@ -588,13 +585,19 @@ static void cputime_adjust(struct task_cputime *curr,
if (prev->stime + prev->utime >= rtime)
goto out;

- if (total) {
+ stime = curr->stime;
+ utime = curr->utime;
+
+ if (utime == 0) {
+ stime = rtime;
+ } else if (stime == 0) {
+ utime = rtime;
+ } else {
+ cputime_t total = stime + utime;
+
stime = scale_stime((__force u64)stime,
(__force u64)rtime, (__force u64)total);
utime = rtime - stime;
- } else {
- stime = rtime;
- utime = 0;
}

/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 3, 2013, 2:33 AM

Post #18 of 23 (3629 views)

On (09/03/13 10:43), Stanislaw Gruszka wrote:
> >
> > usually stime < rtime.
> > this is what scale_stime() gets as input:
> >
> > [ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815
>
> Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
> as you told that your application is kernel compilation, so we utilize
> lot of cpu time in user-space.
>
> Anyway we should handle utime == 0 situation on scaling code. We work
> well when rtime & stime are not big (variables and results fit in
> 32 bit), otherwise we have that stime bigger than rtime problem. Let's
> try to handle the problem by below patch. Sergey, does it work for you ?
>

Sure, as fas as I understand, this one is against -current, not -next.
I'll test both (though will start with -next).

-ss

> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index a7959e0..25cc35d 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -557,7 +557,7 @@ static void cputime_adjust(struct task_cputime *curr,
> struct cputime *prev,
> cputime_t *ut, cputime_t *st)
> {
> - cputime_t rtime, stime, utime, total;
> + cputime_t rtime, stime, utime;
>
> if (vtime_accounting_enabled()) {
> *ut = curr->utime;
> @@ -565,9 +565,6 @@ static void cputime_adjust(struct task_cputime *curr,
> return;
> }
>
> - stime = curr->stime;
> - total = stime + curr->utime;
> -
> /*
> * Tick based cputime accounting depend on random scheduling
> * timeslices of a task to be interrupted or not by the timer.
> @@ -588,13 +585,19 @@ static void cputime_adjust(struct task_cputime *curr,
> if (prev->stime + prev->utime >= rtime)
> goto out;
>
> - if (total) {
> + stime = curr->stime;
> + utime = curr->utime;
> +
> + if (utime == 0) {
> + stime = rtime;
> + } else if (stime == 0) {
> + utime = rtime;
> + } else {
> + cputime_t total = stime + utime;
> +
> stime = scale_stime((__force u64)stime,
> (__force u64)rtime, (__force u64)total);
> utime = rtime - stime;
> - } else {
> - stime = rtime;
> - utime = 0;
> }
>
> /*
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 3, 2013, 6:15 AM

Post #19 of 23 (3631 views)

On (09/03/13 10:43), Stanislaw Gruszka wrote:
> > > > Thanks a lot Sergey for testing this further!
> > > >
> > > > Interesting results, so rtime is always one or two units off stime after scaling.
> > > > Stanislaw made the scaling code with Linus and he has a better idea on the math guts
> > > > here.
> > >
> > > I don't think this is scale issue, but rather at scale_stime() input
> > > stime is already bigger then rtime. Sergey, could you verify that
> > > by adding check before scale_stime() ?
> > >
> >
> > usually stime < rtime.
> > this is what scale_stime() gets as input:
> >
> > [ 1291.409566] stime:3790580815 rtime:4344293130 total:3790580815
>
> Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
> as you told that your application is kernel compilation, so we utilize
> lot of cpu time in user-space.
>
> Anyway we should handle utime == 0 situation on scaling code. We work
> well when rtime & stime are not big (variables and results fit in
> 32 bit), otherwise we have that stime bigger than rtime problem. Let's
> try to handle the problem by below patch. Sergey, does it work for you ?

works fine on -next.

-ss

> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index a7959e0..25cc35d 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -557,7 +557,7 @@ static void cputime_adjust(struct task_cputime *curr,
> struct cputime *prev,
> cputime_t *ut, cputime_t *st)
> {
> - cputime_t rtime, stime, utime, total;
> + cputime_t rtime, stime, utime;
>
> if (vtime_accounting_enabled()) {
> *ut = curr->utime;
> @@ -565,9 +565,6 @@ static void cputime_adjust(struct task_cputime *curr,
> return;
> }
>
> - stime = curr->stime;
> - total = stime + curr->utime;
> -
> /*
> * Tick based cputime accounting depend on random scheduling
> * timeslices of a task to be interrupted or not by the timer.
> @@ -588,13 +585,19 @@ static void cputime_adjust(struct task_cputime *curr,
> if (prev->stime + prev->utime >= rtime)
> goto out;
>
> - if (total) {
> + stime = curr->stime;
> + utime = curr->utime;
> +
> + if (utime == 0) {
> + stime = rtime;
> + } else if (stime == 0) {
> + utime = rtime;
> + } else {
> + cputime_t total = stime + utime;
> +
> stime = scale_stime((__force u64)stime,
> (__force u64)rtime, (__force u64)total);
> utime = rtime - stime;
> - } else {
> - stime = rtime;
> - utime = 0;
> }
>
> /*
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

sergey.senozhatsky at gmail

Sep 3, 2013, 11:09 AM

Post #20 of 23 (3609 views)

On (09/03/13 10:43), Stanislaw Gruszka wrote:
> Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
> as you told that your application is kernel compilation, so we utilize
> lot of cpu time in user-space.
>
> Anyway we should handle utime == 0 situation on scaling code. We work
> well when rtime & stime are not big (variables and results fit in
> 32 bit), otherwise we have that stime bigger than rtime problem. Let's
> try to handle the problem by below patch. Sergey, does it work for you ?

checked on -current and -next -- works fine, good job.

here are my:
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

-ss

> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index a7959e0..25cc35d 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -557,7 +557,7 @@ static void cputime_adjust(struct task_cputime *curr,
> struct cputime *prev,
> cputime_t *ut, cputime_t *st)
> {
> - cputime_t rtime, stime, utime, total;
> + cputime_t rtime, stime, utime;
>
> if (vtime_accounting_enabled()) {
> *ut = curr->utime;
> @@ -565,9 +565,6 @@ static void cputime_adjust(struct task_cputime *curr,
> return;
> }
>
> - stime = curr->stime;
> - total = stime + curr->utime;
> -
> /*
> * Tick based cputime accounting depend on random scheduling
> * timeslices of a task to be interrupted or not by the timer.
> @@ -588,13 +585,19 @@ static void cputime_adjust(struct task_cputime *curr,
> if (prev->stime + prev->utime >= rtime)
> goto out;
>
> - if (total) {
> + stime = curr->stime;
> + utime = curr->utime;
> +
> + if (utime == 0) {
> + stime = rtime;
> + } else if (stime == 0) {
> + utime = rtime;
> + } else {
> + cputime_t total = stime + utime;
> +
> stime = scale_stime((__force u64)stime,
> (__force u64)rtime, (__force u64)total);
> utime = rtime - stime;
> - } else {
> - stime = rtime;
> - utime = 0;
> }
>
> /*
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

mingo at kernel

Sep 3, 2013, 11:32 AM

Post #21 of 23 (3605 views)

* Sergey Senozhatsky <sergey.senozhatsky@gmail.com> wrote:

> On (09/03/13 10:43), Stanislaw Gruszka wrote:
> > Ok, I see now, utime is 0 . This seems to be problem with dynamic ticks
> > as you told that your application is kernel compilation, so we utilize
> > lot of cpu time in user-space.
> >
> > Anyway we should handle utime == 0 situation on scaling code. We work
> > well when rtime & stime are not big (variables and results fit in
> > 32 bit), otherwise we have that stime bigger than rtime problem. Let's
> > try to handle the problem by below patch. Sergey, does it work for you ?
>
> checked on -current and -next -- works fine, good job.
>
> here are my:
> Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> Tested-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Cool and thanks for the patient reporting and testing!

Stanislaw, mind sending a changelogged, signed off patch with Sergey's
tags included? It also warrants a -stable backport tag I think.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Sep 8, 2013, 3:56 AM

Post #22 of 23 (3615 views)

On Sat, Aug 31, 2013 at 10:01:17PM +0300, Sergey Senozhatsky wrote:
> On (08/31/13 01:04), Frederic Weisbecker wrote:
> > But stime should always be below rtime due to the calculation done by scale_stime()
> > which roughly sums up to:
> >
> > stime = (stime / stime + utime) * rtime
> >
> > So this shouldn't happen.
> >
> > I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
> >
> > Thanks.
>
> Hello,
> I can't pull current -next (away from good network connection for a
> couple of days). What I have is 3.11.0-rc7-next-20130829:
>
> ps aux | grep rcu
> root 8 0.0 0.0 0 0 ? S Aug30 0:00 [rcuc/0]
> root 9 0.0 0.0 0 0 ? S Aug30 0:00 [rcub/0]
> root 10 21376376 0.0 0 0 ? S Aug30 21114581:36 [rcu_preempt]
> root 11 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/0]
> root 12 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/1]
>
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 715 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> -ss

Do you have this patch in?

http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?h=timers-urgent-for-linus&id=5a8e01f8fa51f5cbce8f37acc050eb2319d12956
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Re: [sched next] overflowed cpu time for kernel threads in /proc/PID/stat [ In reply to ]

fweisbec at gmail

Sep 8, 2013, 3:59 AM

Post #23 of 23 (3606 views)

On Sat, Aug 31, 2013 at 10:01:17PM +0300, Sergey Senozhatsky wrote:
> On (08/31/13 01:04), Frederic Weisbecker wrote:
> > But stime should always be below rtime due to the calculation done by scale_stime()
> > which roughly sums up to:
> >
> > stime = (stime / stime + utime) * rtime
> >
> > So this shouldn't happen.
> >
> > I couldn't manage to reproduce it though. Can you still trigger it with latest -next?
> >
> > Thanks.
>
> Hello,
> I can't pull current -next (away from good network connection for a
> couple of days). What I have is 3.11.0-rc7-next-20130829:
>
> ps aux | grep rcu
> root 8 0.0 0.0 0 0 ? S Aug30 0:00 [rcuc/0]
> root 9 0.0 0.0 0 0 ? S Aug30 0:00 [rcub/0]
> root 10 21376376 0.0 0 0 ? S Aug30 21114581:36 [rcu_preempt]
> root 11 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/0]
> root 12 21376376 0.0 0 0 ? S Aug30 21114581:35 [rcuop/1]
>
> cat /proc/10/stat
> 10 (rcu_preempt) S 2 0 0 0 -1 2129984 0 0 0 0 1844674407370 715 0 0 20 0 1 0 10 0 0 18446744073709551615 0 0 0 0 0 0 0 2147483647 0 18446744073709551615 0 0 17 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> -ss

Sorry, I did not realize this message was posted before you tested the patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/