Mailing List Archive

1 2 3  View All
Re: 2.6.16-rc5-mm1 [ In reply to ]
Andrew wrote:
> If you have (even more) time you could test
> http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz.
> That's the latest of everything with the problematic sysfs patches reverted
> and Eric's recent /proc fixes.

I tested both the fuser command that had been crashing the earlier
kernels with Eric's /proc patches, and my "SGI internal software
management" application, on which I first saw these kernels fail.

With this 2.6.16-rc5-mm2-pre1 kernel, they both work - no more crash.

Good.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
"J.A. Magallon" <jamagallon@able.es> wrote:
>
> On Wed, 1 Mar 2006 02:32:35 -0800, Andrew Morton <akpm@osdl.org> wrote:
>
> >
> > Useful, thanks. So the second batch of /proc patches are indeed the problem.
> >
> > If you have (even more) time you could test
> > http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz.
> > That's the latest of everything with the problematic sysfs patches reverted
> > and Eric's recent /proc fixes.
> >
>
> I just tried rc5-mm1 and this. With this I can run java apps/applets again
> without locking my system.
>
> I also applied the patch you posted for inotify, but now I get this new one:
>
> Mar 1 15:11:04 werewolf kernel: [ 1424.891482] BUG: warning at fs/inotify.c:410/set_dentry_child_flags()

Which patch was that? The first one was doubly broken.

This is closer.

diff -puN fs/dcache.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix fs/dcache.c
--- devel/fs/dcache.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix 2006-03-01 12:16:22.000000000 -0800
+++ devel-akpm/fs/dcache.c 2006-03-01 12:18:34.000000000 -0800
@@ -100,6 +100,7 @@ static void dentry_iput(struct dentry *
if (inode) {
dentry->d_inode = NULL;
list_del_init(&dentry->d_alias);
+ dentry->d_flags &= ~DCACHE_INOTIFY_PARENT_WATCHED;
spin_unlock(&dentry->d_lock);
spin_unlock(&dcache_lock);
if (!inode->i_nlink)
_


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
Andrew Morton wrote:

> Maybe you're not running applications which install inotify watches. This
> is apparently triggerable by doing `touch foo;rm foo;touch foo' in a watched
> directory.
>
> Nick, isn't it simply a matter of..
>

Sorry, I forgot about that. With that patch, d_instantiate can still get some
warnings. The following one should fix that and also makes the watch creation
deletion loop-over-all-dentrys ignore negative dentrys too.

I'm not sure what the cleanest way to do this is. I'm fairly sure the vfs
guys do not want a DENTRY_INOTIFY_ flag in fs/dcache.c, so I've added a
comment there.

I also don't know whether or not the inotify guys want parent events on
negative entries. They don't appear to now, so I'll take that as a no.

After the DENTRY_INOTIFY_PARENT_WATCHED debugging stuff is taken out, that
flag will basically be undefined for negative dentrys rather than always
clear (after this patch). I'm not sure whether the vfs people consider that
to be unclean.

--
SUSE Labs, Novell Inc.
Re: 2.6.16-rc5-mm1 -- strange load balancing problems [ In reply to ]
I'm seeing some strange load balancing problems with this kernel. I
don't think that they're due to the smpnice patches as I've applied them
on a standard 2.6.15-rc5 kernel and the problem doesn't happen there.

The problem is (as I say) quite strange and (for me) very reproducible.
I have two programs (aspin and gsmiley) which I use to produce CPU
hard spinners for testing purposes. What I'm finding is that when I
start several copies of aspin load balancing goes as expected but when I
launch several copies of gsmiley they all go to the one CPU and stick
there like glue. (The most obvious difference between the two programs
is that aspin is just a command line tool while gsmiley is an X windows
program that spins a simley face and reports its own assessment of the
percentage of CPU it's getting.) The machine that I've seen this
problem is a hyper threading Pentium 4 and I suspect that it may be due
to the SCHED_MC changes which overlap SCHED_SMT a bit.

I'm trying to test this on a non hyper threading machine but the machine
has crashed (different kernel) while doing the build. I'll resume this
effort tomorrow but I thought that I should report the problem so that
others could comment.

Peter
PS SCHED_MC was configured in but I'll try it without tomorrow and
report the results.
--
Peter Williams pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
On 3/2/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> On 3/1/06, Andrew Morton <akpm@osdl.org> wrote:
> >
> > Could people please test a couple more patchsets, see if we can isolate it?
> >
> > http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.1.gz
> >
>
> Haven't had time to test this one yet, and won't have time until tomorrow :(
>

I just tested this kernel and it builds and runs just fine. Can't
crash it with my eclipse test case.

>
> > and http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.2.gz is
>
> I've tested this one and I can't crash it with eclipse like I could
> plain old 2.6.16-rc5-mm1
>

With all the recent patches and proposed patches and discussions about
various approaches to fix this I've lost track.

What kernel with what patches applied and/or reverted would it make
the most sense for me to test now, in order to provide the most useful
testing?


--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
On Wed, 1 Mar 2006 20:51:38 -0800, Andrew Morton <akpm@osdl.org> wrote:

> "J.A. Magallon" <jamagallon@able.es> wrote:
> >
> > On Wed, 1 Mar 2006 02:32:35 -0800, Andrew Morton <akpm@osdl.org> wrote:
> >
> > >
> > > Useful, thanks. So the second batch of /proc patches are indeed the problem.
> > >
> > > If you have (even more) time you could test
> > > http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz.
> > > That's the latest of everything with the problematic sysfs patches reverted
> > > and Eric's recent /proc fixes.
> > >
> >
> > I just tried rc5-mm1 and this. With this I can run java apps/applets again
> > without locking my system.
> >
> > I also applied the patch you posted for inotify, but now I get this new one:
> >
> > Mar 1 15:11:04 werewolf kernel: [ 1424.891482] BUG: warning at fs/inotify.c:410/set_dentry_child_flags()
>
> Which patch was that? The first one was doubly broken.
>
> This is closer.
>
> diff -puN fs/dcache.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix fs/dcache.c
> --- devel/fs/dcache.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix 2006-03-01 12:16:22.000000000 -0800
> +++ devel-akpm/fs/dcache.c 2006-03-01 12:18:34.000000000 -0800
> @@ -100,6 +100,7 @@ static void dentry_iput(struct dentry *
> if (inode) {
> dentry->d_inode = NULL;
> list_del_init(&dentry->d_alias);
> + dentry->d_flags &= ~DCACHE_INOTIFY_PARENT_WATCHED;
> spin_unlock(&dentry->d_lock);
> spin_unlock(&dcache_lock);
> if (!inode->i_nlink)
> _
>
>

What I have collected till now is below (against -mm2-pre1). What is
(not) needed from this ? Thanks...

--- devel/fs/inotify.c~a 2006-03-01 02:47:01.000000000 -0800
+++ devel-akpm/fs/inotify.c 2006-03-01 02:47:06.000000000 -0800
@@ -390,6 +390,7 @@ static inline int inotify_inode_watched(

/*
* Get child dentry flag into synch with parent inode.
+ * Flag should always be clear for negative dentrys.
*/
static void set_dentry_child_flags(struct inode *inode, int watched)
{
@@ -400,14 +401,14 @@ static void set_dentry_child_flags(struc
struct dentry *child;

list_for_each_entry(child, &alias->d_subdirs, d_u.d_child) {
+ if (!child->d_inode) {
+ WARN_ON(child->d_flags & DCACHE_INOTIFY_PARENT_WATCHED);
+ continue;
+ }
spin_lock(&child->d_lock);
if (watched) {
- WARN_ON(child->d_flags &
- DCACHE_INOTIFY_PARENT_WATCHED);
child->d_flags |= DCACHE_INOTIFY_PARENT_WATCHED;
} else {
- WARN_ON(!(child->d_flags &
- DCACHE_INOTIFY_PARENT_WATCHED));
child->d_flags&=~DCACHE_INOTIFY_PARENT_WATCHED;
}
spin_unlock(&child->d_lock);
@@ -530,7 +530,6 @@ void inotify_d_instantiate(struct dentry
if (!inode)
return;

- WARN_ON(entry->d_flags & DCACHE_INOTIFY_PARENT_WATCHED);
spin_lock(&entry->d_lock);
parent = entry->d_parent;
if (inotify_inode_watched(parent->d_inode))
--- linux-2.6.orig/fs/dcache.c
+++ linux-2.6/fs/dcache.c
@@ -100,6 +100,7 @@ static void dentry_iput(struct dentry *
if (inode) {
dentry->d_inode = NULL;
list_del_init(&dentry->d_alias);
+ dentry->d_flags &= ~DCACHE_INOTIFY_PARENT_WATCHED;
spin_unlock(&dentry->d_lock);
spin_unlock(&dcache_lock);
if (!inode->i_nlink)
@@ -1203,6 +1204,9 @@ void d_delete(struct dentry * dentry)
spin_lock(&dentry->d_lock);
isdir = S_ISDIR(dentry->d_inode->i_mode);
if (atomic_read(&dentry->d_count) == 1) {
+ /* remove this and other inotify debug checks after 2.6.18 */
+ dentry->d_flags &= ~DCACHE_INOTIFY_PARENT_WATCHED;
+
dentry_iput(dentry);
fsnotify_nameremove(dentry, isdir);
return;



--
J.A. Magallon <jamagallon()able!es> \ Software is like sex:
werewolf!able!es \ It's better when it's free
Mandriva Linux release 2006.1 (Cooker) for i586
Linux 2.6.15-jam14 (gcc 4.0.3 (4.0.3-0.20060215.2mdk for Mandriva Linux release 2006.1))
Re: 2.6.16-rc5-mm1 -- strange load balancing problems [ In reply to ]
Peter Williams wrote:
> I'm seeing some strange load balancing problems with this kernel. I
> don't think that they're due to the smpnice patches as I've applied them
> on a standard 2.6.15-rc5 kernel and the problem doesn't happen there.
>
> The problem is (as I say) quite strange and (for me) very reproducible.
> I have two programs (aspin and gsmiley) which I use to produce CPU hard
> spinners for testing purposes. What I'm finding is that when I start
> several copies of aspin load balancing goes as expected but when I
> launch several copies of gsmiley they all go to the one CPU and stick
> there like glue. (The most obvious difference between the two programs
> is that aspin is just a command line tool while gsmiley is an X windows
> program that spins a simley face and reports its own assessment of the
> percentage of CPU it's getting.) The machine that I've seen this
> problem is a hyper threading Pentium 4 and I suspect that it may be due
> to the SCHED_MC changes which overlap SCHED_SMT a bit.
>
> I'm trying to test this on a non hyper threading machine but the machine
> has crashed (different kernel) while doing the build. I'll resume this
> effort tomorrow but I thought that I should report the problem so that
> others could comment.
>
> Peter
> PS SCHED_MC was configured in but I'll try it without tomorrow and
> report the results.

Configuring SCHED_MC to "no" causes this problem to go away.

Peter
--
Peter Williams pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
"J.A. Magallon" <jamagallon@able.es> wrote:
>
> What I have collected till now is below (against -mm2-pre1). What is
> (not) needed from this ? Thanks...

You should need only the below:


--- devel/fs/dcache.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix-2 2006-03-01 21:41:16.000000000 -0800
+++ devel-akpm/fs/dcache.c 2006-03-01 21:41:16.000000000 -0800
@@ -1177,6 +1177,9 @@ void d_delete(struct dentry * dentry)
spin_lock(&dentry->d_lock);
isdir = S_ISDIR(dentry->d_inode->i_mode);
if (atomic_read(&dentry->d_count) == 1) {
+ /* remove this and other inotify debug checks after 2.6.18 */
+ dentry->d_flags &= ~DCACHE_INOTIFY_PARENT_WATCHED;
+
dentry_iput(dentry);
fsnotify_nameremove(dentry, isdir);
return;
diff -puN fs/inotify.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix-2 fs/inotify.c
--- devel/fs/inotify.c~inotify-lock-avoidance-with-parent-watch-status-in-dentry-fix-2 2006-03-01 21:41:16.000000000 -0800
+++ devel-akpm/fs/inotify.c 2006-03-01 21:41:16.000000000 -0800
@@ -390,6 +390,7 @@ static inline int inotify_inode_watched(

/*
* Get child dentry flag into synch with parent inode.
+ * Flag should always be clear for negative dentrys.
*/
static void set_dentry_child_flags(struct inode *inode, int watched)
{
@@ -400,6 +401,10 @@ static void set_dentry_child_flags(struc
struct dentry *child;

list_for_each_entry(child, &alias->d_subdirs, d_u.d_child) {
+ if (!child->d_inode) {
+ WARN_ON(child->d_flags & DCACHE_INOTIFY_PARENT_WATCHED);
+ continue;
+ }
spin_lock(&child->d_lock);
if (watched) {
WARN_ON(child->d_flags &
_


Anyway, I think things are pretty much sorted out now so I'll try to do mm2
today (approx 12 hours hence).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
"Jesper Juhl" <jesper.juhl@gmail.com> writes:

> On 3/2/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
>> On 3/1/06, Andrew Morton <akpm@osdl.org> wrote:
>> >
>> > Could people please test a couple more patchsets, see if we can isolate it?
>> >
>> > http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.1.gz
>> >
>>
>> Haven't had time to test this one yet, and won't have time until tomorrow :(
>>
>
> I just tested this kernel and it builds and runs just fine. Can't
> crash it with my eclipse test case.
>
>>
>> > and http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.2.gz is
>>
>> I've tested this one and I can't crash it with eclipse like I could
>> plain old 2.6.16-rc5-mm1
>>
>
> With all the recent patches and proposed patches and discussions about
> various approaches to fix this I've lost track.
>
> What kernel with what patches applied and/or reverted would it make
> the most sense for me to test now, in order to provide the most useful
> testing?

So it looks like we have this tracked and fixed. Andrew included my
fix in:
http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz

So just confirming that the fixed actually worked would probably be
the biggest help.

The problem should be fixed unless there is something else that
triggers the horrible and mysterious kernel death. So you are
getting the same results as everyone else.


Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 [ In reply to ]
On 3/2/06, Eric W. Biederman <ebiederm@xmission.com> wrote:
> "Jesper Juhl" <jesper.juhl@gmail.com> writes:
>
> > On 3/2/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> >> On 3/1/06, Andrew Morton <akpm@osdl.org> wrote:
> >> >
> >> > Could people please test a couple more patchsets, see if we can isolate it?
> >> >
> >> > http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.1.gz
> >> >
> >>
> >> Haven't had time to test this one yet, and won't have time until tomorrow :(
> >>
> >
> > I just tested this kernel and it builds and runs just fine. Can't
> > crash it with my eclipse test case.
> >
> >>
> >> > and http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm1.2.gz is
> >>
> >> I've tested this one and I can't crash it with eclipse like I could
> >> plain old 2.6.16-rc5-mm1
> >>
> >
> > With all the recent patches and proposed patches and discussions about
> > various approaches to fix this I've lost track.
> >
> > What kernel with what patches applied and/or reverted would it make
> > the most sense for me to test now, in order to provide the most useful
> > testing?
>
> So it looks like we have this tracked and fixed. Andrew included my
> fix in:
> http://www.zip.com.au/~akpm/linux/patches/stuff/2.6.16-rc5-mm2-pre1.gz
>
> So just confirming that the fixed actually worked would probably be
> the biggest help.
>
> The problem should be fixed unless there is something else that
> triggers the horrible and mysterious kernel death. So you are
> getting the same results as everyone else.
>

2.6.16-rc5-mm2 works fine for me. Can't crash it with eclipse.

--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.16-rc5-mm1 -- strange load balancing problems [ In reply to ]
Peter Williams wrote:
> Peter Williams wrote:
>
>> I'm seeing some strange load balancing problems with this kernel. I
>> don't think that they're due to the smpnice patches as I've applied
>> them on a standard 2.6.15-rc5 kernel and the problem doesn't happen
>> there.
>>
>> The problem is (as I say) quite strange and (for me) very
>> reproducible. I have two programs (aspin and gsmiley) which I use to
>> produce CPU hard spinners for testing purposes. What I'm finding is
>> that when I start several copies of aspin load balancing goes as
>> expected but when I launch several copies of gsmiley they all go to
>> the one CPU and stick there like glue. (The most obvious difference
>> between the two programs is that aspin is just a command line tool
>> while gsmiley is an X windows program that spins a simley face and
>> reports its own assessment of the percentage of CPU it's getting.)
>> The machine that I've seen this problem is a hyper threading Pentium 4
>> and I suspect that it may be due to the SCHED_MC changes which overlap
>> SCHED_SMT a bit.
>>
>> I'm trying to test this on a non hyper threading machine but the
>> machine has crashed (different kernel) while doing the build. I'll
>> resume this effort tomorrow but I thought that I should report the
>> problem so that others could comment.
>>
>> Peter
>> PS SCHED_MC was configured in but I'll try it without tomorrow and
>> report the results.
>
>
> Configuring SCHED_MC to "no" causes this problem to go away.

This problem does not appear to be present in 2.6.16-rc6-mm1.

Peter
--
Peter Williams pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

1 2 3  View All