Mailing List Archive

NFS run amok
Greetings,

I have an F720 with 6.1.1R2 running on it displaying an extremely weird
situation. 'sysstat 2' reports:


CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache
in out read write read write age
16% 0 0 0 12300 1 0 0 0 0 >60
16% 0 0 0 12301 1 0 0 0 0 >60
18% 0 0 0 12302 1 88 110 0 0 >60
16% 0 0 0 12301 0 0 0 0 0 >60
16% 0 0 0 12301 0 0 0 0 0 >60
16% 0 0 0 12300 0 0 0 0 0 >60
16% 0 0 0 12305 0 0 0 0 0 >60
18% 0 0 0 12300 0 82 104 0 0 >60
17% 0 0 0 12300 0 0 0 0 0 >60
17% 0 0 0 12299 0 0 0 0 0 >60
17% 2 0 0 12301 1 0 0 0 0 >60
16% 0 0 0 12301 0 0 0 0 0 >60
16% 0 0 0 12300 0 0 0 0 0 >60
18% 0 0 0 12300 1 88 110 0 0 >60
16% 0 0 0 12301 0 0 0 0 0 >60
16% 0 0 0 12301 0 0 0 0 0 >60
16% 0 0 0 12300 0 0 0 0 0 >60
17% 0 0 0 12301 0 0 0 0 0 >60
18% 0 0 0 12301 0 82 104 0 0 >60

Running pktt on the netapp reveals LOTS of UDP nfs traffic to the filer
but no responses from the filer. The client is a Red Hat linux box.

We are stumped on this at this point. Any advice?

--[Lance]

--
Lance A. Brown
SysAdmin Task
LMIT ITSS Contract for
National Institute of Environmental Health Sciences
919.361.5444x420
Re: NFS run amok [ In reply to ]
RH 7.3 will exhibit this behavior (and lock all clients out of the Netapp)
until you update the kernel. I recommend you run "up2date" to use RedHat
Network and update the 7.3 kernel.

/Brian/

> I have an F720 with 6.1.1R2 running on it displaying an extremely weird
> situation. 'sysstat 2' reports:
>
>
> CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache
> in out read write read write age
> 16% 0 0 0 12300 1 0 0 0 0 >60
> 16% 0 0 0 12301 1 0 0 0 0 >60
> 18% 0 0 0 12302 1 88 110 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 16% 0 0 0 12305 0 0 0 0 0 >60
> 18% 0 0 0 12300 0 82 104 0 0 >60
> 17% 0 0 0 12300 0 0 0 0 0 >60
> 17% 0 0 0 12299 0 0 0 0 0 >60
> 17% 2 0 0 12301 1 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 18% 0 0 0 12300 1 88 110 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 17% 0 0 0 12301 0 0 0 0 0 >60
> 18% 0 0 0 12301 0 82 104 0 0 >60
>
> Running pktt on the netapp reveals LOTS of UDP nfs traffic to the filer
> but no responses from the filer. The client is a Red Hat linux box.
>
> We are stumped on this at this point. Any advice?
>
> --[Lance]
>
>

--
Brian Long | | |
Americas IT Hosting Sys Admin | .|||. .|||.
Phone: (919) 392-7363 | ..:|||||||:...:|||||||:..
Pager: (888) 651-2015 | C i s c o S y s t e m s
Re: NFS run amok [ In reply to ]
I'm guessing it's RedHat 7.3 w/ stock kernel. If so, you need to upgrade
the kernel RPM. See http://rhn.redhat.com/errata/RHBA-2002-110.html.

mjc

On Mon, 2002-09-16 at 13:52, Lance A. Brown wrote:
> Greetings,
>
> I have an F720 with 6.1.1R2 running on it displaying an extremely weird
> situation. 'sysstat 2' reports:
>
>
> CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache
> in out read write read write age
> 16% 0 0 0 12300 1 0 0 0 0 >60
> 16% 0 0 0 12301 1 0 0 0 0 >60
> 18% 0 0 0 12302 1 88 110 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 16% 0 0 0 12305 0 0 0 0 0 >60
> 18% 0 0 0 12300 0 82 104 0 0 >60
> 17% 0 0 0 12300 0 0 0 0 0 >60
> 17% 0 0 0 12299 0 0 0 0 0 >60
> 17% 2 0 0 12301 1 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 18% 0 0 0 12300 1 88 110 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12301 0 0 0 0 0 >60
> 16% 0 0 0 12300 0 0 0 0 0 >60
> 17% 0 0 0 12301 0 0 0 0 0 >60
> 18% 0 0 0 12301 0 82 104 0 0 >60
>
> Running pktt on the netapp reveals LOTS of UDP nfs traffic to the filer
> but no responses from the filer. The client is a Red Hat linux box.
>
> We are stumped on this at this point. Any advice?
>
> --[Lance]
>
> --
> Lance A. Brown
> SysAdmin Task
> LMIT ITSS Contract for
> National Institute of Environmental Health Sciences
> 919.361.5444x420
>
--
Michael J. Carter | Confess your sins to the Lord and you
IT Team Leader | will be forgiven; confess them to man
and
Space Data Systems (NIS-3) | you will be laughed at. -- Josh
Billings
Los Alamos National Laboratory |
Re: NFS run amok [ In reply to ]
On 16 Sep 2002, Lance A. Brown wrote:

> I have an F720 with 6.1.1R2 running on it displaying an extremely weird
> situation. 'sysstat 2' reports:
>
>
> CPU NFS CIFS HTTP Net kB/s Disk kB/s Tape kB/s Cache
> in out read write read write age
> 16% 0 0 0 12300 1 0 0 0 0 >60
> 16% 0 0 0 12301 1 0 0 0 0 >60

...

> Running pktt on the netapp reveals LOTS of UDP nfs traffic to the
> filer but no responses from the filer. The client is a Red Hat
> linux box.
>
> We are stumped on this at this point. Any advice?

I'll wager the client is running the 2.4.18-3 kernel that shipped with
Red Hat 7.3. You can either upgrade to 2.4.18-10 (or even -5) or
manually set the rsize and wsize mount settings to something
reasonable like 8192.

+----------------------------------------------+----------------------+
| Paul Heinlein | heinlein@cse.ogi.edu |
| Research Systems Engineer | +1 503 748-1472 |
| Department of Computer Science & Engineering | 20000 NW Walker Road |
| OGI School of Science & Engineering | Beaverton, OR 97006 |
| Oregon Health & Science University | USA |
+----------------------------------------------+----------------------+
Re: NFS run amok [ In reply to ]
On Mon, Sep 16, 2002 at 04:30:22PM -0400, Brian Long wrote:
> RH 7.3 will exhibit this behavior (and lock all clients out of the Netapp)
> until you update the kernel. I recommend you run "up2date" to use RedHat
> Network and update the 7.3 kernel.

Aint's gonna help - we had the same problem, and the latest kernel
still exhibits the same NFS bug. The workaround is to set rsize and
wsize on the client side to something smaller - 8192 does the job.

Igor
Re: NFS run amok [ In reply to ]
> On Mon, Sep 16, 2002 at 04:30:22PM -0400, Brian Long wrote:
> > RH 7.3 will exhibit this behavior (and lock all clients out of the Netapp)
> > until you update the kernel. I recommend you run "up2date" to use RedHat
> > Network and update the 7.3 kernel.
>
> Aint's gonna help - we had the same problem, and the latest kernel
> still exhibits the same NFS bug. The workaround is to set rsize and
> wsize on the client side to something smaller - 8192 does the job.
>
> Igor

I'm running Linux 7.3 on my desktop and mount with TCP and that solves
the problem, too. Just add the "tcp" flag to the mount options. In my
/etc/fstab file I'm using:

rw,hard,intr,tcp,bg





Steve Losen scl@virginia.edu phone: 434-924-0640

University of Virginia ITC Unix Support
Re: NFS run amok [ In reply to ]
On Mon, 16 Sep 2002, Igor Schein wrote:

> On Mon, Sep 16, 2002 at 04:30:22PM -0400, Brian Long wrote:
> > RH 7.3 will exhibit this behavior (and lock all clients out of the
> > Netapp) until you update the kernel. I recommend you run
> > "up2date" to use RedHat Network and update the 7.3 kernel.
>
> Aint's gonna help - we had the same problem, and the latest kernel
> still exhibits the same NFS bug. The workaround is to set rsize and
> wsize on the client side to something smaller - 8192 does the job.

Odd. The updates did the trick for us (F820, 6.1.3).

--Paul Heinlein <heinlein@cse.ogi.edu>
Re: NFS run amok [ In reply to ]
I'll add to Paul's comments: Updating the client did fix the problem
here, but it's not really a fix (in my opinion). The NetApp is still
vulnerable to a stream of badly-formed IP fragments, which exhausts
the filer's available reassembly space. Here's my earlier discussion:

http://teaparty.mathworks.com:1999/toasters/11666.html

I see that they now have an actual bug report on NOW regarding this
issue (yay!). It's #77650.

Regards,

--
Marion Hakanson <hakanson@cse.ogi.edu>
CSE Computing Facilities



> On Mon, 16 Sep 2002, Igor Schein wrote:
>
> > On Mon, Sep 16, 2002 at 04:30:22PM -0400, Brian Long wrote:
> > > RH 7.3 will exhibit this behavior (and lock all clients out of the
> > > Netapp) until you update the kernel. I recommend you run
> > > "up2date" to use RedHat Network and update the 7.3 kernel.
> >
> > Aint's gonna help - we had the same problem, and the latest kernel
> > still exhibits the same NFS bug. The workaround is to set rsize and
> > wsize on the client side to something smaller - 8192 does the job.
>
> Odd. The updates did the trick for us (F820, 6.1.3).
>
> --Paul Heinlein <heinlein@cse.ogi.edu>
Re: NFS run amok [ In reply to ]
Thank you all for the wonderful input. We've got the situation under
control again thanks to all your good advice!

--[Lance]

--
Lance A. Brown
SysAdmin Task
LMIT ITSS Contract for
National Institute of Environmental Health Sciences
919.361.5444x420