Mailing List Archive: NwkThd_00:warning: NFS response to client was slow

NwkThd_00:warning: NFS response to client was slow

Feb 15, 2016, 3:44 AM

Post #1 of 8 (3529 views)

NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567
was slow, op was v3 write, 65 > 60 (in seconds)

I have a filer head, on which I'm hosting ESX datastores.
I've had a couple of instances now of this error (or one rather similar).

It correlates with VMware getting upset and VMs going read only. But it
doesn't actually give me any insight into what is going on.

Has anyone run into this, and can give some further insight as to what
might be causing and where I can look?

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

basilberntsen at gmail

Feb 15, 2016, 4:42 AM

Post #2 of 8 (3509 views)

Permalink

That's a very generic warning- I'd open a case.

On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com>
wrote:

> NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume
> 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
>
> I have a filer head, on which I'm hosting ESX datastores.
> I've had a couple of instances now of this error (or one rather similar).
>
> It correlates with VMware getting upset and VMs going read only. But it
> doesn't actually give me any insight into what is going on.
>
> Has anyone run into this, and can give some further insight as to what
> might be causing and where I can look?
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

tmacmd at gmail

Feb 15, 2016, 5:32 AM

Post #3 of 8 (3512 views)

Permalink

Are you hosting your datastores on SATA drives? I have seen this before
(many times) when customer use SATA and try to host too many virtual
machines and they do not turn on the Storage I/O control. (premium
feature!).

Turning on SIO control does not always solve this either.

--tmac

*Tim McCarthy*
*Principal Consultant*

On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.com> wrote:

> That's a very generic warning- I'd open a case.
>
> On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com>
> wrote:
>
>> NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume
>> 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
>>
>> I have a filer head, on which I'm hosting ESX datastores.
>> I've had a couple of instances now of this error (or one rather similar).
>>
>> It correlates with VMware getting upset and VMs going read only. But it
>> doesn't actually give me any insight into what is going on.
>>
>> Has anyone run into this, and can give some further insight as to what
>> might be causing and where I can look?
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

brad.thompson877 at gmail

Feb 15, 2016, 6:28 AM

Post #4 of 8 (3512 views)

Permalink

What type and speed disks are you using for the VMware environment? If
you're using 15k disks there are a couple things I would start with. Check
the output of the following:

priv set diag
statit -b
(wait 10 seconds)
statit -e

You'll get some good info about the disks in this output. What are the
xfers (iops)?

Also check:
sysstat -c 10 -x 10

What do you see in the "CP ty" column? If you see an upper or lower case
"b", then you're hitting back to back CPs. Basically, this means that the
controller has a problem keeping up with the incoming writes.

If any of the above comes back abnormal I would open a case for deeper
investigation.

Also, that message you're seeing is giving you the volume fsid, 0x1234567.
You can use the "vol read_fsid <vol-name>" command (priv set diag) to see
the volume fsid's. You'd have to run it on each volume until you find the
one that matches what you see in the message regarding the slow response.

On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com>
wrote:

> NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume
> 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
>
> I have a filer head, on which I'm hosting ESX datastores.
> I've had a couple of instances now of this error (or one rather similar).
>
> It correlates with VMware getting upset and VMs going read only. But it
> doesn't actually give me any insight into what is going on.
>
> Has anyone run into this, and can give some further insight as to what
> might be causing and where I can look?
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

ed.rolison at gmail

Feb 16, 2016, 8:04 AM

Post #5 of 8 (3484 views)

Permalink

The long and short seems to be - I'm getting low_mbuf CPs on the filter
head, and at the time the error message occurs - I'm also getting back to
back CPs. So have a reboot pending, and probably a code update in the near
future.

On 15 February 2016 at 13:32, tmac <tmacmd@gmail.com> wrote:

> Are you hosting your datastores on SATA drives? I have seen this before
> (many times) when customer use SATA and try to host too many virtual
> machines and they do not turn on the Storage I/O control. (premium
> feature!).
>
> Turning on SIO control does not always solve this either.
>
> --tmac
>
> *Tim McCarthy*
> *Principal Consultant*
>
>
> On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.com> wrote:
>
>> That's a very generic warning- I'd open a case.
>>
>> On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com>
>> wrote:
>>
>>> NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume
>>> 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
>>>
>>> I have a filer head, on which I'm hosting ESX datastores.
>>> I've had a couple of instances now of this error (or one rather
>>> similar).
>>>
>>> It correlates with VMware getting upset and VMs going read only. But it
>>> doesn't actually give me any insight into what is going on.
>>>
>>> Has anyone run into this, and can give some further insight as to what
>>> might be causing and where I can look?
>>>
>>> _______________________________________________
>>> Toasters mailing list
>>> Toasters@teaparty.net
>>> http://www.teaparty.net/mailman/listinfo/toasters
>>>
>>>
>>
>> _______________________________________________
>> Toasters mailing list
>> Toasters@teaparty.net
>> http://www.teaparty.net/mailman/listinfo/toasters
>>
>>
>

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

tbar at BERKCOM

Feb 16, 2016, 8:12 AM

Post #6 of 8 (3483 views)

Permalink

Edward -

Good luck with your corrective plan, but if you're getting back to back CPs simply rebooting or patching (unless there's a specific patch that is recommended against this behavior) isn't going to do much to solve your problem. These errors are almost always caused by having too few physical disks in the underlying aggregate, or a workload that is too aggressive for the aggregate hosting it -- so you're best off relocating the workload (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the aggregate. Another thing you might want to check on is the free space on the aggregate (> 90-95% utilization) since that can also cause problems where housekeeping tasks such as changed block reclamation do not have adequate free space to run.

Anyway, to reiterate -- good luck, but keep the above in mind when considering your options.

Thanks!

On Feb 16, 2016, at 8:03 AM, Edward Rolison <ed.rolison@gmail.com<mailto:ed.rolison@gmail.com>> wrote:

The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.

On 15 February 2016 at 13:32, tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>> wrote:
Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).

Turning on SIO control does not always solve this either.

--tmac

Tim McCarthy
Principal Consultant

On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.com<mailto:basilberntsen@gmail.com>> wrote:
That's a very generic warning- I'd open a case.

On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com<mailto:ed.rolison@gmail.com>> wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)

I have a filer head, on which I'm hosting ESX datastores.
I've had a couple of instances now of this error (or one rather similar).

It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.

Has anyone run into this, and can give some further insight as to what might be causing and where I can look?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

RE: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

Justin.Parisi at netapp

Feb 16, 2016, 8:16 AM

Post #7 of 8 (3486 views)

Permalink

Well, if it's caused by low_mbuf, those are memory buffers that can be cleared with a reboot.

However, the issue will likely resurface unless there is a fix for the memory buffer issue.

From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Tony Bar
Sent: Tuesday, February 16, 2016 11:12 AM
To: Edward Rolison
Cc: toasters@teaparty.net
Subject: Re: NwkThd_00:warning: NFS response to client was slow

Edward -

Good luck with your corrective plan, but if you're getting back to back CPs simply rebooting or patching (unless there's a specific patch that is recommended against this behavior) isn't going to do much to solve your problem. These errors are almost always caused by having too few physical disks in the underlying aggregate, or a workload that is too aggressive for the aggregate hosting it -- so you're best off relocating the workload (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the aggregate. Another thing you might want to check on is the free space on the aggregate (> 90-95% utilization) since that can also cause problems where housekeeping tasks such as changed block reclamation do not have adequate free space to run.

Anyway, to reiterate -- good luck, but keep the above in mind when considering your options.

Thanks!

On Feb 16, 2016, at 8:03 AM, Edward Rolison <ed.rolison@gmail.com<mailto:ed.rolison@gmail.com>> wrote:
The long and short seems to be - I'm getting low_mbuf CPs on the filter head, and at the time the error message occurs - I'm also getting back to back CPs. So have a reboot pending, and probably a code update in the near future.

On 15 February 2016 at 13:32, tmac <tmacmd@gmail.com<mailto:tmacmd@gmail.com>> wrote:
Are you hosting your datastores on SATA drives? I have seen this before (many times) when customer use SATA and try to host too many virtual machines and they do not turn on the Storage I/O control. (premium feature!).

Turning on SIO control does not always solve this either.

--tmac

Tim McCarthy
Principal Consultant

On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.com<mailto:basilberntsen@gmail.com>> wrote:
That's a very generic warning- I'd open a case.

On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com<mailto:ed.rolison@gmail.com>> wrote:
NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)

I have a filer head, on which I'm hosting ESX datastores.
I've had a couple of instances now of this error (or one rather similar).

It correlates with VMware getting upset and VMs going read only. But it doesn't actually give me any insight into what is going on.

Has anyone run into this, and can give some further insight as to what might be causing and where I can look?

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
http://www.teaparty.net/mailman/listinfo/toasters

Re: NwkThd_00:warning: NFS response to client was slow [ In reply to ]

ed.rolison at gmail

Feb 16, 2016, 9:07 AM

Post #8 of 8 (3481 views)

Permalink

Yes - my first port of call is to see if I can make low_mbufs go away. I
haven't often seen that, although I have seen similar when I'd got a memory
leak, that was causing similar sort of problems. So an update to the latest
revision of ONTAP is probably no bad thing, before trying to trouble shoot
further. (Or otherwise suggest to my customers that we need more resources
to do what they want).
This is a 2240, so it's not the beefiest of filers in the first place :).

On 16 February 2016 at 16:16, Parisi, Justin <Justin.Parisi@netapp.com>
wrote:

> Well, if itâ€™s caused by low_mbuf, those are memory buffers that can be
> cleared with a reboot.
>
>
>
> However, the issue will likely resurface unless there is a fix for the
> memory buffer issue.
>
>
>
> *From:* toasters-bounces@teaparty.net [mailto:
> toasters-bounces@teaparty.net] *On Behalf Of *Tony Bar
> *Sent:* Tuesday, February 16, 2016 11:12 AM
> *To:* Edward Rolison
> *Cc:* toasters@teaparty.net
> *Subject:* Re: NwkThd_00:warning: NFS response to client was slow
>
>
>
> Edward -
>
>
>
> Good luck with your corrective plan, but if you're getting back to back
> CPs simply rebooting or patching (unless there's a specific patch that is
> recommended against this behavior) isn't going to do much to solve your
> problem. These errors are almost always caused by having too few physical
> disks in the underlying aggregate, or a workload that is too aggressive for
> the aggregate hosting it -- so you're best off relocating the workload
> (Storage vMotion, volume move, SnapMirror/NDMP dump) or expanding the
> aggregate. Another thing you might want to check on is the free space on
> the aggregate (> 90-95% utilization) since that can also cause problems
> where housekeeping tasks such as changed block reclamation do not have
> adequate free space to run.
>
>
>
> Anyway, to reiterate -- good luck, but keep the above in mind when
> considering your options.
>
>
>
> Thanks!
>
>
> On Feb 16, 2016, at 8:03 AM, Edward Rolison <ed.rolison@gmail.com> wrote:
>
> The long and short seems to be - I'm getting low_mbuf CPs on the filter
> head, and at the time the error message occurs - I'm also getting back to
> back CPs. So have a reboot pending, and probably a code update in the near
> future.
>
>
>
> On 15 February 2016 at 13:32, tmac <tmacmd@gmail.com> wrote:
>
> Are you hosting your datastores on SATA drives? I have seen this before
> (many times) when customer use SATA and try to host too many virtual
> machines and they do not turn on the Storage I/O control. (premium
> feature!).
>
>
>
> Turning on SIO control does not always solve this either.
>
>
> --tmac
>
>
>
> *Tim McCarthy*
>
> *Principal Consultant*
>
>
>
>
>
> On Mon, Feb 15, 2016 at 7:42 AM, Basil <basilberntsen@gmail.com> wrote:
>
> That's a very generic warning- I'd open a case.
>
>
>
> On Mon, Feb 15, 2016 at 6:44 AM, Edward Rolison <ed.rolison@gmail.com>
> wrote:
>
> NwkThd_00:warning]: NFS response to client xx.xx.xx.xx for volume
> 0x1234567 was slow, op was v3 write, 65 > 60 (in seconds)
>
>
>
> I have a filer head, on which I'm hosting ESX datastores.
>
> I've had a couple of instances now of this error (or one rather similar).
>
> It correlates with VMware getting upset and VMs going read only. But it
> doesn't actually give me any insight into what is going on.
>
> Has anyone run into this, and can give some further insight as to what
> might be causing and where I can look?
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>
>
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> http://www.teaparty.net/mailman/listinfo/toasters
>
>