Mailing List Archive: IOPs limit?

IOPs limit?

Oct 29, 2020, 5:26 AM

Post #1 of 6 (829 views)

Hi there

We have an A300 connected to two FC fabrics with 2 x 8G FC per controller node.
We are doing tests on an ?older? HP C7000 with VirtualConnect modules and with blades that have 2 x 8G FC mezzanine.
The VCs are connected into the fabric with two 8G FC per fabric.
So the throughput to a given LUN would be 16Gbps. As the other two links are standby.
The blade is running ESXi 7.0 and on top of this we are running a virtual Windows 2016 server from where the tests are done.
There are no one else on the A300. We hav created a standard LUN in a volume, presented it to ESXi from where we created a VMFS and the VM is then presented with a VMDK-based disk.

It seems that no matter what we adjust we have a limit at about 70.000 IOPs on the disks.
We are able to run other workloads while we are testing with 70.000 IOPs, and we are able to load the system even more, so the NetApp does not seem to be the limiting factor? we can also not see that much load on the system from the NetApp Harvest/Grafana we have running?
We of cause make sure that the tests we do is not done in cache on the host?
The specific test is with 64k block size.

Questions:

* Can there be a limit of IOPs in the C7000 setup? PCI-bridge stuff?
* Does it make sense to add more paths, in either the NetApp and/or the C7000? (each blade can only have two ports)
* Would it make sense to upgrade the FC links to 16G?
* We have adjusted the queue depth as per NetApp?s recommendations, as far as I know there is no way to adjust QD on the ONTAP node itself, is this correct?
* The shelf attached is IOM6 based, would it make sense to upgrade this to IOM12?
* Would it make sense to cable the IOM12 in a Quad-Path setup?

Any suggestions are welcome? also if you have any insights of how many IOPs one should expect from an A300 with 24 x 3.7T SSDs.

/Heino
--

Re: IOPs limit? [ In reply to ]

bjones at aggiejones

Oct 29, 2020, 6:31 AM

Post #2 of 6 (829 views)

Permalink

Heino,

It might be your 8Gb FC links that are the bottleneck. You'd have to check
the switch/host stats to verify. By my math, 70,000 IOPS at 64kB IO size
is 4,480,000 kB/s throughput. Converted to Gbit/s, that's 34.18Gbit/s,
which is more than your 2x 8Gb links at the host. You did not mention the
read/write ratio of your tests. Is it 50/50? If you had 35k read and 35k
write IOPS, that's more like 17Gb/s in each direction, which is still over
the max of your 2x 8Gb links. If the IO sizes are mostly smaller than
64kB, then this would be functioning.

Check the performance on the switches and host posts and see how saturated
they are during the test. You might also be able to add more paths of 8Gb,
if that's all you have. Increase the number of connections on the host to
4 and 8 connections to the filer.

Regards,

Brian Jones

On Thu, Oct 29, 2020 at 7:26 AM Heino Walther <hw@beardmann.dk> wrote:

> Hi there
>
>
>
> We have an A300 connected to two FC fabrics with 2 x 8G FC per controller
> node.
>
> We are doing tests on an “older” HP C7000 with VirtualConnect modules and
> with blades that have 2 x 8G FC mezzanine.
>
> The VCs are connected into the fabric with two 8G FC per fabric.
>
> So the throughput to a given LUN would be 16Gbps. As the other two links
> are standby.
>
> The blade is running ESXi 7.0 and on top of this we are running a virtual
> Windows 2016 server from where the tests are done.
>
> There are no one else on the A300. We hav created a standard LUN in a
> volume, presented it to ESXi from where we created a VMFS and the VM is
> then presented with a VMDK-based disk.
>
>
>
> It seems that no matter what we adjust we have a limit at about 70.000
> IOPs on the disks.
>
> We are able to run other workloads while we are testing with 70.000 IOPs,
> and we are able to load the system even more, so the NetApp does not seem
> to be the limiting factor… we can also not see that much load on the system
> from the NetApp Harvest/Grafana we have running…
>
> We of cause make sure that the tests we do is not done in cache on the
> host…
>
> The specific test is with 64k block size.
>
>
>
> Questions:
>
> - Can there be a limit of IOPs in the C7000 setup? PCI-bridge stuff?
> - Does it make sense to add more paths, in either the NetApp and/or
> the C7000? (each blade can only have two ports)
> - Would it make sense to upgrade the FC links to 16G?
> - We have adjusted the queue depth as per NetApp’s recommendations, as
> far as I know there is no way to adjust QD on the ONTAP node itself, is
> this correct?
> - The shelf attached is IOM6 based, would it make sense to upgrade
> this to IOM12?
> - Would it make sense to cable the IOM12 in a Quad-Path setup?
>
>
>
> Any suggestions are welcome… also if you have any insights of how many
> IOPs one should expect from an A300 with 24 x 3.7T SSDs.
>
>
>
> /Heino
>
> --
>
>
> _______________________________________________
> Toasters mailing list
> Toasters@teaparty.net
> https://www.teaparty.net/mailman/listinfo/toasters

SV: IOPs limit? [ In reply to ]

hw at beardmann

Oct 29, 2020, 6:40 AM

Post #3 of 6 (829 views)

Permalink

Hi Brian

Even with 4k blocks 100% read we only get the 70.000 IOPs?
As this blade is the only one using the NetApp and it only have 2 x 8G FC HBAs (which cannot be expanded), it might not make much sense adding more or faster paths to the NetApp?

/Heino

Fra: Toasters <toasters-bounces@teaparty.net>
Dato: torsdag, 29. oktober 2020 kl. 14.32
Til: Toasters <toasters@teaparty.net>
Emne: Re: IOPs limit?
Heino,

It might be your 8Gb FC links that are the bottleneck. You'd have to check the switch/host stats to verify. By my math, 70,000 IOPS at 64kB IO size is 4,480,000 kB/s throughput. Converted to Gbit/s, that's 34.18Gbit/s, which is more than your 2x 8Gb links at the host. You did not mention the read/write ratio of your tests. Is it 50/50? If you had 35k read and 35k write IOPS, that's more like 17Gb/s in each direction, which is still over the max of your 2x 8Gb links. If the IO sizes are mostly smaller than 64kB, then this would be functioning.

Check the performance on the switches and host posts and see how saturated they are during the test. You might also be able to add more paths of 8Gb, if that's all you have. Increase the number of connections on the host to 4 and 8 connections to the filer.

Regards,
Brian Jones

On Thu, Oct 29, 2020 at 7:26 AM Heino Walther <hw@beardmann.dk<mailto:hw@beardmann.dk>> wrote:
Hi there

We have an A300 connected to two FC fabrics with 2 x 8G FC per controller node.
We are doing tests on an ?older? HP C7000 with VirtualConnect modules and with blades that have 2 x 8G FC mezzanine.
The VCs are connected into the fabric with two 8G FC per fabric.
So the throughput to a given LUN would be 16Gbps. As the other two links are standby.
The blade is running ESXi 7.0 and on top of this we are running a virtual Windows 2016 server from where the tests are done.
There are no one else on the A300. We hav created a standard LUN in a volume, presented it to ESXi from where we created a VMFS and the VM is then presented with a VMDK-based disk.

It seems that no matter what we adjust we have a limit at about 70.000 IOPs on the disks.
We are able to run other workloads while we are testing with 70.000 IOPs, and we are able to load the system even more, so the NetApp does not seem to be the limiting factor? we can also not see that much load on the system from the NetApp Harvest/Grafana we have running?
We of cause make sure that the tests we do is not done in cache on the host?
The specific test is with 64k block size.

Questions:

* Can there be a limit of IOPs in the C7000 setup? PCI-bridge stuff?
* Does it make sense to add more paths, in either the NetApp and/or the C7000? (each blade can only have two ports)
* Would it make sense to upgrade the FC links to 16G?
* We have adjusted the queue depth as per NetApp?s recommendations, as far as I know there is no way to adjust QD on the ONTAP node itself, is this correct?
* The shelf attached is IOM6 based, would it make sense to upgrade this to IOM12?
* Would it make sense to cable the IOM12 in a Quad-Path setup?

Any suggestions are welcome? also if you have any insights of how many IOPs one should expect from an A300 with 24 x 3.7T SSDs.

/Heino
--

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
https://www.teaparty.net/mailman/listinfo/toasters

AW: IOPs limit? [ In reply to ]

AGriesser at anexia-it

Oct 29, 2020, 6:41 AM

Post #4 of 6 (829 views)

Permalink

Where do you measure the 70k iops? Inside of the virtual machine? Directly on the storage?

Alexander Griesser
Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: AGriesser@anexia-it.com<mailto:AGriesser@anexia-it.com>
Web: http://www.anexia-it.com<http://www.anexia-it.com/>

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstra?e 140, 9020 Klagenfurt
Gesch?ftsf?hrer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: Toasters <toasters-bounces@teaparty.net> Im Auftrag von Heino Walther
Gesendet: Donnerstag, 29. Oktober 2020 14:41
An: Brian Jones <bjones@aggiejones.com>; Toasters <toasters@teaparty.net>
Betreff: SV: IOPs limit?

Hi Brian

Even with 4k blocks 100% read we only get the 70.000 IOPs...
As this blade is the only one using the NetApp and it only have 2 x 8G FC HBAs (which cannot be expanded), it might not make much sense adding more or faster paths to the NetApp...

/Heino

Fra: Toasters <toasters-bounces@teaparty.net<mailto:toasters-bounces@teaparty.net>>
Dato: torsdag, 29. oktober 2020 kl. 14.32
Til: Toasters <toasters@teaparty.net<mailto:toasters@teaparty.net>>
Emne: Re: IOPs limit?
Heino,

It might be your 8Gb FC links that are the bottleneck. You'd have to check the switch/host stats to verify. By my math, 70,000 IOPS at 64kB IO size is 4,480,000 kB/s throughput. Converted to Gbit/s, that's 34.18Gbit/s, which is more than your 2x 8Gb links at the host. You did not mention the read/write ratio of your tests. Is it 50/50? If you had 35k read and 35k write IOPS, that's more like 17Gb/s in each direction, which is still over the max of your 2x 8Gb links. If the IO sizes are mostly smaller than 64kB, then this would be functioning.

Check the performance on the switches and host posts and see how saturated they are during the test. You might also be able to add more paths of 8Gb, if that's all you have. Increase the number of connections on the host to 4 and 8 connections to the filer.

Regards,
Brian Jones

On Thu, Oct 29, 2020 at 7:26 AM Heino Walther <hw@beardmann.dk<mailto:hw@beardmann.dk>> wrote:
Hi there

We have an A300 connected to two FC fabrics with 2 x 8G FC per controller node.
We are doing tests on an "older" HP C7000 with VirtualConnect modules and with blades that have 2 x 8G FC mezzanine.
The VCs are connected into the fabric with two 8G FC per fabric.
So the throughput to a given LUN would be 16Gbps. As the other two links are standby.
The blade is running ESXi 7.0 and on top of this we are running a virtual Windows 2016 server from where the tests are done.
There are no one else on the A300. We hav created a standard LUN in a volume, presented it to ESXi from where we created a VMFS and the VM is then presented with a VMDK-based disk.

It seems that no matter what we adjust we have a limit at about 70.000 IOPs on the disks.
We are able to run other workloads while we are testing with 70.000 IOPs, and we are able to load the system even more, so the NetApp does not seem to be the limiting factor... we can also not see that much load on the system from the NetApp Harvest/Grafana we have running...
We of cause make sure that the tests we do is not done in cache on the host...
The specific test is with 64k block size.

Questions:

* Can there be a limit of IOPs in the C7000 setup? PCI-bridge stuff?
* Does it make sense to add more paths, in either the NetApp and/or the C7000? (each blade can only have two ports)
* Would it make sense to upgrade the FC links to 16G?
* We have adjusted the queue depth as per NetApp's recommendations, as far as I know there is no way to adjust QD on the ONTAP node itself, is this correct?
* The shelf attached is IOM6 based, would it make sense to upgrade this to IOM12?
* Would it make sense to cable the IOM12 in a Quad-Path setup?

Any suggestions are welcome... also if you have any insights of how many IOPs one should expect from an A300 with 24 x 3.7T SSDs.

/Heino
--

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
https://www.teaparty.net/mailman/listinfo/toasters

SV: IOPs limit? [ In reply to ]

hw at beardmann

Oct 29, 2020, 6:46 AM

Post #5 of 6 (829 views)

Permalink

We see it in ?lun stat show?? on the NetApp..

/Heino

Fra: Alexander Griesser <AGriesser@anexia-it.com>
Dato: torsdag, 29. oktober 2020 kl. 14.41
Til: Heino Walther <hw@beardmann.dk>, Brian Jones <bjones@aggiejones.com>, Toasters <toasters@teaparty.net>
Emne: AW: IOPs limit?
Where do you measure the 70k iops? Inside of the virtual machine? Directly on the storage?

Alexander Griesser
Head of Systems Operations

ANEXIA Internetdienstleistungs GmbH

E-Mail: AGriesser@anexia-it.com<mailto:AGriesser@anexia-it.com>
Web: http://www.anexia-it.com<http://www.anexia-it.com/>

Anschrift Hauptsitz Klagenfurt: Feldkirchnerstra?e 140, 9020 Klagenfurt
Gesch?ftsf?hrer: Alexander Windbichler
Firmenbuch: FN 289918a | Gerichtsstand: Klagenfurt | UID-Nummer: AT U63216601

Von: Toasters <toasters-bounces@teaparty.net> Im Auftrag von Heino Walther
Gesendet: Donnerstag, 29. Oktober 2020 14:41
An: Brian Jones <bjones@aggiejones.com>; Toasters <toasters@teaparty.net>
Betreff: SV: IOPs limit?

Hi Brian

Even with 4k blocks 100% read we only get the 70.000 IOPs?
As this blade is the only one using the NetApp and it only have 2 x 8G FC HBAs (which cannot be expanded), it might not make much sense adding more or faster paths to the NetApp?

/Heino

Fra: Toasters <toasters-bounces@teaparty.net<mailto:toasters-bounces@teaparty.net>>
Dato: torsdag, 29. oktober 2020 kl. 14.32
Til: Toasters <toasters@teaparty.net<mailto:toasters@teaparty.net>>
Emne: Re: IOPs limit?
Heino,

It might be your 8Gb FC links that are the bottleneck. You'd have to check the switch/host stats to verify. By my math, 70,000 IOPS at 64kB IO size is 4,480,000 kB/s throughput. Converted to Gbit/s, that's 34.18Gbit/s, which is more than your 2x 8Gb links at the host. You did not mention the read/write ratio of your tests. Is it 50/50? If you had 35k read and 35k write IOPS, that's more like 17Gb/s in each direction, which is still over the max of your 2x 8Gb links. If the IO sizes are mostly smaller than 64kB, then this would be functioning.

Check the performance on the switches and host posts and see how saturated they are during the test. You might also be able to add more paths of 8Gb, if that's all you have. Increase the number of connections on the host to 4 and 8 connections to the filer.

Regards,
Brian Jones

On Thu, Oct 29, 2020 at 7:26 AM Heino Walther <hw@beardmann.dk<mailto:hw@beardmann.dk>> wrote:
Hi there

We have an A300 connected to two FC fabrics with 2 x 8G FC per controller node.
We are doing tests on an ?older? HP C7000 with VirtualConnect modules and with blades that have 2 x 8G FC mezzanine.
The VCs are connected into the fabric with two 8G FC per fabric.
So the throughput to a given LUN would be 16Gbps. As the other two links are standby.
The blade is running ESXi 7.0 and on top of this we are running a virtual Windows 2016 server from where the tests are done.
There are no one else on the A300. We hav created a standard LUN in a volume, presented it to ESXi from where we created a VMFS and the VM is then presented with a VMDK-based disk.

It seems that no matter what we adjust we have a limit at about 70.000 IOPs on the disks.
We are able to run other workloads while we are testing with 70.000 IOPs, and we are able to load the system even more, so the NetApp does not seem to be the limiting factor? we can also not see that much load on the system from the NetApp Harvest/Grafana we have running?
We of cause make sure that the tests we do is not done in cache on the host?
The specific test is with 64k block size.

Questions:

* Can there be a limit of IOPs in the C7000 setup? PCI-bridge stuff?
* Does it make sense to add more paths, in either the NetApp and/or the C7000? (each blade can only have two ports)
* Would it make sense to upgrade the FC links to 16G?
* We have adjusted the queue depth as per NetApp?s recommendations, as far as I know there is no way to adjust QD on the ONTAP node itself, is this correct?
* The shelf attached is IOM6 based, would it make sense to upgrade this to IOM12?
* Would it make sense to cable the IOM12 in a Quad-Path setup?

Any suggestions are welcome? also if you have any insights of how many IOPs one should expect from an A300 with 24 x 3.7T SSDs.

/Heino
--

_______________________________________________
Toasters mailing list
Toasters@teaparty.net<mailto:Toasters@teaparty.net>
https://www.teaparty.net/mailman/listinfo/toasters

SV: IOPs limit? [ In reply to ]

hw at beardmann

Oct 29, 2020, 6:57 AM

Post #6 of 6 (829 views)

Permalink

Of cause it is ?statistics lun show??

Fra: Toasters <toasters-bounces@teaparty.net>
Dato: torsdag, 29. oktober 2020 kl. 14.47
Til: Alexander Griesser <AGriesser@anexia-it.com>, Brian Jones <bjones@aggiejones.com>, Toasters <toasters@teaparty.net>
Emne: SV: IOPs limit?
We see it in ?lun stat show?? on the NetApp..

/Heino