Mailing List Archive

Re: Network throughput from main Gentoo rig to NAS box.
On Sat, Sep 23, 2023 at 5:05?AM Dale <rdalek1967@gmail.com> wrote:
<SNIP>
> If you need more info, let me know. If you know the command, that might
> help too. Just in case it is a command I'm not familiar with.
>
> Thanks.
>
> Dale
>
> :-) :-)

You can use the iperf command to do simple raw speed testing.

For instance, on your server open a terminal through ssh and run

iperf -s

It should tell you the server is listening.

On your desktop machine run

iperf -c 192.168.86.119

(replace with the IP of your server)

It runs for 5-10 seconds and then reports what it sees
as throughput.

Remember to Ctrl-C the server side when you're done.

HTH,
Mark
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Mark Knecht wrote:
>
>
> On Sat, Sep 23, 2023 at 5:05?AM Dale <rdalek1967@gmail.com
> <mailto:rdalek1967@gmail.com>> wrote:
> <SNIP>
> > If you need more info, let me know.  If you know the command, that might
> > help too.  Just in case it is a command I'm not familiar with.
> >
> > Thanks.
> >
> > Dale
> >
> > :-)  :-)
>
> You can use the iperf command to do simple raw speed testing.
>
> For instance, on your server open a terminal through ssh and run
>
> iperf -s
>
> It should tell you the server is listening.
>
> On your desktop machine run
>
> iperf -c 192.168.86.119
>
> (replace with the IP of your server)
>
> It runs for 5-10 seconds and then reports what it sees 
> as throughput.
>
> Remember to Ctrl-C the server side when you're done.
>
> HTH,
> Mark


I had to install those.  On Gentoo it's called iperf3 but it works. 
Anyway, this is what I get from running the command on the NAS box to my
main rig. 


root@nas:~# iperf -c 10.0.0.4
tcp connect failed: Connection refused
------------------------------------------------------------
Client connecting to 10.0.0.4, TCP port 5001
TCP window size: -1.00 Byte (default)
------------------------------------------------------------
[  1] local 0.0.0.0 port 0 connected with 10.0.0.4 port 5001
root@nas:~#


This is when I try to run from my main rig to the NAS box. 


root@fireball / # iperf3 -c 10.0.0.7
iperf3: error - unable to connect to server - server may have stopped
running or use a different port, firewall issue, etc.: Connection refused
root@fireball / #


I took what you said to mean to run from the NAS box.  I tried both just
in case I misunderstood your meaning by server.  ;-)

Ideas?

Dale

:-)  :-) 

That pepper sauce is getting loud.  '_O
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
On Sat, Sep 23, 2023 at 6:41?AM Dale <rdalek1967@gmail.com> wrote:
>
> Mark Knecht wrote:
>
>
>
> On Sat, Sep 23, 2023 at 5:05?AM Dale <rdalek1967@gmail.com> wrote:
> <SNIP>
> > If you need more info, let me know. If you know the command, that might
> > help too. Just in case it is a command I'm not familiar with.
> >
> > Thanks.
> >
> > Dale
> >
> > :-) :-)
>
> You can use the iperf command to do simple raw speed testing.
>
> For instance, on your server open a terminal through ssh and run
>
> iperf -s
>
> It should tell you the server is listening.
>
> On your desktop machine run
>
> iperf -c 192.168.86.119
>
> (replace with the IP of your server)
>
> It runs for 5-10 seconds and then reports what it sees
> as throughput.
>
> Remember to Ctrl-C the server side when you're done.
>
> HTH,
> Mark
>
>
>
> I had to install those. On Gentoo it's called iperf3 but it works.
Anyway, this is what I get from running the command on the NAS box to my
main rig.
>
>
> root@nas:~# iperf -c 10.0.0.4
> tcp connect failed: Connection refused
> ------------------------------------------------------------
> Client connecting to 10.0.0.4, TCP port 5001
> TCP window size: -1.00 Byte (default)
> ------------------------------------------------------------
> [ 1] local 0.0.0.0 port 0 connected with 10.0.0.4 port 5001
> root@nas:~#
>
>
> This is when I try to run from my main rig to the NAS box.
>
>
> root@fireball / # iperf3 -c 10.0.0.7
> iperf3: error - unable to connect to server - server may have stopped
running or use a different port, firewall issue, etc.: Connection refused
> root@fireball / #
>
>
> I took what you said to mean to run from the NAS box. I tried both just
in case I misunderstood your meaning by server. ;-)
>
> Ideas?
>
> Dale

I thought the instructions were clear but let's try again.

When using iperf YOU have to set up BOTH ends of the path, so:

1) On one end - let's say it's your NAS server - open a terminal. In that
terminal type

mark@plex:~$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 128 KByte (default)
------------------------------------------------------------

2) Then, on your desktop machine that wants to talk to the NAS server type
this command,
replacing my service IP with your NAS server IP

mark@science2:~$ iperf -c 192.168.86.119
------------------------------------------------------------
Client connecting to 192.168.86.119, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 1] local 192.168.86.43 port 40320 connected with 192.168.86.119 port
5001
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0808 sec 426 MBytes 354 Mbits/sec
mark@science2:~$

In this case, over my wireless network, I'm getting about 354Mb/S. Last time
I checked it I hooked a cable between the 2 rooms I got about 900Mb/s.
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Mark Knecht wrote:
>
>
> On Sat, Sep 23, 2023 at 6:41?AM Dale <rdalek1967@gmail.com
> <mailto:rdalek1967@gmail.com>> wrote:
> >
> > Mark Knecht wrote:
> >
> >
> >
> > On Sat, Sep 23, 2023 at 5:05?AM Dale <rdalek1967@gmail.com
> <mailto:rdalek1967@gmail.com>> wrote:
> > <SNIP>
> > > If you need more info, let me know.  If you know the command, that
> might
> > > help too.  Just in case it is a command I'm not familiar with.
> > >
> > > Thanks.
> > >
> > > Dale
> > >
> > > :-)  :-)
> >
> > You can use the iperf command to do simple raw speed testing.
> >
> > For instance, on your server open a terminal through ssh and run
> >
> > iperf -s
> >
> > It should tell you the server is listening.
> >
> > On your desktop machine run
> >
> > iperf -c 192.168.86.119
> >
> > (replace with the IP of your server)
> >
> > It runs for 5-10 seconds and then reports what it sees
> > as throughput.
> >
> > Remember to Ctrl-C the server side when you're done.
> >
> > HTH,
> > Mark
> >
> >
> >
> > I had to install those.  On Gentoo it's called iperf3 but it works. 
> Anyway, this is what I get from running the command on the NAS box to
> my main rig.
> >
> >
> > root@nas:~# iperf -c 10.0.0.4
> > tcp connect failed: Connection refused
> > ------------------------------------------------------------
> > Client connecting to 10.0.0.4, TCP port 5001
> > TCP window size: -1.00 Byte (default)
> > ------------------------------------------------------------
> > [  1] local 0.0.0.0 port 0 connected with 10.0.0.4 port 5001
> > root@nas:~#
> >
> >
> > This is when I try to run from my main rig to the NAS box.
> >
> >
> > root@fireball / # iperf3 -c 10.0.0.7
> > iperf3: error - unable to connect to server - server may have
> stopped running or use a different port, firewall issue, etc.:
> Connection refused
> > root@fireball / #
> >
> >
> > I took what you said to mean to run from the NAS box.  I tried both
> just in case I misunderstood your meaning by server.  ;-)
> >
> > Ideas?
> >
> > Dale
>
> I thought the instructions were clear but let's try again.
>
> When using iperf YOU have to set up BOTH ends of the path, so:
>
> 1) On one end - let's say it's your NAS server - open a terminal. In
> that terminal type
>
> mark@plex:~$ iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size:  128 KByte (default)
> ------------------------------------------------------------
>
> 2) Then, on your desktop machine that wants to talk to the NAS server
> type this command,
> replacing my service IP with your NAS server IP
>
> mark@science2:~$ iperf -c 192.168.86.119    
> ------------------------------------------------------------
> Client connecting to 192.168.86.119, TCP port 5001
> TCP window size: 85.0 KByte (default)
> ------------------------------------------------------------
> [  1] local 192.168.86.43 port 40320 connected with 192.168.86.119
> port 5001
> [ ID] Interval       Transfer     Bandwidth
> [  1] 0.0000-10.0808 sec   426 MBytes   354 Mbits/sec
> mark@science2:~$ 
>
> In this case, over my wireless network, I'm getting about 354Mb/S.
> Last time
> I checked it I hooked a cable between the 2 rooms I got about 900Mb/s.


Oh.  My pepper sauce was getting loud and my eyes were watery.  Now that
I got that done, I can see better after opening the doors a few
minutes.  This is what I get now.  My NAS box, running it first: 

root@nas:~# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size:  128 KByte (default)
------------------------------------------------------------


From main rig, running NAS box command first and it appeared to be waiting.


root@fireball / # iperf3 -c 10.0.0.7
iperf3: error - unable to connect to server - server may have stopped
running or use a different port, firewall issue, etc.: Connection refused
root@fireball / #


So, it appears to be waiting but my main rig isn't getting it.  Then it
occurred my VPN might be affecting this somehow.  I stopped it just in
case.  OK, same thing.  I did run the one on the NAS box first, since I
assume it needs to be listening when I run the command on my main rig. 
After stopping the VPN, I ran both again. 

Just so you know the machine is reachable, I am ssh'd into the NAS box
and I also have it mounted and copying files over with rsync.  Could my
router be blocking this connection?  I kinda leave it at the default
settings.  Read somewhere those are fairly secure. 

I'm working in garden a bit so may come and go at times.  I'm sure you
doing other things too.  :-D 

Dale

:-)  :-)
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
There are quite a few things to tweak that can lead to much smoother transfers, so I'll make an unordered list to help.

mount -o nocto,nolock,async,nconnect=4,rsize=1048576,wsize=1048576
rsize and wsize are very important for max bandwidth, worth checking with mount after linked up
nocto helps a bit, the man page has more info
nconnect helps reach higher throughput by using more threads on the pipe
async might actually be your main issue, nfs does a lot of sync writes, so that would explain the gaps in your chart, needs written to physical media before replying that it's been committed so more data can be sent.

sysctl.conf mods
net.ipv4.tcp_mtu_probing = 2
net.ipv4.tcp_base_mss = 1024

if you use jumbo frames, that'll allow it to find the higher packet sizes.

fs.nfs.nfs_congestion_kb = 524288

that controls how much data can be inflight waiting for responses, if it's too small that'll also lead to the gaps you see.

subjective part incoming lol

net.core.rmem_default = 1048576
net.core.rmem_max = 16777216
net.core.wmem_default = 1048576
net.core.wmem_max = 16777216

net.ipv4.tcp_mem = 4096 131072 262144
net.ipv4.tcp_rmem = 4096 1048576 16777216
net.ipv4.tcp_wmem = 4096 1048576 16777216

net.core.netdev_max_backlog = 10000
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_limit_output_bytes = 262144
net.ipv4.tcp_max_tw_buckets = 262144

you can find your own numbers based on ram size. Basically those control how much data can be buffered PER socket, big buffers improve bandwidth usage to a point, after that point they can lead to latency being added, if most of your communication is with that NAS, you basically ping the NAS to get the average latency then divide your wire speed by it to see how much data it would take to max it out. Also being per socket means you can have lower numbers than I use for sure, I do a lot of single file copies, so my workload isn't the normal usage.
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
<SNIP>
>
> Oh. My pepper sauce was getting loud and my eyes were watery. Now that
I got that done, I can see better after opening the doors a few minutes.
This is what I get now. My NAS box, running it first:
>
> root@nas:~# iperf -s
> ------------------------------------------------------------
> Server listening on TCP port 5001
> TCP window size: 128 KByte (default)
> ------------------------------------------------------------
>
>
> From main rig, running NAS box command first and it appeared to be
waiting.
>
>
> root@fireball / # iperf3 -c 10.0.0.7
> iperf3: error - unable to connect to server - server may have stopped
running or use a different port, firewall issue, etc.: Connection refused
> root@fireball / #
>
>
> So, it appears to be waiting but my main rig isn't getting it. Then it
occurred my VPN might be affecting this somehow. I stopped it just in
case. OK, same thing. I did run the one on the NAS box first, since I
assume it needs to be listening when I run the command on my main rig.
After stopping the VPN, I ran both again.
>
> Just so you know the machine is reachable, I am ssh'd into the NAS box
and I also have it mounted and copying files over with rsync. Could my
router be blocking this connection? I kinda leave it at the default
settings. Read somewhere those are fairly secure.
>
> I'm working in garden a bit so may come and go at times. I'm sure you
doing other things too. :-D
>
> Dale
>
> :-) :-)

If you're running a VPN then you'll need someone at a higher pay grade than
me, but packetizing TCP packets into and out of VPN security is certainly
going to use CPU cycles and slow things down, at least a little. No idea if
that's what caused your gkrellm pictures. Also, any network heavy apps,
like playing video from the NAS or from the Internet is also going to slow
file transfers down.

Your error message is telling you that something is in the way.

Can you ping the NAS box?

ping 10.0.0.7

Can you tracepath the NAS box?

tracepath 10.0.0.7

Are you sure that 10.0.0.7 is the address of the NAS box?

Do you have a /etc/hosts file to keep the names straight?

HTH,
Mark
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
On 9/23/23 14:04, Dale wrote:
> Howdy,
>
> As most everyone knows, I redone my NAS box.  Before I had Truenas on it
> but switched to Ubuntu server thingy called Jimmy.  Kinda like the
> name.  lol  Anyway, Ubuntu has the same odd transfer pattern as the
> Truenas box had.  I'm not sure if the problem is on the Gentoo end or
> the Ubuntu end or something else.  I'm attaching a picture of Gkrellm so
> you can see what I'm talking about.  It transfers a bit, then seems to
> stop for some reason, then start up again and this repeats over and
> over.  I'm expecting more of a consistent throughput instead of all the
> idle time.  The final throughput is only around 29.32MB/s according to
> info from rsync.  If it was not stopping all the time and passing data
> through all the time, I think that would improve.  Might even double.
>
> ...
> Has anyone ever seen something like this and know why it is idle for so
> much of the time?  Anyone know if this can be fixed so that it is more
> consistent, and hopefully faster?
>
I found a similar pattern when I checked some time ago, while
transferring big (several Gb) files from one desktop to the other. I
concluded the cause of the gaps was the destination PC's SATA spinning
disk that needed to empty its cache before accepting more data. In
theory the network is 1Gb/s (measured with iperf, it is really close to
that) and the SATA is 6Gb/s so it should not be the limit, but I have
strong doubts as how this speed is measured by the manufacturer. As
indirect confirmation, when I switched the destination disk to a flash
one the overall network transfer speed improved a lot (a lot lot!).

raffaele
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
On 9/23/23 08:04, Dale wrote:
> Howdy,
>
> As most everyone knows, I redone my NAS box.  Before I had Truenas on it
> but switched to Ubuntu server thingy called Jimmy.  Kinda like the
> name.  lol  Anyway, Ubuntu has the same odd transfer pattern as the
> Truenas box had.  I'm not sure if the problem is on the Gentoo end or
> the Ubuntu end or something else.  I'm attaching a picture of Gkrellm so
> you can see what I'm talking about.  It transfers a bit, then seems to
> stop for some reason, then start up again and this repeats over and
> over.  I'm expecting more of a consistent throughput instead of all the
> idle time.  The final throughput is only around 29.32MB/s according to
> info from rsync.  If it was not stopping all the time and passing data
> through all the time, I think that would improve.  Might even double.
>
> A little more info.  The set of drives this is being copied from use LVM
> and are encrypted with dm-crypt.  They are also set up the same way on
> the NAS box.  I also notice that on the NAS box, using htop, the CPUs
> sit at idle for a bit then show heavy use, on Ubuntu about 60 or 70%,
> then go back to idle.  This seems to be the same thing I'm seeing on
> htop with the data throughput.  One may have something to do with the
> other but I don't know what.  I got so much stuff running on my main rig
> that I can't really tell if that CPU has the same changes or not.  By
> the way, it showed the same when Truenas was on there.  These things are
> mounted using nfs.  I don't know if that matters or not.  In case this
> is a routing issue, I have a Netgear router with 1GB ports.  This is the
> first part of the mount command:
>
> mount -t nfs -o nolock
>
> Has anyone ever seen something like this and know why it is idle for so
> much of the time?  Anyone know if this can be fixed so that it is more
> consistent, and hopefully faster?
>
> If you need more info, let me know.  If you know the command, that might
> help too.  Just in case it is a command I'm not familiar with.
I can't add to what others have suggested to improve the throughput, but
the gkrellm pic you posted tells me something is handling the data in
batches.  enp3s0 (your network interface) gets a peak of activity then
stops while crypt (the disk) has a peak of activity. Rinse and repeat. 
I don't know if this is caused by the program invoked by the command you
issued or by some interaction of different pieces that get called to do
the work.  My guess is that it is reading until it fills some buffer,
then writes it out. (Note, it doesn't matter which device is reading and
which is writing, the two just don't overlap)  If encryption is
involved, it might be that there is actually a encrypt/decrypt which
takes place in between the disk and network flows.  I don't know of any
way to change this, but it might explain why the network transfer rate
is as fast as it can get, but the overall throughput is lower.
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
On Sat, Sep 23, 2023 at 8:04?AM Dale <rdalek1967@gmail.com> wrote:
>
> I'm expecting more of a consistent throughput instead of all the
> idle time. The final throughput is only around 29.32MB/s according to
> info from rsync. If it was not stopping all the time and passing data
> through all the time, I think that would improve. Might even double.

Is anything else reading data from the NAS at the same time? The
performance is going to depend on a lot of details you haven't
provided, but anything that reads from a hard disk is going to
significantly drop throughput - probably to levels around what you're
seeing.

That seems like the most likely explanation, assuming you don't have
some older CPU or a 100Mbps network port, or something else like WiFi
in the mix. The bursty behavior is likely due to caching.

--
Rich
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
On Sat, Sep 23, 2023 at 9:22?AM Rich Freeman <rich0@gentoo.org> wrote:
>
> On Sat, Sep 23, 2023 at 8:04?AM Dale <rdalek1967@gmail.com> wrote:
> >
> > I'm expecting more of a consistent throughput instead of all the
> > idle time. The final throughput is only around 29.32MB/s according to
> > info from rsync. If it was not stopping all the time and passing data
> > through all the time, I think that would improve. Might even double.
>
> Is anything else reading data from the NAS at the same time? The
> performance is going to depend on a lot of details you haven't
> provided, but anything that reads from a hard disk is going to
> significantly drop throughput - probably to levels around what you're
> seeing.
>
> That seems like the most likely explanation, assuming you don't have
> some older CPU or a 100Mbps network port, or something else like WiFi
> in the mix. The bursty behavior is likely due to caching.
>
> --
> Rich

Let's not forget that Dale also likes to put layers of things on his
drives, LVM & encryption at a minimum. We also don't know anything
about his block sizes at either end of this pipe.

I would think maybe running iotop AND btop on both ends would give some
clues on timing. Is the time when gkrellm is idle due to the host disk not
responding or the target getting flooded with too much data?

- Mark
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Mark Knecht wrote:
> <SNIP>
> >
> > Oh.  My pepper sauce was getting loud and my eyes were watery.  Now
> that I got that done, I can see better after opening the doors a few
> minutes.  This is what I get now.  My NAS box, running it first:
> >
> > root@nas:~# iperf -s
> > ------------------------------------------------------------
> > Server listening on TCP port 5001
> > TCP window size:  128 KByte (default)
> > ------------------------------------------------------------
> >
> >
> > From main rig, running NAS box command first and it appeared to be
> waiting.
> >
> >
> > root@fireball / # iperf3 -c 10.0.0.7
> > iperf3: error - unable to connect to server - server may have
> stopped running or use a different port, firewall issue, etc.:
> Connection refused
> > root@fireball / #
> >
> >
> > So, it appears to be waiting but my main rig isn't getting it.  Then
> it occurred my VPN might be affecting this somehow.  I stopped it just
> in case.  OK, same thing.  I did run the one on the NAS box first,
> since I assume it needs to be listening when I run the command on my
> main rig.  After stopping the VPN, I ran both again.
> >
> > Just so you know the machine is reachable, I am ssh'd into the NAS
> box and I also have it mounted and copying files over with rsync. 
> Could my router be blocking this connection?  I kinda leave it at the
> default settings.  Read somewhere those are fairly secure.
> >
> > I'm working in garden a bit so may come and go at times.  I'm sure
> you doing other things too.  :-D
> >
> > Dale
> >
> > :-)  :-)
>
> If you're running a VPN then you'll need someone at a higher pay grade
> than me, but packetizing TCP packets into and out of VPN security is
> certainly going to use CPU cycles and slow things down, at least a
> little. No idea if that's what caused your gkrellm pictures. Also, any
> network heavy apps, like playing video from the NAS or from the
> Internet is also going to slow file transfers down.
>
> Your error message is telling you that something is in the way.
>
> Can you ping the NAS box?
>
> ping 10.0.0.7
>
> Can you tracepath the NAS box?
>
> tracepath 10.0.0.7
>
> Are you sure that 10.0.0.7 is the address of the NAS box? 
>
> Do you have a /etc/hosts file to keep the names straight?
>
> HTH,
> Mark
>
>


Sorry so long.  I was working in the garden some when my neighbor called
and wanted to move his mailbox.  He got one of those pre-made things a
long time ago and it is really to short.  I dug a new hole, put in a
piece of tubing that the mailbox post would slip over and then poured
concrete around it.  Should hold up now.  Can't move it and the concrete
isn't even dry yet.  I really like a auger that fits on the back of a
tractor especially when the road used to be a gravel road.  Post hole
diggers are a TON of hard work.  Auger took less than a minute, with
tractor at idle.  LOL  Anyway, mailbox is up and I got some sore
joints.  I sense meds coming pretty soon.  :/ 

I can ping it.  I always have the VPN running and believe it or not, I
can connect to the NAS box, ping, transfer files and everything with it
running.  I just thought the VPN might affect that one thing for some
reason.  Anytime something odd is going on network wise, I stop the VPN
first and test again.  I rule that out first thing. 

I read the other replies and I think it is caching the data, the drives
writes and catches up and then it asks for more data again.  I can't
imagine anything to do with the NAS box given it did the exact same
thing with Truenas.  If anything, I'd expect it to be on my main rig but
the main rig has a lot more power and memory than the NAS box does.  I
suspect between the cache thing and encryption, that is the bottleneck. 
I also replaced the network card a week or so ago.  Turned out it was a
setting I needed to change.  So, not a bad card either. 

I got to cool off and rest a bit.  I may read some more replies later or
again, after I get my wits back.  :/ 

Dale

:-)  :-) 
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Am Sat, Sep 23, 2023 at 02:30:32PM -0500 schrieb Dale:

> I read the other replies and I think it is caching the data, the drives
> writes and catches up and then it asks for more data again.

Tool tip: dstat

It puts out one line of values every x seconds (x == 1 by default).
With arguments you can tell it what to show. To see disks in action, I like
to run the following during upgrades that involve volumous packages:

dstat --time --cpu --disk -D <comma-list of disks you want to monitor> --net --mem-adv --swap'

The cpu column includes IO wait.

The disk columns show read and write volume. If you omit the -D option, you
will only see a total over all disks, which might still be enough for your
use case.

The mem-adv shows how much data is in the file system write cache (the --mem
option does not).

--
Grüße | Greetings | Salut | Qapla’
Please do not share anything from, with or about me on any social network.

Unburden your teacher; skip classes once in a while!
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Am Sat, Sep 23, 2023 at 05:54:21PM +0200 schrieb ralfconn:

> On 9/23/23 14:04, Dale wrote:
> > Howdy,
> >
> > As most everyone knows, I redone my NAS box.  Before I had Truenas on it
> > but switched to Ubuntu server thingy called Jimmy.  Kinda like the
> > name.  lol  Anyway, Ubuntu has the same odd transfer pattern as the
> > Truenas box had.  I'm not sure if the problem is on the Gentoo end or
> > the Ubuntu end or something else.  I'm attaching a picture of Gkrellm so
> > you can see what I'm talking about.  It transfers a bit, then seems to
> > stop for some reason, then start up again and this repeats over and
> > over.  I'm expecting more of a consistent throughput instead of all the
> > idle time.  The final throughput is only around 29.32MB/s according to
> > info from rsync.  If it was not stopping all the time and passing data
> > through all the time, I think that would improve.  Might even double.
> >
> > ...
> > Has anyone ever seen something like this and know why it is idle for so
> > much of the time?  Anyone know if this can be fixed so that it is more
> > consistent, and hopefully faster?
> >
> I found a similar pattern when I checked some time ago, while transferring
> big (several Gb) files from one desktop to the other. I concluded the cause
> of the gaps was the destination PC's SATA spinning disk that needed to empty
> its cache before accepting more data. In theory the network is 1Gb/s
> (measured with iperf, it is really close to that) and the SATA is 6Gb/s so
> it should not be the limit, but I have strong doubts as how this speed is
> measured by the manufacturer.

Please be aware there is a difference between Gb and GB: one is gigabit, the
other gigabyte. 1 Gb/s is theoretically 125 MB/s, and after deducting
network overhead you get around 117 MB/s net bandwidth. Modern 3.5? HDDs
read more than 200 MB/s in their fastest areas, 2.5? not so much. In their
slowest region, that can go down to 50..70 MB/s.

--
Grüße | Greetings | Salut | Qapla’
Please do not share anything from, with or about me on any social network.

A peach is like an apple covered by a carpet.
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Mark Knecht wrote:
>
>
> On Sat, Sep 23, 2023 at 9:22?AM Rich Freeman <rich0@gentoo.org
> <mailto:rich0@gentoo.org>> wrote:
> >
> > On Sat, Sep 23, 2023 at 8:04?AM Dale <rdalek1967@gmail.com
> <mailto:rdalek1967@gmail.com>> wrote:
> > >
> > > I'm expecting more of a consistent throughput instead of all the
> > > idle time.  The final throughput is only around 29.32MB/s according to
> > > info from rsync.  If it was not stopping all the time and passing data
> > > through all the time, I think that would improve.  Might even double.
> >
> > Is anything else reading data from the NAS at the same time?  The
> > performance is going to depend on a lot of details you haven't
> > provided, but anything that reads from a hard disk is going to
> > significantly drop throughput - probably to levels around what you're
> > seeing.
> >
> > That seems like the most likely explanation, assuming you don't have
> > some older CPU or a 100Mbps network port, or something else like WiFi
> > in the mix.  The bursty behavior is likely due to caching.
> >
> > --
> > Rich
>
> Let's not forget that Dale also likes to put layers of things on his
> drives, LVM & encryption at a minimum. We also don't know anything
> about his block sizes at either end of this pipe.
>
> I would think maybe running iotop AND btop on both ends would give some 
> clues on timing. Is the time when gkrellm is idle due to the host disk
> not 
> responding or the target getting flooded with too much data?
>
> - Mark


This is true.  I have LVM on the bottom layer with dm-crypt, cryptsetup,
above that and then the file system.  It's the only way I can have more
than one drive and encrypt it.  Well, there's ZFS but I already been
down that path.  ;-) 

I posted in another reply a picture that shows this same thing happening
even when the copy process is on the same rig and even on the same LV. 
That should rule out network issues.  I think as was pointed out, it is
transferring the data until the cache fills up and then it has to wait
for it to catch up then repeat.  It could be encryption slows that
process down a good bit, it could be LVM, or both.  I do know when I did
a test copy to the NAS box when the drives on the NAS box were not
encrypted, it was a good bit faster, about double or more.  Once I got
LVM and the encryption set up, it was slow again.  Also just like when
using Truenas. 

Anyway, I think the network is ruled out at least.  I'm not sure what
can be done at this point.  If it is a cache or drive can't keep up
issue, we can't fix that.  At least it does work, even if it takes a
while to get there.  :-D  Only took almost two weeks to copy over to my
backups.  ROFL 

Thanks to all for the ideas, help, suggestions etc etc.

Dale

:-)  :-) 
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Tsukasa Mcp_Reznor wrote:
> There are quite a few things to tweak that can lead to much smoother transfers, so I'll make an unordered list to help.
>
> mount -o nocto,nolock,async,nconnect=4,rsize=1048576,wsize=1048576
> rsize and wsize are very important for max bandwidth, worth checking with mount after linked up
> nocto helps a bit, the man page has more info
> nconnect helps reach higher throughput by using more threads on the pipe
> async might actually be your main issue, nfs does a lot of sync writes, so that would explain the gaps in your chart, needs written to physical media before replying that it's been committed so more data can be sent.
>
> sysctl.conf mods
> net.ipv4.tcp_mtu_probing = 2
> net.ipv4.tcp_base_mss = 1024
>
> if you use jumbo frames, that'll allow it to find the higher packet sizes.
>
> fs.nfs.nfs_congestion_kb = 524288
>
> that controls how much data can be inflight waiting for responses, if it's too small that'll also lead to the gaps you see.
>
> subjective part incoming lol
>
> net.core.rmem_default = 1048576
> net.core.rmem_max = 16777216
> net.core.wmem_default = 1048576
> net.core.wmem_max = 16777216
>
> net.ipv4.tcp_mem = 4096 131072 262144
> net.ipv4.tcp_rmem = 4096 1048576 16777216
> net.ipv4.tcp_wmem = 4096 1048576 16777216
>
> net.core.netdev_max_backlog = 10000
> net.ipv4.tcp_fin_timeout = 15
> net.ipv4.tcp_limit_output_bytes = 262144
> net.ipv4.tcp_max_tw_buckets = 262144
>
> you can find your own numbers based on ram size. Basically those control how much data can be buffered PER socket, big buffers improve bandwidth usage to a point, after that point they can lead to latency being added, if most of your communication is with that NAS, you basically ping the NAS to get the average latency then divide your wire speed by it to see how much data it would take to max it out. Also being per socket means you can have lower numbers than I use for sure, I do a lot of single file copies, so my workload isn't the normal usage.
> .
>


I finished my OS updates and started my weekly backup updates.  I
mounted using your options and this is a decent improvement.  I'm not
sure which option makes it faster but it is faster, almost double.  A
few examples using fairly large file sizes for good results. 


3,519,790,127 100%   51.46MB/s    0:01:05
3,519,632,300 100%   51.97MB/s    0:01:04
3,518,456,042 100%   51.20MB/s    0:01:05


It may not look like much, still slower than just a straight copy with
no encryption, but given previous speeds, this is a nice improvement.  I
think before I was getting about 25 to 30MB/s before.  This is the
settings shown by the mount command now, which should be what it is using. 


root@fireball / # mount | grep TV
10.0.0.7:/mnt/backup on /mnt/TV_Backup type nfs4
(rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,nconnect=4,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,local_lock=none,addr=10.0.0.7)
root@fireball / #


I think it took all your options and is using them.  If you have ideas
that would speed things up more, I'm open to it but this is a nice
improvement.  I still think the encryption slows things down some,
especially on the NAS end which is much older machine and is likely
fairly CPU intensive.  A newer CPU that has the same clock speed and
number of cores would likely do much better, newer instruction support
and all.  I think I read somewhere that newer CPUs have extra stuff to
speed encryption up.  I might be wrong on that.

Thanks much.  Any additional ideas are welcome, from anyone who has
them.  If it matters, both rigs are on UPSs.

Dale

:-)  :-) 

P. S.  Those who know I garden, my turnip and mustard greens are popping
up.  My kale and collards are not up yet.  I watered them again to help
them pop up.  Kinda dry here and no rain until the end of the next week,
they think.  They never really know what the weather is going to do
anyway. 
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
________________________________________
From: Dale <rdalek1967@gmail.com>
Sent: Sunday, October 1, 2023 1:29 PM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Network throughput from main Gentoo rig to NAS box.

Tsukasa Mcp_Reznor wrote:
> There are quite a few things to tweak that can lead to much smoother transfers, so I'll make an unordered list to help.
>
> mount -o nocto,nolock,async,nconnect=4,rsize=1048576,wsize=1048576
> rsize and wsize are very important for max bandwidth, worth checking with mount after linked up
> nocto helps a bit, the man page has more info
> nconnect helps reach higher throughput by using more threads on the pipe
> async might actually be your main issue, nfs does a lot of sync writes, so that would explain the gaps in your chart, needs written to physical media before replying that it's been committed so more data can be sent.
>
> sysctl.conf mods
> net.ipv4.tcp_mtu_probing = 2
> net.ipv4.tcp_base_mss = 1024
>
> if you use jumbo frames, that'll allow it to find the higher packet sizes.
>
> fs.nfs.nfs_congestion_kb = 524288
>
> that controls how much data can be inflight waiting for responses, if it's too small that'll also lead to the gaps you see.
>
> subjective part incoming lol
>
> net.core.rmem_default = 1048576
> net.core.rmem_max = 16777216
> net.core.wmem_default = 1048576
> net.core.wmem_max = 16777216
>
> net.ipv4.tcp_mem = 4096 131072 262144
> net.ipv4.tcp_rmem = 4096 1048576 16777216
> net.ipv4.tcp_wmem = 4096 1048576 16777216
>
> net.core.netdev_max_backlog = 10000
> net.ipv4.tcp_fin_timeout = 15
> net.ipv4.tcp_limit_output_bytes = 262144
> net.ipv4.tcp_max_tw_buckets = 262144
>
> you can find your own numbers based on ram size. Basically those control how much data can be buffered PER socket, big buffers improve bandwidth usage to a point, after that point they can lead to latency being added, if most of your communication is with that NAS, you basically ping the NAS to get the average latency then divide your wire speed by it to see how much data it would take to max it out. Also being per socket means you can have lower numbers than I use for sure, I do a lot of single file copies, so my workload isn't the normal usage.
> .
>


I finished my OS updates and started my weekly backup updates. I
mounted using your options and this is a decent improvement. I'm not
sure which option makes it faster but it is faster, almost double. A
few examples using fairly large file sizes for good results.


3,519,790,127 100% 51.46MB/s 0:01:05
3,519,632,300 100% 51.97MB/s 0:01:04
3,518,456,042 100% 51.20MB/s 0:01:05


It may not look like much, still slower than just a straight copy with
no encryption, but given previous speeds, this is a nice improvement. I
think before I was getting about 25 to 30MB/s before. This is the
settings shown by the mount command now, which should be what it is using.


root@fireball / # mount | grep TV
10.0.0.7:/mnt/backup on /mnt/TV_Backup type nfs4
(rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,nocto,proto=tcp,nconnect=4,timeo=600,retrans=2,sec=sys,clientaddr=10.0.0.4,local_lock=none,addr=10.0.0.7)
root@fireball / #


I think it took all your options and is using them. If you have ideas
that would speed things up more, I'm open to it but this is a nice
improvement. I still think the encryption slows things down some,
especially on the NAS end which is much older machine and is likely
fairly CPU intensive. A newer CPU that has the same clock speed and
number of cores would likely do much better, newer instruction support
and all. I think I read somewhere that newer CPUs have extra stuff to
speed encryption up. I might be wrong on that.

Thanks much. Any additional ideas are welcome, from anyone who has
them. If it matters, both rigs are on UPSs.

Dale

:-) :-)

P. S. Those who know I garden, my turnip and mustard greens are popping
up. My kale and collards are not up yet. I watered them again to help
them pop up. Kinda dry here and no rain until the end of the next week,
they think. They never really know what the weather is going to do
anyway.


----------------------------------------------------------------------------

Glad you got some decent benefits, I just now realized that the "async" setting is supposed to be done on the server in /etc/exports not under the client mount. Please give that a shot, I think it'll make a big difference for you.
Re: Network throughput from main Gentoo rig to NAS box. [ In reply to ]
Tsukasa Mcp_Reznor wrote:
>
> Glad you got some decent benefits, I just now realized that the "async" setting is supposed to be done on the server in /etc/exports not under the client mount. Please give that a shot, I think it'll make a big difference for you.
>
>


My hard drives are still locked in the safe but I thought I'd boot the
old rig and see what the setting is.  This is my current settings.


/mnt/backup     10.0.0.4/(rw,sync,no_subtree_check)


I'll change it to async and see what it does next weekend.  As it is, I
don't have anything new to backup, yet. 

Thanks.

Dale

:-) :-)