Mailing List Archive

Difference in buffer behaviour between 8.89 and 8.9?
Hey all,

I've run in to a strange issue and I'm wondering if anyone has any
insight on this. So I have a modified version of OpenSSH called hpnssh
that uses larger receive buffer sizes to improve throughput. When I
ported that patch set to 8.9 I found out after an initial burst of data
the throughput would drop to zero after an initial burst and just sort
of stay there. This happens regardless of the RTT; same issues at
sub-millisecond RTTs and 150ms.

I've been able to replicate this behaviour in OpenSSH by increasing the
size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to
higher values. You kind of see it in the debug log at 96*CSPD and you
can start to see it impact throughput at 128*CSDP. With my tests I have
it at 512*CSPD and the pause will last upwards of 5 seconds. When it's
at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will
last 18 seconds or more.

I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get
a better view of whats happening. The following are the results of that
in both 8.9p1 and 8.8p1


Version 8.9p1

Start of buffer growth

debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 98304, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 131072, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 163840, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 196608, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 229376, len: 32768 for 0x55d548211550

debug1: adjusted rlen: 22003712, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22036480, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22069248, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 22085632, len: 16384 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 22102016, len: 16384 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22134784, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22167552, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22200320, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 22216704, len: 16384 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 22233088, len: 16384 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22265856, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22298624, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 22331392, len: 32768 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 22347776, len: 16384 for 0x55d548211550
debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
— end of buffer growth

So there is a buffer that kicks in after an short initial period of good
throughput that has then has to grow larger that the size of
CHAN_SES_WINDOW_DEFAULT (128*CSPD in this case) before throughput can
start to pick up again. During this growth period CPU usage is pegged at
100% even though throughput is minimal.


Version 8.8p1

Start buffer adjustment

debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 49152, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 81920, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 114688, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 147456, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 180224, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 212992, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 245760, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 278528, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 311296, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 344064, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 376832, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 409600, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 442368, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 475136, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 507904, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 540672, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 573440, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 606208, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 638976, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 671744, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 704512, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 737280, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 770048, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 802816, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 835584, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 868352, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 901120, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 933888, len: 32768 for 0x5570cf28a3e0
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410

End buffer adjustment

In version 8.8 I also see a buffer growing during the course of the
transfer but it looks like it doesn’t need to grow to the same size as
in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on
the same test bed, same set up, and the tests are being run right after
each other.

The behaviour in 8.9 is very consistent. After a certain number of
adjusted rlens for, what I am assuming is the packet buffer, it goes
into this growth phase for another buffer. In my test that been about
700 of these adjusts which is unrelated to the size of
CHAN_SES_WINDOW_DEFAULT

So I’m wondering what changed between 8.8 and 8.9 that might account for
this and if this is expected/desired behaviour. I'm still working
through the diff from 8.8 to 8.9 but nothing has leaped out at me yet.

Thanks,
Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Of course, 90 minutes later I find that my issues starts with commit
b30d32159d

"upstream: add a ssh_packet_process_read() function that reads from

a fd directly into the transport input buffer.

Use this in the client and server mainloops to avoid unnecessary
copying. It also lets us use a more greedy read size without penalty.

Yields a 2-3% performance gain on cipher-speed.sh (in a fairly
unscientific test tbf)"

I still don't know why it's causing this behaviour so if anyone has a
clue please share?

Thanks
Chris



On 5/18/22 2:02 PM, rapier wrote:
> Hey all,
>
> I've run in to a strange issue and I'm wondering if anyone has any
> insight on this. So I have a modified version of OpenSSH called hpnssh
> that uses larger receive buffer sizes to improve throughput. When I
> ported that patch set to 8.9 I found out after an initial burst of data
> the throughput would drop to zero after an initial burst and just sort
> of stay there. This happens regardless of the RTT; same issues at
> sub-millisecond RTTs and 150ms.
>
> I've been able to replicate this behaviour in OpenSSH by increasing the
> size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to
> higher values. You kind of see it in the debug log at 96*CSPD and you
> can start to see it impact throughput at 128*CSDP. With my tests I have
> it at 512*CSPD and the pause will last upwards of 5 seconds. When it's
> at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will
> last 18 seconds or more.
>
> I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get
> a better view of whats happening. The following are the results of that
> in both 8.9p1 and 8.8p1
>
>
> Version 8.9p1
>
> Start of buffer growth
>
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 98304, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 131072, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 163840, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 196608, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 229376, len: 32768 for 0x55d548211550
> …
> debug1: adjusted rlen: 22003712, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22036480, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22069248, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 22085632, len: 16384 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 22102016, len: 16384 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22134784, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22167552, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22200320, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 22216704, len: 16384 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 22233088, len: 16384 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22265856, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22298624, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 22331392, len: 32768 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 22347776, len: 16384 for 0x55d548211550
> debug1: adjusted rlen: 16640, len: 16404 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> debug1: adjusted rlen: 33024, len: 32788 for 0x55d54820bd60
> — end of buffer growth
>
> So there is a buffer that kicks in after an short initial period of good
> throughput that has then has to grow larger that the size of
> CHAN_SES_WINDOW_DEFAULT (128*CSPD in this case) before throughput can
> start to pick up again. During this growth period CPU usage is pegged at
> 100% even though throughput is minimal.
>
>
> Version 8.8p1
>
> Start buffer adjustment
>
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 49152, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 81920, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 114688, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 147456, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 180224, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 212992, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 245760, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 278528, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 311296, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 344064, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 376832, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 409600, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 442368, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 475136, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 507904, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 540672, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 573440, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 606208, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 638976, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 671744, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 704512, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 737280, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 770048, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 802816, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 835584, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 868352, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 901120, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 933888, len: 32768 for 0x5570cf28a3e0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
> debug1: adjusted rlen: 33024, len: 32788 for 0x5570cf28d410
>
> End buffer adjustment
>
> In version 8.8 I also see a buffer growing during the course of the
> transfer but it looks like it doesn’t need to grow to the same size as
> in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on
> the same test bed, same set up, and the tests are being run right after
> each other.
>
> The behaviour in 8.9 is very consistent. After a certain number of
> adjusted rlens for, what I am assuming is the packet buffer, it goes
> into this growth phase for another buffer. In my test that been about
> 700 of these adjusts which is unrelated to the size of
> CHAN_SES_WINDOW_DEFAULT
>
> So I’m wondering what changed between 8.8 and 8.9 that might account for
> this and if this is expected/desired behaviour. I'm still working
> through the diff from 8.8 to 8.9 but nothing has leaped out at me yet.
>
> Thanks,
> Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On Wed, 18 May 2022, rapier wrote:

> Hey all,
>
> I've run in to a strange issue and I'm wondering if anyone has any
> insight on this. So I have a modified version of OpenSSH called hpnssh
> that uses larger receive buffer sizes to improve throughput. When I
> ported that patch set to 8.9 I found out after an initial burst of data
> the throughput would drop to zero after an initial burst and just sort
> of stay there. This happens regardless of the RTT; same issues at
> sub-millisecond RTTs and 150ms.
>
> I've been able to replicate this behaviour in OpenSSH by increasing the
> size of CHAN_SES_DEFAULT_WINDOW from 64*CHAN_SES_PACKET_DEFAULT to
> higher values. You kind of see it in the debug log at 96*CSPD and you
> can start to see it impact throughput at 128*CSDP. With my tests I have
> it at 512*CSPD and the pause will last upwards of 5 seconds. When it's
> at 1024*CSPD (corresponding to a 32MB receive buffer) the pause will
> last 18 seconds or more.
>
> I dropped in a debug statement in to sshbuf.c in sshbuf_allocate to get
> a better view of whats happening. The following are the results of that
> in both 8.9p1 and 8.8p1

[snip]

> In version 8.8 I also see a buffer growing during the course of the
> transfer but it looks like it doesn’t need to grow to the same size as
> in 8.9p1 (in this case WINDOW_DEFAULT was set to 4096*CSPD). This is on
> the same test bed, same set up, and the tests are being run right after
> each other.
>
> The behaviour in 8.9 is very consistent. After a certain number of
> adjusted rlens for, what I am assuming is the packet buffer, it goes
> into this growth phase for another buffer. In my test that been about
> 700 of these adjusts which is unrelated to the size of
> CHAN_SES_WINDOW_DEFAULT
>
> So I’m wondering what changed between 8.8 and 8.9 that might account for
> this and if this is expected/desired behaviour. I'm still working
> through the diff from 8.8 to 8.9 but nothing has leaped out at me yet.

8.9 switch from select() to poll() and included a couple of bugs that
could cause weird problems. IMO you should try to port to what's on
the top of the V_9_0 git branch, which is 9.0 + one more poll()-
related fix.

-d
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Damien

> 8.9 switch from select() to poll() and included a couple of bugs that
> could cause weird problems. IMO you should try to port to what's on
> the top of the V_9_0 git branch, which is 9.0 + one more poll()-
> related fix.

I just tried it with the head (commit bedb93415b) of the master branch.
Unfortunately, I'm still seeing the issue.

debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0

Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
RE: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On May 19, 2022 11:23 AM Chris wrote:
>Damien
>
>> 8.9 switch from select() to poll() and included a couple of bugs that
>> could cause weird problems. IMO you should try to port to what's on
>> the top of the V_9_0 git branch, which is 9.0 + one more poll()-
>> related fix.
>
>I just tried it with the head (commit bedb93415b) of the master branch.
>Unfortunately, I'm still seeing the issue.
>
>debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
>debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
>debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0

I am watching this situation also. We are unable to get OpenSSH to work on
NonStop since select was replaced with poll - mostly because poll is no
longer on the platform. It is emulated only, so we are getting an "Illegal
Function Argument" reported to the remote, which reports it back. Hopefully
your resolution will also work for us.
--Randall

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
I think I have a fix, or at least a better band aid, for people in my
situation. If, at some point, you decide to increase the size of
CHAN_SES_WINDOW_DEFAULT this may be helpful.

Basically, the changes to channels.h increases the size of SSH's receive
buffer, which triggers this behaviour. The changes in sshbuf.c tests to
see if the buffer is larger than ssh buffer that seems to be handling
the incoming packets. The assumption is that if it's larger then it's
the receive buffer. I then aggressively grow that buffer (4MB at a time)
but no more buf->max_size. The debug statement is so you can see what's
going on but that should be removed for production.

If you comment out the section that grows the window aggressively you
can see the stall as the buffer grows 32k at a time. I have some
concerns about growing the buffer it such large chunks. Also, I'd
ideally make it a function of the size of c->local_window_max but I
can't seem to get an extern working.

Chris

This patch is against the head of the master branch of openssh-portable.

diff --git a/channels.h b/channels.h
index 828c1b61b..ae897680c 100644
--- a/channels.h
+++ b/channels.h
@@ -210,7 +210,7 @@ struct Channel {

/* default window/packet sizes for tcp/x11-fwd-channel */
#define CHAN_SES_PACKET_DEFAULT (32*1024)
-#define CHAN_SES_WINDOW_DEFAULT (64*CHAN_SES_PACKET_DEFAULT)
+#define CHAN_SES_WINDOW_DEFAULT (4096*CHAN_SES_PACKET_DEFAULT)
#define CHAN_TCP_PACKET_DEFAULT (32*1024)
#define CHAN_TCP_WINDOW_DEFAULT (64*CHAN_TCP_PACKET_DEFAULT)
#define CHAN_X11_PACKET_DEFAULT (16*1024)
diff --git a/sshbuf.c b/sshbuf.c
index 840b899b1..b45720e1f 100644
--- a/sshbuf.c
+++ b/sshbuf.c
@@ -27,6 +27,7 @@
#include "ssherr.h"
#include "sshbuf.h"
#include "misc.h"
+#include "log.h"

static inline int
sshbuf_check_sanity(const struct sshbuf *buf)
@@ -327,9 +328,27 @@ sshbuf_allocate(struct sshbuf *buf, size_t len)
*/
need = len + buf->size - buf->alloc;
rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
+ /* I think the receive buffer is the one that is
+ * growing slowly. This buffer quickly ends up being larger
+ * than the typical packet buffer (usually 32.25KB) so if
+ * we explicitly test for growth needs larger than that we
+ * can accelerate the growth of this buffer and reduce load
+ * the CPU and improve throughput. Ideally we would use
+ * (local_window_max - rlen) as need but we don't have access
+ * to that here */
+ if (rlen > 34*1024) {
+ need = 4 * 1024 * 1024;
+ rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
+ if (rlen > buf->max_size)
+ rlen = buf->max_size;
+ }
SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
if (rlen > buf->max_size)
rlen = buf->alloc + need;
+ /* be sure to include log.h if you uncomment the debug
+ * this debug helped identify the buffer growth issue in v8.9
+ * see the git log about it. search for sshbuf_read */
+ debug("adjusted rlen: %zu, len: %lu for %p", rlen, len, buf);
SSHBUF_DBG(("adjusted rlen %zu", rlen));
if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
SSHBUF_DBG(("realloc fail"));
@@ -401,4 +420,3 @@ sshbuf_consume_end(struct sshbuf *buf, size_t len)
SSHBUF_TELL("done");
return 0;
}


On 5/19/22 11:23 AM, rapier wrote:
> Damien
>
>> 8.9 switch from select() to poll() and included a couple of bugs that
>> could cause weird problems. IMO you should try to port to what's on
>> the top of the V_9_0 git branch, which is 9.0 + one more poll()-
>> related fix.
>
> I just tried it with the head (commit bedb93415b) of the master branch.
> Unfortunately, I'm still seeing the issue.
>
> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
> debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
> debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
> debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0
>
> Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
RE: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
This may be part of it for me also, but in the other direction. I cannot
exceed 56000 bytes for any I/O unless through a buffered stream.

>-----Original Message-----
>From: openssh-unix-dev <openssh-unix-dev-
>bounces+rsbecker=nexbridge.com@mindrot.org> On Behalf Of rapier
>Sent: May 19, 2022 2:22 PM
>To: Damien Miller <djm@mindrot.org>
>Cc: openssh-unix-dev@mindrot.org
>Subject: Re: Difference in buffer behaviour between 8.89 and 8.9?
>
>I think I have a fix, or at least a better band aid, for people in my
situation. If, at
>some point, you decide to increase the size of CHAN_SES_WINDOW_DEFAULT this
>may be helpful.
>
>Basically, the changes to channels.h increases the size of SSH's receive
buffer,
>which triggers this behaviour. The changes in sshbuf.c tests to see if the
buffer is
>larger than ssh buffer that seems to be handling the incoming packets. The
>assumption is that if it's larger then it's the receive buffer. I then
aggressively grow
>that buffer (4MB at a time) but no more buf->max_size. The debug statement
is
>so you can see what's going on but that should be removed for production.
>
>If you comment out the section that grows the window aggressively you can
see
>the stall as the buffer grows 32k at a time. I have some concerns about
growing
>the buffer it such large chunks. Also, I'd ideally make it a function of
the size of c-
>>local_window_max but I can't seem to get an extern working.
>
>Chris
>
>This patch is against the head of the master branch of openssh-portable.
>
>diff --git a/channels.h b/channels.h
>index 828c1b61b..ae897680c 100644
>--- a/channels.h
>+++ b/channels.h
>@@ -210,7 +210,7 @@ struct Channel {
>
> /* default window/packet sizes for tcp/x11-fwd-channel */
> #define CHAN_SES_PACKET_DEFAULT (32*1024)
>-#define CHAN_SES_WINDOW_DEFAULT (64*CHAN_SES_PACKET_DEFAULT)
>+#define CHAN_SES_WINDOW_DEFAULT
>(4096*CHAN_SES_PACKET_DEFAULT)
> #define CHAN_TCP_PACKET_DEFAULT (32*1024)
> #define CHAN_TCP_WINDOW_DEFAULT (64*CHAN_TCP_PACKET_DEFAULT)
> #define CHAN_X11_PACKET_DEFAULT (16*1024)
>diff --git a/sshbuf.c b/sshbuf.c
>index 840b899b1..b45720e1f 100644
>--- a/sshbuf.c
>+++ b/sshbuf.c
>@@ -27,6 +27,7 @@
> #include "ssherr.h"
> #include "sshbuf.h"
> #include "misc.h"
>+#include "log.h"
>
> static inline int
> sshbuf_check_sanity(const struct sshbuf *buf) @@ -327,9 +328,27 @@
>sshbuf_allocate(struct sshbuf *buf, size_t len)
> */
> need = len + buf->size - buf->alloc;
> rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>+ /* I think the receive buffer is the one that is
>+ * growing slowly. This buffer quickly ends up being larger
>+ * than the typical packet buffer (usually 32.25KB) so if
>+ * we explicitly test for growth needs larger than that we
>+ * can accelerate the growth of this buffer and reduce load
>+ * the CPU and improve throughput. Ideally we would use
>+ * (local_window_max - rlen) as need but we don't have access
>+ * to that here */
>+ if (rlen > 34*1024) {
>+ need = 4 * 1024 * 1024;
>+ rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>+ if (rlen > buf->max_size)
>+ rlen = buf->max_size;
>+ }
> SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
> if (rlen > buf->max_size)
> rlen = buf->alloc + need;
>+ /* be sure to include log.h if you uncomment the debug
>+ * this debug helped identify the buffer growth issue in v8.9
>+ * see the git log about it. search for sshbuf_read */
>+ debug("adjusted rlen: %zu, len: %lu for %p", rlen, len, buf);
> SSHBUF_DBG(("adjusted rlen %zu", rlen));
> if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
> SSHBUF_DBG(("realloc fail")); @@ -401,4 +420,3 @@
>sshbuf_consume_end(struct sshbuf *buf, size_t len)
> SSHBUF_TELL("done");
> return 0;
> }
>
>
>On 5/19/22 11:23 AM, rapier wrote:
>> Damien
>>
>>> 8.9 switch from select() to poll() and included a couple of bugs that
>>> could cause weird problems. IMO you should try to port to what's on
>>> the top of the V_9_0 git branch, which is 9.0 + one more poll()-
>>> related fix.
>>
>> I just tried it with the head (commit bedb93415b) of the master branch.
>> Unfortunately, I'm still seeing the issue.
>>
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0
>>
>> Chris
>_______________________________________________
>openssh-unix-dev mailing list
>openssh-unix-dev@mindrot.org
>https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On Thu, 19 May 2022, rapier wrote:

> then aggressively grow that buffer (4MB at a time)

Does this impact low-memory systems like my SPARCstation
and that old 80486 laptop where growing this much may fail?

bye,
//mirabilos
--
>> Why don't you use JavaScript? I also don't like enabling JavaScript in
> Because I use lynx as browser.
+1
-- Octavio Alvarez, me and ????? (Mario Lang) on debian-devel
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Honestly, I don't know. It shouldn't grow any larger than buf->max_size
which seems to have a hard limit of 134,217,728 bytes. That said, in
memory constrained systems the TCP receive buffer probably isn't going
to be large enough for this really to make a difference.

Ideally I'd like to set the growth of the buffer to be a function of the
tcp receive buffer. I am getting that information in hpnssh but getting
the extern from channels.c to sshbuf.c is causing a weird link error in
sshkey.c. I haven't nailed that down yet.

Note: resetting the CHAN_SES_DEFAULT_WINDOW to the size in the patch is
just to demonstrate the problem. If you aren't making that buffer bigger
than it's current size (of 2MB) this should never be an issue.

It is an issue for my work in hpnssh because we dynamically adjust SSH's
receive buffer to match auto-tuning TCP receive buffers in order to
maximize throughput.

The main thing is that there was a change in how the buffers are handled
between 8.8 and 8.9 that could, in some cases, end up impacting
performance.

On 5/19/22 2:37 PM, Thorsten Glaser wrote:
> On Thu, 19 May 2022, rapier wrote:
>
>> then aggressively grow that buffer (4MB at a time)
>
> Does this impact low-memory systems like my SPARCstation
> and that old 80486 laptop where growing this much may fail?
>
> bye,
> //mirabilos
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
RE: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Randall,

Isn't that a consequence of the limits on the nonstop READX call?

-----Original Message-----
From: openssh-unix-dev <openssh-unix-dev-bounces+scott_n=xypro.com@mindrot.org> On Behalf Of rsbecker@nexbridge.com
Sent: Thursday, May 19, 2022 11:33 AM
To: 'rapier' <rapier@psc.edu>; 'Damien Miller' <djm@mindrot.org>
Cc: openssh-unix-dev@mindrot.org
Subject: RE: Difference in buffer behaviour between 8.89 and 8.9?

This may be part of it for me also, but in the other direction. I cannot exceed 56000 bytes for any I/O unless through a buffered stream.

>-----Original Message-----
>From: openssh-unix-dev <openssh-unix-dev-
>bounces+rsbecker=nexbridge.com@mindrot.org> On Behalf Of rapier
>Sent: May 19, 2022 2:22 PM
>To: Damien Miller <djm@mindrot.org>
>Cc: openssh-unix-dev@mindrot.org
>Subject: Re: Difference in buffer behaviour between 8.89 and 8.9?
>
>I think I have a fix, or at least a better band aid, for people in my
situation. If, at
>some point, you decide to increase the size of CHAN_SES_WINDOW_DEFAULT
>this may be helpful.
>
>Basically, the changes to channels.h increases the size of SSH's
>receive
buffer,
>which triggers this behaviour. The changes in sshbuf.c tests to see if
>the
buffer is
>larger than ssh buffer that seems to be handling the incoming packets.
>The assumption is that if it's larger then it's the receive buffer. I
>then
aggressively grow
>that buffer (4MB at a time) but no more buf->max_size. The debug
>statement
is
>so you can see what's going on but that should be removed for production.
>
>If you comment out the section that grows the window aggressively you
>can
see
>the stall as the buffer grows 32k at a time. I have some concerns about
growing
>the buffer it such large chunks. Also, I'd ideally make it a function
>of
the size of c-
>>local_window_max but I can't seem to get an extern working.
>
>Chris
>
>This patch is against the head of the master branch of openssh-portable.
>
>diff --git a/channels.h b/channels.h
>index 828c1b61b..ae897680c 100644
>--- a/channels.h
>+++ b/channels.h
>@@ -210,7 +210,7 @@ struct Channel {
>
> /* default window/packet sizes for tcp/x11-fwd-channel */
> #define CHAN_SES_PACKET_DEFAULT (32*1024)
>-#define CHAN_SES_WINDOW_DEFAULT (64*CHAN_SES_PACKET_DEFAULT)
>+#define CHAN_SES_WINDOW_DEFAULT
>(4096*CHAN_SES_PACKET_DEFAULT)
> #define CHAN_TCP_PACKET_DEFAULT (32*1024)
> #define CHAN_TCP_WINDOW_DEFAULT (64*CHAN_TCP_PACKET_DEFAULT)
> #define CHAN_X11_PACKET_DEFAULT (16*1024)
>diff --git a/sshbuf.c b/sshbuf.c
>index 840b899b1..b45720e1f 100644
>--- a/sshbuf.c
>+++ b/sshbuf.c
>@@ -27,6 +27,7 @@
> #include "ssherr.h"
> #include "sshbuf.h"
> #include "misc.h"
>+#include "log.h"
>
> static inline int
> sshbuf_check_sanity(const struct sshbuf *buf) @@ -327,9 +328,27 @@
>sshbuf_allocate(struct sshbuf *buf, size_t len)
> */
> need = len + buf->size - buf->alloc;
> rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>+ /* I think the receive buffer is the one that is
>+ * growing slowly. This buffer quickly ends up being larger
>+ * than the typical packet buffer (usually 32.25KB) so if
>+ * we explicitly test for growth needs larger than that we
>+ * can accelerate the growth of this buffer and reduce load
>+ * the CPU and improve throughput. Ideally we would use
>+ * (local_window_max - rlen) as need but we don't have access
>+ * to that here */
>+ if (rlen > 34*1024) {
>+ need = 4 * 1024 * 1024;
>+ rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>+ if (rlen > buf->max_size)
>+ rlen = buf->max_size;
>+ }
> SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
> if (rlen > buf->max_size)
> rlen = buf->alloc + need;
>+ /* be sure to include log.h if you uncomment the debug
>+ * this debug helped identify the buffer growth issue in v8.9
>+ * see the git log about it. search for sshbuf_read */
>+ debug("adjusted rlen: %zu, len: %lu for %p", rlen, len, buf);
> SSHBUF_DBG(("adjusted rlen %zu", rlen));
> if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
> SSHBUF_DBG(("realloc fail")); @@ -401,4 +420,3 @@
>sshbuf_consume_end(struct sshbuf *buf, size_t len)
> SSHBUF_TELL("done");
> return 0;
> }
>
>
>On 5/19/22 11:23 AM, rapier wrote:
>> Damien
>>
>>> 8.9 switch from select() to poll() and included a couple of bugs
>>> that could cause weird problems. IMO you should try to port to
>>> what's on the top of the V_9_0 git branch, which is 9.0 + one more
>>> poll()- related fix.
>>
>> I just tried it with the head (commit bedb93415b) of the master branch.
>> Unfortunately, I'm still seeing the issue.
>>
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>> debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0
>>
>> Chris
>_______________________________________________
>openssh-unix-dev mailing list
>openssh-unix-dev@mindrot.org
>https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flists.mindrot.org%2
>fmailman%2flistinfo%2fopenssh-unix-dev&c=E,1,50VJZTQXRUdUXTEwgfs20i0gpr
>m86flr2uBmX13I7_aUSLTmp7fTXXTn5CytR1N1UDctCiUGQW-3t_4-SaIEjdYe1kTTsnel3
>UmfvFwkHrU4Sg,,&typo=1

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flists.mindrot.org%2fmailman%2flistinfo%2fopenssh-unix-dev&c=E,1,SmySto8ksZw6pYMAttooee6uwhmQWL12sdiMti7O5hQO3xmxl-jKai0Ov1ZnkmR_0YbGT2tpOz9XQZRL2do2o_weY7FdhnvvghG7YR7U&typo=1
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
RE: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
WRITEX, yes. And write() and fwrite(). It is a file system limit, but there
are replacement procedures available now, just not for sockets.

>-----Original Message-----
>From: openssh-unix-dev <openssh-unix-dev-
>bounces+rsbecker=nexbridge.com@mindrot.org> On Behalf Of Scott Neugroschl
>Sent: May 19, 2022 5:21 PM
>To: rsbecker@nexbridge.com; 'rapier' <rapier@psc.edu>; 'Damien Miller'
><djm@mindrot.org>
>Cc: openssh-unix-dev@mindrot.org
>Subject: RE: Difference in buffer behaviour between 8.89 and 8.9?
>
>Randall,
>
>Isn't that a consequence of the limits on the nonstop READX call?
>
>-----Original Message-----
>From: openssh-unix-dev <openssh-unix-dev-
>bounces+scott_n=xypro.com@mindrot.org> On Behalf Of
>rsbecker@nexbridge.com
>Sent: Thursday, May 19, 2022 11:33 AM
>To: 'rapier' <rapier@psc.edu>; 'Damien Miller' <djm@mindrot.org>
>Cc: openssh-unix-dev@mindrot.org
>Subject: RE: Difference in buffer behaviour between 8.89 and 8.9?
>
>This may be part of it for me also, but in the other direction. I cannot
exceed 56000
>bytes for any I/O unless through a buffered stream.
>
>>-----Original Message-----
>>From: openssh-unix-dev <openssh-unix-dev-
>>bounces+rsbecker=nexbridge.com@mindrot.org> On Behalf Of rapier
>>Sent: May 19, 2022 2:22 PM
>>To: Damien Miller <djm@mindrot.org>
>>Cc: openssh-unix-dev@mindrot.org
>>Subject: Re: Difference in buffer behaviour between 8.89 and 8.9?
>>
>>I think I have a fix, or at least a better band aid, for people in my
>situation. If, at
>>some point, you decide to increase the size of CHAN_SES_WINDOW_DEFAULT
>>this may be helpful.
>>
>>Basically, the changes to channels.h increases the size of SSH's
>>receive
>buffer,
>>which triggers this behaviour. The changes in sshbuf.c tests to see if
>>the
>buffer is
>>larger than ssh buffer that seems to be handling the incoming packets.
>>The assumption is that if it's larger then it's the receive buffer. I
>>then
>aggressively grow
>>that buffer (4MB at a time) but no more buf->max_size. The debug
>>statement
>is
>>so you can see what's going on but that should be removed for production.
>>
>>If you comment out the section that grows the window aggressively you
>>can
>see
>>the stall as the buffer grows 32k at a time. I have some concerns about
>growing
>>the buffer it such large chunks. Also, I'd ideally make it a function
>>of
>the size of c-
>>>local_window_max but I can't seem to get an extern working.
>>
>>Chris
>>
>>This patch is against the head of the master branch of openssh-portable.
>>
>>diff --git a/channels.h b/channels.h
>>index 828c1b61b..ae897680c 100644
>>--- a/channels.h
>>+++ b/channels.h
>>@@ -210,7 +210,7 @@ struct Channel {
>>
>> /* default window/packet sizes for tcp/x11-fwd-channel */
>> #define CHAN_SES_PACKET_DEFAULT (32*1024)
>>-#define CHAN_SES_WINDOW_DEFAULT (64*CHAN_SES_PACKET_DEFAULT)
>>+#define CHAN_SES_WINDOW_DEFAULT
>>(4096*CHAN_SES_PACKET_DEFAULT)
>> #define CHAN_TCP_PACKET_DEFAULT (32*1024)
>> #define CHAN_TCP_WINDOW_DEFAULT
>(64*CHAN_TCP_PACKET_DEFAULT)
>> #define CHAN_X11_PACKET_DEFAULT (16*1024)
>>diff --git a/sshbuf.c b/sshbuf.c
>>index 840b899b1..b45720e1f 100644
>>--- a/sshbuf.c
>>+++ b/sshbuf.c
>>@@ -27,6 +27,7 @@
>> #include "ssherr.h"
>> #include "sshbuf.h"
>> #include "misc.h"
>>+#include "log.h"
>>
>> static inline int
>> sshbuf_check_sanity(const struct sshbuf *buf) @@ -327,9 +328,27 @@
>>sshbuf_allocate(struct sshbuf *buf, size_t len)
>> */
>> need = len + buf->size - buf->alloc;
>> rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>>+ /* I think the receive buffer is the one that is
>>+ * growing slowly. This buffer quickly ends up being larger
>>+ * than the typical packet buffer (usually 32.25KB) so if
>>+ * we explicitly test for growth needs larger than that we
>>+ * can accelerate the growth of this buffer and reduce load
>>+ * the CPU and improve throughput. Ideally we would use
>>+ * (local_window_max - rlen) as need but we don't have access
>>+ * to that here */
>>+ if (rlen > 34*1024) {
>>+ need = 4 * 1024 * 1024;
>>+ rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
>>+ if (rlen > buf->max_size)
>>+ rlen = buf->max_size;
>>+ }
>> SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
>> if (rlen > buf->max_size)
>> rlen = buf->alloc + need;
>>+ /* be sure to include log.h if you uncomment the debug
>>+ * this debug helped identify the buffer growth issue in v8.9
>>+ * see the git log about it. search for sshbuf_read */
>>+ debug("adjusted rlen: %zu, len: %lu for %p", rlen, len, buf);
>> SSHBUF_DBG(("adjusted rlen %zu", rlen));
>> if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
>> SSHBUF_DBG(("realloc fail")); @@ -401,4 +420,3 @@
>>sshbuf_consume_end(struct sshbuf *buf, size_t len)
>> SSHBUF_TELL("done");
>> return 0;
>> }
>>
>>
>>On 5/19/22 11:23 AM, rapier wrote:
>>> Damien
>>>
>>>> 8.9 switch from select() to poll() and included a couple of bugs
>>>> that could cause weird problems. IMO you should try to port to
>>>> what's on the top of the V_9_0 git branch, which is 9.0 + one more
>>>> poll()- related fix.
>>>
>>> I just tried it with the head (commit bedb93415b) of the master branch.
>>> Unfortunately, I'm still seeing the issue.
>>>
>>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>>> debug1: adjusted rlen: 36438016, len: 32768 for 0x5609805b0ed0
>>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>>> debug1: adjusted rlen: 36470784, len: 32768 for 0x5609805b0ed0
>>> debug1: adjusted rlen: 33024, len: 32788 for 0x5609805a3eb0
>>> debug1: adjusted rlen: 36503552, len: 32768 for 0x5609805b0ed0
>>>
>>> Chris
>>_______________________________________________
>>openssh-unix-dev mailing list
>>openssh-unix-dev@mindrot.org
>>https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flists.mindrot.org%2
>>fmailman%2flistinfo%2fopenssh-unix-dev&c=E,1,50VJZTQXRUdUXTEwgfs20i0gpr
>>m86flr2uBmX13I7_aUSLTmp7fTXXTn5CytR1N1UDctCiUGQW-3t_4-
>SaIEjdYe1kTTsnel3
>>UmfvFwkHrU4Sg,,&typo=1
>
>_______________________________________________
>openssh-unix-dev mailing list
>openssh-unix-dev@mindrot.org
>https://linkprotect.cudasvc.com/url?a=https%3a%2f%2flists.mindrot.org%2fmai
l
>man%2flistinfo%2fopenssh-unix-
>dev&c=E,1,SmySto8ksZw6pYMAttooee6uwhmQWL12sdiMti7O5hQO3xmxl-
>jKai0Ov1ZnkmR_0YbGT2tpOz9XQZRL2do2o_weY7FdhnvvghG7YR7U&typo=1
>_______________________________________________
>openssh-unix-dev mailing list
>openssh-unix-dev@mindrot.org
>https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On Thu, 19 May 2022, rapier wrote:

> I think I have a fix, or at least a better band aid, for people in my
> situation. If, at some point, you decide to increase the size of
> CHAN_SES_WINDOW_DEFAULT this may be helpful.
>
> Basically, the changes to channels.h increases the size of SSH's receive
> buffer, which triggers this behaviour. The changes in sshbuf.c tests to see if
> the buffer is larger than ssh buffer that seems to be handling the incoming
> packets. The assumption is that if it's larger then it's the receive buffer. I
> then aggressively grow that buffer (4MB at a time) but no more buf->max_size.
> The debug statement is so you can see what's going on but that should be
> removed for production.
>
> If you comment out the section that grows the window aggressively you can see
> the stall as the buffer grows 32k at a time. I have some concerns about
> growing the buffer it such large chunks. Also, I'd ideally make it a function
> of the size of c->local_window_max but I can't seem to get an extern working.

I don't understand the problem and I don't understand the fix, sorry.

The sshbuf_read() in packet.c will always reserve PACKET_MAX_SIZE (256k)
in addition to whatever is in the buffer currently and will greedily try to
fill it (up to max_size).

Is there some other buffer that is growing more slowly?

If so, I don't think the answer is to change the buffer code but instead
to sshbuf_reserve() an appropriate size before filling data into it.

-d
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On 5/19/22 7:29 PM, Damien Miller wrote:
> On Thu, 19 May 2022, rapier wrote:
>
>> I think I have a fix, or at least a better band aid, for people in my
>> situation. If, at some point, you decide to increase the size of
>> CHAN_SES_WINDOW_DEFAULT this may be helpful.
>>
>> Basically, the changes to channels.h increases the size of SSH's receive
>> buffer, which triggers this behaviour. The changes in sshbuf.c tests to see if
>> the buffer is larger than ssh buffer that seems to be handling the incoming
>> packets. The assumption is that if it's larger then it's the receive buffer. I
>> then aggressively grow that buffer (4MB at a time) but no more buf->max_size.
>> The debug statement is so you can see what's going on but that should be
>> removed for production.
>>
>> If you comment out the section that grows the window aggressively you can see
>> the stall as the buffer grows 32k at a time. I have some concerns about
>> growing the buffer it such large chunks. Also, I'd ideally make it a function
>> of the size of c->local_window_max but I can't seem to get an extern working.
>
> I don't understand the problem and I don't understand the fix, sorry.

I don't really understand the problem either whcih means the fix really
is a band aid at this point.

> The sshbuf_read() in packet.c will always reserve PACKET_MAX_SIZE (256k)
> in addition to whatever is in the buffer currently and will greedily try to
> fill it (up to max_size).

The problem is that I'm not seeing that behaviour. It's pretty easy to
see though - just up the default window and drop that debug into
sshbuf.c. It's easier to see if you ramp up the delay with netem but
even with no delay it's there. Either way, I'm only seeing it ramp up by
32K at a time.

> Is there some other buffer that is growing more slowly?

I honestly don't know. No small part of this is guess work on my part
but there is some buffer that's growing pretty slowly in this situation.

> If so, I don't think the answer is to change the buffer code but instead
> to sshbuf_reserve() an appropriate size before filling data into it.

I'll look into that but I'm in a place where figuring out where to start
can be tricky (my knowledge of the sshbuf code is, to be sure, weak) so
I'm mostly working backwards. I got to where I am by profiling via
vtune. sshbuf_reserve() is in the call stack so I'll start going
backwards again.


Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On Fri, 20 May 2022, rapier wrote:

> > I don't understand the problem and I don't understand the fix, sorry.
>
> I don't really understand the problem either whcih means the fix really is a
> band aid at this point.
>
> > The sshbuf_read() in packet.c will always reserve PACKET_MAX_SIZE (256k)
> > in addition to whatever is in the buffer currently and will greedily try to
> > fill it (up to max_size).
>
> The problem is that I'm not seeing that behaviour. It's pretty easy to see
> though - just up the default window and drop that debug into sshbuf.c. It's
> easier to see if you ramp up the delay with netem but even with no delay it's
> there. Either way, I'm only seeing it ramp up by 32K at a time.

Are you sure that it's the packet receive buffer in this case?

I'd recommend instrumenting sshbuf_read() first and seeing what what it's
doing, e.g. how much data the read() is returning.

You can set SSHBUF_DEBUG at compile time to get a lot more info.
A strace or similar might be helpful too.

-d
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On 5/20/22 9:08 PM, Damien Miller wrote:

> Are you sure that it's the packet receive buffer in this case?
>
> I'd recommend instrumenting sshbuf_read() first and seeing what what it's
> doing, e.g. how much data the read() is returning.
>
> You can set SSHBUF_DEBUG at compile time to get a lot more info.
> A strace or similar might be helpful too.

So I tried this and I'm getting the following from the client

rapier@iztli:~$ ssh -p2289 192.168.1.102
kex_exchange_identification: No SSH version received in first 1024 lines
from server


Turns out the banner is getting filled with the SSHBUF_DBG lines:
debug1: kex_exchange_identification: banner line 0: sshbuf.c:34
sshbuf_check_sanity: sanity size 0 alloc 256 off 0 max 134217728
debug1: kex_exchange_identification: banner line 1: sshbuf.c:354
sshbuf_reserve: reserve buf = 0x556d1a798160 len = 3125
debug1: kex_exchange_identification: banner line 2: sshbuf.c:309
sshbuf_allocate: allocate buf = 0x556d1a798160 len = 3125


Am I missing a step? This is with 9.0p1 with the default sshd_config.
I've also tried it with 8.8 and 8.2 with the same results so I feel like
I am missing something. Client is 8.2p1.

Thanks,

Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Wait, I figured it out. This isn't designed to work from the server
side. Sorry about the extra mail.

On 5/24/22 1:38 PM, rapier wrote:
>
>
> On 5/20/22 9:08 PM, Damien Miller wrote:
>
>> Are you sure that it's the packet receive buffer in this case?
>>
>> I'd recommend instrumenting sshbuf_read() first and seeing what what it's
>> doing, e.g. how much data the read() is returning.
>>
>> You can set SSHBUF_DEBUG at compile time to get a lot more info.
>> A strace or similar might be helpful too.
>
> So I tried this and I'm getting the following from the client
>
> rapier@iztli:~$ ssh -p2289 192.168.1.102
> kex_exchange_identification: No SSH version received in first 1024 lines
> from server
>
>
> Turns out the banner is getting filled with the SSHBUF_DBG lines:
> debug1: kex_exchange_identification: banner line 0: sshbuf.c:34
> sshbuf_check_sanity: sanity size 0 alloc 256 off 0 max 134217728
> debug1: kex_exchange_identification: banner line 1: sshbuf.c:354
> sshbuf_reserve: reserve buf = 0x556d1a798160 len = 3125
> debug1: kex_exchange_identification: banner line 2: sshbuf.c:309
> sshbuf_allocate: allocate buf = 0x556d1a798160 len = 3125
>
>
> Am I missing a step? This is with 9.0p1 with the default sshd_config.
> I've also tried it with 8.8 and 8.2 with the same results so I feel like
> I am missing something. Client is 8.2p1.
>
> Thanks,
>
> Chris
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
On Tue, 24 May 2022, rapier wrote:

> On 5/20/22 9:08 PM, Damien Miller wrote:
>
> > Are you sure that it's the packet receive buffer in this case?
> >
> > I'd recommend instrumenting sshbuf_read() first and seeing what what it's
> > doing, e.g. how much data the read() is returning.
> >
> > You can set SSHBUF_DEBUG at compile time to get a lot more info.
> > A strace or similar might be helpful too.
>
> So I tried this and I'm getting the following from the client
>
> rapier@iztli:~$ ssh -p2289 192.168.1.102
> kex_exchange_identification: No SSH version received in first 1024 lines from
> server
>
>
> Turns out the banner is getting filled with the SSHBUF_DBG lines:
> debug1: kex_exchange_identification: banner line 0: sshbuf.c:34
> sshbuf_check_sanity: sanity size 0 alloc 256 off 0 max 134217728
> debug1: kex_exchange_identification: banner line 1: sshbuf.c:354
> sshbuf_reserve: reserve buf = 0x556d1a798160 len = 3125
> debug1: kex_exchange_identification: banner line 2: sshbuf.c:309
> sshbuf_allocate: allocate buf = 0x556d1a798160 len = 3125
>
>
> Am I missing a step? This is with 9.0p1 with the default sshd_config. I've
> also tried it with 8.8 and 8.2 with the same results so I feel like I am
> missing something. Client is 8.2p1.

I forgot that it writes to stderr. Try this:

diff --git a/sshbuf-getput-basic.c b/sshbuf-getput-basic.c
index 15a56cb..f845ac6 100644
--- a/sshbuf-getput-basic.c
+++ b/sshbuf-getput-basic.c
@@ -194,7 +194,7 @@ sshbuf_get_string(struct sshbuf *buf, u_char **valp, size_t *lenp)
return r;
if (valp != NULL) {
if ((*valp = malloc(len + 1)) == NULL) {
- SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
+ SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
return SSH_ERR_ALLOC_FAIL;
}
if (len != 0)
@@ -225,7 +225,7 @@ sshbuf_get_string_direct(struct sshbuf *buf, const u_char **valp, size_t *lenp)
*lenp = len;
if (sshbuf_consume(buf, len + 4) != 0) {
/* Shouldn't happen */
- SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
+ SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
SSHBUF_ABORT();
return SSH_ERR_INTERNAL_ERROR;
}
@@ -244,16 +244,16 @@ sshbuf_peek_string_direct(const struct sshbuf *buf, const u_char **valp,
if (lenp != NULL)
*lenp = 0;
if (sshbuf_len(buf) < 4) {
- SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
+ SSHBUF_DBG("SSH_ERR_MESSAGE_INCOMPLETE");
return SSH_ERR_MESSAGE_INCOMPLETE;
}
len = PEEK_U32(p);
if (len > SSHBUF_SIZE_MAX - 4) {
- SSHBUF_DBG(("SSH_ERR_STRING_TOO_LARGE"));
+ SSHBUF_DBG("SSH_ERR_STRING_TOO_LARGE");
return SSH_ERR_STRING_TOO_LARGE;
}
if (sshbuf_len(buf) - 4 < len) {
- SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
+ SSHBUF_DBG("SSH_ERR_MESSAGE_INCOMPLETE");
return SSH_ERR_MESSAGE_INCOMPLETE;
}
if (valp != NULL)
@@ -279,14 +279,14 @@ sshbuf_get_cstring(struct sshbuf *buf, char **valp, size_t *lenp)
/* Allow a \0 only at the end of the string */
if (len > 0 &&
(z = memchr(p , '\0', len)) != NULL && z < p + len - 1) {
- SSHBUF_DBG(("SSH_ERR_INVALID_FORMAT"));
+ SSHBUF_DBG("SSH_ERR_INVALID_FORMAT");
return SSH_ERR_INVALID_FORMAT;
}
if ((r = sshbuf_skip_string(buf)) != 0)
return -1;
if (valp != NULL) {
if ((*valp = malloc(len + 1)) == NULL) {
- SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
+ SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
return SSH_ERR_ALLOC_FAIL;
}
if (len != 0)
@@ -513,7 +513,7 @@ sshbuf_put_string(struct sshbuf *buf, const void *v, size_t len)
int r;

if (len > SSHBUF_SIZE_MAX - 4) {
- SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
+ SSHBUF_DBG("SSH_ERR_NO_BUFFER_SPACE");
return SSH_ERR_NO_BUFFER_SPACE;
}
if ((r = sshbuf_reserve(buf, len + 4, &d)) < 0)
@@ -571,7 +571,7 @@ sshbuf_put_bignum2_bytes(struct sshbuf *buf, const void *v, size_t len)
int r, prepend;

if (len > SSHBUF_SIZE_MAX - 5) {
- SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
+ SSHBUF_DBG("SSH_ERR_NO_BUFFER_SPACE");
return SSH_ERR_NO_BUFFER_SPACE;
}
/* Skip leading zero bytes */
@@ -621,7 +621,7 @@ sshbuf_get_bignum2_bytes_direct(struct sshbuf *buf,
*lenp = len;
if (sshbuf_consume(buf, olen + 4) != 0) {
/* Shouldn't happen */
- SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
+ SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
SSHBUF_ABORT();
return SSH_ERR_INTERNAL_ERROR;
}
diff --git a/sshbuf-getput-crypto.c b/sshbuf-getput-crypto.c
index e3e71ba..b22c1f3 100644
--- a/sshbuf-getput-crypto.c
+++ b/sshbuf-getput-crypto.c
@@ -78,7 +78,7 @@ sshbuf_get_ec(struct sshbuf *buf, EC_POINT *v, const EC_GROUP *g)
/* Skip string */
if (sshbuf_get_string_direct(buf, NULL, NULL) != 0) {
/* Shouldn't happen */
- SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
+ SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
SSHBUF_ABORT();
return SSH_ERR_INTERNAL_ERROR;
}
@@ -94,7 +94,7 @@ sshbuf_get_eckey(struct sshbuf *buf, EC_KEY *v)
size_t len;

if (pt == NULL) {
- SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
+ SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
return SSH_ERR_ALLOC_FAIL;
}
if ((r = sshbuf_peek_string_direct(buf, &d, &len)) < 0) {
@@ -113,7 +113,7 @@ sshbuf_get_eckey(struct sshbuf *buf, EC_KEY *v)
/* Skip string */
if (sshbuf_get_string_direct(buf, NULL, NULL) != 0) {
/* Shouldn't happen */
- SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
+ SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
SSHBUF_ABORT();
return SSH_ERR_INTERNAL_ERROR;
}
diff --git a/sshbuf.c b/sshbuf.c
index 8db22a0..685dbb8 100644
--- a/sshbuf.c
+++ b/sshbuf.c
@@ -39,7 +39,7 @@ sshbuf_check_sanity(const struct sshbuf *buf)
buf->size > buf->alloc ||
buf->off > buf->size)) {
/* Do not try to recover from corrupted buffer internals */
- SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
+ SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
ssh_signal(SIGSEGV, SIG_DFL);
raise(SIGSEGV);
return SSH_ERR_INTERNAL_ERROR;
@@ -50,7 +50,7 @@ sshbuf_check_sanity(const struct sshbuf *buf)
static void
sshbuf_maybe_pack(struct sshbuf *buf, int force)
{
- SSHBUF_DBG(("force %d", force));
+ SSHBUF_DBG("force %d", force);
SSHBUF_TELL("pre-pack");
if (buf->off == 0 || buf->readonly || buf->refcount > 1)
return;
@@ -221,7 +221,7 @@ sshbuf_set_max_size(struct sshbuf *buf, size_t max_size)
u_char *dp;
int r;

- SSHBUF_DBG(("set max buf = %p len = %zu", buf, max_size));
+ SSHBUF_DBG("set max buf = %p len = %zu", buf, max_size);
if ((r = sshbuf_check_sanity(buf)) != 0)
return r;
if (max_size == buf->max_size)
@@ -239,7 +239,7 @@ sshbuf_set_max_size(struct sshbuf *buf, size_t max_size)
rlen = ROUNDUP(buf->size, SSHBUF_SIZE_INC);
if (rlen > max_size)
rlen = max_size;
- SSHBUF_DBG(("new alloc = %zu", rlen));
+ SSHBUF_DBG("new alloc = %zu", rlen);
if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL)
return SSH_ERR_ALLOC_FAIL;
buf->cd = buf->d = dp;
@@ -307,7 +307,7 @@ sshbuf_allocate(struct sshbuf *buf, size_t len)
u_char *dp;
int r;

- SSHBUF_DBG(("allocate buf = %p len = %zu", buf, len));
+ SSHBUF_DBG("allocate buf = %p len = %zu", buf, len);
if ((r = sshbuf_check_reserve(buf, len)) != 0)
return r;
/*
@@ -325,12 +325,12 @@ sshbuf_allocate(struct sshbuf *buf, size_t len)
*/
need = len + buf->size - buf->alloc;
rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
- SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
+ SSHBUF_DBG("need %zu initial rlen %zu", need, rlen);
if (rlen > buf->max_size)
rlen = buf->alloc + need;
- SSHBUF_DBG(("adjusted rlen %zu", rlen));
+ SSHBUF_DBG("adjusted rlen %zu", rlen);
if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
- SSHBUF_DBG(("realloc fail"));
+ SSHBUF_DBG("realloc fail");
return SSH_ERR_ALLOC_FAIL;
}
buf->alloc = rlen;
@@ -352,7 +352,7 @@ sshbuf_reserve(struct sshbuf *buf, size_t len, u_char **dpp)
if (dpp != NULL)
*dpp = NULL;

- SSHBUF_DBG(("reserve buf = %p len = %zu", buf, len));
+ SSHBUF_DBG("reserve buf = %p len = %zu", buf, len);
if ((r = sshbuf_allocate(buf, len)) != 0)
return r;

@@ -368,7 +368,7 @@ sshbuf_consume(struct sshbuf *buf, size_t len)
{
int r;

- SSHBUF_DBG(("len = %zu", len));
+ SSHBUF_DBG("len = %zu", len);
if ((r = sshbuf_check_sanity(buf)) != 0)
return r;
if (len == 0)
@@ -388,7 +388,7 @@ sshbuf_consume_end(struct sshbuf *buf, size_t len)
{
int r;

- SSHBUF_DBG(("len = %zu", len));
+ SSHBUF_DBG("len = %zu", len);
if ((r = sshbuf_check_sanity(buf)) != 0)
return r;
if (len == 0)
diff --git a/sshbuf.h b/sshbuf.h
index 9624c43..cc3c34a 100644
--- a/sshbuf.h
+++ b/sshbuf.h
@@ -396,16 +396,17 @@ u_int sshbuf_refcount(const struct sshbuf *buf);

# ifdef SSHBUF_DEBUG
# define SSHBUF_TELL(what) do { \
- printf("%s:%d %s: %s size %zu alloc %zu off %zu max %zu\n", \
+ fprintf(stderr, \
+ "%s:%d %s: %s size %zu alloc %zu off %zu max %zu\n", \
__FILE__, __LINE__, __func__, what, \
buf->size, buf->alloc, buf->off, buf->max_size); \
- fflush(stdout); \
+ fflush(stderr); \
} while (0)
-# define SSHBUF_DBG(x) do { \
- printf("%s:%d %s: ", __FILE__, __LINE__, __func__); \
- printf x; \
- printf("\n"); \
- fflush(stdout); \
+# define SSHBUF_DBG(...) do { \
+ fprintf(stderr, "%s:%d %s: ", __FILE__, __LINE__, __func__); \
+ fprintf(stderr, __VA_ARGS__); \
+ fprintf(stderr, "\n"); \
+ fflush(stderr); \
} while (0)
# else
# define SSHBUF_TELL(what)
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: Difference in buffer behaviour between 8.89 and 8.9? [ In reply to ]
Thanks! This worked well.

I'm still trying to figure out exactly which buffer is causing the
problem but I can describe the behaviour a little better. There is some
buffer that is a function of CHAN_SES_WINDOW_DEFAULT. After digging
through the code I'm pretty sure it's c->output.

After an initial burst of transfer this buffer starts to grow up to some
maximum value. Each round of data it gets causes sshbuf_reserve to call
sshbuf_alloc which then reallocs the buffer and fires off memcpys. Until
the buffer reaches that maximum none/little of it ends up getting
consumed. I believe, and this mostly a guess, that it not getting
consumed clamps the advertised receive window. Once it reaches that
maximum consumption starts and throughput picks up again.

When CHAN_SES_WINDOW_DEFAULT is small (64*CSPD) this delay isn't
noticible outside of really looking for it in the sshbuf logs. When CSWD
is large the delay becomes noticable, especially at higher RTTs when it
needs to handle more outstanding data.

What I'm trying to figure out is how to tell sshbuf_reserve to
selectively accelerate the growth of that buffer but I'm at a loss. The
right place seems to be in channel_input_data but mucking about with
sshbuf_put is failing. If you have any ideas please let me know.

Chris

On 5/24/22 7:27 PM, Damien Miller wrote:
> diff --git a/sshbuf-getput-basic.c b/sshbuf-getput-basic.c
> index 15a56cb..f845ac6 100644
> --- a/sshbuf-getput-basic.c
> +++ b/sshbuf-getput-basic.c
> @@ -194,7 +194,7 @@ sshbuf_get_string(struct sshbuf *buf, u_char **valp, size_t *lenp)
> return r;
> if (valp != NULL) {
> if ((*valp = malloc(len + 1)) == NULL) {
> - SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
> + SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
> return SSH_ERR_ALLOC_FAIL;
> }
> if (len != 0)
> @@ -225,7 +225,7 @@ sshbuf_get_string_direct(struct sshbuf *buf, const u_char **valp, size_t *lenp)
> *lenp = len;
> if (sshbuf_consume(buf, len + 4) != 0) {
> /* Shouldn't happen */
> - SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
> + SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
> SSHBUF_ABORT();
> return SSH_ERR_INTERNAL_ERROR;
> }
> @@ -244,16 +244,16 @@ sshbuf_peek_string_direct(const struct sshbuf *buf, const u_char **valp,
> if (lenp != NULL)
> *lenp = 0;
> if (sshbuf_len(buf) < 4) {
> - SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
> + SSHBUF_DBG("SSH_ERR_MESSAGE_INCOMPLETE");
> return SSH_ERR_MESSAGE_INCOMPLETE;
> }
> len = PEEK_U32(p);
> if (len > SSHBUF_SIZE_MAX - 4) {
> - SSHBUF_DBG(("SSH_ERR_STRING_TOO_LARGE"));
> + SSHBUF_DBG("SSH_ERR_STRING_TOO_LARGE");
> return SSH_ERR_STRING_TOO_LARGE;
> }
> if (sshbuf_len(buf) - 4 < len) {
> - SSHBUF_DBG(("SSH_ERR_MESSAGE_INCOMPLETE"));
> + SSHBUF_DBG("SSH_ERR_MESSAGE_INCOMPLETE");
> return SSH_ERR_MESSAGE_INCOMPLETE;
> }
> if (valp != NULL)
> @@ -279,14 +279,14 @@ sshbuf_get_cstring(struct sshbuf *buf, char **valp, size_t *lenp)
> /* Allow a \0 only at the end of the string */
> if (len > 0 &&
> (z = memchr(p , '\0', len)) != NULL && z < p + len - 1) {
> - SSHBUF_DBG(("SSH_ERR_INVALID_FORMAT"));
> + SSHBUF_DBG("SSH_ERR_INVALID_FORMAT");
> return SSH_ERR_INVALID_FORMAT;
> }
> if ((r = sshbuf_skip_string(buf)) != 0)
> return -1;
> if (valp != NULL) {
> if ((*valp = malloc(len + 1)) == NULL) {
> - SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
> + SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
> return SSH_ERR_ALLOC_FAIL;
> }
> if (len != 0)
> @@ -513,7 +513,7 @@ sshbuf_put_string(struct sshbuf *buf, const void *v, size_t len)
> int r;
>
> if (len > SSHBUF_SIZE_MAX - 4) {
> - SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
> + SSHBUF_DBG("SSH_ERR_NO_BUFFER_SPACE");
> return SSH_ERR_NO_BUFFER_SPACE;
> }
> if ((r = sshbuf_reserve(buf, len + 4, &d)) < 0)
> @@ -571,7 +571,7 @@ sshbuf_put_bignum2_bytes(struct sshbuf *buf, const void *v, size_t len)
> int r, prepend;
>
> if (len > SSHBUF_SIZE_MAX - 5) {
> - SSHBUF_DBG(("SSH_ERR_NO_BUFFER_SPACE"));
> + SSHBUF_DBG("SSH_ERR_NO_BUFFER_SPACE");
> return SSH_ERR_NO_BUFFER_SPACE;
> }
> /* Skip leading zero bytes */
> @@ -621,7 +621,7 @@ sshbuf_get_bignum2_bytes_direct(struct sshbuf *buf,
> *lenp = len;
> if (sshbuf_consume(buf, olen + 4) != 0) {
> /* Shouldn't happen */
> - SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
> + SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
> SSHBUF_ABORT();
> return SSH_ERR_INTERNAL_ERROR;
> }
> diff --git a/sshbuf-getput-crypto.c b/sshbuf-getput-crypto.c
> index e3e71ba..b22c1f3 100644
> --- a/sshbuf-getput-crypto.c
> +++ b/sshbuf-getput-crypto.c
> @@ -78,7 +78,7 @@ sshbuf_get_ec(struct sshbuf *buf, EC_POINT *v, const EC_GROUP *g)
> /* Skip string */
> if (sshbuf_get_string_direct(buf, NULL, NULL) != 0) {
> /* Shouldn't happen */
> - SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
> + SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
> SSHBUF_ABORT();
> return SSH_ERR_INTERNAL_ERROR;
> }
> @@ -94,7 +94,7 @@ sshbuf_get_eckey(struct sshbuf *buf, EC_KEY *v)
> size_t len;
>
> if (pt == NULL) {
> - SSHBUF_DBG(("SSH_ERR_ALLOC_FAIL"));
> + SSHBUF_DBG("SSH_ERR_ALLOC_FAIL");
> return SSH_ERR_ALLOC_FAIL;
> }
> if ((r = sshbuf_peek_string_direct(buf, &d, &len)) < 0) {
> @@ -113,7 +113,7 @@ sshbuf_get_eckey(struct sshbuf *buf, EC_KEY *v)
> /* Skip string */
> if (sshbuf_get_string_direct(buf, NULL, NULL) != 0) {
> /* Shouldn't happen */
> - SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
> + SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
> SSHBUF_ABORT();
> return SSH_ERR_INTERNAL_ERROR;
> }
> diff --git a/sshbuf.c b/sshbuf.c
> index 8db22a0..685dbb8 100644
> --- a/sshbuf.c
> +++ b/sshbuf.c
> @@ -39,7 +39,7 @@ sshbuf_check_sanity(const struct sshbuf *buf)
> buf->size > buf->alloc ||
> buf->off > buf->size)) {
> /* Do not try to recover from corrupted buffer internals */
> - SSHBUF_DBG(("SSH_ERR_INTERNAL_ERROR"));
> + SSHBUF_DBG("SSH_ERR_INTERNAL_ERROR");
> ssh_signal(SIGSEGV, SIG_DFL);
> raise(SIGSEGV);
> return SSH_ERR_INTERNAL_ERROR;
> @@ -50,7 +50,7 @@ sshbuf_check_sanity(const struct sshbuf *buf)
> static void
> sshbuf_maybe_pack(struct sshbuf *buf, int force)
> {
> - SSHBUF_DBG(("force %d", force));
> + SSHBUF_DBG("force %d", force);
> SSHBUF_TELL("pre-pack");
> if (buf->off == 0 || buf->readonly || buf->refcount > 1)
> return;
> @@ -221,7 +221,7 @@ sshbuf_set_max_size(struct sshbuf *buf, size_t max_size)
> u_char *dp;
> int r;
>
> - SSHBUF_DBG(("set max buf = %p len = %zu", buf, max_size));
> + SSHBUF_DBG("set max buf = %p len = %zu", buf, max_size);
> if ((r = sshbuf_check_sanity(buf)) != 0)
> return r;
> if (max_size == buf->max_size)
> @@ -239,7 +239,7 @@ sshbuf_set_max_size(struct sshbuf *buf, size_t max_size)
> rlen = ROUNDUP(buf->size, SSHBUF_SIZE_INC);
> if (rlen > max_size)
> rlen = max_size;
> - SSHBUF_DBG(("new alloc = %zu", rlen));
> + SSHBUF_DBG("new alloc = %zu", rlen);
> if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL)
> return SSH_ERR_ALLOC_FAIL;
> buf->cd = buf->d = dp;
> @@ -307,7 +307,7 @@ sshbuf_allocate(struct sshbuf *buf, size_t len)
> u_char *dp;
> int r;
>
> - SSHBUF_DBG(("allocate buf = %p len = %zu", buf, len));
> + SSHBUF_DBG("allocate buf = %p len = %zu", buf, len);
> if ((r = sshbuf_check_reserve(buf, len)) != 0)
> return r;
> /*
> @@ -325,12 +325,12 @@ sshbuf_allocate(struct sshbuf *buf, size_t len)
> */
> need = len + buf->size - buf->alloc;
> rlen = ROUNDUP(buf->alloc + need, SSHBUF_SIZE_INC);
> - SSHBUF_DBG(("need %zu initial rlen %zu", need, rlen));
> + SSHBUF_DBG("need %zu initial rlen %zu", need, rlen);
> if (rlen > buf->max_size)
> rlen = buf->alloc + need;
> - SSHBUF_DBG(("adjusted rlen %zu", rlen));
> + SSHBUF_DBG("adjusted rlen %zu", rlen);
> if ((dp = recallocarray(buf->d, buf->alloc, rlen, 1)) == NULL) {
> - SSHBUF_DBG(("realloc fail"));
> + SSHBUF_DBG("realloc fail");
> return SSH_ERR_ALLOC_FAIL;
> }
> buf->alloc = rlen;
> @@ -352,7 +352,7 @@ sshbuf_reserve(struct sshbuf *buf, size_t len, u_char **dpp)
> if (dpp != NULL)
> *dpp = NULL;
>
> - SSHBUF_DBG(("reserve buf = %p len = %zu", buf, len));
> + SSHBUF_DBG("reserve buf = %p len = %zu", buf, len);
> if ((r = sshbuf_allocate(buf, len)) != 0)
> return r;
>
> @@ -368,7 +368,7 @@ sshbuf_consume(struct sshbuf *buf, size_t len)
> {
> int r;
>
> - SSHBUF_DBG(("len = %zu", len));
> + SSHBUF_DBG("len = %zu", len);
> if ((r = sshbuf_check_sanity(buf)) != 0)
> return r;
> if (len == 0)
> @@ -388,7 +388,7 @@ sshbuf_consume_end(struct sshbuf *buf, size_t len)
> {
> int r;
>
> - SSHBUF_DBG(("len = %zu", len));
> + SSHBUF_DBG("len = %zu", len);
> if ((r = sshbuf_check_sanity(buf)) != 0)
> return r;
> if (len == 0)
> diff --git a/sshbuf.h b/sshbuf.h
> index 9624c43..cc3c34a 100644
> --- a/sshbuf.h
> +++ b/sshbuf.h
> @@ -396,16 +396,17 @@ u_int sshbuf_refcount(const struct sshbuf *buf);
>
> # ifdef SSHBUF_DEBUG
> # define SSHBUF_TELL(what) do { \
> - printf("%s:%d %s: %s size %zu alloc %zu off %zu max %zu\n", \
> + fprintf(stderr, \
> + "%s:%d %s: %s size %zu alloc %zu off %zu max %zu\n", \
> __FILE__, __LINE__, __func__, what, \
> buf->size, buf->alloc, buf->off, buf->max_size); \
> - fflush(stdout); \
> + fflush(stderr); \
> } while (0)
> -# define SSHBUF_DBG(x) do { \
> - printf("%s:%d %s: ", __FILE__, __LINE__, __func__); \
> - printf x; \
> - printf("\n"); \
> - fflush(stdout); \
> +# define SSHBUF_DBG(...) do { \
> + fprintf(stderr, "%s:%d %s: ", __FILE__, __LINE__, __func__); \
> + fprintf(stderr, __VA_ARGS__); \
> + fprintf(stderr, "\n"); \
> + fflush(stderr); \
> } while (0)
> # else
> # define SSHBUF_TELL(what)
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev