Mailing List Archive

9.3p1 Daemon Rejects Client Connections on armv7l-dey-linux-gnueabihf w/ GCC 10/11/12
I have an NXP i.MX6-based armv7l-dey-linux-gnueabihf system in which I
am seeing some as-yet-unaccountable behavior in sshd when compiled with
Arm/GCC 10/11/12. That is, when attempting to scp/slogin/ssh to
'root@<host>', where <host> is either a name or IPv4 or IPv6 address,
the connection is quickly closed by the server without prompting for a
password.

The variable I can consistently change across all others to impact
whether things work or do not work is the toolchain. Under the
arm-dey-linux-gnueabi-gcc 8.2.0 from Digi Embedded Yocto (DEY),
scp/slogin/ssh works. Under arm-none-linux-gnueabihf-gcc 10/11/12
(specifically those from https://developer.arm.com/-/media/Files/downloads/gnu-a/10.3-2021.07/binrel/gcc-arm-10.3-2021.07-x86_64-arm-none-linux-gnueabihf.tar.xz, https://developer.arm.com/-/media/Files/downloads/gnu/11.3.rel1/binrel/arm-gnu-toolchain-11.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz, and https://developer.arm.com/-/media/Files/downloads/gnu/12.3.rel1/binrel/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz) they do not, failing consistently and with the same failure across the three of them.

The original version of openssh under which this was observed was 9.3p1,
configured as follows:

${BuildRoot}/third_party/openssh/openssh-9.3p1/configure -C \
AR="${AR}" CPP="${CPP}" CC="${CC}" CXX="${CXX}" RANLIB="${RANLIB}" STRIP="${STRIP}" \
CPPFLAGS="--sysroot=${SYSROOT} -mcpu=cortex-a8 -mfloat-abi=hard -mfpu=neon -isystem ${SYSROOT}/usr/include -I${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/ncurses/usr/include -I${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/openssl/usr/include" \
CFLAGS="--sysroot=${SYSROOT} -mcpu=cortex-a8 -mfloat-abi=hard -mfpu=neon -fno-omit-frame-pointer -fno-strict-aliasing" \
LDFLAGS="--sysroot=${SYSROOT} -L${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/ncurses/usr/lib/ -L${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/libedit/usr/lib/ -Wl,-rpath-link -Wl,${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/ncurses/usr/lib -Wl,-rpath-link -Wl,${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/zlib/usr/lib" \
--build=x86_64-pc-linux-gnu \
--host=arm-dey-linux-gnueabi \
--target=arm-dey-linux-gnueabi \
--disable-strip \
--with-hardening \
--with-libedit="${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/libedit/usr" \
--with-mantype=cat \
--with-openssl \
--with-pid-dir=/var/run \
--with-privsep-path=/var/run/sshd \
--with-ssl-dir="${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/openssl/usr" \
--with-stackprotect \
--with-zlib-version-check \
--with-zlib="${BuildRoot}/results/${PRODUCT}/arm/gnu-toolchain/12.3.1/release/third_party/zlib/usr" \
--without-kerberos5 \
--without-ldns \
--without-maildir \
--without-pam \
--without-rpath \
--without-selinux \
--without-xauth \
--prefix=/usr \
--sysconfdir=/etc/ssh \
--localstatedir=/var

Were it just one version, I’d have expected a potential code generation
bug with the compiler; however, across three different versions from
three different GCC eras, I’m inclined to believe this isn’t a code-
generation issue.

In all failures, the ssh client fails with:

debug1: expecting SSH2_MSG_KEX_ECDH_REPLY

followed by:

Connection closed by <IP address of server> port 22

In all failures, the ssh daemon fails with:

debug1: expecting SSH2_MSG_KEX_ECDH_INIT [preauth]
debug3: receive packet: type 30 [preauth]
debug3: mm_sshkey_sign entering [preauth]
debug3: mm_request_send entering: type 6 [preauth]
debug3: mm_sshkey_sign: waiting for MONITOR_ANS_SIGN [preauth]
debug3: mm_request_receive_expect entering: type 7 [preauth]
debug3: mm_request_receive entering [preauth]
debug3: mm_request_receive entering
debug3: monitor_read: checking request 6
debug3: mm_answer_sign
debug3: mm_answer_sign: hostkey proof signature 0x1164880(100)
debug3: mm_request_send entering: type 7
debug2: monitor_read: 6 used once, disabling now
debug3: send packet: type 31 [preauth]
debug3: send packet: type 21 [preauth]
debug2: set_newkeys: mode 1 [preauth]
debug1: rekey after 134217728 blocks [preauth]
debug1: monitor_read_log: child log fd closed
debug3: mm_request_receive entering
debug1: do_cleanup
debug1: Killing privsep child 2544

My first inclination was that this was a SHA-1 key algorithm deprecation
issue; however, I verified that was not the case. And, again, the fact
that the compiler is the only variable indicated it likely was not.

My second inclination was that this was perhaps an optimization issue
with the later versions of GCC, so I compiled OpenSSH with -O0. No
change. Digi DEY 8.2.0 works; Arm GNU Toolchain 10/11/12 did not.

My next inclination was to try a different ssh client. I’d been using
8.2p1 (Ubuntu 20.04); however, 8.1p1 and 8.6p1 (macOS) as well as a
locally-built 9.5p1 yielded the same results: Digi DEY 8.2.0 works;
Arm GNU Toolchain 10/11/12 did not.

My next inclination was to iterate through sshd_config configuration.
I commented out the 10 lines one-by-one and retested which yielded the
same results: Digi DEY 8.2.0 works; Arm GNU Toolchain 10/11/12 did not.

My next inclination was that perhaps OpenSSL was creating an issue. I
tried 1.1.1w (up from my 1.1.1s) and 3.1.4 which yielded the same
results: Digi DEY 8.2.0 works; Arm GNU Toolchain 10/11/12 did not.

My next inclination was the perhaps it was OpenSSH version-specific. I
tried up revving to 9.5p1 and then down revving to 7.9p1 which yielded
the same results: Digi DEY 8.2.0 works; Arm GNU Toolchain 10/11/12 did
not.

My last inclination was to do a side-by-side comparison of the
configuration and compilation output between Digi DEY 8.2.0 and Arm GNU
Toolchain 12. The key differences were checking:

if ${CC} supports compile flag -fzero-call-used-regs=all
if ${CC} supports compile flag -ftrivial-auto-var-init=zero
for sys/sysctl.h
for library containing login
for closefrom
for close_range
for library containing dlopen
for arc4random
for arc4random_buf
for arc4random_uniform
if libc defines sys_errlist
if libc defines sys_nerr
for library containing res_query
for library containing dn_expand
if res_query will link
for _getshort
for _getlong

While most of these configuration difference seem trivial and innocuous,
the -fzero-call-used-regs=all and -ftrivial-auto-var-init=zero compiler
language / code generation options seemed the most likely among those
differences to impact the point at which the client/daemon interaction
seemed to be failing. So, I forcibly disabled both which yielded the same
results: Digi DEY 8.2.0 works; Arm GNU Toolchain 10/11/12 did not.

Does anyone recognize this as a familiar failure mode? Beyond that, any
thoughts or recommendations on zeroing in further on the potential root
cause?

Best,

Grant
_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: 9.3p1 Daemon Rejects Client Connections on armv7l-dey-linux-gnueabihf w/ GCC 10/11/12 [ In reply to ]
On Mon, 30 Oct 2023, Grant Erickson wrote:

> I have an NXP i.MX6-based armv7l-dey-linux-gnueabihf system in which I
> am seeing some as-yet-unaccountable behavior in sshd when compiled with
> Arm/GCC 10/11/12. That is, when attempting to scp/slogin/ssh to
> 'root@<host>', where <host> is either a name or IPv4 or IPv6 address,
> the connection is quickly closed by the server without prompting for a
> password.
>
> The variable I can consistently change across all others to impact
> whether things work or do not work is the toolchain. Under the
> arm-dey-linux-gnueabi-gcc 8.2.0 from Digi Embedded Yocto (DEY),
> scp/slogin/ssh works. Under arm-none-linux-gnueabihf-gcc 10/11/12
> (specifically those from https://developer.arm.com/-/media/Files/downloads/gnu-a/10.3-2021.07/binrel/gcc-arm-10.3-2021.07-x86_64-arm-none-linux-gnueabihf.tar.xz, https://developer.arm.com/-/media/Files/downloads/gnu/11.3.rel1/binrel/arm-gnu-toolchain-11.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz, and https://developer.arm.com/-/media/Files/downloads/gnu/12.3.rel1/binrel/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz) they do not, failing consistently and with the same failure across the three of them.

This might be a syscall sandbox violation. Try building with
SANDBOX_SECCOMP_FILTER_DEBUG defined and see if you get any more information.

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
Re: 9.3p1 Daemon Rejects Client Connections on armv7l-dey-linux-gnueabihf w/ GCC 10/11/12 [ In reply to ]
On Nov 2, 2023, at 4:39 PM, Grant Erickson <gerickson@nuovations.com> wrote:
> On Nov 2, 2023, at 4:32 PM, Damien Miller <djm@mindrot.org> wrote:
>> On Mon, 30 Oct 2023, Grant Erickson wrote:
>>
>>> I have an NXP i.MX6-based armv7l-dey-linux-gnueabihf system in which I
>>> am seeing some as-yet-unaccountable behavior in sshd when compiled with
>>> Arm/GCC 10/11/12. That is, when attempting to scp/slogin/ssh to
>>> 'root@<host>', where <host> is either a name or IPv4 or IPv6 address,
>>> the connection is quickly closed by the server without prompting for a
>>> password.
>>>
>>> The variable I can consistently change across all others to impact
>>> whether things work or do not work is the toolchain. Under the
>>> arm-dey-linux-gnueabi-gcc 8.2.0 from Digi Embedded Yocto (DEY),
>>> scp/slogin/ssh works. Under arm-none-linux-gnueabihf-gcc 10/11/12
>>> (specifically those from https://developer.arm.com/-/media/Files/downloads/gnu-a/10.3-2021.07/binrel/gcc-arm-10.3-2021.07-x86_64-arm-none-linux-gnueabihf.tar.xz, https://developer.arm.com/-/media/Files/downloads/gnu/11.3.rel1/binrel/arm-gnu-toolchain-11.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz, and https://developer.arm.com/-/media/Files/downloads/gnu/12.3.rel1/binrel/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf.tar.xz) they do not, failing consistently and with the same failure across the three of them.
>>
>> This might be a syscall sandbox violation. Try building with
>> SANDBOX_SECCOMP_FILTER_DEBUG defined and see if you get any more information.
>
> Thanks for the reply. I’ll give that a try and report back.

Damien,

Thank you; that was an absolutely golden recommendation. Turning on SANDBOX_SECCOMP_FILTER_DEBUG was the magic, that did, in fact, uncover an unexpected system call violation:


debug3: monitor_read: checking request 6
debug3: mm_answer_sign: entering
debug3: mm_answer_sign: ecdsa-sha2-nistp256 KEX signature len=101
debug3: mm_request_send: entering, type 7
debug2: monitor_read: 6 used once, disabling now
debug3: send packet: type 31 [preauth]
debug3: send packet: type 21 [preauth]
debug2: ssh_set_newkeys: mode 1 [preauth]
debug1: rekey out after 134217728 blocks [preauth]
ssh_sandbox_violation: unexpected system call (arch:0x40000028,syscall:403 @ 0x76ccaa66) [preauth]
debug1: monitor_read_log: child log fd closed
debug3: mm_request_receive: entering
debug1: do_cleanup
debug1: Killing privsep child 528
...

The last defined system call in <asm/unistd-common.h> is __NR_io_pgetevents, 399.

According to this URL, https://gpages.juszkiewicz.com.pl/syscalls-table/syscalls.html, system call 403 is clock_gettime64 in Arm32.

clock_gettime64 is not defined for the older Digi DEY 8.2.0 toolchain that does work with OpenSSH:

% grep -r clock_gettime64 /opt/sysroots/cortexa9t2hf-neon-dey-linux-gnueabi/ /opt/sysroots/x86_64-deysdk-linux/

but is defined for Arm GNU Toolchain 12.3.1 (and 11.3.1 and 10.3-2021.07):

% grep -r clock_gettime64 /opt/sysroots/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf/
/opt/sysroots/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/libc/usr/include/bits/syscall.h:#ifdef __NR_clock_gettime64
/opt/sysroots/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/libc/usr/include/bits/syscall.h:# define SYS_clock_gettime64 __NR_clock_gettime64
/opt/sysroots/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/libc/usr/include/time.h: timespec *__tp), __clock_gettime64)
/opt/sysroots/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-linux-gnueabihf/arm-none-linux-gnueabihf/libc/usr/include/time.h:# define clock_gettime __clock_gettime64

The <time.h> header has this block:

#ifdef __USE_POSIX199309
# ifndef __USE_TIME_BITS64
/* Pause execution for a number of nanoseconds.

This function is a cancellation point and therefore not marked with
__THROW. */
extern int nanosleep (const struct timespec *__requested_time,
struct timespec *__remaining);

/* Get resolution of clock CLOCK_ID. */
extern int clock_getres (clockid_t __clock_id, struct timespec *__res) __THROW;

/* Get current value of clock CLOCK_ID and store it in TP. */
extern int clock_gettime (clockid_t __clock_id, struct timespec *__tp)
__THROW __nonnull((2));

/* Set clock CLOCK_ID to value TP. */
extern int clock_settime (clockid_t __clock_id, const struct timespec *__tp)
__THROW __nonnull((2));
# else
# ifdef __REDIRECT
extern int __REDIRECT (nanosleep, (const struct timespec *__requested_time,
struct timespec *__remaining),
__nanosleep64);
extern int __REDIRECT_NTH (clock_getres, (clockid_t __clock_id,
struct timespec *__res),
__clock_getres64);
extern int __REDIRECT_NTH (clock_gettime, (clockid_t __clock_id, struct
timespec *__tp), __clock_gettime64)
__nonnull((2));
extern int __REDIRECT_NTH (clock_settime, (clockid_t __clock_id, const struct
timespec *__tp), __clock_settime64)
__nonnull((2));
# else
# define nanosleep __nanosleep64
# define clock_getres __clock_getres64
# define clock_gettime __clock_gettime64
# define clock_settime __clock_settime64
# endif
# endif

and <bits/syscall.h> this block:

#ifdef __NR_clock_gettime64
# define SYS_clock_gettime64 __NR_clock_gettime64
#endif

However, it looks like the Digi DEY Linux 4.9.212 kernel is too old and does not define clock_gettime64 or the corresponding system call in the Arm architecture-specific headers:

% grep -r __NR_clock_gettime64 $BuildRoot/results/arm/gnu-toolchain/12.3.1/release/third_party/linux/linux-dey/include/

or in the kernel source at all, for that matter:

% git grep clock_gettime64 $BuildRoot/third_party/linux/linux-dey/repo/

It looks like the 64-bit clock interfaces were introduced in linux-5.1 and glibc-2.31.

Thank you again for the suggestion; very helpful!

Best,

Grant
--
Principal
Nuovations

gerickson@nuovations.com
http://www.nuovations.com/

_______________________________________________
openssh-unix-dev mailing list
openssh-unix-dev@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev