Mailing List Archive

[PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok
On arm64, access_ok makes adjustments to pointers based on whether
memory tagging is enabled for a task (ARM MTE). When leveraging ptrace,
it's possible for a task to enable/disable various kernel features (such
as syscall user dispatch) which require user points as arguments.

To enable Task A to set these features via ptrace with Task B's
pointers, a task variant of access_ok is required for architectures with
features such as memory tagging.

If the architecture does not implement task_access_ok, the operation
reduces to access_ok and the task argument is discarded.

Signed-off-by: Gregory Price <gregory.price@memverge.com>
---
arch/arm64/include/asm/uaccess.h | 13 +++++++++++--
include/asm-generic/access_ok.h | 10 ++++++++++
2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 5c7b2f9d5913..1a51a54f264f 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -35,7 +35,9 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);
* This is equivalent to the following test:
* (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
*/
-static inline int access_ok(const void __user *addr, unsigned long size)
+static inline int task_access_ok(struct task_struct *task,
+ const void __user *addr,
+ unsigned long size)
{
/*
* Asynchronous I/O running in a kernel thread does not have the
@@ -43,11 +45,18 @@ static inline int access_ok(const void __user *addr, unsigned long size)
* the user address before checking.
*/
if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
- (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
+ (task->flags & PF_KTHREAD || test_ti_thread_flag(task, TIF_TAGGED_ADDR)))
addr = untagged_addr(addr);

return likely(__access_ok(addr, size));
}
+
+static inline int access_ok(const void __user *addr, unsigned long size)
+{
+ return task_access_ok(current, addr, size);
+}
+
+#define task_access_ok task_access_ok
#define access_ok access_ok

#include <asm-generic/access_ok.h>
diff --git a/include/asm-generic/access_ok.h b/include/asm-generic/access_ok.h
index 2866ae61b1cd..31465773c40a 100644
--- a/include/asm-generic/access_ok.h
+++ b/include/asm-generic/access_ok.h
@@ -45,4 +45,14 @@ static inline int __access_ok(const void __user *ptr, unsigned long size)
#define access_ok(addr, size) likely(__access_ok(addr, size))
#endif

+/*
+ * Some architectures may have special features (such as ARM MTE)
+ * that require handling if access_ok is called on a pointer from one
+ * task in the context of another. On most architectures this operation
+ * is equivalent to simply __access_ok.
+ */
+#ifndef task_access_ok
+#define task_access_ok(task, addr, size) likely(__access_ok(addr, size))
+#endif
+
#endif
--
2.39.1
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Tue, Mar 28, 2023, at 18:48, Gregory Price wrote:
> On arm64, access_ok makes adjustments to pointers based on whether
> memory tagging is enabled for a task (ARM MTE). When leveraging ptrace,
> it's possible for a task to enable/disable various kernel features (such
> as syscall user dispatch) which require user points as arguments.
>
> To enable Task A to set these features via ptrace with Task B's
> pointers, a task variant of access_ok is required for architectures with
> features such as memory tagging.
>
> If the architecture does not implement task_access_ok, the operation
> reduces to access_ok and the task argument is discarded.
>
> Signed-off-by: Gregory Price <gregory.price@memverge.com>

For asm-generic:

Acked-by: Arnd Bergmann <arnd@arndb.de>
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
Hmm. I am not comfortable with this change...

I won't really argue because I don't have a better solution and because
I think we don't really care as long as task_set_syscall_user_dispatch()
is the only user of task_access_ok(), but still...

OK, so this version changes set_syscall_user_dispatch() to use
task_access_ok() instead of access_ok() because task != current.

On 03/28, Gregory Price wrote:
>
> If the architecture does not implement task_access_ok, the operation
> reduces to access_ok and the task argument is discarded.

No, with this patch it reduces to __access_ok(). And this already doesn't
look very good to me, but this is minor.

> --- a/include/asm-generic/access_ok.h
> +++ b/include/asm-generic/access_ok.h
> @@ -45,4 +45,14 @@ static inline int __access_ok(const void __user *ptr, unsigned long size)
> #define access_ok(addr, size) likely(__access_ok(addr, size))
> #endif
>
> +/*
> + * Some architectures may have special features (such as ARM MTE)
> + * that require handling if access_ok is called on a pointer from one
> + * task in the context of another. On most architectures this operation
> + * is equivalent to simply __access_ok.
> + */
> +#ifndef task_access_ok
> +#define task_access_ok(task, addr, size) likely(__access_ok(addr, size))
> +#endif

Lets ignore arm64.

This look as if access_ok() or __access_ok() doesn't depend on task, but
this is not true in general. Say, TASK_SIZE_MAX can check is_32bit_task()
test_thread_flag(TIF_32BIT...) and this uses "current".

Again, we probably do not care, but I don't like the fact task_access_ok()
looks as if task_access_ok(task) returns the same result as "task" calling
access_ok().

Oleg.
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023, at 17:15, Oleg Nesterov wrote:
>
> This look as if access_ok() or __access_ok() doesn't depend on task, but
> this is not true in general. Say, TASK_SIZE_MAX can check is_32bit_task()
> test_thread_flag(TIF_32BIT...) and this uses "current".
>
> Again, we probably do not care, but I don't like the fact task_access_ok()
> looks as if task_access_ok(task) returns the same result as "task" calling
> access_ok().

I think the idea of TASK_SIZE_MAX is that it is a compile-time constant and in fact independent of current, while TASK_SIZE
takes TIF_32BIT into account.

Arnd
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On 03/29, Arnd Bergmann wrote:
>
> On Wed, Mar 29, 2023, at 17:15, Oleg Nesterov wrote:
> >
> > This look as if access_ok() or __access_ok() doesn't depend on task, but
> > this is not true in general. Say, TASK_SIZE_MAX can check is_32bit_task()
> > test_thread_flag(TIF_32BIT...) and this uses "current".
> >
> > Again, we probably do not care, but I don't like the fact task_access_ok()
> > looks as if task_access_ok(task) returns the same result as "task" calling
> > access_ok().
>
> I think the idea of TASK_SIZE_MAX is that it is a compile-time constant and in fact independent of current, while TASK_SIZE
> takes TIF_32BIT into account.

Say, arch/loongarch defines TASK_SIZE which depends on test_thread_flag(TIF_32BIT_ADDR)
but it doesn't define TASK_SIZE_MAX, so __access_ok() will use TASK_SIZE.

Oleg.
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 06:03:23PM +0200, Oleg Nesterov wrote:
> On 03/29, Arnd Bergmann wrote:
> >
> > On Wed, Mar 29, 2023, at 17:15, Oleg Nesterov wrote:
> > >
> > > This look as if access_ok() or __access_ok() doesn't depend on task, but
> > > this is not true in general. Say, TASK_SIZE_MAX can check is_32bit_task()
> > > test_thread_flag(TIF_32BIT...) and this uses "current".
> > >
> > > Again, we probably do not care, but I don't like the fact task_access_ok()
> > > looks as if task_access_ok(task) returns the same result as "task" calling
> > > access_ok().
> >
> > I think the idea of TASK_SIZE_MAX is that it is a compile-time constant and in fact independent of current, while TASK_SIZE
> > takes TIF_32BIT into account.
>
> Say, arch/loongarch defines TASK_SIZE which depends on test_thread_flag(TIF_32BIT_ADDR)
> but it doesn't define TASK_SIZE_MAX, so __access_ok() will use TASK_SIZE.
>
> Oleg.
>

I did not notice this at first. Thinking of solutions, I'd originally
considered writing a similar change in asm-generic that I made in arm64,
but that would have ultimately resulted in "(void) task;" because task
appears unused.

Now it seems like TASK_SIZE/_MAX seems like a dangerous define
combination that hides relevant functionality. Fixing this seeems to
naturally want a "TASK_TASK_SIZE(task)" which is... uh... annoying.

Not sure how I should proceed here, but this makes me wonder if there
are oversights like this elsewhere, as this seems like a pretty easy
thing to overlook.

~Gregory
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Tue, Mar 28, 2023 at 12:48:08PM -0400, Gregory Price wrote:
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 5c7b2f9d5913..1a51a54f264f 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -35,7 +35,9 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);
> * This is equivalent to the following test:
> * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
> */
> -static inline int access_ok(const void __user *addr, unsigned long size)
> +static inline int task_access_ok(struct task_struct *task,
> + const void __user *addr,
> + unsigned long size)
> {
> /*
> * Asynchronous I/O running in a kernel thread does not have the
> @@ -43,11 +45,18 @@ static inline int access_ok(const void __user *addr, unsigned long size)
> * the user address before checking.
> */
> if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
> - (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> + (task->flags & PF_KTHREAD || test_ti_thread_flag(task, TIF_TAGGED_ADDR)))
> addr = untagged_addr(addr);
>
> return likely(__access_ok(addr, size));
> }
> +
> +static inline int access_ok(const void __user *addr, unsigned long size)
> +{
> + return task_access_ok(current, addr, size);
> +}
> +
> +#define task_access_ok task_access_ok

I'd not bother with this at all. In the generic code you can either do
an __access_ok() check directly or just
access_ok(untagged_addr(selector), ...) with a comment that address
tagging of the ptraced task may not be enabled.

--
Catalin
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 05:22:49PM +0100, Catalin Marinas wrote:
> On Tue, Mar 28, 2023 at 12:48:08PM -0400, Gregory Price wrote:
> > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > index 5c7b2f9d5913..1a51a54f264f 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -35,7 +35,9 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);
> > * This is equivalent to the following test:
> > * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
> > */
> > -static inline int access_ok(const void __user *addr, unsigned long size)
> > +static inline int task_access_ok(struct task_struct *task,
> > + const void __user *addr,
> > + unsigned long size)
> > {
> > /*
> > * Asynchronous I/O running in a kernel thread does not have the
> > @@ -43,11 +45,18 @@ static inline int access_ok(const void __user *addr, unsigned long size)
> > * the user address before checking.
> > */
> > if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
> > - (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> > + (task->flags & PF_KTHREAD || test_ti_thread_flag(task, TIF_TAGGED_ADDR)))
> > addr = untagged_addr(addr);
> >
> > return likely(__access_ok(addr, size));
> > }
> > +
> > +static inline int access_ok(const void __user *addr, unsigned long size)
> > +{
> > + return task_access_ok(current, addr, size);
> > +}
> > +
> > +#define task_access_ok task_access_ok
>
> I'd not bother with this at all. In the generic code you can either do
> an __access_ok() check directly or just
> access_ok(untagged_addr(selector), ...) with a comment that address
> tagging of the ptraced task may not be enabled.
>
> --
> Catalin

This was my original proposal, but the comment that lead to this patch
was the following:

"""
If this would be correct, then access_ok() on arm64 would
unconditionally untag the checked address, but it does not. Simply
because untagging is only valid if the task enabled pointer tagging. If
it didn't a tagged pointer is obviously invalid.

Why would ptrace make this suddenly valid?
"""

https://lore.kernel.org/all/87a605anvx.ffs@tglx/

I did not have a sufficient answer for this so I went down this path.

It does seem simpler to simply untag the address, however it didn't seem
like a good solution to simply leave an identified bad edge case.

with access_ok(untagged_addr(addr), ...) it breaks down like this:

(tracer,tracee) : result

tag,tag : untagged - (correct)
tag,untag : untagged - incorrect as this would have been an impossible
state to reach through the standard prctl interface. Will
lead to a SIGSEGV in the tracee upon next syscall
untag,tag : untagged - (correct)
untag,untag : no-op - (correct), tagged address will fail to set

Basically if the tracer is a tagged process while the tracee is not, it
would become possible to set the tracee's selector to a tagged pointer.

It's beyond me to say whether or not this situation is "ok" and "the
user's fault", but it does feel like an addressable problem.

Thoughts?
~Gregory
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On 03/28, Gregory Price wrote:
>
> Not sure how I should proceed here,

Can't we just kill this access_ok() in set_syscall_user_dispatch() ?
I don't think it buys too much.

Oleg.

diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
index 0b6379adff6b..d2e516ece52b 100644
--- a/kernel/entry/syscall_user_dispatch.c
+++ b/kernel/entry/syscall_user_dispatch.c
@@ -43,11 +43,7 @@ bool syscall_user_dispatch(struct pt_regs *regs)
return false;

if (likely(sd->selector)) {
- /*
- * access_ok() is performed once, at prctl time, when
- * the selector is loaded by userspace.
- */
- if (unlikely(__get_user(state, sd->selector))) {
+ if (unlikely(get_user(state, sd->selector))) {
force_exit_sig(SIGSEGV);
return true;
}
@@ -86,9 +82,6 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
if (offset && offset + len <= offset)
return -EINVAL;

- if (selector && !access_ok(selector, sizeof(*selector)))
- return -EFAULT;
-
break;
default:
return -EINVAL;
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 07:13:22PM +0200, Oleg Nesterov wrote:
> On 03/28, Gregory Price wrote:
> >
> > Not sure how I should proceed here,
>
> Can't we just kill this access_ok() in set_syscall_user_dispatch() ?
> I don't think it buys too much.
>
> Oleg.
>
> diff --git a/kernel/entry/syscall_user_dispatch.c b/kernel/entry/syscall_user_dispatch.c
> index 0b6379adff6b..d2e516ece52b 100644
> --- a/kernel/entry/syscall_user_dispatch.c
> +++ b/kernel/entry/syscall_user_dispatch.c
> @@ -43,11 +43,7 @@ bool syscall_user_dispatch(struct pt_regs *regs)
> return false;
>
> if (likely(sd->selector)) {
> - /*
> - * access_ok() is performed once, at prctl time, when
> - * the selector is loaded by userspace.
> - */
> - if (unlikely(__get_user(state, sd->selector))) {
> + if (unlikely(get_user(state, sd->selector))) {
> force_exit_sig(SIGSEGV);
> return true;
> }
> @@ -86,9 +82,6 @@ int set_syscall_user_dispatch(unsigned long mode, unsigned long offset,
> if (offset && offset + len <= offset)
> return -EINVAL;
>
> - if (selector && !access_ok(selector, sizeof(*selector)))
> - return -EFAULT;
> -
> break;
> default:
> return -EINVAL;
>

The result of this would be either a task calling via prctl or a tracer
calling via ptrace would be capable of setting selector to a bad pointer
and producing a SIGSEGV on the next system call.

It's a pretty small footgun, but maybe that's reasonable?

From a user perspective, debugging this behavior would be nightmarish.
Your call to prctl/ptrace would succeed and the process would continue
to execute until the next syscall - at which point you incur a SIGSEGV,
and depending on the syscall that could mean anything?

Everything feels bad here lol.

~Gregory
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On 03/29, Gregory Price wrote:
>
> On Wed, Mar 29, 2023 at 07:13:22PM +0200, Oleg Nesterov wrote:
> >
> > - if (selector && !access_ok(selector, sizeof(*selector)))
> > - return -EFAULT;
> > -
> > break;
> > default:
> > return -EINVAL;
> >
>
> The result of this would be either a task calling via prctl or a tracer
> calling via ptrace would be capable of setting selector to a bad pointer
> and producing a SIGSEGV on the next system call.

Yes,

> It's a pretty small footgun, but maybe that's reasonable?

I hope this is reasonable,

> From a user perspective, debugging this behavior would be nightmarish.
> Your call to prctl/ptrace would succeed and the process would continue
> to execute until the next syscall - at which point you incur a SIGSEGV,

Yes. But how does this differ from the case when, for example, user
does prtcl(PR_SET_SYSCALL_USER_DISPATCH, selector = 1) ? Or another
bad address < TASK_SIZE?

access_ok() will happily succeed, then later syscall_user_dispatch()
will equally trigger SIGSEGV.

Oleg.
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 07:58:51PM +0200, Oleg Nesterov wrote:
> On 03/29, Gregory Price wrote:
> >
> > On Wed, Mar 29, 2023 at 07:13:22PM +0200, Oleg Nesterov wrote:
> > >
> > > - if (selector && !access_ok(selector, sizeof(*selector)))
> > > - return -EFAULT;
> > > -
> > > break;
> > > default:
> > > return -EINVAL;
> > >
> >
> > The result of this would be either a task calling via prctl or a tracer
> > calling via ptrace would be capable of setting selector to a bad pointer
> > and producing a SIGSEGV on the next system call.
>
> Yes,
>
> > It's a pretty small footgun, but maybe that's reasonable?
>
> I hope this is reasonable,
>
> > From a user perspective, debugging this behavior would be nightmarish.
> > Your call to prctl/ptrace would succeed and the process would continue
> > to execute until the next syscall - at which point you incur a SIGSEGV,
>
> Yes. But how does this differ from the case when, for example, user
> does prtcl(PR_SET_SYSCALL_USER_DISPATCH, selector = 1) ? Or another
> bad address < TASK_SIZE?
>
> access_ok() will happily succeed, then later syscall_user_dispatch()
> will equally trigger SIGSEGV.
>
> Oleg.
>

I'm convinced now, this feels like the correct solution. I will pull
your suggested patch ahead and drop the task variant of access_ok.

Am I ok to add your signed-off-by to the suggested patch, and i'll add
it to the series? Not quite sure what the correct set of tags is,
since i don't have any suggested changes to your patch.

~Gregory
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On 03/29, Gregory Price wrote:
>
> Am I ok to add your signed-off-by to the suggested patch, and i'll add
> it to the series? Not quite sure what the correct set of tags is,
> since i don't have any suggested changes to your patch.

Feel free to add my SOB, but probably Suggested-by: make more sense.

Oleg.
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023, at 18:03, Oleg Nesterov wrote:
> On 03/29, Arnd Bergmann wrote:
>>
>> I think the idea of TASK_SIZE_MAX is that it is a compile-time constant and in fact independent of current, while TASK_SIZE
>> takes TIF_32BIT into account.
>
> Say, arch/loongarch defines TASK_SIZE which depends on
> test_thread_flag(TIF_32BIT_ADDR)
> but it doesn't define TASK_SIZE_MAX, so __access_ok() will use TASK_SIZE.

I'd consider that a bug in loongarch, though it's
as harmless as it gets: The only downside is that
it's missing an optimization from constant-folding
the value, and since there is no CONFIG_COMPAT on
loongarch yet, it doesn't even have a different
value.

TASK_SIZE_MAX become mandatory here when I worked
on the optimized access_ok() across architectures,
and the reason it's safe to use is that access_ok()
has to only guarantee that a task cannot access
data that it can't already access, i.e. kernel
data. Passing a pointer between TASK_SIZE and
TASK_SIZE_MAX will still cause a -EFAULT error
because of the trap.

Arnd
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 07:58:51PM +0200, Oleg Nesterov wrote:
> On 03/29, Gregory Price wrote:
> >
> > On Wed, Mar 29, 2023 at 07:13:22PM +0200, Oleg Nesterov wrote:
> > >
> > > - if (selector && !access_ok(selector, sizeof(*selector)))
> > > - return -EFAULT;
> > > -
> > > break;
> > > default:
> > > return -EINVAL;
> > >
> >
> > The result of this would be either a task calling via prctl or a tracer
> > calling via ptrace would be capable of setting selector to a bad pointer
> > and producing a SIGSEGV on the next system call.
>
> Yes,
>
> > It's a pretty small footgun, but maybe that's reasonable?
>
> I hope this is reasonable,
>
> > From a user perspective, debugging this behavior would be nightmarish.
> > Your call to prctl/ptrace would succeed and the process would continue
> > to execute until the next syscall - at which point you incur a SIGSEGV,
>
> Yes. But how does this differ from the case when, for example, user
> does prtcl(PR_SET_SYSCALL_USER_DISPATCH, selector = 1) ? Or another
> bad address < TASK_SIZE?
>
> access_ok() will happily succeed, then later syscall_user_dispatch()
> will equally trigger SIGSEGV.
>
> Oleg.
>

Last note on this before I push up another patch set.

The change from __get_user to get_user also introduces a call to
might_fault() which adds a larger callstack for every syscall /
dispatch. This turns into a might_sleep and might_reschedule, which
represent a very different pattern of execution from before.

At the very least, syscall-user-dispatch will be less performant as the
selector is validated on every syscall. I have to assume that is why
they chose to validate it upon activating SUD - to avoid the overhead.

The current cost of a dispatch is about 3-5us (2 context switches + the
signal system). This could be a small amount of overhead comparatively.
However, this additional overhead would apply to ALL system calls,
regardless of whether they dispatch or not. That seems concerning for
syscall hotpath code.


So given this, the three options I presently see available are:

1) drop access_ok on SUD setup, validate the pointer on every syscall
with get_user instead of __get user, or

2) create task_access_ok and deal with the TASK_SIZE implications (or
not? there seems to be some argument for and against)

3) indescriminately untag all pointers and allow tracers to set selector
to what otherwise would be a bad value in a particuarly degenerate
case. (There seems to be some argument for/against this?)

Will leave this for comment for a day or so before I push another set.

Personally I fall on the side of untagging the pointer for the access_ok
check, as its only real effect is the situation where a tracer has tagging
and enabled, and is tracing an untagged task. That seems extremely narrow,
and not particularly realistic, and the result is the tracee firing a
SIGSEGV - which is equivalent to allowing the pointer being invalid in
the first place without the additional overhead.

~Gregory
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On 03/29, Gregory Price wrote:
>
> Last note on this before I push up another patch set.
>
> The change from __get_user to get_user also introduces a call to
> might_fault() which adds a larger callstack for every syscall /
> dispatch. This turns into a might_sleep and might_reschedule, which
> represent a very different pattern of execution from before.

might_fault() is nop unless CONFIG_PROVE_LOCKING || DEBUG_ATOMIC_SLEEP.

Again, I won't really argue with task_access_ok(). Just I am not sure
2/4 gives enough justification for this new helper with unclear semantics
(until we ensure that access_ok() doesn't depend on current).

Oleg.
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Wed, Mar 29, 2023 at 12:34:04AM -0400, Gregory Price wrote:
> On Wed, Mar 29, 2023 at 05:22:49PM +0100, Catalin Marinas wrote:
> > On Tue, Mar 28, 2023 at 12:48:08PM -0400, Gregory Price wrote:
> > > diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> > > index 5c7b2f9d5913..1a51a54f264f 100644
> > > --- a/arch/arm64/include/asm/uaccess.h
> > > +++ b/arch/arm64/include/asm/uaccess.h
> > > @@ -35,7 +35,9 @@ static inline int __access_ok(const void __user *ptr, unsigned long size);
> > > * This is equivalent to the following test:
> > > * (u65)addr + (u65)size <= (u65)TASK_SIZE_MAX
> > > */
> > > -static inline int access_ok(const void __user *addr, unsigned long size)
> > > +static inline int task_access_ok(struct task_struct *task,
> > > + const void __user *addr,
> > > + unsigned long size)
> > > {
> > > /*
> > > * Asynchronous I/O running in a kernel thread does not have the
> > > @@ -43,11 +45,18 @@ static inline int access_ok(const void __user *addr, unsigned long size)
> > > * the user address before checking.
> > > */
> > > if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) &&
> > > - (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR)))
> > > + (task->flags & PF_KTHREAD || test_ti_thread_flag(task, TIF_TAGGED_ADDR)))
> > > addr = untagged_addr(addr);
> > >
> > > return likely(__access_ok(addr, size));
> > > }
> > > +
> > > +static inline int access_ok(const void __user *addr, unsigned long size)
> > > +{
> > > + return task_access_ok(current, addr, size);
> > > +}
> > > +
> > > +#define task_access_ok task_access_ok
> >
> > I'd not bother with this at all. In the generic code you can either do
> > an __access_ok() check directly or just
> > access_ok(untagged_addr(selector), ...) with a comment that address
> > tagging of the ptraced task may not be enabled.
>
> This was my original proposal, but the comment that lead to this patch
> was the following:
>
> """
> If this would be correct, then access_ok() on arm64 would
> unconditionally untag the checked address, but it does not. Simply
> because untagging is only valid if the task enabled pointer tagging. If
> it didn't a tagged pointer is obviously invalid.
>
> Why would ptrace make this suddenly valid?
> """
>
> https://lore.kernel.org/all/87a605anvx.ffs@tglx/

Ah, thanks for the pointer.

For ptrace(), we live with this relaxation as there's no easy way to
check. Take __access_remote_vm() for example, it ends up in
get_user_pages_remote() -> ... -> __get_user_pages() which just untags
the address. Even if it would want to do this conditionally, the tag
pointer is enabled per thread (and inherited) but the GUP API only takes
the mm.

While we could improve it as ptrace() can tell which thread it is
tracing, I don't think it's worth the effort. On arm64, top-byte-ignore
was enabled from the start for in-user accesses but not at the syscall
level. We wanted to avoid breaking some use-cases with untagging all
user pointers, hence the explicit opt-in to catch some issues (glibc did
have a problem with brk() ignoring the top byte -
https://bugzilla.redhat.com/show_bug.cgi?id=1797052).

So yeah, this access_ok() in a few places is a best effort to catch some
potential ABI regressions like the one above and also as a way to force
the old ABI (mostly) via sysctl. But we do have places like GUP where we
don't have the thread information (only the mm), so we just
indiscriminately untag the pointer.

Note that there is no security risk for the access itself. The Arm
architecture selects the user vs kernel address spaces based on bit 55
rather than 63. Untagging a pointer sign-extends bit 55.

> I did not have a sufficient answer for this so I went down this path.
>
> It does seem simpler to simply untag the address, however it didn't seem
> like a good solution to simply leave an identified bad edge case.
>
> with access_ok(untagged_addr(addr), ...) it breaks down like this:
>
> (tracer,tracee) : result
>
> tag,tag : untagged - (correct)
> tag,untag : untagged - incorrect as this would have been an impossible
> state to reach through the standard prctl interface. Will
> lead to a SIGSEGV in the tracee upon next syscall

Well, even without untagging the pointer, the tracer can set a random
address that passes access_ok() but still faults in the tracee. It's the
tracer that should ensure the pointer is valid in the context of the
tracee.

Now, even if the selector pointer is tagged, the accesses still work
fine (top-byte-ignore) unless MTE is enabled in the tracee and the tag
should match the region's colour. But, again, that's no different from a
debugger changing pointer variables in the debugged process, they should
be valid and it's not for the kernel to sanitise them.

> untag,tag : untagged - (correct)
> untag,untag : no-op - (correct), tagged address will fail to set
>
> Basically if the tracer is a tagged process while the tracee is not, it
> would become possible to set the tracee's selector to a tagged pointer.

Yes, but does it matter? You'd trust the tracer to work correctly. There
are multiple ways it can break the tracee here even if access_ok()
worked as intended.

> It's beyond me to say whether or not this situation is "ok" and "the
> user's fault", but it does feel like an addressable problem.

To me, the situation looks fine. While it's addressable, we have other
places where the tag is ignored on the ptrace() path, so I don't think
it's worth the effort.

--
Catalin
Re: [PATCH v14 1/4] asm-generic,arm64: create task variant of access_ok [ In reply to ]
On Thu, Mar 30, 2023 at 03:05:07PM +0100, Catalin Marinas wrote:
>
> Ah, thanks for the pointer.
>
> For ptrace(), we live with this relaxation as there's no easy way to
> check. Take __access_remote_vm() for example, it ends up in
> get_user_pages_remote() -> ... -> __get_user_pages() which just untags
> the address. Even if it would want to do this conditionally, the tag
> pointer is enabled per thread (and inherited) but the GUP API only takes
> the mm.
>
> While we could improve it as ptrace() can tell which thread it is
> tracing, I don't think it's worth the effort. On arm64, top-byte-ignore
> was enabled from the start for in-user accesses but not at the syscall
> level. We wanted to avoid breaking some use-cases with untagging all
> user pointers, hence the explicit opt-in to catch some issues (glibc did
> have a problem with brk() ignoring the top byte -
> https://bugzilla.redhat.com/show_bug.cgi?id=1797052).
>
> So yeah, this access_ok() in a few places is a best effort to catch some
> potential ABI regressions like the one above and also as a way to force
> the old ABI (mostly) via sysctl. But we do have places like GUP where we
> don't have the thread information (only the mm), so we just
> indiscriminately untag the pointer.
>
> Note that there is no security risk for the access itself. The Arm
> architecture selects the user vs kernel address spaces based on bit 55
> rather than 63. Untagging a pointer sign-extends bit 55.
>
> > I did not have a sufficient answer for this so I went down this path.
> >
> > It does seem simpler to simply untag the address, however it didn't seem
> > like a good solution to simply leave an identified bad edge case.
> >
> > with access_ok(untagged_addr(addr), ...) it breaks down like this:
> >
> > (tracer,tracee) : result
> >
> > tag,tag : untagged - (correct)
> > tag,untag : untagged - incorrect as this would have been an impossible
> > state to reach through the standard prctl interface. Will
> > lead to a SIGSEGV in the tracee upon next syscall
>
> Well, even without untagging the pointer, the tracer can set a random
> address that passes access_ok() but still faults in the tracee. It's the
> tracer that should ensure the pointer is valid in the context of the
> tracee.
>
> Now, even if the selector pointer is tagged, the accesses still work
> fine (top-byte-ignore) unless MTE is enabled in the tracee and the tag
> should match the region's colour. But, again, that's no different from a
> debugger changing pointer variables in the debugged process, they should
> be valid and it's not for the kernel to sanitise them.
>
> > untag,tag : untagged - (correct)
> > untag,untag : no-op - (correct), tagged address will fail to set
> >
> > Basically if the tracer is a tagged process while the tracee is not, it
> > would become possible to set the tracee's selector to a tagged pointer.
>
> Yes, but does it matter? You'd trust the tracer to work correctly. There
> are multiple ways it can break the tracee here even if access_ok()
> worked as intended.
>
> > It's beyond me to say whether or not this situation is "ok" and "the
> > user's fault", but it does feel like an addressable problem.
>
> To me, the situation looks fine. While it's addressable, we have other
> places where the tag is ignored on the ptrace() path, so I don't think
> it's worth the effort.
>
> --
> Catalin

Thank you for the extensive breakdown. Given this, it seems like I
should just revert to untagging the pointer and drop the access_ok
extensions.

I'll add a comment at the untag site that discusses this interaction.

Thanks!
Gregory