Mailing List Archive

[patch 62/63] locking/rtmutex: Add adaptive spinwait mechanism
From: Steven Rostedt <rostedt@goodmis.org>

Going to sleep when a spinlock or rwlock is contended can be quite
inefficient when the contention time is short and the lock owner is running
on a different CPU. The MCS mechanism is not applicable to rtmutex based
locks, so provide a simple adaptive spinwait mechanism for the RT specific
spin/rwlock implementations.

[ tglx: Provide a contemporary changelog ]

Originally-by: Gregory Haskins <ghaskins@novell.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
kernel/locking/rtmutex.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)
---
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -8,6 +8,11 @@
* Copyright (C) 2005-2006 Timesys Corp., Thomas Gleixner <tglx@timesys.com>
* Copyright (C) 2005 Kihon Technologies Inc., Steven Rostedt
* Copyright (C) 2006 Esben Nielsen
+ * Adaptive Spinlocks:
+ * Copyright (C) 2008 Novell, Inc., Gregory Haskins, Sven Dietrich,
+ * and Peter Morreale,
+ * Adaptive Spinlocks simplification:
+ * Copyright (C) 2008 Red Hat, Inc., Steven Rostedt <srostedt@redhat.com>
*
* See Documentation/locking/rt-mutex-design.rst for details.
*/
@@ -1529,6 +1534,43 @@ static __always_inline int __rt_mutex_lo
* Functions required for spin/rw_lock substitution on RT kernels
*/

+#ifdef CONFIG_SMP
+/*
+ * Note that owner is a speculative pointer and dereferencing relies
+ * on rcu_read_lock() and the check against the lock owner.
+ */
+static bool rtlock_adaptive_spinwait(struct rt_mutex_base *lock,
+ struct task_struct *owner)
+{
+ bool res = true;
+
+ rcu_read_lock();
+ for (;;) {
+ /* Owner changed. Trylock again */
+ if (owner != rt_mutex_owner(lock))
+ break;
+ /*
+ * Ensure that owner->on_cpu is dereferenced _after_
+ * checking the above to be valid.
+ */
+ barrier();
+ if (!owner->on_cpu) {
+ res = false;
+ break;
+ }
+ cpu_relax();
+ }
+ rcu_read_unlock();
+ return res;
+}
+#else
+static bool rtlock_adaptive_spinwait(struct rt_mutex_base *lock,
+ struct task_struct *owner)
+{
+ return false;
+}
+#endif
+
/**
* rtlock_slowlock_locked - Slow path lock acquisition for RT locks
* @lock: The underlying rt mutex
@@ -1536,6 +1578,7 @@ static __always_inline int __rt_mutex_lo
static void __sched rtlock_slowlock_locked(struct rt_mutex_base *lock)
{
struct rt_mutex_waiter waiter;
+ struct task_struct *owner;

lockdep_assert_held(&lock->wait_lock);

@@ -1554,9 +1597,14 @@ static void __sched rtlock_slowlock_lock
if (try_to_take_rt_mutex(lock, current, &waiter))
break;

+ if (&waiter == rt_mutex_top_waiter(lock))
+ owner = rt_mutex_owner(lock);
+ else
+ owner = NULL;
raw_spin_unlock_irq(&lock->wait_lock);

- schedule_rtlock();
+ if (!owner || !rtlock_adaptive_spinwait(lock, owner))
+ schedule_rtlock();

raw_spin_lock_irq(&lock->wait_lock);
set_current_state(TASK_RTLOCK_WAIT);
Re: [patch 62/63] locking/rtmutex: Add adaptive spinwait mechanism [ In reply to ]
On Fri, Jul 30, 2021 at 03:51:09PM +0200, Thomas Gleixner wrote:
> From: Steven Rostedt <rostedt@goodmis.org>
>
> Going to sleep when a spinlock or rwlock is contended can be quite
> inefficient when the contention time is short and the lock owner is running
> on a different CPU. The MCS mechanism is not applicable to rtmutex based
> locks, so provide a simple adaptive spinwait mechanism for the RT specific
> spin/rwlock implementations.

A better Changelog would explain *why* OSQ does not apply. I'm thinking
this ie because the (spin) waiters can be of different priorities and we
need to ensure the highest prio waiter gets to win?

AFAICT that isn't true even without OSQ, you just get a thundering herd
and the higher prio waiter has a better chance of winning the race but
all bets are off either way around.

> [ tglx: Provide a contemporary changelog ]

It might be best to squash this and the next patch, this back and forth
doesn't make much sense at this point.

> +#ifdef CONFIG_SMP

Existing convention would make that:

#ifdef CONFIG_RTMUTEX_SPIN_ON_OWNER

But I suppose that's indeed not required if we don't use OSQ.

> +/*
> + * Note that owner is a speculative pointer and dereferencing relies
> + * on rcu_read_lock() and the check against the lock owner.
> + */
> +static bool rtlock_adaptive_spinwait(struct rt_mutex_base *lock,
> + struct task_struct *owner)

similarly, this would be:

rt_mutex_spin_on_owner()

> +{
> + bool res = true;
> +
> + rcu_read_lock();
> + for (;;) {
> + /* Owner changed. Trylock again */
> + if (owner != rt_mutex_owner(lock))
> + break;
> + /*
> + * Ensure that owner->on_cpu is dereferenced _after_
> + * checking the above to be valid.
> + */
> + barrier();
> + if (!owner->on_cpu) {

Esp. when this will be on rtmutex unconditionally, you want to mirror
the full set of conditions we also have on mutex_spin_on_owner():

|| need_resched() || vcpu_is_preempted(task_cpu(owner))

> + res = false;
> + break;
> + }
> + cpu_relax();
> + }
> + rcu_read_unlock();
> + return res;
> +}

Additionally, we could consider adding something that would compare the
current prio to the top_waiter prio and terminate the loop if we're
found to be of lower prio, but lifetime issues are probably going to
make that 'interesting'.
Re: [patch 62/63] locking/rtmutex: Add adaptive spinwait mechanism [ In reply to ]
On Wed, Aug 04 2021 at 14:30, Peter Zijlstra wrote:
> On Fri, Jul 30, 2021 at 03:51:09PM +0200, Thomas Gleixner wrote:
>> From: Steven Rostedt <rostedt@goodmis.org>
>>
>> Going to sleep when a spinlock or rwlock is contended can be quite
>> inefficient when the contention time is short and the lock owner is running
>> on a different CPU. The MCS mechanism is not applicable to rtmutex based
>> locks, so provide a simple adaptive spinwait mechanism for the RT specific
>> spin/rwlock implementations.
>
> A better Changelog would explain *why* OSQ does not apply. I'm thinking
> this ie because the (spin) waiters can be of different priorities and we
> need to ensure the highest prio waiter gets to win?
>
> AFAICT that isn't true even without OSQ, you just get a thundering herd
> and the higher prio waiter has a better chance of winning the race but
> all bets are off either way around.

Will do.

>> +#ifdef CONFIG_SMP
>
> Existing convention would make that:
>
> #ifdef CONFIG_RTMUTEX_SPIN_ON_OWNER
>
> But I suppose that's indeed not required if we don't use OSQ.

Right.

>> +/*
>> + * Note that owner is a speculative pointer and dereferencing relies
>> + * on rcu_read_lock() and the check against the lock owner.
>> + */
>> +static bool rtlock_adaptive_spinwait(struct rt_mutex_base *lock,
>> + struct task_struct *owner)
>
> similarly, this would be:
>
> rt_mutex_spin_on_owner()

Duh.
>
> Esp. when this will be on rtmutex unconditionally, you want to mirror
> the full set of conditions we also have on mutex_spin_on_owner():
>
> || need_resched() || vcpu_is_preempted(task_cpu(owner))

Sure.

>> + res = false;
>> + break;
>> + }
>> + cpu_relax();
>> + }
>> + rcu_read_unlock();
>> + return res;
>> +}
>
> Additionally, we could consider adding something that would compare the
> current prio to the top_waiter prio and terminate the loop if we're
> found to be of lower prio, but lifetime issues are probably going to
> make that 'interesting'.

It's only the top priority waiter which can spin. If all of them start
spinning then everything goes down the drain.

Thanks,

tglx