Mailing List Archive

[PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT
For PREEMPT_RT kernel, preempt_disable() and local_irq_save()
are typically converted to local_lock() and local_lock_irqsave()
respectively. These two variants of local_lock() are essentially
the same. Thus, there is no performance advantage in choosing one
over the other.

As there is no point in maintaining two different sets of obj_stock,
it is simpler and more efficient to just disable task_obj and use
only irq_obj for PREEMPT_RT. However, task_obj will still be there
in the memcg_stock_pcp structure even though it is not used in this
configuration.

Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/memcontrol.c | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 87c883227f90..4f80770cb97b 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2120,12 +2120,22 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
* which is cheap in non-preempt kernel. The interrupt context object stock
* can only be accessed after disabling interrupt. User context code can
* access interrupt object stock, but not vice versa.
+ *
+ * For PREEMPT_RT kernel, preempt_disable() and local_irq_save() may have
+ * to be changed to variants of local_lock(). This eliminates the
+ * performance advantage of using preempt_disable(). Fall back to always
+ * use local_irq_save() and use only irq_obj for simplicity.
*/
+static inline bool use_task_obj_stock(void)
+{
+ return !IS_ENABLED(CONFIG_PREEMPT_RT) && likely(in_task());
+}
+
static inline struct obj_stock *get_obj_stock(unsigned long *pflags)
{
struct memcg_stock_pcp *stock;

- if (likely(in_task())) {
+ if (use_task_obj_stock()) {
*pflags = 0UL;
preempt_disable();
stock = this_cpu_ptr(&memcg_stock);
@@ -2139,7 +2149,7 @@ static inline struct obj_stock *get_obj_stock(unsigned long *pflags)

static inline void put_obj_stock(unsigned long flags)
{
- if (likely(in_task()))
+ if (use_task_obj_stock())
preempt_enable();
else
local_irq_restore(flags);
@@ -2212,7 +2222,7 @@ static void drain_local_stock(struct work_struct *dummy)

stock = this_cpu_ptr(&memcg_stock);
drain_obj_stock(&stock->irq_obj);
- if (in_task())
+ if (use_task_obj_stock())
drain_obj_stock(&stock->task_obj);
drain_stock(stock);
clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
@@ -3217,7 +3227,7 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock,
{
struct mem_cgroup *memcg;

- if (in_task() && stock->task_obj.cached_objcg) {
+ if (use_task_obj_stock() && stock->task_obj.cached_objcg) {
memcg = obj_cgroup_memcg(stock->task_obj.cached_objcg);
if (memcg && mem_cgroup_is_descendant(memcg, root_memcg))
return true;
--
2.18.1
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
Waiman,

On Tue, Aug 03 2021 at 13:55, Waiman Long wrote:

please Cc RT people on RT related patches.

> For PREEMPT_RT kernel, preempt_disable() and local_irq_save()
> are typically converted to local_lock() and local_lock_irqsave()
> respectively.

That's just wrong. local_lock has a clear value even on !RT kernels. See

https://www.kernel.org/doc/html/latest/locking/locktypes.html#local-lock

> These two variants of local_lock() are essentially
> the same.

Only on RT kernels.

> + * For PREEMPT_RT kernel, preempt_disable() and local_irq_save() may have
> + * to be changed to variants of local_lock(). This eliminates the
> + * performance advantage of using preempt_disable(). Fall back to always
> + * use local_irq_save() and use only irq_obj for simplicity.

Instead of adding that comment you could have just done the full
conversion, but see below.

> */
> +static inline bool use_task_obj_stock(void)
> +{
> + return !IS_ENABLED(CONFIG_PREEMPT_RT) && likely(in_task());
> +}
> +
> static inline struct obj_stock *get_obj_stock(unsigned long *pflags)
> {
> struct memcg_stock_pcp *stock;
>
> - if (likely(in_task())) {
> + if (use_task_obj_stock()) {
> *pflags = 0UL;
> preempt_disable();
> stock = this_cpu_ptr(&memcg_stock);

This is clearly the kind of conditional locking which is frowned upon
rightfully.

So if we go to reenable memcg for RT we end up with:

if (use_task_obj_stock()) {
preempt_disable();
} else {
local_lock_irqsave(memcg_stock_lock, flags);
}

and further down we end up with:

> @@ -2212,7 +2222,7 @@ static void drain_local_stock(struct work_struct *dummy)
>
> stock = this_cpu_ptr(&memcg_stock);
> drain_obj_stock(&stock->irq_obj);
> - if (in_task())
> + if (use_task_obj_stock())
> drain_obj_stock(&stock->task_obj);
> drain_stock(stock);
> clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);

/*
* The only protection from memory hotplug vs. drain_stock races is
* that we always operate on local CPU stock here with IRQ disabled
*/
- local_irq_save(flags);
+ local_lock_irqsave(memcg_stock_lock, flags);
...
if (use_task_obj_stock())
drain_obj_stock(&stock->task_obj);

which is incomprehensible garbage.

The comment above the existing local_irq_save() is garbage w/o any local
lock conversion already today (and even before the commit which
introduced stock::task_obj) simply because that comment does not explain
the why.

I can just assume that for stock->task_obj the IRQ protection is
completely irrelevant. If not and _all_ members of stock have to be
protected against memory hotplug by disabling interrupts then any other
function which just disables preemption is broken.

To complete the analysis of drain_local_stock(). AFAICT that function
can only be called from task context. So what is the purpose of this
in_task() conditional there?

if (in_task())
drain_obj_stock(&stock->task_obj);

I assume it's mechanical conversion of:

- drain_obj_stock(stock);
+ drain_obj_stock(&stock->irq_obj);
+ if (in_task())
+ drain_obj_stock(&stock->task_obj);

all over the place without actually looking at the surrounding code,
comments and call sites.

This patch is certainly in line with that approach, but it's just adding
more confusion.

Thanks,

tglx
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On 8/3/21 7:21 PM, Thomas Gleixner wrote:
> Waiman,
>
> On Tue, Aug 03 2021 at 13:55, Waiman Long wrote:
>
> please Cc RT people on RT related patches.
>
>> For PREEMPT_RT kernel, preempt_disable() and local_irq_save()
>> are typically converted to local_lock() and local_lock_irqsave()
>> respectively.
> That's just wrong. local_lock has a clear value even on !RT kernels. See
>
> https://www.kernel.org/doc/html/latest/locking/locktypes.html#local-lock
>
I understand what local_lock is for. For !RT kernel, local_lock() still
requires the use of a pseudo_lock which is not the goal of this patch to
put one there.
>> These two variants of local_lock() are essentially
>> the same.
> Only on RT kernels.
That is right. So this is a change aimed for easier integration with RT
kernel.
>
>> + * For PREEMPT_RT kernel, preempt_disable() and local_irq_save() may have
>> + * to be changed to variants of local_lock(). This eliminates the
>> + * performance advantage of using preempt_disable(). Fall back to always
>> + * use local_irq_save() and use only irq_obj for simplicity.
> Instead of adding that comment you could have just done the full
> conversion, but see below.
Well, I can do that if you want me to.
>
>> */
>> +static inline bool use_task_obj_stock(void)
>> +{
>> + return !IS_ENABLED(CONFIG_PREEMPT_RT) && likely(in_task());
>> +}
>> +
>> static inline struct obj_stock *get_obj_stock(unsigned long *pflags)
>> {
>> struct memcg_stock_pcp *stock;
>>
>> - if (likely(in_task())) {
>> + if (use_task_obj_stock()) {
>> *pflags = 0UL;
>> preempt_disable();
>> stock = this_cpu_ptr(&memcg_stock);
> This is clearly the kind of conditional locking which is frowned upon
> rightfully.
>
> So if we go to reenable memcg for RT we end up with:
>
> if (use_task_obj_stock()) {
> preempt_disable();
> } else {
> local_lock_irqsave(memcg_stock_lock, flags);
> }
>
> and further down we end up with:
The purpose of this series is to improve kmem_cache allocation and free
performance for non-RT kernel. So not disabling/enabling interrupt help
a bit in this regard.
>
>> @@ -2212,7 +2222,7 @@ static void drain_local_stock(struct work_struct *dummy)
>>
>> stock = this_cpu_ptr(&memcg_stock);
>> drain_obj_stock(&stock->irq_obj);
>> - if (in_task())
>> + if (use_task_obj_stock())
>> drain_obj_stock(&stock->task_obj);
>> drain_stock(stock);
>> clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags);
>> Thanks,
>>
>> tglx
>>
>
> /*
> * The only protection from memory hotplug vs. drain_stock races is
> * that we always operate on local CPU stock here with IRQ disabled
> */
> - local_irq_save(flags);
> + local_lock_irqsave(memcg_stock_lock, flags);
> ...
> if (use_task_obj_stock())
> drain_obj_stock(&stock->task_obj);
>
> which is incomprehensible garbage.
>
> The comment above the existing local_irq_save() is garbage w/o any local
> lock conversion already today (and even before the commit which
> introduced stock::task_obj) simply because that comment does not explain
> the why.
That comment was added by commit 72f0184c8a00 ("mm, memcg: remove
hotplug locking from try_charge"). It was there before my commits.

>
> I can just assume that for stock->task_obj the IRQ protection is
> completely irrelevant. If not and _all_ members of stock have to be
> protected against memory hotplug by disabling interrupts then any other
> function which just disables preemption is broken.
That is correct specifically for task_obj, but not for other data.
>
> To complete the analysis of drain_local_stock(). AFAICT that function
> can only be called from task context. So what is the purpose of this
> in_task() conditional there?
>
> if (in_task())
> drain_obj_stock(&stock->task_obj);
I haven't done a full analysis to see if it can be called from task
context only. Maybe in_task() check isn't needed, but having it there
provides the safety that it will still work in case it can be called
from interrupt context.
>
> I assume it's mechanical conversion of:
>
> - drain_obj_stock(stock);
> + drain_obj_stock(&stock->irq_obj);
> + if (in_task())
> + drain_obj_stock(&stock->task_obj);
>
> all over the place without actually looking at the surrounding code,
> comments and call sites.
>
> This patch is certainly in line with that approach, but it's just adding
> more confusion.

What is your suggestion for improving this patch?

Cheers,
Longman
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On 8/4/21 1:21 AM, Thomas Gleixner wrote:
> /*
> * The only protection from memory hotplug vs. drain_stock races is
> * that we always operate on local CPU stock here with IRQ disabled
> */
> - local_irq_save(flags);
> + local_lock_irqsave(memcg_stock_lock, flags);
> ...
> if (use_task_obj_stock())
> drain_obj_stock(&stock->task_obj);
>
> which is incomprehensible garbage.
>
> The comment above the existing local_irq_save() is garbage w/o any local
> lock conversion already today (and even before the commit which
> introduced stock::task_obj) simply because that comment does not explain
> the why.

Michal, this seems to be your comment from commit 72f0184c8a00 ("mm, memcg:
remove hotplug locking from try_charge"). Was "memory hotplug" a mistake,
because the rest of the commit is about cpu hotplug, and I don't really see a
memory hotplug connection there?
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On Wed 04-08-21 09:39:23, Vlastimil Babka wrote:
> On 8/4/21 1:21 AM, Thomas Gleixner wrote:
> > /*
> > * The only protection from memory hotplug vs. drain_stock races is
> > * that we always operate on local CPU stock here with IRQ disabled
> > */
> > - local_irq_save(flags);
> > + local_lock_irqsave(memcg_stock_lock, flags);
> > ...
> > if (use_task_obj_stock())
> > drain_obj_stock(&stock->task_obj);
> >
> > which is incomprehensible garbage.
> >
> > The comment above the existing local_irq_save() is garbage w/o any local
> > lock conversion already today (and even before the commit which
> > introduced stock::task_obj) simply because that comment does not explain
> > the why.
>
> Michal, this seems to be your comment from commit 72f0184c8a00 ("mm, memcg:
> remove hotplug locking from try_charge"). Was "memory hotplug" a mistake,
> because the rest of the commit is about cpu hotplug, and I don't really see a
> memory hotplug connection there?

This part of the changelog tried to explain that part IIRC
"
We can get rid of {get,put}_online_cpus, fortunately. We do not have to
be worried about races with memory hotplug because drain_local_stock,
which is called from both the WQ draining and the memory hotplug
contexts, is always operating on the local cpu stock with IRQs disabled.
"

Now I have to admit I do not remember all the details and from a quick
look the memory hotplug doesn't seem to be draining memcg pcp stock.
Maybe this has been removed since then. The only stock draining outside
of the memcg code seems to be memcg_hotplug_cpu_dead callback. That
would indicate that I really meant the cpu hotplug here indeed.

--
Michal Hocko
SUSE Labs
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On Tue 03-08-21 21:40:35, Waiman Long wrote:
[...]
> The purpose of this series is to improve kmem_cache allocation and free
> performance for non-RT kernel. So not disabling/enabling interrupt help a
> bit in this regard.

Johannes has explained the irq disabling role in the stock draining just
yesterday. Have a look at http://lkml.kernel.org/r/YQlPiLY0ieRb704V@cmpxchg.org
--
Michal Hocko
SUSE Labs
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On 8/3/21 9:40 PM, Waiman Long wrote:
> On 8/3/21 7:21 PM, Thomas Gleixner wrote:
>> To complete the analysis of drain_local_stock(). AFAICT that function
>> can only be called from task context. So what is the purpose of this
>> in_task() conditional there?
>>
>>     if (in_task())
>>            drain_obj_stock(&stock->task_obj);
> I haven't done a full analysis to see if it can be called from task
> context only. Maybe in_task() check isn't needed, but having it there
> provides the safety that it will still work in case it can be called
> from interrupt context.

After looking at possible call chains that can lead to
drain_local_stock(), one call chain comes from the allocation of slab
objects which I had previously determined to be callable from interrupt
context. So it is prudent to add a in_task() check here.

Cheers,
Longman
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On Wed 04-08-21 10:33:41, Michal Hocko wrote:
> On Wed 04-08-21 09:39:23, Vlastimil Babka wrote:
> > On 8/4/21 1:21 AM, Thomas Gleixner wrote:
> > > /*
> > > * The only protection from memory hotplug vs. drain_stock races is
> > > * that we always operate on local CPU stock here with IRQ disabled
> > > */
> > > - local_irq_save(flags);
> > > + local_lock_irqsave(memcg_stock_lock, flags);
> > > ...
> > > if (use_task_obj_stock())
> > > drain_obj_stock(&stock->task_obj);
> > >
> > > which is incomprehensible garbage.
> > >
> > > The comment above the existing local_irq_save() is garbage w/o any local
> > > lock conversion already today (and even before the commit which
> > > introduced stock::task_obj) simply because that comment does not explain
> > > the why.
> >
> > Michal, this seems to be your comment from commit 72f0184c8a00 ("mm, memcg:
> > remove hotplug locking from try_charge"). Was "memory hotplug" a mistake,
> > because the rest of the commit is about cpu hotplug, and I don't really see a
> > memory hotplug connection there?
>
> This part of the changelog tried to explain that part IIRC
> "
> We can get rid of {get,put}_online_cpus, fortunately. We do not have to
> be worried about races with memory hotplug because drain_local_stock,
> which is called from both the WQ draining and the memory hotplug
> contexts, is always operating on the local cpu stock with IRQs disabled.
> "
>
> Now I have to admit I do not remember all the details and from a quick
> look the memory hotplug doesn't seem to be draining memcg pcp stock.
> Maybe this has been removed since then. The only stock draining outside
> of the memcg code seems to be memcg_hotplug_cpu_dead callback. That
> would indicate that I really meant the cpu hotplug here indeed.

Does this look better?
---

From 5aa1c8ce0d88b8c6d59ba95c7e36ca07dc2b2161 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Mon, 9 Aug 2021 10:59:04 +0200
Subject: [PATCH] memcg: fix up drain_local_stock comment

Thomas and Vlastimil have noticed that the comment in drain_local_stock
doesn't quite make sense. It talks about a synchronization with the
memory hotplug but there is no actual memory hotplug involvement here.
I meant to talk about cpu hotplug here. Fix that up and hopefuly make
the comment more helpful by referencing the cpu hotplug callback as
well.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
mm/memcontrol.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index eb8e87c4833f..f7be7b01395e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2205,8 +2205,9 @@ static void drain_local_stock(struct work_struct *dummy)
unsigned long flags;

/*
- * The only protection from memory hotplug vs. drain_stock races is
- * that we always operate on local CPU stock here with IRQ disabled
+ * The only protection from cpu hotplug (memcg_hotplug_cpu_dead) vs.
+ * drain_stock races is that we always operate on local CPU stock
+ * here with IRQ disabled
*/
local_irq_save(flags);

--
2.30.1

--
Michal Hocko
SUSE Labs
Re: [PATCH] mm/memcg: Disable task obj_stock for PREEMPT_RT [ In reply to ]
On 8/9/21 11:07 AM, Michal Hocko wrote:
> On Wed 04-08-21 10:33:41, Michal Hocko wrote:
>> On Wed 04-08-21 09:39:23, Vlastimil Babka wrote:
>> > On 8/4/21 1:21 AM, Thomas Gleixner wrote:
>> > > /*
>> > > * The only protection from memory hotplug vs. drain_stock races is
>> > > * that we always operate on local CPU stock here with IRQ disabled
>> > > */
>> > > - local_irq_save(flags);
>> > > + local_lock_irqsave(memcg_stock_lock, flags);
>> > > ...
>> > > if (use_task_obj_stock())
>> > > drain_obj_stock(&stock->task_obj);
>> > >
>> > > which is incomprehensible garbage.
>> > >
>> > > The comment above the existing local_irq_save() is garbage w/o any local
>> > > lock conversion already today (and even before the commit which
>> > > introduced stock::task_obj) simply because that comment does not explain
>> > > the why.
>> >
>> > Michal, this seems to be your comment from commit 72f0184c8a00 ("mm, memcg:
>> > remove hotplug locking from try_charge"). Was "memory hotplug" a mistake,
>> > because the rest of the commit is about cpu hotplug, and I don't really see a
>> > memory hotplug connection there?
>>
>> This part of the changelog tried to explain that part IIRC
>> "
>> We can get rid of {get,put}_online_cpus, fortunately. We do not have to
>> be worried about races with memory hotplug because drain_local_stock,
>> which is called from both the WQ draining and the memory hotplug
>> contexts, is always operating on the local cpu stock with IRQs disabled.
>> "
>>
>> Now I have to admit I do not remember all the details and from a quick
>> look the memory hotplug doesn't seem to be draining memcg pcp stock.
>> Maybe this has been removed since then. The only stock draining outside
>> of the memcg code seems to be memcg_hotplug_cpu_dead callback. That
>> would indicate that I really meant the cpu hotplug here indeed.
>
> Does this look better?

Yes, thanks.

> ---
>
> From 5aa1c8ce0d88b8c6d59ba95c7e36ca07dc2b2161 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 9 Aug 2021 10:59:04 +0200
> Subject: [PATCH] memcg: fix up drain_local_stock comment
>
> Thomas and Vlastimil have noticed that the comment in drain_local_stock
> doesn't quite make sense. It talks about a synchronization with the
> memory hotplug but there is no actual memory hotplug involvement here.
> I meant to talk about cpu hotplug here. Fix that up and hopefuly make
> the comment more helpful by referencing the cpu hotplug callback as
> well.
>
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
> mm/memcontrol.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index eb8e87c4833f..f7be7b01395e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2205,8 +2205,9 @@ static void drain_local_stock(struct work_struct *dummy)
> unsigned long flags;
>
> /*
> - * The only protection from memory hotplug vs. drain_stock races is
> - * that we always operate on local CPU stock here with IRQ disabled
> + * The only protection from cpu hotplug (memcg_hotplug_cpu_dead) vs.
> + * drain_stock races is that we always operate on local CPU stock
> + * here with IRQ disabled
> */
> local_irq_save(flags);
>
>