Mailing List Archive

1 2 3 4  View All
Re: AUTOSEL process [ In reply to ]
On Wed, Mar 01, 2023 at 07:06:26AM +0100, Greg KH wrote:
> On Tue, Feb 28, 2023 at 09:05:16PM -0500, Slade Watkins wrote:
> > On 2/28/23 06:28, Greg KH wrote:
> > >> But just so you know, as a maintainer, you have the option to request that
> > >> patches to your subsystem will not be selected by AUTOSEL and run your
> > >> own process to select, test and submit fixes to stable trees.
> > >
> > > Yes, and simply put, that's the answer for any subsystem or maintainer
> > > that does not want their patches picked using the AUTOSEL tool.
> > >
> > > The problem that the AUTOSEL tool is solving is real, we have whole
> > > major subsystems where no patches are ever marked as "for stable" and so
> > > real bugfixes are never backported properly.
> >
> > Yeah, I agree.
> >
> > And I'm throwing this out here [.after having time to think about it due to an
> > internet outage], but, would Cc'ing the patch's relevant subsystems on AUTOSEL
> > emails help? This was sort of mentioned in this email[1] from Eric, and I
> > think it _could_ help? I don't know, just something that crossed my mind earlier.
>
> I don't know, maybe? Note that determining a patch's "subsystem" at
> many times is difficult in an automated fashion, have any idea how to do
> that reliably that doesn't just hit lkml all the time?

As I said, it seems Sasha already does this for AUTOSEL (but not other stable
emails). I assume he uses either get_maintainer.pl, or the lists the original
patch is sent to (retrievable from lore). This is *not* a hard problem.

> But again, how is that going to help much, the people who should be
> saying "no" are the ones on the signed-off-by and cc: lines in the patch
> itself.

So that if one person does not respond, other people can help.

You're basically arguing that mailing lists shouldn't exist at all...

- Eric
Re: AUTOSEL process [ In reply to ]
On Wed, Mar 01, 2023 at 07:09:49AM +0100, Greg KH wrote:
>
> Why would the FAILED emails want to go to a mailing list? If the people
> that were part of making the patch don't want to respond to a FAILED
> email, why would anyone on the mailing list?

The same reason we use mailing lists for kernel development at all instead of
just sending patches directly to the MAINTAINERS.

>
> But hey, I'll be glad to take a change to my script to add that
> functionality if you want to make it, it's here:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/scripts/bad_stable
>

Ah, the classic "patches welcome".

Unfortunately, I don't think that can work very well for scripts that are only
ever used by at most two people -- you and Sasha. Many even just by you,
apparently, as I see your name, email address, home directory, etc. hardcoded in
the scripts. (BTW, where are the scripts Sasha uses for AUTOSEL?)

- Eric
Re: AUTOSEL process [ In reply to ]
On Tue, Feb 28, 2023 at 11:22:53PM -0800, Eric Biggers wrote:
> On Wed, Mar 01, 2023 at 07:09:49AM +0100, Greg KH wrote:
> >
> > Why would the FAILED emails want to go to a mailing list? If the people
> > that were part of making the patch don't want to respond to a FAILED
> > email, why would anyone on the mailing list?
>
> The same reason we use mailing lists for kernel development at all instead of
> just sending patches directly to the MAINTAINERS.

I'm personally seeing a difference between reviewing patches on a mailing
list to help someone polish it, and commenting on its suitability for
stable once it's got merged, from someone not having participated to its
inclusion. All such patches are already sent to stable@ which many of
those of us interested in having a look are already subscribed to, and
which most often just triggers a quick glance depending on areas of
interest. I hardly see anyone just curious about a patch ask "are you
sure ?".

I could possibly understand the value of sending the failed ones to a
list, in case someone with more time available than the author wants to
give it a try. But again they're already sent to stable@. It's just that
it might be possible that more interested parties are on other lists.
But again I'm not fully convinced.

> > But hey, I'll be glad to take a change to my script to add that
> > functionality if you want to make it, it's here:
> > https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/scripts/bad_stable
> >
>
> Ah, the classic "patches welcome".
>
> Unfortunately, I don't think that can work very well for scripts that are only
> ever used by at most two people -- you and Sasha. Many even just by you,
> apparently, as I see your name, email address, home directory, etc. hardcoded in
> the scripts.

But it's going into a dead end. You are the one saying that changes
are easy, suggesting to use get_maintainers.pl, so easy that you can't
try to adapt them in existing stuff. Even without modifying existing
scripts, if you are really interested by such features, why not at least
try to run your idea over a whole series, figure how long it takes, how
accurate it seems to be, adjust the output to remove unwanted noise and
propose for review a few lines that seem to do the job for you ?

> (BTW, where are the scripts Sasha uses for AUTOSEL?)

Maybe they're even uglier and he doesn't want to show them. The ones
I was using to sort out 3.10 patches were totally ugly and full of
heuristics, with lines that I would comment/uncomment along operations
to refine the selection and I definitely wasn't going to post that
anywhere. They were looking a bit like a commented extract of my
.history. I'm sure over time Sasha has a much cleaner and repeatable
process but maybe it involves parts that are not stabilized, that
he's not proud of, or even tools contributed by coworkers that we
don't necessarily need to know about and where he hardly sees how
anyone could bring any help.

Willy
Re: AUTOSEL process [ In reply to ]
On Wed, Mar 01, 2023 at 08:40:03AM +0100, Willy Tarreau wrote:
> But it's going into a dead end. You are the one saying that changes
> are easy, suggesting to use get_maintainers.pl, so easy that you can't
> try to adapt them in existing stuff. Even without modifying existing
> scripts, if you are really interested by such features, why not at least
> try to run your idea over a whole series, figure how long it takes, how
> accurate it seems to be, adjust the output to remove unwanted noise and
> propose for review a few lines that seem to do the job for you ?
>

As I said, Sasha *already does this for AUTOSEL*. So it seems this problem has
already been solved, but Sasha and Greg are not coordinating with each other.

Anyway, it's hard to have much motivation to try to contribute to scripts that
not only can I not use or test myself, but before even getting to that point,
pretty much any ideas for improvements are strongly pushed back on. Earlier I
was actually going to go into more detail about some ideas for how to flag and
review potential problems with backporting a commit, but I figured why bother
given how this thread has been going.

(Also I presume anything that would add *any* human time on the stable
maintainers' end per patch, on average, would be strongly pushed back on too?
But I have no visibility into what would be acceptable. I don't do their job.)

- Eric
Re: AUTOSEL process [ In reply to ]
On Wed, Mar 01, 2023 at 12:31:22AM -0800, Eric Biggers wrote:
> On Wed, Mar 01, 2023 at 08:40:03AM +0100, Willy Tarreau wrote:
> > But it's going into a dead end. You are the one saying that changes
> > are easy, suggesting to use get_maintainers.pl, so easy that you can't
> > try to adapt them in existing stuff. Even without modifying existing
> > scripts, if you are really interested by such features, why not at least
> > try to run your idea over a whole series, figure how long it takes, how
> > accurate it seems to be, adjust the output to remove unwanted noise and
> > propose for review a few lines that seem to do the job for you ?
> >
>
> As I said, Sasha *already does this for AUTOSEL*. So it seems this problem has
> already been solved, but Sasha and Greg are not coordinating with each other.

We do not share the same scripts for these tasks as we have different
roles here. That's all, nothing malicious.

thanks,

greg k-h
Re: AUTOSEL process [ In reply to ]
On 28.02.23 12:28, Greg KH wrote:
> On Tue, Feb 28, 2023 at 12:41:07PM +0200, Amir Goldstein wrote:
>>>> I'm not sure how feedback in the form of "this sucks but I'm sure it
>>>> could be much better" is useful.
>>> I've already given you some specific suggestions.
>>> I can't force you to listen to them, of course.
>>
>> As you probably know, this is not the first time that the subject of the
>> AUTOSEL process has been discussed.
>> Here is one example from fsdevel with a few other suggestions [1].
>>
>> But just so you know, as a maintainer, you have the option to request that
>> patches to your subsystem will not be selected by AUTOSEL and run your
>> own process to select, test and submit fixes to stable trees.
> [...]
> In an ideal world, all maintainers would properly mark their patches for
> stable backporting (as documented for the past 15+ years, with a cc:
> stable tag, NOT a Fixes: tag), but we do not live in that world, and
> hence, the need for the AUTOSEL work.

Well, we could do something to get a bit closer to the ideal world:
teach checkpatch.pl to help developers do the right thing in the first
place. That's what I'm trying to do right now to make them add Link:
tags more often (https://git.kernel.org/torvalds/c/d7f1d71e5ef6 ), as my
regression tracking efforts heavily rely on them. Shouldn't be too hard
to add a simple check along the lines of "this change has a Fixes: tag;
either CC stable or do <foo> to suppress this warning" (<foo> could be a
"nostable" tag or something else that we'd need to agree on first).

In an ideal we'd maybe even have a "checkpatch bot" that looks at all
patches posted and sends feedback to the list if it finds something to
improve. Sure, some (a lot?) of what AUTOSEL does relies on data that is
only available after a change was merged, but maybe some is available
earlier already.

Ciao, Thorsten
Re: AUTOSEL process [ In reply to ]
On Tue, Feb 28, 2023 at 12:28:23PM +0100, Greg KH wrote:

> In an ideal world, all maintainers would properly mark their patches for
> stable backporting (as documented for the past 15+ years, with a cc:
> stable tag, NOT a Fixes: tag), but we do not live in that world, and
> hence, the need for the AUTOSEL work.

Just as a datapoint here: I stopped making any effort to copy stable on
things due to AUTOSEL, it's pulling back so much more than I would
consider for stable that it no longer seems worthwhile to try to make
decisions about what might fit.
Re: AUTOSEL process [ In reply to ]
Hi!

> > So to summarize, that buggy commit was backported even though:
> >
> > * There were no indications that it was a bug fix (and thus potentially
> > suitable for stable) in the first place.
> > * On the AUTOSEL thread, someone told you the commit is broken.
> > * There was already a thread that reported a regression caused by the commit.
> > Easily findable via lore search.
> > * There was also already a pending patch that Fixes the commit. Again easily
> > findable via lore search.
> >
> > So it seems a *lot* of things went wrong, no? Why? If so many things can go
> > wrong, it's not just a "mistake" but rather the process is the problem...
>
> BTW, another cause of this is that the commit (66f99628eb24) was AUTOSEL'd after
> only being in mainline for 4 days, and *released* in all LTS kernels after only
> being in mainline for 12 days. Surely that's a timeline befitting a critical
> security vulnerability, not some random neural-network-selected commit that
> wasn't even fixing anything?

I see this problem, too, "-stable" is more experimental than Linus's
releases.

I believe that -stable would be more useful without AUTOSEL process.

Best regards,
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.
Re: AUTOSEL process [ In reply to ]
On Tue, Mar 07, 2023 at 10:18:35PM +0100, Pavel Machek wrote:
> Hi!
>
> > > So to summarize, that buggy commit was backported even though:
> > >
> > > * There were no indications that it was a bug fix (and thus potentially
> > > suitable for stable) in the first place.
> > > * On the AUTOSEL thread, someone told you the commit is broken.
> > > * There was already a thread that reported a regression caused by the commit.
> > > Easily findable via lore search.
> > > * There was also already a pending patch that Fixes the commit. Again easily
> > > findable via lore search.
> > >
> > > So it seems a *lot* of things went wrong, no? Why? If so many things can go
> > > wrong, it's not just a "mistake" but rather the process is the problem...
> >
> > BTW, another cause of this is that the commit (66f99628eb24) was AUTOSEL'd after
> > only being in mainline for 4 days, and *released* in all LTS kernels after only
> > being in mainline for 12 days. Surely that's a timeline befitting a critical
> > security vulnerability, not some random neural-network-selected commit that
> > wasn't even fixing anything?
>
> I see this problem, too, "-stable" is more experimental than Linus's
> releases.
>
> I believe that -stable would be more useful without AUTOSEL process.
>

There has to be a way to ensure that security fixes that weren't properly tagged
make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
think that debating *whether it should exist* is a distraction from what's
actually important, which is that the current AUTOSEL process has some specific
problems, and these specific problems need to be fixed...

- Eric
Re: AUTOSEL process [ In reply to ]
On Mon, Feb 27, 2023 at 07:41:31PM -0800, Eric Biggers wrote:
>
> Well, probably more common is that prerequisites are in the same patchset, and
> the prerequisites are tagged for stable too. Whereas AUTOSEL often just picks
> patch X of N. Also, developers and maintainers who tag patches for stable are
> probably more likely to help with the stable process in general and make sure
> patches are backported correctly...
>
> Anyway, the point is, AUTOSEL needs to be fixed to stop inappropriately
> cherry-picking patch X of N so often.
>

... and AUTOSEL strikes again, with the 6.1 and 6.2 kernels currently crashing
whenever a block device is removed, due to patches 1 and 3 of a 3-patch series
being AUTOSEL'ed (on the same day I started this discussion, no less):

https://lore.kernel.org/linux-block/CAOCAAm4reGhz400DSVrh0BetYD3Ljr2CZen7_3D4gXYYdB4SKQ@mail.gmail.com/T/#u

Oh sorry, ignore this, it's just an anecdotal example.

- Eric
Re: AUTOSEL process [ In reply to ]
On Tue, Mar 07, 2023 at 09:45:24PM +0000, Eric Biggers wrote:
> On Tue, Mar 07, 2023 at 10:18:35PM +0100, Pavel Machek wrote:
> > I believe that -stable would be more useful without AUTOSEL process.
>
> There has to be a way to ensure that security fixes that weren't properly tagged
> make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
> think that debating *whether it should exist* is a distraction from what's
> actually important, which is that the current AUTOSEL process has some specific
> problems, and these specific problems need to be fixed...

I agree with you, that we need autosel and we also need autosel to
be better. I actually see Pavel's mail as a datapoint (or "anecdote",
if you will) in support of that; the autosel process currently works
so badly that a long-time contributor thinks it's worse than nothing.

Sasha, what do you need to help you make this better?
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 06:25:59AM +0000, Matthew Wilcox wrote:
> On Tue, Mar 07, 2023 at 09:45:24PM +0000, Eric Biggers wrote:
> > On Tue, Mar 07, 2023 at 10:18:35PM +0100, Pavel Machek wrote:
> > > I believe that -stable would be more useful without AUTOSEL process.
> >
> > There has to be a way to ensure that security fixes that weren't properly tagged
> > make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
> > think that debating *whether it should exist* is a distraction from what's
> > actually important, which is that the current AUTOSEL process has some specific
> > problems, and these specific problems need to be fixed...
>
> I agree with you, that we need autosel and we also need autosel to
> be better. I actually see Pavel's mail as a datapoint (or "anecdote",
> if you will) in support of that; the autosel process currently works
> so badly that a long-time contributor thinks it's worse than nothing.
>
> Sasha, what do you need to help you make this better?

One would probably need to define "better" and "so badly". As a user
of -stable kernels, I consider that they've got much better over the
last years. A lot of processes have improved everywhere even before
the release, but I do think that autosel is part of what generally
gives a chance to some useful and desired fixed (e.g. in drivers) to
be backported and save some users unneeded headaches.

In fact I think that the reason for the negative perception is that
patches that it picks are visible, and it's easy to think "WTF" when
seeing one of them. Previously, these patches were not proposed, so
nobody knew they were missing. It happened to plenty of us to spend
some time trying to spot why a stable kernel would occasionally fail
on a machine, and discovering in the process that mainline did work
because it contained a fix that was never backported. This is
frustrating but there's noone to blame for failing to pick that patch
(and the patch's author should not be blamed either since for small
compatibility stuff it's probably common to see first-timers who are
not yet at ease with the process).

Here the patches are CCed to their authors before being merged. They
get a chance to be reviewed and rejected. Granted, maybe sometimes they
could be subject to a longer delay or be sent to certain lists. Maybe.
But I do think that the complaints in fact reflect a process that's not
as broken as some think, precisely because it allows people to complain
when something is going wrong. The previous process didn't permit that.
For this alone it's a progress.

Willy
Re: AUTOSEL process [ In reply to ]
Hi!

> > > > I believe that -stable would be more useful without AUTOSEL process.
> > >
> > > There has to be a way to ensure that security fixes that weren't properly tagged
> > > make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
> > > think that debating *whether it should exist* is a distraction from what's
> > > actually important, which is that the current AUTOSEL process has some specific
> > > problems, and these specific problems need to be fixed...
> >
> > I agree with you, that we need autosel and we also need autosel to
> > be better. I actually see Pavel's mail as a datapoint (or "anecdote",
> > if you will) in support of that; the autosel process currently works
> > so badly that a long-time contributor thinks it's worse than nothing.
> >
> > Sasha, what do you need to help you make this better?
>
> One would probably need to define "better" and "so badly". As a user
> of -stable kernels, I consider that they've got much better over the

Well, we have Documentation/process/stable-kernel-rules.rst . If we
wanted to define "better", we should start documenting what the real
rules are for the patches in the stable tree.

I agree that -stable works quite well, but the real rules are far away
from what is documented.

I don't think AUTOSEL works well. I believe we should require positive
reply from patch author on relevant maintainer before merging such
patch to -stable.

Best regards,
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 12:45:20PM +0100, Pavel Machek wrote:
> Hi!
>
> > > > > I believe that -stable would be more useful without AUTOSEL process.
> > > >
> > > > There has to be a way to ensure that security fixes that weren't properly tagged
> > > > make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
> > > > think that debating *whether it should exist* is a distraction from what's
> > > > actually important, which is that the current AUTOSEL process has some specific
> > > > problems, and these specific problems need to be fixed...
> > >
> > > I agree with you, that we need autosel and we also need autosel to
> > > be better. I actually see Pavel's mail as a datapoint (or "anecdote",
> > > if you will) in support of that; the autosel process currently works
> > > so badly that a long-time contributor thinks it's worse than nothing.
> > >
> > > Sasha, what do you need to help you make this better?
> >
> > One would probably need to define "better" and "so badly". As a user
> > of -stable kernels, I consider that they've got much better over the
>
> Well, we have Documentation/process/stable-kernel-rules.rst . If we
> wanted to define "better", we should start documenting what the real
> rules are for the patches in the stable tree.
>
> I agree that -stable works quite well, but the real rules are far away
> from what is documented.
>
> I don't think AUTOSEL works well. I believe we should require positive
> reply from patch author on relevant maintainer before merging such
> patch to -stable.

Again, for the people in the back so that everyone can hear it, that
does not work as some subsystems refuse to tag ANY patches for stable
commits, nor do they want to have anything to do with stable kernels at
all. And that's fine, that's their option, but because of that, we have
to have a way to actually get the real fixes in those subsystems to the
users who use these stable kernels. Hence, the AUTOSEL work.

So no, forcing a maintainer/author to ack a patch to get it into stable
is not going to work UNLESS a maintainer/author explicitly asks for
that, which some have, and that's wonderful.

thanks,

greg k-h
Re: AUTOSEL process [ In reply to ]
On Fri, Mar 10, 2023 at 03:07:04PM -0800, Eric Biggers wrote:
>On Mon, Feb 27, 2023 at 07:41:31PM -0800, Eric Biggers wrote:
>>
>> Well, probably more common is that prerequisites are in the same patchset, and
>> the prerequisites are tagged for stable too. Whereas AUTOSEL often just picks
>> patch X of N. Also, developers and maintainers who tag patches for stable are
>> probably more likely to help with the stable process in general and make sure
>> patches are backported correctly...
>>
>> Anyway, the point is, AUTOSEL needs to be fixed to stop inappropriately
>> cherry-picking patch X of N so often.
>>
>
>... and AUTOSEL strikes again, with the 6.1 and 6.2 kernels currently crashing
>whenever a block device is removed, due to patches 1 and 3 of a 3-patch series
>being AUTOSEL'ed (on the same day I started this discussion, no less):
>
>https://lore.kernel.org/linux-block/CAOCAAm4reGhz400DSVrh0BetYD3Ljr2CZen7_3D4gXYYdB4SKQ@mail.gmail.com/T/#u
>
>Oh sorry, ignore this, it's just an anecdotal example.

Yes, clearly a problem with AUTOSEL and not with how sad the testing
story is for stable releases.

--
Thanks,
Sasha
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 06:25:59AM +0000, Matthew Wilcox wrote:
>On Tue, Mar 07, 2023 at 09:45:24PM +0000, Eric Biggers wrote:
>> On Tue, Mar 07, 2023 at 10:18:35PM +0100, Pavel Machek wrote:
>> > I believe that -stable would be more useful without AUTOSEL process.
>>
>> There has to be a way to ensure that security fixes that weren't properly tagged
>> make it to stable anyway. So, AUTOSEL is necessary, at least in some form. I
>> think that debating *whether it should exist* is a distraction from what's
>> actually important, which is that the current AUTOSEL process has some specific
>> problems, and these specific problems need to be fixed...
>
>I agree with you, that we need autosel and we also need autosel to
>be better. I actually see Pavel's mail as a datapoint (or "anecdote",
>if you will) in support of that; the autosel process currently works
>so badly that a long-time contributor thinks it's worse than nothing.
>
>Sasha, what do you need to help you make this better?

What could I do to avoid this?

I suppose that if I had a way to know if a certain a commit is part of a
series, I could either take all of it or none of it, but I don't think I
have a way of doing that by looking at a commit in Linus' tree
(suggestions welcome, I'm happy to implement them).

Other than that, the commit at hand:

1. Describes a real problem that needs to be fixed, so while it was
reverted for a quick fix, we'll need to go back and bring it in along
with it's dependency.

2. Soaked for over two weeks between the AUTOSEL mails and the release,
gone through multiple rounds of reviews.

3. Went through all the tests provided by all the individuals, bots,
companies, etc who test the tree through multiple rounds of testing (we
had to do a -rc2 for that releases).

4. Went through whatever tests distros run on the kernel before they
package and release it.

--
Thanks,
Sasha
Re: AUTOSEL process [ In reply to ]
On Sat, 2023-03-11 at 08:41 -0500, Sasha Levin wrote:
> On Fri, Mar 10, 2023 at 03:07:04PM -0800, Eric Biggers wrote:
> > On Mon, Feb 27, 2023 at 07:41:31PM -0800, Eric Biggers wrote:
> > >
> > > Well, probably more common is that prerequisites are in the same
> > > patchset, and the prerequisites are tagged for stable too. 
> > > Whereas AUTOSEL often just picks patch X of N.  Also, developers
> > > and maintainers who tag patches for stable are probably more
> > > likely to help with the stable process in general and make sure
> > > patches are backported correctly...
> > >
> > > Anyway, the point is, AUTOSEL needs to be fixed to stop
> > > inappropriately cherry-picking patch X of N so often.
> > >
> >
> > ... and AUTOSEL strikes again, with the 6.1 and 6.2 kernels
> > currently crashing whenever a block device is removed, due to
> > patches 1 and 3 of a 3-patch series being AUTOSEL'ed (on the same
> > day I started this discussion, no less):
> >
> > https://lore.kernel.org/linux-block/CAOCAAm4reGhz400DSVrh0BetYD3Ljr2CZen7_3D4gXYYdB4SKQ@mail.gmail.com/T/#u
> >
> > Oh sorry, ignore this, it's just an anecdotal example.
>
> Yes, clearly a problem with AUTOSEL and not with how sad the testing
> story is for stable releases.

Hey, that's a completely circular argument: if we had perfect testing
then, of course, it would pick up every bad patch before anything got
released; but we don't, and everyone knows it. Therefore, we have to
be discriminating about what patches we put in. And we have to
acknowledge that zero bugs in patches isn't feasible in spite of all
the checking we do do. I also think we have to acknowledge that users
play a role in the testing process because some bugs simply aren't
picked up until they try out a release. So discouraging users from
running mainline -rc's means we do get bugs in the released kernel that
we might not have had if they did. Likewise if everyone only runs
stable kernels, the bugs in the released kernel don't get found until
stable. So this blame game really isn't helping.

I think the one thing everyone on this thread might agree on is that
this bug wouldn't have happened if AUTOSEL could detect and backport
series instead of individual patches. Sasha says that can't be done
based on in information in Linus' tree[1] which is true but not a
correct statement of the problem. The correct question is given all
the information available, including lore, could we assist AUTOSEL in
better detecting series and possibly making better decisions generally?

I think that's the challenge for anyone who actually wants to help
rather than complain. At least the series detection bit sounds like it
could be a reasonable summer of code project.

Regards,

James

[1] https://lore.kernel.org/linux-fsdevel/ZAyK0KM6JmVOvQWy@sashalap/
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 09:06:08AM -0500, Sasha Levin wrote:
>
> I suppose that if I had a way to know if a certain a commit is part of a
> series, I could either take all of it or none of it, but I don't think I
> have a way of doing that by looking at a commit in Linus' tree
> (suggestions welcome, I'm happy to implement them).

Well, this is why I think it is a good idea to have a link to the
patch series in lore. I know Linus doesn't like it, claiming it
doesn't add any value, but I have to disagree. It adds two bits of
value.

First, if there is any discussion on the review of the patch before it
goes in, the lore link gives you access to that --- and if people have
a back-lick in the cover letter of version N to the cover letter of
version N-1, it allows someone doing code archeology to find all of
the discussions around the patch series in the lore archives.

Secondly, the lore link will allow you to figure out whether or not
the patch is part of a series; b4 can figure this out by looking at
the in-reply-to headers, and lore will chain the patch series
together, so if the commit contains a lore link to the patch, the
AUTOSEL script could use that to find out whether the patch is part of
the series.

And this is really easy to do. All you need is the following in
.git/hooks/applypatch-msg:

#!/bin/sh
# For .git/hooks/applypatch-msg
#
. git-sh-setup
perl -pi -e 's|^Message-Id:\s*<?([^>]+)>?$|Link: https://lore.kernel.org/r/$1|g;' "$1"
test -x "$GIT_DIR/hooks/commit-msg" &&
exec "$GIT_DIR/hooks/commit-msg" ${1+"$@"}
:

Cheers,

- Ted

P.S. There was a recent patch series where I noticed that I would be
screwed if AUTOSEL would only take patch 2/2 and not patch 1/2. I
dealt with that though by adding an explicit "Cc: stable@kernel.org".
So that's the other way to avoid breakage; if people were universally
careful about adding "Cc: stable@kernel.org" tags, then we wouldn't
need AUTOSEL at all.

And this is another place where I break with commonly received wisdom
about "Thou Shalt Never, Never Rewind The Git Branch". Personally, if
I find that I missed a Cc: stable tag, rewinding the branch to add
edit the trailers is *far* better a tradeoff than adhering to some
religious rule about never rewinding git branches. Of course, I can
get away with that since I don't have people basing their branches on
my branch. But I've seen people who will self-righteously proclaim
non-rewinding git branches as the One True Way to do git, and I
profoundly disagree with that point of view.
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 11:16:44AM -0500, Theodore Ts'o wrote:
> On Sat, Mar 11, 2023 at 09:06:08AM -0500, Sasha Levin wrote:
> >
> > I suppose that if I had a way to know if a certain a commit is part of a
> > series, I could either take all of it or none of it, but I don't think I
> > have a way of doing that by looking at a commit in Linus' tree
> > (suggestions welcome, I'm happy to implement them).
>
> Well, this is why I think it is a good idea to have a link to the
> patch series in lore. I know Linus doesn't like it, claiming it
> doesn't add any value, but I have to disagree. It adds two bits of
> value.
>

So, earlier I was going to go into more detail about some of my ideas, before
Sasha and Greg started stonewalling with "patches welcome" (i.e. "I'm refusing
to do my job") and various silly arguments about why nothing should be changed.
But I suppose the worst thing that can happen is that that just continues, so
here it goes:

One of the first things I would do if I was maintaining the stable kernels is to
set up a way to automatically run searches on the mailing lists, and then take
advantage of that in the stable process in various ways. Not having that is the
root cause of a lot of the issues with the current process, IMO.

Now that lore exists, this might be trivial: it could be done just by hammering
lore.kernel.org with queries https://lore.kernel.org/linux-fsdevel/?q=query from
a Python script.

Of course, there's a chance that won't scale to multiple queries for each one of
thousands of stable commits, or at least won't be friendly to the kernel.org
admins. In that case, what can be done is to download down all emails from all
lists, using lore's git mirrors or Atom feeds, and index them locally. (Note:
if the complete history is inconveniently large, then just indexing the last
year or so would work nearly as well.)

Then once that is in place, that could be used in various ways. For example,
given a git commit, it's possible to search by email subject to get to the
original patch, *even if the git commit does not have a Link tag*. And it can
be automatically checked whether it's part of a patch series, and if so, whether
all the patches in the series are being backported or just some.

This could also be used to check for mentions of a commit on the mailing list
that potentially indicate a regression report, which is one of the issues we
discussed earlier. I'm not sure what the optimal search criteria would be, but
one option would be something like "messages that contain the commit title or
commit ID and are dated to after the commit being committed". There might need
to be some exclusions added to that.

This could also be used to automatically find the AUTOSEL email, if one exists,
and check whether it's been replied to or not.

The purpose of all these mailing list searches would be to generate a list of
potential issues with backporting each commit, which would then undergo brief
human review. Once issues are reviewed, that state would be persisted, so that
if the script gets run again, it would only show *new* information based on new
mailing list emails that have not already been reviewed. That's needed because
these issues need to be checked for when the patch is initially proposed for
stable as well as slightly later, before the actual release happens.

If the stable maintainers have no time for doing *any* human review themselves
(again, I do not know what their requirements are on how much time they can
spend per patch), then instead an email with the list of potential issues could
be generated and sent to stable@vger.kernel.org for review by others.

Anyway, that's my idea. I know the response will be either "that won't work" or
"patches welcome", or a mix of both, but that's it.

- Eric
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 10:54:36AM -0500, James Bottomley wrote:
>On Sat, 2023-03-11 at 08:41 -0500, Sasha Levin wrote:
>> On Fri, Mar 10, 2023 at 03:07:04PM -0800, Eric Biggers wrote:
>> > On Mon, Feb 27, 2023 at 07:41:31PM -0800, Eric Biggers wrote:
>> > >
>> > > Well, probably more common is that prerequisites are in the same
>> > > patchset, and the prerequisites are tagged for stable too.?
>> > > Whereas AUTOSEL often just picks patch X of N.? Also, developers
>> > > and maintainers who tag patches for stable are probably more
>> > > likely to help with the stable process in general and make sure
>> > > patches are backported correctly...
>> > >
>> > > Anyway, the point is, AUTOSEL needs to be fixed to stop
>> > > inappropriately cherry-picking patch X of N so often.
>> > >
>> >
>> > ... and AUTOSEL strikes again, with the 6.1 and 6.2 kernels
>> > currently crashing whenever a block device is removed, due to
>> > patches 1 and 3 of a 3-patch series being AUTOSEL'ed (on the same
>> > day I started this discussion, no less):
>> >
>> > https://lore.kernel.org/linux-block/CAOCAAm4reGhz400DSVrh0BetYD3Ljr2CZen7_3D4gXYYdB4SKQ@mail.gmail.com/T/#u
>> >
>> > Oh sorry, ignore this, it's just an anecdotal example.
>>
>> Yes, clearly a problem with AUTOSEL and not with how sad the testing
>> story is for stable releases.
>
>Hey, that's a completely circular argument: if we had perfect testing
>then, of course, it would pick up every bad patch before anything got
>released; but we don't, and everyone knows it. Therefore, we have to
>be discriminating about what patches we put in. And we have to
>acknowledge that zero bugs in patches isn't feasible in spite of all
>the checking we do do. I also think we have to acknowledge that users
>play a role in the testing process because some bugs simply aren't
>picked up until they try out a release. So discouraging users from
>running mainline -rc's means we do get bugs in the released kernel that
>we might not have had if they did. Likewise if everyone only runs
>stable kernels, the bugs in the released kernel don't get found until
>stable. So this blame game really isn't helping.
>
>I think the one thing everyone on this thread might agree on is that
>this bug wouldn't have happened if AUTOSEL could detect and backport
>series instead of individual patches. Sasha says that can't be done
>based on in information in Linus' tree[1] which is true but not a
>correct statement of the problem. The correct question is given all
>the information available, including lore, could we assist AUTOSEL in
>better detecting series and possibly making better decisions generally?

My argument was that this type of issue is no AUTOSEL specific, and we
saw it happening multiple times with stable tagged patches as well.

It's something that needs to get solved, and I suspect that both Greg
and myself will end up using it when it's there.

>I think that's the challenge for anyone who actually wants to help
>rather than complain. At least the series detection bit sounds like it
>could be a reasonable summer of code project.

Right - I was trying to reply directly to Willy's question: this is
something very useful, somewhat hard, and I don't think I could do in
the near future - so help is welcome here.

--
Thanks,
Sasha
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 09:48:13AM -0800, Eric Biggers wrote:
> The purpose of all these mailing list searches would be to generate a list of
> potential issues with backporting each commit, which would then undergo brief
> human review.

This is one big part that I suspect is underestimated. I'll speak from my
past experience maintaining extended LTS for 3.10. I couldn't produce as
many releases as I would have liked to because despite the scripts that
helped me figure some series, some dependencies, origin branches etc, the
whole process of reviewing ~600 patches to end up with ~200 at the end
(and adapting some of them to fit) required ~16 hours a day for a full
week-end, and I didn't always have that amount of time available. Any my
choices were far from being perfect, as during the reviews I got a number
of "please don't backport this there" and "if you take this one you also
need these ones". Also I used to intentionally drop what had nothing to
do on old LTS stuff so even from that perspective my work could have been
perceived as insufficient.

The reviewing process is overwhelming, really. There is a point where you
start to fail and make choices that are not better than a machine's. But
is a mistake once in a while dramatic if on the other hand it fixes 200
other issues ? I think not as long as it's transparent and accepted by
the users, because for one user that could experience a regression (one
that escaped all the testing in place), thousands get fixes for existing
problems. I'm not saying that regressions are good, I hate them, but as
James said, we have to accept that user are part of the quality process.

My approach on another project I maintain is to announce upfront my own
level of trust in my backport work, saying "I had a difficult week fixing
that problem, do not rush on it or be extra careful", or "nothing urgent,
no need to upgrade if you have no problem" or also "just upgrade, it's
almost riskless". Users love that, because they know they're part of the
quality assurance process, and they will either take small risks when
they can, or wait for others to take risks.

But thinking that having one person review patches affecting many
subsystem after pre-selection and extra info regarding discussions on
each individual patch could result in more reliable stable releases is
just an illusion IMHO, because the root of problem is that there are not
enough humans to fix all the problems that humans introduce in the first
place, and despite this we need to fix them. Just like automated scripts
scraping lore, AUTOSEL does bring some value if it offloads some work
from the available humans, even in its current state. And I hope that
more of the selection and review work in the future will be automated
and even less dependent on humans, because it does have a chance to be
more reliable in front of that vast amount of work.

And in any case I've seen you use the word "trivial" several times in
this thread, and for having been through a little bit of this process
in the past, I wouldn't use that word anywhere in a description of what
my experience had been. You really seem to underestimate the difficulty
here.

Regards,
Willy
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 09:48:13AM -0800, Eric Biggers wrote:
>On Sat, Mar 11, 2023 at 11:16:44AM -0500, Theodore Ts'o wrote:
>> On Sat, Mar 11, 2023 at 09:06:08AM -0500, Sasha Levin wrote:
>> >
>> > I suppose that if I had a way to know if a certain a commit is part of a
>> > series, I could either take all of it or none of it, but I don't think I
>> > have a way of doing that by looking at a commit in Linus' tree
>> > (suggestions welcome, I'm happy to implement them).
>>
>> Well, this is why I think it is a good idea to have a link to the
>> patch series in lore. I know Linus doesn't like it, claiming it
>> doesn't add any value, but I have to disagree. It adds two bits of
>> value.
>>
>
>So, earlier I was going to go into more detail about some of my ideas, before
>Sasha and Greg started stonewalling with "patches welcome" (i.e. "I'm refusing
>to do my job") and various silly arguments about why nothing should be changed.
>But I suppose the worst thing that can happen is that that just continues, so
>here it goes:

"job"? do you think I'm paid to do this work? Why would I stonewall
improvements to the process?

I'm getting a bunch of suggestions and complaints that I'm not implementing
those suggestions fast enough on my spare time.

>One of the first things I would do if I was maintaining the stable kernels is to
>set up a way to automatically run searches on the mailing lists, and then take
>advantage of that in the stable process in various ways. Not having that is the
>root cause of a lot of the issues with the current process, IMO.

"if I was maintaining the stable kernels" - why is this rellevant? give
us the tool you've proposed below and we'll be happy to use it. Heck,
don't give it to us, use it to review the patches we're sending out for
review and let us know if we've missed anything.

>Now that lore exists, this might be trivial: it could be done just by hammering
>lore.kernel.org with queries https://lore.kernel.org/linux-fsdevel/?q=query from
>a Python script.
>
>Of course, there's a chance that won't scale to multiple queries for each one of
>thousands of stable commits, or at least won't be friendly to the kernel.org
>admins. In that case, what can be done is to download down all emails from all
>lists, using lore's git mirrors or Atom feeds, and index them locally. (Note:
>if the complete history is inconveniently large, then just indexing the last
>year or so would work nearly as well.)
>
>Then once that is in place, that could be used in various ways. For example,
>given a git commit, it's possible to search by email subject to get to the
>original patch, *even if the git commit does not have a Link tag*. And it can
>be automatically checked whether it's part of a patch series, and if so, whether
>all the patches in the series are being backported or just some.
>
>This could also be used to check for mentions of a commit on the mailing list
>that potentially indicate a regression report, which is one of the issues we
>discussed earlier. I'm not sure what the optimal search criteria would be, but
>one option would be something like "messages that contain the commit title or
>commit ID and are dated to after the commit being committed". There might need
>to be some exclusions added to that.
>
>This could also be used to automatically find the AUTOSEL email, if one exists,
>and check whether it's been replied to or not.
>
>The purpose of all these mailing list searches would be to generate a list of
>potential issues with backporting each commit, which would then undergo brief
>human review. Once issues are reviewed, that state would be persisted, so that
>if the script gets run again, it would only show *new* information based on new
>mailing list emails that have not already been reviewed. That's needed because
>these issues need to be checked for when the patch is initially proposed for
>stable as well as slightly later, before the actual release happens.
>
>If the stable maintainers have no time for doing *any* human review themselves
>(again, I do not know what their requirements are on how much time they can
>spend per patch), then instead an email with the list of potential issues could
>be generated and sent to stable@vger.kernel.org for review by others.
>
>Anyway, that's my idea. I know the response will be either "that won't work" or
>"patches welcome", or a mix of both, but that's it.

I've been playing with this in the past - I had a bot that looks at the
mailing lists for patches that are tagged for stable, and attempts to
apply/build then on the multiple trees to verify that it works and send
a reply back if something goes wrong, asking for a backport.

It gets a bit tricky as there's no way to go back from a commit to the
initial submission, you start hitting issues like:

- Patches get re-sent multiple times (think stuff like tip trees,
reviews from other maintainers, etc).
- Different versions of patches - for example, v1 was a single patch
and in v2 it became multiple patches.

I'm not arguing against your idea, I'm just saying that it's not
trivial. An incomplete work here simply won't scale to the thousands of
patches that flow in the trees, and won't be as useful. I don't think
that this is trivial as you suggest.

If you disagree, and really think it's trivial, take 5 minutes to write
something up? please?

--
Thanks,
Sasha
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 01:26:57PM -0500, Sasha Levin wrote:
>
> "job"? do you think I'm paid to do this work?

> Why would I stonewall improvements to the process?
>
> I'm getting a bunch of suggestions and complaints that I'm not implementing
> those suggestions fast enough on my spare time.
>
> > One of the first things I would do if I was maintaining the stable kernels is to
> > set up a way to automatically run searches on the mailing lists, and then take
> > advantage of that in the stable process in various ways. Not having that is the
> > root cause of a lot of the issues with the current process, IMO.
>
> "if I was maintaining the stable kernels" - why is this rellevant? give
> us the tool you've proposed below and we'll be happy to use it. Heck,
> don't give it to us, use it to review the patches we're sending out for
> review and let us know if we've missed anything.

It's kind of a stretch to claim that maintaining the stable kernels is not part
of your and Greg's jobs. But anyway, the real problem is that it's currently
very hard for others to contribute, given the unique role the stable maintainers
have and the lack of documentation about it. Each of the two maintainers has
their own scripts, and it is not clear how they use them and what processes they
follow. (Even just stable-kernel-rules.rst is totally incorrect these days.)
Actually I still don't even know where your scripts are! They are not in
stable-queue/scripts, it seems those are only Greg's scripts? And if I built
something, how do I know you would even use it? You likely have all sorts of
requirements that I don't even know about.

>
> I've been playing with this in the past - I had a bot that looks at the
> mailing lists for patches that are tagged for stable, and attempts to
> apply/build then on the multiple trees to verify that it works and send
> a reply back if something goes wrong, asking for a backport.
>
> It gets a bit tricky as there's no way to go back from a commit to the
> initial submission, you start hitting issues like:
>
> - Patches get re-sent multiple times (think stuff like tip trees,
> reviews from other maintainers, etc).
> - Different versions of patches - for example, v1 was a single patch
> and in v2 it became multiple patches.
>
> I'm not arguing against your idea, I'm just saying that it's not
> trivial. An incomplete work here simply won't scale to the thousands of
> patches that flow in the trees, and won't be as useful. I don't think
> that this is trivial as you suggest.

There are obviously going to be edge cases; another one is commits that show up
in git without ever having been sent to the mailing list. I don't think they
actually matter very much, though. Worst case, we miss some things, but still
find everything else.

>
> If you disagree, and really think it's trivial, take 5 minutes to write
> something up? please?

I never said that it's "trivial" or that it would take only 5 minutes; that's
just silly. Just that this is possible and it's what needs to be done.

If you don't have time, you should instead be helping ensure that the work gets
done by someone else (internship, GSoC project, etc.).

And yes, I am interested in contributing, but as I mentioned I think you need to
first acknowledge that there is a problem, fix your attitude of immediately
pushing back on everything, and make it easier for people to contribute.

- Eric
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 10:55:01AM -0800, Eric Biggers wrote:
> >
> > If you disagree, and really think it's trivial, take 5 minutes to write
> > something up? please?
>
> I never said that it's "trivial" or that it would take only 5 minutes; that's
> just silly. Just that this is possible and it's what needs to be done.

(I did say that it would be trivial to query lore.kernel.org from a Python
script, which is absolutely correct. I did *not* say that the whole thing,
including the local index if querying lore.kernel.org directly is not scalable
enough, would be trivial. Just that it's *possible*, and there do not seem to
be any technical blockers...)

- Eric
Re: AUTOSEL process [ In reply to ]
On Sat, Mar 11, 2023 at 07:33:58PM +0100, Willy Tarreau wrote:
> On Sat, Mar 11, 2023 at 09:48:13AM -0800, Eric Biggers wrote:
> > The purpose of all these mailing list searches would be to generate a list of
> > potential issues with backporting each commit, which would then undergo brief
> > human review.
>
> This is one big part that I suspect is underestimated. I'll speak from my
> past experience maintaining extended LTS for 3.10. I couldn't produce as
> many releases as I would have liked to because despite the scripts that
> helped me figure some series, some dependencies, origin branches etc, the
> whole process of reviewing ~600 patches to end up with ~200 at the end
> (and adapting some of them to fit) required ~16 hours a day for a full
> week-end, and I didn't always have that amount of time available. Any my
> choices were far from being perfect, as during the reviews I got a number
> of "please don't backport this there" and "if you take this one you also
> need these ones". Also I used to intentionally drop what had nothing to
> do on old LTS stuff so even from that perspective my work could have been
> perceived as insufficient.
>
> The reviewing process is overwhelming, really. There is a point where you
> start to fail and make choices that are not better than a machine's. But
> is a mistake once in a while dramatic if on the other hand it fixes 200
> other issues ? I think not as long as it's transparent and accepted by
> the users, because for one user that could experience a regression (one
> that escaped all the testing in place), thousands get fixes for existing
> problems. I'm not saying that regressions are good, I hate them, but as
> James said, we have to accept that user are part of the quality process.
>
> My approach on another project I maintain is to announce upfront my own
> level of trust in my backport work, saying "I had a difficult week fixing
> that problem, do not rush on it or be extra careful", or "nothing urgent,
> no need to upgrade if you have no problem" or also "just upgrade, it's
> almost riskless". Users love that, because they know they're part of the
> quality assurance process, and they will either take small risks when
> they can, or wait for others to take risks.
>
> But thinking that having one person review patches affecting many
> subsystem after pre-selection and extra info regarding discussions on
> each individual patch could result in more reliable stable releases is
> just an illusion IMHO, because the root of problem is that there are not
> enough humans to fix all the problems that humans introduce in the first
> place, and despite this we need to fix them. Just like automated scripts
> scraping lore, AUTOSEL does bring some value if it offloads some work
> from the available humans, even in its current state. And I hope that
> more of the selection and review work in the future will be automated
> and even less dependent on humans, because it does have a chance to be
> more reliable in front of that vast amount of work.

As I said in a part of my email which you did not quote, the fallback option is
to send the list of issues to the mailing list for others to review.

If even that fails, then it could be cut down to the *just the most useful*
heuristics and decisions made automatically based on those... "Don't AUTOSEL
patch N of a series without 1...N-1" might be a good one.

But again, this comes back to one of the core issues here which is how does one
even build something for the stable maintainers if their requirements are
unknown to others?

> And in any case I've seen you use the word "trivial" several times in
> this thread, and for having been through a little bit of this process
> in the past, I wouldn't use that word anywhere in a description of what
> my experience had been. You really seem to underestimate the difficulty
> here.

I checked the entire email thread
(https://lore.kernel.org/stable/?q=f%3Aebiggers+trivial). The only place I used
the word "trivial" was mentioning that querying lore.kernel.org from a Python
script might be trivial, which is true. And also in my response to Sasha's
similar false claim that I was saying everything would be trivial.

I'm not sure why you're literally just making things up; it's not a very good
way to have a productive discussion...

- Eric

1 2 3 4  View All