Mailing List Archive

[PATCH] Add initcall_blacklist kernel parameter
When a module is built into the kernel, the modules's module_init()
function becomes an initcall. Debugging built in kernel modules is
typically done by changing the .config, recompiling, and booting the new
kernel in an effort to determine exactly which module caused a problem.
This is a wasteful time consuming process as it may be repeated several times
before determining precisely what caused the problem.

This patch has been useful in identifying problems with built-in modules and
initcalls. It allows a user to skip an initcall to see if the kernel
would continue to boot properly without requiring recompiles.

Usage: initcall_blacklist=<initcall function>

ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
the log contains:

blacklisted initcall sgi_uv_sysfs_init
...
...
function sgi_uv_sysfs_init returning without executing

Cc: Rob Landley <rob@landley.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
---
Documentation/kernel-parameters.txt | 4 ++++
init/main.c | 16 ++++++++++++++++
2 files changed, 20 insertions(+)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 7116fda..dadc43b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1268,6 +1268,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
for working out where the kernel is dying during
startup.

+ initcall_blacklist= [KNL] Stop executing the specified initcall
+ function. Useful for debugging built-in modules and
+ initcalls.
+
initrd= [BOOT] Specify the location of the initial ramdisk

inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
diff --git a/init/main.c b/init/main.c
index 9c7fd4c..a34677c 100644
--- a/init/main.c
+++ b/init/main.c
@@ -666,6 +666,15 @@ static void __init do_ctors(void)
bool initcall_debug;
core_param(initcall_debug, initcall_debug, bool, 0644);

+static char blacklist_buf[128] = "\0";
+static int initcall_blacklist(char *str)
+{
+ snprintf(blacklist_buf, 127, "%s", str);
+ pr_debug("blacklisted initcall %s\n", blacklist_buf);
+ return 0;
+}
+__setup("initcall_blacklist=", initcall_blacklist);
+
static int __init_or_module do_one_initcall_debug(initcall_t fn)
{
ktime_t calltime, delta, rettime;
@@ -689,6 +698,13 @@ int __init_or_module do_one_initcall(initcall_t fn)
int count = preempt_count();
int ret;
char msgbuf[64];
+ char fn_name[128] = "\0";
+
+ snprintf(fn_name, 127, "%pf", fn);
+ if (!strcmp(fn_name, blacklist_buf)) {
+ pr_debug("function %pf returning without executing\n", fn);
+ return -EPERM;
+ }

if (initcall_debug)
ret = do_one_initcall_debug(fn);
--
1.7.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add initcall_blacklist kernel parameter [ In reply to ]
On Wed, Mar 26, 2014 at 10:00 AM, Prarit Bhargava <prarit@redhat.com> wrote:
> When a module is built into the kernel, the modules's module_init()
> function becomes an initcall. Debugging built in kernel modules is
> typically done by changing the .config, recompiling, and booting the new
> kernel in an effort to determine exactly which module caused a problem.
> This is a wasteful time consuming process as it may be repeated several times
> before determining precisely what caused the problem.
>
> This patch has been useful in identifying problems with built-in modules and
> initcalls. It allows a user to skip an initcall to see if the kernel
> would continue to boot properly without requiring recompiles.
>
> Usage: initcall_blacklist=<initcall function>
>
> ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
> the log contains:
>
> blacklisted initcall sgi_uv_sysfs_init
> ...
> ...
> function sgi_uv_sysfs_init returning without executing
>

Could you elaborate on cases where this would be more useful than
initcall_debug? It seems odd to have one option to debug initicalls
already saying it's useful for working out where the kernel is dying
during startup, and then adding another option with the same goal.

> Cc: Rob Landley <rob@landley.net>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: linux-doc@vger.kernel.org
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> ---
> Documentation/kernel-parameters.txt | 4 ++++
> init/main.c | 16 ++++++++++++++++
> 2 files changed, 20 insertions(+)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 7116fda..dadc43b 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -1268,6 +1268,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> for working out where the kernel is dying during
> startup.
>
> + initcall_blacklist= [KNL] Stop executing the specified initcall
> + function. Useful for debugging built-in modules and
> + initcalls.
> +
> initrd= [BOOT] Specify the location of the initial ramdisk
>
> inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
> diff --git a/init/main.c b/init/main.c
> index 9c7fd4c..a34677c 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -666,6 +666,15 @@ static void __init do_ctors(void)
> bool initcall_debug;
> core_param(initcall_debug, initcall_debug, bool, 0644);
>
> +static char blacklist_buf[128] = "\0";
> +static int initcall_blacklist(char *str)
> +{
> + snprintf(blacklist_buf, 127, "%s", str);
> + pr_debug("blacklisted initcall %s\n", blacklist_buf);
> + return 0;
> +}
> +__setup("initcall_blacklist=", initcall_blacklist);
> +

I'm not trying to bikeshed, but this isn't so much a list as it is a
single initcall to skip. Maybe call it initcall_skip ? I can see
people doing something like
initcall_blacklist=sgi_uv_sysfs_init,foo_bar_init,baz_debug_init and
being confused when nothing is skipped because no single initcall
matches that literal string.

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add initcall_blacklist kernel parameter [ In reply to ]
On 03/26/2014 12:34 PM, Josh Boyer wrote:
> On Wed, Mar 26, 2014 at 10:00 AM, Prarit Bhargava <prarit@redhat.com> wrote:
>> When a module is built into the kernel, the modules's module_init()
>> function becomes an initcall. Debugging built in kernel modules is
>> typically done by changing the .config, recompiling, and booting the new
>> kernel in an effort to determine exactly which module caused a problem.
>> This is a wasteful time consuming process as it may be repeated several times
>> before determining precisely what caused the problem.
>>
>> This patch has been useful in identifying problems with built-in modules and
>> initcalls. It allows a user to skip an initcall to see if the kernel
>> would continue to boot properly without requiring recompiles.
>>
>> Usage: initcall_blacklist=<initcall function>
>>
>> ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
>> the log contains:
>>
>> blacklisted initcall sgi_uv_sysfs_init
>> ...
>> ...
>> function sgi_uv_sysfs_init returning without executing
>>
>
> Could you elaborate on cases where this would be more useful than
> initcall_debug? It seems odd to have one option to debug initicalls
> already saying it's useful for working out where the kernel is dying
> during startup, and then adding another option with the same goal.

Sure, here's an example of where this would have been useful. The last thing
seen on the console was:

Non-volatile memory driver v1.2
Linux agpgart interface v0.103

Seems easy enough to debug, right? Must be the agp driver. So I added
"initcall_debug" and got ...

Non-volatile memory driver v1.2
initcall nvram_init+0x0/0x6f
calling agp_init+0x0/0x28
Linux agpgart interface v0.103

So it is obvious that the agp code is to blame. I removed the agp_intel_init()
(I was on an Intel based system) & driver from the kernel config and then ended
up with the same failure:

Non-volatile memory driver v1.2
Linux agpgart interface v0.103

I was then scratching my head and started pulling config options to get deeper
into the boot. I eventually removed the TPM's init and the system booted -- the
issue was a peculiar HW problem with TPM which hung the system so bad that the
console didn't have time to flush. The issue was resolved in the BIOS FWIW, but
that's not really the point; the problem is that I wasted effort debugging this
when 20-30 lines of code would have made it so much easier to do so.

I no longer have the original call sequence but here's one from current
top-of-tree so you can see the number of builds that were required to figure out
exactly what was broken:

[ 13.804729] initcall agp_init+0x0/0x29 returned 0 after 4466 usecs
[ 13.811628] calling agp_amd64_mod_init+0x0/0x21 @ 1
[ 13.817794] initcall agp_amd64_mod_init+0x0/0x21 returned -19 after 607 usecs
[ 13.825762] calling agp_intel_init+0x0/0x2a @ 1
[ 13.831093] initcall agp_intel_init+0x0/0x2a returned 0 after 172 usecs
[ 13.838479] calling agp_sis_init+0x0/0x2a @ 1
[ 13.843600] initcall agp_sis_init+0x0/0x2a returned 0 after 155 usecs
[ 13.850792] calling agp_via_init+0x0/0x2a @ 1
[ 13.855909] initcall agp_via_init+0x0/0x2a returned 0 after 152 usecs
[ 13.863102] calling init_tis+0x0/0xa6 @ 1
[ 13.868269] tpm_tis 00:0a: 1.2 TPM (device-id 0x0, rev-id 78)
[ 13.910570] initcall init_tis+0x0/0xa6 returned 0 after 41906 usecs

and I had to annoyingly bisect through to figure out exactly which built in
module caused the problem. It really was a waste of time IMO when it would have
been so much easier to do have just bisected init calls and ended up with
"initcall_blacklist=init_tis".

Even with the above output as straight forward as it is I feel I got lucky with
the debug. I know I've debugged situations (plural emphasized) where a function
seconds earlier in the initcalls that caused a problem later on (due to an
expired timer or some such thing). That's not to say the case above also could
have been something much later on ... then I would really have been burning time
doing compiles trying to see what went wrong.

There is admittedly another use for this code and that would be to allow someone
with a broken system to skip over a built-in module's init call. The end user
could have continued with "initcall_blacklist=init_tis" while we continued to
debug. While I wouldn't recommend running this way in production with something
like TPM disabled, I could see myself saying this is okay to do with a driver
that wasn't critical for a system (floppy on a modern server).


>
>> inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
>> diff --git a/init/main.c b/init/main.c
>> index 9c7fd4c..a34677c 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -666,6 +666,15 @@ static void __init do_ctors(void)
>> bool initcall_debug;
>> core_param(initcall_debug, initcall_debug, bool, 0644);
>>
>> +static char blacklist_buf[128] = "\0";
>> +static int initcall_blacklist(char *str)
>> +{
>> + snprintf(blacklist_buf, 127, "%s", str);
>> + pr_debug("blacklisted initcall %s\n", blacklist_buf);
>> + return 0;
>> +}
>> +__setup("initcall_blacklist=", initcall_blacklist);
>> +
>
> I'm not trying to bikeshed, but this isn't so much a list as it is a
> single initcall to skip. Maybe call it initcall_skip ? I can see
> people doing something like
> initcall_blacklist=sgi_uv_sysfs_init,foo_bar_init,baz_debug_init and
> being confused when nothing is skipped because no single initcall
> matches that literal string.

Or maybe that implies that I should allow for a list of blacklist'd calls? It
was something that a fellow engineer at Red Hat suggested as well, along with
some cleanups of the 128/127 char writing. He pointed out that removing one
initcall could have an impact on another. For example, removing the init for
netdev would have an impact on loading the igb driver, and then you're booting
into another panic/oops situation.

Is dynamically allocating memory in __setup() calls acceptable? I'm not sure if
there are any restrictions on doing so. Ingo or Peter? Do you know?

P.

>
> josh
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add initcall_blacklist kernel parameter [ In reply to ]
On Wed, Mar 26, 2014 at 1:47 PM, Prarit Bhargava <prarit@redhat.com> wrote:
>
>
> On 03/26/2014 12:34 PM, Josh Boyer wrote:
>> On Wed, Mar 26, 2014 at 10:00 AM, Prarit Bhargava <prarit@redhat.com> wrote:
>>> When a module is built into the kernel, the modules's module_init()
>>> function becomes an initcall. Debugging built in kernel modules is
>>> typically done by changing the .config, recompiling, and booting the new
>>> kernel in an effort to determine exactly which module caused a problem.
>>> This is a wasteful time consuming process as it may be repeated several times
>>> before determining precisely what caused the problem.
>>>
>>> This patch has been useful in identifying problems with built-in modules and
>>> initcalls. It allows a user to skip an initcall to see if the kernel
>>> would continue to boot properly without requiring recompiles.
>>>
>>> Usage: initcall_blacklist=<initcall function>
>>>
>>> ex) added "initcall_blacklist=sgi_uv_sysfs_init" as a kernel parameter and
>>> the log contains:
>>>
>>> blacklisted initcall sgi_uv_sysfs_init
>>> ...
>>> ...
>>> function sgi_uv_sysfs_init returning without executing
>>>
>>
>> Could you elaborate on cases where this would be more useful than
>> initcall_debug? It seems odd to have one option to debug initicalls
>> already saying it's useful for working out where the kernel is dying
>> during startup, and then adding another option with the same goal.
>
> Sure, here's an example of where this would have been useful. The last thing
> seen on the console was:

Thanks for the quick reply!

<snip very well detailed example>

OK, so maybe elude to the reasons you are introducing this in both the
commit log and the kernel-parameters description. Something along the
lines of:

"This can be useful stand-alone or combined with initcall_debug.
There are cases where some initcalls can hang the machine before the
console can be flushed, which can make initcall_debug output
inaccurate. Having the ability to skip initcalls can help further
debugging of these scenarios."

> There is admittedly another use for this code and that would be to allow someone
> with a broken system to skip over a built-in module's init call. The end user
> could have continued with "initcall_blacklist=init_tis" while we continued to
> debug. While I wouldn't recommend running this way in production with something
> like TPM disabled, I could see myself saying this is okay to do with a driver
> that wasn't critical for a system (floppy on a modern server).

Yeah, I don't think that's unreasonable for a per-instance debugging
situation, but I wouldn't really recommend it in the help text or
anything ;).

>>> inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
>>> diff --git a/init/main.c b/init/main.c
>>> index 9c7fd4c..a34677c 100644
>>> --- a/init/main.c
>>> +++ b/init/main.c
>>> @@ -666,6 +666,15 @@ static void __init do_ctors(void)
>>> bool initcall_debug;
>>> core_param(initcall_debug, initcall_debug, bool, 0644);
>>>
>>> +static char blacklist_buf[128] = "\0";
>>> +static int initcall_blacklist(char *str)
>>> +{
>>> + snprintf(blacklist_buf, 127, "%s", str);
>>> + pr_debug("blacklisted initcall %s\n", blacklist_buf);
>>> + return 0;
>>> +}
>>> +__setup("initcall_blacklist=", initcall_blacklist);
>>> +
>>
>> I'm not trying to bikeshed, but this isn't so much a list as it is a
>> single initcall to skip. Maybe call it initcall_skip ? I can see
>> people doing something like
>> initcall_blacklist=sgi_uv_sysfs_init,foo_bar_init,baz_debug_init and
>> being confused when nothing is skipped because no single initcall
>> matches that literal string.
>
> Or maybe that implies that I should allow for a list of blacklist'd calls? It
> was something that a fellow engineer at Red Hat suggested as well, along with
> some cleanups of the 128/127 char writing. He pointed out that removing one
> initcall could have an impact on another. For example, removing the init for
> netdev would have an impact on loading the igb driver, and then you're booting
> into another panic/oops situation.

Sure, fixing it to be a list would work too.

> Is dynamically allocating memory in __setup() calls acceptable? I'm not sure if
> there are any restrictions on doing so. Ingo or Peter? Do you know?

That I'm unsure of myself.

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Add initcall_blacklist kernel parameter [ In reply to ]
Prarit Bhargava <prarit@redhat.com> writes:
> core_param(initcall_debug, initcall_debug, bool, 0644);
>
> +static char blacklist_buf[128] = "\0";

__initdata


> +static int initcall_blacklist(char *str)

__init

Rest looks good. Should be quite useful to have this option

-Andi

--
ak@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/