Mailing List Archive

[RFC] KEXEC: Clean up logic to choose a range for the crash kernel
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>Before this patch, using "classic" crashdump= syntax (<size>@<start>)
>causes an exact size and location to be reserved in the e820 map
>before Xen and modules are relocated. However, using "new" syntax
>(<start>-[<end>]:<size>) encounteres a myriad of problems, most
>notibly not considering the range when allocating its region.

I'm sure I tested this (with some basic values), so the "myriad of
problems" make me curious as to what they are.

What does "range" refer to here, and what is "its region"?

You also seem to overlook that even the old syntax allowed for only
specifying a size.

>This patch tries to unify both codepaths, and fix the broken logic
>when using "newer" syntax.
>
>Changes addressed:
>
>1) Remove use of kexec_crash_area from parse_crashkernel().
> Use ranges[0] in preference, as there is no support for multiple
> ranges using classic syntax. This prevents some wierd logic in
> __setup_xen() depending on whether kexec_crash_area.size is set
> or not.

What weird about that?

>2) Change set_kexec_crash_area_size() to choose_kexec_range()
> The new method of choosing a range provides an entire rather than

And entire what?

> just setting kexec_crash_area.size. Also, fix the very broken
> logic which will only consider a range if its .end is greater
> than system_ram.

Where's that happening? Are you perhaps mis-interpreting the
purpose of these ranges?

> Additionally, prefer range with the largest
> size if we have a choice of valid ones, and prefer ranges with
> more flexibility in their limits.
>
>3) Move kexec_reserve_area() from setup.c to kexec.c
> It makes more sense here, Also, change its functionality a little
> to always work from the kexec range provided by choose_kexec_range()
>
>4) Remove the kexec reservation code when considering modules
> This code only had any effect for people using the "newer" syntax
> on the command line, as people using the "classic" syntax would
> reserve a memory region before considering modules.

Are you sure? How do you prevent the kexec area from overlapping
any of the modules (in particular the initrd, which can get passed in
place to Dom0)?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
On 25/11/11 15:36, Jan Beulich wrote:
>>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Before this patch, using "classic" crashdump= syntax (<size>@<start>)
>> causes an exact size and location to be reserved in the e820 map
>> before Xen and modules are relocated. However, using "new" syntax
>> (<start>-[<end>]:<size>) encounteres a myriad of problems, most
>> notibly not considering the range when allocating its region.
> I'm sure I tested this (with some basic values), so the "myriad of
> problems" make me curious as to what they are.

The comment was a bit old from before I twigged what some of the code
was doing. The two key problems were only considering the size, and a
bug in set_kexec_crash_area_size() which will only consider a range as
being valid if its end is greater than system_ram. This bug prevents
specifying a range as "128M anywhere in 1G-2G" actually working on a box
with more than 2G of ram.

> What does "range" refer to here, and what is "its region"?

I will need to clean up my use of the term range and region, although it
doesn't help that there is ambiguity in the source code as well.
Loosely, I have been using "range" to mean a possible range as given on
the command line, and "region" to be the actual area in memory where the
kdump kernel is going to be put.

> You also seem to overlook that even the old syntax allowed for only
> specifying a size.

No. In the case of using the old syntax, both
kexec_crash_area.{start,size} are set, which caused the earlier call to
kexec_reserve_region() in __setup_xen() to attempt to reserve an exact
region at an exact address. Using the newer syntax,
kexec_crash_area.size did not get set in the parsing function, and got
considered in set_kexec_crash_area_size(), which only sets the size.
The subsiquent call to kexec_reverse_area() on the next line fails
silently because kexec_crash_area.start is 0. Later, when considering
the modules, kexec_crash_area.start gets set to e-size, at which point
the call to kexec_reserve_area() actually does attempt to reserve a
region, based on the limits of an arbitrary e820 entry, not on the
limits provided on the command line.

>> This patch tries to unify both codepaths, and fix the broken logic
>> when using "newer" syntax.
>>
>> Changes addressed:
>>
>> 1) Remove use of kexec_crash_area from parse_crashkernel().
>> Use ranges[0] in preference, as there is no support for multiple
>> ranges using classic syntax. This prevents some wierd logic in
>> __setup_xen() depending on whether kexec_crash_area.size is set
>> or not.
> What weird about that?

As described above.

>> 2) Change set_kexec_crash_area_size() to choose_kexec_range()
>> The new method of choosing a range provides an entire rather than
> And entire what?

Oops. An entire "kexec_range" entry giving a start, end and size value,
rather than the current case which takes either an exact size/start pair
from the 'classic' case, or a size anywhere in memory for the 'newer' case.

>> just setting kexec_crash_area.size. Also, fix the very broken
>> logic which will only consider a range if its .end is greater
>> than system_ram.
> Where's that happening? Are you perhaps mis-interpreting the
> purpose of these ranges?

I might possibly be misinterpreting these ranges as the only
documentation is the code, but I believe I have got it correct based on
the upstream Linux code. What are your interpretations of the ranges?

>> Additionally, prefer range with the largest
>> size if we have a choice of valid ones, and prefer ranges with
>> more flexibility in their limits.
>>
>> 3) Move kexec_reserve_area() from setup.c to kexec.c
>> It makes more sense here, Also, change its functionality a little
>> to always work from the kexec range provided by choose_kexec_range()
>>
>> 4) Remove the kexec reservation code when considering modules
>> This code only had any effect for people using the "newer" syntax
>> on the command line, as people using the "classic" syntax would
>> reserve a memory region before considering modules.
> Are you sure? How do you prevent the kexec area from overlapping
> any of the modules (in particular the initrd, which can get passed in
> place to Dom0)?

Good point, which is why this is an RFC. Note that the "classic"
syntax, which everyone refers to in documents on the subject, will never
consider any of the modules. I am not really sure which is best, but I
would consider positioning the kdump kernel more important than the
location of initrd. The user is likely to know more about why their
kdump kernel needs to be located where specified than Xen does. Perhaps
after deciding where the kdump kernel should go, we should move any
overlapping modules?

> Jan
>

--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
On 25/11/11 16:03, Andrew Cooper wrote:
> On 25/11/11 15:36, Jan Beulich wrote:
>>>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> Before this patch, using "classic" crashdump= syntax (<size>@<start>)
>>> causes an exact size and location to be reserved in the e820 map
>>> before Xen and modules are relocated. However, using "new" syntax
>>> (<start>-[<end>]:<size>) encounteres a myriad of problems, most
>>> notibly not considering the range when allocating its region.
>> I'm sure I tested this (with some basic values), so the "myriad of
>> problems" make me curious as to what they are.
> The comment was a bit old from before I twigged what some of the code
> was doing. The two key problems were only considering the size, and a
> bug in set_kexec_crash_area_size() which will only consider a range as
> being valid if its end is greater than system_ram. This bug prevents
> specifying a range as "128M anywhere in 1G-2G" actually working on a box
> with more than 2G of ram.
>
>> What does "range" refer to here, and what is "its region"?
> I will need to clean up my use of the term range and region, although it
> doesn't help that there is ambiguity in the source code as well.
> Loosely, I have been using "range" to mean a possible range as given on
> the command line, and "region" to be the actual area in memory where the
> kdump kernel is going to be put.
>
>> You also seem to overlook that even the old syntax allowed for only
>> specifying a size.
> No. In the case of using the old syntax, both
> kexec_crash_area.{start,size} are set, which caused the earlier call to
> kexec_reserve_region() in __setup_xen() to attempt to reserve an exact
> region at an exact address. Using the newer syntax,
> kexec_crash_area.size did not get set in the parsing function, and got
> considered in set_kexec_crash_area_size(), which only sets the size.
> The subsiquent call to kexec_reverse_area() on the next line fails
> silently because kexec_crash_area.start is 0. Later, when considering
> the modules, kexec_crash_area.start gets set to e-size, at which point
> the call to kexec_reserve_area() actually does attempt to reserve a
> region, based on the limits of an arbitrary e820 entry, not on the
> limits provided on the command line.
>
>>> This patch tries to unify both codepaths, and fix the broken logic
>>> when using "newer" syntax.
>>>
>>> Changes addressed:
>>>
>>> 1) Remove use of kexec_crash_area from parse_crashkernel().
>>> Use ranges[0] in preference, as there is no support for multiple
>>> ranges using classic syntax. This prevents some wierd logic in
>>> __setup_xen() depending on whether kexec_crash_area.size is set
>>> or not.
>> What weird about that?
> As described above.
>
>>> 2) Change set_kexec_crash_area_size() to choose_kexec_range()
>>> The new method of choosing a range provides an entire rather than
>> And entire what?
> Oops. An entire "kexec_range" entry giving a start, end and size value,
> rather than the current case which takes either an exact size/start pair
> from the 'classic' case, or a size anywhere in memory for the 'newer' case.
>
>>> just setting kexec_crash_area.size. Also, fix the very broken
>>> logic which will only consider a range if its .end is greater
>>> than system_ram.
>> Where's that happening? Are you perhaps mis-interpreting the
>> purpose of these ranges?
> I might possibly be misinterpreting these ranges as the only
> documentation is the code, but I believe I have got it correct based on
> the upstream Linux code. What are your interpretations of the ranges?
>
>>> Additionally, prefer range with the largest
>>> size if we have a choice of valid ones, and prefer ranges with
>>> more flexibility in their limits.
>>>
>>> 3) Move kexec_reserve_area() from setup.c to kexec.c
>>> It makes more sense here, Also, change its functionality a little
>>> to always work from the kexec range provided by choose_kexec_range()
>>>
>>> 4) Remove the kexec reservation code when considering modules
>>> This code only had any effect for people using the "newer" syntax
>>> on the command line, as people using the "classic" syntax would
>>> reserve a memory region before considering modules.
>> Are you sure? How do you prevent the kexec area from overlapping
>> any of the modules (in particular the initrd, which can get passed in
>> place to Dom0)?
> Good point, which is why this is an RFC. Note that the "classic"
> syntax, which everyone refers to in documents on the subject, will never
> consider any of the modules. I am not really sure which is best, but I
> would consider positioning the kdump kernel more important than the
> location of initrd. The user is likely to know more about why their
> kdump kernel needs to be located where specified than Xen does. Perhaps
> after deciding where the kdump kernel should go, we should move any
> overlapping modules?

On second thoughts, how about this?

If a user provides an exact position and size for the kdump kernel,
place the kdump kernel there and move any overlapping modules.

If the user provides a range, attempt to fit the kdump kernel around
other modules, and in the case where we cant, move the offending
module(s) elsewhere.

>> Jan
>>

--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
>>> On 25.11.11 at 17:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> On 25/11/11 15:36, Jan Beulich wrote:
>>>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> Where's that happening? Are you perhaps mis-interpreting the
>> purpose of these ranges?
>
> I might possibly be misinterpreting these ranges as the only
> documentation is the code, but I believe I have got it correct based on
> the upstream Linux code. What are your interpretations of the ranges?

The ranges are to say "if the system has between <start> and <end>
amount of memory, then allocate a <size> block (possibly located at
<offset>).

>>> 4) Remove the kexec reservation code when considering modules
>>> This code only had any effect for people using the "newer" syntax
>>> on the command line, as people using the "classic" syntax would
>>> reserve a memory region before considering modules.
>> Are you sure? How do you prevent the kexec area from overlapping
>> any of the modules (in particular the initrd, which can get passed in
>> place to Dom0)?
>
> Good point, which is why this is an RFC. Note that the "classic"
> syntax, which everyone refers to in documents on the subject, will never
> consider any of the modules.

It won't in either case when addresses are specified. It will in
both cases when only sizes are provided.

> I am not really sure which is best, but I
> would consider positioning the kdump kernel more important than the
> location of initrd. The user is likely to know more about why their
> kdump kernel needs to be located where specified than Xen does. Perhaps
> after deciding where the kdump kernel should go, we should move any
> overlapping modules?

And that's already happening (or is at least supposed to be).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
On 25/11/11 16:49, Jan Beulich wrote:
>>>> On 25.11.11 at 17:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 25/11/11 15:36, Jan Beulich wrote:
>>>>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> Where's that happening? Are you perhaps mis-interpreting the
>>> purpose of these ranges?
>> I might possibly be misinterpreting these ranges as the only
>> documentation is the code, but I believe I have got it correct based on
>> the upstream Linux code. What are your interpretations of the ranges?
> The ranges are to say "if the system has between <start> and <end>
> amount of memory, then allocate a <size> block (possibly located at
> <offset>).

Ah ok. That actually makes more logical sense. Is there some
documentation as to exactly what is permitted to specify? I have not
been able to find any.

>>>> 4) Remove the kexec reservation code when considering modules
>>>> This code only had any effect for people using the "newer" syntax
>>>> on the command line, as people using the "classic" syntax would
>>>> reserve a memory region before considering modules.
>>> Are you sure? How do you prevent the kexec area from overlapping
>>> any of the modules (in particular the initrd, which can get passed in
>>> place to Dom0)?
>> Good point, which is why this is an RFC. Note that the "classic"
>> syntax, which everyone refers to in documents on the subject, will never
>> consider any of the modules.
> It won't in either case when addresses are specified. It will in
> both cases when only sizes are provided.

How can you specify an offset while using the range syntax? This leaves
the classic syntax as the only way to specific an exact location in memory.

>> I am not really sure which is best, but I
>> would consider positioning the kdump kernel more important than the
>> location of initrd. The user is likely to know more about why their
>> kdump kernel needs to be located where specified than Xen does. Perhaps
>> after deciding where the kdump kernel should go, we should move any
>> overlapping modules?
> And that's already happening (or is at least supposed to be).
>
> Jan
>

--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
On 25/11/11 17:03, Andrew Cooper wrote:
>
> On 25/11/11 16:49, Jan Beulich wrote:
>>>>> On 25.11.11 at 17:03, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>> On 25/11/11 15:36, Jan Beulich wrote:
>>>>>>> On 25.11.11 at 16:12, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> Where's that happening? Are you perhaps mis-interpreting the
>>>> purpose of these ranges?
>>> I might possibly be misinterpreting these ranges as the only
>>> documentation is the code, but I believe I have got it correct based on
>>> the upstream Linux code. What are your interpretations of the ranges?
>> The ranges are to say "if the system has between <start> and <end>
>> amount of memory, then allocate a <size> block (possibly located at
>> <offset>).
> Ah ok. That actually makes more logical sense. Is there some
> documentation as to exactly what is permitted to specify? I have not
> been able to find any.
>
>>>>> 4) Remove the kexec reservation code when considering modules
>>>>> This code only had any effect for people using the "newer" syntax
>>>>> on the command line, as people using the "classic" syntax would
>>>>> reserve a memory region before considering modules.
>>>> Are you sure? How do you prevent the kexec area from overlapping
>>>> any of the modules (in particular the initrd, which can get passed in
>>>> place to Dom0)?
>>> Good point, which is why this is an RFC. Note that the "classic"
>>> syntax, which everyone refers to in documents on the subject, will never
>>> consider any of the modules.
>> It won't in either case when addresses are specified. It will in
>> both cases when only sizes are provided.
> How can you specify an offset while using the range syntax? This leaves
> the classic syntax as the only way to specific an exact location in memory.
>
>>> I am not really sure which is best, but I
>>> would consider positioning the kdump kernel more important than the
>>> location of initrd. The user is likely to know more about why their
>>> kdump kernel needs to be located where specified than Xen does. Perhaps
>>> after deciding where the kdump kernel should go, we should move any
>>> overlapping modules?
>> And that's already happening (or is at least supposed to be).
>>
>> Jan

After comparing the Xen parsing code with the Linux parsing code, with
the correct idea of what we are trying to do, Xen still lacks the
ability to specify an exact offset using the newer style. I will
implement this.

--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [RFC] KEXEC: Clean up logic to choose a range for the crash kernel [ In reply to ]
>>> On 25.11.11 at 18:49, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
> After comparing the Xen parsing code with the Linux parsing code, with
> the correct idea of what we are trying to do, Xen still lacks the
> ability to specify an exact offset using the newer style. I will
> implement this.

What am I overlooking? set_kexec_crash_area_size() takes the new
syntax into consideration, and kexec_reserve_area() gets called only
afterwards.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel