Mailing List Archive

Portage load control
Hello list,

I still don't know how this works. I ran a test over the last two days, and
the result does not accord with 'man make.conf' nor 'man 1 make'.

First, 'man make.conf' does not state that --load-average, if set, will
override --jobs, as it clearly does.

Second, the two pages contribute actively to the confusion between the emerge
jobs submitted in parallel by portage and the concurrent tasks that may be
launched by each of those.

The test:

I ran 'emerge -e @world' with EMERGE_DEFAULT_OPTS="--jobs=10 --load-
average=40 ...". It took 350m46s.

Then I ran the same -e with --load-average=40, but no --jobs and no -j. That
took 351m21s - 35 seconds longer! What's worse, the load average was
controlled at about 72, not 40. I watched it for some time, and even though
all three load averages were at 72-75, portage kept on starting more packages.

As far as I could see, swap was not touched.

The machine has 24 threads and 64GB RAM (not to mention plenty of swap), so
how was the 72 figure arrived at?

I still don't know how to control the number of simultaneous compilations,
short of limiting them to one.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On 2023.05.06 07:50, Peter Humphrey wrote:
> Hello list,
>
> I still don't know how this works. I ran a test over the last two
> days, and
> the result does not accord with 'man make.conf' nor 'man 1 make'.
>
> First, 'man make.conf' does not state that --load-average, if set,
> will
> override --jobs, as it clearly does.
>
> Second, the two pages contribute actively to the confusion between
> the emerge
> jobs submitted in parallel by portage and the concurrent tasks that
> may be
> launched by each of those.
>
> The test:
>
> I ran 'emerge -e @world' with EMERGE_DEFAULT_OPTS="--jobs=10 --load-
> average=40 ...". It took 350m46s.
>
> Then I ran the same -e with --load-average=40, but no --jobs and no
> -j. That
> took 351m21s - 35 seconds longer! What's worse, the load average was
> controlled at about 72, not 40. I watched it for some time, and even
> though
> all three load averages were at 72-75, portage kept on starting more
> packages.
>
> As far as I could see, swap was not touched.
>
> The machine has 24 threads and 64GB RAM (not to mention plenty of
> swap), so
> how was the 72 figure arrived at?
>
> I still don't know how to control the number of simultaneous
> compilations,
> short of limiting them to one.
>
> --
> Regards,
> Peter.
Minor point - are you sure ccache isn't affecting your results?

I hope I'm not preaching to the choir, and I have NOT reread the
various man pages, but the different options you mention (and some you
don't) apply to different parts of the process. Some tell emerge
whether or not to start working on another package, but once it starts
the process, it has no control over how busy the machine can get. Then
there are those that get passed to make. I wouldn't think so, but are
you possibly confusing the two? Lastly, I don't see that those that
apply to make would have any effect on packages that use ninja instead,
so that might also contribute to the issue..

Separate question, only vaguely related: is there any easy way to tell
what build tools (ninja vs make, gcc vs clang vs ?) were used for an
installed package without actually looking into the ebuild? It's
probably not relevant to your question about controlling the load, but
I used to rebuild everything after installing a new version of gcc, and
have since realized there are many packages that make no use of gcc, so
the rebuild serves no point, and I miss rebuilding packages that use
clang after an upgrade of it and related tools.

Jack
Re: Portage load control [ In reply to ]
On Saturday, 6 May 2023 19:18:25 BST Jack wrote:

> Minor point - are you sure ccache isn't affecting your results?

Pretty sure - it isn't installed here. :)

> I hope I'm not preaching to the choir, and I have NOT reread the
> various man pages, but the different options you mention (and some you
> don't) apply to different parts of the process. Some tell emerge
> whether or not to start working on another package, but once it starts
> the process, it has no control over how busy the machine can get. Then
> there are those that get passed to make. I wouldn't think so, but are
> you possibly confusing the two? Lastly, I don't see that those that
> apply to make would have any effect on packages that use ninja instead,
> so that might also contribute to the issue..

Yes, I understand all that; it all points to the confusion encouraged by the
man pages.

> Separate question, only vaguely related: is there any easy way to tell
> what build tools (ninja vs make, gcc vs clang vs ?) were used for an
> installed package without actually looking into the ebuild? It's
> probably not relevant to your question about controlling the load, but
> I used to rebuild everything after installing a new version of gcc, and
> have since realized there are many packages that make no use of gcc, so
> the rebuild serves no point, and I miss rebuilding packages that use
> clang after an upgrade of it and related tools.

Good questions, but I don't think they're related to mine. I just want to know
how to control the number of simultaneous tasks during a big emerge - or any
emerge, really.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Sunday, 7 May 2023 11:27:14 BST Peter Humphrey wrote:
> On Saturday, 6 May 2023 19:18:25 BST Jack wrote:

> > I hope I'm not preaching to the choir, and I have NOT reread the
> > various man pages, but the different options you mention (and some you
> > don't) apply to different parts of the process. Some tell emerge
> > whether or not to start working on another package, but once it starts
> > the process, it has no control over how busy the machine can get. Then
> > there are those that get passed to make. I wouldn't think so, but are
> > you possibly confusing the two? Lastly, I don't see that those that
> > apply to make would have any effect on packages that use ninja instead,
> > so that might also contribute to the issue..
>
> Yes, I understand all that; it all points to the confusion encouraged by the
> man pages.

As I understand it and have so far confirmed on my systems, the --jobs
directive explained on the emerge man page, places a limit of how many
different non-dependent packages will be emerged in parallel at any time, by
any single emerge invocation. Where there are inter-dependencies between
packages, they will be built sequentially and in an appropriate order as per
the dependency graph emerge will determine and therefore the number of emerge
jobs could be lower than specified by the user.

If no jobs are specified on the command line, the EMERGE_DEFAULT_OPTS variable
in make.conf will be sourced instead. If --jobs is given but left empty, then
the number of parallel emerges will be unlimited and will swamp the CPU - see
--load-average next.

The --load-average directive in emerge specifies the average number of
packages emerge will try to build at any time. This number determines if a
new package build will start by emerge at any point in time. I don't know
over what period of time such a load average is calculated. It is recommended
to set the load-average at the number CPU-cores x 0.9 times to maintain some
system responsiveness.

In addition to the above, we can specify --jobs and --load-average in MAKEOPTS
within make.conf. These directives will determine how many 'make' commands
will be allowed to run concurrently when emerge sources the MAKEOPTS variable.

So, if you have set MAKEOPTS="-j 10" and then run 'emerge --jobs=10" you will
see up to 10x10=100 parallel make tasks in your top output, while the load
average on e.g. a 100-core CPU will show 1.00.

I understand --jobs is used to provide a hard limit, i.e. an instruction to
NOT run any more than the specified parallel package builds, multiplied by the
specified make commands.

The --load-average is an instruction to keep starting more builds and/or run
more make commands, to keep the system busy up to the specified average load.

Enough about the theory, what about its application? I cannot answer why in
your experiment the no-jobs plus 40 load average ended up being 35 seconds
longer. The no jobs implies emerge would continue to increase package builds
as long as inter-dependencies between them and available resources allowed.
Assuming no other processes were using up resources on both runs, the no jobs
experiment should have completed the work sooner. Since you observed no usage
of swap took place, clearly resources were not exhausted. Could it be the
calculation of the average load introduced some loopback and hysteresis
inefficiencies as the average floating number was building up-overshooting-
cutting down, compared with the previous hard limit for number of jobs? I
don't know.

The way I set my systems, admittedly with only a fraction of your resources,
is by setting only the MAKEOPTS variable. I set the jobs number at the number
of CPU threads + 1, or 2xCPU threads +1 and load average at 0.9 of the jobs
number. Smaller packages do not exhaust my RAM, but monsters like Chromium/
qtwebengine at more than 2G per job easily do. For these I set specific
limits in the number of jobs in /etc/portage/env/ and keep an eye on how much
swap may increase as gcc versions evolve. If I notice swap is thrashing the
disk I dial down the jobs number accordingly.
Re: Portage load control [ In reply to ]
On Sunday, 7 May 2023 15:52:08 BST Michael wrote:

> As I understand it and have so far confirmed on my systems, the --jobs
> directive explained on the emerge man page, places a limit of how many
> different non-dependent packages will be emerged in parallel at any time, by
> any single emerge invocation. Where there are inter-dependencies between
> packages, they will be built sequentially and in an appropriate order as
> per the dependency graph emerge will determine and therefore the number of
> emerge jobs could be lower than specified by the user.
>
> If no jobs are specified on the command line, the EMERGE_DEFAULT_OPTS
> variable in make.conf will be sourced instead. If --jobs is given but left
> empty, then the number of parallel emerges will be unlimited and will swamp
> the CPU - see --load-average next.
>
> The --load-average directive in emerge specifies the average number of
> packages emerge will try to build at any time. This number determines if a
> new package build will start by emerge at any point in time. I don't know
> over what period of time such a load average is calculated. It is
> recommended to set the load-average at the number CPU-cores x 0.9 times to
> maintain some system responsiveness.
>
> In addition to the above, we can specify --jobs and --load-average in
> MAKEOPTS within make.conf. These directives will determine how many
> 'make' commands will be allowed to run concurrently when emerge sources
> the MAKEOPTS variable.
>
> So, if you have set MAKEOPTS="-j 10" and then run 'emerge --jobs=10" you
> will see up to 10x10=100 parallel make tasks in your top output, while the
> load average on e.g. a 100-core CPU will show 1.00.
>
> I understand --jobs is used to provide a hard limit, i.e. an instruction to
> NOT run any more than the specified parallel package builds, multiplied by
> the specified make commands.
>
> The --load-average is an instruction to keep starting more builds and/or run
> more make commands, to keep the system busy up to the specified average
> load.

Everybody keeps explaining how the system is supposed to work. I know all
that, as I said last time.

The problem is that PORTAGE IS NOT DOING WHAT IT'S SUPPOSED TO.

I don't like to shout, but I don't know how else to get the point across.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Sat, May 06, 2023 at 12:50:08PM +0100, Peter Humphrey wrote

> Second, the two pages contribute actively to the confusion between the emerge
> jobs submitted in parallel by portage and the concurrent tasks that may be
> launched by each of those.
>
> The test:
>
> I ran 'emerge -e @world' with EMERGE_DEFAULT_OPTS="--jobs=10 --load-
> average=40 ...". It took 350m46s.
>
> Then I ran the same -e with --load-average=40, but no --jobs and no -j. That
> took 351m21s - 35 seconds longer! What's worse, the load average was
> controlled at about 72, not 40. I watched it for some time, and even though
> all three load averages were at 72-75, portage kept on starting more packages.
>
> As far as I could see, swap was not touched.
>
> The machine has 24 threads and 64GB RAM (not to mention plenty of swap), so
> how was the 72 figure arrived at?

I think you're over-complicating things here. Forget --jobs and
--load-average and their interactions. First of all the Gentoo install
handbook has a dire warning that each *THREAD* requires *AT LEAST* 2 GiB
of ram. https://wiki.gentoo.org/wiki/Handbook:AMD64/Full/Installation#MAKEOPTS

You've got 24 threads and 64 GB RAM, so try...

MAKEOPTS="-j20"

...to leave a few threads for your system if your'e playing freecell or
whatever. How fast do the builds go? Don't sweat +/- 30 seconds.

--
I've seen things, you people wouldn't believe; Gopher, Netscape with
frames, the first Browser Wars. Searching for pages with AltaVista,
pop-up windows self-replicating, trying to uninstall RealPlayer. All
those moments, will be lost in time like tears in rain... time to die.
Re: Portage load control [ In reply to ]
On Sun, 07 May 2023 17:00:16 +0100, Peter Humphrey wrote:

> Everybody keeps explaining how the system is supposed to work. I know
> all that, as I said last time.
>
> The problem is that PORTAGE IS NOT DOING WHAT IT'S SUPPOSED TO.

I see the same at times. I realise that portage can only look at the load
at the time it starts another package, it has no control over what
happens after that, but they way your system seems to stick to a limit,
just not the one you set, seems strange.

> I don't like to shout, but I don't know how else to get the point
> across.

Maybe you should take this to bgo where it can be flagged for the portage
devs to look at, just keep us posted on the outcome.


--
Neil Bothwick

I'd tell you a UDP joke, but you may not get it.
Re: Portage load control [ In reply to ]
On Monday, 8 May 2023 11:20:45 BST Neil Bothwick wrote:

> Maybe you should take this to bgo where it can be flagged for the portage
> devs to look at, just keep us posted on the outcome.

https://bugs.gentoo.org/905933

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Monday, 8 May 2023 11:20:45 BST Neil Bothwick wrote:

> Maybe you should take this to bgo where it can be flagged for the portage
> devs to look at, just keep us posted on the outcome.

So far, I've just been asked whether I expected something different, to which I
replied "Why is --load-average=40 being ignored?"

Perhaps we don't all understand the same things about how this is supposed to
work.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Thu, May 11, 2023 at 6:34?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Monday, 8 May 2023 11:20:45 BST Neil Bothwick wrote:
>
> > Maybe you should take this to bgo where it can be flagged for the
portage
> > devs to look at, just keep us posted on the outcome.
>
> So far, I've just been asked whether I expected something different, to
which I
> replied "Why is --load-average=40 being ignored?"
>
> Perhaps we don't all understand the same things about how this is
supposed to
> work.
>
> --
> Regards,
> Peter.
>

OK, this is a bit of a weird thing for me to ask you to try but this page
on emerge:

https://wiki.gentoo.org/wiki/EMERGE_DEFAULT_OPTS

says pretty clearly that "--load-average X.Y" should be a floating point
number so
try it with "--load-average 40.0", and further with and without the --jobs
option.

Note 2 things - this page doesn't say to use an "=" AND it was last edited
on my birthday. It wasn't a good year for me. Possibly it wasn't a good year
for this man page... ;-)
Re: Portage load control [ In reply to ]
Going further, this page states:

"The load average value is the same as displayed by top or uptime, and for
an N-core system, a load average of N.0 would be a 100% load. Another rule
of thumb here is to set X.Y=N*0.9 which will limit the load to 90%, thus
maintaining system responsiveness."

So, how many cores does your system have? For a 16 core system, if you want
40% load, you only want to spawn 16 * 0.4 jobs so you'd set that value to
6.4

On Thu, May 11, 2023 at 6:45?AM Mark Knecht <markknecht@gmail.com> wrote:
>
>
>
> On Thu, May 11, 2023 at 6:34?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
> >
> > On Monday, 8 May 2023 11:20:45 BST Neil Bothwick wrote:
> >
> > > Maybe you should take this to bgo where it can be flagged for the
portage
> > > devs to look at, just keep us posted on the outcome.
> >
> > So far, I've just been asked whether I expected something different, to
which I
> > replied "Why is --load-average=40 being ignored?"
> >
> > Perhaps we don't all understand the same things about how this is
supposed to
> > work.
> >
> > --
> > Regards,
> > Peter.
> >
>
> OK, this is a bit of a weird thing for me to ask you to try but this page
on emerge:
>
> https://wiki.gentoo.org/wiki/EMERGE_DEFAULT_OPTS
>
> says pretty clearly that "--load-average X.Y" should be a floating point
number so
> try it with "--load-average 40.0", and further with and without the
--jobs option.
>
> Note 2 things - this page doesn't say to use an "=" AND it was last edited
> on my birthday. It wasn't a good year for me. Possibly it wasn't a good
year
> for this man page... ;-)

>
Re: Portage load control [ In reply to ]
On Thursday, 11 May 2023 14:45:26 BST Mark Knecht wrote:

> OK, this is a bit of a weird thing for me to ask you to try but this page
> on emerge:
>
> https://wiki.gentoo.org/wiki/EMERGE_DEFAULT_OPTS
>
> says pretty clearly that "--load-average X.Y" should be a floating point
> number so try it with "--load-average 40.0", and further with and without
> the --jobs option.
>
> Note 2 things - this page doesn't say to use an "=" AND it was last edited
> on my birthday. It wasn't a good year for me. Possibly it wasn't a good year
> for this man page... ;-)

Yet more confusion, with man page and wiki article contradicting each other.
:(

I'll try your suggestions - thanks!

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Thursday, 11 May 2023 15:58:20 BST Mark Knecht wrote:
> Going further, this page states:
>
> "The load average value is the same as displayed by top or uptime, and for
> an N-core system, a load average of N.0 would be a 100% load. Another rule
> of thumb here is to set X.Y=N*0.9 which will limit the load to 90%, thus
> maintaining system responsiveness."

That's the first reference I've seen to percentage load. Interesting. Perhaps
changes are afoot already.

> So, how many cores does your system have? For a 16 core system, if you want
> 40% load, you only want to spawn 16 * 0.4 jobs so you'd set that value to
> 6.4

24 cores, but portage is ignoring my load-average anyway, so I'm interested to
see what the bug reports elicits.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Thu, May 11, 2023 at 9:03?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Thursday, 11 May 2023 15:58:20 BST Mark Knecht wrote:
> > Going further, this page states:
> >
> > "The load average value is the same as displayed by top or uptime, and
for
> > an N-core system, a load average of N.0 would be a 100% load. Another
rule
> > of thumb here is to set X.Y=N*0.9 which will limit the load to 90%, thus
> > maintaining system responsiveness."
>
> That's the first reference I've seen to percentage load. Interesting.
Perhaps
> changes are afoot already.
>
> > So, how many cores does your system have? For a 16 core system, if you
want
> > 40% load, you only want to spawn 16 * 0.4 jobs so you'd set that value
to
> > 6.4
>
> 24 cores, but portage is ignoring my load-average anyway, so I'm
interested to
> see what the bug reports elicits.
>
> --
> Regards,
> Peter.

I'm sure you get this but I'm pointing toward the EMERGE_DEFAULT_OPTS
portage variable which, according to it's page that "defines entries to be
appended to the emerge command line." I suspect they are appended, but
that doesn't guarantee that they override other entries that you are adding
by hand or have somewhere else. It seems reasonable to me that you
might just use this setting with nothing else and see if you can get it
under
control.

Note the blue section on the page:

Note
When MAKEOPTS="-jN" is used with
EMERGE_DEFAULT_OPTS="--jobs K --load-average X.Y" the number
of possible tasks created would be up to N*K. Therefore, both variables
need to be set with each other in mind as they create up to K jobs each
with up to N tasks.

The ''problem' is this can easily hit 100% of the cores you have in the
machine if not sensibly set. (You choose what's 'sensible')

HTH,
Mark
Re: Portage load control [ In reply to ]
On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:

> I'm sure you get this but I'm pointing toward the EMERGE_DEFAULT_OPTS
> portage variable which, according to it's page that "defines entries to be
> appended to the emerge command line." I suspect they are appended, but
> that doesn't guarantee that they override other entries that you are adding
> by hand or have somewhere else. It seems reasonable to me that you
> might just use this setting with nothing else and see if you can get it
> under control.

I think that's worth a shot. And no, I don't have any other entries.

> Note the blue section on the page:
>
> Note
> When MAKEOPTS="-jN" is used with
> EMERGE_DEFAULT_OPTS="--jobs K --load-average X.Y" the number
> of possible tasks created would be up to N*K. Therefore, both variables
> need to be set with each other in mind as they create up to K jobs each
> with up to N tasks.
>
> The ''problem' is this can easily hit 100% of the cores you have in the
> machine if not sensibly set. (You choose what's 'sensible')

Once again, --load-average is being ignored. Why is it there? Surely, it must
be to mitigate the worst effects of that N*K, but it isn't doing so.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Thu, May 11, 2023 at 3:07?PM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:
<SNIP>
> > The ''problem' is this can easily hit 100% of the cores you have in the
> > machine if not sensibly set. (You choose what's 'sensible')
>
> Once again, --load-average is being ignored. Why is it there? Surely, it
must
> be to mitigate the worst effects of that N*K, but it isn't doing so.
>

From your description, yeah, it's weird, but possibly it's managing it over
(for instance) over much longer time frames or something like that.

Or possibly it just doesn't work.

Or possibly whoever wrote the man page misunderstood.

Poking around a bit this morning I took the path at the bottom of the
link I gave you to the Portage niceness page. It says scheduling policy
control
started with portage-3.0.35 which on paper sounds sort of recent. Possibly
a bug crept in, but I was curious as to what you have for
PORTAGE_SCHEDULING_POLICY, if any, and whether you
need to enable some sort of scheduling to get this under control?

https://wiki.gentoo.org/wiki/Portage_niceness

I find this page a bit troubling as it isn't clear to a dummy like
me what happens if nothing is set. If I still ran Gentoo, or if it
was easier to set up a VM, I'd try it myself but alas it ain't
to be.

Anyway, I feel for ya.

Good luck,
Mark
Re: Portage load control [ In reply to ]
On 5/11/23 18:07, Peter Humphrey wrote:
> On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:
>
>> I'm sure you get this but I'm pointing toward the EMERGE_DEFAULT_OPTS
>> portage variable which, according to it's page that "defines entries to be
>> appended to the emerge command line." I suspect they are appended, but
>> that doesn't guarantee that they override other entries that you are adding
>> by hand or have somewhere else. It seems reasonable to me that you
>> might just use this setting with nothing else and see if you can get it
>> under control.
> I think that's worth a shot. And no, I don't have any other entries.
>
>> Note the blue section on the page:
>>
>> Note
>> When MAKEOPTS="-jN" is used with
>> EMERGE_DEFAULT_OPTS="--jobs K --load-average X.Y" the number
>> of possible tasks created would be up to N*K. Therefore, both variables
>> need to be set with each other in mind as they create up to K jobs each
>> with up to N tasks.
>>
>> The ''problem' is this can easily hit 100% of the cores you have in the
>> machine if not sensibly set. (You choose what's 'sensible')
> Once again, --load-average is being ignored. Why is it there? Surely, it must
> be to mitigate the worst effects of that N*K, but it isn't doing so.

Sorry if I'm repeating myself, but as I see it, there are two different
--load-average settings to consider.  I'd have to go back to the
beginning of the thread to confirm you are setting both of them.

The --load-average to emerge itself just tells it not to start a new job
if the load is above the setting.  If there are several large jobs, but
all start with single  threaded configuration activity such as
./configure or cmake, multiple jobs can clearly get started before the
load average starts climbing.

The --load-average in MAKEOPTS gets passed to make, and controls how
many processes make starts.  If that is set, and the load is still too
high, the problem is in make not in emerge.  Also, that setting will
have no effect if the package uses ninja or something else instead of
make.  Ninja does have a -l setting for load average, but I don't know
if emerge passes any MAKEOPTS to ninja. That might be an interesting
enhancement request.

Jack
Re: Portage load control [ In reply to ]
On Thu, May 11, 2023 at 11:07:04PM +0100, Peter Humphrey wrote:
> Once again, --load-average is being ignored. Why is it there? Surely, it must
> be to mitigate the worst effects of that N*K, but it isn't doing so.
>

Take all of the following with a grain of salt and YMMV. Any gentoo
pro's please correct my ideas here:

I have also been experimenting along the lines of making emerge nicer
via a few quick strategies, and while it doesn't address your issue
directly, I'll tell you some of the things that have made it more
enjoyable to use a machine while it is building packages:

1) Niceness: I set the following in my make.conf, since I value
responsiveness of the machine over the speed of the build:
```
PORTAGE_IONICE_COMMAND="ionice -c 3 -p \${PID}"
PORTAGE_NICENESS=19
```

2) Load average: I will trade some build time for responsiveness, so I
go ahead and use `taskset 2-100 emerge ...` on linux to prevent emerge
from using the first two cores/threads. I don't have a machine with a hundred
cores, but if I did, it would make a little heat....

3) I set --load-average on the command line and in the make opts. No need
to run too wild. I make sure this is less than the number of cores I
allocated with taskset, or I think I may not hit the designated load
average for limiting.

4) ONLY if I have the RAM, mount /var/tmp or /var/tmp/portage as a
tmpfs. If I am merging very large projects (firefox and llvm for
example) concurrently, I may need more than several 10's of GB of RAM
for this. If I don't usually run the machine with substantial swap space
enabled, this might be a good time.

I think with those strategies, it is ok to just run emerge with `-j`
with no arguments. The cores prohibited from participating in emerge
will be available for interactive tasks, and the load average will limit
to some degree the number of processes.

If you give this a try, let me know what you think!

Eldon
Re: Portage load control [ In reply to ]
On 5/11/23 23:23, Eldon wrote:
> On Thu, May 11, 2023 at 11:07:04PM +0100, Peter Humphrey wrote:
>> Once again, --load-average is being ignored. Why is it there? Surely, it must
>> be to mitigate the worst effects of that N*K, but it isn't doing so.
>>
> Take all of the following with a grain of salt and YMMV. Any gentoo
> pro's please correct my ideas here:
>
> I have also been experimenting along the lines of making emerge nicer
> via a few quick strategies, and while it doesn't address your issue
> directly, I'll tell you some of the things that have made it more
> enjoyable to use a machine while it is building packages:
>
> 1) Niceness: I set the following in my make.conf, since I value
> responsiveness of the machine over the speed of the build:
> ```
> PORTAGE_IONICE_COMMAND="ionice -c 3 -p \${PID}"
> PORTAGE_NICENESS=19
> ```
>
> 2) Load average: I will trade some build time for responsiveness, so I
> go ahead and use `taskset 2-100 emerge ...` on linux to prevent emerge
> from using the first two cores/threads. I don't have a machine with a hundred
> cores, but if I did, it would make a little heat....
>
> 3) I set --load-average on the command line and in the make opts. No need
> to run too wild. I make sure this is less than the number of cores I
> allocated with taskset, or I think I may not hit the designated load
> average for limiting.
>
> 4) ONLY if I have the RAM, mount /var/tmp or /var/tmp/portage as a
> tmpfs. If I am merging very large projects (firefox and llvm for
> example) concurrently, I may need more than several 10's of GB of RAM
> for this. If I don't usually run the machine with substantial swap space
> enabled, this might be a good time.
>
> I think with those strategies, it is ok to just run emerge with `-j`
> with no arguments. The cores prohibited from participating in emerge
> will be available for interactive tasks, and the load average will limit
> to some degree the number of processes.
>
> If you give this a try, let me know what you think!
>
> Eldon
>
Don't set PORTAGE_IONICE_COMMAND or PORTAGE_NICENESS. I just set
PORTAGE_SCHEDULING_POLICY="idle". My MAKEOPTS are "-j12 -l12" and I have
"--jobs=12" in EMERGE_DEFAULT_OPTS (6c12t CPU). Anything that uses rust
(cargo seems to not support load so runs into problems, but setting
scheduling policy should help) or webkit you might want to lower
MAKEOPTS jobs via env overrides.
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 01:38:52 BST Jack wrote:

> Sorry if I'm repeating myself, but as I see it, there are two different
> --load-average settings to consider. I'd have to go back to the
> beginning of the thread to confirm you are setting both of them.

I am also going to repeat myself.

> The --load-average to emerge itself just tells it not to start a new job
> if the load is above the setting. If there are several large jobs, but
> all start with single threaded configuration activity such as
> ./configure or cmake, multiple jobs can clearly get started before the
> load average starts climbing.

I have said several times that portage is ignoring that setting. I have it at
40, yet portage kicks off more packages at 72, and continues doing so for
extended periods - at least 15 minutes.

> The --load-average in MAKEOPTS gets passed to make, and controls how
> many processes make starts. If that is set, and the load is still too
> high, the problem is in make not in emerge. Also, that setting will
> have no effect if the package uses ninja or something else instead of
> make. Ninja does have a -l setting for load average, but I don't know
> if emerge passes any MAKEOPTS to ninja. That might be an interesting
> enhancement request.

I didn't know about that setting, and I can't see it in the man pages, so you
can be sure it isn't set here. You don't mean --jobs, do you?

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 09:34:27 BST I wrote:

> > The --load-average in MAKEOPTS gets passed to make, and controls how
> > many processes make starts. If that is set, and the load is still too
> > high, the problem is in make not in emerge. Also, that setting will
> > have no effect if the package uses ninja or something else instead of
> > make. Ninja does have a -l setting for load average, but I don't know
> > if emerge passes any MAKEOPTS to ninja. That might be an interesting
> > enhancement request.
>
> I didn't know about that setting, and I can't see it in the man pages, so
> you can be sure it isn't set here. You don't mean --jobs, do you?

Please ignore that. I must go for some coffee.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Fri, 12 May 2023 at 10:34, Peter Humphrey <peter@prh.myzen.co.uk> wrote:
> On Friday, 12 May 2023 01:38:52 BST Jack wrote:
> > The --load-average to emerge itself just tells it not to start a new job
> > if the load is above the setting. If there are several large jobs, but
> > all start with single threaded configuration activity such as
> > ./configure or cmake, multiple jobs can clearly get started before the
> > load average starts climbing.
>
> I have said several times that portage is ignoring that setting. I have it at
> 40, yet portage kicks off more packages at 72, and continues doing so for
> extended periods - at least 15 minutes.

But are you sure that it is actually ignored? It was said in an
earlier message from Mark that the value was related to number of
cores, where your 24 cores at 100% average load would translate to a
value of --load-average 24.0. That would put your value of 40 at 166%
average load? What load are you actually trying to limit it to? If you
want 40% load, that should apparently be --load-average 9.6.

Regards,
Arve
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 11:09:37 BST Arve Barsnes wrote:
> On Fri, 12 May 2023 at 10:34, Peter Humphrey <peter@prh.myzen.co.uk>
wrote:

> > I have said several times that portage is ignoring that setting. I have it
> > at 40, yet portage kicks off more packages at 72, and continues doing so
> > for extended periods - at least 15 minutes.
>
> But are you sure that it is actually ignored? It was said in an
> earlier message from Mark that the value was related to number of
> cores, where your 24 cores at 100% average load would translate to a
> value of --load-average 24.0. That would put your value of 40 at 166%
> average load? What load are you actually trying to limit it to? If you
> want 40% load, that should apparently be --load-average 9.6.

I'm reading man make.conf, which makes quite clear that --load-average limits
the number of portage packages to be emerged, so as to avoid excess load.
Simple.

Either --load-average is designed to do as the man page says, but it doesn't
work and should be fixed, or it should be removed, being useless and
misleading. We can't have an option that limits load, but can be ignored at
portage's whim.

I haven't had a reply to my question in the bug report yesterday: "Why is
--load-average=40 being ignored?" but it is perhaps early days yet.

A possibility has just occurred to me: it seems to me that the use of a
floating-point number for load average is a recent development, at least I've
never seen it before and I've always used a plain integer. Could portage be
skipping over it when it doesn't find a decimal point? That would be easier to
fix than a wholesale failure of function.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On 5/12/23 09:16, Peter Humphrey wrote:
> On Friday, 12 May 2023 11:09:37 BST Arve Barsnes wrote:
>> On Fri, 12 May 2023 at 10:34, Peter Humphrey <peter@prh.myzen.co.uk>
> wrote:
>
>>> I have said several times that portage is ignoring that setting. I have it
>>> at 40, yet portage kicks off more packages at 72, and continues doing so
>>> for extended periods - at least 15 minutes.
>> But are you sure that it is actually ignored? It was said in an
>> earlier message from Mark that the value was related to number of
>> cores, where your 24 cores at 100% average load would translate to a
>> value of --load-average 24.0. That would put your value of 40 at 166%
>> average load? What load are you actually trying to limit it to? If you
>> want 40% load, that should apparently be --load-average 9.6.
> I'm reading man make.conf, which makes quite clear that --load-average limits
> the number of portage packages to be emerged, so as to avoid excess load.
> Simple.
>
> Either --load-average is designed to do as the man page says, but it doesn't
> work and should be fixed, or it should be removed, being useless and
> misleading. We can't have an option that limits load, but can be ignored at
> portage's whim.
>
> I haven't had a reply to my question in the bug report yesterday: "Why is
> --load-average=40 being ignored?" but it is perhaps early days yet.
>
> A possibility has just occurred to me: it seems to me that the use of a
> floating-point number for load average is a recent development, at least I've
> never seen it before and I've always used a plain integer. Could portage be
> skipping over it when it doesn't find a decimal point? That would be easier to
> fix than a wholesale failure of function.

I still see two separate issues.  First, you are saying that emerge
still launches new jobs when the load is over what is set with
--load-average.  A possible way to test this directly is to run or
create some job that pushed the load average to over some number, say
5.  (It doesn't have to be high, just predictable, although a higher
load would make a more obvious result.)  Then start an emerge of two or
more packages with --load-average=3.  It should start the first job, and
should then not start another until the load is below 3 or the first job
has finished.  You can try with both 3 and 3.0.  If the second job does
get started, this is an easy to run, concrete test you can post to the bug.

The second issues is whether MAKEOPTS --load-average is actually getting
passed to each job and whether make is then observing that limit. 
Whether this is the case or not is independent of the first issue.  I
suppose this could be tested without even involving emerge.  Given you
observed an actual load of 72 (do I remember correctly?) with both
--load-averages set significantly below this, you could test, as long as
you have a single compile which is busy enough.

Another possible test would be (for example) to set emerge's
--load-average to 2 or 2.0 and MAKEOPT's --load-average=10.
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 00:08:03 BST Mark Knecht wrote:
> On Thu, May 11, 2023 at 3:07?PM Peter Humphrey <peter@prh.myzen.co.uk>
>
> wrote:
> > On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:
> <SNIP>
>
> > > The ''problem' is this can easily hit 100% of the cores you have in the
> > > machine if not sensibly set. (You choose what's 'sensible')
> >
> > Once again, --load-average is being ignored. Why is it there? Surely, it
> > must be to mitigate the worst effects of that N*K, but it isn't doing so.
>
> From your description, yeah, it's weird, but possibly it's managing it over
> (for instance) over much longer time frames or something like that.
>
> Or possibly it just doesn't work.

That's it, I'm sure.

> Or possibly whoever wrote the man page misunderstood.

Load-average has been around for a long time.

> Poking around a bit this morning I took the path at the bottom of the
> link I gave you to the Portage niceness page. It says scheduling policy
> control started with portage-3.0.35 which on paper sounds sort of recent.
> Possibly a bug crept in, but I was curious as to what you have for
> PORTAGE_SCHEDULING_POLICY, if any, and whether you need to enable some
> sort of scheduling to get this under control?
>
> https://wiki.gentoo.org/wiki/Portage_niceness

I have no PORTAGE_SCHEDULING_POLICY, or not that I can find. It seems to me
that such a policy is to do with the running of portage in the OS, rather than
how it launches its own emerge jobs. Is that right?

> Anyway, I feel for ya.

:)

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On 5/12/23 09:46, Peter Humphrey wrote:
> On Friday, 12 May 2023 00:08:03 BST Mark Knecht wrote:
>> On Thu, May 11, 2023 at 3:07?PM Peter Humphrey <peter@prh.myzen.co.uk>
>>
>> wrote:
>>> On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:
>> <SNIP>
>>
>>>> The ''problem' is this can easily hit 100% of the cores you have in the
>>>> machine if not sensibly set. (You choose what's 'sensible')
>>> Once again, --load-average is being ignored. Why is it there? Surely, it
>>> must be to mitigate the worst effects of that N*K, but it isn't doing so.
>> From your description, yeah, it's weird, but possibly it's managing it over
>> (for instance) over much longer time frames or something like that.
>>
>> Or possibly it just doesn't work.
> That's it, I'm sure.
>
>> Or possibly whoever wrote the man page misunderstood.
> Load-average has been around for a long time.
>
>> Poking around a bit this morning I took the path at the bottom of the
>> link I gave you to the Portage niceness page. It says scheduling policy
>> control started with portage-3.0.35 which on paper sounds sort of recent.
>> Possibly a bug crept in, but I was curious as to what you have for
>> PORTAGE_SCHEDULING_POLICY, if any, and whether you need to enable some
>> sort of scheduling to get this under control?
>>
>> https://wiki.gentoo.org/wiki/Portage_niceness
> I have no PORTAGE_SCHEDULING_POLICY, or not that I can find. It seems to me
> that such a policy is to do with the running of portage in the OS, rather than
> how it launches its own emerge jobs. Is that right?
>
>> Anyway, I feel for ya.
> :)
>
You can read /usr/share/portage/config/make.conf.example for an
explanation. All children processes will use that. I can run portage and
play games on the same system with my settings.
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 14:37:13 BST Jack wrote:

> I still see two separate issues. First, you are saying that emerge
> still launches new jobs when the load is over what is set with
> --load-average. A possible way to test this directly is to run or
> create some job that pushed the load average to over some number, say
> 5.

I have tested it, directly, with emerge. I reported what happened at the start
of this thread. To recap:

I was running 'emerge -e @world'; no extras, no ifs, no buts. Make.conf had
EMERGE_DEFAULT_OPTS="--jobs --load-average=40 ... ; MAKEOPTS was not
specified.

As the six-hour job proceeded and portage was working with larger packages in
the plasma group, the load average rose to "72 75 75", clearly much higher
than the 40 I'd specified and continuing over at least 15 minutes. Yet portage
was still starting more emerge jobs to keep the load that high.

--->8

> The second issues is whether MAKEOPTS --load-average is actually getting
> passed to each job and whether make is then observing that limit.
> Whether this is the case or not is independent of the first issue. I
> suppose this could be tested without even involving emerge. Given you
> observed an actual load of 72 (do I remember correctly?) with both
> --load-averages set significantly below this, you could test, as long as
> you have a single compile which is busy enough.

I had no MAKEOPTS, so I assume -j took the default value of num_cores=24. I
don't know what the -l default is.

Here's the rub, though: whatever values were taken by -j, -l or --jobs, the
value of --load-average should not have been exceeded so grossly and
persistently.

I'd like to thank everyone who's offered ideas and suggestions, but I'm just
going to have to wait for the outcome of the bug I reported.

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Fri, May 12, 2023 at 6:46?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Friday, 12 May 2023 00:08:03 BST Mark Knecht wrote:
> > On Thu, May 11, 2023 at 3:07?PM Peter Humphrey <peter@prh.myzen.co.uk>
> >
> > wrote:
> > > On Thursday, 11 May 2023 17:18:17 BST Mark Knecht wrote:
> > <SNIP>
> >
> > > > The ''problem' is this can easily hit 100% of the cores you have in
the
> > > > machine if not sensibly set. (You choose what's 'sensible')
> > >
> > > Once again, --load-average is being ignored. Why is it there? Surely,
it
> > > must be to mitigate the worst effects of that N*K, but it isn't doing
so.
> >
> > From your description, yeah, it's weird, but possibly it's managing it
over
> > (for instance) over much longer time frames or something like that.
> >
> > Or possibly it just doesn't work.
>
> That's it, I'm sure.
>
> > Or possibly whoever wrote the man page misunderstood.
>
> Load-average has been around for a long time.
>
> > Poking around a bit this morning I took the path at the bottom of the
> > link I gave you to the Portage niceness page. It says scheduling policy
> > control started with portage-3.0.35 which on paper sounds sort of
recent.
> > Possibly a bug crept in, but I was curious as to what you have for
> > PORTAGE_SCHEDULING_POLICY, if any, and whether you need to enable some
> > sort of scheduling to get this under control?
> >
> > https://wiki.gentoo.org/wiki/Portage_niceness
>
> I have no PORTAGE_SCHEDULING_POLICY, or not that I can find. It seems to
me
> that such a policy is to do with the running of portage in the OS, rather
than
> how it launches its own emerge jobs. Is that right?
>
> > Anyway, I feel for ya.

Peter,
My point about PORTAGE_SCHEDULING_POLICY is that it *might* have
an effect on what you're seeing, not that it *would* have an effect.

If there's one thing I most distrust about the Open Source world, it's
documentation...

WRT the floating point issue, the Gentoo Catalyst page

https://wiki.gentoo.org/wiki/Catalyst

in the "Jobs and load average" section states:

<QUOTE>
FILE /etc/catalyst/catalyst.confcatalyst.conf

# Integral value passed to emerge as the parameter to --jobs and is used to
# define MAKEOPTS during the target build.
jobs = 4

# Floating-point value passed to emerge as the parameter to --load-average
and
# is used to define MAKEOPTS during the target build.
# load-average = 4.0

</QUOTE>

so once again, the use of floating point is documented as (you choose)
either
important or required.

My opinion: load-average probably works, but we are misunderstanding
the documentation. Note that the example using 4.0 is a pretty load number.

Good luck,
Mark
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 15:06:21 BST Michael Cook wrote:

> You can read /usr/share/portage/config/make.conf.example for an
> explanation. All children processes will use that. I can run portage and
> play games on the same system with my settings.

That example says nothing about any of the emerge default options. I have no
problem with system responsiveness (except, recently, running BOINC).

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 15:13:08 BST Mark Knecht wrote:

> My opinion: load-average probably works, but we are misunderstanding
> the documentation.

That's what bothers me the most - that I have a mental block somewhere. :(

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Fri, May 12, 2023 at 7:27?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Friday, 12 May 2023 15:13:08 BST Mark Knecht wrote:
>
> > My opinion: load-average probably works, but we are misunderstanding
> > the documentation.
>
> That's what bothers me the most - that I have a mental block somewhere.
:(
>
> --
> Regards,
> Peter.

Just for clarity, how are you measuring 'load average'? Just looking at what
is reported in top or something else that takes stats?

So if it's either a documentation issue, or an understanding the
documentation
issue, possibly set up a 'design of experiments' set of tests? For
instance:

1) Pick 1 semi-large package that spawns a few extra jobs to get built
2) Remove the binaries from your system
3) Ensure all the source is prefetched

4) Build the package with no options measuring load-average

Repeat 2 - 4 using a few different options:

-j 1
-j1 --load-average=40
-j1 --load-aveeage=40.0
-j1 --load-average=4.0
-j1 --load-average=0.4
-j10 --load-average=0.4

etc., and see what happens?
Re: Portage load control [ In reply to ]
On 2023.05.12 11:27, Mark Knecht wrote:
> On Fri, May 12, 2023 at 7:27?AM Peter Humphrey <peter@prh.myzen.co.uk>
> wrote:
> >
> > On Friday, 12 May 2023 15:13:08 BST Mark Knecht wrote:
> >
> > > My opinion: load-average probably works, but we are
> misunderstanding
> > > the documentation.
> >
> > That's what bothers me the most - that I have a mental block
> somewhere.
> :(
> >
> > --
> > Regards,
> > Peter.
>
> Just for clarity, how are you measuring 'load average'? Just looking
> at what
> is reported in top or something else that takes stats?
>
> So if it's either a documentation issue, or an understanding the
> documentation
> issue, possibly set up a 'design of experiments' set of tests? For
> instance:
>
> 1) Pick 1 semi-large package that spawns a few extra jobs to get built
> 2) Remove the binaries from your system
> 3) Ensure all the source is prefetched
>
> 4) Build the package with no options measuring load-average
>
> Repeat 2 - 4 using a few different options:
>
> -j 1
> -j1 --load-average=40
> -j1 --load-aveeage=40.0
> -j1 --load-average=4.0
> -j1 --load-average=0.4
> -j10 --load-average=0.4
>
> etc., and see what happens?
--load-average controls whether or not emerge starts another
job/package, so testing by emerging a single package will not actually
test this. That's why I suggested running some application to get the
load up to 10 (arbitrary number) and then emerging a larger number of
small packages. If --load-average is set to anything less than the
actual load, it should only launch one package at a time. Having that
simple example to add to the bug would give the developers an easy way
to test.

I think the fact that Peter's actual load went over 70 is because each
individual job/package had no limit on the number of parallel compiles
make could kick off. There is likely no bug there. The real problem
(as Peter keeps pointing out) is that with the load that high, emerge
still starts additional jobs.
Re: Portage load control [ In reply to ]
On Fri, May 12, 2023 at 9:08?AM Jack <ostroffjh@users.sourceforge.net>
wrote:
>
> > -j 1
> > -j1 --load-average=40
> > -j1 --load-aveeage=40.0
> > -j1 --load-average=4.0
> > -j1 --load-average=0.4
> > -j10 --load-average=0.4
> >
> > etc., and see what happens?
> --load-average controls whether or not emerge starts another
> job/package, so testing by emerging a single package will not actually
> test this. That's why I suggested running some application to get the
> load up to 10 (arbitrary number) and then emerging a larger number of
> small packages. If --load-average is set to anything less than the
> actual load, it should only launch one package at a time. Having that
> simple example to add to the bug would give the developers an easy way
> to test.
>
> I think the fact that Peter's actual load went over 70 is because each
> individual job/package had no limit on the number of parallel compiles
> make could kick off. There is likely no bug there. The real problem
> (as Peter keeps pointing out) is that with the load that high, emerge
> still starts additional jobs.

Jack,
I totally agree, as long as nothing is broken, but yeah, the list I
provided was more meant to engender ideas for Peter.

One interesting point is that the first Gentoo page I found to
look at the emerge man page shows LOAD as the value provided
to the --load-average option, but nowhere does it specify anything
other than it's a floating point value:

https://dev.gentoo.org/~zmedico/portage/doc/man/emerge.1.html

For clarification reading other sites, my understanding is that a
load average value of 1 in the top application is meant to
represent 1 CPU core operating at 100%. Assuming that's
true, then on Peter's 24 core machine, with LOAD=40, he's
telling emerge it's ok to use more cores than his machine has.

Is that consistent with your (or others) understanding?

I think the mistake is one of those easy to make ones where
the human things 40% (hence 40) and the machine things
40% (hence 0.4)

Cheers,
Mark
Re: Portage load control [ In reply to ]
On 2023.05.12 12:23, Mark Knecht wrote:
[snip .....]
> One interesting point is that the first Gentoo page I found to
> look at the emerge man page shows LOAD as the value provided
> to the --load-average option, but nowhere does it specify anything
> other than it's a floating point value:
I suspect the specification of floating point implies that it CAN take
digits after the decimal point, but not that they are required,
although that should be easy enough to test.
>
> https://dev.gentoo.org/~zmedico/portage/doc/man/emerge.1.html
>
> For clarification reading other sites, my understanding is that a
> load average value of 1 in the top application is meant to
> represent 1 CPU core operating at 100%. Assuming that's
> true, then on Peter's 24 core machine, with LOAD=40, he's
> telling emerge it's ok to use more cores than his machine has.
>
> Is that consistent with your (or others) understanding?
Close, but not quite. (See
https://en.wikipedia.org/wiki/Load_(computing) for more details.) I
think your understanding will match any observations, but I see the
definition as different. I understand the load (instantaneous, not
average) is the number of processed in the "r" state, i.e., running or
waiting for a CPU slice. That excludes any process explicitly sleeping
or waiting for IO. Since it can change so quickly, the point load is
not very useful, so it is more commonly presented as a value averaged
over a period of time. Top shows 1, 5, and 15 minute averages.

Again, --load-average tells emerge whether it can start a new
job/package, but has no control over how high the load will get based
on the already started jobs. If emerge starts new jobs when the load
is over that specified by --load-average, that does smell like a bug in
emerge.

>
> I think the mistake is one of those easy to make ones where
> the human things 40% (hence 40) and the machine things
> 40% (hence 0.4)
>
> Cheers,
> Mark
>
Re: Portage load control [ In reply to ]
On Friday, 12 May 2023 17:58:46 BST Jack wrote:

> Again, --load-average tells emerge whether it can start a new
> job/package, but has no control over how high the load will get based
> on the already started jobs. If emerge starts new jobs when the load
> is over that specified by --load-average, that does smell like a bug in
> emerge.

Hooray!

:)

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On Fri, May 12, 2023 at 9:59?AM Jack <ostroffjh@users.sourceforge.net>
wrote:
>
> On 2023.05.12 12:23, Mark Knecht wrote:
> [snip .....]
> > One interesting point is that the first Gentoo page I found to
> > look at the emerge man page shows LOAD as the value provided
> > to the --load-average option, but nowhere does it specify anything
> > other than it's a floating point value:
> I suspect the specification of floating point implies that it CAN take
> digits after the decimal point, but not that they are required,
> although that should be easy enough to test.
> >
> > https://dev.gentoo.org/~zmedico/portage/doc/man/emerge.1.html
> >
> > For clarification reading other sites, my understanding is that a
> > load average value of 1 in the top application is meant to
> > represent 1 CPU core operating at 100%. Assuming that's
> > true, then on Peter's 24 core machine, with LOAD=40, he's
> > telling emerge it's ok to use more cores than his machine has.
> >
> > Is that consistent with your (or others) understanding?
> Close, but not quite. (See
> https://en.wikipedia.org/wiki/Load_(computing) for more details.) I
> think your understanding will match any observations, but I see the
> definition as different. I understand the load (instantaneous, not
> average) is the number of processed in the "r" state, i.e., running or
> waiting for a CPU slice. That excludes any process explicitly sleeping
> or waiting for IO. Since it can change so quickly, the point load is
> not very useful, so it is more commonly presented as a value averaged
> over a period of time. Top shows 1, 5, and 15 minute averages.
>
> Again, --load-average tells emerge whether it can start a new
> job/package, but has no control over how high the load will get based
> on the already started jobs. If emerge starts new jobs when the load
> is over that specified by --load-average, that does smell like a bug in
> emerge.
>
> >
> > I think the mistake is one of those easy to make ones where
> > the human things 40% (hence 40) and the machine things
> > 40% (hence 0.4)
> >
> > Cheers,
> > Mark
> >

OK, I find that all reasonable. One point about the Wikipedia
description for anyone following who may not actually read it
is that the average is accomplished with an exponential moving
average and therefore is not, by definition, linear over time.

As a little experiment that anyone can run I'll include a little
AI generated batch file people can use to actually
see more of what's going on in top, htop and btop. Note
on my system the CPU affinity didn't work and I don't care
to debug it. However this loops continuously until you hit ctrl-C.

If you watch CPU load you'll
see it climb quickly at first and then more slowly until
you get up to 1.0. It will go a little higher (1.03 in my case)
which is likely the CPU load from the programs monitoring
the system and other background junk.

None the less, 1 core running continuously generates
as load of 1 after some period of time.

As with all code on the Internet I take no responsibility
for any damage caused my this code and neither
does Google's Bard.

#!/bin/bash

# This batch program loops until you hit Ctrl-C.

# Get the current processor affinity.
affinity=$(cat /proc/self/cpuset)

# Set the processor affinity to a single core.
echo $affinity | sudo tee /proc/self/cpuset

while true; do

# Do nothing.
:

done

# Reset the processor affinity to the default.
echo "" | sudo tee /proc/self/cpuset
Re: Portage load control [ In reply to ]
On Fri, May 12, 2023 at 10:42?AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
>
> On Friday, 12 May 2023 17:58:46 BST Jack wrote:
>
> > Again, --load-average tells emerge whether it can start a new
> > job/package, but has no control over how high the load will get based
> > on the already started jobs. If emerge starts new jobs when the load
> > is over that specified by --load-average, that does smell like a bug in
> > emerge.
>
> Hooray!
>

Peter,
I agree with Jack's response, but the keyword & potential issue is
all based around that one word - "If". The way I see this is unless
you have tracked down realtime what processes are running and
where the CPU usage is going, and can further be sure that it's a
process emerge itself started, then we don't really know what is
causing the problem. My concern is what happens if emerge is
honoring --load-average but you're seeing system usage created
by some tool emerge called that doesn't understand --jobs and
emerge doesn't know about at that level? Think some Rust code
getting built by a rust compiler, or some deep make system.

Anyway, I had a couple of thoughts:

1) If it's really a bug then as others have said report it up the
chain and hope for a fix.

2) If I wanted to solve the problem today(ish) then I'd build
a Gentoo VM in Virtualbox, dedicate some number of cores
to it, build everything with binary packages and probably
run an NFS server in the VM which I mount in the host
machine. I then update the host machine from the binary
packages and Virtualbox manages to never use more cores
than I give it. That fix is more or less guaranteed to work.

3) As a question for the far more knowledgeable system
folks I'd ask "Can this problem be solved by cgroups?" If
I have a cgroup with 10 processors in it, can I start emerge
in the host environment and then just transfer the emerge
process ID to a cgroup that I've set up for this purpose?
Isn't that what cgroups is supposed to be used for?

Anyway, just thoughts.

Good luck,
Mark
Re: Portage load control [ In reply to ]
On Saturday, 13 May 2023 00:53:49 BST Mark Knecht wrote:

> Anyway, I had a couple of thoughts:
>
> 1) If it's really a bug then as others have said report it up the
> chain and hope for a fix.

https://bugs.gentoo.org/905933

> 2) If I wanted to solve the problem today(ish) then I'd build
> a Gentoo VM in Virtualbox, dedicate some number of cores
> to it, build everything with binary packages and probably
> run an NFS server in the VM which I mount in the host
> machine. I then update the host machine from the binary
> packages and Virtualbox manages to never use more cores
> than I give it. That fix is more or less guaranteed to work.

Sounds like a lot of work. :(

> 3) As a question for the far more knowledgeable system
> folks I'd ask "Can this problem be solved by cgroups?" If
> I have a cgroup with 10 processors in it, can I start emerge
> in the host environment and then just transfer the emerge
> process ID to a cgroup that I've set up for this purpose?
> Isn't that what cgroups is supposed to be used for?

Interesting idea, that.

> Anyway, just thoughts.

All grist to the mill...

--
Regards,
Peter.
Re: Portage load control [ In reply to ]
On 5/12/23 20:08, Peter Humphrey wrote:
> On Saturday, 13 May 2023 00:53:49 BST Mark Knecht wrote:
>
>> Anyway, I had a couple of thoughts:
>>
>> 1) If it's really a bug then as others have said report it up the
>> chain and hope for a fix.
> https://bugs.gentoo.org/905933
>
>> 2) If I wanted to solve the problem today(ish) then I'd build
>> a Gentoo VM in Virtualbox, dedicate some number of cores
>> to it, build everything with binary packages and probably
>> run an NFS server in the VM which I mount in the host
>> machine. I then update the host machine from the binary
>> packages and Virtualbox manages to never use more cores
>> than I give it. That fix is more or less guaranteed to work.
> Sounds like a lot of work. :(
A new thought on an easier test.  With -j any higher than 1, doesn't
emerge put out a fairly constant stream of how many out of how many jobs
are complete, how many are currently running, and the load average?  If
it launches new jobs when it's own display of load average is above what
you set, that should be pretty compelling to the developers.
>> 3) As a question for the far more knowledgeable system
>> folks I'd ask "Can this problem be solved by cgroups?" If
>> I have a cgroup with 10 processors in it, can I start emerge
>> in the host environment and then just transfer the emerge
>> process ID to a cgroup that I've set up for this purpose?
>> Isn't that what cgroups is supposed to be used for?
> Interesting idea, that.
>
>> Anyway, just thoughts.
> All grist to the mill...
>