Mailing List Archive: Jobs and load-average

Jobs and load-average

Feb 15, 2023, 1:56 AM

Post #1 of 17 (751 views)

Hello list,

Not long ago I read that we should allow 2GB RAM for every emerge job - that
is, we should divide our RAM size by 2 to get the maximum number of
simultaneous jobs. I'm trying to get that right, but I'm not there yet.

I have these entries in make.conf:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
unmerge-warn --ke>
MAKEOPTS="-j16"

Today, though, I saw load averages going up to 72. Can anyone suggest better
values to suit my 24 threads and 64GB RAM?

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

peter.bo at web

Feb 15, 2023, 3:31 AM

Post #2 of 17 (750 views)

Permalink

Am Mittwoch, 15. Februar 2023, 10:56:22 CET schrieb Peter Humphrey:
> Hello list,
>
> Not long ago I read that we should allow 2GB RAM for every emerge job - that
> is, we should divide our RAM size by 2 to get the maximum number of
> simultaneous jobs. I'm trying to get that right, but I'm not there yet.
>
> I have these entries in make.conf:
> EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> unmerge-warn --ke>
> MAKEOPTS="-j16"
>
> Today, though, I saw load averages going up to 72. Can anyone suggest better
> values to suit my 24 threads and 64GB RAM?

Maybe you are interested in this wiki article:

https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times

Regards,
Peter

Re: Jobs and load-average [ In reply to ]

peter at prh

Feb 15, 2023, 3:39 AM

Post #3 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 09:56:22 GMT Peter Humphrey wrote:

> EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> unmerge-warn --ke>

That should have been:
EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
unmerge-warn --keep-going --nospinner"

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

confabulate at kintzios

Feb 15, 2023, 5:12 AM

Post #4 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 11:31:49 GMT Peter Böhm wrote:
> Am Mittwoch, 15. Februar 2023, 10:56:22 CET schrieb Peter Humphrey:
> > Hello list,
> >
> > Not long ago I read that we should allow 2GB RAM for every emerge job -
> > that is, we should divide our RAM size by 2 to get the maximum number of
> > simultaneous jobs. I'm trying to get that right, but I'm not there yet.
> >
> > I have these entries in make.conf:
> > EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> > unmerge-warn --ke>

The above determine how may ebuilds will be emerged in parallel. If you are
rebuilding your whole system with hundreds of packages stacking up to be
emerged, then having as high as 16 packages being emerged in parallel could be
advantageous.

> > MAKEOPTS="-j16"

This determines how many MAKE jobs will run in parallel in any one emerge.
Large packages like chromium will benefit from maximising the number of jobs
here, as long as you have enough RAM.

Given you have 24 threads and your RAM is 64GB, you should be able to ratchet
this up to -j24, but not if you specify a high EMERGE_DEFAULT_OPTS at the same
time, or if the compiler is eating up more than 2G per process.

> > Today, though, I saw load averages going up to 72. Can anyone suggest
> > better values to suit my 24 threads and 64GB RAM?

Since you have specified up to 16 parallel emerges and each one could run up
to 16 MAKE jobs, you can understand why you would soon find loads escalating
and your machine becoming unresponsive.

You should consider what is more important for you, emerging as many packages
in parallel as possible, or emerging any one large package as fast as
possible.

Two extreme examples would be setting EMERGE_DEFAULT_OPTS at "--jobs 24", with
MAKEOPTS at "-j1", or conversely setting EMERGE_DEFAULT_OPTS at "--jobs 1",
with MAKEOPTS at "-j24".

On my old and slow laptop with only 4 threads and 16G of RAM my priority is to
finish large packages faster. I leave EMERGE_DEFAULT_OPTS unset, while
specifying MAKEOPTS="-j5 -l4.8". This uses a job number I determined by trial
and error, building ffmpeg repeatedly by progressively increasing the number
for -j from 1 to 12. The two faster times were achieved with -j5 and -j10,
which aligns with the old myth of using CPU+1.

Regarding RAM being used being ~2G per MAKE job, this fluctuates with
successive compiler versions. I have seen up to 3.4G of RAM per process,
while emerging chromium. For such huge packages which cause excessive
swapping, unresponsiveness and thrashing of disk, I limit the MAKEOPTS to 3 in
package.env.

> Maybe you are interested in this wiki article:
>
> https://wiki.gentoo.org/wiki/User:Pietinger/Tutorials/Optimize_compile_times
>
> Regards,
> Peter

I'd start by reading the suggestions in this article first, which is a good
introduction to the concepts involved:

https://wiki.gentoo.org/wiki/MAKEOPTS

Re: Jobs and load-average [ In reply to ]

rich0 at gentoo

Feb 15, 2023, 5:18 AM

Post #5 of 17 (750 views)

Permalink

On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:
>
> Not long ago I read that we should allow 2GB RAM for every emerge job - that
> is, we should divide our RAM size by 2 to get the maximum number of
> simultaneous jobs. I'm trying to get that right, but I'm not there yet.
>
> I have these entries in make.conf:
> EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> unmerge-warn --ke>
> MAKEOPTS="-j16"
>
> Today, though, I saw load averages going up to 72. Can anyone suggest better
> values to suit my 24 threads and 64GB RAM?

First, keep in mind that --jobs=16 + -j16 can result in up to 256
(16*16) tasks running at once. Of course, that is worst case and most
of the time you'll have way less than that.

Keep in mind that you need to consider available RAM and not just
total RAM. Run free under the conditions where you typically run
emerge and see how much available memory it displays. Depending on
what you have running it could be much lower than 64GB.

Beyond that, unfortunately this is hard to deal with beyond just
figuring out what needs more RAM and making exceptions in package.env.

Also, RAM pressure could also come from the build directory if it is
on tmpfs, which of course many of us use.

Some packages that I build with either a greatly reduced -j setting or
a non-tmpfs build directory are:
sys-cluster/ceph
dev-python/scipy
dev-python/pandas
app-office/calligra
net-libs/nodejs
dev-qt/qtwebengine
dev-qt/qtwebkit
dev-lang/spidermonkey
www-client/chromium
app-office/libreoffice
sys-devel/llvm
dev-lang/rust (I use the rust binary these days as this has gotten
really out of hand)
x11-libs/gtk+

These are just packages I've had issues with at some point, and it is
possible that some of these packages no longer use as much memory
today.

--
Rich

Re: Jobs and load-average [ In reply to ]

peter at prh

Feb 15, 2023, 6:31 AM

Post #6 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:
> On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
> > Not long ago I read that we should allow 2GB RAM for every emerge job -
> > that is, we should divide our RAM size by 2 to get the maximum number of
> > simultaneous jobs. I'm trying to get that right, but I'm not there yet.
> >
> > I have these entries in make.conf:
> > EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> > unmerge-warn --ke>
> > MAKEOPTS="-j16"
> >
> > Today, though, I saw load averages going up to 72. Can anyone suggest
> > better values to suit my 24 threads and 64GB RAM?
>
> First, keep in mind that --jobs=16 + -j16 can result in up to 256
> (16*16) tasks running at once. Of course, that is worst case and most
> of the time you'll have way less than that.
>
> Keep in mind that you need to consider available RAM and not just
> total RAM. Run free under the conditions where you typically run
> emerge and see how much available memory it displays. Depending on
> what you have running it could be much lower than 64GB.
>
> Beyond that, unfortunately this is hard to deal with beyond just
> figuring out what needs more RAM and making exceptions in package.env.
>
> Also, RAM pressure could also come from the build directory if it is
> on tmpfs, which of course many of us use.
>
> Some packages that I build with either a greatly reduced -j setting or
> a non-tmpfs build directory are:
> sys-cluster/ceph
> dev-python/scipy
> dev-python/pandas
> app-office/calligra
> net-libs/nodejs
> dev-qt/qtwebengine
> dev-qt/qtwebkit
> dev-lang/spidermonkey
> www-client/chromium
> app-office/libreoffice
> sys-devel/llvm
> dev-lang/rust (I use the rust binary these days as this has gotten
> really out of hand)
> x11-libs/gtk+
>
> These are just packages I've had issues with at some point, and it is
> possible that some of these packages no longer use as much memory
> today.

Thank you all. I can see what I'm doing better now. (Politicians aren't the
only ones who can be ambiguous!)

I'll start by picking up the point I'd missed - putting MAKEOPTS in
package.env.

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

confabulate at kintzios

Feb 15, 2023, 7:12 AM

Post #7 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 14:31:25 GMT Peter Humphrey wrote:
> On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:
> > On Wed, Feb 15, 2023 at 4:56 AM Peter Humphrey <peter@prh.myzen.co.uk>
>
> wrote:
> > > Not long ago I read that we should allow 2GB RAM for every emerge job -
> > > that is, we should divide our RAM size by 2 to get the maximum number of
> > > simultaneous jobs. I'm trying to get that right, but I'm not there yet.
> > >
> > > I have these entries in make.conf:
> > > EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> > > unmerge-warn --ke>
> > > MAKEOPTS="-j16"
> > >
> > > Today, though, I saw load averages going up to 72. Can anyone suggest
> > > better values to suit my 24 threads and 64GB RAM?
> >
> > First, keep in mind that --jobs=16 + -j16 can result in up to 256
> > (16*16) tasks running at once. Of course, that is worst case and most
> > of the time you'll have way less than that.
> >
> > Keep in mind that you need to consider available RAM and not just
> > total RAM. Run free under the conditions where you typically run
> > emerge and see how much available memory it displays. Depending on
> > what you have running it could be much lower than 64GB.
> >
> > Beyond that, unfortunately this is hard to deal with beyond just
> > figuring out what needs more RAM and making exceptions in package.env.
> >
> > Also, RAM pressure could also come from the build directory if it is
> > on tmpfs, which of course many of us use.
> >
> > Some packages that I build with either a greatly reduced -j setting or
> > a non-tmpfs build directory are:
> > sys-cluster/ceph
> > dev-python/scipy
> > dev-python/pandas
> > app-office/calligra
> > net-libs/nodejs
> > dev-qt/qtwebengine
> > dev-qt/qtwebkit
> > dev-lang/spidermonkey
> > www-client/chromium
> > app-office/libreoffice
> > sys-devel/llvm
> > dev-lang/rust (I use the rust binary these days as this has gotten
> > really out of hand)
> > x11-libs/gtk+
> >
> > These are just packages I've had issues with at some point, and it is
> > possible that some of these packages no longer use as much memory
> > today.
>
> Thank you all. I can see what I'm doing better now. (Politicians aren't the
> only ones who can be ambiguous!)
>
> I'll start by picking up the point I'd missed - putting MAKEOPTS in
> package.env.

You can have both a generic MAKEOPTS in make.conf, which suits your base case
of emerge operations and will not cause your PC to explode when combined with
EMERGE_DEFAULT_OPTS, as well as package specific MAKEOPTS in package.env to
finely tune individual package requirements.

Re: Jobs and load-average [ In reply to ]

peter at prh

Feb 15, 2023, 7:32 AM

Post #8 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 15:12:28 GMT Michael wrote:

> You can have both a generic MAKEOPTS in make.conf, which suits your base
> case of emerge operations and will not cause your PC to explode when
> combined with EMERGE_DEFAULT_OPTS, as well as package specific MAKEOPTS in
> package.env to finely tune individual package requirements.

Yes, I assumed so, and I've now set it up that way.

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

joost at antarean

Feb 15, 2023, 11:12 PM

Post #9 of 17 (750 views)

Permalink

On Wednesday, February 15, 2023 10:56:22 AM CET Peter Humphrey wrote:
> Hello list,
>
> Not long ago I read that we should allow 2GB RAM for every emerge job - that
> is, we should divide our RAM size by 2 to get the maximum number of
> simultaneous jobs. I'm trying to get that right, but I'm not there yet.
>
> I have these entries in make.conf:
> EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> unmerge-warn --ke>
> MAKEOPTS="-j16"
>
> Today, though, I saw load averages going up to 72. Can anyone suggest better
> values to suit my 24 threads and 64GB RAM?

One other item I missed in the replies:
"--load-average" is also a valid option for make.

If you want to keep the load down, I would suggest adding this to MAKEOPTS as
well:

MAKEOPTS="--jobs=16 --load-average=32"

I write the options out full because I had some weird errors in the past
because the "-j" wasn't handled correctly at some point.

--
Joost

Re: Jobs and load-average [ In reply to ]

peter at prh

Feb 16, 2023, 1:53 AM

Post #10 of 17 (750 views)

Permalink

On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:

> First, keep in mind that --jobs=16 + -j16 can result in up to 256
> (16*16) tasks running at once. Of course, that is worst case and most
> of the time you'll have way less than that.

Yes, I was aware of that, but why didn't --load-average=32 take precedence?

--->8

> Some packages that I build with either a greatly reduced -j setting or
> a non-tmpfs build directory are:
> sys-cluster/ceph
> dev-python/scipy
> dev-python/pandas
> app-office/calligra
> net-libs/nodejs
> dev-qt/qtwebengine
> dev-qt/qtwebkit
> dev-lang/spidermonkey
> www-client/chromium
> app-office/libreoffice
> sys-devel/llvm
> dev-lang/rust (I use the rust binary these days as this has gotten
> really out of hand)
> x11-libs/gtk+

Thanks for the list, Rich.

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

finkandreas at web

Feb 16, 2023, 2:32 AM

Post #11 of 17 (750 views)

Permalink

On Thu, 16 Feb 2023 09:53:30 +0000
Peter Humphrey <peter@prh.myzen.co.uk> wrote:

> On Wednesday, 15 February 2023 13:18:24 GMT Rich Freeman wrote:
>
> > First, keep in mind that --jobs=16 + -j16 can result in up to 256
> > (16*16) tasks running at once. Of course, that is worst case and most
> > of the time you'll have way less than that.
>
> Yes, I was aware of that, but why didn't --load-average=32 take precedence?
This only means that emerge would not schedule additional package job
(where a package job means something like `emerge gcc`) when load
average > 32, howwever if a job is scheduled it's running, independently
of the current load.
While having it in MAKEOPTS, it would be handled by the make system,
which schedules single build jobs, and would stop scheduling additional
jobs, when the load is too high.

Extreme case:
emerge chromium firefox qtwebengine
--> your load when you do this is pretty much close to 0, i.e. all 3
packages are being merged simultaneously and each will be built with
-j16.
I.e. for a long time you will have about 3*16=48 single build jobs
running in parallel, i.e. you should see a load going towards 48, when
you do not have anything in your MAKEOPTS.

Cheers
Andreas

Re: Jobs and load-average [ In reply to ]

rich0 at gentoo

Feb 16, 2023, 4:23 AM

Post #12 of 17 (750 views)

Permalink

On Thu, Feb 16, 2023 at 5:32 AM Andreas Fink <finkandreas@web.de> wrote:
>
> On Thu, 16 Feb 2023 09:53:30 +0000
> Peter Humphrey <peter@prh.myzen.co.uk> wrote:
>
> > Yes, I was aware of that, but why didn't --load-average=32 take precedence?
> This only means that emerge would not schedule additional package job
> (where a package job means something like `emerge gcc`) when load
> average > 32, howwever if a job is scheduled it's running, independently
> of the current load.
> While having it in MAKEOPTS, it would be handled by the make system,
> which schedules single build jobs, and would stop scheduling additional
> jobs, when the load is too high.
>
> Extreme case:
> emerge chromium firefox qtwebengine
> --> your load when you do this is pretty much close to 0, i.e. all 3
> packages are being merged simultaneously and each will be built with
> -j16.
> I.e. for a long time you will have about 3*16=48 single build jobs
> running in parallel, i.e. you should see a load going towards 48, when
> you do not have anything in your MAKEOPTS.

TL;DR - the load-average option results in underdamping, as a result
of the delay with the measurement of load average.

Keep in mind that load averages are averages and have a time lag, and
compilers that are swapping like crazy can run for a fairly long time.
So you will probably have fairly severe oscillation in the load if
swapping is happening. If your load is under 32, each of your 16
parallel makes, even if running with the limit in MAKEOPTS, will feel
free to launch another 256 jobs, because it will take seconds for the
1 minute load average to creep above 32. At that point you have WAY
more than 32 tasks running and if they're swapping then half of the
processes on your system are probably going to start blocking. So now
make (if configured in MAKEOPTS) will hold off on launching anything,
but it could take minutes for those swapping compiler jobs to complete
the amount of work that would normally take a few seconds. Then as
those processes eventually start terminating (assuming you don't get
OOM killing or PANICs) your load will start dropping, until eventually
it gets back below 32, at which point all those make processes that
are just sitting around will wake up and fire off another 50 gcc
instances or whatever they get up to before the brakes come back on.

The load average setting is definitely useful and I would definitely
set it, but when the issue is swapping it doesn't go far enough. Make
has no idea how much memory a gcc process will require. Since that is
the resource likely causing problems it is hard to efficiently max out
your cores without actually accounting for memory use. The best I've
been able to do is just set things conservatively so it never gets out
of control, and underutilizes CPU in the process. Often it is only
parts of a build that even have issues - something big like chromium
might have 10,000 tasks that would run fine with -j16 or whatever, but
then there is this one part where the jobs all want a ton of RAM and
you need to run just that one part at a lower setting.

--
Rich

Re: Jobs and load-average [ In reply to ]

peter at prh

Feb 16, 2023, 5:39 AM

Post #13 of 17 (750 views)

Permalink

On Thursday, 16 February 2023 12:23:52 GMT Rich Freeman wrote:

--->8 Much useful detail.

That all makes perfect sense, and is what I'd assumed, but it's good to have
it confirmed.

> The load average setting is definitely useful and I would definitely
> set it, but when the issue is swapping it doesn't go far enough. Make
> has no idea how much memory a gcc process will require. Since that is
> the resource likely causing problems it is hard to efficiently max out
> your cores without actually accounting for memory use. The best I've
> been able to do is just set things conservatively so it never gets out
> of control, and underutilizes CPU in the process. Often it is only
> parts of a build that even have issues - something big like chromium
> might have 10,000 tasks that would run fine with -j16 or whatever, but
> then there is this one part where the jobs all want a ton of RAM and
> you need to run just that one part at a lower setting.

I've just looked at 'man make', from which it's clear that -j = --jobs, and
that both those and --load-average are passed to /usr/bin/make, presumably
untouched unless portage itself has identically named variables. So I wonder
how feasible it might be for make to incorporate its own checks to ensure that
the load average is not exceeded. I am not a programmer (not for at least 35
years, anyway), so I have to leave any such suggestion to the experts.

--
Regards,
Peter.

Re: Jobs and load-average [ In reply to ]

rich0 at gentoo

Feb 16, 2023, 6:24 AM

Post #14 of 17 (750 views)

Permalink

On Thu, Feb 16, 2023 at 8:39 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:
>
> I've just looked at 'man make', from which it's clear that -j = --jobs, and
> that both those and --load-average are passed to /usr/bin/make, presumably
> untouched unless portage itself has identically named variables. So I wonder
> how feasible it might be for make to incorporate its own checks to ensure that
> the load average is not exceeded. I am not a programmer (not for at least 35
> years, anyway), so I have to leave any such suggestion to the experts.
>

Well, if we just want to have a fun discussion here are my thoughts.
However, the complexity vs usefulness outside of Gentoo is such that I
don't see it happening.

For the most typical use case - a developer building the same thing
over and over (which isn't Gentoo), then make could cache info on
resources consumed, and use that to make more educated decisions about
how many tasks to launch. That wouldn't help us at all, but it would
help the typical make user. However, the typical make user can just
tune things in other ways.

It isn't going to be possible for make to estimate build complexity in
any practical way. Halting problem aside maybe you could build in
some smarts looking at the program being executed and its arguments,
but it would be a big mess.

Something make could do is tune the damping a bit. It could gradually
increase the number of jobs it runs and watch the load average, and
gradually scale it up appropriately, and gradually scale down if CPU
is the issue, or rapidly scale down if swap is the issue. If swapping
is detected it could even suspend most of the tasks it has spawned and
then gradually continue them as other tasks finish to recover from
this condition. However, this isn't going to work as well if portage
is itself spawning parallel instances of make - they'd have to talk to
each other or portage would somehow need to supervise things.

A way of thinking about it is that when you have portage spawning
multiple instances of make, that is a bit like adding gain to the
--load-average MAKEOPTS. So each instance of make independently looks
at load average and takes action. So you have an output (compilers
that create load), then you sample that load with a time-weighted
average, and then you apply gain to this average, and then use that as
feedback. That's basically a recipe for out of control oscillation.
You need to add damping and get rid of the gain.

Disclaimer: I'm not an engineer and I suspect a real engineer would be
able to add a bit more insight.

Really though the issue is that this is the sort of thing that only
impacts Gentoo and so nobody else is likely to solve this problem for
us.

--
Rich

Re: Jobs and load-average [ In reply to ]

finkandreas at web

Feb 16, 2023, 7:17 AM

Post #15 of 17 (750 views)

Permalink

On Thu, 16 Feb 2023 09:24:08 -0500
Rich Freeman <rich0@gentoo.org> wrote:

> On Thu, Feb 16, 2023 at 8:39 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:
> >
> > I've just looked at 'man make', from which it's clear that -j = --jobs, and
> > that both those and --load-average are passed to /usr/bin/make, presumably
> > untouched unless portage itself has identically named variables. So I wonder
> > how feasible it might be for make to incorporate its own checks to ensure that
> > the load average is not exceeded. I am not a programmer (not for at least 35
> > years, anyway), so I have to leave any such suggestion to the experts.
> >
>
> Well, if we just want to have a fun discussion here are my thoughts.
> However, the complexity vs usefulness outside of Gentoo is such that I
> don't see it happening.
>
> For the most typical use case - a developer building the same thing
> over and over (which isn't Gentoo), then make could cache info on
> resources consumed, and use that to make more educated decisions about
> how many tasks to launch. That wouldn't help us at all, but it would
> help the typical make user. However, the typical make user can just
> tune things in other ways.
>
> It isn't going to be possible for make to estimate build complexity in
> any practical way. Halting problem aside maybe you could build in
> some smarts looking at the program being executed and its arguments,
> but it would be a big mess.
>
> Something make could do is tune the damping a bit. It could gradually
> increase the number of jobs it runs and watch the load average, and
> gradually scale it up appropriately, and gradually scale down if CPU
> is the issue, or rapidly scale down if swap is the issue. If swapping
> is detected it could even suspend most of the tasks it has spawned and
> then gradually continue them as other tasks finish to recover from
> this condition. However, this isn't going to work as well if portage
> is itself spawning parallel instances of make - they'd have to talk to
> each other or portage would somehow need to supervise things.
>
> A way of thinking about it is that when you have portage spawning
> multiple instances of make, that is a bit like adding gain to the
> --load-average MAKEOPTS. So each instance of make independently looks
> at load average and takes action. So you have an output (compilers
> that create load), then you sample that load with a time-weighted
> average, and then you apply gain to this average, and then use that as
> feedback. That's basically a recipe for out of control oscillation.
> You need to add damping and get rid of the gain.
>
> Disclaimer: I'm not an engineer and I suspect a real engineer would be
> able to add a bit more insight.
>
> Really though the issue is that this is the sort of thing that only
> impacts Gentoo and so nobody else is likely to solve this problem for
> us.
>

Given all your explenation and my annoyance a couple of years ago, I
hacked a little helper that sits between make and spawned build jobs.
Basically what annoyed me is the fact that chromium would compile for
hours and then fail, because it would need more memory than memory
available, and this would fail the whole build.
One possible solution is to reduce the number of build jobs to e.g. -j1
for chromium, but this is stupid because 99% of the time -j16 would
work just fine.

So I hacked a bit around, and came up with little helper&watcher. The
helper would limit spawning new jobs to SOME_LIMIT, and when load
is too high (e.g.g I am doing other work on the PC, that's not
under emerge's control). The watcher kills memory hungry build jobs,
once memory usage higher than 90%, tells the helper to stop spawning new
jobs, waits until the helper reports that no more build jobs are
running and then respawns the memory hungry build job (i.e. the memory
hungry build job will run essentially as if -j1 was specified)

This way I can mix emerge --jobs=HIGH_NUMBER and make
-jOTHER_HIGH_NUMBER, and it wouldn't affect the system, because the
total number of actual build jobs is controlled by the helper, and would
never go beyond SOME_LIMIT, even if HIGH_NUMBER*OTHER_HIGH_NUMBER > SOME_LIMIT.

I never published this anywhere, but if there's interest in it, I can
probably upload it somewhere, but I had the feeling that it's quite
hacky and not worth publishing. Also I was never sure if I break emerge
in some way, because it's very low-level, but now it's running since
more than a year without any emerge failure due to this hijacking.

RE: Jobs and load-average [ In reply to ]

lperkins at openeye

Feb 16, 2023, 9:55 AM

Post #16 of 17 (750 views)

Permalink

> -----Original Message-----
> From: Rich Freeman <rich0@gentoo.org>
> Sent: Thursday, February 16, 2023 6:24 AM
> To: gentoo-user@lists.gentoo.org
> Subject: Re: [gentoo-user] Jobs and load-average
>
> On Thu, Feb 16, 2023 at 8:39 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:
> >
> > I've just looked at 'man make', from which it's clear that -j =
> > --jobs, and that both those and --load-average are passed to
> > /usr/bin/make, presumably untouched unless portage itself has
> > identically named variables. So I wonder how feasible it might be for
> > make to incorporate its own checks to ensure that the load average is
> > not exceeded. I am not a programmer (not for at least 35 years, anyway), so I have to leave any such suggestion to the experts.
> >
>
> Well, if we just want to have a fun discussion here are my thoughts.
> However, the complexity vs usefulness outside of Gentoo is such that I don't see it happening.
>
> For the most typical use case - a developer building the same thing over and over (which isn't Gentoo), then make could cache info on resources consumed, and use that to make more educated decisions about how many tasks to launch. That wouldn't help us at all, but it would help the typical make user. However, the typical make user can just tune things in other ways.
>
> It isn't going to be possible for make to estimate build complexity in any practical way. Halting problem aside maybe you could build in some smarts looking at the program being executed and its arguments, but it would be a big mess.
>
> Something make could do is tune the damping a bit. It could gradually increase the number of jobs it runs and watch the load average, and gradually scale it up appropriately, and gradually scale down if CPU is the issue, or rapidly scale down if swap is the issue. If swapping is detected it could even suspend most of the tasks it has spawned and then gradually continue them as other tasks finish to recover from this condition. However, this isn't going to work as well if portage is itself spawning parallel instances of make - they'd have to talk to each other or portage would somehow need to supervise things.
>
> A way of thinking about it is that when you have portage spawning multiple instances of make, that is a bit like adding gain to the --load-average MAKEOPTS. So each instance of make independently looks at load average and takes action. So you have an output (compilers that create load), then you sample that load with a time-weighted average, and then you apply gain to this average, and then use that as feedback. That's basically a recipe for out of control oscillation.
> You need to add damping and get rid of the gain.
>
> Disclaimer: I'm not an engineer and I suspect a real engineer would be able to add a bit more insight.
>
> Really though the issue is that this is the sort of thing that only impacts Gentoo and so nobody else is likely to solve this problem for us.
>
> --
> Rich

Expanding the capabilities of app-admin/cpulimit to tally up memory usage and suspend new compile jobs temporarily when it gets too high probably wouldn't be too horribly difficult...

LMP

Re: Jobs and load-average [ In reply to ]

peter at prh

Apr 11, 2023, 4:09 AM

Post #17 of 17 (637 views)

Permalink

On Wednesday, 15 February 2023 09:56:22 BST Peter Humphrey wrote:
> Hello list,
>
> Not long ago I read that we should allow 2GB RAM for every emerge job - that
> is, we should divide our RAM size by 2 to get the maximum number of
> simultaneous jobs. I'm trying to get that right, but I'm not there yet.
>
> I have these entries in make.conf:
> EMERGE_DEFAULT_OPTS="--jobs=16 --load-average=32 --autounmask=n --quiet-
> unmerge-warn --ke>
> MAKEOPTS="-j16"
>
> Today, though, I saw load averages going up to 72. Can anyone suggest better
> values to suit my 24 threads and 64GB RAM?

Thanks all for your contributions.

I've settled on the following, after some experimenting:

EMERGE_DEFAULT_OPTS="--jobs --autounmask=n --quiet-unmerge-warn --keep-going
--nospinner"
MAKEOPTS="-j24"

I've stopped using disk space for /var/tmp/portage, even for the biggest
packages, because (a) it causes a huge increase in compilation time, even on a
SATA SSD, and (b) I've never seen an OOM anyway.

So what if the system load goes high? It's only the number of processes ready
for execution at any instant. I imagine the kernel is effective in guarding its
own memory spaces.

--
Regards,
Peter.

Mailing List Archive

Attached Files:

Attached Files: