Mailing List Archive

Emerge load again
Hello list,

I still can't see how portage limits the load. Today I'm emerging libreoffice,
and it's spending almost the whole time working with 4 CPU threads. But:

$ grep -e '\-j' -e distcc /etc/portage/make.conf
EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
autounmask=n --keep-going --nospinner"
FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
userpriv usersandbox"
MAKEOPTS="-j18"

I found a suggestion to use distcc in the installation handbook, which I
hadn't seen there before, so I went searching for it and found how to do it.
It usually works well, in this case starting 18 packages before starting LO
itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ . Other
times it just doesn't help at all.

What am I missing?

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
On Monday, 27 November 2023 15:39:33 GMT Peter Humphreey wrote:
> Hello list,
>
> I still can't see how portage limits the load. Today I'm emerging
> libreoffice, and it's spending almost the whole time working with 4 CPU
> threads. But:
>
> $ grep -e '\-j' -e distcc /etc/portage/make.conf
> EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
> autounmask=n --keep-going --nospinner"
> FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
> userpriv usersandbox"
> MAKEOPTS="-j18"
>
> I found a suggestion to use distcc in the installation handbook, which I
> hadn't seen there before, so I went searching for it and found how to do it.
> It usually works well, in this case starting 18 packages before starting LO
> itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ .
> Other times it just doesn't help at all.
>
> What am I missing?

In absence of other contributions I'll offer a theoretical explanation, based
on random observations on my systems.

You have specified as many as 18 packages to be emerged in parallel x up to 18
make jobs each. The result of [18 x 18 = 324] is to be limited by a total
load average of 30.

If there were more than 18 packages listed to be emerged and there were no
dependencies between them to restrict how many could start emerging in
parallel, you would observe =<18 packages being emerged in parallel. This
alone will not breach the load limit of 30.

Let's assume all 18 packages had a large codebase to need at least 18 make
jobs each. Sooner or later you'd have 18 parallel emerges all trying to run
18 make jobs.

Were this to occur the load limit restriction would kick in and you would see
only up to 30 jobs listed in top, with individual package processes
alternating in the top list of make threads.

Here's my hypothesis explaining your own observation with libreoffice. As a
package or more finished emerging, libreoffice's turn comes up. Soon
libreoffice starts to execute make jobs, but any of the following may apply:

1. There are only 4 out of 30 jobs available, because other packages are
already using 26, throughout your window of observation.
2. Libreoffice sequencing of make jobs is mostly linear with succeeding make
jobs waiting on output from their predecessors.
3. Libreoffice source code is not optimised for high parallelism - I recall
when it was hardcoded at -j1 just a few years ago. Before this restriction
was added, any bug reporters were advised to try again after limiting make to
-j1.

Next time I'm building libreoffice on a beefier system I'll keep an eye out
for the number of jobs to see what it gets up to.
Re: Emerge load again [ In reply to ]
On Wednesday, 29 November 2023 10:26:36 GMT Michael wrote:

> Here's my hypothesis explaining your own observation with libreoffice. As a
> package or more finished emerging, libreoffice's turn comes up. Soon
> libreoffice starts to execute make jobs, but any of the following may
> apply:
>
> 1. There are only 4 out of 30 jobs available, because other packages are
> already using 26, throughout your window of observation.

Nope. Nothing else in progress.

> 2. Libreoffice sequencing of make jobs is mostly linear with succeeding make
> jobs waiting on output from their predecessors.

That's possible, but it doesn't seem likely with such a huge code base. And
why four processes, specifically and consistently?

> 3. Libreoffice source code is not optimised for high parallelism - I recall
> when it was hardcoded at -j1 just a few years ago. Before this restriction
> was added, any bug reporters were advised to try again after limiting make
> to -j1.

Yes, that was common to many packages for a long time because of incomplete
optimisation.

> Next time I'm building libreoffice on a beefier system I'll keep an eye out
> for the number of jobs to see what it gets up to.

That would help, yes.

The contribution of distcc isn't clear to me yet, as I said before. Sometimes
it's the bee's knees; other times it might just as well not be there. I don't
like mysteries... :)

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
On Mon, Nov 27, 2023 at 10:39 AM Peter Humphreey <peter@prh.myzen.co.uk>
wrote:l

>
> What am I missing?


I have much less powerful hardware than you but libreoffice (as a
stand-alone build) generates many more threads than 4 on my “cluster”. I’m
also using distcc.

On the main box, I set
MAKEOPTS=“-j17 -l6”
On the other two less powerful ones -l is 5 and 3, but -j is the same.

On the main box, /etc/distcc/hosts contains
localhost/11 sophie/5,lzo tobey/3,lzo —localslots=11 —localslots_cpp=11

On sophie and tobey (my less powerful boxes) the hosts file contains
something similar but specific to those boxes. The localslots and
localslots_cpp numbers are 3 on tobey and 5 on sophie, and the order in
which the machines are mentioned changes (local machine first, then remote
machines in order of power).

This configuration is the result of a lot of experimentation rather than
just a theoretical calculation. The various guides that discuss how to tune
these numbers for best performance were modestly helpful in explaining what
the tuning parameters mean, but experimenting and watching the resulting
performance was the best teacher.

Hope this helps.

John Blinka
Re: Emerge load again [ In reply to ]
On Wednesday, 29 November 2023 14:12:39 GMT John Blinka wrote:
> On Mon, Nov 27, 2023 at 10:39 AM Peter Humphreey <peter@prh.myzen.co.uk>
> wrote:l
>
> > What am I missing?
>
> I have much less powerful hardware than you but libreoffice (as a
> stand-alone build) generates many more threads than 4 on my “cluster”. I’m
> also using distcc.
>
> On the main box, I set
> MAKEOPTS=“-j17 -l6”
> On the other two less powerful ones -l is 5 and 3, but -j is the same.
>
> On the main box, /etc/distcc/hosts contains
> localhost/11 sophie/5,lzo tobey/3,lzo —localslots=11 —localslots_cpp=11
>
> On sophie and tobey (my less powerful boxes) the hosts file contains
> something similar but specific to those boxes. The localslots and
> localslots_cpp numbers are 3 on tobey and 5 on sophie, and the order in
> which the machines are mentioned changes (local machine first, then remote
> machines in order of power).
>
> This configuration is the result of a lot of experimentation rather than
> just a theoretical calculation. The various guides that discuss how to tune
> these numbers for best performance were modestly helpful in explaining what
> the tuning parameters mean, but experimenting and watching the resulting
> performance was the best teacher.
>
> Hope this helps.
>
> John Blinka

I don't use distcc, so I can't add anything useful to its application on
Peter's requirements, but a quick test by Peter would be to start a single
emerge of libreoffice on its own and observe if it is still limited to 4
threads with and without distcc.
Re: Emerge load again [ In reply to ]
On 2023-11-29, Michael wrote:

> On Monday, 27 November 2023 15:39:33 GMT Peter Humphreey wrote:
>> Hello list,
>>
>> I still can't see how portage limits the load. Today I'm emerging
>> libreoffice, and it's spending almost the whole time working with 4 CPU
>> threads. But:
>>
>> $ grep -e '\-j' -e distcc /etc/portage/make.conf
>> EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
>> autounmask=n --keep-going --nospinner"
>> FEATURES="distcc userfetch buildpkg network-sandbox parallel-install sandbox
>> userpriv usersandbox"
>> MAKEOPTS="-j18"
>>
>> I found a suggestion to use distcc in the installation handbook, which I
>> hadn't seen there before, so I went searching for it and found how to do it.
>> It usually works well, in this case starting 18 packages before starting LO
>> itself. grep -rw doesn't find '4' anywere relevant under /etc/portage/ .
>> Other times it just doesn't help at all.
>>
>> What am I missing?
>
> In absence of other contributions I'll offer a theoretical explanation, based
> on random observations on my systems.

I can't explain the 4, but one thing about this configuration (although
it's possible this has been already discussed before, apologies if
that's the case):

> You have specified as many as 18 packages to be emerged in parallel x up to 18
> make jobs each. The result of [18 x 18 = 324] is to be limited by a total
> load average of 30.
[...]
> Were this to occur the load limit restriction would kick in and you would see
> only up to 30 jobs listed in top, with individual package processes
> alternating in the top list of make threads.

The load limit is being set only for emerge, not make, so it would only
affect the decision to start building more packages in parallel. The
already started ongoing builds could still take the load beyond 30, with
more than 30 processes - there is nothing set to prevent that, or is
there?

--
Nuno Silva
Re: Re: Emerge load again [ In reply to ]
On Thursday, 30 November 2023 10:16:25 GMT Nuno Silva wrote:
> On 2023-11-29, Michael wrote:
> > On Monday, 27 November 2023 15:39:33 GMT Peter Humphreey wrote:
> >> Hello list,
> >>
> >> I still can't see how portage limits the load. Today I'm emerging
> >> libreoffice, and it's spending almost the whole time working with 4 CPU
> >> threads. But:
> >>
> >> $ grep -e '\-j' -e distcc /etc/portage/make.conf
> >> EMERGE_DEFAULT_OPTS="--jobs=18 --load-average=30 --backtrack=200 --
> >> autounmask=n --keep-going --nospinner"
> >> FEATURES="distcc userfetch buildpkg network-sandbox parallel-install
> >> sandbox userpriv usersandbox"
> >> MAKEOPTS="-j18"
> >>
> >> I found a suggestion to use distcc in the installation handbook, which I
> >> hadn't seen there before, so I went searching for it and found how to do
> >> it. It usually works well, in this case starting 18 packages before
> >> starting LO itself. grep -rw doesn't find '4' anywere relevant under
> >> /etc/portage/ . Other times it just doesn't help at all.
> >>
> >> What am I missing?
> >
> > In absence of other contributions I'll offer a theoretical explanation,
> > based on random observations on my systems.
>
> I can't explain the 4, but one thing about this configuration (although
> it's possible this has been already discussed before, apologies if
>
> that's the case):
> > You have specified as many as 18 packages to be emerged in parallel x up
> > to 18 make jobs each. The result of [18 x 18 = 324] is to be limited by
> > a total load average of 30.
>
> [...]
>
> > Were this to occur the load limit restriction would kick in and you would
> > see only up to 30 jobs listed in top, with individual package processes
> > alternating in the top list of make threads.
>
> The load limit is being set only for emerge, not make, so it would only
> affect the decision to start building more packages in parallel. The
> already started ongoing builds could still take the load beyond 30, with
> more than 30 processes - there is nothing set to prevent that, or is
> there?

As I understand it any tasks the emerge command is spawning, including make
jobs, will be respectful of the '--load-average 30.0'. When only MAKEOPTS is
specified, then a '-l 30.0' would be needed there to apply the same load limit
average.
Re: Re: Emerge load again [ In reply to ]
On Thursday, 30 November 2023 10:16:25 GMT Nuno Silva wrote:

> The load limit is being set only for emerge, not make, so it would only
> affect the decision to start building more packages in parallel. The
> already started ongoing builds could still take the load beyond 30, with
> more than 30 processes - there is nothing set to prevent that, or is
> there?

Yes, according to that web site I found, distcc will limit the number, and it
does seem to do so. What puzzles me is that I can't get LO to start any other
number of make jobs than 4.

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
On Wednesday, 29 November 2023 12:06:15 GMT Peter Humphreey wrote:
> On Wednesday, 29 November 2023 10:26:36 GMT Michael wrote:
> > Here's my hypothesis explaining your own observation with libreoffice. As
> > a package or more finished emerging, libreoffice's turn comes up. Soon
> > libreoffice starts to execute make jobs, but any of the following may
> > apply:
> >
> > 1. There are only 4 out of 30 jobs available, because other packages are
> > already using 26, throughout your window of observation.
>
> Nope. Nothing else in progress.
>
> > 2. Libreoffice sequencing of make jobs is mostly linear with succeeding
> > make jobs waiting on output from their predecessors.
>
> That's possible, but it doesn't seem likely with such a huge code base. And
> why four processes, specifically and consistently?
>
> > 3. Libreoffice source code is not optimised for high parallelism - I
> > recall
> > when it was hardcoded at -j1 just a few years ago. Before this
> > restriction
> > was added, any bug reporters were advised to try again after limiting make
> > to -j1.
>
> Yes, that was common to many packages for a long time because of incomplete
> optimisation.
>
> > Next time I'm building libreoffice on a beefier system I'll keep an eye
> > out
> > for the number of jobs to see what it gets up to.
>
> That would help, yes.

OK, I eventually got around to it. I am observing right now LO is building
with as many as 24 jobs:

top - 11:14:59 up 2:19, 2 users, load average: 24.46, 23.15, 9.51
Tasks: 474 total, 25 running, 449 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 5.6 sy, 94.0 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
st
MiB Mem : 64217.1 total, 50028.6 free, 6233.7 used, 7954.9 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 54333.4 avail Mem

I don't use distcc. The make -j25 -l24.8 I have specified is respected.

> The contribution of distcc isn't clear to me yet, as I said before.
> Sometimes it's the bee's knees; other times it might just as well not be
> there. I don't like mysteries... :)
Re: Emerge load again [ In reply to ]
On Saturday, 6 January 2024 11:44:20 GMT Michael wrote:
> On Wednesday, 29 November 2023 12:06:15 GMT Peter Humphreey wrote:
> > On Wednesday, 29 November 2023 10:26:36 GMT Michael wrote:
> > > Here's my hypothesis explaining your own observation with libreoffice.
> > > As
> > > a package or more finished emerging, libreoffice's turn comes up. Soon
> > > libreoffice starts to execute make jobs, but any of the following may
> > > apply:
> > >
> > > 1. There are only 4 out of 30 jobs available, because other packages are
> > > already using 26, throughout your window of observation.
> >
> > Nope. Nothing else in progress.
> >
> > > 2. Libreoffice sequencing of make jobs is mostly linear with succeeding
> > > make jobs waiting on output from their predecessors.
> >
> > That's possible, but it doesn't seem likely with such a huge code base.
> > And
> > why four processes, specifically and consistently?
> >
> > > 3. Libreoffice source code is not optimised for high parallelism - I
> > > recall
> > > when it was hardcoded at -j1 just a few years ago. Before this
> > > restriction
> > > was added, any bug reporters were advised to try again after limiting
> > > make
> > > to -j1.
> >
> > Yes, that was common to many packages for a long time because of
> > incomplete
> > optimisation.
> >
> > > Next time I'm building libreoffice on a beefier system I'll keep an eye
> > > out
> > > for the number of jobs to see what it gets up to.
> >
> > That would help, yes.
>
> OK, I eventually got around to it. I am observing right now LO is building
> with as many as 24 jobs:
>
> top - 11:14:59 up 2:19, 2 users, load average: 24.46, 23.15, 9.51
> Tasks: 474 total, 25 running, 449 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 0.2 us, 5.6 sy, 94.0 ni, 0.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0
> st
> MiB Mem : 64217.1 total, 50028.6 free, 6233.7 used, 7954.9 buff/cache
> MiB Swap: 0.0 total, 0.0 free, 0.0 used. 54333.4 avail Mem
>
> I don't use distcc. The make -j25 -l24.8 I have specified is respected.

Interesting. Thanks.

> > The contribution of distcc isn't clear to me yet, as I said before.
> > Sometimes it's the bee's knees; other times it might just as well not be
> > there. I don't like mysteries... :)

I've decided to ditch distcc altogether. During the very first build, what it
grants with one hand it takes away double with the other - lots of tiny jobs
all started together, but then gcc is sompiled with just two threads. That
just-two happens on at least two different machines (not just separate;
different).

The position is no better in regular maintenance: no matter how many /make/
tasks are needed, I get just two threads compiling at a time. (I'm referring
to the single-host arrangement I mentioned at the start.)

I'm baffled, and I don't like it; I much prefer understanding to mystery.

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
On 29/11/2023 12:06, Peter Humphreey wrote:
> The contribution of distcc isn't clear to me yet, as I said before. Sometimes
> it's the bee's knees; other times it might just as well not be there. I don't
> like mysteries... ????

As far as I'm aware, there's no mystery. On a single machine you get the
exact same thing ... it's all down to parallelism.

Make asks itself "how many separate tasks can I do at the same time,
which won't interfere with each other". In gcc's case, the answer
appears to be two. It doesn't matter how much resource is available,
make can only make use of two cores.

In other cases, there may be a hundred separate tasks, make fires off a
hundred tasks shared amongst all the resource it can find, and sits back
and waits.

Think of a hundred compile jobs all running at the same time, but then
the linker is invoked, and you can only have the one linker running,
after all the compile jobs have finished.

And this is a HARD problem, I haven't seen it recently, but there used
to be plenty of threads about hard-to-debug compile failures that went
away with -j1. The obvious cause was two compile jobs being set off in
parallel, when in reality one depended on the other, and things messed up.

Cheers,
Wol
Re: Emerge load again [ In reply to ]
On Saturday, 6 January 2024 15:28:53 GMT Wols Lists wrote:

> As far as I'm aware, there's no mystery. On a single machine you get the
> exact same thing ... it's all down to parallelism.
>
> Make asks itself "how many separate tasks can I do at the same time,
> which won't interfere with each other". In gcc's case, the answer
> appears to be two. It doesn't matter how much resource is available,
> make can only make use of two cores.

Yet, if I set -distcc and -j12 -l12, I get 12 threads in parallel. That's the
mystery.

> In other cases, there may be a hundred separate tasks, make fires off a
> hundred tasks shared amongst all the resource it can find, and sits back
> and waits.

And that's how the very first installation goes, with single-host distcc. Then,
when it gets to gcc, it collapses to 2 threads and everything gained so far is
lost many-fold. (I set USE=-fortran to avoid pointless recompilation, since
nothing needs it here.)

> Think of a hundred compile jobs all running at the same time, but then
> the linker is invoked, and you can only have the one linker running,
> after all the compile jobs have finished.

I hadn't thought of that - another thing to consider.

> And this is a HARD problem, I haven't seen it recently, but there used
> to be plenty of threads about hard-to-debug compile failures that went
> away with -j1. The obvious cause was two compile jobs being set off in
> parallel, when in reality one depended on the other, and things messed up.

I haven't either - seen it recently.

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
On 06/01/2024 17:52, Peter Humphrey wrote:
>> In other cases, there may be a hundred separate tasks, make fires off a
>> hundred tasks shared amongst all the resource it can find, and sits back
>> and waits.

> And that's how the very first installation goes, with single-host distcc. Then,
> when it gets to gcc, it collapses to 2 threads and everything gained so far is
> lost many-fold. (I set USE=-fortran to avoid pointless recompilation, since
> nothing needs it here.)

So if it's consistently gcc that collapses to two threads, then
something (maybe explicit settings, maybe dependencies, maybe yadda
yadda) is telling make that only two jobs can run at the same time else
they'll trip over each other.

Could be a dev has hard-coded the "two jobs" rule to make those random
crashes go away :-) Or maybe they found the problem, and that's why only
two jobs can run in parallel.

Cheers,
Wol
Re: Emerge load again [ In reply to ]
On Saturday, 6 January 2024 19:31:59 GMT Wols Lists wrote:
> On 06/01/2024 17:52, Peter Humphrey wrote:
> >> In other cases, there may be a hundred separate tasks, make fires off a
> >> hundred tasks shared amongst all the resource it can find, and sits back
> >> and waits.
> >
> > And that's how the very first installation goes, with single-host distcc.
> > Then, when it gets to gcc, it collapses to 2 threads and everything
> > gained so far is lost many-fold. (I set USE=-fortran to avoid pointless
> > recompilation, since nothing needs it here.)
>
> So if it's consistently gcc that collapses to two threads, then
> something (maybe explicit settings, maybe dependencies, maybe yadda
> yadda) is telling make that only two jobs can run at the same time else
> they'll trip over each other.
>
> Could be a dev has hard-coded the "two jobs" rule to make those random
> crashes go away :-) Or maybe they found the problem, and that's why only
> two jobs can run in parallel.

Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12 threads
in parallel'.

--
Regards,
Peter.
Re: Emerge load again [ In reply to ]
>
> > So if it's consistently gcc that collapses to two threads, then
> > something (maybe explicit settings, maybe dependencies, maybe yadda
> > yadda) is telling make that only two jobs can run at the same time else
> > they'll trip over each other.
> >
> > Could be a dev has hard-coded the "two jobs" rule to make those random
> > crashes go away :-) Or maybe they found the problem, and that's why only
> > two jobs can run in parallel.
>
> Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12
> threads
> in parallel'.
>

Have you checked you're not limiting jobs in /etc/distcc/hosts? ie no '/2'
after the IP address?
Re: Emerge load again [ In reply to ]
On Sunday, 7 January 2024 00:54:12 GMT Adam Carter wrote:
> > > So if it's consistently gcc that collapses to two threads, then
> > > something (maybe explicit settings, maybe dependencies, maybe yadda
> > > yadda) is telling make that only two jobs can run at the same time else
> > > they'll trip over each other.
> > >
> > > Could be a dev has hard-coded the "two jobs" rule to make those random
> > > crashes go away :-) Or maybe they found the problem, and that's why
only
> > > two jobs can run in parallel.
> >
> > Not so. As I said last time: 'if I set -distcc and -j12 -l12, I get 12
> > threads
> > in parallel'.
>
> Have you checked you're not limiting jobs in /etc/distcc/hosts? ie no '/2'
> after the IP address?

$ cat /etc/distcc/hosts
localhost/12

--
Regards,
Peter.