Mailing List Archive

Re: what is normal conserver hang during reconfig
Running v 8.1.18. Rereading the SIGHUP section of the man page I'm
still thinking I've configured something wrong. SIGHUP says conserver
rereads the config files and then adds/deletes consoles as needed and
only touches running consoles if they have changed. If thats true I
wouldn't expect a 30s buffer of input/output on a console that hasn't
changed, should I?
I also don't see anything in CHANGES that sounds like this is a bug
that has been fixed.

-denis

On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> I love conserver. I have a minor issue and I was curious what options
> there might be.
>
> So I have a conserver setup running against 262 servers (mostly digis or
> ser2net machines). It works great. However when we need to update due
> to a config change we run "kill -HUP" against the parent. With the
> number of consoles (I think) this causes about a 30s "hang" when
> interacting with any console which corresponds to the reconfig time.
>
> Does this make sense and is per the current design? Any chance there is
> a clever way to make it block for less time? Barring that I intend to
> spin up a new server to share the load of my current server and reduce
> the reconfig time.
>
> I was mostly curious if there was a config issue or if this description
> doesn't make any sense to folks and it means I have something else going
> on like too many down consoles or something.
> -denis
>
> --
> __________________________
> Denis Alan Hainsworth
> denis.hainsworth@gmail.com

--
__________________________
Denis Alan Hainsworth
denis.hainsworth@gmail.com
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this. The code does block all activity when it processes a HUP signal, but that's supposed to be "quick". :-|

Each process (the main and children) rereads the config file and figures out if there's anything to do. The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured).

With that in mind, how many consoles are each child managing? The compile time default can be seen with a "conserver -V", but it can be overridden with -m. I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config).

Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things. The reread of the config puts all that on hold, so it probably has to do with that.

One issue I've seen before is the magnitude of DNS lookups done when a config is loaded. It all depends on the config, of course, but you could end up generating a lot of requests. Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble.

Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice). It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs). Just send one of the children a HUP so it minimizes the impact. With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything).

Bryan

> On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
>
> Running v 8.1.18. Rereading the SIGHUP section of the man page I'm
> still thinking I've configured something wrong. SIGHUP says conserver
> rereads the config files and then adds/deletes consoles as needed and
> only touches running consoles if they have changed. If thats true I
> wouldn't expect a 30s buffer of input/output on a console that hasn't
> changed, should I?
> I also don't see anything in CHANGES that sounds like this is a bug
> that has been fixed.
>
> -denis
>
> On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> I love conserver. I have a minor issue and I was curious what options
>> there might be.
>>
>> So I have a conserver setup running against 262 servers (mostly digis or
>> ser2net machines). It works great. However when we need to update due
>> to a config change we run "kill -HUP" against the parent. With the
>> number of consoles (I think) this causes about a 30s "hang" when
>> interacting with any console which corresponds to the reconfig time.
>>
>> Does this make sense and is per the current design? Any chance there is
>> a clever way to make it block for less time? Barring that I intend to
>> spin up a new server to share the load of my current server and reduce
>> the reconfig time.
>>
>> I was mostly curious if there was a config issue or if this description
>> doesn't make any sense to folks and it means I have something else going
>> on like too many down consoles or something.
>> -denis
>>
>> --
>> __________________________
>> Denis Alan Hainsworth
>> denis.hainsworth@gmail.com
>
> --
> __________________________
> Denis Alan Hainsworth
> denis.hainsworth@gmail.com
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users


_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
> From: "Bryan Stansell via users" <users@conserver.com>
> To: users@conserver.com
> Sent: Tuesday, October 18, 2016 11:39:19 PM
> Subject: Re: what is normal conserver hang during reconfig

> With that in mind, how many consoles are each child managing? The compile time
> default can be seen with a "conserver -V", but it can be overridden with -m.
> I'm honestly not sure if having more or less would be better or even change
> things (more processes would use more cores, but also "slam" the system with
> that many things reading and processing the config).

> Conserver tries very hard to be multiplex across all the consoles, even when
> bringing up and tearing down things. The reread of the config puts all that on
> hold, so it probably has to do with that.

> One issue I've seen before is the magnitude of DNS lookups done when a config is
> loaded. It all depends on the config, of course, but you could end up
> generating a lot of requests. Maybe it doesn't apply in your environment, but
> it can be an unexpected source of trouble.

Does a HUP close and open consoles? Does a HUP open consoles that are down? If it si going after consoles that are down and blocking that could be what is going on.

On a local device I can manage 100 consoles with ease. Just a couple serial, the rest are programs or log file tails.

Chris

Chris
Re: what is normal conserver hang during reconfig [ In reply to ]
> On Oct 18, 2016, at 9:16 PM, Chris Fowler via users <users@conserver.com> wrote:
>
> Does a HUP close and open consoles? Does a HUP open consoles that are down? If it si going after consoles that are down and blocking that could be what is going on.
>

A HUP doesn't close and open consoles (it will reopen log files though). It will try and open anything down (socket connections are set up to be non-blocking).

Bryan


_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
> From: "Bryan Stansell via users" <users@conserver.com>
> To: users@conserver.com
> Sent: Wednesday, October 19, 2016 1:10:43 AM
> Subject: Re: what is normal conserver hang during reconfig

> > On Oct 18, 2016, at 9:16 PM, Chris Fowler via users <users@conserver.com> wrote:

>> Does a HUP close and open consoles? Does a HUP open consoles that are down? If
>> it si going after consoles that are down and blocking that could be what is
> > going on.


> A HUP doesn't close and open consoles (it will reopen log files though). It will
> try and open anything down (socket connections are set up to be non-blocking).

A could down with DNS look ups could be the culprit. At this point I'd strace it to see where it is spending time.
Re: what is normal conserver hang during reconfig [ In reply to ]
Thanks for the ideas guys, I'll see what I can dig up. I only realized
last night my first email was sent before I updated my subscription
address so the list just quietly ignored it :)
-denis
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
Finally got time to look at things. strace is perfect, thanks for
suggesting that.

So running something like
strace -t -o strace.out.2 -p 3198
and sending a SIGHUP to the parent process showed the issue.

So the way we've always set things up was to automatically generate one
config file per console server from our equipment database. This means
There are 264 files that are #included into the main config file.
The first 30s of "hang" is each process opening each file reading it in
and closing it, I'm wondering if we need to block I/O during this or
perhaps that could be done before we start blocking?
Once that is done there is another 10s of hang while we do the dns
lookup for each console host as you thought (open /etc/hosts, make a dns
query, resolve it).

I tried putting all the configs into one file but that didnt change
anything. So then I started wondering. Our IT had long ago made the
console servers VMs. Its never seemed like an issue but I compared
some basic dd commands and found my problem server has terrible IO
throughput ... sigh. To compare one of my good servers has about
80Mbp/s read/write and the bad one has around 15Mbp/s read/write.

So I'm going to look into moving the VM or get the disk perf up which
should solve most of my issues but I also wonder if the conserver code
could be re-organized without too much trouble to avoid issues of
blocking when there is slow disk? Its possible what I'm asking is dumb,
just throwing it out there.

-denis

On Tue, Oct 18, 2016 at 08:39:19PM -0700, Bryan Stansell via users wrote:
> Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this. The code does block all activity when it processes a HUP signal, but that's supposed to be "quick". :-|
>
> Each process (the main and children) rereads the config file and figures out if there's anything to do. The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured).
>
> With that in mind, how many consoles are each child managing? The compile time default can be seen with a "conserver -V", but it can be overridden with -m. I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config).
>
> Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things. The reread of the config puts all that on hold, so it probably has to do with that.
>
> One issue I've seen before is the magnitude of DNS lookups done when a config is loaded. It all depends on the config, of course, but you could end up generating a lot of requests. Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble.
>
> Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice). It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs). Just send one of the children a HUP so it minimizes the impact. With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything).
>
> Bryan
>
> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> >
> > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm
> > still thinking I've configured something wrong. SIGHUP says conserver
> > rereads the config files and then adds/deletes consoles as needed and
> > only touches running consoles if they have changed. If thats true I
> > wouldn't expect a 30s buffer of input/output on a console that hasn't
> > changed, should I?
> > I also don't see anything in CHANGES that sounds like this is a bug
> > that has been fixed.
> >
> > -denis
> >
> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> >> I love conserver. I have a minor issue and I was curious what options
> >> there might be.
> >>
> >> So I have a conserver setup running against 262 servers (mostly digis or
> >> ser2net machines). It works great. However when we need to update due
> >> to a config change we run "kill -HUP" against the parent. With the
> >> number of consoles (I think) this causes about a 30s "hang" when
> >> interacting with any console which corresponds to the reconfig time.
> >>
> >> Does this make sense and is per the current design? Any chance there is
> >> a clever way to make it block for less time? Barring that I intend to
> >> spin up a new server to share the load of my current server and reduce
> >> the reconfig time.
> >>
> >> I was mostly curious if there was a config issue or if this description
> >> doesn't make any sense to folks and it means I have something else going
> >> on like too many down consoles or something.
> >> -denis
> >>
> >> --
> >> __________________________
> >> Denis Alan Hainsworth
> >> denis.hainsworth@gmail.com
> >
> > --
> > __________________________
> > Denis Alan Hainsworth
> > denis.hainsworth@gmail.com
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

--
__________________________
Denis Alan Hainsworth
denis.hainsworth@gmail.com
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-)

I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.

Bryan

> On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
>
> Finally got time to look at things. strace is perfect, thanks for
> suggesting that.
>
> So running something like
> strace -t -o strace.out.2 -p 3198
> and sending a SIGHUP to the parent process showed the issue.
>
> So the way we've always set things up was to automatically generate one
> config file per console server from our equipment database. This means
> There are 264 files that are #included into the main config file.
> The first 30s of "hang" is each process opening each file reading it in
> and closing it, I'm wondering if we need to block I/O during this or
> perhaps that could be done before we start blocking?
> Once that is done there is another 10s of hang while we do the dns
> lookup for each console host as you thought (open /etc/hosts, make a dns
> query, resolve it).
>
> I tried putting all the configs into one file but that didnt change
> anything. So then I started wondering. Our IT had long ago made the
> console servers VMs. Its never seemed like an issue but I compared
> some basic dd commands and found my problem server has terrible IO
> throughput ... sigh. To compare one of my good servers has about
> 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.
>
> So I'm going to look into moving the VM or get the disk perf up which
> should solve most of my issues but I also wonder if the conserver code
> could be re-organized without too much trouble to avoid issues of
> blocking when there is slow disk? Its possible what I'm asking is dumb,
> just throwing it out there.
>
> -denis


_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
Dang it, my theory didn't pan out. While the slower of the two did in
fact have slower disks my IT was able to move the VM to some ultra fast
storage and my reconfig loop wasn't any faster. :( And it was such a
lovely theory too.

So I'm still digging to see if I can come up with a second clever idea
but I have a feeling to reduce to reconfig time I'll just have to spread
the load over more systems.

-denis

On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-)
>
> I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
>
> Bryan
>
> > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> >
> > Finally got time to look at things. strace is perfect, thanks for
> > suggesting that.
> >
> > So running something like
> > strace -t -o strace.out.2 -p 3198
> > and sending a SIGHUP to the parent process showed the issue.
> >
> > So the way we've always set things up was to automatically generate one
> > config file per console server from our equipment database. This means
> > There are 264 files that are #included into the main config file.
> > The first 30s of "hang" is each process opening each file reading it in
> > and closing it, I'm wondering if we need to block I/O during this or
> > perhaps that could be done before we start blocking?
> > Once that is done there is another 10s of hang while we do the dns
> > lookup for each console host as you thought (open /etc/hosts, make a dns
> > query, resolve it).
> >
> > I tried putting all the configs into one file but that didnt change
> > anything. So then I started wondering. Our IT had long ago made the
> > console servers VMs. Its never seemed like an issue but I compared
> > some basic dd commands and found my problem server has terrible IO
> > throughput ... sigh. To compare one of my good servers has about
> > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.
> >
> > So I'm going to look into moving the VM or get the disk perf up which
> > should solve most of my issues but I also wonder if the conserver code
> > could be re-organized without too much trouble to avoid issues of
> > blocking when there is slow disk? Its possible what I'm asking is dumb,
> > just throwing it out there.
> >
> > -denis
>
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

--
__________________________
Denis Alan Hainsworth
denis.hainsworth@gmail.com
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
So i think I have a solution which avoids the issue rather than fixing
anything :)

I had to try to recall why we have things the way we do since its going
on like 10 years or more (have i said I love conserver?)

So back in the day we set up the single conserver instance which was
obviously going to be the master. We later started populating a couple
other servers at a few different sites. To keep things simple and
robust every server got the same config set and could be the master.
However in reality I'm pretty sure no one uses anything but the default
master ever. So if I reduce the configs on all the slave servers,
especially the ones that reconfig the most and are causing the most
grief to the users when it takes 40s, to only the configs they actually
own then my times drop down to 3s and 13s respectively.

Not a perfect solution but it should require a minimum of changes to
everyone involved. Hopefully no one will remind me tomorrow of
something I forgot :)

-denis (purveyor of good enough solutions)

On Sun, Oct 23, 2016 at 01:57:35PM -0400, Denis Hainsworth wrote:
> Dang it, my theory didn't pan out. While the slower of the two did in
> fact have slower disks my IT was able to move the VM to some ultra fast
> storage and my reconfig loop wasn't any faster. :( And it was such a
> lovely theory too.
>
> So I'm still digging to see if I can come up with a second clever idea
> but I have a feeling to reduce to reconfig time I'll just have to spread
> the load over more systems.
>
> -denis
>
> On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> > I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-)
> >
> > I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
> >
> > Bryan
> >
> > > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > >
> > > Finally got time to look at things. strace is perfect, thanks for
> > > suggesting that.
> > >
> > > So running something like
> > > strace -t -o strace.out.2 -p 3198
> > > and sending a SIGHUP to the parent process showed the issue.
> > >
> > > So the way we've always set things up was to automatically generate one
> > > config file per console server from our equipment database. This means
> > > There are 264 files that are #included into the main config file.
> > > The first 30s of "hang" is each process opening each file reading it in
> > > and closing it, I'm wondering if we need to block I/O during this or
> > > perhaps that could be done before we start blocking?
> > > Once that is done there is another 10s of hang while we do the dns
> > > lookup for each console host as you thought (open /etc/hosts, make a dns
> > > query, resolve it).
> > >
> > > I tried putting all the configs into one file but that didnt change
> > > anything. So then I started wondering. Our IT had long ago made the
> > > console servers VMs. Its never seemed like an issue but I compared
> > > some basic dd commands and found my problem server has terrible IO
> > > throughput ... sigh. To compare one of my good servers has about
> > > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.
> > >
> > > So I'm going to look into moving the VM or get the disk perf up which
> > > should solve most of my issues but I also wonder if the conserver code
> > > could be re-organized without too much trouble to avoid issues of
> > > blocking when there is slow disk? Its possible what I'm asking is dumb,
> > > just throwing it out there.
> > >
> > > -denis
> >
> >
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
>
> --
> __________________________
> Denis Alan Hainsworth
> denis.hainsworth@gmail.com

--
__________________________
Denis Alan Hainsworth
denis.hainsworth@gmail.com
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: what is normal conserver hang during reconfig [ In reply to ]
To Bryan's point about DNS lookups, I expect that my main conserver will
be the first thing online (and then then network comes up...) and it will
be the last thing down. As a result, I have all of my console servers
listed in the /etc/hosts file, and I look at the file first. I have 67
conserverver child processes with 16 ports under each, and my hup is just a
few seconds.

Best regards,

-Z-


On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <
users@conserver.com> wrote:

> Off the top of my head, I agree that there shouldn't be anything fixed in
> the newer code to address this. The code does block all activity when it
> processes a HUP signal, but that's supposed to be "quick". :-|
>
> Each process (the main and children) rereads the config file and figures
> out if there's anything to do. The main process is in charge of spawning
> new consoles (or reconfigured), and the children are responsible for
> letting go of old ones (or reconfigured).
>
> With that in mind, how many consoles are each child managing? The compile
> time default can be seen with a "conserver -V", but it can be overridden
> with -m. I'm honestly not sure if having more or less would be better or
> even change things (more processes would use more cores, but also "slam"
> the system with that many things reading and processing the config).
>
> Conserver tries very hard to be multiplex across all the consoles, even
> when bringing up and tearing down things. The reread of the config puts
> all that on hold, so it probably has to do with that.
>
> One issue I've seen before is the magnitude of DNS lookups done when a
> config is loaded. It all depends on the config, of course, but you could
> end up generating a lot of requests. Maybe it doesn't apply in your
> environment, but it can be an unexpected source of trouble.
>
> Aside from that, another server will certainly share the load (and, set up
> right, the end users won't even notice). It would be interesting to look
> at an strace (assuming linux) of a process when it gets a HUP (even without
> any changes to configs). Just send one of the children a HUP so it
> minimizes the impact. With timestamps, it might highlight what is causing
> the issue (like the DNS query case, but could be anything).
>
> Bryan
>
> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <
> users@conserver.com> wrote:
> >
> > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm
> > still thinking I've configured something wrong. SIGHUP says conserver
> > rereads the config files and then adds/deletes consoles as needed and
> > only touches running consoles if they have changed. If thats true I
> > wouldn't expect a 30s buffer of input/output on a console that hasn't
> > changed, should I?
> > I also don't see anything in CHANGES that sounds like this is a bug
> > that has been fixed.
> >
> > -denis
> >
> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> >> I love conserver. I have a minor issue and I was curious what options
> >> there might be.
> >>
> >> So I have a conserver setup running against 262 servers (mostly digis or
> >> ser2net machines). It works great. However when we need to update due
> >> to a config change we run "kill -HUP" against the parent. With the
> >> number of consoles (I think) this causes about a 30s "hang" when
> >> interacting with any console which corresponds to the reconfig time.
> >>
> >> Does this make sense and is per the current design? Any chance there is
> >> a clever way to make it block for less time? Barring that I intend to
> >> spin up a new server to share the load of my current server and reduce
> >> the reconfig time.
> >>
> >> I was mostly curious if there was a config issue or if this description
> >> doesn't make any sense to folks and it means I have something else going
> >> on like too many down consoles or something.
> >> -denis
> >>
> >> --
> >> __________________________
> >> Denis Alan Hainsworth
> >> denis.hainsworth@gmail.com
> >
> > --
> > __________________________
> > Denis Alan Hainsworth
> > denis.hainsworth@gmail.com
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users
>



--
Train of Lights reminder email list signup - http://tinyurl.com/ncry
-announce
Re: what is normal conserver hang during reconfig [ In reply to ]
Also, my main conserver is a dedicated host, on it's own UPS, which is
powered by the data canter UPS... maybe it's overkill for some, but I can
tell you if there were any problems shutting down the rest of the world, or
bringing it back online again. (This was a really important feature after
our campus lost PG&E Mains power TWICE in 14 hours last Monday. :-)

All the log files go into /var/consoles/current, and we rotate timestamped
files into /var/consoles/archive.

Best regards,

-Z-


On Tue, Oct 25, 2016 at 8:00 AM, Zonker <consoleteam@gmail.com> wrote:

> To Bryan's point about DNS lookups, I expect that my main conserver will
> be the first thing online (and then then network comes up...) and it will
> be the last thing down. As a result, I have all of my console servers
> listed in the /etc/hosts file, and I look at the file first. I have 67
> conserverver child processes with 16 ports under each, and my hup is just a
> few seconds.
>
> Best regards,
>
> -Z-
>
>
> On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <
> users@conserver.com> wrote:
>
>> Off the top of my head, I agree that there shouldn't be anything fixed in
>> the newer code to address this. The code does block all activity when it
>> processes a HUP signal, but that's supposed to be "quick". :-|
>>
>> Each process (the main and children) rereads the config file and figures
>> out if there's anything to do. The main process is in charge of spawning
>> new consoles (or reconfigured), and the children are responsible for
>> letting go of old ones (or reconfigured).
>>
>> With that in mind, how many consoles are each child managing? The
>> compile time default can be seen with a "conserver -V", but it can be
>> overridden with -m. I'm honestly not sure if having more or less would be
>> better or even change things (more processes would use more cores, but also
>> "slam" the system with that many things reading and processing the config).
>>
>> Conserver tries very hard to be multiplex across all the consoles, even
>> when bringing up and tearing down things. The reread of the config puts
>> all that on hold, so it probably has to do with that.
>>
>> One issue I've seen before is the magnitude of DNS lookups done when a
>> config is loaded. It all depends on the config, of course, but you could
>> end up generating a lot of requests. Maybe it doesn't apply in your
>> environment, but it can be an unexpected source of trouble.
>>
>> Aside from that, another server will certainly share the load (and, set
>> up right, the end users won't even notice). It would be interesting to
>> look at an strace (assuming linux) of a process when it gets a HUP (even
>> without any changes to configs). Just send one of the children a HUP so it
>> minimizes the impact. With timestamps, it might highlight what is causing
>> the issue (like the DNS query case, but could be anything).
>>
>> Bryan
>>
>> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <
>> users@conserver.com> wrote:
>> >
>> > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm
>> > still thinking I've configured something wrong. SIGHUP says conserver
>> > rereads the config files and then adds/deletes consoles as needed and
>> > only touches running consoles if they have changed. If thats true I
>> > wouldn't expect a 30s buffer of input/output on a console that hasn't
>> > changed, should I?
>> > I also don't see anything in CHANGES that sounds like this is a bug
>> > that has been fixed.
>> >
>> > -denis
>> >
>> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> >> I love conserver. I have a minor issue and I was curious what options
>> >> there might be.
>> >>
>> >> So I have a conserver setup running against 262 servers (mostly digis
>> or
>> >> ser2net machines). It works great. However when we need to update due
>> >> to a config change we run "kill -HUP" against the parent. With the
>> >> number of consoles (I think) this causes about a 30s "hang" when
>> >> interacting with any console which corresponds to the reconfig time.
>> >>
>> >> Does this make sense and is per the current design? Any chance there
>> is
>> >> a clever way to make it block for less time? Barring that I intend to
>> >> spin up a new server to share the load of my current server and reduce
>> >> the reconfig time.
>> >>
>> >> I was mostly curious if there was a config issue or if this description
>> >> doesn't make any sense to folks and it means I have something else
>> going
>> >> on like too many down consoles or something.
>> >> -denis
>> >>
>> >> --
>> >> __________________________
>> >> Denis Alan Hainsworth
>> >> denis.hainsworth@gmail.com
>> >
>> > --
>> > __________________________
>> > Denis Alan Hainsworth
>> > denis.hainsworth@gmail.com
>> > _______________________________________________
>> > users mailing list
>> > users@conserver.com
>> > https://www.conserver.com/mailman/listinfo/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users@conserver.com
>> https://www.conserver.com/mailman/listinfo/users
>>
>
>
>
> --
> Train of Lights reminder email list signup - http://tinyurl.com/ncry-
> announce
>



--
Train of Lights reminder email list signup - http://tinyurl.com/ncry
-announce
Re: what is normal conserver hang during reconfig [ In reply to ]
> From: "Zonker via users" <users@conserver.com>
> To: "Bryan Stansell" <bryan@conserver.com>
> Cc: users@conserver.com
> Sent: Tuesday, October 25, 2016 11:06:48 AM
> Subject: Re: what is normal conserver hang during reconfig

> Also, my main conserver is a dedicated host, on it's own UPS, which is powered
> by the data canter UPS... maybe it's overkill for some, but I can tell you if
> there were any problems shutting down the rest of the world, or bringing it
> back online again. (This was a really important feature after our campus lost
> PG&E Mains power TWICE in 14 hours last Monday. :-)

I'm different.

All mine are independent. They have their own configs. Console output is stored on their local storage.

Main location is a program I wrote that pulls its info from a database on what to connect to. It stores output to its local disk. To connect from the main I have 2 programs. One uses the console protocol. The other uses SSH to the target host and then executes console on it.

Chris
Re: what is normal conserver hang during reconfig [ In reply to ]
On Tue, Oct 25, 2016 at 08:00:16AM -0700, Zonker via users wrote:
> To Bryan's point about DNS lookups, I expect that my main conserver will
> be the first thing online (and then then network comes up...) and it will
> be the last thing down. As a result, I have all of my console servers
> listed in the /etc/hosts file, and I look at the file first. I have 67
> conserverver child processes with 16 ports under each, and my hup is just a
> few seconds.

Yeah DNS is a hit but its dwarfed by what I was seeing. Understand my
"small" site has 72 children and HUPs in 3s now that I'm reading in
only the configs it manages (aka its now no longer a master).
My large site has 126 children :) even with paring it
down to only the stuff it manages. Thats now HUPing in 13s.
We have other boxes that can continue to serve as masters.
-denis
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users