Mailing List Archive

Problems with conserver after OS upgrade
Apologies for those who feel this is an OS problem, and you're
likely right. That, and/or a "stupid user trick".

I was running NetBSD 4.0_STABLE for more than a year, with
conserver 8.1.16 built from pkgsrc. This worked fine. I'd written a
perl program to auto-login over a [telnet] connection to port 2000+ on
my cisco access server. (I tried ssh at some point, but it appears I
wasn't using that. Maybe I never got it to work?)

Anyway, I recently upgraded the base OS on the console server, and
built a new version of conserver, still 8.1.16. Now, I seem to have
no working consoles.

If I run "console -x", I simply get no output. If I try to console
to a console, I get a brief pause followed by:

console: forwarding level too deep!

Interestingly, I'm also seeing the following in the log:

[Thu Jun 4 22:46:46 2009] conserver (1139): ERROR: FileRead(): SSL
error on fd

Any idea what's gone wrong? The console and conserver binaries
should be built with the same version of openssl as are on the machine
at the moment. Any other ideas as to what might be causing this issue?

Thanks...

- Chris

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
Well, this could be an interesting failure mode with the SSL code, or
something changed such that conserver no longer believes it's running on
the host that's supposed to manage the consoles. So, it's sending a
redirect but redirecting to itself (under an alternate identity?).

I'd look at the following output for clues:

conserver -DS
conserver -V
console -D <console>

If it's a name mismatch problem, perhaps just looking at the 'master'
entries (if there are any) in your conserver.cf file and making sure
they map to an ip address on your host would be a good first start.

Feel free to send me (directly) any of the info above to help poke
around and figure this out. It *seems* like a configuration issue, but
it's always possible it's something else.

Bryan

On Thu, Jun 04, 2009 at 10:48:39PM -0400, Chris Ross wrote:
> If I run "console -x", I simply get no output. If I try to console
> to a console, I get a brief pause followed by:
>
> console: forwarding level too deep!
>
> Interestingly, I'm also seeing the following in the log:
>
> [Thu Jun 4 22:46:46 2009] conserver (1139): ERROR: FileRead(): SSL
> error on fd
>
> Any idea what's gone wrong? The console and conserver binaries
> should be built with the same version of openssl as are on the machine
> at the moment. Any other ideas as to what might be causing this issue?
>
> Thanks...
>
> - Chris
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
RE: Problems with conserver after OS upgrade [ In reply to ]
Hm! OS Upgrade? Check your HOSTS file. Various unix's implement
the local host name differently. If your node is phred, and its IP is
say, 141.107.223.14, there are two variations of host file layouts you
will see:

127.0.0.1 localhost.localdomain localhost phred

OR:

127.0.0.1 localhost.localdomain localhost
141.107.223.14 phred.my.domain phred

There are arguments for and against where to stick the local host name
(phred) - I have seen both, and I have seen applications break in one
versus the other. One very strong possibility is you had the hosts
file one way, and the um, "OS Upgrade" put it back the other way.

Something to look for.

--kcb




-----Original Message-----
From: users-bounces@conserver.com [mailto:users-bounces@conserver.com]
On Behalf Of Bryan Stansell
Sent: Friday, June 05, 2009 12:30 PM
To: Conserver Users's Mailing List
Subject: Re: Problems with conserver after OS upgrade

Well, this could be an interesting failure mode with the SSL code, or
something changed such that conserver no longer believes it's running on
the host that's supposed to manage the consoles...................
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
On Friday 05 June 2009 13:30:03 Bryan Stansell wrote:
> Well, this could be an interesting failure mode with the SSL code, or
> something changed such that conserver no longer believes it's running on
> the host that's supposed to manage the consoles. So, it's sending a
> redirect but redirecting to itself (under an alternate identity?).
>
> I'd look at the following output for clues:
>
> conserver -DS
> conserver -V
> console -D <console>

I think there's something in there. conserver -DS shows lots of things, including that all of the consoles (of which I only have 6) are remote:

[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:762] Memory Usage (GRPENT objects): 0 (0)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:817] Memory Usage (CONSENT objects): 0 (0)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:830] Memory Usage (REMOTE objects): 211 (6)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:837] Memory Usage (ACCESS objects): 81 (3)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:844] Memory Usage (STRING objects): 856 (14)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:852] Memory Usage (userList objects): 46 (4)
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:855] Memory Usage (total): 1194
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=cfe-rack, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=skaro, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=usparc, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=harmony, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=cyteen, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): DEBUG: [main.c:1004] DumpDataStructures(): remote: rserver=c3620, rhost=localhost
[Fri Jun 5 15:06:24 2009] conserver (2243): terminated

One of these is a direct connection, and the other 5 are TCP connections
through a cisco access-server. But, I think that's not the relevant part.
I think is it a client/server issue.

When I run console -D foo, I see many "ok -> ok -> ssl_connect ->
ok -> @localhost -> goodbye", followed by another of the same,
connecting again to localhost, until it eventually fails with "forwarding
level too deep!"

I've never run this in any way other than as a single console server,
and only being able to connect from itself via localhost (127.0.0.1).

> If it's a name mismatch problem, perhaps just looking at the 'master'
> entries (if there are any) in your conserver.cf file and making sure
> they map to an ip address on your host would be a good first start.
>
> Feel free to send me (directly) any of the info above to help poke
> around and figure this out. It *seems* like a configuration issue, but
> it's always possible it's something else.

Okay. Default * has "master localhost". That's the only master I have.
localhost does, using IPv4 resolution, resolve to 127.0.0.1. I don't
know if the default family changed in this rev of the OS, but it seems
that "localhost" resolves only to the IPv4 address. That appears to be
true on another host still running NetBSD 4.0.

Thanks. Let me know if this reveals anything to you...

- Chris

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
> Okay. Default * has "master localhost". That's the only master I have.
> localhost does, using IPv4 resolution, resolve to 127.0.0.1. I don't
> know if the default family changed in this rev of the OS, but it seems
> that "localhost" resolves only to the IPv4 address. That appears to be
> true on another host still running NetBSD 4.0.

Does 'conserver -DS' show 127.0.0.1 in the ProbeInterfaces() output? If
it's not there (or ProbeInterfaces() isn't showing anything or the wrong
things), the mapping won't happen correctly. You should see all your
interfaces listed.

Bryan
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
On Jun 6, 2009, at 18:06, Bryan Stansell wrote:
> Does 'conserver -DS' show 127.0.0.1 in the ProbeInterfaces()
> output? If
> it's not there (or ProbeInterfaces() isn't showing anything or the
> wrong
> things), the mapping won't happen correctly. You should see all your
> interfaces listed.

There's only one line of "ProbeInterfaces()" output. It doesn't
seem interesting.

The following is the beginning of the conserver -DS output,
including the one line of ProbeInterfaces() output, and the one line
after that.

[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [cutil.c:355]
AllocString(): 0xbb818040 created string #1
[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [cutil.c:355]
AllocString(): 0xbb818060 created string #2
[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [cutil.c:355]
AllocString(): 0xbb818080 created string #3
[Sun Jun 7 21:29:23 2009] conserver (13449): performing configuration
file syntax check
[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [main.c:1364]
main(): bind address set to `0.0.0.0'
[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [cutil.c:2263]
ProbeInterfaces(): ifc_len==4464 max_count==31
[Sun Jun 7 21:29:23 2009] conserver (13449): DEBUG: [cutil.c:355]
AllocString(): 0xbb8180a0 created string #4


Does this mean the interfaces aren't being probed [correctly] ?
That would certainly explain the behaviour...

Thanks.

- Chris

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
Yep, something is broken dealing with enumerating all the interfaces.
Looks like it thinks there are 31 of them (is that even close?)...but
it's not saying anything about them, so something isn't looking right.

I don't have access to NetBSD machines (that I know of), but what
version are you running? 5.0? other?

A quick search found this thread:

http://mail-index.netbsd.org/tech-userlevel/2009/03/22/msg001912.html

which talks about how 5.0 is much less tolerant of apps that don't use
SIOCGIFCONF "correctly" (which could very well be the problem - but
conserver works across many platforms, so I wonder what's up). Probably
something simple to fix, but their suggested change is basically what
conserver already does. Is HAVE_SA_LEN defined in config.h (where you
built conserver)? Could be it isn't auto-detecting the right thing.
Just random guesses since I can't see it in person.

Bryan

On Sun, Jun 07, 2009 at 09:31:20PM -0400, Chris Ross wrote:
> Does this mean the interfaces aren't being probed [correctly] ?
> That would certainly explain the behaviour...
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
On Monday 08 June 2009 02:05:47 Bryan Stansell wrote:
> Yep, something is broken dealing with enumerating all the interfaces.
> Looks like it thinks there are 31 of them (is that even close?)...but
> it's not saying anything about them, so something isn't looking right.
>
> I don't have access to NetBSD machines (that I know of), but what
> version are you running? 5.0? other?

Yup. I'm running 5.0. (Actually, the 5.0_STABLE branch, shortly beyond the
release point, but...)

> A quick search found this thread:
>
> http://mail-index.netbsd.org/tech-userlevel/2009/03/22/msg001912.html
>
> which talks about how 5.0 is much less tolerant of apps that don't use
> SIOCGIFCONF "correctly" (which could very well be the problem - but
> conserver works across many platforms, so I wonder what's up). Probably
> something simple to fix, but their suggested change is basically what
> conserver already does. Is HAVE_SA_LEN defined in config.h (where you
> built conserver)? Could be it isn't auto-detecting the right thing.
> Just random guesses since I can't see it in person.

HAVE_SA_LEN is set. In looking at the code in the thread you mention, and
the code in conserver, they're clearly similar, but referencing different
elements of the ifreq structure. I'm not familiar with the ifreq structure
though, so they could be well the same.

In rebuilding from scratch, even before touching that code, I did notice the
following, which comes from that part of the code:

cc -O2 -I/usr/include -I.. -I.. -I. -DHAVE_CONFIG_H -
DSYSCONFDIR=\"/usr/pkg/etc\" -I/usr/include -I/usr/include -I/usr/include -c
-o cutil.o cutil.c
cutil.c: In function 'ProbeInterfaces':
cutil.c:2280: warning: dereferencing 'void *' pointer


That's the portion of code that sets what ifr points to:

ifr = (struct ifreq *)&ifc.ifc_buf[r];

I'm going to put some more debugging code in and see what I can find out.
Thanks for the help localizing the problem! I'll let you know if I find
anything...

- Chris

_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users
Re: Problems with conserver after OS upgrade [ In reply to ]
On Monday 08 June 2009 09:30:26 Chris Ross wrote:
> On Monday 08 June 2009 02:05:47 Bryan Stansell wrote:
> > A quick search found this thread:
> >
> > http://mail-index.netbsd.org/tech-userlevel/2009/03/22/msg001912.html
> >
> > which talks about how 5.0 is much less tolerant of apps that don't use
> > SIOCGIFCONF "correctly" (which could very well be the problem - but
> > conserver works across many platforms, so I wonder what's up). Probably
> > something simple to fix, but their suggested change is basically what
> > conserver already does.
>
> HAVE_SA_LEN is set. In looking at the code in the thread you mention,
> and the code in conserver, they're clearly similar, but referencing
> different elements of the ifreq structure.

Looking at this a little more, that difference there was the important one.
The patch in the email thread which you mention above compares sa_len to
sizeof(ifr->ifr_irfu). The code in conserver, however, compares against
sizeof(ifr->ifr_addr). ifr_addr is an element in the union (ifr_ifru), but
not the largest one, so those sizeof's yield different results.

The attached patch causes it to find the interfaces and addresses, and
ProbeInterfaces() now reports them in conserver -DS output. (And all of the
consoles come up and work under normal use)

Was this an error on your part, that is just showing a problem for me
because NetBSD's ifru is so much bigger than ifr_addr (128 vs 16 bytes)? Or
is this bug unique to NetBSD, and there should be a local change for NetBSD?

Thanks...

- Chris
Re: Problems with conserver after OS upgrade [ In reply to ]
On Mon, Jun 08, 2009 at 09:58:03AM -0400, Chris Ross wrote:
> Looking at this a little more, that difference there was the important one.
> The patch in the email thread which you mention above compares sa_len to
> sizeof(ifr->ifr_irfu). The code in conserver, however, compares against
> sizeof(ifr->ifr_addr). ifr_addr is an element in the union (ifr_ifru), but
> not the largest one, so those sizeof's yield different results.

Thanks for digging into that for me...it certainly is a problem in the
conserver code. After you pointed out the specifics of the problem (I
glanced right over that), I found this too:

https://lists.isc.org/pipermail/dhcp-hackers/2007-September/000767.html

which explains the "problem" in detail...and why it happened to work
before but is broken now. The change you made looks appropriate for
any OS...certainly the right thing to do.

Thanks for tracking this down!

Bryan
_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/users