Mailing List Archive

testing out, but bizarre results...
Hi all,

I have spread running on OpenBSD and have successfully compiled wackamole to
run. My setup consists of two machines (A/B) that are identical in
hardware, with two nics each. The problem I have is that when I attempt to
'ifconfig sis0 down' on B which wackamole is controling 10.1.0.231 &
10.1.0.232, A assumes only the ip 10.1.0.231. When I 'ifconfig sis0 up' on
B, B then gets the gets both ips, but A doesn't let go of 10.1.0.231 and an
arp war begins.

I would like to know if I did something wrong in a config. Below is my
info. Also, I go into more detail on how my system reacts at the end of
this info.

wackamole.conf on A:
-------------------------------
Spread = 4803@drella
Group = wack1
Control = /var/tmp/wack.it

prefer sis0:10.1.0.1/24


VirtualInterfaces {
{ sis0:10.1.0.231/16 }
{ sis0:10.1.0.232/16 }
}

Arp-Cache = 90s

Notify {
sis0:10.1.0.12/32
arp-cache
}
balance {
AcquisitionsPerRound = all
interval = 4s
}
mature = 5s
-------------------------------

wackamole.conf on B:
-------------------------------
Spread = 4803@bela
Group = wack1
Control = /var/tmp/wack.it

prefer sis0:10.1.0.5/24


VirtualInterfaces {
{ sis0:10.1.0.231/16 }
{ sis0:10.1.0.232/16 }
}

Arp-Cache = 90s

Notify {
sis0:10.1.0.12/32
arp-cache
}
balance {
AcquisitionsPerRound = all
interval = 4s
}
mature = 5s
-------------------------------

ifconfig on A for sis0:
inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0

ifconfig on B for sis0:
inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0

spread.conf on A:
-------------------------------
Spread_Segment 10.1.0.255:4803 {

bela 10.1.0.5
drella 10.1.0.1

}

EventLogFile = /var/log/spread_%h.log
EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"

AllowedAuthMethods = "NULL"
AccessControlPolicy = "PERMIT"
-------------------------------

spread.conf on B:
-------------------------------
Spread_Segment 10.1.0.255:4803 {

bela 10.1.0.5
drella 10.1.0.1
}

EventLogFile = /var/log/spread_%h.log
EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"

AllowedAuthMethods = "NULL"
AccessControlPolicy = "PERMIT"
-------------------------------

- I have spread running with no wackamole running.

ifconfig on A for sis0:
inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0

ifconfig on B for sis0:
inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0

- I start wackamole on A:
ifconfig on A:
inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255

- I start wackamole on B:
ifconfig on B:
inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
ifconfig on A:
inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0

- I `ifconfig sis0 down` on B:
ifconfig on A:
inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255

ifconfig on B:
inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255


As you can see A doesn't pick up the x.x.x.232 address.

Am I doing something wrong? Does 'ifconfig <iface> down' good enough to
simulate a downed interface card? If not, how can I test this, besides
actually unplugging the cable? Am I getting my broadcast and netmasks
messed up?

Any help is greatly appreciated.

-.mag
testing out, but bizarre results... [ In reply to ]
First, from your configuration file, I will assume you are running the
latest CVS version. I have annotated your wackamole conf file... try
some (all) of these changes:

Mark A. Garcia wrote:

>wackamole.conf on A:
>-------------------------------
>Spread = 4803@drella
>
Just say 4803, not 4803@drella... That way it uses the unix domain socket.

>Group = wack1
>Control = /var/tmp/wack.it
>
>prefer sis0:10.1.0.1/24
>
You have to "prefer" one of the below IPs. Unless you feel that it is
important that _this_ machines have a specific IP address if it is up,
then I highly suggest commenting out the prefer statement and letting
wackamole make those decisions.

>VirtualInterfaces {
> { sis0:10.1.0.231/16 }
> { sis0:10.1.0.232/16 }
>}
>
I don't know much about OpenBSD, but if it is anything like FreeBSD,
then you do not want /16 netmask on those VIPs. You instead want them
to have a netmask of 255.255.255.255, so use /32. (That's in the
FreeBSD man page).

>Arp-Cache = 90s
>
>Notify {
> sis0:10.1.0.12/32
>
That's your router?

> arp-cache
>}
>balance {
> AcquisitionsPerRound = all
> interval = 4s
>}
>mature = 5s
>-------------------------------
>
>wackamole.conf on B:
>-------------------------------
>Spread = 4803@bela
>Group = wack1
>Control = /var/tmp/wack.it
>
>prefer sis0:10.1.0.5/24
>
>
>VirtualInterfaces {
> { sis0:10.1.0.231/16 }
> { sis0:10.1.0.232/16 }
>}
>
>Arp-Cache = 90s
>
>Notify {
> sis0:10.1.0.12/32
> arp-cache
>}
>balance {
> AcquisitionsPerRound = all
> interval = 4s
>}
>mature = 5s
>-------------------------------
>
>ifconfig on A for sis0:
> inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
>
>ifconfig on B for sis0:
> inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
>
>spread.conf on A:
>-------------------------------
>Spread_Segment 10.1.0.255:4803 {
>
> bela 10.1.0.5
> drella 10.1.0.1
>
>}
>
>EventLogFile = /var/log/spread_%h.log
>EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
>
>AllowedAuthMethods = "NULL"
>AccessControlPolicy = "PERMIT"
>-------------------------------
>
>spread.conf on B:
>-------------------------------
>Spread_Segment 10.1.0.255:4803 {
>
> bela 10.1.0.5
> drella 10.1.0.1
>}
>
>
If this is indeed a /16 like you suggest above, then your broadcast is
more likely to be 10.1.255.255 and not 10.1.0.255. That would through
Spread through a loop. Though with only two machines, Spread might just
work right with an incorrect broadcast address.

>EventLogFile = /var/log/spread_%h.log
>EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
>
>AllowedAuthMethods = "NULL"
>AccessControlPolicy = "PERMIT"
>-------------------------------
>
>- I have spread running with no wackamole running.
>
>ifconfig on A for sis0:
> inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
>
>ifconfig on B for sis0:
> inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
>
>- I start wackamole on A:
>ifconfig on A:
> inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
>
>- I start wackamole on B:
>ifconfig on B:
> inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
>ifconfig on A:
> inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
>
>- I `ifconfig sis0 down` on B:
>ifconfig on A:
> inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
>
>ifconfig on B:
> inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
>
>
>As you can see A doesn't pick up the x.x.x.232 address.
>
>Am I doing something wrong? Does 'ifconfig <iface> down' good enough to
>simulate a downed interface card? If not, how can I test this, besides
>actually unplugging the cable? Am I getting my broadcast and netmasks
>messed up?
>
If you have more than one interface, then downing an interface might not
ork quite right. Honestly, I tested mine by pulling the power plug on
the machine -- and downing the interface -- and killing spread -- and
killing wackamole. It is good to play with it a bit so that you truly
understand the nature of the beast.

If you "ifconfig down" and interface, wackamole will never be the wiser.
You must not manually remove aliases put in place by wackamole.
Wackmole will not be aware of changes made "underneath" it and will go
on with life as if it nothing has changed. There is a small program
called wackatrl that comes with wackamole. It will allow you to "induce
failure". It doesn't do this by "alerting" other daemons, instead it
simply "disappears" (disconnects from Spread and drops all VIPs), the
other machines are responsible for taking the appropriate action.

wackatrl -f (failure)
wackatrl -s (success)
wackatrl -l (listing of existing VIPs and owners)

--
Theo Schlossnagle
Principal Consultant
OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Phone: +1 301 776 6376 Fax: +1 410 880 4879
1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
testing out, but bizarre results... [ In reply to ]
I applied all the changes and everything is nice and snappy. Works like a
charm. The Notify ip, is actually our dns server.

Thanks for the help.

-.mag
On Wed, Jul 24, 2002 at 10:55:14AM -0400, Theo Schlossnagle wrote:
> First, from your configuration file, I will assume you are running the
> latest CVS version. I have annotated your wackamole conf file... try
> some (all) of these changes:
>
> Mark A. Garcia wrote:
>
> >wackamole.conf on A:
> >-------------------------------
> >Spread = 4803@drella
> >
> Just say 4803, not 4803@drella... That way it uses the unix domain socket.
>
> >Group = wack1
> >Control = /var/tmp/wack.it
> >
> >prefer sis0:10.1.0.1/24
> >
> You have to "prefer" one of the below IPs. Unless you feel that it is
> important that _this_ machines have a specific IP address if it is up,
> then I highly suggest commenting out the prefer statement and letting
> wackamole make those decisions.
>
> >VirtualInterfaces {
> > { sis0:10.1.0.231/16 }
> > { sis0:10.1.0.232/16 }
> >}
> >
> I don't know much about OpenBSD, but if it is anything like FreeBSD,
> then you do not want /16 netmask on those VIPs. You instead want them
> to have a netmask of 255.255.255.255, so use /32. (That's in the
> FreeBSD man page).
>
> >Arp-Cache = 90s
> >
> >Notify {
> > sis0:10.1.0.12/32
> >
> That's your router?
>
> > arp-cache
> >}
> >balance {
> > AcquisitionsPerRound = all
> > interval = 4s
> >}
> >mature = 5s
> >-------------------------------
> >
> >wackamole.conf on B:
> >-------------------------------
> >Spread = 4803@bela
> >Group = wack1
> >Control = /var/tmp/wack.it
> >
> >prefer sis0:10.1.0.5/24
> >
> >
> >VirtualInterfaces {
> > { sis0:10.1.0.231/16 }
> > { sis0:10.1.0.232/16 }
> >}
> >
> >Arp-Cache = 90s
> >
> >Notify {
> > sis0:10.1.0.12/32
> > arp-cache
> >}
> >balance {
> > AcquisitionsPerRound = all
> > interval = 4s
> >}
> >mature = 5s
> >-------------------------------
> >
> >ifconfig on A for sis0:
> > inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> >
> >ifconfig on B for sis0:
> > inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> >
> >spread.conf on A:
> >-------------------------------
> >Spread_Segment 10.1.0.255:4803 {
> >
> > bela 10.1.0.5
> > drella 10.1.0.1
> >
> >}
> >
> >EventLogFile = /var/log/spread_%h.log
> >EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
> >
> >AllowedAuthMethods = "NULL"
> >AccessControlPolicy = "PERMIT"
> >-------------------------------
> >
> >spread.conf on B:
> >-------------------------------
> >Spread_Segment 10.1.0.255:4803 {
> >
> > bela 10.1.0.5
> > drella 10.1.0.1
> >}
> >
> >
> If this is indeed a /16 like you suggest above, then your broadcast is
> more likely to be 10.1.255.255 and not 10.1.0.255. That would through
> Spread through a loop. Though with only two machines, Spread might just
> work right with an incorrect broadcast address.
>
> >EventLogFile = /var/log/spread_%h.log
> >EventTimeStamp = "[%a %d %b %Y %H:%M:%S]"
> >
> >AllowedAuthMethods = "NULL"
> >AccessControlPolicy = "PERMIT"
> >-------------------------------
> >
> >- I have spread running with no wackamole running.
> >
> >ifconfig on A for sis0:
> > inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> >
> >ifconfig on B for sis0:
> > inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> >
> >- I start wackamole on A:
> >ifconfig on A:
> > inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> > inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> > inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
> >
> >- I start wackamole on B:
> >ifconfig on B:
> > inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> > inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> > inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
> >ifconfig on A:
> > inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> >
> >- I `ifconfig sis0 down` on B:
> >ifconfig on A:
> > inet 10.1.0.1 netmask 0xff000000 broadcast 255.255.0.0
> > inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> >
> >ifconfig on B:
> > inet 10.1.0.5 netmask 0xff000000 broadcast 255.255.0.0
> > inet 10.1.0.231 netmask 0xffff0000 broadcast 10.1.255.255
> > inet 10.1.0.232 netmask 0xffff0000 broadcast 10.1.255.255
> >
> >
> >As you can see A doesn't pick up the x.x.x.232 address.
> >
> >Am I doing something wrong? Does 'ifconfig <iface> down' good enough to
> >simulate a downed interface card? If not, how can I test this, besides
> >actually unplugging the cable? Am I getting my broadcast and netmasks
> >messed up?
> >
> If you have more than one interface, then downing an interface might not
> ork quite right. Honestly, I tested mine by pulling the power plug on
> the machine -- and downing the interface -- and killing spread -- and
> killing wackamole. It is good to play with it a bit so that you truly
> understand the nature of the beast.
>
> If you "ifconfig down" and interface, wackamole will never be the wiser.
> You must not manually remove aliases put in place by wackamole.
> Wackmole will not be aware of changes made "underneath" it and will go
> on with life as if it nothing has changed. There is a small program
> called wackatrl that comes with wackamole. It will allow you to "induce
> failure". It doesn't do this by "alerting" other daemons, instead it
> simply "disappears" (disconnects from Spread and drops all VIPs), the
> other machines are responsible for taking the appropriate action.
>
> wackatrl -f (failure)
> wackatrl -s (success)
> wackatrl -l (listing of existing VIPs and owners)
>
> --
> Theo Schlossnagle
> Principal Consultant
> OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
> Phone: +1 301 776 6376 Fax: +1 410 880 4879
> 1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
> 2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
>
>
>
>
>
> _______________________________________________
> wackamole-users mailing list
> wackamole-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/wackamole-users
testing out, but bizarre results... [ In reply to ]
Mark A. Garcia wrote:

>I applied all the changes and everything is nice and snappy. Works like a
>charm. The Notify ip, is actually our dns server.
>
>Thanks for the help.
>
>
I would highly suggest adding your router/gateway/NAT box to the list of
nofications. There is nothing worse than failing over your machine and
not having the rest of thew world notice :-)

--
Theo Schlossnagle
Principal Consultant
OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
Phone: +1 301 776 6376 Fax: +1 410 880 4879
1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
testing out, but bizarre results... [ In reply to ]
Ok... now a different problem.

So, I was able to get everything working on 3.0 obsd. I saved the conf.
Then, I reinstalled the server with a fresh new 3.1 obsd.

I did everything like the first time in compiling it and installing, and
used the same conf for starting it up. Now I just get the license text and
it exits.

I tried looking through the code and then set Debug = 1 Compiled again, and
attempted to run it. Now I get the license text plus:

Now trying to reconnect
Clean_up called
SP_connect: connected with private group(15 bytes): #wack27988#bela
Sig_handler called
SIGSEGV Detected!
Clean_up called

Has anyone else had similar problems for 3.1 obsd. I'm not sure why it
worked on 3.0 obsd and not 3.1

Any suggestions?

Cheers,
-.mag

On Wed, Jul 24, 2002 at 01:28:26PM 0400, Theo Schlossnagle wrote:
> Mark A. Garcia wrote:
>
> >I applied all the changes and everything is nice and snappy. Works like a
> >charm. The Notify ip, is actually our dns server.
> >
> >Thanks for the help.
> >
> >
> I would highly suggest adding your router/gateway/NAT box to the list of
> nofications. There is nothing worse than failing over your machine and
> not having the rest of thew world notice :-)
>
> --
> Theo Schlossnagle
> Principal Consultant
> OmniTI Computer Consulting, Inc. -- http://www.omniti.com/
> Phone: +1 301 776 6376 Fax: +1 410 880 4879
> 1024D/82844984/95FD 30F1 489E 4613 F22E 491A 7E88 364C 8284 4984
> 2047R/33131B65/71 F7 95 64 49 76 5D BA 3D 90 B9 9F BE 27 24 E7
>
>
>
>
>
> _______________________________________________
> wackamole-users mailing list
> wackamole-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/wackamole-users