Mailing List Archive

[mod_backhand-users] cluster distribution question
Hi,

Our web cluster is mainly made up of the same type of systems, all
single CPU machines with similar capabilities. However, we will be
introducing a 4 CPU system into the cluster, which should be able to
handle four times the load that the other systems can. How should I
distribute the work load around the cluster, taking advantage of the
fact that this machine can handle a higher load? Can I apply a wait to
"byLoad", can I use "byCost" in place of "byLoad" to get the desired
results, or must I build my own candidacy functions?

TIA
Monte
[mod_backhand-users] cluster distribution question [ In reply to ]
Monte - we have large web farms that are definitely heterogeneous in terms
of processing power and available memory. If your 4 CPU server will be
performing the same tasks as your other single CPU machines, it is most
probable that it will be able to handle many times the processes of a
single CPU server while maintaining a reasonably low load.

We are still experimenting with different configurations, however, using:

byAge
byLoad

.. seems to do just swell. Our multi-processor web servers that have the
most memory do the most work, all the way down the line to our lowly
single CPU servers with smaller amounts memory.

On Tue, 4 Sep 2001, Monte Ohrt wrote:

> Hi,
>
> Our web cluster is mainly made up of the same type of systems, all
> single CPU machines with similar capabilities. However, we will be
> introducing a 4 CPU system into the cluster, which should be able to
> handle four times the load that the other systems can. How should I
> distribute the work load around the cluster, taking advantage of the
> fact that this machine can handle a higher load? Can I apply a wait to
> "byLoad", can I use "byCost" in place of "byLoad" to get the desired
> results, or must I build my own candidacy functions?
>
> TIA
> Monte
[mod_backhand-users] cluster distribution question [ In reply to ]
> Monte - we have large web farms that are definitely heterogeneous in terms
> of processing power and available memory. If your 4 CPU server will be
> performing the same tasks as your other single CPU machines, it is most
> probable that it will be able to handle many times the processes of a
> single CPU server while maintaining a reasonably low load.
>
> We are still experimenting with different configurations, however, using:
>
> byAge
> byLoad
>
> .. seems to do just swell. Our multi-processor web servers that have the
> most memory do the most work, all the way down the line to our lowly
> single CPU servers with smaller amounts memory.

Add byCost if at all possible because it uses the Ariba #. If
they machines you're working with are all doing the same thing, then
try:


Backhand byAge 3
# BackhandFromSO libexec/byHostname.so byHostname www
Backhand byLogWindow
Backhand byCost
Backhand addPrediction

If you want only your machines with www in their name, then
uncomment the previous line. Cost + Prediction = VERY good thing. ;)
-sc

--
Sean Chittenden
[mod_backhand-users] cluster distribution question [ In reply to ]
> > Monte - we have large web farms that are definitely heterogeneous in terms
> > of processing power and available memory. If your 4 CPU server will be
> > performing the same tasks as your other single CPU machines, it is most
> > probable that it will be able to handle many times the processes of a
> > single CPU server while maintaining a reasonably low load.
> >
> > We are still experimenting with different configurations, however, using:
> >
> > byAge
> > byLoad
> >
> > .. seems to do just swell. Our multi-processor web servers that have the
> > most memory do the most work, all the way down the line to our lowly
> > single CPU servers with smaller amounts memory.
>
> Add byCost if at all possible because it uses the Ariba #. If
> they machines you're working with are all doing the same thing, then
> try:
>
>
> Backhand byAge 3
> # BackhandFromSO libexec/byHostname.so byHostname www
> Backhand byLogWindow
> Backhand byCost
> Backhand addPrediction
>
> If you want only your machines with www in their name, then
> uncomment the previous line. Cost + Prediction = VERY good thing. ;)

Can someone please elaborate on addPrediction? I've seen it in the
builtins.c and am at a loss at what it's doing (other than the fact it's
adding on to the load value of some server).
[mod_backhand-users] cluster distribution question [ In reply to ]
I was about to ask this same question, I don't see addPrediction in the
FAQ or anywhere on the site.

TIA
Monte

Neil Mansilla wrote:
>
> > > Monte - we have large web farms that are definitely heterogeneous in terms
> > > of processing power and available memory. If your 4 CPU server will be
> > > performing the same tasks as your other single CPU machines, it is most
> > > probable that it will be able to handle many times the processes of a
> > > single CPU server while maintaining a reasonably low load.
> > >
> > > We are still experimenting with different configurations, however, using:
> > >
> > > byAge
> > > byLoad
> > >
> > > .. seems to do just swell. Our multi-processor web servers that have the
> > > most memory do the most work, all the way down the line to our lowly
> > > single CPU servers with smaller amounts memory.
> >
> > Add byCost if at all possible because it uses the Ariba #. If
> > they machines you're working with are all doing the same thing, then
> > try:
> >
> >
> > Backhand byAge 3
> > # BackhandFromSO libexec/byHostname.so byHostname www
> > Backhand byLogWindow
> > Backhand byCost
> > Backhand addPrediction
> >
> > If you want only your machines with www in their name, then
> > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
>
> Can someone please elaborate on addPrediction? I've seen it in the
> builtins.c and am at a loss at what it's doing (other than the fact it's
> adding on to the load value of some server).
>
> _______________________________________________
> backhand-users mailing list
> backhand-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/backhand-users

--
Monte Ohrt <monte@ispi.net>
http://www.ispi.net/
[mod_backhand-users] cluster distribution question [ In reply to ]
On a side note, it would be nice to see a summary of "byCost" in the
FAQ. Instead of sending people to a postscript file to rummage through
algorithms and theories, maybe summarize what "byCost" does, and what
resources it references to determine the "cost" of a web server.

Monte

Neil Mansilla wrote
>
> > > Monte - we have large web farms that are definitely heterogeneous in terms
> > > of processing power and available memory. If your 4 CPU server will be
> > > performing the same tasks as your other single CPU machines, it is most
> > > probable that it will be able to handle many times the processes of a
> > > single CPU server while maintaining a reasonably low load.
> > >
> > > We are still experimenting with different configurations, however, using:
> > >
> > > byAge
> > > byLoad
> > >
> > > .. seems to do just swell. Our multi-processor web servers that have the
> > > most memory do the most work, all the way down the line to our lowly
> > > single CPU servers with smaller amounts memory.
> >
> > Add byCost if at all possible because it uses the Ariba #. If
> > they machines you're working with are all doing the same thing, then
> > try:
> >
> >
> > Backhand byAge 3
> > # BackhandFromSO libexec/byHostname.so byHostname www
> > Backhand byLogWindow
> > Backhand byCost
> > Backhand addPrediction
> >
> > If you want only your machines with www in their name, then
> > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
>
> Can someone please elaborate on addPrediction? I've seen it in the
> builtins.c and am at a loss at what it's doing (other than the fact it's
> adding on to the load value of some server).
>
> _______________________________________________
> backhand-users mailing list
> backhand-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/backhand-users

--
Monte Ohrt <monte@ispi.net>
http://www.ispi.net/
[mod_backhand-users] cluster distribution question [ In reply to ]
> > Add byCost if at all possible because it uses the Ariba #. If
> > they machines you're working with are all doing the same thing, then
> > try:
> >
> >
> > Backhand byAge 3
> > # BackhandFromSO libexec/byHostname.so byHostname www
> > Backhand byLogWindow
> > Backhand byCost
> > Backhand addPrediction
> >
> > If you want only your machines with www in their name, then
> > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
>
> Can someone please elaborate on addPrediction? I've seen it in the
> builtins.c and am at a loss at what it's doing (other than the fact it's
> adding on to the load value of some server).

In a nutshell:

byCost: a request on a big IBM RS6000 is going to cost less than a
request on an old 386 SX33 Mhz system. byCost uses the arriba
number to take this into account and send reuqests to the RS6000
if both systems have a load of 0.00

addPrediction:
if for somereason a server gets railed with requests,
addPrediction will update the information for a host in the
interim second before the next update from the rest of the
backhand cluster. This is only really useful for sites that are
under heavey traffic and in 1 second the state of the
network/servers can/has changed dramatically.

Make sense? -sc

--
Sean Chittenden
[mod_backhand-users] cluster distribution question [ In reply to ]
> byCost: a request on a big IBM RS6000 is going to cost less than a
> request on an old 386 SX33 Mhz system. byCost uses the arriba
> number to take this into account and send reuqests to the RS6000
> if both systems have a load of 0.00
>
> addPrediction:
> if for somereason a server gets railed with requests,
> addPrediction will update the information for a host in the
> interim second before the next update from the rest of the
> backhand cluster. This is only really useful for sites that are
> under heavey traffic and in 1 second the state of the
> network/servers can/has changed dramatically.

Quick note for addPrediction: it updates only the local information, not
the rest of the machines in the cluster. -sc

--
Sean Chittenden
[mod_backhand-users] cluster distribution question [ In reply to ]
Sean Chittenden wrote:
>
> > > Add byCost if at all possible because it uses the Ariba #. If
> > > they machines you're working with are all doing the same thing, then
> > > try:
> > >
> > >
> > > Backhand byAge 3
> > > # BackhandFromSO libexec/byHostname.so byHostname www
> > > Backhand byLogWindow
> > > Backhand byCost
> > > Backhand addPrediction
> > >
> > > If you want only your machines with www in their name, then
> > > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
> >
> > Can someone please elaborate on addPrediction? I've seen it in the
> > builtins.c and am at a loss at what it's doing (other than the fact it's
> > adding on to the load value of some server).
>
> In a nutshell:
>
> byCost: a request on a big IBM RS6000 is going to cost less than a
> request on an old 386 SX33 Mhz system. byCost uses the arriba
> number to take this into account and send reuqests to the RS6000
> if both systems have a load of 0.00

Ok, so it uses the system load and the arriba number to figure the cost?
anything else, like cpu idle or memory consumption?


>
> addPrediction:
> if for somereason a server gets railed with requests,
> addPrediction will update the information for a host in the
> interim second before the next update from the rest of the
> backhand cluster. This is only really useful for sites that are
> under heavey traffic and in 1 second the state of the
> network/servers can/has changed dramatically.
>
> Make sense? -sc
>
> --
> Sean Chittenden
>
> _______________________________________________
> backhand-users mailing list
> backhand-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/backhand-users

--
Monte Ohrt <monte@ispi.net>
http://www.ispi.net/
[mod_backhand-users] cluster distribution question [ In reply to ]
> > > > Add byCost if at all possible because it uses the Ariba #. If
> > > > they machines you're working with are all doing the same thing, then
> > > > try:
> > > >
> > > >
> > > > Backhand byAge 3
> > > > # BackhandFromSO libexec/byHostname.so byHostname www
> > > > Backhand byLogWindow
> > > > Backhand byCost
> > > > Backhand addPrediction
> > > >
> > > > If you want only your machines with www in their name, then
> > > > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
> > >
> > > Can someone please elaborate on addPrediction? I've seen it in the
> > > builtins.c and am at a loss at what it's doing (other than the fact it's
> > > adding on to the load value of some server).
> >
> > In a nutshell:
> >
> > byCost: a request on a big IBM RS6000 is going to cost less than a
> > request on an old 386 SX33 Mhz system. byCost uses the arriba
> > number to take this into account and send reuqests to the RS6000
> > if both systems have a load of 0.00
>
> Ok, so it uses the system load and the arriba number to figure the cost?
> anything else, like cpu idle or memory consumption?

I believe so.... yeah, it looks like it takes cpu, ammount of memory
available, and arriba and mixes it in an academically proven/studied (?)
algorythm to figure out what server is best to use. Prediction updates
internal data so that if you get 1000 requests in a sec, you don't all
send them to one server (or might depending on the HP of that one
server). -sc

PS Theo, can you update the permissions for wackamole? anoncvs
can't get a directory lock in /storage/cvs/munjal/wacamole. Thanks.

--
Sean Chittenden
[mod_backhand-users] cluster distribution question [ In reply to ]
Hi,

A bit of update about the byCost and what's behind it.
The byCost function assigns a cost to each available resource according to
the Cost-Benefit Framework, a concept originally developed at Johns Hopkins by
Baruch Awerbuch and adapted to practice by several of us practical guys at Hopkins.

The essence is that the price of a resource is increased as the availability of
it is depleted - its utilization increased. Now, theory shows that if the price
is increased EXPONENTIALLY as the utilization of it is increased. then an online
scheduling algorithm (that does not know what requests will happen in the future)
will be competitive compared with the optimal algorithm that knows everything in
advance and have infinite power to calculate the best strategy after the fact.
"Competitive" means that you can bound the different between the byCost algorithm
and the optimal algorithm with a known factor. Highly important theoretical result.

This result applies to the worst case. What most practitioners care about is
the average case. We showed that in several different settings (A PVM cluster, A mosix cluster,
global multicast flow control) this algorithm outperforms in practice the best
algorithms out there in the AVERAGE case (which our theory does not make any predictions
about).

So - actually an old 486 PC can be cheaper compared with the strongest Pentium III if,
for example, the Pentium III is extremely busy and the 486 is totally empty.
The more interesting question is how to compare a machine which is moderately
stronger but have much more available memory compared with another machine at
a particular time. How do you compare apples (cpu) and oranges (memory)?
This is exactly what the cost-benefit framework does by assigning a cost
based on the utilization.

Here is a good slide about this:
http://www.cnds.jhu.edu/funding/tolerant_networks/ar0101_cost/sld029.htm

Enjoy,

:) Yair.


Sean Chittenden wrote:
>
> > > > > Add byCost if at all possible because it uses the Ariba #. If
> > > > > they machines you're working with are all doing the same thing, then
> > > > > try:
> > > > >
> > > > >
> > > > > Backhand byAge 3
> > > > > # BackhandFromSO libexec/byHostname.so byHostname www
> > > > > Backhand byLogWindow
> > > > > Backhand byCost
> > > > > Backhand addPrediction
> > > > >
> > > > > If you want only your machines with www in their name, then
> > > > > uncomment the previous line. Cost + Prediction = VERY good thing. ;)
> > > >
> > > > Can someone please elaborate on addPrediction? I've seen it in the
> > > > builtins.c and am at a loss at what it's doing (other than the fact it's
> > > > adding on to the load value of some server).
> > >
> > > In a nutshell:
> > >
> > > byCost: a request on a big IBM RS6000 is going to cost less than a
> > > request on an old 386 SX33 Mhz system. byCost uses the arriba
> > > number to take this into account and send reuqests to the RS6000
> > > if both systems have a load of 0.00
> >
> > Ok, so it uses the system load and the arriba number to figure the cost?
> > anything else, like cpu idle or memory consumption?
>
> I believe so.... yeah, it looks like it takes cpu, ammount of memory
> available, and arriba and mixes it in an academically proven/studied (?)
> algorythm to figure out what server is best to use. Prediction updates
> internal data so that if you get 1000 requests in a sec, you don't all
> send them to one server (or might depending on the HP of that one
> server). -sc
>
> PS Theo, can you update the permissions for wackamole? anoncvs
> can't get a directory lock in /storage/cvs/munjal/wacamole. Thanks.
>
> --
> Sean Chittenden
>
> _______________________________________________
> backhand-users mailing list
> backhand-users@lists.backhand.org
> http://lists.backhand.org/mailman/listinfo/backhand-users