Mailing List Archive

[OT] distributed webservers?
this is not likely the best place to ask this, but i'm not sure fo where else
would be. i've noticed lately that a lot of non-profit organisations are
running into trouble paying their bills. the costs are coming mainly from
things like bandwidth consumption, the result of migration of the public from
mainstream sources (cnn.com) to independent ones (indymedia.org). the
problem however is that the traffic hitting non-profit sites is making it
very expensive to run one.

some have added advertising to the site, others have just plain shut down, but
i was wondering how difficult it would be (or if such a thing already exists)
to run a distributed webserver. something that would split each request to a
page off to multiple low-bandwidth, sattelite servers... like you and i
running boxes at home. some data (say, large video files etc) would only be
available from one pool of sources, while others (html, jpg, png files) would
be available from a larger set.

of course there would have to be lots of checking to make sure that a server
doesn't blow up or anything, but how doable is this? and why haven't i seen
more non-profits & ngo's doing this sort of thing? personally, i wouldn't
mind giving up a little bandwith for a couple sites i want to support but
don't have the cash to donate to...


--
travel is fatal to prejudice, bigotry, and narrow-mindedness, and many of our
people need it sorely on these accounts. broad, wholesome, charitable views
of man and things cannot be acquired by vegetating in one little corner of
the earth all one's lifetime.
- mark twain

--
gentoo-user@gentoo.org mailing list
Re: [OT] distributed webservers? [ In reply to ]
Depending on how fancy you want to get there are many ways of doing
this. Gentoo has a similar concept for the rsync servers.
Sourceforge has something for their download mirrors. You could use a
round robin DNS setup to distribute each request to a different
server. You could write a script that would redirect requests from
the primary server to one of many helper servers. There's many ideas,
but the biggest problem would be that each server would need to have a
complete copy of the site's data. Making updates to a site would
become a nightmare, unless each server was set to rsync with the
primary server all the time.

-jw

On Thu, 14 Oct 2004 15:34:21 -0400, daniel
<danstemporaryaccount@yahoo.ca> wrote:
> this is not likely the best place to ask this, but i'm not sure fo where else
> would be. i've noticed lately that a lot of non-profit organisations are
> running into trouble paying their bills. the costs are coming mainly from
> things like bandwidth consumption, the result of migration of the public from
> mainstream sources (cnn.com) to independent ones (indymedia.org). the
> problem however is that the traffic hitting non-profit sites is making it
> very expensive to run one.
>
> some have added advertising to the site, others have just plain shut down, but
> i was wondering how difficult it would be (or if such a thing already exists)
> to run a distributed webserver. something that would split each request to a
> page off to multiple low-bandwidth, sattelite servers... like you and i
> running boxes at home. some data (say, large video files etc) would only be
> available from one pool of sources, while others (html, jpg, png files) would
> be available from a larger set.
>
> of course there would have to be lots of checking to make sure that a server
> doesn't blow up or anything, but how doable is this? and why haven't i seen
> more non-profits & ngo's doing this sort of thing? personally, i wouldn't
> mind giving up a little bandwith for a couple sites i want to support but
> don't have the cash to donate to...
>
> --
> travel is fatal to prejudice, bigotry, and narrow-mindedness, and many of our
> people need it sorely on these accounts. broad, wholesome, charitable views
> of man and things cannot be acquired by vegetating in one little corner of
> the earth all one's lifetime.
> - mark twain
>
> --
> gentoo-user@gentoo.org mailing list
>
>

--
gentoo-user@gentoo.org mailing list
Re: [OT] distributed webservers? [ In reply to ]
Jeffrey's suggestion sounds like a baby Akamai. Once you get the round robin
DNS off the ground, you could enhance it by selecting the server with the least
latency for that request. You could also add server monitoring features,
alerts, traffic monitoring/logging, logic to handle times of heavy load
(slashdotting), etc...

A better model for site replication would be a push model where one web server
is designated as the master and updates are pushed out to the slave servers.
It would have to be secured like crazy, but you could even do it in a staged
fashion where the update is only visible on the master server after all (or
most of) the slaves have successfully processed the update (or it could be set
on a timeout, etc). Sort of similar to the way most DBMS's handle replication.

Maintaining session state would also get really interesting. You would either
need a central server that all the web servers could retrieve a user's state
from (indexed by a unique ID stored as a cookie on the user's machine), or you
would have to store a ton of cookies on the client machine. It would be
interesting to find the method that saves the most bandwidth - I bet it would
be heavily influenced by the application being run.

-Dan


Quoting Jeffrey Wong <mindstormmaster@gmail.com>:

> Depending on how fancy you want to get there are many ways of doing
> this. Gentoo has a similar concept for the rsync servers.
> Sourceforge has something for their download mirrors. You could use a
> round robin DNS setup to distribute each request to a different
> server. You could write a script that would redirect requests from
> the primary server to one of many helper servers. There's many ideas,
> but the biggest problem would be that each server would need to have a
> complete copy of the site's data. Making updates to a site would
> become a nightmare, unless each server was set to rsync with the
> primary server all the time.
>
> -jw
>
> On Thu, 14 Oct 2004 15:34:21 -0400, daniel
> <danstemporaryaccount@yahoo.ca> wrote:
> > this is not likely the best place to ask this, but i'm not sure fo where
> else
> > would be. i've noticed lately that a lot of non-profit organisations are
> > running into trouble paying their bills. the costs are coming mainly from
> > things like bandwidth consumption, the result of migration of the public
> from
> > mainstream sources (cnn.com) to independent ones (indymedia.org). the
> > problem however is that the traffic hitting non-profit sites is making it
> > very expensive to run one.
> >
> > some have added advertising to the site, others have just plain shut down,
> but
> > i was wondering how difficult it would be (or if such a thing already
> exists)
> > to run a distributed webserver. something that would split each request to
> a
> > page off to multiple low-bandwidth, sattelite servers... like you and i
> > running boxes at home. some data (say, large video files etc) would only
> be
> > available from one pool of sources, while others (html, jpg, png files)
> would
> > be available from a larger set.
> >
> > of course there would have to be lots of checking to make sure that a
> server
> > doesn't blow up or anything, but how doable is this? and why haven't i
> seen
> > more non-profits & ngo's doing this sort of thing? personally, i wouldn't
> > mind giving up a little bandwith for a couple sites i want to support but
> > don't have the cash to donate to...
> >
> > --
> > travel is fatal to prejudice, bigotry, and narrow-mindedness, and many of
> our
> > people need it sorely on these accounts. broad, wholesome, charitable
> views
> > of man and things cannot be acquired by vegetating in one little corner of
> > the earth all one's lifetime.
> > - mark twain
> >
> > --
> > gentoo-user@gentoo.org mailing list
> >
> >
>
> --
> gentoo-user@gentoo.org mailing list
>
>




--
gentoo-user@gentoo.org mailing list
Re: [OT] distributed webservers? [ In reply to ]
A simple and basic HTML site wouldn't have any problem with session
state. Anything invloving logins or sessions would get much more
complex.

Apache/Jakarta's Tomcat has a bit of a load balancer in the latest
releases. It sounds like it can send session data to other servers,
but I haven't tried it myself.

If you were to need session data it might be better to redirect the
request at the very beginning and keep the user on the same server for
their entire visit, eliminating session state problems. This does
bring up problems with bandwidth and disk space though.

-jw


On Fri, 15 Oct 2004 10:57:08 -0400, Dan Falcone <dan@falconeweb.net> wrote:
> Jeffrey's suggestion sounds like a baby Akamai. Once you get the round robin
> DNS off the ground, you could enhance it by selecting the server with the least
> latency for that request. You could also add server monitoring features,
> alerts, traffic monitoring/logging, logic to handle times of heavy load
> (slashdotting), etc...
>
> A better model for site replication would be a push model where one web server
> is designated as the master and updates are pushed out to the slave servers.
> It would have to be secured like crazy, but you could even do it in a staged
> fashion where the update is only visible on the master server after all (or
> most of) the slaves have successfully processed the update (or it could be set
> on a timeout, etc). Sort of similar to the way most DBMS's handle replication.
>
> Maintaining session state would also get really interesting. You would either
> need a central server that all the web servers could retrieve a user's state
> from (indexed by a unique ID stored as a cookie on the user's machine), or you
> would have to store a ton of cookies on the client machine. It would be
> interesting to find the method that saves the most bandwidth - I bet it would
> be heavily influenced by the application being run.
>
> -Dan
>
>
>
>
> Quoting Jeffrey Wong <mindstormmaster@gmail.com>:
>
> > Depending on how fancy you want to get there are many ways of doing
> > this. Gentoo has a similar concept for the rsync servers.
> > Sourceforge has something for their download mirrors. You could use a
> > round robin DNS setup to distribute each request to a different
> > server. You could write a script that would redirect requests from
> > the primary server to one of many helper servers. There's many ideas,
> > but the biggest problem would be that each server would need to have a
> > complete copy of the site's data. Making updates to a site would
> > become a nightmare, unless each server was set to rsync with the
> > primary server all the time.
> >
> > -jw
> >
> > On Thu, 14 Oct 2004 15:34:21 -0400, daniel
> > <danstemporaryaccount@yahoo.ca> wrote:
> > > this is not likely the best place to ask this, but i'm not sure fo where
> > else
> > > would be. i've noticed lately that a lot of non-profit organisations are
> > > running into trouble paying their bills. the costs are coming mainly from
> > > things like bandwidth consumption, the result of migration of the public
> > from
> > > mainstream sources (cnn.com) to independent ones (indymedia.org). the
> > > problem however is that the traffic hitting non-profit sites is making it
> > > very expensive to run one.
> > >
> > > some have added advertising to the site, others have just plain shut down,
> > but
> > > i was wondering how difficult it would be (or if such a thing already
> > exists)
> > > to run a distributed webserver. something that would split each request to
> > a
> > > page off to multiple low-bandwidth, sattelite servers... like you and i
> > > running boxes at home. some data (say, large video files etc) would only
> > be
> > > available from one pool of sources, while others (html, jpg, png files)
> > would
> > > be available from a larger set.
> > >
> > > of course there would have to be lots of checking to make sure that a
> > server
> > > doesn't blow up or anything, but how doable is this? and why haven't i
> > seen
> > > more non-profits & ngo's doing this sort of thing? personally, i wouldn't
> > > mind giving up a little bandwith for a couple sites i want to support but
> > > don't have the cash to donate to...
> > >
> > > --
> > > travel is fatal to prejudice, bigotry, and narrow-mindedness, and many of
> > our
> > > people need it sorely on these accounts. broad, wholesome, charitable
> > views
> > > of man and things cannot be acquired by vegetating in one little corner of
> > > the earth all one's lifetime.
> > > - mark twain
> > >
> > > --
> > > gentoo-user@gentoo.org mailing list
> > >
> > >
> >
> > --
> > gentoo-user@gentoo.org mailing list
> >
> >
>
>
> --
>
>
> gentoo-user@gentoo.org mailing list
>
>

--
gentoo-user@gentoo.org mailing list