Mailing List Archive

lvs, squid and efficiency
Suppose i have two real servers running Squid, and a LVS in front of
them, with round-robin.
Now, suppose one web page is cached on the first real server, a client
requests that page, and LVS redirects it to the second real server. The
result will be a cache miss, even if the object was stored into the
other real server.

Is there any way LVS can be combined with an HTTP proxy, and not loose
cache efficiency?

--
Florin Andrei
Re: lvs, squid and efficiency [ In reply to ]
That's what the LBLC and DH schedulers are for. (LBLC uses persistence
memory, and DH uses a destination hash. LBLC will lead to better
balancing, DH is more scalable.)

Florin Andrei wrote:

> Suppose i have two real servers running Squid, and a LVS in front of
> them, with round-robin.
> Now, suppose one web page is cached on the first real server, a client
> requests that page, and LVS redirects it to the second real server. The
> result will be a cache miss, even if the object was stored into the
> other real server.
>
> Is there any way LVS can be combined with an HTTP proxy, and not loose
> cache efficiency?


--
Joe Cooper <joe@swelltech.com>
Affordable Web Caching Proxy Appliances
http://www.swelltech.com
Re: lvs, squid and efficiency [ In reply to ]
Florin Andrei wrote:
>
> Suppose i have two real servers running Squid, and a LVS in front of
> them, with round-robin.
> Now, suppose one web page is cached on the first real server, a client
> requests that page, and LVS redirects it to the second real server. The
> result will be a cache miss, even if the object was stored into the

Thomas Proell gave us some code last Oct, which handles this situation.
It's now the -dh scheduler (in 2.4.x kernel code). (I haven't tried it,
I've only seen the postings about it)

Joe

--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center,
mailto:mack.joseph@epa.gov ph# 919-541-0007, RTP, NC, USA
RE: lvs, squid and efficiency [ In reply to ]
I suggest you do a web search for Rice and LARD...
(I am not joking, you will find something useful
and relevant to your question ;-)

/sG
Re: lvs, squid and efficiency [ In reply to ]
Dear Florin,

I would suggest that you peer the two servers. If a cache miss occurs on
one server then with peering it will first contact the other server to
see if it has the item in it's cache. If that fails the first server
will then go and do a direct fetch. If you set up the 2 servers so that
they peer each other this would help your case alot.

Yours sincerely

David Ruwoldt

Florin Andrei wrote:
>
> Suppose i have two real servers running Squid, and a LVS in front of
> them, with round-robin.
> Now, suppose one web page is cached on the first real server, a client
> requests that page, and LVS redirects it to the second real server. The
> result will be a cache miss, even if the object was stored into the
> other real server.
>
> Is there any way LVS can be combined with an HTTP proxy, and not loose
> cache efficiency?
>
> --
> Florin Andrei
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users

--
David Ruwoldt
Senior Systems Specialist
Information Technology Services
Level 7, 10 Pulteney St
ADELAIDE UNIVERSITY SA 5005
AUSTRALIA

-----------------------------------------------------------
This email message is intended only for the addressee(s)
and contains information which may be confidential and/or
copyright. If you are not the intended recipient please
do not read, save, forward, disclose, or copy the contents
of this email. If this email has been sent to you in error,
please notify the sender by reply email and delete this
email and any copies or links to this email completely and
immediately from your system. No representation is made
that this email is free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
Re: lvs, squid and efficiency [ In reply to ]
On 20 Mar 2001 09:10:20 +1030, David Ruwoldt wrote:
> Dear Florin,
>
> I would suggest that you peer the two servers. If a cache miss occurs on
> one server then with peering it will first contact the other server to
> see if it has the item in it's cache. If that fails the first server

But what if that *doesn't* fail?
I'm not sure what peering does.

--
Florin Andrei
RE: lvs, squid and efficiency [ In reply to ]
On Mon, 19 Mar 2001, Steve Gonczi wrote:

> I suggest you do a web search for Rice and LARD...
> (I am not joking, you will find something useful
> and relevant to your question ;-)

do they have working code? (I'll download the paper
at work tomorrow and look then)

Joe

--
Joseph Mack mack@ncifcrf.gov
RE: lvs, squid and efficiency [ In reply to ]
Yes, they have a research prototype.
Be aware, it is NOT GPL-ed.

>do they have working code? (I'll download the paper
>at work tomorrow and look then)
RE: lvs, squid and efficiency [ In reply to ]
On Mon, 19 Mar 2001, Steve Gonczi wrote:

> Yes, they have a research prototype.
> Be aware, it is NOT GPL-ed.

So universities aren't even pretending to be founts of knowlege for
society now?

I downloaded their paper and didn't find anything about real code,
anywhere obvious (I couldn't find the big DOWNLOAD button for instance)
even in section 5 where they say they have working code.


Joe

--
Joseph Mack mack@ncifcrf.gov
Re: lvs, squid and efficiency [ In reply to ]
Dear Florin,

If it does'nt fail then the server simply gets the data from the other
server and places it in its own cache. Example

User Request for Web Addres -> Server 1

Server 1 peered with Server 2

Server 1 does not have information and asks Server 2. Server 2 has
information and so passes it back to Server 1. Server 1 then has
information cached and also passes it on to user.

Server 1 does not have information and asks Server 2. Server 2 reply's
that it does not have information. Server 1 goes out and gets
information from web caches it and passes it back to user.

Server 1 has information cached and so just passes it back to user.

User Request for Web Addres -> Server 2

Server 2 peered with Server 1

Server 2 does not have information and asks Server 1. Server 1 has
information and so passes it back to Server 2. Server 2 then has
information cached and also passes it on to user.

Server 2 does not have information and asks Server 1. Server 1 reply's
that it does not have information. Server 2 goes out and gets
information from web caches it and passes it back to user.

Server 2 has information cached and so just passes it back to user.

So you would want each server to peer with the other. If you lose a
server then the server that is still up will just go direct to the web
as it could not contact the other server. Hope this makes sense.

Yours sincerely

David Ruwoldt

Florin Andrei wrote:
>
> On 20 Mar 2001 09:10:20 +1030, David Ruwoldt wrote:
> > Dear Florin,
> >
> > I would suggest that you peer the two servers. If a cache miss occurs on
> > one server then with peering it will first contact the other server to
> > see if it has the item in it's cache. If that fails the first server
>
> But what if that *doesn't* fail?
> I'm not sure what peering does.
>
> --
> Florin Andrei
>
> _______________________________________________
> LinuxVirtualServer.org mailing list - lvs-users@LinuxVirtualServer.org
> Send requests to lvs-users-request@LinuxVirtualServer.org
> or go to http://www.in-addr.de/mailman/listinfo/lvs-users

--
David Ruwoldt
Senior Systems Specialist
Information Technology Services
Level 7, 10 Pulteney St
ADELAIDE UNIVERSITY SA 5005
AUSTRALIA

-----------------------------------------------------------
This email message is intended only for the addressee(s)
and contains information which may be confidential and/or
copyright. If you are not the intended recipient please
do not read, save, forward, disclose, or copy the contents
of this email. If this email has been sent to you in error,
please notify the sender by reply email and delete this
email and any copies or links to this email completely and
immediately from your system. No representation is made
that this email is free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.
Re: lvs, squid and efficiency [ In reply to ]
On 20 Mar 2001 11:11:00 +1030, David Ruwoldt wrote:
>
> If it does'nt fail then the server simply gets the data from the other
> server and places it in its own cache. Example
>
> So you would want each server to peer with the other. If you lose a
> server then the server that is still up will just go direct to the web
> as it could not contact the other server. Hope this makes sense.

That's great! So, actually, i only need to use persistent port (besides
peering proxies). Just like the guys from JANET did.

--
Florin Andrei
Re: lvs, squid and efficiency [ In reply to ]
Florin Andrei wrote:
>
> On 20 Mar 2001 11:11:00 +1030, David Ruwoldt wrote:
> >
> > If it does'nt fail then the server simply gets the data from the other
> > server and places it in its own cache. Example
> >
> > So you would want each server to peer with the other. If you lose a
> > server then the server that is still up will just go direct to the web
> > as it could not contact the other server. Hope this makes sense.
>
> That's great! So, actually, i only need to use persistent port (besides
> peering proxies). Just like the guys from JANET did.

well yes, you have the idea, but the -dh scheduler handles that for you.
You don't want to go to the peer if you can go to the box that has the data.

Joe


--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center,
mailto:mack.joseph@epa.gov ph# 919-541-0007, RTP, NC, USA
Re: lvs, squid and efficiency [ In reply to ]
On 19 Mar 2001, Florin Andrei wrote:
> That's great! So, actually, i only need to use persistent port (besides
> peering proxies). Just like the guys from JANET did.

FWIW, we don't use persistent connections. Just a lot of cache digests :-)

--
Andrew Veitch, National & Local Web Cache Support andrew.veitch@man.ac.uk
Manchester Computing, University of Manchester http://wwwcache.ja.net/
Re: lvs, squid and efficiency [ In reply to ]
On 20 Mar 2001 06:21:10 -0500, Joseph Mack wrote:
>
> well yes, you have the idea, but the -dh scheduler handles that for you.
> You don't want to go to the peer if you can go to the box that has the data.

Sounds interesting, and seems to be faster than peering.
How stable is -dh? Has been tested with kernel-2.4 a lot?

--
Florin Andrei
Re: lvs, squid and efficiency [ In reply to ]
Florin Andrei wrote:
>
> > well yes, you have the idea, but the -dh scheduler handles that for you.
> > You don't want to go to the peer if you can go to the box that has the data.
>
> Sounds interesting, and seems to be faster than peering.

that's the idea

> How stable is -dh? Has been tested with kernel-2.4 a lot?

<exuberant sales pitch>
It's brand new in 2.4. It's the usual LVS quality and you
could be one of the first to try it out :-) (step this way...)

Actually Thomas Proell spend most of last year working on it
- it was for his Masters thesis, and it wound up
as a patch to 2.2.17. Thomas has now gone off to a real
job, where he is being paid to code and he's too busy
paying off his debts to do anything else and it doesn't
look like he'll have time to do a patch for 2.2.18.

Wensong doesn't have money pressures like this, and being an
academic he does what he likes, so he did the port to
2.4 as the -dh scheduler.
</exuberant sales pitch>

Joe

--
Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin
contractor to the National Environmental Supercomputer Center,
mailto:mack.joseph@epa.gov ph# 919-541-0007, RTP, NC, USA
Re: lvs, squid and efficiency [ In reply to ]
On 20 Mar 2001 16:20:50 -0500, Joseph Mack wrote:
> Florin Andrei wrote:
> >
> > How stable is -dh? Has been tested with kernel-2.4 a lot?
>
> <exuberant sales pitch>
> It's brand new in 2.4. It's the usual LVS quality and you
> could be one of the first to try it out :-) (step this way...)

lol...

Ok. Now i'm evaluating many different cache clusters: LVS, usual ICP,
digests, and so on.
There are pros and cons for each one.
Seems like LVS in front of a good server farm is the best for me. Only
the scheduler remains to be chosen.
-dh looks very nice. Maybe the load will be distributed a little bit
unevenly with it, but the overall performance have to be better.

--
Florin Andrei
Re: lvs, squid and efficiency [ In reply to ]
On 20 Mar 2001 17:23:10 +0000, Andrew Veitch wrote:
> On 19 Mar 2001, Florin Andrei wrote:
> > That's great! So, actually, i only need to use persistent port (besides
> > peering proxies). Just like the guys from JANET did.
>
> FWIW, we don't use persistent connections. Just a lot of cache digests :-)

But... why? Aren't persistent connections supposed to be more
appropriate for things like HTTP proxies?

--
Florin Andrei
Re: lvs, squid and efficiency [ In reply to ]
Florin Andrei wrote:

> On 20 Mar 2001 16:20:50 -0500, Joseph Mack wrote:
>
>> Florin Andrei wrote:
>>
>>> How stable is -dh? Has been tested with kernel-2.4 a lot?
>>
>> <exuberant sales pitch>
>> It's brand new in 2.4. It's the usual LVS quality and you
>> could be one of the first to try it out :-) (step this way...)
>
>
> lol...
>
> Ok. Now i'm evaluating many different cache clusters: LVS, usual ICP,
> digests, and so on.
> There are pros and cons for each one.
> Seems like LVS in front of a good server farm is the best for me. Only
> the scheduler remains to be chosen.
> -dh looks very nice. Maybe the load will be distributed a little bit
> unevenly with it, but the overall performance have to be better.

I would expect that until you get into very big clusters (800+ reqs/sec,
probably) or multiple balancers, LBLC will work just as well. There was
quite a heavy debate on the list about 6 months ago, regarding the
choice of schedulers for web caches (Thomas Proell and I coming down
firmly on the side of a destination hash, and many others coming down
firmly on the side of what became LBLC). Thankfully, Wensong was wiser
than all of us, and decided not to choose...putting them both in! ;-)

But both will work quite well...DH will cause a few hotspots (one cache
may see 20-30% higher loads at times than others in the cluster), but it
does guarantee that you can keep scaling it almost forever. And you can
even add multiple balancers to the picture, since given an equal number
of caches and the same ordering of the caches--the same IP's will be
hashed to the same caches. (At least that's the theory, I haven't at
all tested multiple balancers.)

That said, I still fall on the side of preferring DH, as it is so
consistent. I worry about the long term content distribution across
caches when using LBLC, but I can't prove that there is a problem with
it, since I haven't done large scale long term testing (that is coming
on my to-do list in a few weeks).
--
Joe Cooper <joe@swelltech.com>
Affordable Web Caching Proxy Appliances
http://www.swelltech.com
Re: lvs, squid and efficiency [ In reply to ]
Hello,

On 19 Mar 2001, Florin Andrei wrote:

>
> Suppose i have two real servers running Squid, and a LVS in front of
> them, with round-robin.
> Now, suppose one web page is cached on the first real server, a client
> requests that page, and LVS redirects it to the second real server. The
> result will be a cache miss, even if the object was stored into the
> other real server.

I hope you will prefer LBLCR and not LBLC. And probably there
can be another better solution. For example, a variant with cooperation
between squid and LVS. On lvs module load we can load the information
extracted from the proxy cache. By this way the (new) LVS scheduler
will build the table with associations on start. May be the real servers
can be enumerated, etc. The problem is how to transform the
names (or AS numbers) to IP addresses or Class C networks for example.
Job for the user space. The new scheduler will need a way to receive
all these associations in independent format. By this way the information
will be persistent and will not change on scheduler/director restart.
Only an idea.

> Is there any way LVS can be combined with an HTTP proxy, and not loose
> cache efficiency?
>
> --
> Florin Andrei


Regards

--
Julian Anastasov <ja@ssi.bg>