Mailing List Archive

using memcached as a StateDB
I'm toying with idea presently, but it would make for a great step up
from MLDBM and company.
And although Apache::ASP::State makes quite a few assumptions about the
StateDB being a file, I think I could add memcached as a back end
without making it way to ugly (interface via Cache::Memcached(::Fast),
of course).
It sure beats having StateDB on ramdisk with NFS for clusters....

What do you think ?


---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
For additional commands, e-mail: asp-help@perl.apache.org
Re: using memcached as a StateDB.. getting there [ In reply to ]
(ignoring the fact that no-one seems interested)

I came accross Cache::Memcached::Tie which pretty much does most of the
work, however I'm having some issues - actually design decisions:
- I suppose StateDir should equal ``namespace'' in memcached parlance.
Do we want to make this transparent to the end user or add a
configuration option ?
- memcached has no concept of Lock(), UnLock() and some stuff I haven't
figured out yet. Add no-ops for these to Cache::Memcached::Tie or is
there some more elegant way to bypass them inside Apache::ASP ?
- It seems there's no obvious way to enumerate keys in memcached
(FIRSTKEY NEXTKEY). Perhaps keep a separate index of keys inside
memcached and add methods that use those (?)
- memcached needs a few extra configuration options. Most important is
obviously ``servers'', but also ``compress_threshold'' and ``debug''


---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
For additional commands, e-mail: asp-help@perl.apache.org
RE: using memcached as a StateDB.. getting there [ In reply to ]
I, for one, am interested, though I haven't been able to do much or get
involved.

"Namespace" - I suggest making this an option that is user tunable, but pick
a good default value that hopefully won't need to be tweaked all the time.

"Lock/Unlock" - Each individual operation of memcached is atomic.. if two
processes attempt to write to one location at the same time, they will be
serialized and one will not corrupt the other. The problem is you don't
necessarily know which one will actually win. This means you can't guarantee
state if you don't do locking to ensure the right one wins. Consider an
incrementing counter. Assuming you do a get to read the value, increment it
by one, and then do a set to save it; each individual operation is safe
(get, set), but not the combination (get+set). If two hit at the same time,
the number would increment by 1 and not by 2 (one for each). I'd worry that
making lock and unlock a no-op would create new problems, especially as site
volume increases and chances of simultaneous updates increase.

"No list of keys" - Usually I know what keys I'm stuffing into memcacheb and
I don't need to walk through the keys, or if I do I have a place outside of
memcache that has the keys to lookup. This one could get tricky. How would
you keep the list of keys in a second entry intact, especially if two
processes wanted to add a key at the same time?

Other thoughts/suggestions:

Having not looked at your code or design this may not be applicable, but
consider making this generic, something that memcache or another cache
engine could be plugged into. If you're interested, and my time permits, I'd
be interested in working on part of this with you.

Greg

> -----Original Message-----
> From: Thanos Chatziathanassiou [mailto:tchatzi@arx.gr]
> Sent: Thursday, March 26, 2009 5:33 AM
> Cc: 'asp@perl.apache.org'
> Subject: Re: using memcached as a StateDB.. getting there
>
> (ignoring the fact that no-one seems interested)
>
> I came accross Cache::Memcached::Tie which pretty much does most of the
> work, however I'm having some issues - actually design decisions:
> - I suppose StateDir should equal ``namespace'' in memcached parlance.
> Do we want to make this transparent to the end user or add a
> configuration option ?
> - memcached has no concept of Lock(), UnLock() and some stuff I haven't
> figured out yet. Add no-ops for these to Cache::Memcached::Tie or is
> there some more elegant way to bypass them inside Apache::ASP ?
> - It seems there's no obvious way to enumerate keys in memcached
> (FIRSTKEY NEXTKEY). Perhaps keep a separate index of keys inside
> memcached and add methods that use those (?)
> - memcached needs a few extra configuration options. Most important is
> obviously ``servers'', but also ``compress_threshold'' and ``debug''
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
> For additional commands, e-mail: asp-help@perl.apache.org
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.0.238 / Virus Database: 270.11.26/2020 - Release Date:
> 03/25/09 07:16:00


---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
For additional commands, e-mail: asp-help@perl.apache.org
Re: using memcached as a StateDB.. getting there [ In reply to ]
O/H Gregory S. Youngblood έγραψε:
> I, for one, am interested, though I haven't been able to do much or get
> involved.
>
> "Namespace" - I suggest making this an option that is user tunable, but pick
> a good default value that hopefully won't need to be tweaked all the time.
>
It would vaguely resemble ``StateDir''. Each separate StateDir signifies
a unique application.
Thus far, using StateDir in shmfs/tmpfs I'd use
/dev/shm/apache/website1 and /dev/shm/apache/website2 for two separate
applications. Something similar would apply to using a single memcached
for multiple applications, no ?
Of course, the default StateDir ``.state'' would mix things up a bit.
> "Lock/Unlock" - Each individual operation of memcached is atomic.. if two
> processes attempt to write to one location at the same time, they will be
> serialized and one will not corrupt the other. The problem is you don't
> necessarily know which one will actually win. This means you can't guarantee
> state if you don't do locking to ensure the right one wins.
Well, the original point of Lock and UnLock was to avoid corrupting the
on-disk files, but your point is valid..
> Consider an
> incrementing counter. Assuming you do a get to read the value, increment it
> by one, and then do a set to save it; each individual operation is safe
> (get, set), but not the combination (get+set). If two hit at the same time,
> the number would increment by 1 and not by 2 (one for each). I'd worry that
> making lock and unlock a no-op would create new problems, especially as site
> volume increases and chances of simultaneous updates increase.
>
Each user gets his own key (session-id in Apache::ASP) to write, apart
from the generic ``application''.
The case you're describing isn't handled in Apache::ASP already and it's
probably because the developer (end user in our case) should concern
himself with that.
But we can work on that one too if you feel so inclined.
> "No list of keys" - Usually I know what keys I'm stuffing into memcacheb and
> I don't need to walk through the keys, or if I do I have a place outside of
> memcache that has the keys to lookup. This one could get tricky. How would
> you keep the list of keys in a second entry intact, especially if two
> processes wanted to add a key at the same time?
>
According to perltie, tying hashes pretty much expects ``FIRSTKEY'' and
``NEXTKEY'' to work. But this is an inadequacy of Cache::Memcached::Tie
which we may or may not address.
I was (indirectly) asking Josh if these are actually needed for
Apache::ASP Sessions to work, seeing that they're defined in
Apache::ASP::Session but never actually called - not from within
Apache::ASP at least. But someone is bound to have used / wants to use
keys(%$Session) already..
> Other thoughts/suggestions:
>
> Having not looked at your code or design this may not be applicable, but
> consider making this generic, something that memcache or another cache
> engine could be plugged into. If you're interested, and my time permits, I'd
> be interested in working on part of this with you.
>
The thing is, Apache::ASP::State is too tightly bound to on-disk dbm
files and would require a major rewrite to facilitate other storage
engines. I suspect that's what held Josh from implementing
Apache::Session storage. Come to think of it, there already is an
Apache::Session::Memcached thing out there, so perhaps we should focus
on making Apache::ASP::State work with an Apache::Session and friends
back-end instead of hacking around.
I'd feel more comfortable if we had help from the original author of
Apache::ASP for this (or at least his blessing ;)
Let's think this through the weekend and decide on Monday.

Best Regards,
Thanos Chatziathanassiou



---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
For additional commands, e-mail: asp-help@perl.apache.org
Re: using memcached as a StateDB [ In reply to ]
I had my svn repository disk die on me recently, but still have my
working copy around and got some free time to hack it.
It turned into a real Apache::Session session store back-end for
Apache::ASP and seems to be working fine so far, although it is a bit
rough around the edges.

Due to ``internal'' and ``application'' not being valid keys for most
Apache::Session implementations, I had them hard-coded to
``00000000000000000000000000000000'' and
``ffffffffffffffffffffffffffffffff'' respectively. The ugly part is
Apache::ASP::State::ApacheSession::ServerAutoKeys, which creates these
automatically if they're needed when a request arrives.
I'm afraid it is prone to deadlock if there are multiple simultaneous
first requests and the session store is something heavier than memcached
(memcached and redis seem to be fine with it by design of course). Due
to this, it is off by default, but can be turned on in the server
configuration.
The hard-coded 1MB limit item size in memcached is somewhat problematic
for internal and I had it run out without compression by creating about
28000 sessions. The space each session occupies in internal is fixed,
regardless of its contents, right ?
I suppose having the session manager clean up on a tighter schedule is
an option with a fast backend though.

Some benchmarks look promising:
memcached running on localhost
$ perl multi_http.pl --requests=1000 --concurrency=5
--url=http://127.0.0.1:8080/asp/benchmarks/memcached/index.asp
Child 001: 200 requests success: 200, failure: 0 in 1.560713 sec
Child 003: 200 requests success: 200, failure: 0 in 1.564235 sec
Child 004: 200 requests success: 200, failure: 0 in 1.577891 sec
Child 002: 200 requests success: 200, failure: 0 in 1.630787 sec
Child 005: 200 requests success: 200, failure: 0 in 1.565008 sec
Parent total time: 1.66203 sec

classic sessions running on shmfs/tmpfs via MLDBM::Sync::SDBM_File
$ perl multi_http.pl --requests=1000 --concurrency=5
--url=http://127.0.0.1:8080/asp/benchmarks/classic/index.asp
Child 001: 200 requests success: 200, failure: 0 in 1.849258 sec
Child 005: 200 requests success: 200, failure: 0 in 1.973691 sec
Child 004: 200 requests success: 200, failure: 0 in 2.004447 sec
Child 002: 200 requests success: 200, failure: 0 in 2.064989 sec
Child 003: 200 requests success: 200, failure: 0 in 2.094894 sec
Parent total time: 2.106399 sec

I did not yet try with MLDBM::Sync::SDBM_File on an NFS mounted
filesystem (the server that would be hosting that died along with my svn
repository), but these results look tempting.
Since I'm afraid my disk might also die on me and I'd have to redo
everything from the start, I'm attaching what I have so far.

Once again, this has not been (properly) tested. This quite ugly. It is
most definitely NOT production-stuff material. It is not even ready.
Caveat emptor. Still, it shows Apache::Session back-end can be made to work.


Configuration for the file served above:
for memcached version:
---8<---
PerlSetVar AllowSessionState 1
PerlSetVar AllowApplicationState 1
PerlSetVar StateDB Apache::Session::Memcached
#Memcached
PerlSetVar ApacheSessionParams Servers
PerlAddVar ApacheSessionParams 127.0.0.1:11211
PerlAddVar ApacheSessionParams Namespace
PerlAddVar ApacheSessionParams testing
PerlAddVar ApacheSessionParams CompressThreshold
PerlAddVar ApacheSessionParams 10000
PerlAddVar ApacheSessionParams AutoCreateApplicationInternalKeys
PerlAddVar ApacheSessionParams 1
---8<---
(The author of Apache::Session::Store::Memcached has not implemented
``Namespace'' and since I was already playing with his module, I cheated
and changed it so that it uses Cache::Memcached::Fast instead of plain
Cache::Memcached)

for classic version:
---8<---
PerlSetVar AllowSessionState 1
PerlSetVar AllowApplicationState 1
PerlSetVar StateDB MLDBM::Sync::SDBM_File
PerlSetVar StateDir /dev/shm/apache
---8<---

Other tested back-ends:
#Sybase
PerlSetVar StateDB Apache::Session::Sybase
PerlSetVar ApacheSessionParams DataSource
PerlAddVar ApacheSessionParams
dbi:Sybase:database=session_db;server=dev_server
PerlAddVar ApacheSessionParams UserName
PerlAddVar ApacheSessionParams username
PerlAddVar ApacheSessionParams Password
PerlAddVar ApacheSessionParams password
PerlAddVar ApacheSessionParams Commit
PerlAddVar ApacheSessionParams 1

#SQLite3
PerlSetVar StateDB Apache::Session::SQLite3
PerlSetVar ApacheSessionParams DataSource
PerlAddVar ApacheSessionParams dbi:SQLite:dbname=/tmp/session.db

#Postgres
PerlSetVar StateDB Apache::Session::Postgres
PerlSetVar ApacheSessionParams DataSource
PerlAddVar ApacheSessionParams DBI:Pg:dbname=dev_server;host=192.168.100.46
PerlAddVar ApacheSessionParams UserName
PerlAddVar ApacheSessionParams username
PerlAddVar ApacheSessionParams Password
PerlAddVar ApacheSessionParams password
PerlAddVar ApacheSessionParams Commit
PerlAddVar ApacheSessionParams 1

#Redis via self-made Apache::Session::Redis using Redis-0.08
PerlSetVar StateDB Apache::Session::Redis
PerlSetVar ApacheSessionParams Server
PerlAddVar ApacheSessionParams 127.0.0.1:6379
PerlAddVar ApacheSessionParams Namespace
PerlAddVar ApacheSessionParams testing