Mailing List Archive

Shambhala Modules Musings
After looking at this a bit closer this afternoon, I am beginning to
see some pretty incredible abuses of this module model.

I had kicked the following idea around with some members of this
group awhile back, but now see a more obvious path to the dream.

Consider a CGI server...

You could create a separate module for each of your CGI applications.
Based on the AddType, you would call that particular module. You could
even have a "standalone" CGI server listening on another port for
handler requests from the main HTTP server. No more startup delays
for the various CGI programs. We're going to have to write a perl
bootstrapper to load in modules. :-)
Re: Shambhala Modules Musings [ In reply to ]
/*
* "Shambhala Modules Musings" by Randy Terbush <randy@zyzzyva.com>
* written Sun, 02 Jul 1995 17:59:43 -0500
*
* You could create a separate module for each of your CGI
* applications. Based on the AddType, you would call that particular
* module. You could even have a "standalone" CGI server listening on
* another port for handler requests from the main HTTP server. No
* more startup delays for the various CGI programs.

This is exactly what we did for Netsite's integrated imagemap. If
using magic MIME types makes you ill, in Netsite you can also assign
different functions to handle various URLs and patterns of URLs.

* We're going to have to write a perl bootstrapper to load in modules. :-)
*/

Also, think about foo.java

--Rob
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
rst writes,
> The easiest way to fix this is to change the way that timeouts are
> handled. Here's one idea --- instead of doing an immediate longjmp(),
> just put the connection in an "aborted" state where the server will
> refuse to do any more I/O on it. (The easiest thing would be to just
> close the file descriptor).
>
> In fact, if I can safely assume that no sensible Unix-oid OS will
> attempt to restart I/O on a socket when a SIGALRM handler returns,
> this would allow me to ditch the longjmp() from the handling of client
> aborts. If that assumption is good, I'm going to try this (next
> weekend at the latest).

It might sound like an easy way out, but 0.7 would have a child
which hits a timeout/broken connection exit after logging it... It's
not as elegant as closing everything down and starting again, but
at least one doesn't have to worry about handling the signals in a
portable fashion.

My Unix guru friend says that anything other than a longjump from
a signal handler is prone to portability problems. If you do the
longjump, it has to do all the cleaning up.

An exit() after a timeout/broken connection shouldn't cause any
real efficiency problems.. At Cardiff they come in at a rate of
1 for every 30+ connections.. many of the children still manage to
reach their MaxRequestsPerChild (60 in this example) without hitting one.

-=-=-=

As for the config problem Rob asked for more info on..

I removed the offending line (it wasn't being used for anything) so
I no longer have the exact line, but here's the problem..

Shambhala says,

"AddType takes two arguments, a mime type followed by a file extension"

Shouldn't it allow things like

AddType text/html; charset=ISO-8859 html
and
AddType text/html; html htm

But the question remains,

is "charset=ISO-8859" a parameter or a file extension ?

maybe ""s should be allowed (to remove ambiguity)

AddType "text/html; charset=ISO-8859" html


Also, if AddType is meant to be equivalent to entries in mime.types, then
it needs to accept multiple file extensions

AddType image/gif; gif GIF
AddType test/html; html htm


Now might also be a good time to encourage the omission of "." from
the extensions... it'll mean a slight divergence from what's allowed
in 1.3, but the "." isn't supposed to be there... a well worded warning
message would probably suffice.


A bug fix..
http_main.c clen=sizeof(sa_client);
|
v
clen=sizeof(struct sockaddr_in)



rob
--
http://nqcd.lanl.gov/~hartill/
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
From: Randy Terbush <randy@zyzzyva.com>
Reply-To: new-httpd@hyperreal.com

After looking at this a bit closer this afternoon, I am beginning to
see some pretty incredible abuses of this module model.

Consider a CGI server...

You could create a separate module for each of your CGI applications.
Based on the AddType, you would call that particular module.

So far, not abusive at all... the only thing to beware of is that the
server itself is a rather less forgiving environment for anyone's code
than an external CGI process --- file descriptors it fails to close
and memory it fails to allocate all leak, etc. I think that for most
scripts, at most sites, it probably isn't worth the hassle. But if
you want the option, it's there.

You could
even have a "standalone" CGI server listening on another port for
handler requests from the main HTTP server. No more startup delays
for the various CGI programs.

On the other hand, if you're going to turn the scripts into
full-fledged internet servers in their own right, maybe turning them
into modules is an alternative to consider ;-).

We're going to have to write a perl
bootstrapper to load in modules. :-)

Ha! Bet you thought I wouldn't take that seriously!

What you're actually asking for is the ability to write modules in
Perl, java, or the other embedded language of your choice. The major
barrier to that in Shambhala as it stands is the handling of timeouts:
when a request times out or gets a SIGPIPE, the signal handler in
http_main.c does a longjmp() straight back to child_main(), which
frees everything in the per-transaction resource pool and asks for
another request.

The problem here is that everything which *isn't* tied into the
resource pool machinery in alloc.c is a potential leak. The code in
my development sandbox makes this a little less restrictive by
allowing just about anything to be registered with alloc.c as a
cleanup function, but you still can't get free of the resource pools
completely. What makes that a problem for embedded languages is that
files opened, memory allocated, etc., by the embedded language's
native primitives won't be tied in to any Shambhala resource pool, and
will be potential leaks.

The easiest way to fix this is to change the way that timeouts are
handled. Here's one idea --- instead of doing an immediate longjmp(),
just put the connection in an "aborted" state where the server will
refuse to do any more I/O on it. (The easiest thing would be to just
close the file descriptor).

In fact, if I can safely assume that no sensible Unix-oid OS will
attempt to restart I/O on a socket when a SIGALRM handler returns,
this would allow me to ditch the longjmp() from the handling of client
aborts. If that assumption is good, I'm going to try this (next
weekend at the latest).

That leaves the following requirements on a language to be embedded
into Shambhala:

1) it should be reentrant --- that is, it should be possible for code
in, say, Perl to call the Shambhala C code, and to have that code
in turn invoke more Perl code. If not, you lose sub_requests.

2) Ideally, it should be thread-safe.

There probably aren't any essential conflicts between that and the
Perl5 embedding support (though it's a little hard to tell, given that
the documentation for embedded Perl5 consists, in its entirety, of the
sentence "Look at perl_main.c, and do something like that."). At any
rate, it doesn't seem infeasable, although you'd need someone
knowledgable in Perl internals to do a proper job.

rst
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
> > Now might also be a good time to encourage the omission of "." from
> > the extensions... it'll mean a slight divergence from what's allowed
> > in 1.3, but the "." isn't supposed to be there... a well worded warning
> > message would probably suffice.
> >
> > I'm not sure how you conclude what is or isn't "supposed" to exist in
> > these cases, but the '.'s *do* exist in a lot of places, and the cost
> > of skipping over them if they are present is essentially zero.
>
> The NCSA documentation. I think Randy looked it up at some point.
>
> True, they don't cause any real problems, other than confusing people
> as to what the config file options actually do. By having a warning, it
> should encourage existing users to clean up their conf files, that way
> they don't end up advising others of the incorrect syntax.. which is what
> has happened so far with AddType.

I do recall looking this up in the mime references we have linked into
the "library". It did not demand the ommision, but seemed to indicate
that the proper syntax is sans '.'. I don't know if it is work breaking
the server for. At about shambhala2, this was the behavior, and failed
on an old config file on my test machine....
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
> Ah, but one of my not-so-hidden agendas here is that I'm looking
> towards multithreading the thing (I'm not sure how it's possible to
> write a decent HTTP-NG server which *isn't* multithreaded, unless you
> turn the entire thing into a gigantic collage of state machines, which
> generally results in an unreadable mess). That doesn't give you the
> option of tossing the entire address space any time a timeout on one
> of the twenty-odd requests it might be serving gets aborted; you need
> to find another way out of it.

Sounds good. Will a threads based system be portable ?, I'm thinking
of problems like Rob McCool pointed out, were a Solaris (I beleive)
library was leaking, and hence forced the MaxRequestPerChild approach.
Perhaps these issues are unrelatated, I know nothing about threads.

> On the other subject... this syntax:
>
> AddType "text/html; charset=ISO-8859" html
>
> actually works, though you wouldn't know it because I forgot to make
> the ARENA_BUG_WORKAROUND properly conditional. (Sigh...). Actually,
> any "word" in a config file can be delimited by single or double
> quotes, in addition to blanks; backslash escapes should also work more
> or less as you expect.

nice... but see next comment.

> Also, if AddType is meant to be equivalent to entries in mime.types, then
> it needs to accept multiple file extensions
>
> AddType image/gif; gif GIF
> AddType test/html; html htm
>
> Easy enough; change the TAKE2 in the mod_mime command table to
> ITERATE2, and change the wording of the error message in the table to
> indicate that the extensions can be plural.

please read these points not as "how do I make the current version
of Shambhala do this", but as "a future version probably needs to do this".

At this stage I'm not interested in fine tuning the configuration, it
either works as I expect or it doesn't.. I can wait for updates.

> Now might also be a good time to encourage the omission of "." from
> the extensions... it'll mean a slight divergence from what's allowed
> in 1.3, but the "." isn't supposed to be there... a well worded warning
> message would probably suffice.
>
> I'm not sure how you conclude what is or isn't "supposed" to exist in
> these cases, but the '.'s *do* exist in a lot of places, and the cost
> of skipping over them if they are present is essentially zero.

The NCSA documentation. I think Randy looked it up at some point.

True, they don't cause any real problems, other than confusing people
as to what the config file options actually do. By having a warning, it
should encourage existing users to clean up their conf files, that way
they don't end up advising others of the incorrect syntax.. which is what
has happened so far with AddType.

Maybe someone will want to have a type which ends ..html :-)

> A bug fix..
> http_main.c clen=sizeof(sa_client);
> |
> v
> clen=sizeof(struct sockaddr_in)
>
>
> Hmmm... sizeof(*sa_client) is probably what's meant here, though the
> existing code is fairly clearly wrong...

same thing I guess; *sa_client would make it more portable should
sockaddr_in ever be changed for some weird platform.


rob
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
From: Rob Hartill <hartill@ooo.lanl.gov>
Date: Mon, 3 Jul 95 10:21:43 MDT

rst writes,
> The easiest way to fix this is to change the way that timeouts are
> handled. Here's one idea --- instead of doing an immediate longjmp(),
> just put the connection in an "aborted" state where the server will
> refuse to do any more I/O on it. (The easiest thing would be to just
> close the file descriptor).

It might sound like an easy way out, but 0.7 would have a child
which hits a timeout/broken connection exit after logging it... It's
not as elegant as closing everything down and starting again, but
at least one doesn't have to worry about handling the signals in a
portable fashion.

Ah, but one of my not-so-hidden agendas here is that I'm looking
towards multithreading the thing (I'm not sure how it's possible to
write a decent HTTP-NG server which *isn't* multithreaded, unless you
turn the entire thing into a gigantic collage of state machines, which
generally results in an unreadable mess). That doesn't give you the
option of tossing the entire address space any time a timeout on one
of the twenty-odd requests it might be serving gets aborted; you need
to find another way out of it.

My Unix guru friend says that anything other than a longjump from
a signal handler is prone to portability problems. If you do the
longjump, it has to do all the cleaning up.

My counterproposal amounts to having the signal handler just change a
global variable... that's actually *less* than the longjmp().



On the other subject... this syntax:

AddType "text/html; charset=ISO-8859" html

actually works, though you wouldn't know it because I forgot to make
the ARENA_BUG_WORKAROUND properly conditional. (Sigh...). Actually,
any "word" in a config file can be delimited by single or double
quotes, in addition to blanks; backslash escapes should also work more
or less as you expect.

Also, if AddType is meant to be equivalent to entries in mime.types, then
it needs to accept multiple file extensions

AddType image/gif; gif GIF
AddType test/html; html htm

Easy enough; change the TAKE2 in the mod_mime command table to
ITERATE2, and change the wording of the error message in the table to
indicate that the extensions can be plural.

Now might also be a good time to encourage the omission of "." from
the extensions... it'll mean a slight divergence from what's allowed
in 1.3, but the "." isn't supposed to be there... a well worded warning
message would probably suffice.

I'm not sure how you conclude what is or isn't "supposed" to exist in
these cases, but the '.'s *do* exist in a lot of places, and the cost
of skipping over them if they are present is essentially zero.

A bug fix..
http_main.c clen=sizeof(sa_client);
|
v
clen=sizeof(struct sockaddr_in)


Hmmm... sizeof(*sa_client) is probably what's meant here, though the
existing code is fairly clearly wrong...

rst
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
> same thing I guess; *sa_client would make it more portable should
> sockaddr_in ever be changed for some weird platform.
>
> Are Linux and Solaris weird enough for you? (It's not sockaddr_in for
> either of those, or for Next).

Maybe that explains some of the problems they were having with 0.7.x
I mistakenly thought they were using the same structure, but that's only
the case for sa_server, not sa_client... oh well, as long as we get it
right eventually.

rob
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
> Failure to handle the '.foo' syntax was reported to me as a bug, and
> fixed, sometime after that. (Doing what the user obviously wanted in
> this case is at *least* as easy as doing anything else).

so is '..foo' possible ?, it doesn't sound like if '.foo' gets translated
to 'foo'. I'll never use it, so do whatever you think is best.
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
From: Rob Hartill <hartill@ooo.lanl.gov>
Date: Mon, 3 Jul 95 11:34:33 MDT

Sounds good. Will a threads based system be portable ?, I'm thinking
of problems like Rob McCool pointed out, were a Solaris (I beleive)
library was leaking, and hence forced the MaxRequestPerChild approach.
Perhaps these issues are unrelatated, I know nothing about threads.

The irony here is that Solaris has one of the few kernels which has
decent native threads support. To answer the question: yes, when
you've got leaky C libraries, you have to rotate server processes
*eventually*, but at the same time, when a multithreaded server just
up and dies, every transaction that it was processing gets
unceremoniously aborted, which is not good. There are two ways of
dealing with this:

1) Live with the aborts, but rotate the processes rarely enough that
they aren't a *significant* problem (for some suitable value of
"significant"). Dying on every timeout is dying too often for
this --- I'm not terribly busy here (a mere 150K reqs/day), and
it's still not uncommon to see several timeouts a minute.

2) Adopt a more gradual approach to process rotation, where the
"waning" server process stops accepting new requests when the
"waxing" one comes on-line, but finishes the ones it had in
progress before actually kicking the bucket.

Either one is workable (though I have an obvious preference for #2).
In short, this issue can be worked out.

What's more of a concern is the portability of the threading interface
itself. I know there are some PD threads libraries which are at least
*supposed* to be moderately portable, but I admit that I need to
investigate the matter further.

same thing I guess; *sa_client would make it more portable should
sockaddr_in ever be changed for some weird platform.

Are Linux and Solaris weird enough for you? (It's not sockaddr_in for
either of those, or for Next).

rst
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
From: Randy Terbush <randy@zyzzyva.com>
Reply-To: new-httpd@hyperreal.com

I do recall looking this up in the mime references we have linked into
the "library". It did not demand the ommision, but seemed to indicate
that the proper syntax is sans '.'. I don't know if it is work breaking
the server for. At about shambhala2, this was the behavior, and failed
on an old config file on my test machine....

Failure to handle the '.foo' syntax was reported to me as a bug, and
fixed, sometime after that. (Doing what the user obviously wanted in
this case is at *least* as easy as doing anything else).

rst
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
From: Rob Hartill <hartill@ooo.lanl.gov>
Date: Mon, 3 Jul 95 14:22:29 MDT

> Failure to handle the '.foo' syntax was reported to me as a bug, and
> fixed, sometime after that. (Doing what the user obviously wanted in
> this case is at *least* as easy as doing anything else).

so is '..foo' possible ?, it doesn't sound like if '.foo' gets translated
to 'foo'. I'll never use it, so do whatever you think is best.

Actually, '..foo' wouldn't ever have worked, because the code regards
'.' as a delimiter for suffixes --- on this theory, '..foo' is two
suffixes, not one, the first being null. You'd have to really mess
with the code to get it to treat the second '.' as part of a suffix
delimited by the first, and I'm not sure I see the point.

Besides, it seems clear that a whole lot of people, including many in
this group, are already using the 'Addtype foo/bar .fubar' syntax no
matter *what* the documentation says. It might as well work,
particularly since that's no more trouble than doing anything else
with it (e.g., declaring it a syntax error and wrecking peoples'
existing setups).

rst
Re: Shambhala Modules Musings [network wizard advice sought at bottom] [ In reply to ]
> What's more of a concern is the portability of the threading interface
> itself. I know there are some PD threads libraries which are at least
> *supposed* to be moderately portable, but I admit that I need to
> investigate the matter further.

Chris Provenzano has a very good POSIX Threads package called pthreads
which is available at ftp://sipb.mit.edu/pub/pthreads

It has been ported to: Solaris, SunOS, Ultrix, Linux, NetBSD, FreeBSD, BSDI,
OSF/1 [for the alpha] (Digital Unix), and HPUX. He is currently working
on an IRIX port.

Matthew Gray --------------------------------- voice: (617) 577-9800
net.Genesis fax: (617) 577-9850
56 Rogers St. mkgray@netgen.com
Cambridge, MA 02142-1119 ------------- http://www.netgen.com/~mkgray