Mailing List Archive

Migration from Apache to Cherokee: URI tweaking
Hi,

Don't know if this the proper place for such a question, so excuse me if this is considered as "noise".

I'm presently testing Cherokee as an alternative to Apache for the LXR project (see http://lxr.sourceforge.net). As long as I use elementary LXR configuration features, it works fine.

But, when I come to try to serve several databases (for short, database is the closest ordinary concept equivalent to its "service") with a single application instance, a trick is played on the URI, which I can't fancy how to convert.

LXR is driver by this kind of URI:

http://hostname/LXR_service_signature/DB_id/script_file/path_for_script?arguments

i.e. an argument-like is interspersed inside the web-path for the script. Under Apache, directive AliasMatch strips off this information and simultaneously routes the request to an alternate document root. The important point is the original URI is not changed and available for parsing unaltered by the script which retrieves the DB_id.

In my conversion attempt, I used either a directory rule (on LXR_service_signature) or regexp rule, both with a redirect handler to remove the DB_id and other non-path related bits. Unhappily, this rewrites the URI and defeats the script processing which no longer can retrieve the DB_id.

Does there exist in Cherokee a means to launch a script whose command line is generated from groups ($1, $2, ...) captured by the regexp-based rule so that the URI is unaltered (environment variables reflect the initial URI)?

ajl
Re: Migration from Apache to Cherokee: URI tweaking [ In reply to ]
I haven't used Cherokee for a while, but this is known as a rewrite rule
("internal" redirect).


On Wed, May 1, 2013 at 8:27 PM, Anfré Littoz <page74010-chrk@yahoo.fr>wrote:

> Hi,
>
> Don't know if this the proper place for such a question, so excuse me if
> this is considered as "noise".
>
> I'm presently testing Cherokee as an alternative to Apache for the LXR
> project (see http://lxr.sourceforge.net). As long as I use elementary LXR
> configuration features, it works fine.
>
> But, when I come to try to serve several databases (for short, database is
> the closest ordinary concept equivalent to its "service") with a single
> application instance, a trick is played on the URI, which I can't fancy how
> to convert.
>
> LXR is driver by this kind of URI:
>
> http://hostname/LXR_service_signature/DB_id
> /script_file/path_for_script?arguments
>
> i.e. an argument-like is interspersed inside the web-path for the script.
> Under Apache, directive AliasMatch strips off this information and
> simultaneously routes the request to an alternate document root. The
> important point is the original URI is not changed and available for
> parsing unaltered by the script which retrieves the DB_id.
>
> In my conversion attempt, I used either a directory rule (on
> LXR_service_signature) or regexp rule, both with a redirect handler to
> remove the DB_id and other non-path related bits. Unhappily, this rewrites
> the URI and defeats the script processing which no longer can retrieve the
> DB_id.
>
> Does there exist in Cherokee a means to launch a script whose command line
> is generated from groups ($1, $2, ...) captured by the regexp-based rule so
> that the URI is unaltered (environment variables reflect the initial URI)?
>
> ajl
>
> _______________________________________________
> Cherokee mailing list
> Cherokee@lists.octality.com
> http://lists.octality.com/listinfo/cherokee
>
>
Re: Migration from Apache to Cherokee: URI tweaking [ In reply to ]
Yes, this is a rewrite rule and consequently it changes the URI.

What I need is a rule, which can be flagged "final", causing execution of a CGI script whose home directory is known (e;g. /cgi-bin) and whose name and eventually arguments are taken from the URI ($x substitutions) while the environment variables SERVER_NAME, SCRIPT_NAME, ... are set from the original URI.

From the documentation, I'm afraid this is not possible with Cherokee.

The alternate solution would be to remove DB_id from the "script web path" part since this was a bad design decision as it mixes script path and argument but this involves a major rewrite of LXR initialisation and configuration. Maybe this is the wise direction since it could greatly simplify integration with web servers and allow for more server candidates.

ajl



________________________________
De : Daniel Lo Nigro <lists@dan.cx>
À : cherokee List <Cherokee@lists.octality.com>
Envoyé le : Mercredi 1 mai 2013 14h33
Objet : Re: [Cherokee] Migration from Apache to Cherokee: URI tweaking



I haven't used Cherokee for a while, but this is known as a rewrite rule ("internal" redirect).


_______________________________________________
Cherokee mailing list
Cherokee@lists.octality.com
http://lists.octality.com/listinfo/cherokee
Re: Migration from Apache to Cherokee: URI tweaking [ In reply to ]
Hmm I thought rewrite rules shouldn't change the URI if they're marked as
"internal". Does LXR have instructions for Nginx or Lighttpd? If so, you
should be able to convert those rules to Cherokee rules. Apache has so many
configuration options that sometimes it's hard to find the exact match.


On Wed, May 1, 2013 at 10:57 PM, Anfré Littoz <page74010-chrk@yahoo.fr>wrote:

> Yes, this is a rewrite rule and consequently it changes the URI.
>
> What I need is a rule, which can be flagged "final", causing execution of
> a CGI script whose home directory is known (e;g. /cgi-bin) and whose name
> and eventually arguments are taken from the URI ($x substitutions) while
> the environment variables SERVER_NAME, SCRIPT_NAME, ... are set from the
> original URI.
>
> From the documentation, I'm afraid this is not possible with Cherokee.
>
> The alternate solution would be to remove DB_id from the "script web path"
> part since this was a bad design decision as it mixes script path and
> argument but this involves a major rewrite of LXR initialisation and
> configuration. Maybe this is the wise direction since it could greatly
> simplify integration with web servers and allow for more server candidates.
>
> ajl
>
> ------------------------------
> *De :* Daniel Lo Nigro <lists@dan.cx>
> *À :* cherokee List <Cherokee@lists.octality.com>
> *Envoyé le :* Mercredi 1 mai 2013 14h33
> *Objet :* Re: [Cherokee] Migration from Apache to Cherokee: URI tweaking
>
> I haven't used Cherokee for a while, but this is known as a rewrite rule
> ("internal" redirect).
>
>
> _______________________________________________
> Cherokee mailing list
> Cherokee@lists.octality.com
> http://lists.octality.com/listinfo/cherokee
>
>
>
> _______________________________________________
> Cherokee mailing list
> Cherokee@lists.octality.com
> http://lists.octality.com/listinfo/cherokee
>
>
Re: Migration from Apache to Cherokee: URI tweaking [ In reply to ]
I tried both "internal" and "external" with the same effect. "external" has the debugging advantage to show the result of substitution.

Nginx and lighttpd are based on original concepts, different from Apache.

* lightpd:
You regexp-match on the URL and you tell that such initial URL part is related to such document root, e.g.:
     $HTTP["url"] =~ "^/LXR_signature/" {
            alias.url += ( "/LXR_signature/DB_id" => "common_LXR_root_directory" )
     }
and there is another directive to name the files which are scripts.

*nginx:
There is no notion of "master" DocumentRoot and Alias as in Apache. Every URL can be individually diverted to its own root. Of course in the simplest case, this is equivalent to DocumentRoot or Cherokee's directory rule or default rule. Part of an URL is regexp-matched by a location directive and you tell what you want to do with the bits without rewrite, e.g.:
    server { ...
        location ~ ^/LXR_signature/[^/]+(.*)$ {
            alias /LXR_root_directory/$1 ;     # for ordinary files like .css or images
            location ~ ^(/LXR_signature/[^/]+/)(script_names) {
                set $virtroot $1;
                set $scriptname $2;
                alias /LXR_root_directory;
                include fastcgi.conf;
                fastcgi_param SCRIPT_FILENAME $document_root$scriptname;   # compute which script to launch
                fastcgi_param SCRIPT_NAME $virtroot$scriptname; # unaltered CGI variable
                fastcgi_pass unix://...;
            }
        }
    }

My idea was to mimic nginx' "action" block. Unhappily, I could not fancy how to simultaneously "untweak" the initial URL part and launch the correct script. I had to break it into two separate rules. The first one identifies the LXR service and removes DB_id to provide an "ordinary" script path, but doing so I lose DB_id. The second one is a common directory rule with CGI handler, but since the URI has changed the called script fails because it takes a segment of the web directory as the DB_id.

Anyway, considering the tweaks needed to port LXR on various web servers, I'm more and more convinced that putting the DB_id in the middle of the web directory name (exactly, just before the script name) was a bad design choice (but ages ago, you had Apache - full stop). That information belongs in the script parameters, maybe as a root for PATH_INFO. Notwithstanding the compatibility issue with existing LXR sites, redesigning this needs a lot of effort (first for a neat design, next in trying not to break the core code).

Thus if you know the name of a Cherokee variable, like nginx' $originaluri, this could temporarily solve the problem.

Thanks for your answers and your patience.



________________________________
De : Daniel Lo Nigro <lists@dan.cx>
À : Anfré Littoz <page74010-chrk@yahoo.fr>
Cc : cherokee List <Cherokee@lists.octality.com>
Envoyé le : Mercredi 1 mai 2013 15h03
Objet : Re: [Cherokee] Migration from Apache to Cherokee: URI tweaking



Hmm I thought rewrite rules shouldn't change the URI if they're marked as "internal". Does LXR have instructions for Nginx or Lighttpd? If so, you should be able to convert those rules to Cherokee rules. Apache has so many configuration options that sometimes it's hard to find the exact match.