Mailing List Archive

Strange problem with input charset ?
Hello,

I have a site where the config is a little strange. It works for years but I
just discovered it fails for (at least) one file. Here is the scenario :

I want to log (in a DB) all PDF downloads. So I set up things this way :

EMBPERL_OBJECT_BASE base.epl
<FilesMatch "\.(pdf|html)$">
SetHandler perl-script
PerlHandler Embperl::Object
Options ExecCGI
</FilesMatch>

And base.epl starts like this :

[-
warn('here');
-]

[$ if ($ENV{SCRIPT_NAME} =~ /\.pdf$/) $]
[.-
$ENV{SCRIPT_NAME} =~ /.*\/(.*\.pdf)$/;
my $filename = $1;
Execute('auth.epl'); # This is where the logging takes place
if (open(PDF,$ENV{DOCUMENT_ROOT}.$ENV{SCRIPT_NAME})) {
$http_headers_out{'Content-type'}='application/pdf';
$http_headers_out{'Content-Disposition'}="attachment; filename=$filename";
local ($/);
local ($escmode);
my $pdf=<PDF>;
print OUT $pdf;
close(PDF);
$waspdfok=1;
}
else {
$http_headers_out{'Location'}=["http://$ENV{SERVER_NAME}",404];
exit(1);
}
-]
[$ endif $]

For only one PDF I was getting this error :

Error in Perl code: Can't modify non-lvalue subroutine call in undef operator at /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf line 59, at EOF

And the "here" warning is never printed to the logs, for this PDF only.

I say "was getting" because after I changed something in the config, the error does not
appear anymore BUT the "here" warning is still never printed (and, as you can expect,
the PDF download is never logged).


As you see, this may not be optimal (at all) but PDF files are (I think) "sourced" and I
guess the problem comes from this particular PDF containing invalid/strange/UTF8?/whatever
character(s) preventing the base.epl to be called.


I've spent hours trying to understand/fix this but the best I could do was :

* get the PDF file to be downloadable (no more error preventing it) but not loggable
* some debug info showing the difference between how 2 PDF files are treated :

The bad one :

[22400]REQ: ***** Start Request at Wed Dec 15 08:57:41 2010
[22400]Use App: HtmlEcm
[22408]Embperl::Object Request Filename: /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf
[22408]Embperl::Object basename: base.epl
[22408]Embperl::Object Check for base: /var/www/sites/ecm/data/pdf/base.epl
[22408]Embperl::Object Check for base: /var/www/sites/ecm/data/base.epl
[22408]Embperl::Object Check for base: /var/www/sites/ecm/base.epl
[22408]Embperl::Object Found Base: /var/www/sites/ecm/base.epl
[22408]Embperl::Object path: /var/www/sites/ecm/data/pdf /var/www/sites/ecm/data /var/www/sites/ecm
[22408]Embperl::Object import new file: /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf
[22408] Use Recipe Embperl
[22400]Search for /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf
[22400]Search: nothing to search return /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf
#### things get different here ####
[22400]Reading /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf as input using PerlIO (496821 Bytes)...
[22400]PERF: Parse Start Time: 0 ms
[22400]PERF: Parse End Time: 0 ms
[22400]PERF: Parse Time: 0 ms
[22400]PERF: DOMSTAT: MemUsage = 70288 Bytes numNodes = 32 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 106 numReplace = 5
[22400]PERF: Compile Start Time: 0 ms
[22400]PERF: Compile End Time: 0 ms
[22400]PERF: After Compile Exec End Time: 0 ms
[22400]PERF: Perl Compile End Time: 10 ms
[22400]PERF: Compile Time: 10 ms
[22400]PERF: DOMSTAT: MemUsage = 68224 Bytes numNodes = 25 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 103 numReplace = 5
[22400]ERR: 24: Error in Perl code: Can't modify non-lvalue subroutine call in undef operator at /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf line 59, at EOF
[22408]Embperl::Object import file with ERRORS finished: /var/www/sites/ecm/data/pdf/J_2010_05_241.pdf, package =
[22400]Using APACHE for output...
[22400]PERF: input = ???
[22400]PERF: Time: 10 ms
[22400]Request finished. Wed Dec 15 08:57:41 2010
. Entry-SVs: 34083 Exit-SVs: 34121

A good one :

[22400]REQ: ***** Start Request at Wed Dec 15 08:58:00 2010
[22400]Use App: HtmlEcm
[22409]Embperl::Object Request Filename: /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22409]Embperl::Object basename: base.epl
[22409]Embperl::Object Check for base: /var/www/sites/ecm/data/pdf/base.epl
[22409]Embperl::Object Check for base: /var/www/sites/ecm/data/base.epl
[22409]Embperl::Object Check for base: /var/www/sites/ecm/base.epl
[22409]Embperl::Object Found Base: /var/www/sites/ecm/base.epl
[22409]Embperl::Object path: /var/www/sites/ecm/data/pdf /var/www/sites/ecm/data /var/www/sites/ecm
[22409]Embperl::Object import new file: /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22409] Use Recipe Embperl
[22400]Search for /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search: nothing to search return /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
#### things get different here ####
[22400]Search for /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search: nothing to search return /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search for /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search: nothing to search return /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search for /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22400]Search: nothing to search return /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf
[22409]SYNTAX: switch to Embperl::Syntax::Embperl
[22400]Reading /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf as input using PerlIO (513851 Bytes)...
[22400]PERF: Parse Start Time: 10 ms
[22400]PERF: Parse End Time: 10 ms
[22400]PERF: Parse Time: 0 ms
[22400]PERF: DOMSTAT: MemUsage = 113772 Bytes numNodes = 268 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 28
[22400]PERF: Compile Start Time: 10 ms
[22400]PERF: Compile End Time: 10 ms
[22400]PERF: After Compile Exec End Time: 10 ms
[22400]PERF: Perl Compile End Time: 20 ms
[22400]PERF: Compile Time: 10 ms
[22400]PERF: DOMSTAT: MemUsage = 114812 Bytes numNodes = 268 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 28
[22409]Embperl::Object import file finished: /var/www/sites/ecm/data/pdf/J_2010_05_249.pdf, package = Embperl::__16
[22400]Using APACHE for output...
[22409] Use Recipe Embperl
[22400]Search for /var/www/sites/ecm/base.epl
[22400]Search: nothing to search return /var/www/sites/ecm/base.epl
[22400]Reading /var/www/sites/ecm/base.epl as input using PerlIO (2832 Bytes)...
[22400]PERF: Parse Start Time: 20 ms
[22400]PERF: Parse End Time: 20 ms
[22400]PERF: Parse Time: 0 ms
[22400]PERF: DOMSTAT: MemUsage = 113772 Bytes numNodes = 285 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 28
[22400]PERF: Compile Start Time: 20 ms
[22400]PERF: Compile End Time: 20 ms
[22400]PERF: After Compile Exec End Time: 20 ms
[22400]PERF: Perl Compile End Time: 20 ms
[22400]PERF: Compile Time: 0 ms
[22400]PERF: DOMSTAT: MemUsage = 114812 Bytes numNodes = 268 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 33
[22400]ERR: 32: Warning in Perl code: ici at /var/www/sites/ecm/base.epl line 3.
[22409] Use Recipe Embperl
[22400]Search for basedb.epl
[22400]Search: #0 test dir=/var/www/sites/ecm/data/pdf, fn=/var/www/sites/ecm/data/pdf/basedb.epl (skip=0)
[22400]Search: #1 test dir=/var/www/sites/ecm/data, fn=/var/www/sites/ecm/data/basedb.epl (skip=0)
[22400]Search: #2 test dir=/var/www/sites/ecm, fn=/var/www/sites/ecm/basedb.epl (skip=0)
[22400]Search: found /var/www/sites/ecm/basedb.epl
[22400]PERF: Run Start Time: 20 ms
[22400]PERF: Run End Time: 20 ms
[22400]PERF: Run Time: 0 ms
[22400]PERF: DOMSTAT: MemUsage = 117764 Bytes numNodes = 269 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 33
[22409] Use Recipe Embperl
[22400]Search for auth.epl
[22400]Search: #0 test dir=/var/www/sites/ecm/data/pdf, fn=/var/www/sites/ecm/data/pdf/auth.epl (skip=0)
[22400]Search: #1 test dir=/var/www/sites/ecm/data, fn=/var/www/sites/ecm/data/auth.epl (skip=0)
[22400]Search: #2 test dir=/var/www/sites/ecm, fn=/var/www/sites/ecm/auth.epl (skip=0)
[22400]Search: found /var/www/sites/ecm/auth.epl
[22400]PERF: Run Start Time: 20 ms
[22400]PERF: Run End Time: 30 ms
[22400]PERF: Run Time: 10 ms
[22400]PERF: DOMSTAT: MemUsage = 121996 Bytes numNodes = 273 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 284 numReplace = 33
[22400]ERR: 32: Warning in Perl code: v2 at /var/www/sites/ecm/base.epl line 31, <GEN1> line 3.
[22400]PERF: Run Start Time: 20 ms
[22400]PERF: Run End Time: 40 ms
[22400]PERF: Run Time: 20 ms
[22400]PERF: DOMSTAT: MemUsage = 121916 Bytes numNodes = 277 numLevelLookup = 0 numLevelLookupItem = 0 numStr = 285 numReplace = 33
[22400]PERF: input = ???
[22400]PERF: Time: 40 ms
[22400]Request finished. Wed Dec 15 08:58:00 2010
. Entry-SVs: 37260 Exit-SVs: 37609


My question is : is there a better way than this <FilesMatch "\.(pdf|html)$">
hack to get what I want ?
Or should I continue with this technique but bypassing any "read error"
(with something like the unimplemented-as-the-doc-says EMBPERL_INPUT_CHARSET ?)

I use embperl 2.3.0-1 on a Debian Lenny system.


Thanks for your help.

JC


---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org
Re: Strange problem with input charset ? [ In reply to ]
Well, maybe I should reword my question.

I want to place links to PDF files on my site like :

<a href="/pdf/v01.pdf">Click here to download</a>

And I want to log the fact that someone downloaded the file.

So I set Apache up so that all PDF files are handled by Embperl :

EMBPERL_OBJECT_BASE base.epl
<FilesMatch "\.(pdf|html)$">
SetHandler perl-script
PerlHandler Embperl::Object
Options ExecCGI
</FilesMatch>

Inside my "base.epl", I distinguish .pdf from .html and proceed
accordingly :

[$ if ($ENV{SCRIPT_NAME} =~ /\.pdf$/) $]
[.-
$ENV{SCRIPT_NAME} =~ /.*\/(.*\.pdf)$/;
my $filename = $1;
Execute('auth.epl'); # This is where the logging takes place
if (open(PDF,$ENV{DOCUMENT_ROOT}.$ENV{SCRIPT_NAME})) {
$http_headers_out{'Content-type'}='application/pdf';
$http_headers_out{'Content-Disposition'}="attachment; filename=$filename";
local ($/);
local ($escmode);
my $pdf=<PDF>;
print OUT $pdf;
close(PDF);
}
exit 1;
-]
[$ endif $]

This worked very well until I switched to UTF8 and discovered that Embperl
was actually loading in the PDF file (like an Execute() ). I thought it would
only do it when it would encounter :

Execute('*');

(which I wouldn't have done for PDF of course) but for some reason, it does read
it. And some PDF files contain some characters, invalid in the UTF8 sense, that
prevent the script to work.

So, is there a way to do what I'm trying to do ?

Thanks a lot for reading this far and for your precious help/hint/guidline...

JC

---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org
Re: Strange problem with input charset ? [ In reply to ]
Jean-Christophe,
It seems reasonable for me that Embperl "executes" the .pdf files
since to told it to do in your httpd.conf

Why don't you just put your links something like this:
<a href="authorize_log_and_deliver_pdf.ep?file=my.pdf>click here</a>

And than in the .ep file do something like:

[*
execute(auth.epl ) ;
$http_headers_out{'Location'}= $fdat{'file'};

*]

In other words:
Do what you have to do and let Apache do its work.

Jean-Christophe Boggio schrieb:
> Well, maybe I should reword my question.
>
> I want to place links to PDF files on my site like :
>
> <a href="/pdf/v01.pdf">Click here to download</a>
>
> And I want to log the fact that someone downloaded the file.
>
> So I set Apache up so that all PDF files are handled by Embperl :
>
> EMBPERL_OBJECT_BASE base.epl
> <FilesMatch "\.(pdf|html)$">
> SetHandler perl-script
> PerlHandler Embperl::Object
> Options ExecCGI
> </FilesMatch>
>
> Inside my "base.epl", I distinguish .pdf from .html and proceed
> accordingly :
>
> [$ if ($ENV{SCRIPT_NAME} =~ /\.pdf$/) $]
> [.-
> $ENV{SCRIPT_NAME} =~ /.*\/(.*\.pdf)$/;
> my $filename = $1;
> Execute('auth.epl'); # This is where the logging takes place
> if (open(PDF,$ENV{DOCUMENT_ROOT}.$ENV{SCRIPT_NAME})) {
> $http_headers_out{'Content-type'}='application/pdf';
> $http_headers_out{'Content-Disposition'}="attachment;
> filename=$filename";
> local ($/);
> local ($escmode);
> my $pdf=<PDF>;
> print OUT $pdf;
> close(PDF);
> }
> exit 1;
> -]
> [$ endif $]
>
>


--
mit freundlichem Gruß,

Frank Wesemann
Fotofinder GmbH USt-IdNr. DE812854514
Software Entwicklung Web: http://www.fotofinder.com/
Potsdamer Str. 96 Tel: +49 30 25 79 28 90
10785 Berlin Fax: +49 30 25 79 28 999

Sitz: Berlin
Amtsgericht Berlin Charlottenburg (HRB 73099)
Geschäftsführer: Ali Paczensky




---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org