Getting really frustrated with mod_perl2's apparent inability to
probably read UTF8 input.
Here's my mod_perl2 setup:
Apache 2.2.[something]
mod_perl 2.0.7 (or nearly that)
ModPerl::Registry
Perl "script" with CGI.pm
Very early in my app:
## ensure utf8 CGI params:
$CGI::PARAM_UTF8 = 1;
binmode STDIN, ":utf8";
binmode STDOUT, ":utf8";
binmode STDERR, ":utf8";
This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'),
DBI::data_string_desc($foo) shows a UTF8 string with the proper
discrepency between bytes and chars.
But when I try to run it under mod_perl, the returned string appears
to be the raw ascii bytes, and definitely not utf8. Of course, when I
store that in the database (using DBD::Pg), the "latin-1" is encoded
to "utf-8", and I get a bunch of weird chars on the output.
Has anyone managed to round-trip UTF8 from form to database and back
using a setup similar to this?
I suspect part of the problem is this in CGI.pm:
'read_from_client' => <<'END_OF_FUNC',
# Read data from a file handle
sub read_from_client {
my($self, $buff, $len, $offset) = @_;
local $^W=0; # prevent a warning
return $MOD_PERL
? $self->r->read($$buff, $len, $offset)
: read(\*STDIN, $$buff, $len, $offset);
}
END_OF_FUNC
Since I binmode STDIN, the non-$MOD_PERL works ok here. What's the
equivalent of $r->read() that marks the incoming stream as UTF8, so I
get chars instead of bytes? Or can I just read(\*STDIN) in mod_perl2
as well? (I know that was supported at one point...)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig
probably read UTF8 input.
Here's my mod_perl2 setup:
Apache 2.2.[something]
mod_perl 2.0.7 (or nearly that)
ModPerl::Registry
Perl "script" with CGI.pm
Very early in my app:
## ensure utf8 CGI params:
$CGI::PARAM_UTF8 = 1;
binmode STDIN, ":utf8";
binmode STDOUT, ":utf8";
binmode STDERR, ":utf8";
This works fine in CGI mode: when I ask for $foo = $cgi->param('foo'),
DBI::data_string_desc($foo) shows a UTF8 string with the proper
discrepency between bytes and chars.
But when I try to run it under mod_perl, the returned string appears
to be the raw ascii bytes, and definitely not utf8. Of course, when I
store that in the database (using DBD::Pg), the "latin-1" is encoded
to "utf-8", and I get a bunch of weird chars on the output.
Has anyone managed to round-trip UTF8 from form to database and back
using a setup similar to this?
I suspect part of the problem is this in CGI.pm:
'read_from_client' => <<'END_OF_FUNC',
# Read data from a file handle
sub read_from_client {
my($self, $buff, $len, $offset) = @_;
local $^W=0; # prevent a warning
return $MOD_PERL
? $self->r->read($$buff, $len, $offset)
: read(\*STDIN, $$buff, $len, $offset);
}
END_OF_FUNC
Since I binmode STDIN, the non-$MOD_PERL works ok here. What's the
equivalent of $r->read() that marks the incoming stream as UTF8, so I
get chars instead of bytes? Or can I just read(\*STDIN) in mod_perl2
as well? (I know that was supported at one point...)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix consulting, Technical writing, Comedy, etc. etc.
Still trying to think of something clever for the fourth line of this .sig