> On Aug 2, 2021, at 4:28 PM, Harald Jörg <haj@posteo.de> wrote:
>
> Dan Book <grinnz@gmail.com> writes:
>
>> DBD::MariaDB, DBD::SQLite, and DBD::Pg are used with the unicode
>> option in any modern programs. Thus they expect decoded strings.
>
> As far as DBD::SQLite is concerned, this is only half-true. In the
> current version 1.70 there have been changes how to declare unicode
> handling, but even with DBD_SQLITE_STRING_MODE_UNICODE_STRICT you can
> feed it UTF-8 encoded byte sequences and it "just works" (but maybe
> shouldn't).
Perl has no way of making it *not* work, alas.
>
> You see the downside of this when you have a non-ASCII literal in a
> iso-latin-1 encoded Perl source (e.g. "ä" or "\x{e4}"). For Perl, it is
> the same character as "\N{LATIN SMALL LETTER A WITH DIAERESIS}", but if
> you feed both to the database you get different results.
This should no longer be the case if you avoid DBD_SQLITE_STRING_MODE_PV.
> It seems that the driver still inspects the infamous UTF-8-flag to
> decide whether a literal is encoded or not.
This is not the case, except with DBD_SQLITE_STRING_MODE_PV, which for backward compatibility reasons remains the default.
-FG
>
> Dan Book <grinnz@gmail.com> writes:
>
>> DBD::MariaDB, DBD::SQLite, and DBD::Pg are used with the unicode
>> option in any modern programs. Thus they expect decoded strings.
>
> As far as DBD::SQLite is concerned, this is only half-true. In the
> current version 1.70 there have been changes how to declare unicode
> handling, but even with DBD_SQLITE_STRING_MODE_UNICODE_STRICT you can
> feed it UTF-8 encoded byte sequences and it "just works" (but maybe
> shouldn't).
Perl has no way of making it *not* work, alas.
>
> You see the downside of this when you have a non-ASCII literal in a
> iso-latin-1 encoded Perl source (e.g. "ä" or "\x{e4}"). For Perl, it is
> the same character as "\N{LATIN SMALL LETTER A WITH DIAERESIS}", but if
> you feed both to the database you get different results.
This should no longer be the case if you avoid DBD_SQLITE_STRING_MODE_PV.
> It seems that the driver still inspects the infamous UTF-8-flag to
> decide whether a literal is encoded or not.
This is not the case, except with DBD_SQLITE_STRING_MODE_PV, which for backward compatibility reasons remains the default.
-FG