Mailing List Archive

Case-insensitive file systems
I blew an hour of a client's time trying to figure out why I was getting
redefined subroutine warnings. I eventually tracked them down to this:

# use `require` instead of `use` to avoid a
# circular load
require Client::Db;


I had *just* written that code and my tests for it passed. However, the
warnings were showing up elsewhere, so I was led on a wild goose chase.

It should have been:

require Client::DB;


Because the file system on Macs is case-insensitive, it didn't see
$INC{'Client/Db.pm'} and it cheerfully loaded Client/DB.pm and caused the
warning.

Is it possible to introduce some kind of lexically-scoped pragma to tell
Perl to to verify the case of the filename (and directories, too, of
course). This is not the first time I've been bitten by this and it would
be nice if it could be addressed.

Best,
Ovid
Re: Case-insensitive file systems [ In reply to ]
On Fri, 14 Apr 2023 19:10:28 +0200
Ovid <curtis.poe@gmail.com> wrote:

> I blew an hour of a client's time trying to figure out why I was getting
> redefined subroutine warnings. I eventually tracked them down to this:
>
> # use `require` instead of `use` to avoid a
> # circular load
> require Client::Db;
>
>
> I had *just* written that code and my tests for it passed. However, the
> warnings were showing up elsewhere, so I was led on a wild goose chase.
>
> It should have been:
>
> require Client::DB;
>
>
> Because the file system on Macs is case-insensitive, it didn't see
> $INC{'Client/Db.pm'} and it cheerfully loaded Client/DB.pm and caused the
> warning.
>
> Is it possible to introduce some kind of lexically-scoped pragma to tell
> Perl to to verify the case of the filename (and directories, too, of
> course). This is not the first time I've been bitten by this and it would
> be nice if it could be addressed.
>
> Best,
> Ovid

There's an open PR that adds a warning in this case:

https://github.com/Perl/perl5/pull/19419
Re: Case-insensitive file systems [ In reply to ]
On Fri, 14 Apr 2023 21:11:22 +0200
Tomasz Konojacki <me@xenu.pl> wrote:

> There's an open PR that adds a warning in this case:
>
> https://github.com/Perl/perl5/pull/19419

Actually, this warns only when import has arguments. So,
"use Client::Db 'foo'" would warn, but "use Client::Db" would not.
Re: Case-insensitive file systems [ In reply to ]
On 4/14/23 10:10, Ovid wrote:
> I blew an hour of a client's time trying to figure out why I was getting
> redefined subroutine warnings. I eventually tracked them down to this:
>
> # use `require` instead of `use` to avoid a
> # circular load
> require Client::Db;
>
>
> I had *just* written that code and my tests for it passed. However, the
> warnings were showing up elsewhere, so I was led on a wild goose chase.
>
> It should have been:
>
> require Client::DB;
>
>
> Because the file system on Macs is case-insensitive, it didn't see
> $INC{'Client/Db.pm'} and it cheerfully loaded Client/DB.pm and caused the
> warning.
>
> Is it possible to introduce some kind of lexically-scoped pragma to tell
> Perl to to verify the case of the filename (and directories, too, of
> course). This is not the first time I've been bitten by this and it would
> be nice if it could be addressed.
>
> Best,
> Ovid


The last time I installed macOS Big Sur (version 11), I chose to format
my system disk as "APFS case-sensitive". At some point, I upgraded to
macOS Monterey (version 12). While there are reports of application
problems on case sensitive file systems on macOS X (version 10), I have
not encountered any.


David
Re: Case-insensitive file systems [ In reply to ]
FWIW, git has issues on case-insensitive filesystems if you have a conflict in
its eyes, e.g.:

./gitrepo/foo.txt
./gitrepo/FOO.txt

It doesn't warn that the filesystem is case insensitive, only that there is
in effect a system exception raised when trying to write second file. Seems
like the right approach.

For Perl pragma and to abstract it from the concern of the filesystem, it
seems the real need is a more belabored search for the file.

E.g.,

require m/Client::DB/i;

Where it would have to search for all case sensitive permutations that this
implies on a case insensitive platform. So then you have to consider the
search order if on the case sensitive FS there are multiple valid paths:

.../Client/DB.pm
.../client/dB.pM
.../cLiEnT/Db.Pm
... etc

FWIW, there are things on CPAN under both "cPanel::" and "Cpanel::" name
spaces. It'd be a shame if someone uploaded a "cPanel::JSON::XS", at least
for anyone on a case insensitive filesystem.

Anyway, idk what the right approach here is the solution to the problem
is in the realm of namespace resolution and not in Perl being aware of
how the filesystem deals with upper and lower case.

Cheers,
Brett

* David Christensen <dpchrist@holgerdanske.com> [2023-04-14 13:26:30 -0700]:

> On 4/14/23 10:10, Ovid wrote:
> > I blew an hour of a client's time trying to figure out why I was getting
> > redefined subroutine warnings. I eventually tracked them down to this:
> >
> > # use `require` instead of `use` to avoid a
> > # circular load
> > require Client::Db;
> >
> >
> > I had *just* written that code and my tests for it passed. However, the
> > warnings were showing up elsewhere, so I was led on a wild goose chase.
> >
> > It should have been:
> >
> > require Client::DB;
> >
> >
> > Because the file system on Macs is case-insensitive, it didn't see
> > $INC{'Client/Db.pm'} and it cheerfully loaded Client/DB.pm and caused the
> > warning.
> >
> > Is it possible to introduce some kind of lexically-scoped pragma to tell
> > Perl to to verify the case of the filename (and directories, too, of
> > course). This is not the first time I've been bitten by this and it would
> > be nice if it could be addressed.
> >
> > Best,
> > Ovid
>
>
> The last time I installed macOS Big Sur (version 11), I chose to format my
> system disk as "APFS case-sensitive". At some point, I upgraded to macOS
> Monterey (version 12). While there are reports of application problems on
> case sensitive file systems on macOS X (version 10), I have not encountered
> any.
>
>
> David
>

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native
Re: Case-insensitive file systems [ In reply to ]
On Fri, 14 Apr 2023 at 21:14, Tomasz Konojacki <me@xenu.pl> wrote:
>
> On Fri, 14 Apr 2023 21:11:22 +0200
> Tomasz Konojacki <me@xenu.pl> wrote:
>
> > There's an open PR that adds a warning in this case:
> >
> > https://github.com/Perl/perl5/pull/19419
>
> Actually, this warns only when import has arguments. So,
> "use Client::Db 'foo'" would warn, but "use Client::Db" would not.

The problem is that it is perfectly legit to have a package whose name
on disk does not match the package it installs functionality into. A
common pattern for instance is to have a file Whatever/Heavy.pm load
functionality that gets installed into the Whatever namespace instead
of into the Whatever::Heavy namespace. (using a file may populate zero
or many namespaces so there is no way to validate it is functioning as
intended). These days we dont have any examples of this in core that
I am aware of, but historically Carp::Heavy and Exporter::Heavy were
good examples of this pattern. We do have examples in core where
certain namespaces are populated as a side-effect of loading a
specific file. Tie::Hash for instance populates the namespaces
Tie::Hash, Tie::StdHash, Tie::ExtraHash.

We can validate if there was a typo when someone asks the newly loaded
package to export a symbol because if there is a mistake then
execution will end up inside of UNIVERSAL::import() instead of the
import method in the loaded package that should be handling the
export. But when there is nothing to export we cannot distinguish
between someone playing presumably legitimate games with package names
and file names and someone making a typo on the filename. After all
people regularly load modules with 'use' that do not contain an import
method, do not use Exporter or any hand rolled equivalents, and do not
export anything.

The fact that the filesystem's case insensitivity plays a role here
just makes things worse. Eg, you say "use Client::db;" the case
insensitivity "corrects" your mistake and loads "Client/Db.pm" but you
still think it should be called Client::db when in fact it is called
Client::Db. Another factor that also makes things worse is that we
dont have a standard built in way of exporting functionality, and we
have a proliferation of modules that implement "exporter" semantics.
So we can't "just" introduce a solution to these problems in the
standard modules like Exporter.pm. [. This came up previously with this
PR and related subject. ] We either have to figure out tricky
solutions really low down the stack (eg, the trickery in Universal,
and the trickery in core itself to not error when import is missing),
or we have to leave the bug. [.ASIDE: I consider the way Perl handles
symbol imports to be in the list of Larrys top 10 design mistakes for
Perl, it is a place where TIMTOWTDI is a very bad idea and has caused
huge trouble IMO - seems nice at first, but on further reflection
causes a world of trouble.]

Anyway, that PR was stalled for various reasons, thanks for the
reminder. Ill try to get it picked up in the 5.39 dev cycle.

Interesting piece of trivia: The "import" method is handled via a
special case, so you can call it on any package or namespace and it
will not throw an exception if it is missing. The PR you linked to
fixes that so that the import method is treated as any other function,
and then ensures that UNIVERSAL::import() exists to handle any calls
to packages that do not implement import().

cheers,
Yves








--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: Case-insensitive file systems [ In reply to ]
On Sun, 16 Apr 2023 13:39:15 +0200
demerphq <demerphq@gmail.com> wrote:

> The problem is that it is perfectly legit to have a package whose name
> on disk does not match the package it installs functionality into. A
> common pattern for instance is to have a file Whatever/Heavy.pm load
> functionality that gets installed into the Whatever namespace instead
> of into the Whatever::Heavy namespace. (using a file may populate zero
> or many namespaces so there is no way to validate it is functioning as
> intended). These days we dont have any examples of this in core that
> I am aware of, but historically Carp::Heavy and Exporter::Heavy were
> good examples of this pattern. We do have examples in core where
> certain namespaces are populated as a side-effect of loading a
> specific file. Tie::Hash for instance populates the namespaces
> Tie::Hash, Tie::StdHash, Tie::ExtraHash.
>
> We can validate if there was a typo when someone asks the newly loaded
> package to export a symbol because if there is a mistake then
> execution will end up inside of UNIVERSAL::import() instead of the
> import method in the loaded package that should be handling the
> export. But when there is nothing to export we cannot distinguish
> between someone playing presumably legitimate games with package names
> and file names and someone making a typo on the filename. After all
> people regularly load modules with 'use' that do not contain an import
> method, do not use Exporter or any hand rolled equivalents, and do not
> export anything.

An alternative, more complicated approach would be to check if the
filename of the loaded module matches what was requested.

When user calls "require File::Stat", I guess we could do something like
this:

1. open("File/Stat.pm").

2. Obtain the filename from the file handle and verify it's in the
correct case.

However, it's not portable and what about symlinks?
Re: Case-insensitive file systems [ In reply to ]
On Sun, 16 Apr 2023 at 19:51, Tomasz Konojacki <me@xenu.pl> wrote:
>
> On Sun, 16 Apr 2023 13:39:15 +0200
> demerphq <demerphq@gmail.com> wrote:
>
> > The problem is that it is perfectly legit to have a package whose name
> > on disk does not match the package it installs functionality into. A
> > common pattern for instance is to have a file Whatever/Heavy.pm load
> > functionality that gets installed into the Whatever namespace instead
> > of into the Whatever::Heavy namespace. (using a file may populate zero
> > or many namespaces so there is no way to validate it is functioning as
> > intended). These days we dont have any examples of this in core that
> > I am aware of, but historically Carp::Heavy and Exporter::Heavy were
> > good examples of this pattern. We do have examples in core where
> > certain namespaces are populated as a side-effect of loading a
> > specific file. Tie::Hash for instance populates the namespaces
> > Tie::Hash, Tie::StdHash, Tie::ExtraHash.
> >
> > We can validate if there was a typo when someone asks the newly loaded
> > package to export a symbol because if there is a mistake then
> > execution will end up inside of UNIVERSAL::import() instead of the
> > import method in the loaded package that should be handling the
> > export. But when there is nothing to export we cannot distinguish
> > between someone playing presumably legitimate games with package names
> > and file names and someone making a typo on the filename. After all
> > people regularly load modules with 'use' that do not contain an import
> > method, do not use Exporter or any hand rolled equivalents, and do not
> > export anything.
>
> An alternative, more complicated approach would be to check if the
> filename of the loaded module matches what was requested.
>
> When user calls "require File::Stat", I guess we could do something like
> this:
>
> 1. open("File/Stat.pm").
>
> 2. Obtain the filename from the file handle and verify it's in the
> correct case.

Yes.

> However, it's not portable

No likely not.

> and what about symlinks?

Indeed. Also perhaps we would have to consider what to do about INC hooks.

cheers,
Yves


--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: Case-insensitive file systems [ In reply to ]
* demerphq <demerphq@gmail.com> [2023-04-16 20:17:55 +0200]:

> On Sun, 16 Apr 2023 at 19:51, Tomasz Konojacki <me@xenu.pl> wrote:
> >
> > On Sun, 16 Apr 2023 13:39:15 +0200
> > demerphq <demerphq@gmail.com> wrote:
> >
> > > The problem is that it is perfectly legit to have a package whose name
> > > on disk does not match the package it installs functionality into. A
> > > common pattern for instance is to have a file Whatever/Heavy.pm load
> > > functionality that gets installed into the Whatever namespace instead
> > > of into the Whatever::Heavy namespace. (using a file may populate zero
> > > or many namespaces so there is no way to validate it is functioning as
> > > intended). These days we dont have any examples of this in core that
> > > I am aware of, but historically Carp::Heavy and Exporter::Heavy were
> > > good examples of this pattern. We do have examples in core where
> > > certain namespaces are populated as a side-effect of loading a
> > > specific file. Tie::Hash for instance populates the namespaces
> > > Tie::Hash, Tie::StdHash, Tie::ExtraHash.
> > >
> > > We can validate if there was a typo when someone asks the newly loaded
> > > package to export a symbol because if there is a mistake then
> > > execution will end up inside of UNIVERSAL::import() instead of the
> > > import method in the loaded package that should be handling the
> > > export. But when there is nothing to export we cannot distinguish
> > > between someone playing presumably legitimate games with package names
> > > and file names and someone making a typo on the filename. After all
> > > people regularly load modules with 'use' that do not contain an import
> > > method, do not use Exporter or any hand rolled equivalents, and do not
> > > export anything.
> >
> > An alternative, more complicated approach would be to check if the
> > filename of the loaded module matches what was requested.
> >
> > When user calls "require File::Stat", I guess we could do something like
> > this:
> >
> > 1. open("File/Stat.pm").
> >
> > 2. Obtain the filename from the file handle and verify it's in the
> > correct case.
>
> Yes.
>
> > However, it's not portable
>
> No likely not.
>
> > and what about symlinks?
>
> Indeed. Also perhaps we would have to consider what to do about INC hooks.
>
> cheers,
> Yves

I think the "ask" was to somehow make C<require> or C<use> case insensitive.

Correct me if I am wrong. If I am correct, then this would take the file system
out of the equation entirely. In OP's problem description, the issue was not
*not* when the file was resolved (found) on the case insensitive file system;
it was when the file was *not* found on the case sensitive file system.

So the issue seems it is not in checking case sensitivity on on case insensitive
file systems, but allowing a case insensitive match on case sensitive file systems.
I hope I got that right, I admit this is confusing.

Also if this is all still correct, then the targeted layer of abstraction is
oblivous to the filesystem, and therefore should be very portable since the
solution is one done in file resolution via @INC.

It might be portable, but it is also expensive I imagine having to do a m//i
on all permutations of case in the eNtIrE::nAmEsPaCe for each base contained
in @INC.

This is why I wrote in my reply, something like the following as more a way
to clarify what I was understanding, not to suggest how it should look:

require qr/My::MoDuLe/i;

Here the case insensitivity that is inherent on Mac or Windows (e.g.,) would
have to be emulated on a typical *nix filesystem that is case sensitive.

Maybe Ovid can clarify. I have nothing more to add here.

Cheers,
Brett

>
>
> --
> perl -Mre=debug -e "/just|another|perl|hacker/"

--
--
oodler@cpan.org
oodler577@sdf-eu.org
SDF-EU Public Access UNIX System - http://sdfeu.org
irc.perl.org #openmp #pdl #native
Re: Case-insensitive file systems [ In reply to ]
On Fri, 14 Apr 2023 at 19:10, Ovid <curtis.poe@gmail.com> wrote:
>
> I blew an hour of a client's time trying to figure out why I was getting redefined subroutine warnings. I eventually tracked them down to this:
>
> # use `require` instead of `use` to avoid a
> # circular load
> require Client::Db;
>
>
> I had just written that code and my tests for it passed. However, the warnings were showing up elsewhere, so I was led on a wild goose chase.
>
> It should have been:
>
> require Client::DB;
>
>
> Because the file system on Macs is case-insensitive, it didn't see $INC{'Client/Db.pm'} and it cheerfully loaded Client/DB.pm and caused the warning.
>
> Is it possible to introduce some kind of lexically-scoped pragma to tell Perl to to verify the case of the filename (and directories, too, of course). This is not the first time I've been bitten by this and it would be nice if it could be addressed.

I think it depends on what you mean by "bitten by this". The generic
problem of case-insensitive file systems hiding certain typos in
package names in require/use statements has no portable efficient
solution. If there was one we would have done something about it
already.

However I think we might be able to do something for the specific case
where loading a module *twice* causes redefined errors. For instance,
when we detect a subroutine is being redefined we could apply a
heuristic to check to see the filename it was loaded from was the same
after fold casing the name, if it was we could add a note to call your
attention to the fact. My point being, that we may not be able to
detect which of "Class::DB" and "Class::Db" is correct, but we should
be able to detect that if Class::DB::foo() is redefined by Class/Db.pm
after being installed by Class/DB.pm that there is likely a
case-insensitive file system error involved.

I also wonder if we couldn't tie %INC so that underneath it stored the
filenames twice, once fold cased, such that we could detect whenever
someone was loading a module name that differed from another already
loaded only by case. Then we could warn when people did this type of
mistake. In theory people could be doing this kind of thing
deliberately on case-insensitive file systems, but i would hope that
generally people would avoid that kind of pattern given how common
case insensitive file systems are.

cheers,
Yves
--
perl -Mre=debug -e "/just|another|perl|hacker/"
Re: Case-insensitive file systems [ In reply to ]
On Sun, 16 Apr 2023 at 22:00, demerphq <demerphq@gmail.com> wrote:
> I also wonder if we couldn't tie %INC so that underneath it stored the
> filenames twice, once fold cased, such that we could detect whenever
> someone was loading a module name that differed from another already
> loaded only by case. Then we could warn when people did this type of
> mistake. In theory people could be doing this kind of thing
> deliberately on case-insensitive file systems, but i would hope that
> generally people would avoid that kind of pattern given how common
> case insensitive file systems are.

While poking into this I realized that on perl 5.38 and 5.37.x perls
with the require hooks in them, you should be able to do something
like:

my %INC_LC;
${^HOOK}{require__before} = sub {
my ($file) = @_;
if ($INC_LC{lc $file} and $INC_LC{lc $file} ne $file) {
warn "Not loading $file which was already loaded as $file\n";
$_[0] = ""; # halt the require.
} else {
$INC_LC{lc $file} = $file;
}
return;
};

Untested as I dont have a case-insensitive file system, but the basic
idea is there.

Yves

--
perl -Mre=debug -e "/just|another|perl|hacker/"