Mailing List Archive

Path hacking [Long] (was Re: Relative Package Imports)
Finally, something I can relate to. Although I have a goal of
packagizing everything I write these days, I haven't experienced any
of the problems that lead others to suggest relative imports. The
most complicated app that I hack on (continuously) is Mailman, which
has a main package and several subpackages off the main one. I always
use absolute paths in my import statements, so I don't see what the
fuss is about. But I'm perfectly willing to admit that I don't have
enough experience.

However...

>>>>> "JCA" == James C Ahlstrom <jim@interet.com> writes:

JCA> I find the PYTHONPATH mechanism totally unreliable for
JCA> commercial programs anyway. It is a global object, and an
JCA> installation of a second Python program can break the first
JCA> one. I don't think there is any solution to this other than
JCA> specify sys.path on a per-application basis. If this is
JCA> false, what is the other solution?

I completely agree with JimA here. It's been a pain with the Knowbot
stuff, a pain with Mailman, and a pain with other packages that I've
installed for shared use within CNRI. The .pth files solve part of
the problem nicely. They let me install, say PIL or PCT in a shared
location, for access by all the Python users at my site, without the
users having to individually hack their dot-files, etc.

But this doesn't work so well for apps like Mailman or the Knowbot
stuff because we can't expect that the person installing those
applications will be able to install a .pth file in the right place.
Also, .pth files don't let you tightly control sys.path, e.g. you can
only add paths, not delete or reorder them.

Plus you have a global naming problem. Mailman's top level package is
called "Mailman", so I can be fairly confident that I'm not going to
collide, but it means that I have an extra directory level within my
install that contains all the core importable modules. I don't think
that's a big deal, but it's a convention that other packaged app
writers should follow.

The problem is getting Mailman's (or the Knowbots') top level
directory on sys.path, and in exactly controlling the contents of
sys.path.

Our first approach with Knowbots was to do direct sys.path.insert()s,
which is quite ugly and error prone. Plus if you're adding many
paths, or adding and deleting, that's a lot of gibberish at the top of
your entry level executables. And now let's say that you have a dozen
or two dozen entry level executables that all have to perform the same
sys.path magic. That's a lot of cutting-and-pasting (and /highly/
error prone patching when directory structures change). It's a lose.

So for Knowbots we wrote a small module called pathhack that all entry
level executables imported. pathhack was good because it put all that
sys.path munging nonsense in one place so it was manageable from a s/w
engineering standpoint. But it sucked because those executables had
to /find/ pathhack.py! Bootstrap lossage (we've actually gone back to
sys.path.insert).

With Mailman, I could solve that problem because I added a
configure/make phase. This let me write a module template called
paths.py.in which configure flippered into paths.py containing path
hackage based on --prefix. The next trick was that "make install"
copied that paths.py file into all the subdirectories that had top
level entry points into the Mailman system (e.g. the bin directory,
the cron directory, the cgi directory). So now, an executable need
only do

import paths
import Mailman.Utils
import Mailman.Logging.Utils

and absolute paths work like a charm. I can even provide a
`pythonlib' directory that contains newer versions of standard modules
that have fixes for folks running older Pythons. Thus I do

from Mailman.pythonlib import rfc822

and the rest of my code uses my special rfc822 module with no changes.

I'm very happy with how this works for Mailman, however we can't use
the same approach (or let's say Guido doesn't want to use this
approach) for the Knowbots stuff because there /is/ no "make install"
step. You just unpack it and go. But it still has to play lots of
games searching the file system for various things.

What I've been thinking is that Python needs a registry <shudder>.
JPython's already got such a beast, and it integrates with Java's
system properties, so that things like the PYTHONPATH equivalent are
set in the registry and immediately available. But it's not very
flexible, and you still need an install step in order to bootstrap the
locating of the registry.

I think we can do a little bit better. Python already knows how to
find it's sys module. We can add an object into sys, call it
sys.registry, which would contain things like sys.path definitions,
and all sorts of other application specific keys. This object would
be tied to a file (or files) which might be human readable, a
marshal/pickle (or both). Bootstrap location of this file(s) is an
issue, but see below.

This would let you do things like the following at the beginning of
every top level executable:

import sys
sys.application = 'zope'
sys.registry.setpath(sys.application+'.pythonpath')

I'm sure all kinds of lengthy discussion will now ensue about the
exact interface of the registry object, but I'll make just a few
observations:

- There should be a system wide registry and a user specific
registry. This let's an admin install shared applications easily,
but also lets individual users have their own overrides.

- The system-wide registry can be located in say
sys.prefix/lib/python<version>/site-packages. The user registry
would reside somewhere in $HOME. This could all be platform
specific so that on Windows, maybe the Python registry is integrated
with the Windows registry, while in JPython it would be integrated
with the standard JPython registry mechanism.

- You should be able to specify registry entries on the command line.

- There needs to be defined rules for resolving registry keys b/w
system, user, and command line specifications. JPython has some
experience here (although there have been requests to change
JPython's lookup order), and at the very least, JPython and CPython
should be as consistent as possible (CPython won't have to merge in
Java's system properties).

- The sys.registry object should be read/writable. This would let an
install script do something like:

import sys
sys.registry.lock()
sys.registry.put('zope.pythonpath',
'@prefix@:@prefix@/matools:@prefix@/pythonlib')
sys.registry.write()
sys.registry.unlock()

which would write either the global system registry or the local
user registry, depending on permissions (or maybe that's spelled
explicitly in the API).

- In a sense you're pushing the namespace issue up a level into the
registry, but at least this is a domain we can completely control
from Python; it abstracts away the file system, and I don't think
there's any way to avoid requiring conventions and cooperation for
registry key naming. I also don't think it'll be a big problem in
practice. When I packagize and re-release my Zarathustra's Ocular
Python Experience virtual reality system, I'll try to think of a
non-colliding top level package name.

- (oh darn, I know I had more points, but Guido just popped in and I
lost my train of thought).

Well, this has gone on long enough so I might as well let you guys
shoot this idea all to hell. Let me close by saying that while I
think the Windows registry is a mess, I also think that it might be
useful for Python. Does it solve the same problem that the relative
imports is trying to solve? I dunno, but that's why I changed the
Subject: line above. :)

-Barry
Re: Path hacking [Long] (was Re: Relative Package Imports) [ In reply to ]
"Barry A. Warsaw" wrote:
>
> So for Knowbots we wrote a small module called pathhack that all entry
> level executables imported. pathhack was good because it put all that
> sys.path munging nonsense in one place so it was manageable from a s/w
> engineering standpoint. But it sucked because those executables had
> to /find/ pathhack.py! Bootstrap lossage (we've actually gone back to
> sys.path.insert).

Yes, exactly the problem I had, bootstraping the import of pathhack.
Actually it gets worse because Python imports exceptions.py, site.py
and sitecustomize.py during Py_Initialize(), so if you are having a
really bad day, you might pick up the wrong version of those.

AFAIK, the only way to solve that currently is to use freeze
to build pathhack into the binary executable. That is what I
do anyway. But it is not an ideal solution.

> What I've been thinking is that Python needs a registry <shudder>.

Yikes! As you say, Window's registry is a mess.

> [Lots of good ideas omitted...]

> - The system-wide registry can be located in say
> sys.prefix/lib/python<version>/site-packages. The user registry
> would reside somewhere in $HOME. This could all be platform
> specific so that on Windows, maybe the Python registry is integrated
> with the Windows registry, while in JPython it would be integrated
> with the standard JPython registry mechanism.

Python already has three directories it knows about: sys.executable is
the directory of the interpreter binary, sys.dllfullpath could be
the directory of the interpreter as a shared library (I have a
patch for this), and there is the directory of the main Python program
as given on the command line. Perhaps we can put the registry
in one of these directories. That would be consistent on all
platforms.

> - You should be able to specify registry entries on the command line.

This is vital because I am worried about a bad registry.

> - There needs to be defined rules for resolving registry keys b/w
> system, user, and command line specifications. JPython has some
> experience here (although there have been requests to change

I am not sure a full registry is required. Once you can control
sys.path and can get an accurate import of sitecustomize.py, you
can do everything else there. Maybe just a command line option
is enough. But I will think about it...

Jim Ahlstrom
Re: Path hacking [ In reply to ]
I just had a long discussion with Barry and Fred, in response to his
registry proposal. We quickly decided that a Python registry is
overkill for the given problem. We also quickly came up with a nice
variant of Mailman's approach which will work well in a variety of
cases.

--> The context:

You have a large complicated application that contains many modules
spread over many packages, and which has many "top-level" scripts that
are invoked by the user (or via CGI, for example). All the code is
properly packagized, with sufficiently globally unique package names
being used all over the place.

--> The problem:

How to get the root directory of your application (where all your
packages live) on sys.path.

--> The rules:

Using $PYTHONPATH is right out.

You can't install new files in the core Python installation directory
(nor modify existing ones), so using .pth files is also out.

You don't want to have to edit each of the top-level scripts of your
application.

You want a cross-platform solution, in particular it should be
amenable to Windows.

--> The assumptions:

You can use a reasonably intelligent installer.

All your top-level scripts are installed in a single directory (or
perhaps in a small number of separate bin directories, e.g. bin and
cgi-bin).

--> The solution:

Suppose your application (as a whole, not the individual top-level
script) is called Spam -- this may well also be the name of your
top-level package. Then start each top-level script with the single
line

import Spam_path

before importing anything else.

Your installer, once it knows the absolute pathname of your
application's root directory, crafts a file Spam_path.py which
contains code that inserts the right absolute pathname into sys.path.

Your installer then installs a copy of this file (or a symbolic link
to it) *in each bin directory where it installs top-level Python
scripts*.

Because the script's directory is first on the default path, the Spam
scripts will pick up Spam_path without any help from $PYTHONPATH.

--> Notes:

If you are Spam's developer, you probably want to be able to use its
top-level scripts without having to install them. All you need to do
is create a file Spam_path.py pointing to the top of your development
tree, and set $PYTHONPATH to point to the directory that contains it.

(Perhaps you already have $PYTHONPATH pointing to a personal directory
of Python modules you like to have accessible -- then you can just
drop Spam_path.py there, or link to it from there.)

Note that adding a personal directory of Python goodies is about the
only use of $PYTHONPATH that I approve of -- this way, you can set
$PYTHONPATH in your .profile and never have to change it.

I know this doesn't resolve the relative import thread (how's that
going by the way? :-) but Barry & Fred & I agree that this is the best
solution to the problem stated in Barry's message to which I am
following up here.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Path hacking [ In reply to ]
"Guido van Rossum" wrote:
> --> The solution:

Ah, finally a specific proposal...

> Suppose your application (as a whole, not the individual top-level
> script) is called Spam -- this may well also be the name of your
> top-level package. Then start each top-level script with the single
> line
>
> import Spam_path
>
> before importing anything else.

This should not be necessary if you use the name "sitecustomize" instead
of "Spam_path" right? The file sitecustomize.py is automatically
imported.
Actually all this sounds like site.py all over again.

> Your installer, once it knows the absolute pathname of your
> application's root directory, crafts a file Spam_path.py which
> contains code that inserts the right absolute pathname into sys.path.

I don't think this is necessary either. The sys module is available.
So sitecustomize.py can say:
import sys
mydir = sys.path[0]
if not mydir:
import os
mydir = os.getcwd()
sys.path = [mydir] # To be really extreme about it
# Note: inserting mydir as sys.path[0] should be redundant but is not

> Your installer then installs a copy of this file (or a symbolic link
> to it) *in each bin directory where it installs top-level Python
> scripts*.
>
> Because the script's directory is first on the default path, the Spam
> scripts will pick up Spam_path without any help from $PYTHONPATH.

Hmmm. Is this really true? Nothing else, for example the registry, can
change sys.path[0]? Ever? Please say yes.

> I know this doesn't resolve the relative import thread (how's that
> going by the way? :-) but Barry & Fred & I agree that this is the best
> solution to the problem stated in Barry's message to which I am
> following up here.

This is a good idea, but there are a few problems.

It depends on sys.path[0] being the directory of the Python
file being executed as the main program. I guess I never
really trusted this before. I think if this is the case it
should never be ''. A relative path or no path on the command
line (the __main__ program) should be replaced by the full path
in the sys module setup. Then the "mydir = os.getcwd()" above
is not necessary. And inserting mydir as sys.path[0] is truly
redundant should the current directory change (as it certainly will).
This is currently a problem with sys.path[0] which should be
fixed no matter what else happens.

The files exceptions.py and site.py must be in all the bin
directories as well as sitecustomize.py because they are
automatically imported in Py_Initialize().

The above doesn't work when you start the Python command
interpreter (no main). I know, its a minor point.

It seems to me this totally solves Jim Fulton's and Marc's
problem and makes "__" unnecessary. You just install zope
and mx in zopedir, perform the above, and presto you have a new
private name space where you can control all your names. But
there must be some problem here I haven't thought of.

I still worry that this is not powerful enough. Greg Stein
has volunteered to re-write import.c in Python (using imputil.py)
and this is a Great Idea. Lots of Python could probably be
written in itself. I would like to try writing the main
program in Python and eliminating the special freeze main
program. Once you start on this road (and I think it is a good road)
you have Python code which is more truly part of the binary
interpreter than a library.

Proposal:

Use a special PYTHONPATH on startup to find "special" Python
files which are really part of the interpreter. There are
three directories Python knows about. Namely sys.path[0]
(once it is fixed), sys.executable and sys.dllfullpath,
the directory of python15.dll or other shared library (once it is
added to sys). How about prepending the single directory sys.executable
to sys.path during Py_Initialize()? And demanding that modules
like the new Greg_importer.py[c], exceptions.py[c] and site.py[c]
be placed there.

Actually I would prefer sys.dllfullpath if it exists, since that
is where the interpreter is, and I am trying to associate these
special internal Python files exactly with their correct Python
interpreter.

Alternative Proposal:

Py_Initialize() first imports its files from sys.executable + '/' +
PyInternal.pyl (again I prefer sys.dllfullpath).
PyInternal.pyl is a Python library file (like a Java Jar
file) which would contain modules like exceptions, etc.
The PyInternal.pyl file has the standard Python library file
format (whatever that turns out to be). It is not an error if
this file is absent.

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
[Guido]
> > import Spam_path
> >
> > before importing anything else.

[JimA]
> This should not be necessary if you use the name "sitecustomize" instead
> of "Spam_path" right? The file sitecustomize.py is automatically
> imported.
> Actually all this sounds like site.py all over again.

But the intention here is for the customization to be application
specific (hence the Spam in the name). sitecustomize doesn't know
whethere I need the Mailman or the Knowbot root added to my path.

Or do you mean to imply that we can do this with zero text added to
the script, by simply dropping an appropriate sitecustomize.py in the
script dir? Unfortunately this does currently *not* work, because
sys.path[0] is added after Py_Initialize() is run.

> > Your installer, once it knows the absolute pathname of your
> > application's root directory, crafts a file Spam_path.py which
> > contains code that inserts the right absolute pathname into sys.path.
>
> I don't think this is necessary either. The sys module is available.
> So sitecustomize.py can say:
> import sys
> mydir = sys.path[0]
> if not mydir:
> import os
> mydir = os.getcwd()
> sys.path = [mydir] # To be really extreme about it
> # Note: inserting mydir as sys.path[0] should be redundant but is not

Hm, guessing based on the script directory might work, but seems less
reliable than hardcoding it through the installer. But you can use
this if it works for your application.

> > Your installer then installs a copy of this file (or a symbolic link
> > to it) *in each bin directory where it installs top-level Python
> > scripts*.
> >
> > Because the script's directory is first on the default path, the Spam
> > scripts will pick up Spam_path without any help from $PYTHONPATH.
>
> Hmmm. Is this really true? Nothing else, for example the registry, can
> change sys.path[0]? Ever? Please say yes.

Yes. (The registry can add module-specific paths, which will be
searched before sys.path is even looked at, but this is only for
specific modules. It cannot insert a general directory that is
searched.) The only way this can fail is if an embedding app fails to
call PySys_SetArgv().

> > I know this doesn't resolve the relative import thread (how's that
> > going by the way? :-) but Barry & Fred & I agree that this is the best
> > solution to the problem stated in Barry's message to which I am
> > following up here.
>
> This is a good idea, but there are a few problems.
>
> It depends on sys.path[0] being the directory of the Python
> file being executed as the main program. I guess I never
> really trusted this before. I think if this is the case it
> should never be ''. A relative path or no path on the command
> line (the __main__ program) should be replaced by the full path
> in the sys module setup. Then the "mydir = os.getcwd()" above
> is not necessary. And inserting mydir as sys.path[0] is truly
> redundant should the current directory change (as it certainly will).
> This is currently a problem with sys.path[0] which should be
> fixed no matter what else happens.

I have always resisted forcing path items to be absolute, although I'm
not sure that my reasons are valid any more (it has to do with the
fact that getcwd() may fail and the fact that portable path
concatenation is a pain). In any case, that's a separate issue -- I
agree that if sys.path[0] is '' (as it often is) it's better for
site.py or sitecustomize.py or Spam_path.py (or whoever) to absolutize
it (and everything else on the path) so that it will still work if the
app does a chdir().

> The files exceptions.py and site.py must be in all the bin
> directories as well as sitecustomize.py because they are
> automatically imported in Py_Initialize().

Yes.

> The above doesn't work when you start the Python command
> interpreter (no main). I know, its a minor point.

You could add the "import Spam_path" to your $PYTHONSTARTUP file.

> It seems to me this totally solves Jim Fulton's and Marc's
> problem and makes "__" unnecessary. You just install zope
> and mx in zopedir, perform the above, and presto you have a new
> private name space where you can control all your names. But
> there must be some problem here I haven't thought of.

I think no simple solution that *I* can come up with will satisfy
JimF's and Marc's desire for obscurity :-)

> I still worry that this is not powerful enough. Greg Stein
> has volunteered to re-write import.c in Python (using imputil.py)
> and this is a Great Idea. Lots of Python could probably be
> written in itself. I would like to try writing the main
> program in Python and eliminating the special freeze main
> program. Once you start on this road (and I think it is a good road)
> you have Python code which is more truly part of the binary
> interpreter than a library.

Yes, this is the plan for Python 2.0, and some of it may be
implemented in Python 1.6.

> Proposal:
>
> Use a special PYTHONPATH on startup to find "special" Python
> files which are really part of the interpreter. There are
> three directories Python knows about. Namely sys.path[0]
> (once it is fixed), sys.executable and sys.dllfullpath,
> the directory of python15.dll or other shared library (once it is
> added to sys). How about prepending the single directory sys.executable
> to sys.path during Py_Initialize()? And demanding that modules
> like the new Greg_importer.py[c], exceptions.py[c] and site.py[c]
> be placed there.

On Unix, this is a bin directory and it is strongly discouraged to put
non-program files there. Python already does something similar --
it looks around in sys.executable's ancestors for a specific landmark,
currently lib/python<version>/string.py. Arguably, it should search
for execeptions.py instead.

> Actually I would prefer sys.dllfullpath if it exists, since that
> is where the interpreter is, and I am trying to associate these
> special internal Python files exactly with their correct Python
> interpreter.

Is the full DLL path available at any point? This would certainly be
a good starting point -- especially when the DLL is loaded implicitly
as the result of some COM operation.

> Alternative Proposal:
>
> Py_Initialize() first imports its files from sys.executable + '/' +
> PyInternal.pyl (again I prefer sys.dllfullpath).
> PyInternal.pyl is a Python library file (like a Java Jar
> file) which would contain modules like exceptions, etc.
> The PyInternal.pyl file has the standard Python library file
> format (whatever that turns out to be). It is not an error if
> this file is absent.

I guess this is all up to the redesign of the import mechanism
(something like Greg Stein's imputil.py for sure).

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Re: Path hacking [ In reply to ]
Guido van Rossum wrote:
> > It seems to me this totally solves Jim Fulton's and Marc's
> > problem and makes "__" unnecessary. You just install zope
> > and mx in zopedir, perform the above, and presto you have a new
> > private name space where you can control all your names. But
> > there must be some problem here I haven't thought of.
>
> I think no simple solution that *I* can come up with will satisfy
> JimF's and Marc's desire for obscurity :-)

Never mind, I'll use an imputil.py based approach to get relative
imports to work in my packages. That is when I get imputil.py
to work... it doesn't seem to be quite there yet (or I'm using
an old version).

BTW, I'm 100% behind you guys if you choose to reimplement Python's
import mechanism in Python using a similar approach as the one
Greg implemented in imputil. Should make everybody happy: those
who want obscure syntactic add-ons and others with a taste for
zlib'ed packaged byte code, plus those VMS freaks ;-)

Perhaps we should start a new thread on that topic...

Still needed are:

· Python level APIs for the platform specific magic on
Win32 and Macs (OS/2, BeOS ?), e.g. access to the Windows
registry and the Mac forks

· Patches to make the DirectoryImporter 100% backward compatible

Greg's imputil.py can be found at:

http://www.lyra.org/greg/small/

The trick would then be to install an application specific
importer in the setup module Spam_path or MyAppSetup which
then takes care of all the rest...

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 107 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/
Re: Re: Path hacking [ In reply to ]
Jim Ahlstrom wtoe:

> "Guido van Rossum" wrote:
> > --> The solution:

Did the dev-list miss something? The last I see is Barry's post.



- Gordon
Re: Re: Path hacking [ In reply to ]
> From: "Gordon McMillan" <gmcm@hypernet.com>

> Jim Ahlstrom wtoe:
>
> > "Guido van Rossum" wrote:
> > > --> The solution:
>
> Did the dev-list miss something? The last I see is Barry's post.

Hm. We had an email glitch. Apparently this message got lost:

Subject: Re: Path hacking
From: Guido van Rossum <guido@CNRI.Reston.VA.US>
To: python-dev@python.org
Date: Tue, 14 Sep 1999 15:57:51 -0400

I just had a long discussion with Barry and Fred, in response to his
registry proposal. We quickly decided that a Python registry is
overkill for the given problem. We also quickly came up with a nice
variant of Mailman's approach which will work well in a variety of
cases.

--> The context:

You have a large complicated application that contains many modules
spread over many packages, and which has many "top-level" scripts that
are invoked by the user (or via CGI, for example). All the code is
properly packagized, with sufficiently globally unique package names
being used all over the place.

--> The problem:

How to get the root directory of your application (where all your
packages live) on sys.path.

--> The rules:

Using $PYTHONPATH is right out.

You can't install new files in the core Python installation directory
(nor modify existing ones), so using .pth files is also out.

You don't want to have to edit each of the top-level scripts of your
application.

You want a cross-platform solution, in particular it should be
amenable to Windows.

--> The assumptions:

You can use a reasonably intelligent installer.

All your top-level scripts are installed in a single directory (or
perhaps in a small number of separate bin directories, e.g. bin and
cgi-bin).

--> The solution:

Suppose your application (as a whole, not the individual top-level
script) is called Spam -- this may well also be the name of your
top-level package. Then start each top-level script with the single
line

import Spam_path

before importing anything else.

Your installer, once it knows the absolute pathname of your
application's root directory, crafts a file Spam_path.py which
contains code that inserts the right absolute pathname into sys.path.

Your installer then installs a copy of this file (or a symbolic link
to it) *in each bin directory where it installs top-level Python
scripts*.

Because the script's directory is first on the default path, the Spam
scripts will pick up Spam_path without any help from $PYTHONPATH.

--> Notes:

If you are Spam's developer, you probably want to be able to use its
top-level scripts without having to install them. All you need to do
is create a file Spam_path.py pointing to the top of your development
tree, and set $PYTHONPATH to point to the directory that contains it.

(Perhaps you already have $PYTHONPATH pointing to a personal directory
of Python modules you like to have accessible -- then you can just
drop Spam_path.py there, or link to it from there.)

Note that adding a personal directory of Python goodies is about the
only use of $PYTHONPATH that I approve of -- this way, you can set
$PYTHONPATH in your .profile and never have to change it.

I know this doesn't resolve the relative import thread (how's that
going by the way? :-) but Barry & Fred & I agree that this is the best
solution to the problem stated in Barry's message to which I am
following up here.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Re: Path hacking [ In reply to ]
Gordon McMillan wrote:
>
> Jim Ahlstrom wtoe:
>
> > "Guido van Rossum" wrote:
> > > --> The solution:
>
> Did the dev-list miss something? The last I see is Barry's post.

My mail system is flakey, so I have been reading this list
directly on python.org. I didn't get it by list either, so
I assumed my mailer ate it. See:

http://www.python.org/pipermail/python-dev/1999-September/000851.html

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
Guido van Rossum wrote:
>
> But the intention here is for the customization to be application
> specific (hence the Spam in the name). sitecustomize doesn't know
> whethere I need the Mailman or the Knowbot root added to my path.

Ah, you have multiple scripts in one directory and multiple
Foo_path, Bar_path etc. I was thinking with my Windows head.
A commercial Windows app generally has its own exclusive
install directory, so I was thinking single directory so a single
sitecustomize.py.

> Or do you mean to imply that we can do this with zero text added to
> the script, by simply dropping an appropriate sitecustomize.py in the
> script dir?

Yes, that is exactly what I was thinking.

> Unfortunately this does currently *not* work, because
> sys.path[0] is added after Py_Initialize() is run.

Yikes! That kills using sitecustomize.py. Your Spam_path
still works because it is imported later, but requires an
import in each Python main script just as you said.

Even worse, it means that exceptions.py and site.py can not
be found at all except using the normal PYTHONPATH, and
putting their path in Spam_path will *not* work.

> > > Because the script's directory is first on the default path, the Spam
> > > scripts will pick up Spam_path without any help from $PYTHONPATH.
> >
> > Hmmm. Is this really true? Nothing else, for example the registry, can
> > change sys.path[0]? Ever? Please say yes.
>
> Yes. (The registry can add module-specific paths, which will be
> searched before sys.path is even looked at, but this is only for
> specific modules. It cannot insert a general directory that is
> searched.) The only way this can fail is if an embedding app fails to
> call PySys_SetArgv().

Oh dear, I think I heard no instead of yes. Are you saying that if
someone else installs a Python app on my customer's machine after I do,
and sets a registry entry which sayes to use c:/other/path/to/site.py
for site.py (as he may very well want to do), then if my Python program
depends on getting my copy of site.py from my directory, it will then
use the other copy instead and may very well fail?

> In any case, that's a separate issue -- I
> agree that if sys.path[0] is '' (as it often is) it's better for
> site.py or sitecustomize.py or Spam_path.py (or whoever) to absolutize
> it (and everything else on the path) so that it will still work if the
> app does a chdir().

Point on the curve: Windows apps generally start from an icon
which contains their path and current working directory, and
these are generally different. So a Windows app in general will
*never* have had a getcwd() equal to the path of either the
binary interpreter or the Python main script.

> > The files exceptions.py and site.py must be in all the bin
> > directories as well as sitecustomize.py because they are
> > automatically imported in Py_Initialize().
>
> Yes.

Well, *no* right? This fails unless the bin directories are in
fact on PYTHONPATH. The only way to get exceptions.py is by using
sys.path as it exists within Py_Initialize(). So there is no
hacked sys.path[0] equal to the script dir. And since the
path hacks in site.py haven't happened yet either, we have
an incomplete sys.path at that point.

> > added to sys). How about prepending the single directory sys.executable
> > to sys.path during Py_Initialize()? And demanding that modules
> > like the new Greg_importer.py[c], exceptions.py[c] and site.py[c]
> > be placed there.
>
> On Unix, this is a bin directory and it is strongly discouraged to put
> non-program files there.

Ok, point taken.

> Is the full DLL path available at any point? This would certainly be
> a good starting point -- especially when the DLL is loaded implicitly
> as the result of some COM operation.

I don't know about loading by COM, but if it is a file, its absolute
path is reliably known in sys, the code is identical to that currently
used for sys.executable (on Windows), and I have a patch if you want.

JimA's conjecture: It is currently impossible to
ship a Python app which can not be damaged by the installation of a
second Python app without using a hacked custom binary.

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
>>>>> "Gordo" == Gordon McMillan <gmcm@hypernet.com> writes:

Gordo> Jim Ahlstrom wtoe:

>> "Guido van Rossum" wrote:
>> --> The solution:

Gordo> Did the dev-list miss something? The last I see is Barry's
Gordo> post.

I have a suspicion that python.org lost some email yesterday.

We had a period of time where mail simply stopped getting delivered
(thank you Solaris patch manager) and it took me a little while to
realize that things weren't working correctly. Since there's nothing
unexpected in the mail queue now, all I can say is that if you didn't
get it by now, you ain't gonna.

However, everything seemed to make it into the archives, so Guido's
message is available at:

http://www.python.org/pipermail/python-dev/1999-September/000880.html

-Barry
Re: Re: Path hacking [ In reply to ]
Hmm, I'm suspicious of the fact that no message from Barry
Warsaw ever gets "lost".

Stalin got started by being in charge of the Kremlin's
telephone system, you know...


> >>>>> "Gordo" == Gordon McMillan <gmcm@hypernet.com> writes:
>
> Gordo> Jim Ahlstrom wtoe:
>
> >> "Guido van Rossum" wrote:
> >> --> The solution:
>
> Gordo> Did the dev-list miss something? The last I see is
> Barry's Gordo> post.
>
> I have a suspicion that python.org lost some email yesterday.
>
> We had a period of time where mail simply stopped getting
> delivered (thank you Solaris patch manager) and it took me a
> little while to realize that things weren't working correctly.
> Since there's nothing unexpected in the mail queue now, all I can
> say is that if you didn't get it by now, you ain't gonna.
>
> However, everything seemed to make it into the archives, so
> Guido's message is available at:
>
> http://www.python.org/pipermail/python-dev/1999-September/000880.
> html
>
> -Barry
>
> _______________________________________________
> Python-Dev maillist - Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev



- Gordon
Re: Re: Path hacking [ In reply to ]
>>>>> "Gordo" == Gordon McMillan <gmcm@hypernet.com> writes:

Gordo> Hmm, I'm suspicious of the fact that no message from Barry
Gordo> Warsaw ever gets "lost".

Gordo> Stalin got started by being in charge of the Kremlin's
Gordo> telephone system, you know...

Well, I managed to get rid of Ken so my world domination plan is right
on schedule!

guido-may-be-benevolent-but-you-can-bet-i-won't-be-ly y'rs,
-Barry
Re: Re: Path hacking [ In reply to ]
[me]
> > But the intention here is for the customization to be application
> > specific (hence the Spam in the name). sitecustomize doesn't know
> > whethere I need the Mailman or the Knowbot root added to my path.

[JimA]
> Ah, you have multiple scripts in one directory and multiple
> Foo_path, Bar_path etc. I was thinking with my Windows head.
> A commercial Windows app generally has its own exclusive
> install directory, so I was thinking single directory so a single
> sitecustomize.py.
>
> > Or do you mean to imply that we can do this with zero text added to
> > the script, by simply dropping an appropriate sitecustomize.py in the
> > script dir?
>
> Yes, that is exactly what I was thinking.
>
> > Unfortunately this does currently *not* work, because
> > sys.path[0] is added after Py_Initialize() is run.
>
> Yikes! That kills using sitecustomize.py. Your Spam_path
> still works because it is imported later, but requires an
> import in each Python main script just as you said.

Not too bad (who cares about one more line of boilerplate...).

> Even worse, it means that exceptions.py and site.py can not
> be found at all except using the normal PYTHONPATH, and
> putting their path in Spam_path will *not* work.

Why would you want your own exceptions.py and site.py?

> > > > Because the script's directory is first on the default path, the Spam
> > > > scripts will pick up Spam_path without any help from $PYTHONPATH.
> > >
> > > Hmmm. Is this really true? Nothing else, for example the registry, can
> > > change sys.path[0]? Ever? Please say yes.
> >
> > Yes. (The registry can add module-specific paths, which will be
> > searched before sys.path is even looked at, but this is only for
> > specific modules. It cannot insert a general directory that is
> > searched.) The only way this can fail is if an embedding app fails to
> > call PySys_SetArgv().
>
> Oh dear, I think I heard no instead of yes. Are you saying that if
> someone else installs a Python app on my customer's machine after I do,
> and sets a registry entry which sayes to use c:/other/path/to/site.py
> for site.py (as he may very well want to do), then if my Python program
> depends on getting my copy of site.py from my directory, it will then
> use the other copy instead and may very well fail?

Again - why would anyone register their own site.py?

> > In any case, that's a separate issue -- I
> > agree that if sys.path[0] is '' (as it often is) it's better for
> > site.py or sitecustomize.py or Spam_path.py (or whoever) to absolutize
> > it (and everything else on the path) so that it will still work if the
> > app does a chdir().
>
> Point on the curve: Windows apps generally start from an icon
> which contains their path and current working directory, and
> these are generally different. So a Windows app in general will
> *never* have had a getcwd() equal to the path of either the
> binary interpreter or the Python main script.

You're lucky. It turns out that on Windows, under those circumstances
at least, sys.path[0] is the absolute pathname of the directory. You
only see '' if sys.argv[0] doesn't have any pathname information;
that's only possible if the script *does* live in the current
directory.

> > > The files exceptions.py and site.py must be in all the bin
> > > directories as well as sitecustomize.py because they are
> > > automatically imported in Py_Initialize().
> >
> > Yes.
>
> Well, *no* right? This fails unless the bin directories are in
> fact on PYTHONPATH. The only way to get exceptions.py is by using
> sys.path as it exists within Py_Initialize(). So there is no
> hacked sys.path[0] equal to the script dir. And since the
> path hacks in site.py haven't happened yet either, we have
> an incomplete sys.path at that point.

Sorry, I've lost track of what we were after here. Indeed the
scripts' directory (which I presume you meant by the bin directory)
indeed doesn't occur in sys.path until after Py_Initialize() has run.

> > > added to sys). How about prepending the single directory sys.executable
> > > to sys.path during Py_Initialize()? And demanding that modules
> > > like the new Greg_importer.py[c], exceptions.py[c] and site.py[c]
> > > be placed there.
> >
> > On Unix, this is a bin directory and it is strongly discouraged to put
> > non-program files there.
>
> Ok, point taken.
>
> > Is the full DLL path available at any point? This would certainly be
> > a good starting point -- especially when the DLL is loaded implicitly
> > as the result of some COM operation.
>
> I don't know about loading by COM, but if it is a file, its absolute
> path is reliably known in sys, the code is identical to that currently
> used for sys.executable (on Windows), and I have a patch if you want.

I presume using GetModuleFileName()? Please send me the patch!

> JimA's conjecture: It is currently impossible to
> ship a Python app which can not be damaged by the installation of a
> second Python app without using a hacked custom binary.

Sounds right. All tricks to make the app unique require using a
different registry key, which requires a change to the DLL. However,
you can do this without recompiling! The version string is used is
embedded in a resource, so you can patch it using some kind of
resource editor. Mark Hammond planned it this way!

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Re: Path hacking [ In reply to ]
Guido van Rossum wrote:
> > Even worse, it means that exceptions.py and site.py can not
> > be found at all except using the normal PYTHONPATH, and
> > putting their path in Spam_path will *not* work.
>
> Why would you want your own exceptions.py and site.py?

I don't. I never change Python library files. I am worried
that they won't be found because I don't trust PYTHONPATH.

> > Oh dear, I think I heard no instead of yes. Are you saying that if
> > someone else installs a Python app on my customer's machine after I do,
> > and sets a registry entry which sayes to use c:/other/path/to/site.py
> > for site.py (as he may very well want to do), then if my Python program
> > depends on getting my copy of site.py from my directory, it will then
> > use the other copy instead and may very well fail?
>
> Again - why would anyone register their own site.py?

I wouldn't, I am worried that someone else will break my installation.
Remember that site.py was invented as a site-specific module, although
that function moved to sitecustomize.py.

> I presume using GetModuleFileName()? Please send me the patch!

Yes, and OK.

> > JimA's conjecture: It is currently impossible to
> > ship a Python app which can not be damaged by the installation of a
> > second Python app without using a hacked custom binary.
>
> Sounds right. All tricks to make the app unique require using a
> different registry key, which requires a change to the DLL. However,
> you can do this without recompiling! The version string is used is
> embedded in a resource, so you can patch it using some kind of
> resource editor. Mark Hammond planned it this way!

I don't understand this. Is there documentation?

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
[me]
> > Why would you want your own exceptions.py and site.py?
[JimA]
> I don't. I never change Python library files. I am worried
> that they won't be found because I don't trust PYTHONPATH.

Hmm... PYTHONPATH gets inserted in front of the default sys.path.
(Many moons ago that was different. But it has been like this for a
loooooong time.) So are you worried that someone put a *different*
exceptions.py or site.py on their path?

> > Again - why would anyone register their own site.py?
>
> I wouldn't, I am worried that someone else will break my installation.
> Remember that site.py was invented as a site-specific module, although
> that function moved to sitecustomize.py.

Hm, I dug out the oldest site.py I have (used in Python 1.4), and it
doesn't encourage editing it at all -- it tells you to use
sitecustomize.py. I guess they could break your installation anyway,
but only by messing with the general Python installation.

> > Sounds right. All tricks to make the app unique require using a
> > different registry key, which requires a change to the DLL. However,
> > you can do this without recompiling! The version string is used is
> > embedded in a resource, so you can patch it using some kind of
> > resource editor. Mark Hammond planned it this way!
>
> I don't understand this. Is there documentation?

The usual :-)

Python/import.c shows that import calls PyWin_FindRegisteredModule()
to find a registered module before looking in sys.path (but after
checking for builtin and frozen modules).

PC/import_nt.c shows that PyWin_FindRegisteredModule() uses a registry
key of the form
"Software\Python\PythonCore\<PyWin_DLLVersionString>\Modules\<modulename><debugstring>"
where <modulename> is the module name, <debugstring> is empty or
"\Debug" depending on whether we are compiled with _DEBUG define. The
resource value points to a file (either .py, .pyc/.pyo, .pyd or .dll;
in fact any of the prefixes returned by imp.get_suffixes()).

PC/dl_nt.c shows that PyWin_DLLVersionString is set to string 1000
loaded from the string resource table.

PC/python_nt.rc shows that there's a stringtable with item 1000 being
the MS_DLL_ID string, set to "1.5" in that file.

Note that this value (PyWin_DLLVersionString) is also to Python code
as sys.winver.

I hope that Mark Hammond can point you to a tool that you can use to
edit a string resource in an executable or DLL.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Re: Path hacking [ In reply to ]
Guido van Rossum wrote:
>
> [me]
> > > Why would you want your own exceptions.py and site.py?
> [JimA]
> > I don't. I never change Python library files. I am worried
> > that they won't be found because I don't trust PYTHONPATH.
>
> Hmm... PYTHONPATH gets inserted in front of the default sys.path.
> (Many moons ago that was different. But it has been like this for a
> loooooong time.) So are you worried that someone put a *different*
> exceptions.py or site.py on their path?

When you jam every file into a module archive, you still have to leave
these two "outside" so that Python can find them when starting up. The
problem then breaks down into two parts:

1) locating them
2) ensuring they're the right versions

In my "small" distro, I relied on the current-dir always being in the
path, so I could always find these. The two files were distributed as
part of distro.

Anyhow: JimA is saying that he doesn't trust PYTHONPATH. Not so much bad
files/versions, but that he won't be able to find them because
PYTHONPATH has been monkeyed in some way.

> > > Again - why would anyone register their own site.py?
> >
> > I wouldn't, I am worried that someone else will break my installation.
> > Remember that site.py was invented as a site-specific module, although
> > that function moved to sitecustomize.py.
>
> Hm, I dug out the oldest site.py I have (used in Python 1.4), and it
> doesn't encourage editing it at all -- it tells you to use
> sitecustomize.py. I guess they could break your installation anyway,
> but only by messing with the general Python installation.

If the file exists, then somebody can mess with it. JimA is trying to
create a bulletproof app here. To do this, you can distribute a
python.exe, exceptions.py, site.py, and an archive of your other
modules. site.py is scrapped in favor of installing an Importer to
access the archive (this implies you also distribute imputil.py). These
five files are the exact five in my "small" distro. It's pretty cool...
no need for registry changes and a very small "file count" footprint.
Gordon took this basis and added a bunch of features for bundling an
application in there. JimA has also been mixing in some frozen modules
(I forget exactly why/what).
[.oh, my small distro doesn't ship a python15.dll, although it easily
could]

>...
> I hope that Mark Hammond can point you to a tool that you can use to
> edit a string resource in an executable or DLL.

The win32api module has resource manipulation functions such as
BeginUpdateResource, UpdateResource, and EndUpdateResource.

Write a Python script to modify your version string :-)

A demo of resource munging can be seen in <win32
source>/win32/scripts/VersionStamp/verstamp.py.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: Path hacking [ In reply to ]
Guido van Rossum wrote:

> Hmm... PYTHONPATH gets inserted in front of the default sys.path.
> (Many moons ago that was different. But it has been like this for a
> loooooong time.) So are you worried that someone put a *different*
> exceptions.py or site.py on their path?

Yes, and/or (2) added a sitecustomize.py with their special
import hook as has been proposed here over and
over, or (3) PYTHONPATH is screwed up and doesn't find anything.

Perhaps this is a prejudice of mine. I just look at "print sys.path"
and marvel at what I don't understand. At least I can see it is
not simple. I hate relying on thing that are not simple.

And I hate custom import hooks. Unless they are mine of course ;-)

> Hm, I dug out the oldest site.py I have (used in Python 1.4), and it
> doesn't encourage editing it at all -- it tells you to use
> sitecustomize.py. I guess they could break your installation anyway,
> but only by messing with the general Python installation.

Just adding sitecustomize.py would do it. And this is encouraged.

I think Gordon put his finger on the issue. Either try to co-exist
with other installed Python software and take the risk that everyone
is playing by the rules, or build your own black-box
self-contained Python world and duplicate storage.

> The usual :-)
> [Documentation goes here...]

Thanks. This is very useful.

But it doesn't help, perhaps as a result of more of my prejudices.
This registry entry is meant to be used by a by-the-rules shared
Python installation, so I must not change it. And I hate changing
any registry entries at all. My commercial software keeps all its
settings in a regular .ini file in its install directory, and makes
only minimal and required registry entries. IMHO the Windows registry
is a software catastrophe which ranks right up there with JCL (anyone
else here old enough to remember that?). Anyone who doesn't agree
should
go with me to our money center banking clients, and sit there while they
grill you on every registry entry and why it is required. Money center
banking clients do not like their registry messed with.

I do however see your point that I could change the version string
to something non-standard and use the registry to control imports.
I will think about this further. Maybe it would work.

My current "solution" is to use freeze to create a black-box install,
and worry about second Python installations and wasted storage when it
happens.
I was hoping that this thread whould result in a consensis of what
to do, but it has not.

So now I am hoping that Python library (jar) files will turn out to
be a practical solution, so I am pestering Greg and Gordon. We'll see.

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
[.Greg, replying to Guido's confusion on Jim's interest in
site.py and exceptions.py]

> When you jam every file into a module archive, you still have to
> leave these two "outside" so that Python can find them when
> starting up.

In the soon-to-be-published beta version of my installer, I've
got that down to exceptions.py.

Background: my (Win32) installer has, as a stated goal, the
ability to create quasi-frozen Python apps which won't interfere
with (or be influenced by) existing Python installations (if any).
And it doesn't require the programmer/user to have a compiler.

Thanks to patches given me by Thomas Heller, my
python.exe replacement is now a (minimal) embedding app,
and I do the same things that Greg does in site.py directly
from C code, (and turn off the SiteFlag, too).

I believe that if I freeze in exceptions.py and tweak the
resource in python15.dll (which is just the stock python15.dll),
I can have a completely safe executable.

I think the same techniques can be applied on *nix, (although
I'm pretty sure programmer/users won't be able to get away
without a compiler).

This gives me a strong interest in import hooks for two distinct
reasons:
- I rely completely on Greg's imputil to make this work.
- I rely on freeze's modulefinder to help build these things.

Bizarre import hooks in a normal Python installation will fool
modulefinder. Take a look at what Pmw does (fortunately,
Pmw comes with it's own packager). So I'd like to see import
hooks follow some sort of pattern that can be followed by a
tool like modulefinder.

I also want the hooks. Right now I use imputil with archives,
but wouldn't it be cool if you could add another imputil importer
that checks for more recent versions at some home site on
the web and automatically updates the installation?

Summary: I'm very interested in seeing import and import
hooks get rationalized, and I think Greg's stuff goes a long,
long way towards that goal.

- Gordon
Re: Re: Path hacking [ In reply to ]
James C. Ahlstrom wrote:

[Guido explains Windows registry usage]
> Thanks. This is very useful.
>
> But it doesn't help, perhaps as a result of more of my
> prejudices. This registry entry is meant to be used by a
> by-the-rules shared Python installation, so I must not change it.

Without trying it, I doubt you have to. It looks like you could
set the resource to something that won't be found in the
registry, and then just use normal Python mechanisms.

> ... IMHO the Windows registry is a software catastrophe
> which ranks right up there with JCL (anyone else here old enough
> to remember that?).

Ah. In a rush to get it down to ops, tripping at the head of the
stairs and sending the card deck flying... Those were the days.



- Gordon
Re: Re: Path hacking [ In reply to ]
Skip Montanaro wrote:
>
> Gordon> In the soon-to-be-published beta version of my installer, I've
> Gordon> got that down to exceptions.py.
>
> Why not just run exceptions.py through Python2C, visually and experimentally
> verify that it works, then ship an exceptions.c as an optional module?
> People wanting to ship self-contained packages could then toss exceptions.py
> and build the C version of the exceptions module.
>
> Greg, is there anything in exceptions.py Python2C couldn't handle?

Nah, shouldn't have any problem at all. P2C will even create true class
objects and expose them in the interface.

I think a person might want to consider hand-tuning the output, though
:-)

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
Re: Re: Path hacking [ In reply to ]
Gordon> In the soon-to-be-published beta version of my installer, I've
Gordon> got that down to exceptions.py.

Why not just run exceptions.py through Python2C, visually and experimentally
verify that it works, then ship an exceptions.c as an optional module?
People wanting to ship self-contained packages could then toss exceptions.py
and build the C version of the exceptions module.

Greg, is there anything in exceptions.py Python2C couldn't handle?

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/~skip/
847-971-7098 | Python: Programming the way Guido indented...
Re: Re: Path hacking [ In reply to ]
Skip Montanaro wrote:
> Why not just run exceptions.py through Python2C,

What is Python2C. Is it the same as freeze?

Jim Ahlstrom
Re: Re: Path hacking [ In reply to ]
> Why not just run exceptions.py through Python2C, visually and experimentally
> verify that it works, then ship an exceptions.c as an optional module?
> People wanting to ship self-contained packages could then toss exceptions.py
> and build the C version of the exceptions module.

Alternatively (and probably easier) it (and site.py) could be stored
as frozen modules. All it takes is some edits to Python/frozen.c.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Re: Re: Path hacking [ In reply to ]
Jim> What is Python2C. Is it the same as freeze?

Nope. Python in, compilable C out:

http://www.mudlib.org/~rassilon/p2c/

Courtesy of Greg Stein and Bill Tutt.

Skip Montanaro | http://www.mojam.com/
skip@mojam.com | http://www.musi-cal.com/~skip/
847-971-7098 | Python: Programming the way Guido indented...

1 2  View All