Mailing List Archive

Changing existing class instances
Currently, when you replace a class definition with an updated
version, it's really difficult to change existing class instances;
you'd have to essentially sweep every Python object and check if it's
an instance, starting at roots such as __main__ and sys.modules. This
makes developing code in a long-running process difficult, Zope being
the best example of this. When you modify a class definition used by
Zope code, you can't update existing instances floating around in
memory.

Over dinner, a friend and I were discussing this, and we thought it
probably isn't difficult to add an extra level of indirection to allow
fixing this. The only other option we could think of is either the
complete scan of all objects, or inserting a forwarding pointer into
PyClassObjects that points to the replacing class if !NULL, and then
chase pointers when accessing PyInstanceObject->in_class.

A quick hack to implement the extra indirection took about
half an hour. It does these things:

* Defines a PyClassHandle type:
struct _PyClassHandle {
PyClassHandle *next; /* ptr to next PyClassHandle in linked list */
PyClassObject *klass; /* The class object */
} ;

* The in_class attribute of PyInstanceObject becomes a
PyClassHandle* instead of a PyClassObject*, and all code
such as inst->in_class becomes inst->in_class->klass.

* As a quick hack to allow changing the class object referenced
by a handle, I added a .forward( <newclassobject> ) method to
class objects. This basically does self.handle->klass =
<newclassobject>.

The end result is that obj.__class__.forward(newclass) changes obj to
be an instance of newclass, and all other instances of obj.__class__
also mutate to become newclass instances.

Making this purely automatic seems hard; you'd have to catch things
like 'import ftplib; ftplib.FTP = myclass', which would require
automatically calling ftplib.FTP.forward( myclass ) to make all
existing FTP instances mutate. Would it be worthwhile to export some
hook for doing this in 1.6? The cost is adding an extra pointer deref
to all access to PyInstanceObject->in_class.

(This could probably also be added to ExtensionClass, and probably
doesn't need to be added to core Python to help out Zope. Just a
thought...)

--
A.M. Kuchling http://starship.python.net/crew/amk/
Here the skull of a consumptive child becomes part of a great machine for
calculating the motions of the stars. Here, a yellow bird frets within the
ribcage of an unjust man.
-- Welcome to Orqwith, in DOOM PATROL #22
Re: Changing existing class instances [ In reply to ]
> Currently, when you replace a class definition with an updated
> version, it's really difficult to change existing class instances;
> you'd have to essentially sweep every Python object and check if it's
> an instance, starting at roots such as __main__ and sys.modules. This
> makes developing code in a long-running process difficult, Zope being
> the best example of this. When you modify a class definition used by
> Zope code, you can't update existing instances floating around in
> memory.

There might be another solution. When you reload a module, the module
object and its dictionary are reused.

Perhaps class and function objects could similarly be reused? It
would mean that a class or def statement looks for an existing object
with the same name and type, and overwrites that. Voila, all
references are automatically updated.

This is more work (e.g. for classes, a new bytecode may have to be
invented because the class creation process must be done differently)
but it's much less of a hack, and I think it would be more reliable.
(Even though it alters borderline semantics a bit.)

(Your extra indirection also slows things down, although I don't know
by how much -- not just the extra memory reference but also less
locality of reference so more cache hits.)

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Changing existing class instances [ In reply to ]
[Guido, on Andrew's idea for automagically updating
classes]

> There might be another solution. When you reload a module,
> the module object and its dictionary are reused.
>
> Perhaps class and function objects could similarly be
> reused? It would mean that a class or def statement
> looks for an existing object with the same name and type,
> and overwrites that. Voila, all references are
> automatically updated.

Too dangerous, I think. While uncommon in general, I've certainly seen
(even written) functions that e.g. return a contained def or class. The
intent in such cases is very much to create distinct defs or classes
(despite having the same names). In this case I assume "the same name"
wouldn't *usually* be found, since the "contained def or class"'s name is
local to the containing function. But if there ever happened to be a
module-level function or class of the same name, brrrr.

Modules differ because their namespace "search path" consists solely of the
more-global-than-global <wink> sys.modules.

> This is more work (e.g. for classes, a new bytecode may
> have to be invented because the class creation process
> must be done differently) but it's much less of a hack,
> and I think it would be more reliable. (Even though it
> alters borderline semantics a bit.)

How about an explicit function in the "new" module,

new.update(class_or_def_old, class_or_def_new)

which overwrites old's guts with new's guts (in analogy with dict.update)?
Then no semantics change and you don't need new bytecodes. In return, a
user who wants to e.g. replace an existing class C would need to do

oldC = C
do whatever they do to get the new C
new.update(oldC, C)

Building on that, a short Python loop could do the magic for every class and
function in a module; and building on *that*, a short "updating import"
function could be written in Python. View it as providing mechanism instead
of policy <0.9 wink>.

> (Your extra indirection also slows things down, although
> I don't know by how much -- not just the extra memory
> reference but also less locality of reference so more
> cache hits.)

Across the universe of all Python programs on all platforms, weighted by
importance, it was a slowdown of nearly 4.317%.

if-i-had-used-only-one-digit-everyone-would-have-
known-i-was-making-it-up<wink>-ly y'rs - tim
RE: Changing existing class instances [ In reply to ]
Oh man, oh man... I think this is where I get to say something akin to "I
told you so."

:-)

I already described Tim's proposal in my type proposal paper, as a way to
deal with incomplete classes. Essentially, a class object is created
"empty" and is later "updated" with the correct bits. The empty class
allows two classes to refer to each other in the "recursive type"
scenario.

In other words, I definitely would support a new class object behavior
that allows us to update a class' set of bases and dictionary on the fly.
This could then be used to support my solution for the recursive type
scenario (which, in turn, means that we don't have to introduce Yet
Another Namespace into Python to hold type names).

Note: I would agree with Guido, however, on the "look for a class object
with the same name", but with the restriction that the name is only
replaced in the *target* namespace. i.e. a "class Foo" in a function will
only look for Foo in the function's local namespace; it would not
overwrite a class in the global space, nor would it overwrite class
objects returned by a prior invocation of the function.

Cheers,
-g

On Thu, 20 Jan 2000, Tim Peters wrote:
> [Guido, on Andrew's idea for automagically updating
> classes]
>
> > There might be another solution. When you reload a module,
> > the module object and its dictionary are reused.
> >
> > Perhaps class and function objects could similarly be
> > reused? It would mean that a class or def statement
> > looks for an existing object with the same name and type,
> > and overwrites that. Voila, all references are
> > automatically updated.
>
> Too dangerous, I think. While uncommon in general, I've certainly seen
> (even written) functions that e.g. return a contained def or class. The
> intent in such cases is very much to create distinct defs or classes
> (despite having the same names). In this case I assume "the same name"
> wouldn't *usually* be found, since the "contained def or class"'s name is
> local to the containing function. But if there ever happened to be a
> module-level function or class of the same name, brrrr.
>
> Modules differ because their namespace "search path" consists solely of the
> more-global-than-global <wink> sys.modules.
>
> > This is more work (e.g. for classes, a new bytecode may
> > have to be invented because the class creation process
> > must be done differently) but it's much less of a hack,
> > and I think it would be more reliable. (Even though it
> > alters borderline semantics a bit.)
>
> How about an explicit function in the "new" module,
>
> new.update(class_or_def_old, class_or_def_new)
>
> which overwrites old's guts with new's guts (in analogy with dict.update)?
> Then no semantics change and you don't need new bytecodes. In return, a
> user who wants to e.g. replace an existing class C would need to do
>
> oldC = C
> do whatever they do to get the new C
> new.update(oldC, C)
>
> Building on that, a short Python loop could do the magic for every class and
> function in a module; and building on *that*, a short "updating import"
> function could be written in Python. View it as providing mechanism instead
> of policy <0.9 wink>.
>
> > (Your extra indirection also slows things down, although
> > I don't know by how much -- not just the extra memory
> > reference but also less locality of reference so more
> > cache hits.)
>
> Across the universe of all Python programs on all platforms, weighted by
> importance, it was a slowdown of nearly 4.317%.
>
> if-i-had-used-only-one-digit-everyone-would-have-
> known-i-was-making-it-up<wink>-ly y'rs - tim
>
>
>
> _______________________________________________
> Python-Dev maillist - Python-Dev@python.org
> http://www.python.org/mailman/listinfo/python-dev
>

--
Greg Stein, http://www.lyra.org/
Re: Changing existing class instances [ In reply to ]
"A.M. Kuchling" wrote:
>
> Currently, when you replace a class definition with an updated
> version, it's really difficult to change existing class instances;
> you'd have to essentially sweep every Python object and check if it's
> an instance, starting at roots such as __main__ and sys.modules. This
> makes developing code in a long-running process difficult, Zope being
> the best example of this. When you modify a class definition used by
> Zope code, you can't update existing instances floating around in
> memory.

In the case of Zope, if the objects that you care about happen to be
persistent objects, then it's relatively easy to arrange to get the
objects flushed from memory and reloaded with the new classes.
(There are some subtle issues to deal with, like worrying about
multiple threads, but in a development environment, you can deal with
these, for example, by limiting the server to one thread.)

Note that this is really only a special case of a much larger problem.

Reloading a module redefines the global variables in a module.
It doesn't update any references to those global references
from other places, such as instances or *other* modules.

For example, imports like:

from foo import spam

are not updated when foo is reloaded.

Maybe you are expecting too much from reload.

Jim

--
Jim Fulton mailto:jim@digicool.com
Technical Director (888) 344-4332 Python Powered!
Digital Creations http://www.digicool.com http://www.python.org

Under US Code Title 47, Sec.227(b)(1)(C), Sec.227(a)(2)(B) This email
address may not be added to any commercial mail list with out my
permission. Violation of my privacy with advertising or SPAM will
result in a suit for a MINIMUM of $500 damages/incident, $1500 for
repeats.
Re: Changing existing class instances [ In reply to ]
> From: "Tim Peters" <tim_one@email.msn.com>
>
> [Guido, on Andrew's idea for automagically updating
> classes]
>
> > There might be another solution. When you reload a module,
> > the module object and its dictionary are reused.
> >
> > Perhaps class and function objects could similarly be
> > reused? It would mean that a class or def statement
> > looks for an existing object with the same name and type,
> > and overwrites that. Voila, all references are
> > automatically updated.
>
> Too dangerous, I think. While uncommon in general, I've certainly seen
> (even written) functions that e.g. return a contained def or class. The
> intent in such cases is very much to create distinct defs or classes
> (despite having the same names). In this case I assume "the same name"
> wouldn't *usually* be found, since the "contained def or class"'s name is
> local to the containing function. But if there ever happened to be a
> module-level function or class of the same name, brrrr.

Agreed that that would be bad. But I wouldn't search outer scopes --
I would only look for a class/def that I was about to stomp on.

> Modules differ because their namespace "search path" consists solely of the
> more-global-than-global <wink> sys.modules.

"The search path doesn't enter into it."

> > This is more work (e.g. for classes, a new bytecode may
> > have to be invented because the class creation process
> > must be done differently) but it's much less of a hack,
> > and I think it would be more reliable. (Even though it
> > alters borderline semantics a bit.)
>
> How about an explicit function in the "new" module,
>
> new.update(class_or_def_old, class_or_def_new)
>
> which overwrites old's guts with new's guts (in analogy with dict.update)?
> Then no semantics change and you don't need new bytecodes.

Only a slight semantics change (which my full proposal would require
too): function objects would become mutable -- their func_code,
func_defaults, func_doc and func_globals fields (and, why not,
func_name too) should be changeable. If you make all these
assignable, it doesn't even have to be a privileged function.

> In return, a
> user who wants to e.g. replace an existing class C would need to do
>
> oldC = C
> do whatever they do to get the new C
> new.update(oldC, C)
>
> Building on that, a short Python loop could do the magic for every class and
> function in a module; and building on *that*, a short "updating import"
> function could be written in Python. View it as providing mechanism instead
> of policy <0.9 wink>.

That's certainly a reasonable compromise. Note that the update on a
class should imply an update on its methods, right?

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Changing existing class instances [ In reply to ]
[Tim worries about stomping on unintended classes/defs]

[Guido]
> Agreed that that would be bad. But I wouldn't search outer
> scopes -- I would only look for a class/def that I was about
> to stomp on.

Maybe I just don't grasp what that means, exactly. Fair enough, since I'm
not expressing myself clearly either!

Suppose someone does

from Tkinter import *

in my.py, and later in my.py just *happens* to define, at module level,

class Misc:
blah blah blah

Now Misc was already in my.py's global namespace because Tkinter.py just
happens to export a class of that name too (more by accident than design --
but accidents are what I'm most worried about here).

At the time my.py defines Misc, does Misc count as a class we're "about to
stomp on"? If so-- & I've assumed so --it would wreak havoc.

But if not, I don't see how this case can be reliably distinguished "by
magic" from the cases where update is desired (if people are doing dynamic
updates to a long-running program, a new version of a class can come from
anywhere, so nothing like original file name or line number can distinguish
correctly either).

>> Modules differ because their namespace "search path"
>> consists solely of the more-global-than-global <wink>
>> sys.modules.

> "The search path doesn't enter into it."

I agree, but am at a loss to describe what's happening in the case above
using other terminology <wink>. In a sense, you need a system-wide "unique
handle" to support bulletproof updating, and while sys.modules has supplied
that all along for module objects (in the form of the module name), I don't
believe there's anything analogous to key off of for function or class
objects.

>> [suggesting]
>> new.update(class_or_def_old, class_or_def_new)

> Only a slight semantics change (which my full proposal
> would require too): function objects would become mutable
> -- their func_code, func_defaults, func_doc and func_globals
> fields (and, why not, func_name too) should be changeable.

Of course I meant "no new semantics" in the sense of "won't cause current
exception-free code to alter behavior in any way".

> If you make all these assignable, it doesn't even have to
> be a privileged function.

I'm all for that!

> [.sketching a Python approach to "updating import/reload"
> building on the hypothetical new.update]

> That's certainly a reasonable compromise. Note that the
> update on a class should imply an update on its methods,
> right?

Hadn't considered that! Of course you're right. So make it a pair of
nested loops <wink>.

so-long-as-it-can-be-written-in-python-it's-easy-ly
y'rs - tim
Re: Changing existing class instances [ In reply to ]
> [Tim worries about stomping on unintended classes/defs]
>
> [Guido]
> > Agreed that that would be bad. But I wouldn't search outer
> > scopes -- I would only look for a class/def that I was about
> > to stomp on.
>
> Maybe I just don't grasp what that means, exactly. Fair enough, since I'm
> not expressing myself clearly either!
>
> Suppose someone does
>
> from Tkinter import *
>
> in my.py, and later in my.py just *happens* to define, at module level,
>
> class Misc:
> blah blah blah
>
> Now Misc was already in my.py's global namespace because Tkinter.py just
> happens to export a class of that name too (more by accident than design --
> but accidents are what I'm most worried about here).

For a second I thought you got me there!

> At the time my.py defines Misc, does Misc count as a class we're "about to
> stomp on"? If so-- & I've assumed so --it would wreak havoc.
>
> But if not, I don't see how this case can be reliably distinguished "by
> magic" from the cases where update is desired (if people are doing dynamic
> updates to a long-running program, a new version of a class can come from
> anywhere, so nothing like original file name or line number can distinguish
> correctly either).

Fortunately, there's magic available: recently, all classes have a
__module__ attribute that is set to the full name of the module that
defined it (its key in __sys__.modules).

For functions, we would have to invent something similar.

> >> Modules differ because their namespace "search path"
> >> consists solely of the more-global-than-global <wink>
> >> sys.modules.
>
> > "The search path doesn't enter into it."
>
> I agree, but am at a loss to describe what's happening in the case above
> using other terminology <wink>. In a sense, you need a system-wide "unique
> handle" to support bulletproof updating, and while sys.modules has supplied
> that all along for module objects (in the form of the module name), I don't
> believe there's anything analogous to key off of for function or class
> objects.
>
> >> [suggesting]
> >> new.update(class_or_def_old, class_or_def_new)
>
> > Only a slight semantics change (which my full proposal
> > would require too): function objects would become mutable
> > -- their func_code, func_defaults, func_doc and func_globals
> > fields (and, why not, func_name too) should be changeable.
>
> Of course I meant "no new semantics" in the sense of "won't cause current
> exception-free code to alter behavior in any way".
>
> > If you make all these assignable, it doesn't even have to
> > be a privileged function.
>
> I'm all for that!
>
> > [.sketching a Python approach to "updating import/reload"
> > building on the hypothetical new.update]
>
> > That's certainly a reasonable compromise. Note that the
> > update on a class should imply an update on its methods,
> > right?
>
> Hadn't considered that! Of course you're right. So make it a pair of
> nested loops <wink>.
>
> so-long-as-it-can-be-written-in-python-it's-easy-ly
> y'rs - tim

--Guido van Rossum (home page: http://www.python.org/~guido/)
RE: Changing existing class instances [ In reply to ]
[Greg Stein]
> ...
> In other words, I definitely would support a new class
> object behavior that allows us to update a class' set of
> bases and dictionary on the fly. This could then be used
> to support my solution for the recursive type scenario (which,
> in turn, means that we don't have to introduce Yet Another
> Namespace into Python to hold type names).

Parenthetically, I never grasped the appeal of the parenthetical comment.
Yet Another Namespace for Yet Another Entirely New Purpose seems highly
*desirable* to me! Trying to overload the current namespace set makes it so
much harder to see that these are compile-time gimmicks, and users need to
be acutely aware of that if they're to use it effectively. Note that I
understand (& wholly agree with) the need for runtime introspection.

different-things-different-rules-ly y'rs - tim
RE: Changing existing class instances [ In reply to ]
[Tim, still worried about stomping on unintended classes/defs]

[example abusing Tkinter.Misc]

> For a second I thought you got me there!

That's twice as long as I thought you'd think that, so I win after all
<wink>.

> Fortunately, there's magic available: recently, all classes
> have a __module__ attribute that is set to the full name
> of the module that defined it (its key in __sys__.modules).
>
> For functions, we would have to invent something similar.

OK! I didn't know about class.__module__ -- I hope you realize that relying
on your time machine is making you lazy <wink>.

I remain uncomfortable with automagic updating, but not as much so. Both
kinds of errors still seem possible to me:

1. Automagically updating when it wasn't wanted.

Examples of this are getting harder to come by <wink>. Off the top of my
head I'm reduced to stuff like this:

>>> adders = []
>>> for i in range(10):
def adder(y, x=i):
return y+x
adders.append(adder)


>>> adders[2](40)
42
>>> adders[9](33)
42
>>>

"That kind of thing" has got to be rare, but can't be non-existent either
(well, isn't -- I've done it).

2. Failing to automagically update when it was wanted.

Implicit in the discussion so far is that long-running systems want to
update code at a granularity no finer than module level. Is that realistic?
I'm unsure. It's certainly easy to *imagine* the app running an updater
server thread, accepting new source for functions and classes, and offering
to compile and install the objects.

Under the explicit new.update scheme, such a service needn't bother clients
with communicating the full name of the original module; heck, in a *truly*
long-running app, over time the source tree will change, and classes and
functions will migrate across modules. That will be a problem for the
explicit scheme too (how does it know *which* "class Misc" to update) -- but
at least it's an explicit problem then, and not a "mysterous failure" of
hidden magic.


I could live with both of those (#1 is more worrisome); but think it easier
all around to give the users some tools and tell them to solve the problems
however they see fit.

or-maybe-we-already-agreed-about-that-ly y'rs - tim
Re: Changing existing class instances [ In reply to ]
On Thu, 20 Jan 2000, Guido van Rossum wrote:
> Tim Peters:
>...
> > At the time my.py defines Misc, does Misc count as a class we're "about to
> > stomp on"? If so-- & I've assumed so --it would wreak havoc.
> >
> > But if not, I don't see how this case can be reliably distinguished "by
> > magic" from the cases where update is desired (if people are doing dynamic
> > updates to a long-running program, a new version of a class can come from
> > anywhere, so nothing like original file name or line number can distinguish
> > correctly either).
>
> Fortunately, there's magic available: recently, all classes have a
> __module__ attribute that is set to the full name of the module that
> defined it (its key in __sys__.modules).
>
> For functions, we would have to invent something similar.

func.func_globals


__module__ and func_globals can prevent *other* modules from redefining
something accidentally, but it doesn't prevent Badness from within the
module.
[. Tim just posted an example of this: his "def adder()" example... ]

Cheers,
-g

--
Greg Stein, http://www.lyra.org/
RE: Changing existing class instances [ In reply to ]
>>>>> "TP" == Tim Peters <tim_one@email.msn.com> writes:

TP> Under the explicit new.update scheme, such a service needn't
TP> bother clients with communicating the full name of the
TP> original module; heck, in a *truly* long-running app, over
TP> time the source tree will change, and classes and functions
TP> will migrate across modules. That will be a problem for the
TP> explicit scheme too (how does it know *which* "class Misc" to
TP> update) -- but at least it's an explicit problem then, and not
TP> a "mysterous failure" of hidden magic.

I completely agree. I think in general, such long running apps are
rare, and in those cases you probably want to be explicit about when
and how the updates occur anyway. The one place where automatic
updates would be convenient would be at the interactive prompt, so it
might be nice to add a module that could be imported by PYTHONSTARTUP,
and play hook games to enable automatic updates.

-Barry
RE: Changing existing class instances [ In reply to ]
[Barry A. Warsaw]
> I completely agree.

That's no fun <wink>.

> I think in general, such long running apps are rare,

By definition, they're non-existent under Windows <0.7 wink>. But it
depends on which field you're working in. The closer you get to being part
of a business or consumer service, the more important it gets; e.g., I've
seen serious RFQs for software systems guaranteed to suffer no more than 5
minutes of downtime per *year* (& stiff penalties for failure to meet that).
I've never been on the winning end of such an RFQ, so am not sure what it
takes to meet it.

It's interesting to ponder. Psion has published a little about the software
techniques they use in their PDAs (my Psion 3a's remarkably capable "Agenda"
app has been running non-stop for a bit over 3 years!).

> and in those cases you probably want to be explicit about when
> and how the updates occur anyway.

My guess is you'd want to be *paranoidly* explicit, leaving nothing to
chance.

> The one place where automatic updates would be convenient would
> be at the interactive prompt, so it might be nice to add a
> module that could be imported by PYTHONSTARTUP, and play hook
> games to enable automatic updates.

Returning the favor, I completely agree. The single thing people at work
gripe most about is how to do development under IDLE in such a way that
their package-laden systems exhibit the hoped-for changes in response to
editing a module deep in the bowels of the system. I don't have a *good*
answer to that now; reduced to stuff like writing custom scripts to
selectively clear out sys.modules.

non-stop-ly y'rs - tim