Mailing List Archive

Python, Embedding Q's
hi all

im embedding Python in multi-threaded server, and just had a
couple of q's regarding compiling files, cache, etc. if any help
or thoughts could be offered, please do.

I wish to use Python's built-in threading of the interpreter,
aswell as Python's memory management [for portability], will
this have any major, adverse performance effects?
Does Python handle which thread does what?
would it be better to create say, x threads, each with their
own instance of the interpreter?

what is the PyDict format of globals/locals from PyModule_GetDict?
what vars are declared in this ModuleDict? is it necessary to add
anything there?

does PyCodeObject keep (or represent) the compiled code in mem?
would it be worth maintaining a heap of PyCodeObjects (one for
each .pyx, or until cache is full...) as opposed to recompiling
for each request (unless it is desired, say, .pyd), say
PyCodeObject * pcoCached[] = malloc( 500000 );
will Py use this new heap for the CodeObject pointers aswell as?
will the CodeObjects fill the heap, or will Py, when compiling
for a CodeObject, create mem elsewhere? eer, does sizeof(PyCodeObject)
include the Py bytecode?

if i were to maintain the PyCodeObjects, how then would i have
an instance of the interpreter run this code? ahh, obviously
PyEval_EvalCode, but what if Python is handling the sub-interpreters?
how to say, ThatSubInterpreter->EvalCode( PyCodeObject )?

is Py_NewInterpreter really a 'fresh copy' of the main interpreter
for use by a thread? if i add built-in mods with PyImport_AddModule,
do i need to do this for each Py_NewInterpreter? what about
Py_InitModule? does this need to be called for each new aswell?

in the examples, the telnet server seems to call Py_NewInterpreter
for each request. would i have to, if say i wished to limit the
worker threads, when a request occurs, if threads >= max, put
request in queue, OR, create a new thread with Py_NewInterpreter,
handle the request, check to see if any new requests are in the
queue [and handle them too], then Py_EndInterpreter? what about
reusing Py_NewInterpreters after they've been idle for a while?
(say, a LIFO q of interpreters... you know how it is... no action
for a while then bang, everybody wants in on your server!!)

cheers and thanks in advance
bryn
Python, Embedding Q's [ In reply to ]
hi all

im embedding Python in multi-threaded server, and just had a
couple of q's regarding compiling files, cache, etc. if any help
or thoughts could be offered, please do.

I wish to use Python's built-in threading of the interpreter,
aswell as Python's memory management [for portability], will
this have any major, adverse performance effects?
Does Python handle which thread does what?
would it be better to create say, x threads, each with their
own instance of the interpreter?

what is the PyDict format of globals/locals from PyModule_GetDict?
what vars are declared in this ModuleDict? is it necessary to add
anything there?

does PyCodeObject keep (or represent) the compiled code in mem?
would it be worth maintaining a heap of PyCodeObjects (one for
each .pyx, or until cache is full...) as opposed to recompiling
for each request (unless it is desired, say, .pyd), say
PyCodeObject * pcoCached[] = malloc( 500000 );
will Py use this new heap for the CodeObject pointers aswell as?
will the CodeObjects fill the heap, or will Py, when compiling
for a CodeObject, create mem elsewhere? eer, does
sizeof(PyCodeObject) include the Py bytecode?

if i were to maintain the PyCodeObjects, how then would i have
an instance of the interpreter run this code? ahh, obviously
PyEval_EvalCode, but what if Python is handling the sub-interpreters?
how to say, ThatSubInterpreter->EvalCode( PyCodeObject )?

is Py_NewInterpreter really a 'fresh copy' of the main interpreter
for use by a thread? if i add built-in mods with PyImport_AddModule,
do i need to do this for each Py_NewInterpreter? what about
Py_InitModule? does this need to be called for each new aswell?

in the examples, the telnet server seems to call Py_NewInterpreter
for each request. would i have to, if say i wished to limit the
worker threads, when a request occurs, if threads >= max, put
request in queue, OR, create a new thread with Py_NewInterpreter,
handle the request, check to see if any new requests are in the
queue [and handle them too], then Py_EndInterpreter? what about
reusing Py_NewInterpreters after they've been idle for a while?
(say, a LIFO q of interpreters... you know how it is... no action
for a while then bang, everybody wants in on your server!!)

cheers and thanks in advance
bryn
Python, Embedding Q's [ In reply to ]
B Kingsford <bryn@mds01.itc.com.au> wrote:
: hi all

: im embedding Python in multi-threaded server, and just had a
: couple of q's regarding compiling files, cache, etc. if any help
: or thoughts could be offered, please do.

: I wish to use Python's built-in threading of the interpreter,
: aswell as Python's memory management [for portability], will
: this have any major, adverse performance effects?
: Does Python handle which thread does what?
: would it be better to create say, x threads, each with their
: own instance of the interpreter?

If you embedded Python in the threads, then yes, each thread could/
would have its own interpreter, but Python can handle multiple threads
itself. Python will handle the memory management for you; access to
Python is serialized, but non-Python code is free to be threaded.

Read up on the threading API:
http://www.python.org/doc/current/api/threads.html

: what is the PyDict format of globals/locals from PyModule_GetDict?
: what vars are declared in this ModuleDict? is it necessary to add
: anything there?

The basic variables would be "__name__", "__builtins__" (bound to the
__builtin__ module), "__doc__" and possibly the "__file__" attribute.
Other variables you want to add at initialization should be by calling
PyDict_SetItemString(moddict, "varname", object);

You probably only want to access PyModule_GetDict() (which returns a
normal PyDictObject) only when you initialize the module, and use
PyObject_*AttrString() at other times.

: does PyCodeObject keep (or represent) the compiled code in mem?
: would it be worth maintaining a heap of PyCodeObjects (one for
: each .pyx, or until cache is full...) as opposed to recompiling
: for each request (unless it is desired, say, .pyd), say
: PyCodeObject * pcoCached[] = malloc( 500000 );
: will Py use this new heap for the CodeObject pointers aswell as?
: will the CodeObjects fill the heap, or will Py, when compiling
: for a CodeObject, create mem elsewhere? eer, does sizeof(PyCodeObject)
: include the Py bytecode?

The code objects will be wrapped around functions and bound to
variables in the modules you import. You can then call the functions
with PyObject_CallFunction() or from PyRun_SimpleString(), depending on
how you are using the API.

It is probably better to let Python maintain the information for you,
and it will take care of the memory.

: if i were to maintain the PyCodeObjects, how then would i have
: an instance of the interpreter run this code? ahh, obviously
: PyEval_EvalCode, but what if Python is handling the sub-interpreters?
: how to say, ThatSubInterpreter->EvalCode( PyCodeObject )?

This is handled by switching the interpreter state (see the webpage
mentioned above). But you will want to run the code as functions. For
a good example of switching interpreter states, look at the _tkinter.c
code.

: is Py_NewInterpreter really a 'fresh copy' of the main interpreter
: for use by a thread? if i add built-in mods with PyImport_AddModule,
: do i need to do this for each Py_NewInterpreter? what about
: Py_InitModule? does this need to be called for each new aswell?

As far as I understand it looking at the code, yes, but I've never used
Py_NewInterpreter(), so maybe someone else can answer better. Each new
interpreter has it's own set of imported modules, so yes, you will need
to import the modules individually. But the Py_InitModule should be
called from the modules "init<modname>" function (through a normal
"import <modname>" statement).
http://www.python.org/doc/current/ext/mehodTable.html

: in the examples, the telnet server seems to call Py_NewInterpreter
: for each request. would i have to, if say i wished to limit the
: worker threads, when a request occurs, if threads >= max, put
: request in queue, OR, create a new thread with Py_NewInterpreter,
: handle the request, check to see if any new requests are in the
: queue [and handle them too], then Py_EndInterpreter? what about
: reusing Py_NewInterpreters after they've been idle for a while?
: (say, a LIFO q of interpreters... you know how it is... no action
: for a while then bang, everybody wants in on your server!!)

If this is the case, it would probably be better to create a thread
pool and work from a queue. There are other techniques you could use
tho.

-Arcege