Mailing List Archive

Memory and swapping question
Hi folks,

due to a question which came up in the tutor list, I'd like
to ask if somebody can explain the following:

(This is true for Python 1.5 under Win98 and Suse Linux at least)

Remember the size of your machine's main memory in MB.
On my machine, this is 64.

Start a fresh Python session with not many other tasks
active at the same time.

Now, divide the number by 16 and multiply by a million.

>>> my_mb = 64
>>> big = my_mb / 16 * 1000000
>>> big
4000000

(16 is a good guess for one integer entry in a list:
4 bytes the pointer, 12 bytes the object).

Now, create a list of numbers with the half of big,
and count the seconds. Afterwards, delete the list
and again count the seconds.

>>> x=range(big/2)
>>> del x
>>>

This will be quite fast, and the deletion will be somewhat
faster than the creation.

Now for the big WHY?
Do the same with big.

>>> x=range(big)
>>> del x
>>>

On my system, creation takes about 10 times as for big/2,
this is ok. But the del takes at least three times as long.
Besides the fact that integers are never really disposed but
build up a freelist, why is deletion so much slower now?

cheers - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
Memory and swapping question [ In reply to ]
> Now, create a list of numbers with the half of big,
> and count the seconds. Afterwards, delete the list
> and again count the seconds.
>
> >>> x=range(big/2)
> >>> del x
> >>>
>
> This will be quite fast, and the deletion will be somewhat
> faster than the creation.
>
> Now for the big WHY?
> Do the same with big.
>
> >>> x=range(big)
> >>> del x
> >>>
>
> On my system, creation takes about 10 times as for big/2,
> this is ok. But the del takes at least three times as long.
> Besides the fact that integers are never really disposed but
> build up a freelist, why is deletion so much slower now?

Clearly in the second case you're exceeding your physical memory and
paging VM pages in and out of swap space?

My guess: when creating the list you are allocating new VM pages,
which don't require any overhead until they need to be written, but
when deleting it, each page gets read from swap space, modified, and
then written back. Thus, you're I/O bound, and deleting requires more
I/O.

--Guido van Rossum (home page: http://www.python.org/~guido/)
Memory and swapping question [ In reply to ]
Guido van Rossum wrote:
...
> > >>> x=range(big)
> > >>> del x
> > >>>
> >
> > On my system, creation takes about 10 times as for big/2,
> > this is ok. But the del takes at least three times as long.
> > Besides the fact that integers are never really disposed but
> > build up a freelist, why is deletion so much slower now?
>
> Clearly in the second case you're exceeding your physical memory and
> paging VM pages in and out of swap space?

Yes, this was what I wanted to check out.

> My guess: when creating the list you are allocating new VM pages,
> which don't require any overhead until they need to be written, but
> when deleting it, each page gets read from swap space, modified, and
> then written back. Thus, you're I/O bound, and deleting requires more
> I/O.

Now I've got it. Right, since the structures almost fit main memory,
there is a little but not too much swapping, just pushing other
processes memory away.
Then, all the integers are deallocated which means they are
not deleted at all, but written again since they build up
the free list. This ensures that my whole memory gets written
to the disk. The same would happen if I'd read once through the
list.
But, not really. Doesn't this suggest to do memory deallocation
from the end to the start of a list? I could imagine that the
probability to touch something in memory is higher in this
case. Did someone try this before? (before I waste time)

thanks & cheers - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
Memory and swapping question [ In reply to ]
In article <371B5ED8.A9C82170@appliedbiometrics.com>,
Christian Tismer <tismer@appliedbiometrics.com> wrote:
> due to a question which came up in the tutor list, I'd like
> to ask if somebody can explain the following:....

<chomp><chomp>timings on making huge lists of integers...

> On my system, creation takes about 10 times as for big/2,
> this is ok. But the del takes at least three times as long.
> Besides the fact that integers are never really disposed but
> build up a freelist, why is deletion so much slower now?

Could be wrong, but this may be a case of a famous database
problem. The OS (typically) swaps out pages by picking the "least recently
used" page, but when you are decreffing (scanning) a HUGE list of sequentially
allocated objects this guarantees that the page you need next will
be swapped out by the time you get to it. Yikes! Allocation is faster
because you are really only "paging things out" (the first time)
and the write to the disk can be buffered until the disk is
ready, allowing the program to proceed (?I think?).

This is one reason why Oracle and Sybase, etc, like to do their own
memory and disk management ("gimme them sectors -- don't need no
g.d. filesystem, thanks!"). Just a guess, but a not completely
uneducated one.
-- Aaron Watters
===
stop procrastinating tomorrow.


-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
Memory and swapping question [ In reply to ]
aaron_watters@my-dejanews.com wrote:
>
> In article <371B5ED8.A9C82170@appliedbiometrics.com>,
> Christian Tismer <tismer@appliedbiometrics.com> wrote:
> > due to a question which came up in the tutor list, I'd like
> > to ask if somebody can explain the following:....
>
> <chomp><chomp>timings on making huge lists of integers...
>
> > On my system, creation takes about 10 times as for big/2,
> > this is ok. But the del takes at least three times as long.
> > Besides the fact that integers are never really disposed but
> > build up a freelist, why is deletion so much slower now?
>
> Could be wrong, but this may be a case of a famous database
> problem. The OS (typically) swaps out pages by picking the "least recently
> used" page, but when you are decreffing (scanning) a HUGE list of sequentially
> allocated objects this guarantees that the page you need next will
> be swapped out by the time you get to it. Yikes! Allocation is faster
> because you are really only "paging things out" (the first time)
> and the write to the disk can be buffered until the disk is
> ready, allowing the program to proceed (?I think?).

Exactly.
In this case, things are even worse:
Iin the de-allocation phase, the internal integer cache
is scanned in order, to return the integers to the
freelist. This treats a stack-like structure as a heap,
making integer deallocation like a list.reverse() on
the cache file.
It also applies to other objects which are created
in-order and referenced by the list. The same disk trashing.
Which means that the malloc routines use a similar,
stack-like freelist, at least on Win98.

> This is one reason why Oracle and Sybase, etc, like to do their own
> memory and disk management ("gimme them sectors -- don't need no
> g.d. filesystem, thanks!"). Just a guess, but a not completely
> uneducated one.

Well, I changed the deallocation strategy of lists to free
objects beginning from the end. (1 line in listobject.c)
Now, all my examples run as expected, with del no longer
being more expensive than create.

cheers - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home