Mailing List Archive

Memory profiling for perl5
We have Devel::DProf, which is quite nice for profiling time spent in
subroutines, but we appear to have nothing yet for profile memory
consumption in perl5.

I have a server process which forks to parallelize certain operations,
and if I create a huge data structure in the parent process, this
bloats the memory of the children as well.

I'm starting to wonder how to profile memory consumption in perl5.
The crudest approach would be to parse the symbol table, and sum up
the amount of space taken up by each item. The question is, what's
the correct way to do this??

I am interested primarily in tracing through a large complex data
structure (hash of hashes of hashes of hashes of....) and showing
(optimally in a tree structure) how much memory is consumed by each
"leaf" on the structure.

How much memory does a single hash reference consume?? If that is a
reference to a hash of hashes, how much space do thekeys take up?? I
am assuming I can use length() if the values are scalar, but if they
are numeric, are they stored internally in a different format??

Help and pointers much appreciated. Prior art would be nice...

W. Phillip Moore Phone: (212)-762-2433
Information Technology Department FAX: (212)-762-1009
Morgan Stanley and Co. E-mail: wpm@ms.com
750 7th Ave, NY, NY 10019

"Grant me the serenity to accept the things I cannot change, the
courage to change the things I can, and the wisdom to hide the
bodies of the people that I had to kill because they pissed me
off."
-- Anonymous

"Every normal man must be tempted at times to spit on his
hands, hoist the black flag, and begin slitting throats."
-- H.L. Mencken
Re: Memory profiling for perl5 [ In reply to ]
use the recently posted BSD::Resource package. I think it was
jarkko's, but he's probably sleeping. go snoop around cpan.

--tom

(wish he too were sleeping)
Re: Memory profiling for perl5 [ In reply to ]
Phillip Moore writes:
> I'm starting to wonder how to profile memory consumption in perl5.
> The crudest approach would be to parse the symbol table, and sum up
> the amount of space taken up by each item. The question is, what's
> the correct way to do this??
>
> I am interested primarily in tracing through a large complex data
> structure (hash of hashes of hashes of hashes of....) and showing
> (optimally in a tree structure) how much memory is consumed by each
> "leaf" on the structure.
>

The newest debugger (available RSN ;-) does a first step: it analyses
a memory usage in the package. It is much more crude than you describe
above: since Perl based, it considers the size being equal to the length
of the string representation.

It also does not descent into subtrees (at this moment).

On the other hand, when xsubpp2.0 starts moving, I may add the real
calculator to ExtUtils::Peek. I think XSUB should be as simple as
this: Given a perl structure as input, it will return the list of
other structures this one references. It looks like all the rest may
be made in Perl.

Best,
Ilya
Re: Memory profiling for perl5 [ In reply to ]
> From: Ilya Zakharevich <ilya@math.ohio-state.edu>
>
> Phillip Moore writes:
> >
> > I am interested primarily in tracing through a large complex data
> > structure (hash of hashes of hashes of hashes of....) and showing
> > (optimally in a tree structure) how much memory is consumed by each
> > "leaf" on the structure.
>
> The newest debugger (available RSN ;-) does a first step: it analyses
> a memory usage in the package. It is much more crude than you describe
> above: since Perl based, it considers the size being equal to the length
> of the string representation.
>
> It also does not descent into subtrees (at this moment).
>
> On the other hand, when xsubpp2.0 starts moving, I may add the real
> calculator to ExtUtils::Peek. I think XSUB should be as simple as
> this: Given a perl structure as input, it will return the list of
> other structures this one references. It looks like all the rest may
> be made in Perl.

Let's not forget that the real problem with hashes was not the memory
that's actually in use but rather the memory that _was_ in use and
has now been 'freed' (when the hash doubled in size 8->16->32 etc).

A quick squint at Perl5.002beta1 hv.c shows that Larry has implemented
some fancy new memory management code for hashes.

*Thankyou* Larry!

I've no time to study it at the moment. Anyone care to take a look and
summarise what it's doing?

Tim.
Re: Memory profiling for perl5 [ In reply to ]
Tom Christiansen writes:
> use the recently posted BSD::Resource package. I think it was
> jarkko's, but he's probably sleeping. go snoop around cpan.
>
> --tom
>
> (wish he too were sleeping)

[Zzzz...wazzup?]

Sorry but I do not think BSD::Resource is going to help Phillip very
much on tracking the memomry usage of %hash_this or
%hash_that. BSD::Resource just implements the functionality
in <sys/resource.h>:

getrusage [gs]etrlimit [gs]etpriority

Only the getrusage() is nothing of the sort (well, *rlimit is vaguely
on the spot but only in the sense you get signals when and if you
overstep some predetermined limits...). getrusage() only gives
the following per-process *) so-far-consumptions:

struct rusage {
struct timeval ru_utime; /* user time used */
struct timeval ru_stime; /* system time used */
long ru_maxrss;
#define ru_first ru_ixrss
long ru_ixrss; /* integral shared memory size */
long ru_idrss; /* integral unshared data " */
long ru_isrss; /* integral unshared stack " */
long ru_minflt; /* page reclaims - total vmfaults */
long ru_majflt; /* page faults */
long ru_nswap; /* swaps */
long ru_inblock; /* block input operations */
long ru_oublock; /* block output operations */
long ru_msgsnd; /* messages sent */
long ru_msgrcv; /* messages received */
long ru_nsignals; /* signals received */
long ru_nvcsw; /* voluntary context switches */
long ru_nivcsw; /* involuntary " */
#define ru_last ru_nivcsw
};
#define ru_totflt ru_minflt

and these do not help much in tracking the memory consumption per
a Perlish data structure.

++jhi;

*) per-thread, too, at least in Digital UNIX, will implement at
least the flag, RUSAGE_THREAD, in the next BSD::Resource release
Re: Memory profiling for perl5 [ In reply to ]
>>>>> "Jarkko" == Jarkko Hietaniemi <jhi@snakemail.hut.fi> writes:

Jarkko> Sorry but I do not think BSD::Resource is going to help
Jarkko> Phillip very much on tracking the memomry usage of %hash_this
Jarkko> or %hash_that. BSD::Resource just implements the functionality
Jarkko> in <sys/resource.h>:

Wel, what I really want, in a perfect world, would be a tmon.out-style
file which shows me how much memery each and every data structure
consumes, but I can live with less than that.

Jarkko> long ru_ixrss; /* integral shared memory size */
Jarkko> long ru_idrss; /* integral unshared data " */
Jarkko> long ru_isrss; /* integral unshared stack " */

What I plan to do for now is track changes in the above before and
after certain large operations. If I know how much memory usage
*changes* before and after the creation of a huge structure I can get
a crude handle on memory consumption.

It turns out that my real problem is the lack of copy-on-write memory
structures in fork() on SunOS 4.1.3. Since I obtain parallelism by
forking heavily (still waiting for use POSIX::Thread ;-), creating a
huge data structure in the top level parent is killing me.

The other 3 OS flavors I have available (IRIX 5.3, Solaris 2.[45], and
AIX [34].x) all use copy-on-write when forking, so since the BULK of
the data is accessed read-only in the children, total memory
consumption should be more reasonable.
Re: Memory profiling for perl5 [ In reply to ]
>>>>> "Jarkko" == Jarkko Hietaniemi <jhi@snakemail.hut.fi> writes:

Jarkko> Sorry but I do not think BSD::Resource is going to help
Jarkko> Phillip very much on tracking the memomry usage of %hash_this
Jarkko> or %hash_that. BSD::Resource just implements the functionality
Jarkko> in <sys/resource.h>:

Wel, what I really want, in a perfect world, would be a tmon.out-style
file which shows me how much memery each and every data structure
consumes, but I can live with less than that.

Jarkko> long ru_ixrss; /* integral shared memory size */
Jarkko> long ru_idrss; /* integral unshared data " */
Jarkko> long ru_isrss; /* integral unshared stack " */

What I plan to do for now is track changes in the above before and
after certain large operations. If I know how much memory usage
*changes* before and after the creation of a huge structure I can get
a crude handle on memory consumption.

It turns out that my real problem is the lack of copy-on-write memory
structures in fork() on SunOS 4.1.3. Since I obtain parallelism by
forking heavily (still waiting for use POSIX::Thread ;-), creating a
huge data structure in the top level parent is killing me.

The other 3 OS flavors I have available (IRIX 5.3, Solaris 2.[45], and
AIX [34].x) all use copy-on-write when forking, so since the BULK of
the data is accessed read-only in the children, total memory
consumption should be more reasonable.