Mailing List Archive

BEGIN using lots of time in profile?!
What the heck is with PerMsgStatus::BEGIN? The ExclSec is rather
large. And, oh yeah, it's called 3487 times for 200 messages.

The good news is that our overhead (time parsing, doing, getting, etc.)
for local tests without Bayes is quite low (seems lower than before, but
I haven't actually compared).

Profile:

* perl -d:DProf ./mass-check -f corpus.tiny
* 100 spam
* 100 ham
* use_bayes 0
* use_auto_whitelist 0

Total Elapsed Time = 14.39736 Seconds
User+System Time = 13.11736 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
13.7 1.808 10.021 3487 0.0005 0.0029 ::PerMsgStatus::BEGIN
6.63 0.870 1.225 79609 0.0000 0.0000 ::PerMsgStatus::get
2.92 0.383 4.592 201 0.0019 0.0228 ::PerMsgStatus::_body_tests_0
2.52 0.330 0.351 26028 0.0000 0.0000 ::Message::Node::header
2.18 0.286 1.869 201 0.0014 0.0093 ::PerMsgStatus::_head_tests_0
1.52 0.199 0.249 2 0.0997 0.1243 ::Conf::_parse
1.45 0.190 0.371 26294 0.0000 0.0000 ::Message::Node::get_header
1.22 0.160 0.454 301 0.0005 0.0015 ::Message::new
1.14 0.150 0.114 72175 0.0000 0.0000 UNIVERSAL::can
1.14 0.150 0.169 464 0.0003 0.0004 ::Message::Metadata::parse_received_line
1.13 0.148 0.727 95 0.0016 0.0077 HTML::Parser::parse
1.07 0.140 0.140 201 0.0007 0.0007 ::PerMsgStatus::_check_unique_words
1.07 0.140 0.169 201 0.0007 0.0008 ::PerMsgStatus::_meta_tests_500
0.99 0.130 0.206 4103 0.0000 0.0001 ::HTML::html_text
0.99 0.130 0.133 4917 0.0000 0.0000 ::PerMsgStatus::_handle_hit
0.91 0.120 0.377 6490 0.0000 0.0001 ::HTML::html_tag
0.76 0.100 0.100 201 0.0005 0.0005 ::PerMsgStatus::PORN_15_body_test
0.76 0.100 0.094 11055 0.0000 0.0000 ::PerMsgStatus::html_range
0.76 0.100 0.099 2412 0.0000 0.0000 ::PerMsgStatus::html_eval
0.76 0.100 12.195 201 0.0005 0.0607 ::PerMsgStatus::check
0.69 0.090 0.090 201 0.0004 0.0004 ::PerMsgStatus::US_DOLLARS_3_body_test
0.69 0.090 0.090 201 0.0004 0.0004 ::PerMsgStatus::DIET_3_body_test
0.69 0.090 0.090 201 0.0004 0.0004 ::PerMsgStatus::HAIR_LOSS_body_test
0.69 0.090 0.090 201 0.0004 0.0004 ::PerMsgStatus::__NIGERIAN_BODY_45_body_test
0.69 0.090 0.077 26733 0.0000 0.0000 ::PerMsgStatus::html_test
0.69 0.090 0.107 87 0.0010 0.0012 ::Message::_parse_multipart
0.69 0.090 1.957 603 0.0001 0.0032 ::PerMsgStatus::do_head_tests
0.61 0.080 0.079 2048 0.0000 0.0000 ::HTML::html_font_invisible
0.61 0.080 0.080 201 0.0004 0.0004 ::PerMsgStatus::CONSOLIDATE_DEBT_body_test
0.61 0.080 0.080 201 0.0004 0.0004 ::PerMsgStatus::LESBIAN_body_test
0.61 0.080 0.080 201 0.0004 0.0004 ::PerMsgStatus::PORN_CELEBRITY_body_test
0.61 0.080 0.079 100 0.0008 0.0008 ::remove_spamassassin_markup
0.61 0.080 0.198 201 0.0004 0.0010 ::PerMsgStatus::_body_uri_tests_0
0.53 0.070 0.070 201 0.0003 0.0003 ::PerMsgStatus::AMATEUR_PORN_body_test
0.53 0.070 0.070 201 0.0003 0.0003 ::PerMsgStatus::SUBJ_REMOVE_body_test
0.53 0.070 12.677 200 0.0003 0.0634 main::wanted
0.46 0.060 0.233 4917 0.0000 0.0000 ::PerMsgStatus::got_hit
0.46 0.060 0.117 850 0.0001 0.0001 ::HTML::target_uri
0.46 0.060 0.107 201 0.0003 0.0005 ::PerMsgStatus::get_uri_list
0.38 0.050 0.050 201 0.0002 0.0002 ::PerMsgStatus::FULL_REFUND_body_test
0.38 0.050 0.050 528 0.0001 0.0001 ::PerMsgStatus::_check_whitelist_rcvd
0.38 0.050 0.050 201 0.0002 0.0002 ::PerMsgStatus::REFINANCE_NOW_body_test
0.38 0.050 0.050 201 0.0002 0.0002 ::PerMsgStatus::__NIGERIAN_BODY_41_body_test
0.38 0.050 0.050 201 0.0002 0.0002 ::PerMsgStatus::MARKETING_PARTNERS_body_test
0.38 0.050 0.049 1700 0.0000 0.0000 ::HTML::parse_uri
0.38 0.050 0.050 804 0.0001 0.0001 ::PerMsgStatus::check_for_helo_rdns_mismatch
0.38 0.050 0.050 201 0.0002 0.0002 ::PerMsgStatus::HARDCORE_PORN_body_test
0.38 0.050 0.178 4917 0.0000 0.0000 ::PerMsgStatus::handle_hit
0.38 0.050 12.737 200 0.0002 0.0637 ::ArchiveIterator::run_file
0.38 0.050 0.097 201 0.0002 0.0005 ::PerMsgStatus::get_envelope_from
0.38 0.050 0.077 603 0.0001 0.0001 ::PerMsgStatus::all_from_addrs
0.38 0.050 0.086 201 0.0002 0.0004 ::PerMsgStatus::_check_attachments
0.30 0.040 0.040 201 0.0002 0.0002 ::PerMsgStatus::BAD_CREDIT_body_test
0.30 0.040 0.040 201 0.0002 0.0002 ::PerMsgStatus::GUARANTEED_STUFF_body_test
0.30 0.040 0.040 201 0.0002 0.0002 ::PerMsgStatus::SENT_IN_COMPLIANCE_body_test
0.30 0.040 0.037 6312 0.0000 0.0000 ::Message::Node::_decode_header
...

--
Daniel Quinlan anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/ and open source consulting
Re: BEGIN using lots of time in profile?! [ In reply to ]
On Fri, Mar 19, 2004 at 12:52:50AM -0800, Daniel Quinlan wrote:
> What the heck is with PerMsgStatus::BEGIN? The ExclSec is rather
> large. And, oh yeah, it's called 3487 times for 200 messages.

I'm not seeing the same thing. Here is the top 10 from a similar run:
%Time ExclSec CumulS #Calls sec/call Csec/c Name
9.74 1.949 5.326 1005 0.0019 0.0053 ::PerMsgStatus::run_eval_tests
4.57 0.914 1.643 79511 0.0000 0.0000 ::PerMsgStatus::get
2.82 0.565 8.291 201 0.0028 0.0413 ::PerMsgStatus::_body_tests_0
2.27 0.455 0.480 1108 0.0004 0.0004 ::Message::Metadata::parse_received_line
2.07 0.414 0.821 28587 0.0000 0.0000 ::Message::Node::get_header
2.04 0.408 2.838 201 0.0020 0.0141 ::PerMsgStatus::_head_tests_0
1.92 0.385 0.427 24761 0.0000 0.0000 ::Message::Node::header
1.54 0.309 0.328 201 0.0015 0.0016 ::PerMsgStatus::_meta_tests_500
1.54 0.308 0.336 2 0.1540 0.1678 ::Conf::_parse
1.31 0.262 0.418 201 0.0013 0.0021 ::Message::new

PerMsgStatus::BEGIN doesn't show up until later:
0.55 0.110 0.189 71 0.0015 0.0027 ::PerMsgStatus::BEGIN

I wonder if it's something with ArchiveIterator and the reading in and
parsing of whatever is in corpus.tiny. I've got two lines in mine,
one spam mbox and one ham mbox. Also, all my messages have already
been cleaned of markup, so I wonder if it's something to do with the
remove_spamassassin_markup.

Michael
Re: BEGIN using lots of time in profile?! [ In reply to ]
On Fri, Mar 19, 2004 at 12:52:50AM -0800, Daniel Quinlan wrote:
> What the heck is with PerMsgStatus::BEGIN? The ExclSec is rather
> large. And, oh yeah, it's called 3487 times for 200 messages.

Also, does dprofpp -T show anything useful?

Michael