Mailing List Archive

svn commit: rev 6247 - in incubator/spamassassin/trunk: . lib/Mail lib/Mail/SpamAssassin lib/Mail/SpamAssassin/MIME masses spamd
Author: felicity
Date: Tue Jan 20 14:33:37 2004
New Revision: 6247

Removed:
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/AuditMessage.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EncappedMIME.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EncappedMessage.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Message.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/PhraseFreqs.pm
Modified:
incubator/spamassassin/trunk/INSTALL
incubator/spamassassin/trunk/MANIFEST
incubator/spamassassin/trunk/README
incubator/spamassassin/trunk/USAGE
incubator/spamassassin/trunk/lib/Mail/SpamAssassin.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EvalTests.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME/Parser.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/NoMailAudit.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm
incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Received.pm
incubator/spamassassin/trunk/masses/mass-check
incubator/spamassassin/trunk/spamassassin.raw
incubator/spamassassin/trunk/spamd/spamd.raw
Log:
bug 2939: initial work to remove Mail::Audit code

Modified: incubator/spamassassin/trunk/INSTALL
==============================================================================
--- incubator/spamassassin/trunk/INSTALL (original)
+++ incubator/spamassassin/trunk/INSTALL Tue Jan 20 14:33:37 2004
@@ -46,8 +46,8 @@
spamassassin to filter your mail and then something else wrote it into a
folder for you, then you should be fine.

-Support for versions of the optional Mail::Audit module before 1.9 is no
-longer included.
+Support for versions of the optional Mail::Audit module is no longer
+included.

The default mode of tagging (which used to be ***SPAM*** in the subject
line) no longer takes place. Instead the message is rewritten.
@@ -387,28 +387,6 @@
Note that MIMEDefang users may need to set the 'pyzor_path'
configuration setting, since MIMEDefang does not set a PATH by
default.
-
-
- - Mail::Audit, Mail::Internet, Net::SMTP (from CPAN)
-
- If you want to use SpamAssassin with Mail::Audit, you will (obviously)
- require the Mail::Audit module, and any modules it requires (there's
- lots of them, unfortunately).
-
- Additionally, Mail::Internet is required if you wish to use the
- "-r/-w" options of the spamassassin program (reporting and replying,
- for spam-trap mail accounts).
-
- If you use procmail, KMail, 'spamassassin', or you plan to use
- 'spamd', you will *not* need these.
-
- Here's how to install them using CPAN.pm:
-
- perl -MCPAN -e shell
- o conf prerequisites_policy ask
- install Mail::Audit
- quit
-

- Net::Ident (from CPAN)


Modified: incubator/spamassassin/trunk/MANIFEST
==============================================================================
--- incubator/spamassassin/trunk/MANIFEST (original)
+++ incubator/spamassassin/trunk/MANIFEST Tue Jan 20 14:33:37 2004
@@ -26,7 +26,6 @@
configure
lib/Mail/SpamAssassin.pm
lib/Mail/SpamAssassin/ArchiveIterator.pm
-lib/Mail/SpamAssassin/AuditMessage.pm
lib/Mail/SpamAssassin/AutoWhitelist.pm
lib/Mail/SpamAssassin/Bayes.pm
lib/Mail/SpamAssassin/BayesStore.pm
@@ -35,8 +34,6 @@
lib/Mail/SpamAssassin/ConfSourceSQL.pm
lib/Mail/SpamAssassin/DBBasedAddrList.pm
lib/Mail/SpamAssassin/Dns.pm
-lib/Mail/SpamAssassin/EncappedMIME.pm
-lib/Mail/SpamAssassin/EncappedMessage.pm
lib/Mail/SpamAssassin/EvalTests.pm
lib/Mail/SpamAssassin/HTML.pm
lib/Mail/SpamAssassin/Locales.pm
@@ -44,13 +41,11 @@
lib/Mail/SpamAssassin/MIME.pm
lib/Mail/SpamAssassin/MIME/Parser.pm
lib/Mail/SpamAssassin/MailingList.pm
-lib/Mail/SpamAssassin/Message.pm
lib/Mail/SpamAssassin/NetSet.pm
lib/Mail/SpamAssassin/NoMailAudit.pm
lib/Mail/SpamAssassin/PerMsgLearner.pm
lib/Mail/SpamAssassin/PerMsgStatus.pm
lib/Mail/SpamAssassin/PersistentAddrList.pm
-lib/Mail/SpamAssassin/PhraseFreqs.pm
lib/Mail/SpamAssassin/Received.pm
lib/Mail/SpamAssassin/Reporter.pm
lib/Mail/SpamAssassin/SHA1.pm

Modified: incubator/spamassassin/trunk/README
==============================================================================
--- incubator/spamassassin/trunk/README (original)
+++ incubator/spamassassin/trunk/README Tue Jan 20 14:33:37 2004
@@ -69,10 +69,9 @@
[1]: http://razor.sourceforge.net/

The distribution provides "spamassassin", a command line tool to perform
-filtering, along with "Mail::SpamAssassin", a set of perl modules which
-implement a Mail::Audit plugin, allowing SpamAssassin to be used in a
-Mail::Audit filter, spam-protection proxy SMTP or POP/IMAP server, or a
-variety of different spam-blocking scenarios.
+filtering, along with the "Mail::SpamAssassin" module set which allows
+SpamAssassin to be used in spam-protection proxy SMTP or POP/IMAP server,
+or a variety of different spam-blocking scenarios.

In addition, Craig Hughes has contributed "spamd", a daemonized version of
SpamAssassin, which runs persistently. Using "spamc", a lightweight C

Modified: incubator/spamassassin/trunk/USAGE
==============================================================================
--- incubator/spamassassin/trunk/USAGE (original)
+++ incubator/spamassassin/trunk/USAGE Tue Jan 20 14:33:37 2004
@@ -37,17 +37,6 @@



-If you use Mail::Audit already:
-
- - run "perldoc Mail::SpamAssassin" and take a look at the synopsis, it
- outlines what you need to add to your audit script.
-
- - Copy the configuration files (see CUSTOMISING, below) to a known
- location, so your script can set the appropriate options for the
- Mail::SpamAssassin constructor to load them.
-
-
-
If you use KMail:

- http://kmail.kde.org/tools.html mentions:

Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin.pm Tue Jan 20 14:33:37 2004
@@ -59,7 +59,7 @@

=head1 NAME

-Mail::SpamAssassin - Mail::Audit spam detector plugin
+Mail::SpamAssassin - Spam detector and markup engine

=head1 SYNOPSIS

@@ -69,7 +69,7 @@
my $status = $spamtest->check ($mail);

if ($status->is_spam ()) {
- $status->rewrite_mail ();
+ $mail = $status->rewrite_mail ();
} else {
...
}
@@ -78,23 +78,19 @@

=head1 DESCRIPTION

-Mail::SpamAssassin is a module to identify spam using text analysis and several
-internet-based realtime blacklists.
+Mail::SpamAssassin is a module to identify spam using several methods
+including text analysis, internet-based realtime blacklists, statistical
+analysis, and internet-based hashing algorithms.

Using its rule base, it uses a wide range of heuristic tests on mail headers
-and body text to identify "spam", also known as unsolicited commercial email.
+and body text to identify "spam", also known as unsolicited bulk email.

-Once identified, the mail can then be optionally tagged as spam for later
-filtering using the user's own mail user-agent application.
+Once identified, the mail can then be tagged as spam for later filtering
+using the user's own mail user-agent application or at the mail transfer
+agent.

-This module also implements a Mail::Audit plugin, allowing SpamAssassin to be
-used in a Mail::Audit filter. If you wish to use a command-line filter tool,
-try the C<spamassassin> or C<spamd> tools provided.
-
-Note that, if you're using Mail::Audit, the constructor for the Mail::Audit
-object must use the C<nomime> option, like so:
-
- my $ma = new Mail::Audit ( nomime => 1 );
+If you wish to use a command-line filter tool, try the C<spamassassin>
+or C<spamd> tools provided.

SpamAssassin also includes support for reporting spam messages to collaborative
filtering databases, such as Vipul's Razor ( http://razor.sourceforge.net/ ).
@@ -417,8 +413,8 @@

=item $status = $f->check ($mail)

-Check a mail, encapsulated in a C<Mail::Audit> or
-C<Mail::SpamAssassin::Message> object, to determine if it is spam or not.
+Check a mail, encapsulated in a C<Mail::SpamAssassin::Message> object,
+to determine if it is spam or not.

Returns a C<Mail::SpamAssassin::PerMsgStatus> object which can be
used to test or manipulate the mail message.
@@ -435,8 +431,7 @@
local ($_);

$self->init(1);
- my $mail = $self->encapsulate_mail_object ($mail_obj);
- my $msg = Mail::SpamAssassin::PerMsgStatus->new($self, $mail);
+ my $msg = Mail::SpamAssassin::PerMsgStatus->new($self, $mail_obj);
# Message-Id is used for a filename on disk, so we can't have '/' in it.
$msg->check();
$msg;
@@ -446,8 +441,7 @@

=item $status = $f->learn ($mail, $id, $isspam, $forget)

-Learn from a mail, encapsulated in a C<Mail::Audit> or
-C<Mail::SpamAssassin::Message> object.
+Learn from a mail, encapsulated in a C<Mail::SpamAssassin::Message> object.

If C<$isspam> is set, the mail is assumed to be spam, otherwise it will
be learnt as non-spam.
@@ -478,8 +472,7 @@

require Mail::SpamAssassin::PerMsgLearner;
$self->init(1);
- my $mail = $self->encapsulate_mail_object ($mail_obj);
- my $msg = Mail::SpamAssassin::PerMsgLearner->new($self, $mail);
+ my $msg = Mail::SpamAssassin::PerMsgLearner->new($self, $mail_obj);

if ($forget) {
$msg->forget($id);
@@ -651,7 +644,7 @@

=item $f->report_as_spam ($mail, $options)

-Report a mail, encapsulated in a C<Mail::Audit> object, as human-verified spam.
+Report a mail, encapsulated in a C<Mail::SpamAssassin::Message> object, as human-verified spam.
This will submit the mail message to live, collaborative, spam-blocker
databases, allowing other users to block this message.

@@ -691,8 +684,6 @@
my @msg = split (/^/m, $self->remove_spamassassin_markup($mail));
$mail = Mail::SpamAssassin::NoMailAudit->new ('data' => \@msg);

- $mail = $self->encapsulate_mail_object ($mail);
-
# learn as spam if enabled
if ( $self->{conf}->{bayes_learn_during_report} ) {
$self->learn ($mail, undef, 1, 0);
@@ -707,7 +698,7 @@

=item $f->revoke_as_spam ($mail, $options)

-Revoke a mail, encapsulated in a C<Mail::Audit> object, as human-verified ham
+Revoke a mail, encapsulated in a C<Mail::SpamAssassin::Message> object, as human-verified ham
(non-spam). This will revoke the mail message from live, collaborative,
spam-blocker databases, allowing other users to block this message.

@@ -737,8 +728,6 @@
my @msg = split (/^/m, $self->remove_spamassassin_markup($mail));
$mail = Mail::SpamAssassin::NoMailAudit->new ('data' => \@msg);

- $mail = $self->encapsulate_mail_object ($mail);
-
# learn as nonspam
$self->learn ($mail, undef, 0, 0);

@@ -855,10 +844,9 @@
my $list = Mail::SpamAssassin::AutoWhitelist->new($self);

$self->init(1);
- my $mail = $self->encapsulate_mail_object ($mail_obj);

my @addrlist = ();
- my @hdrs = $mail->get_header ('From');
+ my @hdrs = $mail_obj->get_header ('From');
if ($#hdrs >= 0) {
push (@addrlist, $self->find_all_addrs_in_line (join (" ", @hdrs)));
}
@@ -949,8 +937,7 @@
}
}

- my $mail = $self->encapsulate_mail_object ($mail_obj);
- my $hdrs = $mail->get_all_headers();
+ my $hdrs = $mail_obj->get_all_headers();

# remove DOS line endings
$hdrs =~ s/\r//gs;
@@ -1001,7 +988,7 @@

my @newbody = ();
my $inreport = 0;
- foreach $_ (@{$mail->get_body()})
+ foreach $_ (@{$mail_obj->get_body()})
{
s/\r?$//; # DOS line endings

@@ -1130,8 +1117,7 @@
$self->init($use_user_prefs);

my $mail = Mail::SpamAssassin::NoMailAudit->new(data => \@testmsg);
- my $encapped = $self->encapsulate_mail_object ($mail);
- my $status = Mail::SpamAssassin::PerMsgStatus->new($self, $encapped,
+ my $status = Mail::SpamAssassin::PerMsgStatus->new($self, $mail,
{ disable_auto_learning => 1 } );
$status->word_is_in_dictionary("aba"); # load triplets.txt into memory
$status->check();
@@ -1174,8 +1160,7 @@
$self->{syntax_errors} += $self->{conf}->{errors};

my $mail = Mail::SpamAssassin::NoMailAudit->new(data => \@testmsg);
- my $encapped = $self->encapsulate_mail_object ($mail);
- my $status = Mail::SpamAssassin::PerMsgStatus->new($self, $encapped,
+ my $status = Mail::SpamAssassin::PerMsgStatus->new($self, $mail,
{ disable_auto_learning => 1 } );
$status->check();

@@ -1472,56 +1457,22 @@

###########################################################################

-sub encapsulate_mail_object {
- my ($self, $mail_obj) = @_;
-
- # first, check to see if this is not actually a Mail::Audit object;
- # it could also be an already-encapsulated Mail::Audit wrapped inside
- # a Mail::SpamAssassin::Message.
- if ($mail_obj->{is_spamassassin_wrapper_object}) {
- return $mail_obj;
- }
-
- if ($self->{use_my_mail_class}) {
- my $class = $self->{use_my_mail_class};
- (my $file = $class) =~ s/::/\//g;
- require "$file.pm";
- return $class->new($mail_obj);
- }
-
- # new versions of Mail::Audit can have one of 2 different base classes. URGH.
- # we can tell which class, by querying the is_mime() method. Support for
- # MIME::Entity contributed by Andrew Wilson <andrew@rivendale.net>.
- #
- my $ismime = 0;
- if ($mail_obj->can ("is_mime")) { $ismime = $mail_obj->is_mime(); }
-
- if ($ismime) {
- require Mail::SpamAssassin::EncappedMIME;
- return Mail::SpamAssassin::EncappedMIME->new($mail_obj);
- } else {
- require Mail::SpamAssassin::EncappedMessage;
- return Mail::SpamAssassin::EncappedMessage->new($mail_obj);
- }
-}
-
sub find_all_addrs_in_mail {
my ($self, $mail_obj) = @_;

$self->init(1);
- my $mail = $self->encapsulate_mail_object ($mail_obj);

my @addrlist = ();
foreach my $header (qw(To From Cc Reply-To Sender
Errors-To Mail-Followup-To))
{
- my @hdrs = $mail->get_header ($header);
+ my @hdrs = $mail_obj->get_header ($header);
if ($#hdrs < 0) { next; }
push (@addrlist, $self->find_all_addrs_in_line (join (" ", @hdrs)));
}

# find addrs in body, too
- foreach my $line (@{$mail->get_body()}) {
+ foreach my $line (@{$mail_obj->get_body()}) {
push (@addrlist, $self->find_all_addrs_in_line ($line));
}

@@ -1602,12 +1553,8 @@

=head1 PREREQUISITES

-C<Mail::Audit>
-C<Mail::Internet>
-
-=head1 COREQUISITES
-
-C<Net::DNS>
+C<HTML::Parser>
+C<Sys::Syslog>

=head1 MORE DOCUMENTATION


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EvalTests.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EvalTests.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/EvalTests.pm Tue Jan 20 14:33:37 2004
@@ -331,7 +331,8 @@
$self->{mta_added_message_id_later} = 0;
$self->{mta_added_message_id_backup} = 0;

- my @received = grep(/\S/, split(/\n/, $self->get('Received')));
+ # We may get headers with continuations in them, so deal with it ...
+ my @received = grep(/\S/, map { s/\r?\n\s+/ /g; $_; } $self->get('Received'));
my $id = $self->get('Resent-Message-ID') || $self->get('Message-ID');
return unless defined($id) && $id;
my $local = 1;

Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME.pm Tue Jan 20 14:33:37 2004
@@ -63,6 +63,9 @@
use strict;
use MIME::Base64;
use Mail::SpamAssassin;
+use Mail::SpamAssassin::HTML;
+use MIME::Base64;
+use MIME::QuotedPrint;

# M::SA::MIME is an object method used to encapsulate a message's MIME part
#
@@ -121,30 +124,26 @@
$key =~ s/\s+$//;

if (@_) {
- my ( $decoded_value, $raw_value ) = @_;
- $raw_value = $decoded_value unless defined $raw_value;
+ my $raw_value = shift;
push @{ $self->{'header_order'} }, $rawkey;
- if ( exists $self->{'headers'}{$key} ) {
- push @{ $self->{'headers'}{$key} }, $decoded_value;
- push @{ $self->{'raw_headers'}{$key} }, $raw_value;
+ if ( !exists $self->{'headers'}->{$key} ) {
+ $self->{'headers'}->{$key} = [];
+ $self->{'raw_headers'}->{$key} = [];
}
- else {
- $self->{'headers'}{$key} = [$decoded_value];
- $self->{'raw_headers'}{$key} = [$raw_value];
- }
- return $self->{'headers'}{$key}[-1];
+
+ push @{ $self->{'headers'}->{$key} }, _decode_header($raw_value);
+ push @{ $self->{'raw_headers'}->{$key} }, $raw_value;
+
+ return $self->{'headers'}->{$key}->[-1];
}

- my $want = wantarray;
- if ( defined($want) ) {
- if ($want) {
- return unless exists $self->{'headers'}{$key};
- return @{ $self->{'headers'}{$key} };
- }
- else {
- return '' unless exists $self->{'headers'}{$key};
- return $self->{'headers'}{$key}[-1];
- }
+ if (wantarray) {
+ return unless exists $self->{'headers'}->{$key};
+ return @{ $self->{'headers'}->{$key} };
+ }
+ else {
+ return '' unless exists $self->{'headers'}->{$key};
+ return $self->{'headers'}->{$key}->[-1];
}
}

@@ -159,12 +158,12 @@
$key =~ s/\s+$//;

if (wantarray) {
- return unless exists $self->{'raw_headers'}{$key};
- return @{ $self->{'raw_headers'}{$key} };
+ return unless exists $self->{'raw_headers'}->{$key};
+ return @{ $self->{'raw_headers'}->{$key} };
}
else {
- return '' unless exists $self->{'raw_headers'}{$key};
- return $self->{'raw_headers'}{$key}[-1];
+ return '' unless exists $self->{'raw_headers'}->{$key};
+ return $self->{'raw_headers'}->{$key}->[-1];
}
}

@@ -316,6 +315,58 @@
return $self->{'type'};
}
}
+
+sub delete_header {
+ my($self, $hdr) = @_;
+
+ foreach ( grep(/^${hdr}$/i, keys %{$self->{'headers'}}) ) {
+ delete $self->{'headers'}->{$_};
+ delete $self->{'raw_headers'}->{$_};
+ }
+
+ my @neworder = grep(!/^${hdr}$/i, @{$self->{'header_order'}});
+ $self->{'header_order'} = \@neworder;
+}
+
+sub __decode_header {
+ my ( $encoding, $cte, $data ) = @_;
+
+ if ( $cte eq 'B' ) {
+ # base 64 encoded
+ return Mail::SpamAssassin::Util::base64_decode($data);
+ }
+ elsif ( $cte eq 'Q' ) {
+ # quoted printable
+ return Mail::SpamAssassin::Util::qp_decode($data);
+ }
+ else {
+ die "Unknown encoding type '$cte' in RFC2047 header";
+ }
+}
+
+=item _decode_header()
+
+Decode base64 and quoted-printable in headers according to RFC2047.
+
+=cut
+
+sub _decode_header {
+ my($header) = @_;
+
+ return '' unless $header;
+
+ # deal with folding and cream the newlines and such
+ $header =~ s/\n[ \t]+/\n /g;
+ $header =~ s/\r?\n//g;
+
+ return $header unless $header =~ /=\?/;
+
+ $header =~
+ s/=\?([\w_-]+)\?([bqBQ])\?(.*?)\?=/__decode_header($1, uc($2), $3)/ge;
+
+ return $header;
+}
+

sub dbg { Mail::SpamAssassin::dbg (@_); }


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME/Parser.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME/Parser.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/MIME/Parser.pm Tue Jan 20 14:33:37 2004
@@ -22,9 +22,6 @@

use Mail::SpamAssassin;
use Mail::SpamAssassin::MIME;
-use Mail::SpamAssassin::HTML;
-use MIME::Base64;
-use MIME::QuotedPrint;

=item parse()

@@ -70,28 +67,34 @@
my $msg = Mail::SpamAssassin::MIME->new();
my $header = '';

+ # Go through all the headers of the message
while ( my $last = shift @message ) {
+ # Store the non-modified headers in a scalar
$msg->{'pristine_headers'} .= $last;
- $last =~ s/\r?\n//;

# NB: Really need to figure out special folding rules here!
- if ( $last =~ s/^[ \t]+// ) { # if its a continuation
- $header .= " $last"; # fold continuations
+ if ( $last =~ /^[ \t]+/ ) { # if its a continuation
+ $header .= $last; # fold continuations
next;
}

+ # Ok, there's a header here, let's go ahead and add it in.
if ($header) {
my ( $key, $value ) = split ( /:\s*/, $header, 2 );
- $msg->header( $key, $self->_decode_header($value), $value );
+ $msg->header( $key, $value );
}

# not a continuation...
$header = $last;

- last if ( $last =~ /^$/m );
+ # Ok, we found the header/body blank line ...
+ last if ( $last =~ /^\r?$/m );
}

- #$msg->{'pristine_body'} = \@message;
+ # Store the pristine body for later -- store as a copy since @message will get modified below
+ $msg->{'pristine_body'} = join('', @message);
+
+ # Figure out the boundary
my ($boundary);
($msg->{'type'}, $boundary) = Mail::SpamAssassin::Util::parse_content_type($msg->header('content-type'));
dbg("main message type: ".$msg->{'type'});
@@ -277,39 +280,6 @@
# BTW: please leave this after add_body_parts() since it'll add it back.
#
delete $part_msg->{body_parts};
-}
-
-sub __decode_header {
- my ( $encoding, $cte, $data ) = @_;
-
- if ( $cte eq 'B' ) {
- # base 64 encoded
- return Mail::SpamAssassin::Util::base64_decode($data);
- }
- elsif ( $cte eq 'Q' ) {
- # quoted printable
- return Mail::SpamAssassin::Util::qp_decode($data);
- }
- else {
- die "Unknown encoding type '$cte' in RFC2047 header";
- }
-}
-
-=item _decode_header()
-
-Decode base64 and quoted-printable in headers according to RFC2047.
-
-=cut
-
-sub _decode_header {
- my($self, $header) = @_;
-
- return '' unless $header;
- return $header unless $header =~ /=\?/;
-
- $header =~
- s/=\?([\w_-]+)\?([bqBQ])\?(.*?)\?=/__decode_header($1, uc($2), $3)/ge;
- return $header;
}

sub dbg { Mail::SpamAssassin::dbg (@_); }

Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/NoMailAudit.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/NoMailAudit.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/NoMailAudit.pm Tue Jan 20 14:33:37 2004
@@ -71,233 +71,69 @@

use strict;
use bytes;
-use Fcntl qw(:DEFAULT :flock);

-use Mail::SpamAssassin::Message;
use Mail::SpamAssassin::MIME;
use Mail::SpamAssassin::MIME::Parser;

-@Mail::SpamAssassin::NoMailAudit::ISA = (
- 'Mail::SpamAssassin::Message'
-);
-
# ---------------------------------------------------------------------------

sub new {
my $class = shift;
my %opts = @_;

- my $self = $class->SUPER::new();
-
- $self->{is_spamassassin_wrapper_object} = 1;
- $self->{has_spamassassin_methods} = 1;
- $self->{headers} = { };
- $self->{header_order} = [ ];
+ my $self = {
+ mime_parts => Mail::SpamAssassin::MIME::Parser->parse($opts{'data'} || \*STDIN),
+ };

bless ($self, $class);
-
- # data may be filehandle (default stdin) or arrayref
- my $data = $opts{data} || \*STDIN;
-
- if (ref $data eq 'ARRAY') {
- $self->{textarray} = $data;
- } elsif (ref $data eq 'GLOB') {
- if (defined fileno $data) {
- $self->{textarray} = [ <$data> ];
- }
- }
-
- # Parse the message for MIME parts
- $self->{mime_parts} = Mail::SpamAssassin::MIME::Parser->parse($self->{textarray});
-
- # Parse the message to get header information
- $self->parse_headers();
return $self;
}

# ---------------------------------------------------------------------------

-sub parse_headers {
- my ($self) = @_;
- local ($_);
-
- $self->{headers} = { };
- $self->{header_order} = [ ];
- my ($prevhdr, $hdr, $val, $entry);
-
- while (defined ($_ = shift @{$self->{textarray}})) {
- # warn "parse_headers $_";
- if (/^\r*$/) { last; }
-
- $entry = $hdr = $val = undef;
-
- if (/^\s/) {
- if (defined $prevhdr) {
- $hdr = $prevhdr; $val = $_;
- $val =~ s/\r+\n/\n/gs; # trim CRs, we don't want them
- $entry = $self->{headers}->{$hdr};
- $entry->{$entry->{count} - 1} .= $val;
- next;
-
- } else {
- $hdr = "X-Mail-Format-Warning";
- $val = "No previous line for continuation: $_";
- $entry = $self->_get_or_create_header_object ($hdr);
- $entry->{added} = 1;
- }
-
- } elsif (/^From /) {
- $self->{from_line} = $_;
- next;
-
- } elsif (/^([\x21-\x39\x3B-\x7E]+):\s*(.*)$/s) {
- # format of a header, as defined by RFC 2822 section 3.6.8;
- # 'Any character except controls, SP, and ":".'
- $hdr = $1; $val = $2;
- $val =~ s/\r+//gs; # trim CRs, we don't want them
- $entry = $self->_get_or_create_header_object ($hdr);
- $entry->{original} = 1;
-
- } else {
- $hdr = "X-Mail-Format-Warning";
- $val = "Bad RFC2822 header formatting in $_";
- $entry = $self->_get_or_create_header_object ($hdr);
- $entry->{added} = 1;
- }
-
- $self->_add_header_to_entry ($entry, $hdr, $val);
- $prevhdr = $hdr;
- }
-}
-
-sub _add_header_to_entry {
- my ($self, $entry, $hdr, $line, $order) = @_;
-
- # Do a normal push if no specific order # is set.
- $order ||= @{$self->{header_order}};
-
- # ensure we have line endings
- if ($line !~ /\n$/s) { $line .= "\n"; }
-
- # Store this header
- $entry->{$entry->{count}} = $line;
-
- # Push the header and which count it is in header_order
- splice @{$self->{header_order}}, $order, 0, $hdr.":".$entry->{count};
-
- # Increase the count of this header type
- $entry->{count}++;
-}
-
-sub _get_or_create_header_object {
- my ($self, $hdr) = @_;
-
- if (!defined $self->{headers}->{$hdr}) {
- $self->{headers}->{$hdr} = {
- 'count' => 0,
- 'added' => 0,
- 'original' => 0
- };
- }
- return $self->{headers}->{$hdr};
-}
-
-# ---------------------------------------------------------------------------
-
-sub _get_header_list {
- my ($self, $hdr, $header_name_only) = @_;
-
- # OK, we want to do a case-insensitive match here on the header name
- # So, first I'm going to pick up an array of the actual capitalizations used:
- my $lchdr = lc $hdr;
- my @cap_hdrs = grep(lc($_) eq $lchdr, keys(%{$self->{headers}}));
-
- # If the request is just for the list of headers names that matched only ...
- if ( defined $header_name_only && $header_name_only ) {
- return @cap_hdrs;
- }
- else {
- # return the values in each of the headers
- return map($self->{headers}->{$_},@cap_hdrs);
- }
-}
-
sub get_pristine_header {
my ($self, $hdr) = @_;
+
+ return $self->{mime_parts}->{pristine_headers} unless $hdr;
my(@ret) = $self->{mime_parts}->{pristine_headers} =~ /^(?:$hdr:[ ]+(.*\n(?:\s+\S.*\n)*))/mig;
if (@ret) {
- return wantarray ? @ret : $ret[0];
+ return wantarray ? @ret : $ret[-1];
}
else {
return $self->get_header($hdr);
}
}

+#sub get { shift->get_header(@_); }
sub get_header {
my ($self, $hdr) = @_;

# And now pick up all the entries into a list
- my @entries = $self->_get_header_list($hdr);
-
- if (!wantarray) {
- # If there is no header like that, return undef
- if (scalar(@entries) < 1 ) { return undef; }
- foreach my $entry (@entries) {
- if($entry->{count} > 0) {
- my $ret = $entry->{0};
- $ret =~ s/^\s+//;
- $ret =~ s/\n\s+/ /g;
- return $ret;
- }
- }
- return undef;
-
- } else {
-
- if(scalar(@entries) < 1) { return ( ); }
-
- my @ret = ();
- # loop through each entry and collect all the individual matching lines
- foreach my $entry (@entries)
- {
- foreach my $i (0 .. ($entry->{count}-1)) {
- my $ret = $entry->{$i};
- $ret =~ s/^\s+//;
- $ret =~ s/\n\s+/ /g;
- push (@ret, $ret);
- }
- }
-
- return @ret;
+ # This is assumed to include a newline at the end ...
+ # This is also assumed to have removed continuation bits ...
+ my @hdrs;
+ foreach ( $self->{'mime_parts'}->raw_header($hdr) ) {
+ s/\r?\n\s+/ /g;
+ push(@hdrs, $_);
}
-}

-sub put_header {
- my ($self, $hdr, $text, $order) = @_;
-
- my $entry = $self->_get_or_create_header_object ($hdr);
- $self->_add_header_to_entry ($entry, $hdr, $text, $order);
- if (!$entry->{original}) { $entry->{added} = 1; }
+ if (wantarray) {
+ return @hdrs;
+ }
+ else {
+ return $hdrs[-1];
+ }
}

+#sub header { shift->get_all_headers(@_); }
sub get_all_headers {
my ($self) = @_;

+ my %cache = ();
my @lines = ();
- # warn "JMD".join (' ', caller);

- push(@lines, $self->{from_line}) if ( defined $self->{from_line} );
- foreach my $hdrcode (@{$self->{header_order}}) {
- $hdrcode =~ /^([^:]+):(\d+)$/ or next;
-
- my $hdr = $1;
- my $num = $2;
- my $entry = $self->{headers}->{$hdr};
- next unless defined($entry);
-
- my $text = $hdr.": ".$entry->{$num};
- if ($text !~ /\n$/s) { $text .= "\n"; }
- push (@lines, $text);
+ foreach ( @{$self->{mime_parts}->{header_order}} ) {
+ push(@lines, "$_: ".($self->get_header($_))[$cache{$_}++]);
}

if (wantarray) {
@@ -307,118 +143,38 @@
}
}

-sub replace_header {
- my ($self, $hdr, $text) = @_;
-
- # Figure out where the first case insensitive header of this name is stored.
- # We'll use this to add the new header with the same case and in the order.
- my($casehdr,$order) = ($hdr,undef);
- my $lchdr = lc "$hdr:0"; # just lc it once
-
- # Now find the header
- for ( my $count = 0; $count <= @{$self->{header_order}}; $count++ ) {
- next unless (lc $self->{header_order}->[$count] eq $lchdr);
-
- # Remember where in the order the header is, and the case of said header.
- $order = $count;
- ($casehdr = $self->{header_order}->[$count]) =~ s/:\d+$//;
-
- last;
- }
-
- # Remove all instances of this header
- $self->delete_header ($hdr);
-
- # Add the new header with correctly cased header and in the right place
- return $self->put_header($casehdr, $text, $order);
-}
-
sub delete_header {
my ($self, $hdr) = @_;
-
- # Delete all versions of the header, case insensitively
- foreach my $dhdr ( $self->_get_header_list($hdr,1) ) {
- @{$self->{header_order}} = grep( rindex($_,"$dhdr:",0) != 0, @{$self->{header_order}} );
- delete $self->{headers}->{$dhdr};
- }
+ $self->{mime_parts}->delete_header($hdr);
}

+#sub body { return shift->get_body(@_); }
sub get_body {
my ($self) = @_;
- return $self->{textarray};
-}
-
-sub replace_body {
- my ($self, $aryref) = @_;
- $self->{textarray} = $aryref;
+ my @ret = split(/^/m, $self->get_pristine_body());
+ return \@ret;
}

# ---------------------------------------------------------------------------

sub get_pristine {
my ($self) = @_;
- return join ('', $self->{mime_parts}->{pristine_headers}, @{ $self->{textarray} });
+ return $self->{mime_parts}->{pristine_headers} . $self->{mime_parts}->{pristine_body};
}

sub get_pristine_body {
my ($self) = @_;
- return join ('', @{ $self->{textarray} });
+ return $self->{mime_parts}->{pristine_body};
}

sub as_string {
my ($self) = @_;
- return join ('', $self->get_all_headers(), "\n",
- @{$self->get_body()});
-}
-
-sub replace_original_message {
- my ($self, $data) = @_;
-
- if (ref $data eq 'ARRAY') {
- $self->{textarray} = $data;
- } elsif (ref $data eq 'GLOB') {
- if (defined fileno $data) {
- $self->{textarray} = [ <$data> ];
- }
- }
-
- $self->parse_headers();
-}
-
-# ---------------------------------------------------------------------------
-# Mail::Audit emulation methods.
-
-sub get { shift->get_header(@_); }
-sub header { shift->get_all_headers(@_); }
-
-sub body {
- my ($self) = shift;
- my $replacement = shift;
-
- if (defined $replacement) {
- $self->replace_body ($replacement);
- } else {
- return $self->get_body();
- }
+ return $self->get_all_headers() . "\n" . $self->{mime_parts}->{pristine_body};
}

sub ignore {
my ($self) = @_;
exit (0) unless $self->{noexit};
-}
-
-# ---------------------------------------------------------------------------
-
-# does not need to be called it seems. still, keep it here in case of
-# emergency.
-sub finish {
- my $self = shift;
- delete $self->{textarray};
- foreach my $key (keys %{$self->{headers}}) {
- delete $self->{headers}->{$key};
- }
- delete $self->{headers};
- delete $self->{mail_object};
}

1;

Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/PerMsgStatus.pm Tue Jan 20 14:33:37 2004
@@ -164,7 +164,7 @@
# TODO: change this to do whitelist/blacklists first? probably a plan
# NOTE: definitely need AWL stuff last, for regression-to-mean of score

- $self->clean_spamassassin_headers();
+ $self->{msg}->delete_header('X-Spam-.*');
$self->{learned_hits} = 0;
$self->{body_only_hits} = 0;
$self->{head_only_hits} = 0;
@@ -586,14 +586,11 @@
my ($self) = @_;

if ($self->{is_spam} && $self->{conf}->{report_safe}) {
- $self->rewrite_as_spam();
+ return $self->rewrite_as_spam();
}
else {
- $self->rewrite_headers();
+ return $self->rewrite_headers();
}
-
- # invalidate the header cache, we've changed some of them.
- $self->{hdr_cache} = { };
}

# rewrite the entire message as spam (headers and body)
@@ -753,44 +750,49 @@
EOM

my @lines = split (/^/m, $newmsg);
- $self->{msg}->replace_original_message(\@lines);
+ return Mail::SpamAssassin::NoMailAudit->new(data => \@lines);
}

sub rewrite_headers {
my ($self) = @_;

- if($self->{is_spam}) {
+ # put the pristine headers into an array
+ my(@pristine_headers) = $self->{msg}->get_pristine_header() =~ /^([^:]+:[ ]+(?:.*\n(?:\s+\S.*\n)*))/mig;
+ my $addition = 'headers_ham';

+ if($self->{is_spam}) {
# Deal with header rewriting
- foreach my $header (keys %{$self->{conf}->{rewrite_header}}) {
- $_ = $self->{msg}->get_header($header);
- my $tag = $self->_replace_tags($self->{conf}->{rewrite_header}->{$header});
+ while ( my($header, $value) = each %{$self->{conf}->{rewrite_header}}) {
+ unless ( $header =~ /^(?:Subject|From|To)$/ ) {
+ dbg("rewrite: ignoring $header = $value");
+ next;
+ }
+
+ # Figure out the rewrite piece
+ my $tag = $self->_replace_tags($value);
$tag =~ s/\n/ /gs;
- if ($header eq 'Subject') {
- s/^(?:\Q${tag}\E |)/${tag} /;
- }
- elsif ($header =~ /From|To/) {
- s/(?:\t\Q(${tag})\E|)$/\t(${tag})/;
- }
- $self->{msg}->replace_header($header,$_);
- }

- # Deal with header adding
- foreach my $header (keys %{$self->{conf}->{headers_spam}} ) {
- my $data = $self->{conf}->{headers_spam}->{$header};
- my $line = $self->_process_header($header,$data) || "";
- $self->{msg}->put_header ("X-Spam-$header", $line);
- }
+ # The tag should be a comment for this header ...
+ $tag = "($tag)" if ( $header =~ /^(?:From|To)$/ );

- } else {
-
- foreach my $header (keys %{$self->{conf}->{headers_ham}} ) {
- my $data = $self->{conf}->{headers_ham}->{$header};
- my $line = $self->_process_header($header,$data) || "";
- $self->{msg}->put_header ("X-Spam-$header", $line);
+ # Go ahead and markup the headers
+ foreach ( @pristine_headers ) {
+ # skip non-correct-header or headers that are already tagged
+ next if ( !/^${header}:/i );
+ s/^([^:]+:[ ]*)(?:\Q${tag}\E )?/$1${tag} /i;
+ }
}

+ $addition = 'headers_spam';
}
+
+ while ( my($header, $data) = each %{$self->{conf}->{$addition}} ) {
+ my $line = $self->_process_header($header,$data) || "";
+ push(@pristine_headers, "X-Spam-$header: $line\n");
+ }
+
+ push(@pristine_headers, "\n", split (/^/m, $self->{msg}->get_pristine_body()));
+ return Mail::SpamAssassin::NoMailAudit->new(data => \@pristine_headers);
}

sub _process_header {
@@ -907,28 +909,6 @@

###########################################################################

-=item $messagestring = $status->get_full_message_as_text ()
-
-Returns the mail message as a string, including headers and raw body text.
-
-If the message has been rewritten using C<rewrite_mail()>, these changes
-will be reflected in the string.
-
-Note: this is simply a helper method which calls methods on the mail message
-object. It is provided because Mail::Audit uses an unusual (ie. not quite
-intuitive) interface to do this, and it has been a common stumbling block for
-authors of scripts which use SpamAssassin.
-
-=cut
-
-sub get_full_message_as_text {
- my ($self) = @_;
- return join ("", $self->{msg}->get_all_headers(), "\n",
- @{$self->{msg}->get_body()});
-}
-
-###########################################################################
-
=item $status->finish ()

Indicate that this C<$status> object is finished with, and can be destroyed.
@@ -1355,7 +1335,7 @@
else {
my @hdrs = $self->{msg}->get_header ($hdrname);
if ($#hdrs >= 0) {
- $_ = join ("\n", @hdrs);
+ $_ = join ('', @hdrs);
}
else {
$_ = undef;
@@ -2422,34 +2402,6 @@

sub dbg { Mail::SpamAssassin::dbg (@_); }
sub sa_die { Mail::SpamAssassin::sa_die (@_); }
-
-###########################################################################
-
-sub clean_spamassassin_headers {
- my ($self) = @_;
-
- # attempt to restore original headers
- for my $hdr (('Content-Transfer-Encoding', 'Content-Type', 'Return-Receipt-To')) {
- my $prev = $self->{msg}->get_header ("X-Spam-Prev-$hdr");
- if (defined $prev && $prev ne '') {
- $self->{msg}->replace_header ($hdr, $prev);
- }
- }
- # delete the SpamAssassin-added headers
- $self->{msg}->delete_header ("X-Spam-Checker-Version");
- $self->{msg}->delete_header ("X-Spam-Flag");
- $self->{msg}->delete_header ("X-Spam-Level");
- $self->{msg}->delete_header ("X-Spam-Prev-Content-Transfer-Encoding");
- $self->{msg}->delete_header ("X-Spam-Prev-Content-Type");
- $self->{msg}->delete_header ("X-Spam-Report");
- $self->{msg}->delete_header ("X-Spam-Status");
- foreach my $header (keys %{$self->{conf}->{headers_spam}} ) {
- $self->{msg}->delete_header ("X-Spam-$header");
- }
- foreach my $header (keys %{$self->{conf}->{headers_ham}} ) {
- $self->{msg}->delete_header ("X-Spam-$header");
- }
-}

###########################################################################


Modified: incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Received.pm
==============================================================================
--- incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Received.pm (original)
+++ incubator/spamassassin/trunk/lib/Mail/SpamAssassin/Received.pm Tue Jan 20 14:33:37 2004
@@ -342,17 +342,17 @@
# so protect against that here. These will not appear in the final
# message; they're just used internally.

- if ($self->{msg}->can ("delete_header")) {
- $self->{msg}->delete_header ("X-Spam-Relays-Trusted");
- $self->{msg}->delete_header ("X-Spam-Relays-Untrusted");
-
- if ($self->{msg}->can ("put_metadata")) {
- $self->{msg}->put_metadata ("X-Spam-Relays-Trusted",
- $self->{relays_trusted_str});
- $self->{msg}->put_metadata ("X-Spam-Relays-Untrusted",
- $self->{relays_untrusted_str});
- }
- }
+# if ($self->{msg}->can ("delete_header")) {
+# $self->{msg}->delete_header ("X-Spam-Relays-Trusted");
+# $self->{msg}->delete_header ("X-Spam-Relays-Untrusted");
+#
+# if ($self->{msg}->can ("put_metadata")) {
+# $self->{msg}->put_metadata ("X-Spam-Relays-Trusted",
+# $self->{relays_trusted_str});
+# $self->{msg}->put_metadata ("X-Spam-Relays-Untrusted",
+# $self->{relays_untrusted_str});
+# }
+# }

$self->{tag_data}->{RELAYSTRUSTED} = $self->{relays_trusted_str};
$self->{tag_data}->{RELAYSUNTRUSTED} = $self->{relays_untrusted_str};

Modified: incubator/spamassassin/trunk/masses/mass-check
==============================================================================
--- incubator/spamassassin/trunk/masses/mass-check (original)
+++ incubator/spamassassin/trunk/masses/mass-check Tue Jan 20 14:33:37 2004
@@ -290,7 +290,7 @@
$ma->{noexit} = 1;

# remove SpamAssassin markup, if present and the mail was spam
- $_ = $ma->get ("X-Spam-Status");
+ $_ = $ma->get_header ("X-Spam-Status");
if (defined($_) && /^Yes, hits=/) {
my $newtext = $spamtest->remove_spamassassin_markup($ma);
my @newtext = split (/^/m, $newtext);
@@ -328,12 +328,12 @@
my $tests = join(",", sort(grep(length,$status->get_names_of_tests_hit(),$status->get_names_of_subtests_hit())));
my $extra = join(",", @extra);

- if (defined $opt_rewrite) {
- $status->rewrite_mail();
- open(REWRITE, "> " . ($opt_rewrite ? $opt_rewrite : "/tmp/out"));
- print REWRITE $status->get_full_message_as_text();
- close(REWRITE);
- }
+# if (defined $opt_rewrite) {
+# $status->rewrite_mail();
+# open(REWRITE, "> " . ($opt_rewrite ? $opt_rewrite : "/tmp/out"));
+# print REWRITE $status->get_full_message_as_text();
+# close(REWRITE);
+# }

$id =~ s/\s/_/g;


Modified: incubator/spamassassin/trunk/spamassassin.raw
==============================================================================
--- incubator/spamassassin/trunk/spamassassin.raw (original)
+++ incubator/spamassassin/trunk/spamassassin.raw Tue Jan 20 14:33:37 2004
@@ -208,20 +208,17 @@
$mail->ignore(); # will exit
}

-# not reporting? OK, do checks instead. Create a status object which
-# holds details of the message's spam/not-spam status.
+ # not reporting? OK, do checks instead. Create a status object which
+ # holds details of the message's spam/not-spam status.
my $status = $spamtest->check ($mail);
- $status->rewrite_mail ();
+ $mail = $status->rewrite_mail ();
+
+ print $mail->get_pristine();

if ($opt{'test-mode'}) {
- # add the spam report to the end of the body as well, if testing.
- my $lines = $mail->body();
- push (@{$lines}, split (/$/, $status->get_report()));
- $mail->body ($lines);
+ print $status->get_report();
}

-# if we're piping it, deliver it to stdout.
- print $mail->header(), "\n", join ('', @{$mail->body()});
if (defined $opt{'error-code'} && $status->is_spam ()) { exit ($opt{'error-code'} || 5) ; }
exit;

@@ -642,7 +639,6 @@
sa-learn(1)
Mail::SpamAssassin(3)
Mail::SpamAssassin::Conf(3)
-Mail::Audit(3)
Razor(3)

=head1 BUGS
@@ -652,15 +648,6 @@
=head1 AUTHOR

Justin Mason E<lt>jm /at/ jmason.orgE<gt>
-
-=head1 PREREQUISITES
-
-C<Mail::Audit>
-
-=head1 COREQUISITES
-
-C<Net::DNS>
-C<Razor>

=cut


Modified: incubator/spamassassin/trunk/spamd/spamd.raw
==============================================================================
--- incubator/spamassassin/trunk/spamd/spamd.raw (original)
+++ incubator/spamassassin/trunk/spamd/spamd.raw Tue Jan 20 14:33:37 2004
@@ -760,10 +760,10 @@
my $spamhdr = "Spam: $response_spam_status ; $msg_score / $msg_threshold";

if ($method eq 'PROCESS') {
- $status->rewrite_mail; #if $status->is_spam;
+ $mail = $status->rewrite_mail; #if $status->is_spam;

# Build the message to send back and measure it
- my $msg_resp = join '',$mail->header,"\n",@{$mail->body};
+ my $msg_resp = $mail->as_string();
my $msg_resp_length = length($msg_resp);
if($version >= 1.3) # Spamc protocol 1.3 means multi hdrs are OK
{