On Mon, 7 Apr 2014, Dave Warren wrote:
> On 2014-04-06 17:21, John Hardin wrote:
>> On Sun, 6 Apr 2014, Dave Warren wrote:
>>
>> > Is older ham useful? It specifically mentions that older spam isn't
>> > useful, and why, but I'm thinking older ham is probably useful since old
>> > mail clients and legitimately sent mail never dies. But I could filter
>> > based on date.
>>
>> There's some debate about that. :)
>>
>> I personally agree with you. Others disagree.
>
> I've been giving it some thought and I think that perhaps limiting it to the
> last few months will make it easier to get a sane set of TRUSTED_NETWORKS and
> INTERNAL_NETWORKS; I've got mail going back to
> ~ 2002 but no real recollection of how things were set up or named prior
> to 2007 or so.
>
> Initially I'll limit it to mail within the last couple of months, but perhaps
> expand that up to 24-36 months for non-spam and 6 months for spam, is that
> sane/reasonable?
Sure.
>> Yes, ham-only masscheck submissions would be very welcome.
>
> Perfect, glad to hear it. At this point I've built a dedicated box to run the
> masscheck scripts, so now it's just a matter of putting together a corpus and
> doing some sanity checking and testing.
>
> My current thought is to take user-fed spam and non-spam folders and place
> copies of messages into a staging path which will then be reviewed before
> being added to the corpus for learning. Hopefully I'll be ready to go live
> within a day or two.
Thanks for your participation!
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...every time I sit down in front of a Windows machine I feel as
if the computer is just a place for the manufacturers to put their
advertising. -- fwadling on Y! SCOX
-----------------------------------------------------------------------
6 days until Thomas Jefferson's 271st Birthday
> On 2014-04-06 17:21, John Hardin wrote:
>> On Sun, 6 Apr 2014, Dave Warren wrote:
>>
>> > Is older ham useful? It specifically mentions that older spam isn't
>> > useful, and why, but I'm thinking older ham is probably useful since old
>> > mail clients and legitimately sent mail never dies. But I could filter
>> > based on date.
>>
>> There's some debate about that. :)
>>
>> I personally agree with you. Others disagree.
>
> I've been giving it some thought and I think that perhaps limiting it to the
> last few months will make it easier to get a sane set of TRUSTED_NETWORKS and
> INTERNAL_NETWORKS; I've got mail going back to
> ~ 2002 but no real recollection of how things were set up or named prior
> to 2007 or so.
>
> Initially I'll limit it to mail within the last couple of months, but perhaps
> expand that up to 24-36 months for non-spam and 6 months for spam, is that
> sane/reasonable?
Sure.
>> Yes, ham-only masscheck submissions would be very welcome.
>
> Perfect, glad to hear it. At this point I've built a dedicated box to run the
> masscheck scripts, so now it's just a matter of putting together a corpus and
> doing some sanity checking and testing.
>
> My current thought is to take user-fed spam and non-spam folders and place
> copies of messages into a staging path which will then be reviewed before
> being added to the corpus for learning. Hopefully I'll be ready to go live
> within a day or two.
Thanks for your participation!
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhardin@impsec.org FALaholic #11174 pgpk -a jhardin@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
...every time I sit down in front of a Windows machine I feel as
if the computer is just a place for the manufacturers to put their
advertising. -- fwadling on Y! SCOX
-----------------------------------------------------------------------
6 days until Thomas Jefferson's 271st Birthday