Mailing List Archive

CHAOS: v1.2.2: Of Documentation
Simon Wilson wrote:
>> could you, please, finally, describe what does this module do,
>> here to the list and/or to the wiki?
>>
>> the description there is too hard to understand, epecially at the
>> beginning,
>> and I couldn't force myself to understand it (multiple times).
>>
>> Maybe you should start with the easy parts and follow with those more
>> compliated functionality, because I feel the description starts with
>> thelatter.
>
>
> I'm guessing from the silence in response that this will remain a mystery.
>
> Simon.
>
> ___________
> Simon Wilson
> M: 0400 12 11 16

Reads perfectly well to me.  I guess to be compatible with any other
plugin, I must delete all documentation entirely :)

Seriously, every single rule that this module can generate is listed. 
That's a good start, comparatively.

I answer, and have answered, all questions regarding this module.
Open-ended questions, or questions that are vague and ambiguous, are
ignored.  For instance, "Maybe you should start with easy parts"? OK,
what's easy?  I'm reminded of an old Star Trek episode where Dr. McCoy
is reattaching Spock's brain.  "It's so easy.  A child can do it", he
muses.  Questions have value.  Statements less so.

This module has some unique stuff that CANNOT be done in a pure
SpamAssassin environment.  It also has stuff that can be replicated
using standard rules.

1) The module, if installed and using the config file as is, does no
harm at all.  It will merely generate rules based upon what it finds. 
These are all scored at the low rate of 0.01.  It's up to the user to
decide what to with them.  They can wrap up a generated rule in a meta
rule.  Example:

meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
score JR_HATES_BEENTHERE   8.0
||
2) Via a configuration file option, "chaos_mode", the module can be set
to automatically score its rules.

chaos_mode AutoISP

It will still run along with existing files, cranking out higher scores
for those rules marked with an asterisk.  That is still probably
acceptable for most people.  But it can cause problems. The popular KAM
ruleset scores SendGrid Emails with a high value. Mine is split into two
different values that are scored differently.  While they are both lower
than KAM's, combined, I see that as a potential problem.  I have no
knowledge of what somebody's rules are at any given moment.  Caveat
Emptor.  There I go again with the Latin :)

2A) What values do I set for these rules?  As a percentage of another
configuration file option, "chaos_tag":

chaos_tag 7

Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored. If
you lower the chaos_tag value, the score for this rule would be
reduced.  If I increase the chaos_tag value, the score produced by this
rule is raised.

2B) The AutoISP mode, as is, should be fine for anybody running  a spam
tag level of 8 to 12.

2C) The initial release of CHAOS.pm did all kinds of scoring.  One of
the knocks I have about SpamAssassin is that is does not maintain counts
of hits.  My complaints about this go all the way back to 2010.  Counts
and Amounts.  SA is great with Amounts.  It sucks with Counts.  To the
SA Development crew's credit, somewhere along the way, tflags were added
to allow that functionality in a very primitive fashion.  Many people
are happy with that.  I'm just not one of them.

I read somewhere, while looking at META rules that SA internally builds
an array of the rules hit.  That way, as rules hit, METAs are then
appropriately updated.  Gee, an array.  Maybe we could add a count to
that array if the user wishes to?  I think that it is a lot of
development; not so much the actual process of doing it, but updating
all the User handling thereof.  Alas, It is what it is *SIGH*

2D) One thing about running AutoISP mode is that you can change a Rule's
name in the configuration file and not matter what, you'll get the
Rulename that's hard-coded into the program.  When a Eval plugin
function is called, SA passes the rule name to the plugin. Most plugins
just ignore it, and simply return a Hit/Miss value for the Rulename.  I
ignore that completely.

2E) When I first released CHAOS, all it did was Automatic Scoring. And I
used all kinds of fancy algorithms, even logarithmic, to demonstrate
that.  That was pointless, as many pointed out at the time.  I don't do
that stuff anymore.

2F) Still, as is, AutoISP will still work great for most people.

3) As the first release of CHAOS was about as successful as the
Hindenburg, I added the concept of Manual scoring.  This works in the
same fashion as most people are accustomed to.  This is set in the
configuration file:

chaos_mode Manual

There are currently two exceptions in Manual mode.  I don't allow
changing Rulenames for the mailer_check() and id_attachments() Eval
functions.  The reason is that these Evals can produce a lot of Rule
outputs.


OK, are you still with me?  If not, just implement Step 1) above.

4) Regarding overall development,  rules, rules, rules, and
documentation, my priorities are this:

1) Bug fixes, first and foremost
2) New Stuff that's easy
3) New Stuff that's hard
4) Existing stuff that I'm committed to change
5) Standard rules distribution
6) CHAOS meta rules (using rules from #5 above)
7) Rework Documentation

5) Suggestions and comments are always welcome.  The "Hi
{emailuserpart}" development was the result of a need expressed here on
SA-Users.  When I first released CHAOS, I got a lot of criticism by many
senior people on this list.  I deserved it and I expected it.  These are
professionals that took the time to load the plugin to see what it is
about.  I adapted, made changes and came out better and wiser.  My
respect for these people increased 100 fold. That's how I roll.

But if you're going to sit on the sidelines and complain, I have bad
news for you.  There's no shortage of stuff I can shove into /dev/null.


$0.02,

-- Jared Hall
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
What would the elevator pitch be for this?

> On Jul 23, 2021, at 12:07 AM, Jared Hall <jared@jaredsec.com> wrote:
>
> Simon Wilson wrote:
>>> could you, please, finally, describe what does this module do,
>>> here to the list and/or to the wiki?
>>>
>>> the description there is too hard to understand, epecially at the beginning,
>>> and I couldn't force myself to understand it (multiple times).
>>>
>>> Maybe you should start with the easy parts and follow with those more
>>> compliated functionality, because I feel the description starts with thelatter.
>>
>> I'm guessing from the silence in response that this will remain a mystery.
>>
>> Simon.
>>
>> ___________
>> Simon Wilson
>> M: 0400 12 11 16
>
> Reads perfectly well to me. I guess to be compatible with any other plugin, I must delete all documentation entirely :)
>
> Seriously, every single rule that this module can generate is listed. That's a good start, comparatively.
>
> I answer, and have answered, all questions regarding this module. Open-ended questions, or questions that are vague and ambiguous, are ignored. For instance, "Maybe you should start with easy parts"? OK, what's easy? I'm reminded of an old Star Trek episode where Dr. McCoy is reattaching Spock's brain. "It's so easy. A child can do it", he muses. Questions have value. Statements less so.
>
> This module has some unique stuff that CANNOT be done in a pure SpamAssassin environment. It also has stuff that can be replicated using standard rules.
>
> 1) The module, if installed and using the config file as is, does no harm at all. It will merely generate rules based upon what it finds. These are all scored at the low rate of 0.01. It's up to the user to decide what to with them. They can wrap up a generated rule in a meta rule. Example:
>
> meta JR_HATES_BEENTHERE (JR_X_BEENTHERE)
> score JR_HATES_BEENTHERE 8.0
>
> 2) Via a configuration file option, "chaos_mode", the module can be set to automatically score its rules.
>
> chaos_mode AutoISP
>
> It will still run along with existing files, cranking out higher scores for those rules marked with an asterisk. That is still probably acceptable for most people. But it can cause problems. The popular KAM ruleset scores SendGrid Emails with a high value. Mine is split into two different values that are scored differently. While they are both lower than KAM's, combined, I see that as a potential problem. I have no knowledge of what somebody's rules are at any given moment. Caveat Emptor. There I go again with the Latin :)
>
> 2A) What values do I set for these rules? As a percentage of another configuration file option, "chaos_tag":
>
> chaos_tag 7
>
> Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored. If you lower the chaos_tag value, the score for this rule would be reduced. If I increase the chaos_tag value, the score produced by this rule is raised.
>
> 2B) The AutoISP mode, as is, should be fine for anybody running a spam tag level of 8 to 12.
>
> 2C) The initial release of CHAOS.pm did all kinds of scoring. One of the knocks I have about SpamAssassin is that is does not maintain counts of hits. My complaints about this go all the way back to 2010. Counts and Amounts. SA is great with Amounts. It sucks with Counts. To the SA Development crew's credit, somewhere along the way, tflags were added to allow that functionality in a very primitive fashion. Many people are happy with that. I'm just not one of them.
>
> I read somewhere, while looking at META rules that SA internally builds an array of the rules hit. That way, as rules hit, METAs are then appropriately updated. Gee, an array. Maybe we could add a count to that array if the user wishes to? I think that it is a lot of development; not so much the actual process of doing it, but updating all the User handling thereof. Alas, It is what it is *SIGH*
>
> 2D) One thing about running AutoISP mode is that you can change a Rule's name in the configuration file and not matter what, you'll get the Rulename that's hard-coded into the program. When a Eval plugin function is called, SA passes the rule name to the plugin. Most plugins just ignore it, and simply return a Hit/Miss value for the Rulename. I ignore that completely.
>
> 2E) When I first released CHAOS, all it did was Automatic Scoring. And I used all kinds of fancy algorithms, even logarithmic, to demonstrate that. That was pointless, as many pointed out at the time. I don't do that stuff anymore.
>
> 2F) Still, as is, AutoISP will still work great for most people.
>
> 3) As the first release of CHAOS was about as successful as the Hindenburg, I added the concept of Manual scoring. This works in the same fashion as most people are accustomed to. This is set in the configuration file:
>
> chaos_mode Manual
>
> There are currently two exceptions in Manual mode. I don't allow changing Rulenames for the mailer_check() and id_attachments() Eval functions. The reason is that these Evals can produce a lot of Rule outputs.
>
>
> OK, are you still with me? If not, just implement Step 1) above.
>
> 4) Regarding overall development, rules, rules, rules, and documentation, my priorities are this:
>
> 1) Bug fixes, first and foremost
> 2) New Stuff that's easy
> 3) New Stuff that's hard
> 4) Existing stuff that I'm committed to change
> 5) Standard rules distribution
> 6) CHAOS meta rules (using rules from #5 above)
> 7) Rework Documentation
>
> 5) Suggestions and comments are always welcome. The "Hi {emailuserpart}" development was the result of a need expressed here on SA-Users. When I first released CHAOS, I got a lot of criticism by many senior people on this list. I deserved it and I expected it. These are professionals that took the time to load the plugin to see what it is about. I adapted, made changes and came out better and wiser. My respect for these people increased 100 fold. That's how I roll.
>
> But if you're going to sit on the sidelines and complain, I have bad news for you. There's no shortage of stuff I can shove into /dev/null.
>
>
> $0.02,
>
> -- Jared Hall
>
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
On Fri, Jul 23, 2021 at 12:07:52AM -0400, Jared Hall wrote:
>
> 1) The module, if installed and using the config file as is, does no harm at
> all.? It will merely generate rules based upon what it finds.? These are all
> scored at the low rate of 0.01.? It's up to the user to decide what to with
> them.? They can wrap up a generated rule in a meta rule.? Example:
>
> meta?? JR_HATES_BEENTHERE?? (JR_X_BEENTHERE)
> score JR_HATES_BEENTHERE?? 8.0

While I guess it's not illegal to whip up rules on the fly, it's awkward and
inflexible for the users.

> 2C) The initial release of CHAOS.pm did all kinds of scoring.? One of the
> knocks I have about SpamAssassin is that is does not maintain counts of hits.?
> My complaints about this go all the way back to 2010.? Counts and Amounts.? SA
> is great with Amounts.? It sucks with Counts.? To the SA Development crew's
> credit, somewhere along the way, tflags were added to allow that functionality
> in a very primitive fashion.? Many people are happy with that.? I'm just not
> one of them.
> ...
> I read somewhere, while looking at META rules that SA internally builds an
> array of the rules hit.? That way, as rules hit, METAs are then appropriately
> updated.? Gee, an array.? Maybe we could add a count to that array if the user
> wishes to?? I think that it is a lot of development; not so much the actual
> process of doing it, but updating all the User handling thereof.? Alas, It is
> what it is *SIGH*

There's zero actual information here. What exactly are you finding hard to
"count"?
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
----- Message from Jared Hall <jared@jaredsec.com> ---------
Date: Fri, 23 Jul 2021 00:07:52 -0400
From: Jared Hall <jared@jaredsec.com>
Subject: CHAOS: v1.2.2: Of Documentation
To: users@spamassassin.apache.org


> Simon Wilson wrote:
>>> could you, please, finally, describe what does this module do,
>>> here to the list and/or to the wiki?
>>>
>>> the description there is too hard to understand, epecially at the
>>> beginning,
>>> and I couldn't force myself to understand it (multiple times).
>>>
>>> Maybe you should start with the easy parts and follow with those more
>>> compliated functionality, because I feel the description starts
>>> with thelatter.
>>
>>
>> I'm guessing from the silence in response that this will remain a mystery.
>>
>> Simon.
>>
>> ___________
>> Simon Wilson
>> M: 0400 12 11 16
>
> Reads perfectly well to me.  I guess to be compatible with any other
> plugin, I must delete all documentation entirely :)

No - but perhaps a start would be to *really* listen when people ask
questions demonstrating you are not as good as you think you are at
writing things which make sense to people other than yourself.

>
> Seriously, every single rule that this module can generate is
> listed.  That's a good start, comparatively.
>
> I answer, and have answered, all questions regarding this module.

Again no. Perhaps not all mailing list emails make it through the module...

> Open-ended questions, or questions that are vague and ambiguous, are
> ignored.  For instance, "Maybe you should start with easy parts"?
> OK, what's easy?  I'm reminded of an old Star Trek episode where Dr.
> McCoy is reattaching Spock's brain.  "It's so easy.  A child can do
> it", he muses.  Questions have value.  Statements less so.

Like that one?

>
> This module has some unique stuff that CANNOT be done in a pure
> SpamAssassin environment.  It also has stuff that can be replicated
> using standard rules.
>
> 1) The module, if installed and using the config file as is, does no
> harm at all.  It will merely generate rules based upon what it
> finds.  These are all scored at the low rate of 0.01.  It's up to
> the user to decide what to with them.  They can wrap up a generated
> rule in a meta rule.  Example:
>
> meta   JR_HATES_BEENTHERE   (JR_X_BEENTHERE)
> score JR_HATES_BEENTHERE   8.0
> ||
> 2) Via a configuration file option, "chaos_mode", the module can be
> set to automatically score its rules.
>
> chaos_mode AutoISP
>
> It will still run along with existing files, cranking out higher
> scores for those rules marked with an asterisk.  That is still
> probably acceptable for most people.  But it can cause problems. The
> popular KAM ruleset scores SendGrid Emails with a high value. Mine
> is split into two different values that are scored differently. 
> While they are both lower than KAM's, combined, I see that as a
> potential problem.  I have no knowledge of what somebody's rules are
> at any given moment.  Caveat Emptor.  There I go again with the
> Latin :)
>
> 2A) What values do I set for these rules?  As a percentage of
> another configuration file option, "chaos_tag":
>
> chaos_tag 7
>
> Per the example above JR_X_BEENTHERE is a rule that is Auto-Scored.
> If you lower the chaos_tag value, the score for this rule would be
> reduced.  If I increase the chaos_tag value, the score produced by
> this rule is raised.
>
> 2B) The AutoISP mode, as is, should be fine for anybody running  a
> spam tag level of 8 to 12.
>
> 2C) The initial release of CHAOS.pm did all kinds of scoring.  One
> of the knocks I have about SpamAssassin is that is does not maintain
> counts of hits.  My complaints about this go all the way back to
> 2010.  Counts and Amounts.  SA is great with Amounts.  It sucks with
> Counts.  To the SA Development crew's credit, somewhere along the
> way, tflags were added to allow that functionality in a very
> primitive fashion.  Many people are happy with that.  I'm just not
> one of them.
>
> I read somewhere, while looking at META rules that SA internally
> builds an array of the rules hit.  That way, as rules hit, METAs are
> then appropriately updated.  Gee, an array.  Maybe we could add a
> count to that array if the user wishes to?  I think that it is a lot
> of development; not so much the actual process of doing it, but
> updating all the User handling thereof.  Alas, It is what it is *SIGH*
>
> 2D) One thing about running AutoISP mode is that you can change a
> Rule's name in the configuration file and not matter what, you'll
> get the Rulename that's hard-coded into the program.  When a Eval
> plugin function is called, SA passes the rule name to the plugin.
> Most plugins just ignore it, and simply return a Hit/Miss value for
> the Rulename.  I ignore that completely.
>
> 2E) When I first released CHAOS, all it did was Automatic Scoring.
> And I used all kinds of fancy algorithms, even logarithmic, to
> demonstrate that.  That was pointless, as many pointed out at the
> time.  I don't do that stuff anymore.
>
> 2F) Still, as is, AutoISP will still work great for most people.
>
> 3) As the first release of CHAOS was about as successful as the
> Hindenburg, I added the concept of Manual scoring.  This works in
> the same fashion as most people are accustomed to.  This is set in
> the configuration file:
>
> chaos_mode Manual
>
> There are currently two exceptions in Manual mode.  I don't allow
> changing Rulenames for the mailer_check() and id_attachments() Eval
> functions.  The reason is that these Evals can produce a lot of Rule
> outputs.
>
>
> OK, are you still with me?  If not, just implement Step 1) above.

Is this just a flippant remark? It's hard to tell amongst the rest of
it. Taking it on face value... if someone does not understand what
something is and/or what it does, the answer should NEVER be install
it anyway and see what it does. Why/how do you think that is an
appropriate recommendation?

>
> 4) Regarding overall development,  rules, rules, rules, and
> documentation, my priorities are this:
>
> 1) Bug fixes, first and foremost
> 2) New Stuff that's easy
> 3) New Stuff that's hard
> 4) Existing stuff that I'm committed to change
> 5) Standard rules distribution
> 6) CHAOS meta rules (using rules from #5 above)
> 7) Rework Documentation
>
> 5) Suggestions and comments are always welcome.  The "Hi
> {emailuserpart}" development was the result of a need expressed here
> on SA-Users.  When I first released CHAOS, I got a lot of criticism
> by many senior people on this list.  I deserved it and I expected
> it.  These are professionals that took the time to load the plugin
> to see what it is about.  I adapted, made changes and came out
> better and wiser.  My respect for these people increased 100 fold.
> That's how I roll.
>
> But if you're going to sit on the sidelines and complain, I have bad
> news for you.  There's no shortage of stuff I can shove into
> /dev/null.

I haven't seen anyone complain.
I have seen several smart people genuinely ask you for a *brief*
summary of what your module does.

>
>
> $0.02,


My "$0.02" would be that you may have more success with people
understanding this module, then using it, then contributing to it,
sharing it and recommending it if you respect not just the "senior
people" on this list, but also others who in good faith want to
understand your module. It has obviously had a lot of work put into it
- and (giving benefit of doubt) likely does something of use to the
community... I for one am genuinely curious (albeit that curiosity is
diminishing down the effort:benefit scale).

Simon


--
Simon Wilson
M: 0400 12 11 16
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
On 23/07/2021 18:01, Simon Wilson wrote:

> ----- Message from Jared Hall <jared@jaredsec.com> ---------
> Date: Fri, 23 Jul 2021 00:07:52 -0400
> From: Jared Hall <jared@jaredsec.com>
> Subject: CHAOS: v1.2.2: Of Documentation
> To: users@spamassassin.apache.org
>
> Simon Wilson wrote: could you, please, finally, describe what does this
> module do,
> here to the list and/or to the wiki?
>
> the description there is too hard to understand, epecially at the
> beginning,
> and I couldn't force myself to understand it (multiple times).
>
> Maybe you should start with the easy parts and follow with those more
> compliated functionality, because I feel the description starts with
> thelatter.
>
> I'm guessing from the silence in response that this will remain a
> mystery.
>
> Simon.
>
> ___________
> Simon Wilson
> M: 0400 12 11 16

Reads perfectly well to me. I guess to be compatible with any other
plugin, I must delete all documentation entirely :)
No - but perhaps a start would be to *really* listen when people ask
questions demonstrating you are not as good as you think you are at
writing things which make sense to people other than yourself.

> Seriously, every single rule that this module can generate is listed.
> That's a good start, comparatively.
>
> I answer, and have answered, all questions regarding this module.

Again no. Perhaps not all mailing list emails make it through the
module...

I've still yet to see a list post explaining what this thing does
so no he has not answered all questions about it, the most common sense
thing of all time is if you advertise your wares, you at least tell
people WTF it does, you don't send them to some web site to find out
(which as some posters have indicated apparently does not even tell
you).

I wont comment on the rest of his trash talk, based on his useless smart
arse replies, I don't care what this thing does we wont be touching it
due to his childish pathetic attitude, for all we know it's malware.

--
Regards,
Noel Butler

This Email, including attachments, may contain legally privileged
information, therefore at all times remains confidential and subject to
copyright protected under international law. You may not disseminate
this message without the authors express written authority to do so.
If you are not the intended recipient, please notify the sender then
delete all copies of this message including attachments immediately.
Confidentiality, copyright, and legal privilege are not waived or lost
by reason of the mistaken delivery of this message.
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
On Fri, Jul 23, 2021 at 08:16:56AM +0300, Henrik K wrote:
>
> > 2C) The initial release of CHAOS.pm did all kinds of scoring.? One of the
> > knocks I have about SpamAssassin is that is does not maintain counts of hits.?
> > My complaints about this go all the way back to 2010.? Counts and Amounts.? SA
> > is great with Amounts.? It sucks with Counts.? To the SA Development crew's
> > credit, somewhere along the way, tflags were added to allow that functionality
> > in a very primitive fashion.? Many people are happy with that.? I'm just not
> > one of them.
> > ...
> > I read somewhere, while looking at META rules that SA internally builds an
> > array of the rules hit.? That way, as rules hit, METAs are then appropriately
> > updated.? Gee, an array.? Maybe we could add a count to that array if the user
> > wishes to?? I think that it is a lot of development; not so much the actual
> > process of doing it, but updating all the User handling thereof.? Alas, It is
> > what it is *SIGH*
>
> There's zero actual information here. What exactly are you finding hard to
> "count"?

Looking at the emoji code for example, you are doing all sorts of funny
stuff like creating dynamic rules with count names

"The rulename, JR_SUBJ_EMOJIS or <YOUR_RULENAME> is appended with an
"_$count" whose score is 0.01. Example: YOUR_RULENAME_3. The rule's
description will reflect the number of Emojis found."

This is not really how SA is supposed to be used (even though it's
possible). It's just complex and confusing.

Normal way is calling the eval function multiple times with the parameters
you want to check, there's many examples in the stock rules:

body HTML_OBFUSCATE_05_10 eval:html_range('obfuscation_ratio','.05','.1')
body HTML_OBFUSCATE_10_20 eval:html_range('obfuscation_ratio','.1','.2')
Re: CHAOS: v1.2.2: Of Documentation [ In reply to ]
On Fri, 2021-07-23 at 19:49 +1000, Noel Butler wrote:
> I've still yet to see a list post explaining what this thing does
> so no he has not answered all questions about it, the most common sense
> thing of all time is if you advertise your wares, you at least tell
> people WTF it does, you don't send them to some web site to find out
> (which as some posters have indicated apparently does not even tell
> you).
>

Yes, that is the same problem I have.

I understand that CHAOS generates rules and has fancy ways of setting
their scores but I've yet to understand:

- why it was developed in the first place, i.e. what problem(s) does it
solve that manually written rules fail to address?

- what are its design principles?

- what do its generated rules do that that can't be done with manually
written rules?

- how, if at all, does it test the rules it writes and what does it do
with rules that either don't work as intended or hit ham instead of
spam? 

- does it accept human input about what is spam and what is ham and if
so, how is this input provided, maintained, and stored for future
reference?

IOW: 
- is it working entirely from messages found in the incoming mail
stream?
- what about the outbound mail stream?
- does it use mail archives or spam collections to test the rules it
generates

Martin