Mailing List Archive

[Bug 3471] add sanity check so too many tokens don't get expired
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From felicity@kluge.net 2004-06-04 09:50 -------
Created an attachment (id=2000)
--> (http://bugzilla.spamassassin.org/attachment.cgi?id=2000&action=view)
suggested patch




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471

felicity@kluge.net changed:

What |Removed |Added
----------------------------------------------------------------------------
Priority|P5 |P4
Target Milestone|3.1.0 |3.0.0



------- Additional Comments From felicity@kluge.net 2004-06-04 09:51 -------
I think this is 3.0 material.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From parkerm@pobox.com 2004-06-04 16:38 -------
Subject: Re: New: add sanity check so too many tokens don't get expired

On Fri, Jun 04, 2004 at 09:50:27AM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
>
> I'm attaching a patch which adds a simple check to the DBM expiry (SQL doesn't do estimation?).
>

Actually, it does. It determines the optimal atime to expiration and
then deletes everything < that. If this is a problem for DBM, then it
could happen for SQL as well, but slightly harder to recover from.

Michael





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From felicity@kluge.net 2004-06-04 17:16 -------
Subject: Re: add sanity check so too many tokens don't get expired

On Fri, Jun 04, 2004 at 04:38:11PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> Actually, it does. It determines the optimal atime to expiration and
> then deletes everything < that. If this is a problem for DBM, then it
> could happen for SQL as well, but slightly harder to recover from.

Ah. I saw the "delete ... < ..." bit, and got confused (it was a very
long night last night...)

I guess the way to guard for this in SQL is to do a "select
count(??) ..." before the delete, and see how many tokens will be removed.
If too many, abort. If not, do the delete call. It adds more overhead,
but ... At least for the delete, the appropriate rows would be in
cache. :)





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From parkerm@pobox.com 2004-06-04 18:35 -------
Subject: Re: add sanity check so too many tokens don't get expired

It's been awhile since I've looked at this code but I wonder if this
is just fixing the symptom and not the actual problem.

We should be figuring out the optimal atime value that doesn't expire
too many tokens. So is that part of the code doing the wrong thing?

Michael





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From felicity@kluge.net 2004-06-04 19:43 -------
Subject: Re: add sanity check so too many tokens don't get expired

On Fri, Jun 04, 2004 at 06:35:31PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> We should be figuring out the optimal atime value that doesn't expire
> too many tokens. So is that part of the code doing the wrong thing?

it's all documented in the sa-learn docs. ;)

the expiry works in 2 passes. the first pass is what generates the
atime to use, and the second pass actually does the removal.

to make expiry go faster, the first pass will try to estimate what the
atime value should be based on what the last expiry run found, as long
as they are similar enough in what they want to accomplish (ie: if you
previously expired 5000 tokens, and you now want to expire 5000 tokens,
you can probably use the same atime delta as last time...). this works
on the idea that message flow is relatively constant.

if estimation shouldn't be used (it's rather picky), the first pass will
actually go through all the tokens to figure out what atime value should
be used based on an exponential scale (12 hours, 24 hours, up through
256 days).

the problem is that doing estimation gives you, well, an estimate.
most of the time it's right, and sometimes it's wrong. it's the reason
why people don't like the weatherman on tv.

so to answer your question, we can always use a better estimation
algorithm/equation, but the nature of the beast is that estimations will
sometimes be wrong so we should just put a sanity check in the bottom
to make sure we're not about to shoot off our own foot.





------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471





------- Additional Comments From parkerm@pobox.com 2004-06-04 19:52 -------
Subject: Re: add sanity check so too many tokens don't get expired

Ahhh, yeah I forgot about the first pass/guess based on previous
expires. Makes sense.

Michael






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.
[Bug 3471] add sanity check so too many tokens don't get expired [ In reply to ]
http://bugzilla.spamassassin.org/show_bug.cgi?id=3471

felicity@kluge.net changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #2000 is|0 |1
obsolete| |



------- Additional Comments From felicity@kluge.net 2004-06-07 11:51 -------
Created an attachment (id=2005)
--> (http://bugzilla.spamassassin.org/attachment.cgi?id=2005&action=view)
new version of implementation

this version handles both DBM and SQL



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.