Mailing List Archive: Flagged bots to edit pages containing spam links

Flagged bots to edit pages containing spam links

wikipedia.kawaii.neko at gmail

Apr 28, 2008, 8:17 AM

Post #1 of 10 (1224 views)

https://bugzilla.wikimedia.org/show_bug.cgi?id=13706

Perhaps a community discussion is necessary on the matter, I hereby initiate
it.

When a person tries to edit a page that contains a URL matching the spam
autoblocker regex, the user is prohibited from making the edit until the
spam link is removed. The spam autoblocker was intended to prevent the
addition of new spam.

In a scenario where a spambot adds spam links to wikipedia, then later the
spam url is added to the spam blacklist, then a user tries to edit a page
that already contains spam added before the spam url is added to the spam
blacklist. For a human this isn't much of a deal to deal with, it is however
a different story when it comes to bots.

Consider you are operating a bot that makes non-controversial routine
maintenance edits on a regular basis. The spam autoblocker would prevent
such edits. If your bot's task is dealing with images renamed/deleted on
commons or if your bots task is dealing with interwiki links this is
particularly problematic. Interwiki bots, commons delinking bots often edit
hundereds of pages a day on hundereds of wikis. Thats a lot of logs. So the
suggestion that I should spend perhaps hours per day reading log files for
spam on pages on languages I cannot even understand (or necesarily read the
?'s and %'s) is quite unreasonable. This is a task better dealt with by the
locals (humans) of the wiki community rather than bots preforming mindless,
routine and non-controversial tasks.

There is also the matter of legitimate reason to include spam on pages such
as archived discussion on a spam bot attack where example URLs are used
before these make their way to the spam autoblocker.

- White Cat
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

innocentkiller at gmail

Apr 28, 2008, 9:34 AM

Post #2 of 10 (1193 views)

Forum shopping for this after the lead developer and CTO has said no
is not the way to go about it.

From a technical standpoint: I agree with Brion. There are a whole host
of reasons why an edit might fail (locked db's, protected pages, or even
the server dying), and the bot needs to be designed to deal with that. If
your bot crashes, etc. due to an edit failing: well that's your fault as a
developer.

-Chad

On Mon, Apr 28, 2008 at 11:17 AM, White Cat
<wikipedia.kawaii.neko@gmail.com> wrote:
> https://bugzilla.wikimedia.org/show_bug.cgi?id=13706
>
> Perhaps a community discussion is necessary on the matter, I hereby initiate
> it.
>
> When a person tries to edit a page that contains a URL matching the spam
> autoblocker regex, the user is prohibited from making the edit until the
> spam link is removed. The spam autoblocker was intended to prevent the
> addition of new spam.
>
> In a scenario where a spambot adds spam links to wikipedia, then later the
> spam url is added to the spam blacklist, then a user tries to edit a page
> that already contains spam added before the spam url is added to the spam
> blacklist. For a human this isn't much of a deal to deal with, it is however
> a different story when it comes to bots.
>
> Consider you are operating a bot that makes non-controversial routine
> maintenance edits on a regular basis. The spam autoblocker would prevent
> such edits. If your bot's task is dealing with images renamed/deleted on
> commons or if your bots task is dealing with interwiki links this is
> particularly problematic. Interwiki bots, commons delinking bots often edit
> hundereds of pages a day on hundereds of wikis. Thats a lot of logs. So the
> suggestion that I should spend perhaps hours per day reading log files for
> spam on pages on languages I cannot even understand (or necesarily read the
> ?'s and %'s) is quite unreasonable. This is a task better dealt with by the
> locals (humans) of the wiki community rather than bots preforming mindless,
> routine and non-controversial tasks.
>
> There is also the matter of legitimate reason to include spam on pages such
> as archived discussion on a spam bot attack where example URLs are used
> before these make their way to the spam autoblocker.
>
> - White Cat
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

wikipedia.kawaii.neko at gmail

Apr 28, 2008, 11:58 AM

Post #3 of 10 (1190 views)

I beg your pardon? Forum shopping on foundation-l? Seems self
contradictory...

On Mon, Apr 28, 2008 at 7:34 PM, Chad <innocentkiller@gmail.com> wrote:

> Forum shopping for this after the lead developer and CTO has said no
> is not the way to go about it.
>
> From a technical standpoint: I agree with Brion. There are a whole host
> of reasons why an edit might fail (locked db's, protected pages, or even
> the server dying), and the bot needs to be designed to deal with that. If
> your bot crashes, etc. due to an edit failing: well that's your fault as a
> developer.
>
> -Chad
>
> On Mon, Apr 28, 2008 at 11:17 AM, White Cat
> <wikipedia.kawaii.neko@gmail.com> wrote:
> > https://bugzilla.wikimedia.org/show_bug.cgi?id=13706
> >
> > Perhaps a community discussion is necessary on the matter, I hereby
> initiate
> > it.
> >
> > When a person tries to edit a page that contains a URL matching the
> spam
> > autoblocker regex, the user is prohibited from making the edit until
> the
> > spam link is removed. The spam autoblocker was intended to prevent the
> > addition of new spam.
> >
> > In a scenario where a spambot adds spam links to wikipedia, then later
> the
> > spam url is added to the spam blacklist, then a user tries to edit a
> page
> > that already contains spam added before the spam url is added to the
> spam
> > blacklist. For a human this isn't much of a deal to deal with, it is
> however
> > a different story when it comes to bots.
> >
> > Consider you are operating a bot that makes non-controversial routine
> > maintenance edits on a regular basis. The spam autoblocker would
> prevent
> > such edits. If your bot's task is dealing with images renamed/deleted
> on
> > commons or if your bots task is dealing with interwiki links this is
> > particularly problematic. Interwiki bots, commons delinking bots often
> edit
> > hundereds of pages a day on hundereds of wikis. Thats a lot of logs. So
> the
> > suggestion that I should spend perhaps hours per day reading log files
> for
> > spam on pages on languages I cannot even understand (or necesarily read
> the
> > ?'s and %'s) is quite unreasonable. This is a task better dealt with by
> the
> > locals (humans) of the wiki community rather than bots preforming
> mindless,
> > routine and non-controversial tasks.
> >
> > There is also the matter of legitimate reason to include spam on pages
> such
> > as archived discussion on a spam bot attack where example URLs are used
> > before these make their way to the spam autoblocker.
> >
> > - White Cat
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

innocentkiller at gmail

Apr 28, 2008, 10:11 PM

Post #4 of 10 (1180 views)

Brion already said that this wouldn't be implemented and discussion was
over. You now bring it up on Foundation-l. This is known as forum shopping.

Also known as "asking the other parent."

-Chad

On Mon, Apr 28, 2008 at 2:58 PM, White Cat
<wikipedia.kawaii.neko@gmail.com> wrote:
> I beg your pardon? Forum shopping on foundation-l? Seems self
> contradictory...
>
>
>
> On Mon, Apr 28, 2008 at 7:34 PM, Chad <innocentkiller@gmail.com> wrote:
>
> > Forum shopping for this after the lead developer and CTO has said no
> > is not the way to go about it.
> >
> > From a technical standpoint: I agree with Brion. There are a whole host
> > of reasons why an edit might fail (locked db's, protected pages, or even
> > the server dying), and the bot needs to be designed to deal with that. If
> > your bot crashes, etc. due to an edit failing: well that's your fault as a
> > developer.
> >
> > -Chad
> >
> > On Mon, Apr 28, 2008 at 11:17 AM, White Cat
> > <wikipedia.kawaii.neko@gmail.com> wrote:
> > > https://bugzilla.wikimedia.org/show_bug.cgi?id=13706
> > >
> > > Perhaps a community discussion is necessary on the matter, I hereby
> > initiate
> > > it.
> > >
> > > When a person tries to edit a page that contains a URL matching the
> > spam
> > > autoblocker regex, the user is prohibited from making the edit until
> > the
> > > spam link is removed. The spam autoblocker was intended to prevent the
> > > addition of new spam.
> > >
> > > In a scenario where a spambot adds spam links to wikipedia, then later
> > the
> > > spam url is added to the spam blacklist, then a user tries to edit a
> > page
> > > that already contains spam added before the spam url is added to the
> > spam
> > > blacklist. For a human this isn't much of a deal to deal with, it is
> > however
> > > a different story when it comes to bots.
> > >
> > > Consider you are operating a bot that makes non-controversial routine
> > > maintenance edits on a regular basis. The spam autoblocker would
> > prevent
> > > such edits. If your bot's task is dealing with images renamed/deleted
> > on
> > > commons or if your bots task is dealing with interwiki links this is
> > > particularly problematic. Interwiki bots, commons delinking bots often
> > edit
> > > hundereds of pages a day on hundereds of wikis. Thats a lot of logs. So
> > the
> > > suggestion that I should spend perhaps hours per day reading log files
> > for
> > > spam on pages on languages I cannot even understand (or necesarily read
> > the
> > > ?'s and %'s) is quite unreasonable. This is a task better dealt with by
> > the
> > > locals (humans) of the wiki community rather than bots preforming
> > mindless,
> > > routine and non-controversial tasks.
> > >
> > > There is also the matter of legitimate reason to include spam on pages
> > such
> > > as archived discussion on a spam bot attack where example URLs are used
> > > before these make their way to the spam autoblocker.
> > >
> > > - White Cat
> > > _______________________________________________
> > > foundation-l mailing list
> > > foundation-l@lists.wikimedia.org
> > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> > >
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l@lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

carnildo at gmail

Apr 30, 2008, 4:37 PM

Post #5 of 10 (1181 views)

On 4/28/08, Chad <innocentkiller@gmail.com> wrote:
> From a technical standpoint: I agree with Brion. There are a whole host
> of reasons why an edit might fail (locked db's, protected pages, or even
> the server dying), and the bot needs to be designed to deal with that. If
> your bot crashes, etc. due to an edit failing: well that's your fault as a
> developer.

It would be nice if flagged bots were exempt from the spamfilter.
Spam URLs and protected pages are the situations that my bots can't
handle -- for everything else, the bot can either wait or try again.

--
Mark
[[en:User:Carnildo]]

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

wknight8111 at gmail

Apr 30, 2008, 5:56 PM

Post #6 of 10 (1180 views)

On Wed, Apr 30, 2008 at 7:37 PM, Mark Wagner <carnildo@gmail.com> wrote:
> On 4/28/08, Chad <innocentkiller@gmail.com> wrote:
> > From a technical standpoint: I agree with Brion. There are a whole host
> > of reasons why an edit might fail (locked db's, protected pages, or even
> > the server dying), and the bot needs to be designed to deal with that. If
> > your bot crashes, etc. due to an edit failing: well that's your fault as a
> > developer.
>
> It would be nice if flagged bots were exempt from the spamfilter.
> Spam URLs and protected pages are the situations that my bots can't
> handle -- for everything else, the bot can either wait or try again.

This is something that I can agree with, if a user is trusted enough
to receive the bot flag in the first place (or "trusted not to make
spam/vandalism/controversial mass edits"), we shouldn't have to worry
about spam filtering them.

--Andrew Whitworth

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

wikipedia.kawaii.neko at gmail

May 4, 2008, 11:19 AM

Post #7 of 10 (1161 views)

I am told that devs aren't keen on making an exception. While they (at least
Tim Starling) agrees the current method is rather messed up. They were
talking about a more permanent solution.

A suggestion was that to make the spam autoblocker only black the edit if a
new spam link is being introduced and spam already on the pages do not get
affected. This comes at the expense of performance though.

Then there is the matter of the meta spam autoblocker page has started
getting very large. Soon it will not be possible to load the page.

- White Cat

On Thu, May 1, 2008 at 3:56 AM, Andrew Whitworth <wknight8111@gmail.com>
wrote:

> On Wed, Apr 30, 2008 at 7:37 PM, Mark Wagner <carnildo@gmail.com> wrote:
> > On 4/28/08, Chad <innocentkiller@gmail.com> wrote:
> > > From a technical standpoint: I agree with Brion. There are a whole
> host
> > > of reasons why an edit might fail (locked db's, protected pages, or
> even
> > > the server dying), and the bot needs to be designed to deal with
> that. If
> > > your bot crashes, etc. due to an edit failing: well that's your
> fault as a
> > > developer.
> >
> > It would be nice if flagged bots were exempt from the spamfilter.
> > Spam URLs and protected pages are the situations that my bots can't
> > handle -- for everything else, the bot can either wait or try again.
>
> This is something that I can agree with, if a user is trusted enough
> to receive the bot flag in the first place (or "trusted not to make
> spam/vandalism/controversial mass edits"), we shouldn't have to worry
> about spam filtering them.
>
> --Andrew Whitworth
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

Simetrical+wikilist at gmail

May 5, 2008, 4:20 PM

Post #8 of 10 (1159 views)

On Sun, May 4, 2008 at 2:19 PM, White Cat
<wikipedia.kawaii.neko@gmail.com> wrote:
> A suggestion was that to make the spam autoblocker only black the edit if a
> new spam link is being introduced and spam already on the pages do not get
> affected. This comes at the expense of performance though.

No it doesn't, it would just require an extra regex search (on the old
text) and some trivial array processing. Not a substantial
difference. If you're talking about getting a feature added, anyway,
I think you're on the wrong list.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

wikipedia.kawaii.neko at gmail

May 8, 2008, 8:01 AM

Post #9 of 10 (1131 views)

Where am I supposed to propose it? Bugzilla is obviously the wrong address.

- White Cat

On Tue, May 6, 2008 at 2:20 AM, Simetrical
<Simetrical+wikilist@gmail.com<Simetrical%2Bwikilist@gmail.com>>
wrote:

> On Sun, May 4, 2008 at 2:19 PM, White Cat
> <wikipedia.kawaii.neko@gmail.com> wrote:
> > A suggestion was that to make the spam autoblocker only black the edit
> if a
> > new spam link is being introduced and spam already on the pages do not
> get
> > affected. This comes at the expense of performance though.
>
> No it doesn't, it would just require an extra regex search (on the old
> text) and some trivial array processing. Not a substantial
> difference. If you're talking about getting a feature added, anyway,
> I think you're on the wrong list.
>
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

Re: Flagged bots to edit pages containing spam links [ In reply to ]

Simetrical+wikilist at gmail

May 9, 2008, 9:51 AM

Post #10 of 10 (1132 views)

On Thu, May 8, 2008 at 11:01 AM, White Cat
<wikipedia.kawaii.neko@gmail.com> wrote:
> Where am I supposed to propose it? Bugzilla is obviously the wrong address.

It's the correct one, as you should know. That some features you've
proposed get rejected doesn't mean related features will also get
rejected.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l