Mailing List Archive

Searchability of non-mainspace pages
My attention has repeatedly been drawn to serious negative effects created
by the ability of Google and other searches to search and display pages
outside the mainspace, including pages such as XfD's, DRV's, AN/I
discussions, and the like. Some of these discussions have taken place
on-wiki and others, I am advised, on discussion of OTRS tickets posted by
affected persons.

Given the visibility of Wikipedia results on Google and other searches, and
consistent with the overall intent of [[WP:BLP]] on En-Wiki (and what I hope
is its equivalent on other projects), we have a serious responsibility to
ensure that the overall effect of Wikipedia content is a responsible one.
This includes eliminating the likelihood that the first hit on the Google
search for a living person is not (for example) a deletion discussion on how
insignificant and non-notable that individual is, or a page discussing the
ban of that individual (who might be a minor, for example) who chose to edit
Wikipedia under his or her real name and made some mistakes in doing so and
was criticized or even banned as a result.

There has been discussion from time to time about implementing a technical
modification such that only mainspace pages (or such other pages as the
community might consciously choose) would be visible to searches. In view
of the number of concerns raised about the current situation where
everything is searchable, it seems to me that the necessary changes should
be developed and implemented quickly.

The main argument in opposition to this change that I have seen is that the
internal Wikipedia search capability is not as strong as the external search
engines, so that it is desirable that the ability to conduct a complete
external search be maintained. I know that I have sometimes found it useful
to be able to search all spaces within the site in, for example, looking for
precedent cases while drafting EnWiki arbitration decisions. It therefore
would probably be desirable to upgrade our internal search capability.
However, in view of the number of third parties affected by the current
practice, I do not believe that implementation of the non-search capability
should await this development.

As a matter of disclosure, although I have raised this concern in passing on
prior occasions, my attention has been focused (this is something of an
understatement) on it again by an ongoing and extremely unpleasant thread
concerning me on the Wikipedia Review site. I understand that my concerns
in this matter might be discounted for that reason. Nonetheless, they are
sincere, of long standing, and I urge that they receive priority attention.

Newyorkbrad
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
In reply to NYB's post here (which was also posted to wikien-l and which I
replied to with the same text)

The following bug, which I just entered into Bugzilla, may (if I worded it
right, and I welcome refinement or pointing out that it's a dup) be relevant
to this matter.

https://bugzilla.wikimedia.org/show_bug.cgi?id=13864

I agree with NYB that this is a serious matter, with the potential to cause
harm to innocent bystanders, and that we should do the right thing (whatever
it is) because it's the right thing to do, not because of what some external
site or person wants us to do or not do.

With the recent improvements in internal search, the time is ripe to
consider doing this.

note that I have in the bug asked for the functionality to control the
defaults project wide, not just on en-wp.

Larry Pieniazek
Hobby mail: Lar at Miltontrainworks dot com


_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
On Sun, Apr 27, 2008 at 7:06 PM, Newyorkbrad (Wikipedia)
<newyorkbrad@gmail.com> wrote:
> My attention has repeatedly been drawn to serious negative effects created
> by the ability of Google and other searches to search and display pages
> outside the mainspace, including pages such as XfD's, DRV's, AN/I
> discussions, and the like. Some of these discussions have taken place
> on-wiki and others, I am advised, on discussion of OTRS tickets posted by
> affected persons.
>
> Given the visibility of Wikipedia results on Google and other searches, and
> consistent with the overall intent of [[WP:BLP]] on En-Wiki (and what I hope
> is its equivalent on other projects), we have a serious responsibility to
> ensure that the overall effect of Wikipedia content is a responsible one.
> This includes eliminating the likelihood that the first hit on the Google
> search for a living person is not (for example) a deletion discussion on how
> insignificant and non-notable that individual is, or a page discussing the
> ban of that individual (who might be a minor, for example) who chose to edit
> Wikipedia under his or her real name and made some mistakes in doing so and
> was criticized or even banned as a result.
>
> There has been discussion from time to time about implementing a technical
> modification such that only mainspace pages (or such other pages as the
> community might consciously choose) would be visible to searches. In view
> of the number of concerns raised about the current situation where
> everything is searchable, it seems to me that the necessary changes should
> be developed and implemented quickly.
>
> The main argument in opposition to this change that I have seen is that the
> internal Wikipedia search capability is not as strong as the external search
> engines, so that it is desirable that the ability to conduct a complete
> external search be maintained. I know that I have sometimes found it useful
> to be able to search all spaces within the site in, for example, looking for
> precedent cases while drafting EnWiki arbitration decisions. It therefore
> would probably be desirable to upgrade our internal search capability.
> However, in view of the number of third parties affected by the current
> practice, I do not believe that implementation of the non-search capability
> should await this development.
>
> As a matter of disclosure, although I have raised this concern in passing on
> prior occasions, my attention has been focused (this is something of an
> understatement) on it again by an ongoing and extremely unpleasant thread
> concerning me on the Wikipedia Review site. I understand that my concerns
> in this matter might be discounted for that reason. Nonetheless, they are
> sincere, of long standing, and I urge that they receive priority attention.
>
> Newyorkbrad
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

What about developing a functionality where editors (or perhaps admins
and up only) could set an individual non-mainspace page to be
excluded? The vast majority of non-articlespace content is not
problematic, so the proposed sledgehammer solution seems a bit
overkill as compared to handling individual pages, and would certainly
be much less deleterious to the ability to use superior external
search capability while improved internal search is developed, while
still addressing the fact that some pages are problematic if they show
on search engines and allowing us to respond appropriately to
complaints and concerns about the same.

--
Freedom is the right to say that 2+2=4. From this all else follows.

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
On Sun, Apr 27, 2008 at 7:05 PM, Todd Allen <toddmallen@gmail.com> wrote:

> What about developing a functionality where editors (or perhaps admins
> and up only) could set an individual non-mainspace page to be
> excluded? The vast majority of non-articlespace content is not
> problematic, so the proposed sledgehammer solution seems a bit
> overkill as compared to handling individual pages, and would certainly
> be much less deleterious to the ability to use superior external
> search capability while improved internal search is developed, while
> still addressing the fact that some pages are problematic if they show
> on search engines and allowing us to respond appropriately to
> complaints and concerns about the same.
>

Something like a special keywork such as "__NOINDEX__" would be just about
perfect. Then it is could be embedded in templates and the like and placed
on individual pages as appropriate.

I generally agree with Todd that blocking search engines from all
non-Article pages is effectively using a sledgehammer to kill a fly.

-Robert Rohde
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
I think that this is a fairly awful idea. If there is a problem with how
discussions are being conducted on a particular project, the correct
procedure is to remind those involved of the basic rules of wikiquette, and
take appropriate measures against those who persist in violating basic
norms. Our processes should be self-vindicating in this regard. If not, we
have a problem, but should not fear the light of day.

That said, I believe it would be an enormous boon to Wikipedia (and any
other project which achieves a high pagerank in the future) if it was
*completely* removed from Google search. The website is a dangerous
distraction from the real work of creating high-quality, freely
redistributable content. As far as I know, however, I am alone in this
belief. :-)

Anyway, given that AFAICT the Foundation has deliberately restricted itself
to the role of ISP, not intervening in the affairs of the projects (even
when such intervention is desperately needed) except where required by law,
there is surely no role for it to play here.

-- Visviva (EN Wiktionary/Wikipedia)

On Mon, Apr 28, 2008 at 10:06 AM, Newyorkbrad (Wikipedia) <
newyorkbrad@gmail.com> wrote:

> My attention has repeatedly been drawn to serious negative effects created
> by the ability of Google and other searches to search and display pages
> outside the mainspace, including pages such as XfD's, DRV's, AN/I
> discussions, and the like. Some of these discussions have taken place
> on-wiki and others, I am advised, on discussion of OTRS tickets posted by
> affected persons.
>
> Given the visibility of Wikipedia results on Google and other searches,
> and
> consistent with the overall intent of [[WP:BLP]] on En-Wiki (and what I
> hope
> is its equivalent on other projects), we have a serious responsibility to
> ensure that the overall effect of Wikipedia content is a responsible one.
> This includes eliminating the likelihood that the first hit on the Google
> search for a living person is not (for example) a deletion discussion on
> how
> insignificant and non-notable that individual is, or a page discussing the
> ban of that individual (who might be a minor, for example) who chose to
> edit
> Wikipedia under his or her real name and made some mistakes in doing so
> and
> was criticized or even banned as a result.
>
> There has been discussion from time to time about implementing a technical
> modification such that only mainspace pages (or such other pages as the
> community might consciously choose) would be visible to searches. In view
> of the number of concerns raised about the current situation where
> everything is searchable, it seems to me that the necessary changes should
> be developed and implemented quickly.
>
> The main argument in opposition to this change that I have seen is that
> the
> internal Wikipedia search capability is not as strong as the external
> search
> engines, so that it is desirable that the ability to conduct a complete
> external search be maintained. I know that I have sometimes found it
> useful
> to be able to search all spaces within the site in, for example, looking
> for
> precedent cases while drafting EnWiki arbitration decisions. It therefore
> would probably be desirable to upgrade our internal search capability.
> However, in view of the number of third parties affected by the current
> practice, I do not believe that implementation of the non-search
> capability
> should await this development.
>
> As a matter of disclosure, although I have raised this concern in passing
> on
> prior occasions, my attention has been focused (this is something of an
> understatement) on it again by an ongoing and extremely unpleasant thread
> concerning me on the Wikipedia Review site. I understand that my concerns
> in this matter might be discounted for that reason. Nonetheless, they are
> sincere, of long standing, and I urge that they receive priority
> attention.
>
> Newyorkbrad
> _______________________________________________
> foundation-l mailing list
> foundation-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
On Sun, Apr 27, 2008 at 7:19 PM, Robert Rohde <rarohde@gmail.com> wrote:

> On Sun, Apr 27, 2008 at 7:05 PM, Todd Allen <toddmallen@gmail.com> wrote:
>
> > What about developing a functionality where editors (or perhaps admins
> > and up only) could set an individual non-mainspace page to be
> > excluded? The vast majority of non-articlespace content is not
> > problematic, so the proposed sledgehammer solution seems a bit
> > overkill as compared to handling individual pages, and would certainly
> > be much less deleterious to the ability to use superior external
> > search capability while improved internal search is developed, while
> > still addressing the fact that some pages are problematic if they show
> > on search engines and allowing us to respond appropriately to
> > complaints and concerns about the same.
> >
>
> Something like a special keywork such as "__NOINDEX__" would be just about
> perfect. Then it is could be embedded in templates and the like and placed
> on individual pages as appropriate.
>
> I generally agree with Todd that blocking search engines from all
> non-Article pages is effectively using a sledgehammer to kill a fly.
>
> -Robert Rohde
>


It appears the idea of a "__NOINDEX__" keyword is not a new one:
https://bugzilla.wikimedia.org/show_bug.cgi?id=8068

-Robert Rohde
_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Newyorkbrad (Wikipedia) wrote:
| My attention has repeatedly been drawn to serious negative effects created
| by the ability of Google and other searches to search and display pages
| outside the mainspace, including pages such as XfD's, DRV's, AN/I
| discussions, and the like. Some of these discussions have taken place
| on-wiki and others, I am advised, on discussion of OTRS tickets posted by
| affected persons.
|
| Given the visibility of Wikipedia results on Google and other
searches, and
| consistent with the overall intent of [[WP:BLP]] on En-Wiki (and what
I hope
| is its equivalent on other projects), we have a serious responsibility to
| ensure that the overall effect of Wikipedia content is a responsible one.
| This includes eliminating the likelihood that the first hit on the Google
| search for a living person is not (for example) a deletion discussion
on how
| insignificant and non-notable that individual is, or a page discussing the
| ban of that individual (who might be a minor, for example) who chose
to edit
| Wikipedia under his or her real name and made some mistakes in doing
so and
| was criticized or even banned as a result.
|
| There has been discussion from time to time about implementing a technical
| modification such that only mainspace pages (or such other pages as the
| community might consciously choose) would be visible to searches. In view
| of the number of concerns raised about the current situation where
| everything is searchable, it seems to me that the necessary changes should
| be developed and implemented quickly.
|
| The main argument in opposition to this change that I have seen is
that the
| internal Wikipedia search capability is not as strong as the external
search
| engines, so that it is desirable that the ability to conduct a complete
| external search be maintained. I know that I have sometimes found it
useful
| to be able to search all spaces within the site in, for example,
looking for
| precedent cases while drafting EnWiki arbitration decisions. It therefore
| would probably be desirable to upgrade our internal search capability.
| However, in view of the number of third parties affected by the current
| practice, I do not believe that implementation of the non-search
capability
| should await this development.
|
| As a matter of disclosure, although I have raised this concern in
passing on
| prior occasions, my attention has been focused (this is something of an
| understatement) on it again by an ongoing and extremely unpleasant thread
| concerning me on the Wikipedia Review site. I understand that my concerns
| in this matter might be discounted for that reason. Nonetheless, they are
| sincere, of long standing, and I urge that they receive priority
attention.
|
| Newyorkbrad

<http://blog.wikimedia.org/2008/04/29/robotstxt/>

I had actually drafted this blog post up before Newyorkbrad's post (I
had several people looking at it for me who can probably attest to it)
but was delayed in posting it.

I believe it bears relevance to this thread as well.

- --
Cary Bass
Volunteer Coordinator

Your continued donations keep Wikipedia running! Support the Wikimedia
Foundation today: http://donate.wikimedia.org
Wikimedia Foundation, Inc.
Phone: 415.839.6885
Fax: 415.882.0495

E-Mail: cary@wikimedia.org
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgXXOIACgkQyQg4JSymDYmGnQCgn4QV8YN99hDTsgHPqtvx4B+4
b0wAoMy4dT5farHUDeDV41vA/68z0hL1
=XzVX
-----END PGP SIGNATURE-----

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
Newyorkbrad writes:

> | As a matter of disclosure, although I have raised this concern in
> passing on
> | prior occasions, my attention has been focused (this is something
> of an
> | understatement) on it again by an ongoing and extremely unpleasant
> thread
> | concerning me on the Wikipedia Review site. I understand that my
> concerns
> | in this matter might be discounted for that reason.

They're not discounted by me. I think your concern is a valid one.


--Mike




_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: Searchability of non-mainspace pages [ In reply to ]
On Mon, Apr 28, 2008 at 10:06 AM, Newyorkbrad (Wikipedia)
<newyorkbrad@gmail.com> wrote:
> My attention has repeatedly been drawn to serious negative effects created
> by the ability of Google and other searches to search and display pages
> outside the mainspace, including pages such as XfD's, DRV's, AN/I
> discussions, and the like. Some of these discussions have taken place
> on-wiki and others, I am advised, on discussion of OTRS tickets posted by
> affected persons.

...

> As a matter of disclosure, although I have raised this concern in passing on
> prior occasions, my attention has been focused (this is something of an
> understatement) on it again by an ongoing and extremely unpleasant thread
> concerning me on the Wikipedia Review site. I understand that my concerns
> in this matter might be discounted for that reason. Nonetheless, they are
> sincere, of long standing, and I urge that they receive priority attention.

Not at all. Your concerns sound healthy and thoughtful. Thanks for
bringing it up.

--
KIZU Naoko
http://d.hatena.ne.jp/Britty (in Japanese)
Quote of the Day (English): http://en.wikiquote.org/wiki/WQ:QOTD

_______________________________________________
foundation-l mailing list
foundation-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l