Mailing List Archive

Repository of current Gentoo documentation?
I've been converting the Gentoo handbook into Plucker format for awhile
now in an automated fashion, and I'd like to expand that to include as
much of the existing Gentoo documentation as possible. I'm not a Gentoo
user (I've been using Linux actively for over a decade now), but a
significant portion of our userbase is.

You can see the results of one of these conversions here:

http://code.plkr.org/gentoo

Is the documentation for Gentoo in a repository somewhere? svn? cvs?
rsync? I'd like to keep my local copy current in a more efficient way,
vs. having to hit the Gentoo mirrors all the time over http to refetch
the same material when it changes or is updated.

If not, would it be feasible to create one? I'd be more than happy to
host it if necessary (I host and maintain dozens of other projects over
at SourceFubar.Net).

I mirror LDP, Wikimedia's projects, Project Gutenberg, CPAN, and quite a
few other things. I'd love to mirror Gentoo's documentation for the
purposes of reconverting it to Plucker regulary.

Let me know, and I'll do what I can to help. I just don't like having to
use http to fetch HTML content on an iterative basis.

Thanks for your time.

--
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com
Re: Repository of current Gentoo documentation? [ In reply to ]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Please read our documentation tips and tricks guide; there is a snapshot of the
docs generated daily in tarball format.

http://www.gentoo.org/proj/en/gdp/doc/doc-tipsntricks.xml

Should do the trick; you can set up a cron job to automatically fetch the latest
snapshot. Note that you can replace [en] in the link with the two letter
language code of your choice for docs in other languages, if desired.

David A. Desrosiers wrote:
> I've been converting the Gentoo handbook into Plucker format for awhile
> now in an automated fashion, and I'd like to expand that to include as
> much of the existing Gentoo documentation as possible. I'm not a Gentoo
> user (I've been using Linux actively for over a decade now), but a
> significant portion of our userbase is.
>
> You can see the results of one of these conversions here:
>
> http://code.plkr.org/gentoo
>
> Is the documentation for Gentoo in a repository somewhere? svn? cvs?
> rsync? I'd like to keep my local copy current in a more efficient way,
> vs. having to hit the Gentoo mirrors all the time over http to refetch
> the same material when it changes or is updated.
>
> If not, would it be feasible to create one? I'd be more than happy to
> host it if necessary (I host and maintain dozens of other projects over
> at SourceFubar.Net).
>
> I mirror LDP, Wikimedia's projects, Project Gutenberg, CPAN, and quite a
> few other things. I'd love to mirror Gentoo's documentation for the
> purposes of reconverting it to Plucker regulary.
>
> Let me know, and I'll do what I can to help. I just don't like having to
> use http to fetch HTML content on an iterative basis.
>
> Thanks for your time.
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEQqtOrsJQqN81j74RAsDdAJ0ac+bZyOnXrz8DZ8ZAjPTAvmf91wCfVzZH
r6I6DC/wlEdrCxkRr6rnH7Y=
=YIjz
-----END PGP SIGNATURE-----
--
gentoo-doc@gentoo.org mailing list
Re: Repository of current Gentoo documentation? [ In reply to ]
On Sun, 2006-04-16 at 13:38 -0700, Josh Saddler wrote:
> Should do the trick; you can set up a cron job to automatically fetch
> the latest snapshot. Note that you can replace [en] in the link with
> the two letter language code of your choice for docs in other
> languages, if desired.

While somewhat useful in that these tarballs contain many more docs than
fetching them one-by-one from the server over http, I'm still fetching
them from the server over http, only this time I can't do it in a way
that guarantees I stay current.

The problem is that once a single byte in the original data has changed,
the entire compressed stream differs from that byte onward, but tools
like wget, curl, LWP, and so on.. don't know to rewind to the changed
bytes and refetch from THAT point forward. They simply look at
Content-Length and fetch from the current bytecount until they reach the
end.

So I'll end up fetching it with wget, then delete it, and refetch it
again when I know content has changed, or when my cronjob has
completed.

This means I'm actually using (wasting?) MORE bandwidth than fetching
the rendered HTML (or XML) versions from the server directly by checking
Last-Modified headers and only fetching content when it has changed.

I'll see what I can come up with to work out a better compromise to this
situation. A public svn/cvs would be helpful (I've been told its coming,
but "Not Ready Yet(tm)", or an rsync server to fetch the docs directly
from the tree.


--
David A. Desrosiers
desrod@gnu-designs.com
http://gnu-designs.com