Mailing List Archive

PDF4J Project: Gathering Feature Requests
I have set up the project PDF4J on SourceForge
(http://pdf4j.sourceforge.net). At this point we are simply gathering
requirements--we are not yet ready to start writing code. Supporting
the needs of tools like Lucene is one of our key target use cases, so we
would be very interested to know what Lucene integrators would want from
a PDF access library.

Thanks,

Eliot
--
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX 78752 Phone: 512.656.4139

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
This is very exciting.

Are you planning on basing the code on other pdf readers / writers?

--Peter


On 5/6/02 10:54 AM, "W. Eliot Kimber" <eliot@isogen.com> wrote:

> I have set up the project PDF4J on SourceForge
> (http://pdf4j.sourceforge.net). At this point we are simply gathering
> requirements--we are not yet ready to start writing code. Supporting
> the needs of tools like Lucene is one of our key target use cases, so we
> would be very interested to know what Lucene integrators would want from
> a PDF access library.
>
> Thanks,
>
> Eliot
> --
> W. Eliot Kimber, eliot@isogen.com
> Consultant, ISOGEN International
>
> 1016 La Posada Dr., Suite 240
> Austin, TX 78752 Phone: 512.656.4139
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
Peter Carlson wrote:
>
> This is very exciting.
>
> Are you planning on basing the code on other pdf readers / writers?

At this point I haven't found any Java PDF reader that meets my
requirements. One of the motivations for doing this is the problems we
had using Etymon's PJ library: both the license (GPL, not LGPL) and the
quality of the code itself, which does not meet our engineering
standards. I want to use an LGPL library so that people can use the code
in projects that are not themselves open sourced but I want the library
itself to be protected.

For writing, may or may not be able to leverage existing code, don't
know yet.

Note too that there are two aspects of writing: creating a valid PDF
data stream and creating meaningful page layouts--we are not addressing
the second of these (there are lots of libraries that will create useful
PDF output from various non-PDF inputs). Our main writing usecase is the
rewriting of existing PDFs following some amount of manipulation through
our API.

A caution: I am still waiting to get approval from my employers to do
this work as open source--it may be a while before I can even start on
the coding.

Cheers,

Eliot
--
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX 78752 Phone: 512.656.4139

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
Have you looked at xpdf?

Www.foolabs.com/xpdf

--Peter

On 5/6/02 3:58 PM, "W. Eliot Kimber" <eliot@isogen.com> wrote:

> Peter Carlson wrote:
>>
>> This is very exciting.
>>
>> Are you planning on basing the code on other pdf readers / writers?
>
> At this point I haven't found any Java PDF reader that meets my
> requirements. One of the motivations for doing this is the problems we
> had using Etymon's PJ library: both the license (GPL, not LGPL) and the
> quality of the code itself, which does not meet our engineering
> standards. I want to use an LGPL library so that people can use the code
> in projects that are not themselves open sourced but I want the library
> itself to be protected.
>
> For writing, may or may not be able to leverage existing code, don't
> know yet.
>
> Note too that there are two aspects of writing: creating a valid PDF
> data stream and creating meaningful page layouts--we are not addressing
> the second of these (there are lots of libraries that will create useful
> PDF output from various non-PDF inputs). Our main writing usecase is the
> rewriting of existing PDFs following some amount of manipulation through
> our API.
>
> A caution: I am still waiting to get approval from my employers to do
> this work as open source--it may be a while before I can even start on
> the coding.
>
> Cheers,
>
> Eliot
> --
> W. Eliot Kimber, eliot@isogen.com
> Consultant, ISOGEN International
>
> 1016 La Posada Dr., Suite 240
> Austin, TX 78752 Phone: 512.656.4139
>
> --
> To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Tuesday, May 07, 2002 7:51 AM
Subject: Re: PDF4J Project: Gathering Feature Requests


> Have you looked at xpdf?
>
> Www.foolabs.com/xpdf

From what I know of xpdf, it's not written in Java, probably C I think. Even
if there are JNI hooks to the code, I doubt it would be as nice as a Java
library for that.

K


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
I was just thinking of an existing code base that you could port to pure
java.

--Peter

On 5/6/02 5:26 PM, "Kelvin Tan" <kelvin@relevanz.com> wrote:

> there are JNI hooks to the code, I doubt it would be as nice as a Java
> library for that.
>
> K


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
Good point. Plus the xpdf project (AFAIK) is being actively developed. One
problem though: It's released under GPL, so any port will probably have to
adopt GPL too (unless they can be convinced to re-release it under a less
restrictive license).

----- Original Message -----
From: "Peter Carlson" <carlson@bookandhammer.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Tuesday, May 07, 2002 10:36 AM
Subject: Re: PDF4J Project: Gathering Feature Requests


> I was just thinking of an existing code base that you could port to pure
> java.
>
> --Peter
>
> On 5/6/02 5:26 PM, "Kelvin Tan" <kelvin@relevanz.com> wrote:
>
> > there are JNI hooks to the code, I doubt it would be as nice as a Java
> > library for that.
> >
> > K
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
Peter Carlson wrote:
>
> Have you looked at xpdf?
>
> Www.foolabs.com/xpdf

A quick look at the Web site suggests that it might be a good source for
seeing how certain PDF problems are solved, if nothing else. It might be
useful through JNI too, I don't know (I don't have any experience using
JNI to expose C libraries).

Thanks for the tip.

Cheers,

Eliot
--
W. Eliot Kimber, eliot@isogen.com
Consultant, ISOGEN International

1016 La Posada Dr., Suite 240
Austin, TX 78752 Phone: 512.656.4139

--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
I can't help but notice the similarities between this effort and the new POI
project at apache (jakharta).

A lot of the language, vision, etc. maps directly... just substitute "Adobe
PDF" for "Microsoft Excel." and you have a lot of your requirements and
perhaps your solution architecture.

POI seeks to both read and write. But on the read-side they use an
event-callback technique - similar in concept to SAX for XML. Sounds like a
nice way to do this sort of thing.

And of course, there is the Lucene connection. They mention Lucene
specifically as a project that needs binary level read access to MS Office
documents.

but I think that PDF is even more mentioned in this list.

----- Original Message -----
From: W. Eliot Kimber <eliot@isogen.com>
To: Lucene Users List <lucene-user@jakarta.apache.org>
Sent: Monday, May 06, 2002 3:58 PM
Subject: Re: PDF4J Project: Gathering Feature Requests


> Peter Carlson wrote:
> >
> > This is very exciting.
> >
> > Are you planning on basing the code on other pdf readers / writers?
>
> At this point I haven't found any Java PDF reader that meets my
> requirements. One of the motivations for doing this is the problems we
> had using Etymon's PJ library: both the license (GPL, not LGPL) and the
> quality of the code itself, which does not meet our engineering
> standards. I want to use an LGPL library so that people can use the code
> in projects that are not themselves open sourced but I want the library
> itself to be protected.
>
> For writing, may or may not be able to leverage existing code, don't
> know yet.
>
> Note too that there are two aspects of writing: creating a valid PDF
> data stream and creating meaningful page layouts--we are not addressing
> the second of these (there are lots of libraries that will create useful
> PDF output from various non-PDF inputs). Our main writing usecase is the
> rewriting of existing PDFs following some amount of manipulation through
> our API.
>
> A caution: I am still waiting to get approval from my employers to do
> this work as open source--it may be a while before I can even start on
> the coding.
>
> Cheers,
>
> Eliot
> --
> W. Eliot Kimber, eliot@isogen.com
> Consultant, ISOGEN International
>
> 1016 La Posada Dr., Suite 240
> Austin, TX 78752 Phone: 512.656.4139
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
But POI is only for OLE 2 Compound Document format based formats.

CNew wrote:

>I can't help but notice the similarities between this effort and the new POI
>project at apache (jakharta).
>
>A lot of the language, vision, etc. maps directly... just substitute "Adobe
>PDF" for "Microsoft Excel." and you have a lot of your requirements and
>perhaps your solution architecture.
>
>POI seeks to both read and write. But on the read-side they use an
>event-callback technique - similar in concept to SAX for XML. Sounds like a
>nice way to do this sort of thing.
>
>And of course, there is the Lucene connection. They mention Lucene
>specifically as a project that needs binary level read access to MS Office
>documents.
>
>but I think that PDF is even more mentioned in this list.
>
>----- Original Message -----
>From: W. Eliot Kimber <eliot@isogen.com>
>To: Lucene Users List <lucene-user@jakarta.apache.org>
>Sent: Monday, May 06, 2002 3:58 PM
>Subject: Re: PDF4J Project: Gathering Feature Requests
>
>
>
>
>>Peter Carlson wrote:
>>
>>
>>>This is very exciting.
>>>
>>>Are you planning on basing the code on other pdf readers / writers?
>>>
>>>
>>At this point I haven't found any Java PDF reader that meets my
>>requirements. One of the motivations for doing this is the problems we
>>had using Etymon's PJ library: both the license (GPL, not LGPL) and the
>>quality of the code itself, which does not meet our engineering
>>standards. I want to use an LGPL library so that people can use the code
>>in projects that are not themselves open sourced but I want the library
>>itself to be protected.
>>
>>For writing, may or may not be able to leverage existing code, don't
>>know yet.
>>
>>Note too that there are two aspects of writing: creating a valid PDF
>>data stream and creating meaningful page layouts--we are not addressing
>>the second of these (there are lots of libraries that will create useful
>>PDF output from various non-PDF inputs). Our main writing usecase is the
>>rewriting of existing PDFs following some amount of manipulation through
>>our API.
>>
>>A caution: I am still waiting to get approval from my employers to do
>>this work as open source--it may be a while before I can even start on
>>the coding.
>>
>>Cheers,
>>
>>Eliot
>>--
>>W. Eliot Kimber, eliot@isogen.com
>>Consultant, ISOGEN International
>>
>>1016 La Posada Dr., Suite 240
>>Austin, TX 78752 Phone: 512.656.4139
>>
>>--
>>To unsubscribe, e-mail:
>>
>>
><mailto:lucene-user-unsubscribe@jakarta.apache.org>
>
>
>>For additional commands, e-mail:
>>
>>
><mailto:lucene-user-help@jakarta.apache.org>
>
>
>
>
>
>--
>To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
>For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
>
>
>
>




--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
Re: PDF4J Project: Gathering Feature Requests [ In reply to ]
I wonder if Adobe would donate their java code. I understand that they have
given up on it.
Anyone with Adobe connections?

----- Original Message -----
From: W. Eliot Kimber <eliot@isogen.com>
To: Lucene Users List <lucene-user@jakarta.apache.org>
Sent: Monday, May 06, 2002 3:58 PM
Subject: Re: PDF4J Project: Gathering Feature Requests


> Peter Carlson wrote:
> >
> > This is very exciting.
> >
> > Are you planning on basing the code on other pdf readers / writers?
>
> At this point I haven't found any Java PDF reader that meets my
> requirements. One of the motivations for doing this is the problems we
> had using Etymon's PJ library: both the license (GPL, not LGPL) and the
> quality of the code itself, which does not meet our engineering
> standards. I want to use an LGPL library so that people can use the code
> in projects that are not themselves open sourced but I want the library
> itself to be protected.
>
> For writing, may or may not be able to leverage existing code, don't
> know yet.
>
> Note too that there are two aspects of writing: creating a valid PDF
> data stream and creating meaningful page layouts--we are not addressing
> the second of these (there are lots of libraries that will create useful
> PDF output from various non-PDF inputs). Our main writing usecase is the
> rewriting of existing PDFs following some amount of manipulation through
> our API.
>
> A caution: I am still waiting to get approval from my employers to do
> this work as open source--it may be a while before I can even start on
> the coding.
>
> Cheers,
>
> Eliot
> --
> W. Eliot Kimber, eliot@isogen.com
> Consultant, ISOGEN International
>
> 1016 La Posada Dr., Suite 240
> Austin, TX 78752 Phone: 512.656.4139
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
>



--
To unsubscribe, e-mail: <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>