Mailing List Archive

Migrating WhitespaceTokenizerFactory from 8.2 to 9.4
I am migrating my project’s usage of Lucene from 8.2 to 9.4.
The migration documentation has been very helpful,
but doesn’t help me resolve this exception:

‘Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.analysis.TokenizerFactory with name 'whitespace' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath. The current classpath supports the following names: [standard]’

My project includes the lucene-analysis-common JAR,
and my JAR includes org/apache/lucene/analysis/core/WhitespaceTokenizerFactory.class.

I am not familiar with how Java SPI is configured and built.

I tried creating META-INF/services/org.apache.lucene.analysis.TokenizerFactory
containing: org.apache.lucene.analysis.core.WhitespaceTokenizerFactory

What am I missing?

Any help would be appreciated.

Thanks,
David Shifflett
Re: Migrating WhitespaceTokenizerFactory from 8.2 to 9.4 [ In reply to ]
Hi,

we can't help you here without a full source code and your build system
setup. Generally those errors only happen if you are using some shading
or any other tool that creates UBER JARs. E.g. for Maven's UBER JARS you
need to add the some resource ransformers, so it includes all necessary
files. I checked the JAR file of Lucene, it has all services entries.

General recommendation: Please do not repackage lucene, use the
*original* JAR files. Also if you are using the Java 11 module system in
your project, it is very important to not repackage JARs, otherwise it
breaks completely! This is why: Because in Java 11 when module system is
used, service providers are found by the module-info.class files as part
of every JAR (META-INF is no longer used). And this file with exact same
name is part of every JAR file. When you merge them it breaks as only
one survives (e.g. the one from Lucene Core as I see in your output
(Lucene core only has the standard tokenizer).

Uwe

Am 28.10.2022 um 21:46 schrieb Shifflett, David [USA]:
> I am migrating my project’s usage of Lucene from 8.2 to 9.4.
> The migration documentation has been very helpful,
> but doesn’t help me resolve this exception:
>
> ‘Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.analysis.TokenizerFactory with name 'whitespace' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath. The current classpath supports the following names: [standard]’
>
> My project includes the lucene-analysis-common JAR,
> and my JAR includes org/apache/lucene/analysis/core/WhitespaceTokenizerFactory.class.
>
> I am not familiar with how Java SPI is configured and built.
>
> I tried creating META-INF/services/org.apache.lucene.analysis.TokenizerFactory
> containing: org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
>
> What am I missing?
>
> Any help would be appreciated.
>
> Thanks,
> David Shifflett
>
--
Uwe Schindler
Achterdiek 19, D-28357 Bremen
https://www.thetaphi.de
eMail: uwe@thetaphi.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Migrating WhitespaceTokenizerFactory from 8.2 to 9.4 [ In reply to ]
Based on what Uwe says, if your problem does relate to shade but you need
to package things in a single executable jar,
https://github.com/nsoft/uno-jar (my fork of the old OneJar utility) might
work better for since it never unpacks any jar file, but instead teaches
java how to treat an overall jar file as if it were a filesystem. If you
try to use it and can't load services, file an issue. I have used it to
package JesterJ which contains a processor that will pre analyze fields
based on a supplied solr schema, so that would have loaded tokenizers and I
think there is some chance that I've already proved it handles what you
(maybe?) need. It has ant and gradle tasks and someone wrote a maven plugin
for it too though that's in a different repository.

I adopted and sporadically maintain uno-jar because I've never been a fan
of shade/shadow style uberjars. Also the capsule style packaging which
extracts itself litters the destination which is unsavory (also capsule now
appears to be 6 years since any commit and the capsule.io web site is
broken for me so it probably isn't maintained). This mail reminds me that
it's past time to look at java 17 compatibility too, but uno-jar is not
abandoned, just chronically lagging because I'm busy... (of course
contributions are welcome!)

-Gus

On Sat, Oct 29, 2022 at 5:57 AM Uwe Schindler <uwe@thetaphi.de> wrote:

> Hi,
>
> we can't help you here without a full source code and your build system
> setup. Generally those errors only happen if you are using some shading
> or any other tool that creates UBER JARs. E.g. for Maven's UBER JARS you
> need to add the some resource ransformers, so it includes all necessary
> files. I checked the JAR file of Lucene, it has all services entries.
>
> General recommendation: Please do not repackage lucene, use the
> *original* JAR files. Also if you are using the Java 11 module system in
> your project, it is very important to not repackage JARs, otherwise it
> breaks completely! This is why: Because in Java 11 when module system is
> used, service providers are found by the module-info.class files as part
> of every JAR (META-INF is no longer used). And this file with exact same
> name is part of every JAR file. When you merge them it breaks as only
> one survives (e.g. the one from Lucene Core as I see in your output
> (Lucene core only has the standard tokenizer).
>
> Uwe
>
> Am 28.10.2022 um 21:46 schrieb Shifflett, David [USA]:
> > I am migrating my project’s usage of Lucene from 8.2 to 9.4.
> > The migration documentation has been very helpful,
> > but doesn’t help me resolve this exception:
> >
> > ‘Caused by: java.lang.IllegalArgumentException: A SPI class of type
> org.apache.lucene.analysis.TokenizerFactory with name 'whitespace' does not
> exist. You need to add the corresponding JAR file supporting this SPI to
> your classpath. The current classpath supports the following names:
> [standard]’
> >
> > My project includes the lucene-analysis-common JAR,
> > and my JAR includes
> org/apache/lucene/analysis/core/WhitespaceTokenizerFactory.class.
> >
> > I am not familiar with how Java SPI is configured and built.
> >
> > I tried creating
> META-INF/services/org.apache.lucene.analysis.TokenizerFactory
> > containing: org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
> >
> > What am I missing?
> >
> > Any help would be appreciated.
> >
> > Thanks,
> > David Shifflett
> >
> --
> Uwe Schindler
> Achterdiek 19, D-28357 Bremen
> https://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

--
http://www.needhamsoftware.com (work)
http://www.the111shift.com (play)