Mailing List Archive

I resurrected a 2013 project (Lucene 4.2) and I want to convert it to 8.6
hello, first, there is a google forum or other site to see the questions
in the mailing-list ?

my project was using dictionary indexed + files that I wanted to check for
spelling errors + suggestions.

I try for fun to just update the maven dependencies and my code doesn't
compile.. it was expected :)

so I'll write it from scratch ..will be cleaner too.

I used dictionaries from wiktionary and I used a script to convert hunspell
dictionaries to wordlist format at that time.

There must be official dictionaries that I can used directly now ?

I found a project languagetool that have lot of dictionaries and they use
lucene + hunspell wrapper (native -> java), but it doesn't work on Windows
10.


At my starting point, I want to create a little POC that use english/french
dictionaries and parse a file to check the spelling error.

After that, add custom dictionnaries + find suggestions + highlight the
word in the text. That was I had with Lucene 4.2

any thought on what changes since 2013 ? I'll start looking at the code
from github

thanks
Re: I resurrected a 2013 project (Lucene 4.2) and I want to convert it to 8.6 [ In reply to ]
You could probably google for a dictionary and download a text file. For
English, there is Wordnet which has a java client for accessing it.

I think you would use a FuzzyQuery or QueryParser with a tilde (-) to
indícate the terms you’d like to do the spellcheck for. This will find
terms within a 2 edit distance.



On Tue, 4 Aug 2020 at 4:17 AM, Sébastien Dionne <sebastien.dionne@gmail.com>
wrote:

> hello, first, there is a google forum or other site to see the questions
> in the mailing-list ?
>
> my project was using dictionary indexed + files that I wanted to check for
> spelling errors + suggestions.
>
> I try for fun to just update the maven dependencies and my code doesn't
> compile.. it was expected :)
>
> so I'll write it from scratch ..will be cleaner too.
>
> I used dictionaries from wiktionary and I used a script to convert hunspell
> dictionaries to wordlist format at that time.
>
> There must be official dictionaries that I can used directly now ?
>
> I found a project languagetool that have lot of dictionaries and they use
> lucene + hunspell wrapper (native -> java), but it doesn't work on Windows
> 10.
>
>
> At my starting point, I want to create a little POC that use english/french
> dictionaries and parse a file to check the spelling error.
>
> After that, add custom dictionnaries + find suggestions + highlight the
> word in the text. That was I had with Lucene 4.2
>
> any thought on what changes since 2013 ? I'll start looking at the code
> from github
>
> thanks
>
Re: I resurrected a 2013 project (Lucene 4.2) and I want to convert it to 8.6 [ In reply to ]
Well, a _lot_ has changed since 4.x. Rather than look through the code, I’d
start with the reference guide and the upgrade notes and major changes
that accompany any release.

As for “official dictionaries”, no there aren’t. “somewhere out on the web”
there are certainly various word lists you can download. The problem is
that almost every Solr installation is specialized. An e-commerce site better
have a lot of brand names. Insurance usages need medical terms.
Chemistry… oh my aching head.

Best,
Erick

> On Aug 4, 2020, at 12:39 AM, Ali Akhtar <ali@ali.actor> wrote:
>
> You could probably google for a dictionary and download a text file. For
> English, there is Wordnet which has a java client for accessing it.
>
> I think you would use a FuzzyQuery or QueryParser with a tilde (-) to
> indícate the terms you’d like to do the spellcheck for. This will find
> terms within a 2 edit distance.
>
>
>
> On Tue, 4 Aug 2020 at 4:17 AM, Sébastien Dionne <sebastien.dionne@gmail.com>
> wrote:
>
>> hello, first, there is a google forum or other site to see the questions
>> in the mailing-list ?
>>
>> my project was using dictionary indexed + files that I wanted to check for
>> spelling errors + suggestions.
>>
>> I try for fun to just update the maven dependencies and my code doesn't
>> compile.. it was expected :)
>>
>> so I'll write it from scratch ..will be cleaner too.
>>
>> I used dictionaries from wiktionary and I used a script to convert hunspell
>> dictionaries to wordlist format at that time.
>>
>> There must be official dictionaries that I can used directly now ?
>>
>> I found a project languagetool that have lot of dictionaries and they use
>> lucene + hunspell wrapper (native -> java), but it doesn't work on Windows
>> 10.
>>
>>
>> At my starting point, I want to create a little POC that use english/french
>> dictionaries and parse a file to check the spelling error.
>>
>> After that, add custom dictionnaries + find suggestions + highlight the
>> word in the text. That was I had with Lucene 4.2
>>
>> any thought on what changes since 2013 ? I'll start looking at the code
>> from github
>>
>> thanks
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: I resurrected a 2013 project (Lucene 4.2) and I want to convert it to 8.6 [ In reply to ]
Erick, I don't think this is about Solr but Lucene. Your comments are still
applicable, though Lucene's "reference guide" is its Javadocs.

On Tue, Aug 4, 2020 at 1:55 PM Erick Erickson <erickerickson@gmail.com>
wrote:

> Well, a _lot_ has changed since 4.x. Rather than look through the code, I’d
> start with the reference guide and the upgrade notes and major changes
> that accompany any release.
>
> As for “official dictionaries”, no there aren’t. “somewhere out on the web”
> there are certainly various word lists you can download. The problem is
> that almost every Solr installation is specialized. An e-commerce site
> better
> have a lot of brand names. Insurance usages need medical terms.
> Chemistry… oh my aching head.
>
> Best,
> Erick
>
> > On Aug 4, 2020, at 12:39 AM, Ali Akhtar <ali@ali.actor> wrote:
> >
> > You could probably google for a dictionary and download a text file. For
> > English, there is Wordnet which has a java client for accessing it.
> >
> > I think you would use a FuzzyQuery or QueryParser with a tilde (-) to
> > indícate the terms you’d like to do the spellcheck for. This will find
> > terms within a 2 edit distance.
> >
> >
> >
> > On Tue, 4 Aug 2020 at 4:17 AM, Sébastien Dionne <
> sebastien.dionne@gmail.com>
> > wrote:
> >
> >> hello, first, there is a google forum or other site to see the
> questions
> >> in the mailing-list ?
> >>
> >> my project was using dictionary indexed + files that I wanted to check
> for
> >> spelling errors + suggestions.
> >>
> >> I try for fun to just update the maven dependencies and my code doesn't
> >> compile.. it was expected :)
> >>
> >> so I'll write it from scratch ..will be cleaner too.
> >>
> >> I used dictionaries from wiktionary and I used a script to convert
> hunspell
> >> dictionaries to wordlist format at that time.
> >>
> >> There must be official dictionaries that I can used directly now ?
> >>
> >> I found a project languagetool that have lot of dictionaries and they
> use
> >> lucene + hunspell wrapper (native -> java), but it doesn't work on
> Windows
> >> 10.
> >>
> >>
> >> At my starting point, I want to create a little POC that use
> english/french
> >> dictionaries and parse a file to check the spelling error.
> >>
> >> After that, add custom dictionnaries + find suggestions + highlight the
> >> word in the text. That was I had with Lucene 4.2
> >>
> >> any thought on what changes since 2013 ? I'll start looking at the code
> >> from github
> >>
> >> thanks
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

--
Adrien
Re: I resurrected a 2013 project (Lucene 4.2) and I want to convert it to 8.6 [ In reply to ]
exactement. C'est pour Lucene. and There is no guide, tutorial.. only
javadoc. I think, I'll have to check the source code

On Thu, Aug 6, 2020 at 9:15 AM Adrien Grand <jpountz@gmail.com> wrote:

> Erick, I don't think this is about Solr but Lucene. Your comments are still
> applicable, though Lucene's "reference guide" is its Javadocs.
>
> On Tue, Aug 4, 2020 at 1:55 PM Erick Erickson <erickerickson@gmail.com>
> wrote:
>
> > Well, a _lot_ has changed since 4.x. Rather than look through the code,
> I’d
> > start with the reference guide and the upgrade notes and major changes
> > that accompany any release.
> >
> > As for “official dictionaries”, no there aren’t. “somewhere out on the
> web”
> > there are certainly various word lists you can download. The problem is
> > that almost every Solr installation is specialized. An e-commerce site
> > better
> > have a lot of brand names. Insurance usages need medical terms.
> > Chemistry… oh my aching head.
> >
> > Best,
> > Erick
> >
> > > On Aug 4, 2020, at 12:39 AM, Ali Akhtar <ali@ali.actor> wrote:
> > >
> > > You could probably google for a dictionary and download a text file.
> For
> > > English, there is Wordnet which has a java client for accessing it.
> > >
> > > I think you would use a FuzzyQuery or QueryParser with a tilde (-) to
> > > indícate the terms you’d like to do the spellcheck for. This will find
> > > terms within a 2 edit distance.
> > >
> > >
> > >
> > > On Tue, 4 Aug 2020 at 4:17 AM, Sébastien Dionne <
> > sebastien.dionne@gmail.com>
> > > wrote:
> > >
> > >> hello, first, there is a google forum or other site to see the
> > questions
> > >> in the mailing-list ?
> > >>
> > >> my project was using dictionary indexed + files that I wanted to check
> > for
> > >> spelling errors + suggestions.
> > >>
> > >> I try for fun to just update the maven dependencies and my code
> doesn't
> > >> compile.. it was expected :)
> > >>
> > >> so I'll write it from scratch ..will be cleaner too.
> > >>
> > >> I used dictionaries from wiktionary and I used a script to convert
> > hunspell
> > >> dictionaries to wordlist format at that time.
> > >>
> > >> There must be official dictionaries that I can used directly now ?
> > >>
> > >> I found a project languagetool that have lot of dictionaries and they
> > use
> > >> lucene + hunspell wrapper (native -> java), but it doesn't work on
> > Windows
> > >> 10.
> > >>
> > >>
> > >> At my starting point, I want to create a little POC that use
> > english/french
> > >> dictionaries and parse a file to check the spelling error.
> > >>
> > >> After that, add custom dictionnaries + find suggestions + highlight
> the
> > >> word in the text. That was I had with Lucene 4.2
> > >>
> > >> any thought on what changes since 2013 ? I'll start looking at the
> code
> > >> from github
> > >>
> > >> thanks
> > >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> --
> Adrien
>