Mailing List Archive

Use Case clarification
Hi,

Before doing a deep dive into lucene I would appreciate it if you would
clarify a few things so I know if this is the right project to fulfill my
objective.

1. It is my my understanding that google search is a more elaborate utility
but not unlike this *.nix search utility grep which searches for a string
pattern recursively in text files , for example files could .java files
, .html files. The search starts in this case from the current directory.

grep -RiIl 'search'

Quick grep explanation:

-R - recursive search
-i - case-insensitive
-I - skip binary files
-l - print a simple list as output.

2. Further to my undersrand , if it correct, is the objective of lucene
pretty much the same . Searching for String patterns recursively ?

3. If lucene is a search engine same as google or grep then do I just
point it to my website root directory ?

4. Can I use lucene as a web search engine same as Google, if so where
would I point it to so that lucence can recursively search the www
websites ?

5. Is lucene use case for something else entirely ?


Thanks
Re: Use Case clarification [ In reply to ]
Hi

The following FAQ might be a bit outdated, but nevertheless you should
find some answers there as well

https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ

For example to answer your question 4) see

https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-CanIuseLucenetocrawlmysiteorothersitesontheInternet?

If I understand your questions correctly, your objective is to provide a
search engine for your company website?

HTH

Michael



Am 05.04.21 um 11:34 schrieb Som Lima:
> Hi,
>
> Before doing a deep dive into lucene I would appreciate it if you would
> clarify a few things so I know if this is the right project to fulfill my
> objective.
>
> 1. It is my my understanding that google search is a more elaborate utility
> but not unlike this *.nix search utility grep which searches for a string
> pattern recursively in text files , for example files could .java files
> , .html files. The search starts in this case from the current directory.
>
> grep -RiIl 'search'
>
> Quick grep explanation:
>
> -R - recursive search
> -i - case-insensitive
> -I - skip binary files
> -l - print a simple list as output.
>
> 2. Further to my undersrand , if it correct, is the objective of lucene
> pretty much the same . Searching for String patterns recursively ?
>
> 3. If lucene is a search engine same as google or grep then do I just
> point it to my website root directory ?
>
> 4. Can I use lucene as a web search engine same as Google, if so where
> would I point it to so that lucence can recursively search the www
> websites ?
>
> 5. Is lucene use case for something else entirely ?
>
>
> Thanks
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Use Case clarification [ In reply to ]
Thank you for your reply.
Yes I would like to provide a search engine for my company website and at
the same time build a web search engine as a personal project .

On Mon, 5 Apr 2021, 10:57 Michael Wechner, <michael.wechner@wyona.com>
wrote:

> Hi
>
> The following FAQ might be a bit outdated, but nevertheless you should
> find some answers there as well
>
> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ
>
> For example to answer your question 4) see
>
>
> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-CanIuseLucenetocrawlmysiteorothersitesontheInternet
> ?
>
> If I understand your questions correctly, your objective is to provide a
> search engine for your company website?
>
> HTH
>
> Michael
>
>
>
> Am 05.04.21 um 11:34 schrieb Som Lima:
> > Hi,
> >
> > Before doing a deep dive into lucene I would appreciate it if you
> would
> > clarify a few things so I know if this is the right project to fulfill my
> > objective.
> >
> > 1. It is my my understanding that google search is a more elaborate
> utility
> > but not unlike this *.nix search utility grep which searches for a string
> > pattern recursively in text files , for example files could .java
> files
> > , .html files. The search starts in this case from the current
> directory.
> >
> > grep -RiIl 'search'
> >
> > Quick grep explanation:
> >
> > -R - recursive search
> > -i - case-insensitive
> > -I - skip binary files
> > -l - print a simple list as output.
> >
> > 2. Further to my undersrand , if it correct, is the objective of lucene
> > pretty much the same . Searching for String patterns recursively ?
> >
> > 3. If lucene is a search engine same as google or grep then do I just
> > point it to my website root directory ?
> >
> > 4. Can I use lucene as a web search engine same as Google, if so where
> > would I point it to so that lucence can recursively search the www
> > websites ?
> >
> > 5. Is lucene use case for something else entirely ?
> >
> >
> > Thanks
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Use Case clarification [ In reply to ]
Lucene is basically a search library and Solr is a search web
application using Lucene.

So, depending where you want to set your "starting point", you can
definitely do this with Lucene, whereas you might want to consider Solr

https://solr.apache.org/features.html

which is also based on Lucene, because you will provide many features
out of the box

https://solr.apache.org/features.html

Also see

https://cwiki.apache.org/confluence/display/SOLR/FAQ

Re Crawlers in combination with Solr see for example

https://cwiki.apache.org/confluence/display/SOLR/SolrEcosystem#SolrEcosystem-CrawlersAndConnectors

or

https://www.octoparse.com/blog/10-best-open-source-web-scraper#

Cheers

Michael




Am 05.04.21 um 11:59 schrieb Som Lima:
> Thank you for your reply.
> Yes I would like to provide a search engine for my company website and at
> the same time build a web search engine as a personal project .
>
> On Mon, 5 Apr 2021, 10:57 Michael Wechner, <michael.wechner@wyona.com>
> wrote:
>
>> Hi
>>
>> The following FAQ might be a bit outdated, but nevertheless you should
>> find some answers there as well
>>
>> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ
>>
>> For example to answer your question 4) see
>>
>>
>> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-CanIuseLucenetocrawlmysiteorothersitesontheInternet
>> ?
>>
>> If I understand your questions correctly, your objective is to provide a
>> search engine for your company website?
>>
>> HTH
>>
>> Michael
>>
>>
>>
>> Am 05.04.21 um 11:34 schrieb Som Lima:
>>> Hi,
>>>
>>> Before doing a deep dive into lucene I would appreciate it if you
>> would
>>> clarify a few things so I know if this is the right project to fulfill my
>>> objective.
>>>
>>> 1. It is my my understanding that google search is a more elaborate
>> utility
>>> but not unlike this *.nix search utility grep which searches for a string
>>> pattern recursively in text files , for example files could .java
>> files
>>> , .html files. The search starts in this case from the current
>> directory.
>>> grep -RiIl 'search'
>>>
>>> Quick grep explanation:
>>>
>>> -R - recursive search
>>> -i - case-insensitive
>>> -I - skip binary files
>>> -l - print a simple list as output.
>>>
>>> 2. Further to my undersrand , if it correct, is the objective of lucene
>>> pretty much the same . Searching for String patterns recursively ?
>>>
>>> 3. If lucene is a search engine same as google or grep then do I just
>>> point it to my website root directory ?
>>>
>>> 4. Can I use lucene as a web search engine same as Google, if so where
>>> would I point it to so that lucence can recursively search the www
>>> websites ?
>>>
>>> 5. Is lucene use case for something else entirely ?
>>>
>>>
>>> Thanks
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Use Case clarification [ In reply to ]
Thank you.
These look like the starting point I need.
Thanks you.


On Mon, 5 Apr 2021, 11:16 Michael Wechner, <michael.wechner@wyona.com>
wrote:

> Lucene is basically a search library and Solr is a search web
> application using Lucene.
>
> So, depending where you want to set your "starting point", you can
> definitely do this with Lucene, whereas you might want to consider Solr
>
> https://solr.apache.org/features.html
>
> which is also based on Lucene, because you will provide many features
> out of the box
>
> https://solr.apache.org/features.html
>
> Also see
>
> https://cwiki.apache.org/confluence/display/SOLR/FAQ
>
> Re Crawlers in combination with Solr see for example
>
>
> https://cwiki.apache.org/confluence/display/SOLR/SolrEcosystem#SolrEcosystem-CrawlersAndConnectors
>
> or
>
> https://www.octoparse.com/blog/10-best-open-source-web-scraper#
>
> Cheers
>
> Michael
>
>
>
>
> Am 05.04.21 um 11:59 schrieb Som Lima:
> > Thank you for your reply.
> > Yes I would like to provide a search engine for my company website and at
> > the same time build a web search engine as a personal project .
> >
> > On Mon, 5 Apr 2021, 10:57 Michael Wechner, <michael.wechner@wyona.com>
> > wrote:
> >
> >> Hi
> >>
> >> The following FAQ might be a bit outdated, but nevertheless you should
> >> find some answers there as well
> >>
> >> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ
> >>
> >> For example to answer your question 4) see
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/lucene/LuceneFAQ#LuceneFAQ-CanIuseLucenetocrawlmysiteorothersitesontheInternet
> >> ?
> >>
> >> If I understand your questions correctly, your objective is to provide a
> >> search engine for your company website?
> >>
> >> HTH
> >>
> >> Michael
> >>
> >>
> >>
> >> Am 05.04.21 um 11:34 schrieb Som Lima:
> >>> Hi,
> >>>
> >>> Before doing a deep dive into lucene I would appreciate it if you
> >> would
> >>> clarify a few things so I know if this is the right project to fulfill
> my
> >>> objective.
> >>>
> >>> 1. It is my my understanding that google search is a more elaborate
> >> utility
> >>> but not unlike this *.nix search utility grep which searches for a
> string
> >>> pattern recursively in text files , for example files could .java
> >> files
> >>> , .html files. The search starts in this case from the current
> >> directory.
> >>> grep -RiIl 'search'
> >>>
> >>> Quick grep explanation:
> >>>
> >>> -R - recursive search
> >>> -i - case-insensitive
> >>> -I - skip binary files
> >>> -l - print a simple list as output.
> >>>
> >>> 2. Further to my undersrand , if it correct, is the objective of lucene
> >>> pretty much the same . Searching for String patterns recursively ?
> >>>
> >>> 3. If lucene is a search engine same as google or grep then do I just
> >>> point it to my website root directory ?
> >>>
> >>> 4. Can I use lucene as a web search engine same as Google, if so where
> >>> would I point it to so that lucence can recursively search the www
> >>> websites ?
> >>>
> >>> 5. Is lucene use case for something else entirely ?
> >>>
> >>>
> >>> Thanks
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
>