Please, does anyone have a JSPParser class that parses JSPs?
I hacked the HTMLParser class that comes in the Lucene demo and made it parse and index JSPs. But when i would do a search, the jsp tags
<%pageContext.setAttribute( "req", request );%>
<%@ page import="com.propelnewmedia.tags.BreadcrumbTrailer"%>
and so on, were included in the summary.
Then, I figured out a way to get the JSP tags out of the summary (and i think out of the index as well).
What I did was designate JSP tags (anything starting with <% and ending with %>) as a 3rd comment type in the void CommentTag() :, TOKEN :, and <WithinCommentN> TOKEN : sections of HTMLParser.jj
I just copied and pasted the relevant code for Comment2 and mimicked that for my new Comment type. I then recompiled HTMLParser.jj using javacc.
I'm still not out of the woods though. I still need to know how to make Lucene not include list element values, etc in the search hits. For instance, if a keyword happens to be in a <selection> list, it gets counted as a hit.
Any suggestions (or preferably, working code) would be massively appreciated!. Thanks in advance.
I hacked the HTMLParser class that comes in the Lucene demo and made it parse and index JSPs. But when i would do a search, the jsp tags
<%pageContext.setAttribute( "req", request );%>
<%@ page import="com.propelnewmedia.tags.BreadcrumbTrailer"%>
and so on, were included in the summary.
Then, I figured out a way to get the JSP tags out of the summary (and i think out of the index as well).
What I did was designate JSP tags (anything starting with <% and ending with %>) as a 3rd comment type in the void CommentTag() :, TOKEN :, and <WithinCommentN> TOKEN : sections of HTMLParser.jj
I just copied and pasted the relevant code for Comment2 and mimicked that for my new Comment type. I then recompiled HTMLParser.jj using javacc.
I'm still not out of the woods though. I still need to know how to make Lucene not include list element values, etc in the search hits. For instance, if a keyword happens to be in a <selection> list, it gets counted as a hit.
Any suggestions (or preferably, working code) would be massively appreciated!. Thanks in advance.