I am a software programmer from China.I having been reading the source of lucene version 6.6.0. I have 3 questions to ask.
1. is about this method: org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.TermsWriter.pushTerm(BytesRef). I can not understand one line in this method: [ prefixStarts -= prefixTopSize-1 ]. In my opinion,this line is not necessary. There is no need to update prefixStarts. Because prefixStarts will be updated in the end of this method. It is updated from 'pos' to the end of the new text(the parameter of this method 'text'). If the length of new text is longer than or equals the length of lastTerm then prefixStarts is fully updated in the end. If the lenght of the new text is shorter, of cource not all prefixStarts is updated,but only the items whose index in prefixStarts is smaller than the length of lastTerm is used so there is no need to update them either. So I think this line [ prefixStarts -= prefixTopSize-1 ] is removable.
2. is about this method: org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.TermsWriter.finish(). In this method I can not understand why this line [ pushTerm(new BytesRef()); ] appears twice.I made some tests and found the second line has no effect.I think it works fine to have just one.
3. is about the suffix in the file name. I noticed that the file name tim/tip contains one part called suffix. For example, in '_7_Lucene50_0.tim', '0' is the suffix. I can not understand the function of suffix. In my opinion,the format name( 'Lucene50' in this example) is required to find the postingFormat it uses and it is enough. I've read the source but still can not find any use of the suffix. What's the use of it?
I am sorry my English is not so good.
Looking forward to the reply. ???????????????????????????????????????????????????????????????????????F?V?7V'67&?&R?R???âFWb?V?7V'67&?&T?V6V?R?6?R??&p?f?"FF?F????6????G2?R???âFWb?V??V6V?R?6?R??&p?
1. is about this method: org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.TermsWriter.pushTerm(BytesRef). I can not understand one line in this method: [ prefixStarts -= prefixTopSize-1 ]. In my opinion,this line is not necessary. There is no need to update prefixStarts. Because prefixStarts will be updated in the end of this method. It is updated from 'pos' to the end of the new text(the parameter of this method 'text'). If the length of new text is longer than or equals the length of lastTerm then prefixStarts is fully updated in the end. If the lenght of the new text is shorter, of cource not all prefixStarts is updated,but only the items whose index in prefixStarts is smaller than the length of lastTerm is used so there is no need to update them either. So I think this line [ prefixStarts -= prefixTopSize-1 ] is removable.
2. is about this method: org.apache.lucene.codecs.blocktree.BlockTreeTermsWriter.TermsWriter.finish(). In this method I can not understand why this line [ pushTerm(new BytesRef()); ] appears twice.I made some tests and found the second line has no effect.I think it works fine to have just one.
3. is about the suffix in the file name. I noticed that the file name tim/tip contains one part called suffix. For example, in '_7_Lucene50_0.tim', '0' is the suffix. I can not understand the function of suffix. In my opinion,the format name( 'Lucene50' in this example) is required to find the postingFormat it uses and it is enough. I've read the source but still can not find any use of the suffix. What's the use of it?
I am sorry my English is not so good.
Looking forward to the reply. ???????????????????????????????????????????????????????????????????????F?V?7V'67&?&R?R???âFWb?V?7V'67&?&T?V6V?R?6?R??&p?f?"FF?F????6????G2?R???âFWb?V??V6V?R?6?R??&p?