Mailing List Archive

Word Counting -- A Novell Approach
There was a thread here about word counting, when reading in arbitary
chunks, instead of line-by-line.

I have a friend who continually reminds me "In Rome, do as the Romans", so
it seems to me the right way is to count with an object you 'feed()' data
into, like other non-line-based Python parsers (XML, HTML, etc.).

So I wrote a small word counting class, whose interface is:
* feed: Feed some data into the counter.
* flush: Force a word break. The next feed will force new words.
This is useful, for example, when counting words in multiple
files, to make sure words are not concatenated across files.
* items: Will return a list of (word, count) pairs.

(This is an excerpt from the documentation)

I will happily mail this class to anyone who wants.
--
Moshe Zadka <mzadka@geocities.com>.
QOTD: What fun to me! I'm not signing permanent.