On Sep 10, 2007, at 10:27 PM, Nathan Kurz wrote:
> On 9/10/07, Marvin Humphrey <marvin@rectangular.com> wrote:
>>> Looking at the object code [it] generates,
>>
>> This is something I've dabbled in, but would like to pursue in
>> earnest. Can you suggest some links or a course of study to get me
>> on my way?
>
> Unfortunately, I'm at best an advanced beginner at such things. The
> approach I used here was to compile with -ggdb3 and use 'objdump -S'
> on the object file. Then I stared at the output until it started to
> make sense.
I found what I was looking for. There's a book called "Programming
from the Ground Up", by Jonathan Bartlett, available both in dead-
tree format and as a PDF under the GNU Free Documentation License.
http://xrl.us/6t7x (Link to search.barnesandnoble.com)
http://savannah.nongnu.org/projects/pgubook/ http://download.savannah.gnu.org/releases/pgubook/ A reviewer from the Barnes and Noble site explains its utility:
Most people realize that Gnu's assembler (as, commonly referred to as
'Gas') is not fit for human consumption. After all, it's real purpose
in life is to assemble GCC output. As *nobody* writes in assembly
language anymore, who needs to know anything about Gas? Well,
surprise, surprise, people *do* still program in assembly language.
While there are other assemblers available for Linux and other
platforms where Gas can run (e.g., NASM, FASM, HLA), anyone
wanting to
work with GNU tools will probably want to come to grips with GNU's
AT&T syntax at one point or another. The only problem is that there
are very few books on this subject. The FSF/GNU documentation is a
complete joke. This is where Jonathan Bartlett's book comes in. It's
the first, reasonable, beginner's book on Gas written for the x86
that
I've found. While I can't personally recommend that someone do all
their x86 work with Gas (though some will disagree with me), I'm also
of the opinion that anyone who works in assembly language is going to
have to deal with Gas sooner or later. And when that day arrives,
this
book will prove very handy. Combined with the FSF/GNU documentation
for Gas (as a reference), this book will help someone overcome the
roadblock that Gas' AT&T syntax has been in the past.
> My analysis here went no farther than the generalization
> that shorter with fewer branches is better. While generally true,
> with modern processors one probably needs to test.
Sure. Some sophisticated benchmarking code has been contributed to
Lucene over the last year. I've been waiting for the pace of commits
to settle down before contemplating a port. It seems to have
stabilized over the last couple months, and might be worth a looksee
now.
Nevertheless, I think we'll get pretty far just by looking over
generated assembly code. We'll almost always have a good idea of
what we want the processor to be doing; if the output of 'gcc -S' or
'objdump -S' doesn't look like we think it should, we can tweak the C
code until we're satisfied.
> For the purposes of improving KinoSearch, though, I think that the
> biggest room for improvement is going to be through integrating better
> with the Virtual Memory Manager
> (http://www.informit.com/content/images/0131453483/downloads/
> gorman_book.pdf)
> from profiling to find bottlenecks with Oprofile, and through L2
> cache optimization with Cachegrind. I still think proper use of mmap
> has tremendous potential.
The classes we should focus on are InStream and OutStream. First
order of business is to study how the simple functions look in
assembly: read_byte, read_int, etc.
After that, we move on to what are currently called VInts and
VLongs. (I'm contemplating renaming them C32 and C64 for "compressed
32/64-bit integer", since they are no longer the same as Lucene VInts/
VLongs -- they're now BER compressed integers, as used by Perl's pack
() function.) First, we'd like to see whether those functions are
fully optimized under the current scheme. Second, data compression
is the bugaboo for integrating mmap, and we can brainstorm
alternative approaches while optimizing.
> Are there people actively compiling this under MSVC
> currently? I know nothing about Windows.
The maint branch compiles under MSVC, which makes it possible to
produce a PPM and support the good people at ActiveState. Randy
Kobes was instrumental in getting us to this far; tye, Corion, and
some other Windows users from PerlMonks have also been helpful. It's
actually Randy who publishes the PPMs because KinoSearch's 5.8.3
requirement prevents ActiveState from publishing its own. There are
a number of people who have acquired KS via this route.
Basically the only place maint doesn't compile is under Solaris, I
think because of an alignment problem which is fixed in trunk. Trunk
does have some issues which currently prevent compiling under
Windows, but I know about them and plan to fix them. The goal is to
have KS work on everything except for esoteric systems where e.g.
floats don't conform to IEEE 754.
Supporting MSVC from within KinoSearch isn't too complicated or
painful, because most of the hard work is handled at a higher level.
In maint, we're using the information generated by Perl's Configure
script (itself generated by Metaconfig). In trunk, the Charmonizer
compatibility layer plays this role.
> I'll send you a version tomorrow with such things included for you to
> decide where you want to draw that line. It's out of sync with the
> patch right now.
I went ahead and committed the version you supplied as r2555. Thanks!
Some mild mods followed in r2556 (<
http://xrl.us/6uch>):
* The "inline" keyword has been replaced with the recently introduced
INLINE Charmonizer macro, which is empty if the compiler doesn't
support inline functions.
* A sanity check was added at the top of winnow_anchors() to prevent
possible invalid pointer de-refs. Technically, this wasn't
necessary because the calling function cannot currently supply data
which triggers such a problem, but I was uneasy about the
absence of a
local safety mechanism.
* The assignment of the iteration variable "i" was moved from the
variable declaration at the top of PhraseScorer_calc_phrase_freq
to the loop initiation. This will thwart problems if stuff gets
moved around and i winds up with a new value prior to the loop.
Again, not technically necessary, just defensive programming
practice.
* winnow_anchors() now returns a u32_t rather than a size_t,
because it's returning a count of u32_t rather than char and the
C standard defines size_t as "A type to define sizes of strings
and memory blocks."
>> PS: Tabs suck.
>
> Oops. Have I been including them in things I send?
In this patch at least. No big deal, I zapped 'em.
Marvin Humphrey
Rectangular Research
http://www.rectangular.com/ _______________________________________________
KinoSearch mailing list
KinoSearch@rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch