<color><param>0100,0100,0100</param>On 19 Jun 99, at 22:46, Jonathon wrote:
<color><param>7F00,0000,0000</param>> Andrew M. Kuchling (akuchlin@mems-exchange.org) wrote:
> : I'll wager that the_file_is winds up with some sort of unexpected
>
> It appears top have the correct value.
</color>But what is "correct"? If, unexpectedly, a != b, then one can
deduce either (I) a is wrong (II) b is wrong (III) both are wrong. Here
we seem to have a Type II stuff-up; see below --- in point 5 --- there
were a few other things that triggered my fulminator :-)
<color><param>7F00,0000,0000</param>>
> : value, and causes your variable to never be assigned a value,. because
>
> But the variables I am initializing in the loop don't get
> initialized. It doesn't look like the file goes into the loop.
</color>0. Too true. Otherwise you would have sprung a NameError on
"strip".
Either
import string
foo = string.strip(bar)
or
from string import strip
foo = strip(bar)
<color><param>7F00,0000,0000</param>>
> For those who would like to see the script, I have it
> at http://www.stamp-coin.com/python/problem.html.
</color>Aha! The real source at last ..
<color><param>7F00,0000,0000</param>src</color>> the_file_is = "jan-www.db"
src> <color><param>0100,0100,0100</param><FontFamily><param>Courier New</param>the_data_file_is = "/temp/data/jan-www.data"
1. You shouldn't hard-wire file names. It makes
it extremely inconvenient to run it with some
test data and then run it with the real data.
Get the names from the command line.
</color>2. Use test data!
<color><param>0100,0100,0100</param>3. Instead of having two "filenames", use
os.path functions basename() and splitext() to
get the part of the filename (jan-www) that is
semantically significant (to you).
4. Replace the meaningless "the" and "is" in
your variable names with something meaningful
e.g. to distinguish between names and handles
for files e.g.
input_file = open(input_file_name, ...
output_file = open(output_file_name, ...
=============================================
[much much later]</color><FontFamily><param>Arial</param>
src> <color><param>0100,0100,0100</param><FontFamily><param>Courier New</param>elif ( the_file_is == "jan-ww.db" ) :
</color>5. "WWWhoops" != "WWhoops"
<FontFamily><param>Arial</param>Just in case that "appears" to be a contradiction:
THREE (3) Ws c.f. TWO (2) Ws
6. We're the Pythonistas; we don't need no stinkin' parentheses [at
least not wrapped around conditional expressions] ...
==============================================
5.1 And we have a case of "us" != "usa".
C:\junk>grep "[-][a-z]*[.]db" stamps.py
the_file_is = "jan-www.db"
the_file_is = "jan-usa.db"
the_file_is = "john-ww.db"
the_file_is = "john-usa.db"
if ( the_file_is == "john-ww.db" ) : ### uh-oh
elif ( the_file_is == "jan-ww.db" ) :
elif ( the_file_is == "jan-us.db" ) : ### uh-oh
elif ( the_file_is == "john-us.db") : ### uh-oh
elif ( the_file_is == "jan-usa.db" ):
elif ( the_file_is == "john-usa.db" ):
:-) Do I hear mutterings of "serve 'em right" from the Brazilians &
Dutch?
==============================================
<color><param>7F00,0000,0000</param>src><color><param>0100,0100,0100</param><FontFamily><param>Courier New</param> elif ( temp_country_number == "104" ) :
var_country_name = " Aden "
var_country_number = "104"
elif ( temp_country_number == "108" ) :
var_country_name = " Afars And Issas "
var_country_number = "108"
elif ( temp_country_number == "110" ) :
var_country_name = " Afghanistan "
var_country_number = "110"
</color><FontFamily><param>Arial</param>7. (minor point) Why not assign var_country_number =
temp_country_number ONCE?
8. (minor point) Add the TWO (!?) leading spaces and the one
trailing space to the country name at the last minute when you are
formatting the data prettily (if that's what you are doing).
9. (major point) Put all this stuff in a text file:
104 Aden
108 Afars and Issas
110 Afghanistan
etc
At the start of your script, read the file and stuff the entries into a
dictionary. [.If you got really carried away, you could make a
separate module out of the country dictionary kit. An even better
trick would be to lash up a module that handled dictionaries
generally; then you could use it for your "mint or used" and "grade"
mappings as well.]
Then you can replace almost all of the long and
SLOOOOOWWWWW chain of elifs by a dictionary lookup.
if country_name_dict.has_key(temp_country_number):
var_country_name = country_name_dict[temp_country_number]
else:
# handle exception condition
==========================================
src> <color><param>0100,0100,0100</param><FontFamily><param>Courier New</param>temp_line = ( temp_record_id + bang +
var_country_name + bang + temp_price + bang +
temp_scott_number + bang + var_url + bang +
...
10. "bang" is not defined. Neither is "bang_two"
(used elsewhere). You are heading back into
NameError territory.
11. Read the doco on string.join. It can save
you typing and run-time if there are more than a
few fields.
12. Try this:
string.join((fld1, fld2, ..., fldn), "!")
# assumes bang means "!"; if not, RTFJargonFile</color><FontFamily><param>Arial</param>
=================================================
src>
<color><param>0100,0100,0100</param><FontFamily><param>Courier New</param>if ( temp_photo == "X" ) :
var_url = string.join[the_graphic_prefix_is,
temp_record_id , ".jpg" ]
</color><FontFamily><param>Arial</param>13. When/if this is executed, you will get a "TypeError:
unsubscriptable object" because join is a method, not a sequence
or mapping.
You need to do
var_url = string.join((this, that, the_other), "")
which may be faster than the clearer
var_url = this + that + the_other
=======================================
src><color><param>0100,0100,0100</param><FontFamily><param>Courier New</param> if temp_deleted == "D" :
ver_field_list == "y-del"
elif temp_deleted == "d" :
ver_field_list == "y-del"
</color><FontFamily><param>Arial</param>14. I think you meant var_field_list, nor ver_field_list.
Aside: Timbot, have you started on
are_they_misleadingly_close_names_or_just_a_typo_nanny.py
yet?
================================================
15.
src> temp_record_id = int(validate_line[2:18])
temp_country_text = strip ( validate_line[20:33] )
temp_country_number = int(validate_line[35:48])
temp_country_section = int(validate_line[50:63])
etc etc
Is one to infer from this that (a) there are TWO (!!??) unused(!?)
column positions between each field (e.g. 18 and 19) or (b) you
need to read the doco about the slice operator? The "sample data"
on your web page is useless as it has variable-length fields
separated by "vertical bar" [is this what you call bang?] characters
whereas your code is obviously trying to read fixed-column-position
data.
Variable-length fields would be a better idea; with fixed-length
fields, if the size of one field changes you are in for a lot of (error-
prone) re-calculating and retyping. If the "unused" guff is a
delimiter, use it!
fields = string.split(validate_line, delimiter)
assert len(fields) == appropriate_number, "uh-oh"
for i in range(len(fields)):
fields[i] = string.strip(fields[i])
(
leading_empty_field, temp_record_id, temp_country_text,
temp_country_number, ..., trailing_empty_field
) = fields
I doubt that wrapping int(...) around some of these fields is
desirable, let alone necessary.
=================================================
:-)
No representations are made that the above constitutes a
comprehensive code review. There may well be something left for
the gleaners.
:-)
Cheers,
John Machin