So, I've been using Trac for a couple of weeks now. I noticed it show
up in the Debian repository just about the time I was getting a Wiki set
up for my projects at work. I decided to check Trac out and have been
quite impressed. For now Trac hasn't been used too heavily at work
since I'm mainly working on a couple projects by myself, but as I start
to shift over to our new product and some new developers start to take
over some of my work I think it will become more important.
So, as I've been looking through the Trac code, there are places I
believe the code could be tidied up, or abstracted a bit. Now, this
isn't really a bug so I didn't want to file a ticket, but I think some
things could be done to improve the readability and reduce some code
duplication.
In general I like to use a more functional style programming when
working in Python, which leads me to use iterators quite frequently.
I've noticed there are a variety of places that loop over the rows
returned by a SQL query like:
while 1:
row = cursor.fetchone()
if not row:
break
# do something with row
I'd suggest adding a generator that would abstract this:
def row_iterator(cursor):
while 1:
row = cursor.fetchone()
if not row:
raise StopIteration
yield row
Now the loop becomes:
for row in row_iterator(cursor):
# do something with row
The itertools package has some nice functions for working with
iterators. I tend to define a simple function based off of them to keep
track of an index while iterating over an object:
def counter(obj):
return izip(count(), obj)
Now, I can pass a list or other iterable object to the function and
replace the loops like:
for i in range(len(lst)):
value = lst[i]
# do something with i and value
with:
for i, value in counter(lst):
# do something with i and value
Also, I've noticed that when building url strings there are places that
manual manipulation could be replaced by using the urlparse package.
I'm also not in favor of manually escaping variables to use in building
SQL query strings. I generally find it preferable to use '?' or if
possible named parameters to bind variables since the database driver
will handle the necessary conversions.
In core.py the parse_path_info function uses a bunch of regular
expressions to parse out the pieces of the path to determine the module
and arguments needed. I think a nicer way to handle this would be just
to look at the first part of the path to determine which module should
handle it, then leave the rest of the parsing up to the module. Now,
even if you didn't want to move that functionality into the modules, the
function could still be reworked. One of the telling features of that
function is that it's pretty easy to recognize that a lot of the code
was copied and pasted, a clear sign that there's probably a better way
to abstract the code.
Anyways, I hope that didn't come off as pretentious. I know that when
you're working you generally focus more on working through your current
problem than thinking about ways to generalize what you're doing to use
in the future. But looking through the codebase for the first time, I'm
just trying to make sense of it, so it stands out when a portion of code
isn't as straightforward as it could be.
I'd like to help start making some of these types of changes in Trac,
but I'm not really sure what the protocol should be. I don't want to
file tickets for every change like this. I suppose I'll post patches to
the mailing list. The mailing list volume doesn't seem too heavy at
this point, but at some point it may be good to create separate user and
developer lists.
--
Matthew Good <trac@matt-good.net>
up in the Debian repository just about the time I was getting a Wiki set
up for my projects at work. I decided to check Trac out and have been
quite impressed. For now Trac hasn't been used too heavily at work
since I'm mainly working on a couple projects by myself, but as I start
to shift over to our new product and some new developers start to take
over some of my work I think it will become more important.
So, as I've been looking through the Trac code, there are places I
believe the code could be tidied up, or abstracted a bit. Now, this
isn't really a bug so I didn't want to file a ticket, but I think some
things could be done to improve the readability and reduce some code
duplication.
In general I like to use a more functional style programming when
working in Python, which leads me to use iterators quite frequently.
I've noticed there are a variety of places that loop over the rows
returned by a SQL query like:
while 1:
row = cursor.fetchone()
if not row:
break
# do something with row
I'd suggest adding a generator that would abstract this:
def row_iterator(cursor):
while 1:
row = cursor.fetchone()
if not row:
raise StopIteration
yield row
Now the loop becomes:
for row in row_iterator(cursor):
# do something with row
The itertools package has some nice functions for working with
iterators. I tend to define a simple function based off of them to keep
track of an index while iterating over an object:
def counter(obj):
return izip(count(), obj)
Now, I can pass a list or other iterable object to the function and
replace the loops like:
for i in range(len(lst)):
value = lst[i]
# do something with i and value
with:
for i, value in counter(lst):
# do something with i and value
Also, I've noticed that when building url strings there are places that
manual manipulation could be replaced by using the urlparse package.
I'm also not in favor of manually escaping variables to use in building
SQL query strings. I generally find it preferable to use '?' or if
possible named parameters to bind variables since the database driver
will handle the necessary conversions.
In core.py the parse_path_info function uses a bunch of regular
expressions to parse out the pieces of the path to determine the module
and arguments needed. I think a nicer way to handle this would be just
to look at the first part of the path to determine which module should
handle it, then leave the rest of the parsing up to the module. Now,
even if you didn't want to move that functionality into the modules, the
function could still be reworked. One of the telling features of that
function is that it's pretty easy to recognize that a lot of the code
was copied and pasted, a clear sign that there's probably a better way
to abstract the code.
Anyways, I hope that didn't come off as pretentious. I know that when
you're working you generally focus more on working through your current
problem than thinking about ways to generalize what you're doing to use
in the future. But looking through the codebase for the first time, I'm
just trying to make sense of it, so it stands out when a portion of code
isn't as straightforward as it could be.
I'd like to help start making some of these types of changes in Trac,
but I'm not really sure what the protocol should be. I don't want to
file tickets for every change like this. I suppose I'll post patches to
the mailing list. The mailing list volume doesn't seem too heavy at
this point, but at some point it may be good to create separate user and
developer lists.
--
Matthew Good <trac@matt-good.net>