Mailing List Archive

wrapped around the axle on regexpressions and file searching
Hey Pythoniers!

I'm attemtping to locate log files on my drive(s), and do some comparison.
here's a simplified snippet where I'm getting bottlenecked.

>>>import os,re
>>>regexp = re.compile('.log')
>>>def find_log_files(arg, directory, names):
...for name in os.listdir(directory):
if regexp.search(name):
print directory + "\\" + name
>>>os.path.walk('D:\\',find_log_files,None)


here are my questions: 1. this prints out not only files with the file
extensions ".log" but also any file name that has "log" in it's name. how
would I rewrite to avoid??

2. is there a better, faster way of doing this??, my end goal is to open the
files and compare time sequences to one another.

3. Is ther any way to determine the number of drives on a system, obviuosly I
am hardcoding the top level drive letter "D:\", is there any way to search the
entire system much like win32's find file search??

TIA

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own
wrapped around the axle on regexpressions and file searching [ In reply to ]
On Sat, 24 Apr 1999 15:21:03 GMT, "Gordon McMillan"
<gmcm@hypernet.com> wrote:

>msrisney writes:

>> here are my questions: 1. this prints out not only files with the
>> file extensions ".log" but also any file name that has "log" in it's
>> name. how would I rewrite to avoid??
>
>In a regex, a "." is a wildcard character. If you want a literal "."
>you need to escape it:
> re.compile('\\.log')
>or
> re.compile(r'\.log')

Wouldn't that be re.compile('\\.log$'), so as to avoid things like
tree.log.jpg?

doesn't-yet-grok-python-but-does-grok-regexps-ly y'rs

Steve
--
-- Steve Atkins -- steve@blighty.com
wrapped around the axle on regexpressions and file searching [ In reply to ]
Gordon McMillan wrote:
>
> msrisney writes:

...

> At any rate, os.path.exists('e:/') is an effective way of finding out
> if e: exists. Though specifying 'a:/' will pop up one of those lovely
> Abort/Retry/Fail dialogs if nothing is in the drive.

Not always. On my Win98 box, the floppy drive brushes teeth for
a second, and then I get back *one* from os.path.exists.
This is really bad, since I didn't insert a floppy.

> in-nearly-20-years-I-still-haven't-figured-out-the-difference-between
> -Abort-and-Fail-ly y'rs

Some of my early dos tools aborted immediately from the DOS
function when I hit abort, while they caught the error and
some were able to continue, when I answered with "Fail".
So I think the designed intent was to give the user a chance
to decide wether the program should shut down, or an internal
error handler should get a chance.

Anyway no good solution - ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home
wrapped around the axle on regexpressions and file searching [ In reply to ]
msrisney writes:
>
> I'm attemtping to locate log files on my drive(s), and do some
> comparison. here's a simplified snippet where I'm getting
> bottlenecked.
>
> >>>import os,re
> >>>regexp = re.compile('.log')
> >>>def find_log_files(arg, directory, names):
> ...for name in os.listdir(directory):
> if regexp.search(name):
> print directory + "\\" + name
> >>>os.path.walk('D:\\',find_log_files,None)
>
>
> here are my questions: 1. this prints out not only files with the
> file extensions ".log" but also any file name that has "log" in it's
> name. how would I rewrite to avoid??

In a regex, a "." is a wildcard character. If you want a literal "."
you need to escape it:
re.compile('\\.log')
or
re.compile(r'\.log')

> 2. is there a better, faster way of doing this??, my end goal is to
> open the files and compare time sequences to one another.

Any number of ways that would be faster. Since you've already got
os imported, you could use os.path.splitext()
if os.path.splitext(name)[1] == '.log':

> 3. Is ther any way to determine the number of drives on a system,
> obviuosly I am hardcoding the top level drive letter "D:\", is there
> any way to search the entire system much like win32's find file
> search??

Do you know how to search multiple drives with FindxxxFIle()? I've
always thought that you either got the current drive or specified the
drive to search.

At any rate, os.path.exists('e:/') is an effective way of finding out
if e: exists. Though specifying 'a:/' will pop up one of those lovely
Abort/Retry/Fail dialogs if nothing is in the drive.

in-nearly-20-years-I-still-haven't-figured-out-the-difference-between
-Abort-and-Fail-ly y'rs

- Gordon
wrapped around the axle on regexpressions and file searching [ In reply to ]
[code to find and process files ending in .log]

>2. is there a better, faster way of doing this??,

For matching literal text, string.find, .rfind, .index, .rindex
are faster that regexes. Since you want to match text at end only,
if filename[-4:] == '.log'
will be even faster.

TJR
wrapped around the axle on regexpressions and file searching [ In reply to ]
>>>>> "m" == msrisney <msrisney@my-dejanews.com> writes:

m> here are my questions: 1. this prints out not only files with the
m> file extensions ".log" but also any file name that has "log" in
m> it's name. how would I rewrite to avoid??

How about using "fnmatch.fnmatch" instead of a regular expression?

--
===== R.Hooft@EuroMail.net http://www.xs4all.nl/~hooft/rob/ =====
===== R&D, Nonius BV, Delft http://www.nonius.nl/ =====
===== PGPid 0xFA19277D ========================== Use Linux! =========