Mailing List Archive

dircache.py
Pursuant to my volunteering to implement Guido's plan to
combine cmp.py, cmpcache.py, dircmp.py and dircache.py
into filecmp.py, I did some investigating of dircache.py.

I find it completely unreliable. On my NT box, the mtime of the
directory is updated (on average) 2 secs after a file is added,
but within 10 tries, there's always one in which it takes more
than 100 secs (and my test script quits). My Linux box hardly
ever detects a change within 100 secs.

I've tried a number of ways of testing this ("this" being
checking for a change in the mtime of the directory), the latest
of which is below. Even if dircache can be made to work
reliably and surprise-free on some platforms, I doubt it can be
done cross-platform. So I'd recommend that it just get dropped.

Comments?

---------------------------------------------------
import os
import sys
import time
d = os.getcwd()
atimes = []

def test():
m = os.stat(d)[8]
for i in range(10):
fnm = 's%d.tmp' % i
open(fnm,'w').write('dummy - delete me')
for j in range(10000):
newm = os.stat(d)[8]
if newm != m:
atimes.append(j*0.01)
m = newm
break
time.sleep(0.01)
else:
print "At round %d, failed to detect add within %3.2f
secs" % (i, j*0.01)
break

def report():
import operator
if atimes:
print "detect adds: min= %3.2f max= %3.2f avg=
%3.2f" % (min(atimes), max(atimes), reduce(operator.add,
atimes, 0.0)/len(atimes))
else:
print "no successfully detected adds"

test()
report()

- Gordon
Re: dircache.py [ In reply to ]
Gordon McMillan wrote:
>
> Pursuant to my volunteering to implement Guido's plan to
> combine cmp.py, cmpcache.py, dircmp.py and dircache.py
> into filecmp.py, I did some investigating of dircache.py.
>
> I find it completely unreliable. On my NT box, the mtime of the
> directory is updated (on average) 2 secs after a file is added,
> but within 10 tries, there's always one in which it takes more
> than 100 secs (and my test script quits). My Linux box hardly
> ever detects a change within 100 secs.
>
> I've tried a number of ways of testing this ("this" being
> checking for a change in the mtime of the directory), the latest
> of which is below. Even if dircache can be made to work
> reliably and surprise-free on some platforms, I doubt it can be
> done cross-platform. So I'd recommend that it just get dropped.
>
> Comments?

Note that you'll have to flush and close the tmp file to actually
have it written to the file system. That's why you are not seeing
any new mtimes on Linux.

Still, I'd suggest declaring it obsolete. Filesystem access is
usually cached by the underlying OS anyway, so adding another layer of
caching on top of it seems not worthwhile (plus, the OS knows
better when and what to cache).

Another argument against using stat() time entries for caching
purposes is the resolution of 1 second. It makes the dircache.py
unreliable per se for fast changing directories.

The problem is most probably even worse for NFS and on Samba mounted
WinXX filesystems the mtime trick doesn't work at all (stat()
returns the creation time for atime, mtime and ctime).

--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 60 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/