Mailing List Archive

Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues.
Hi people,
I've the following weird behaviour on a Compaq Proliant, 1Gig phys ram,
Smart2 Compaq raid adapter with 6 disk Raid 5 array, 2 Xeon CPU's.
I tried using both CPU's or only one (disabling it from the bios). I tried
kernel 2.0.36 and 2.2.0pre7 (always with SMP compiled in (even when only 1
CPU was used)). I also tried restricting the kernel memory to 64MB (side
note: It appeared (!) to be a little faster with this setting, I assume
handling/searching 1Gig of disk caches is much too slow. If it is not
possible to speed the handling of cache memory up, maybe it is smarter to
allow to set the kernel to use a maximum of n MB of buffer/caches memory.
Anyway, effect under all those kernels/# of CPU's is as follows: I try to
backup / (whole linux is in one partition) to another disk, file on same
disk, or even /dev/null (it doesn't matter). It runs nicely. But when it
comes to /usr/src/linux-2.2.0pre7 where the 2.2.0pre7 sources are, it
slows down to a crawl. That means it backs up one of these really tiny *.c
and *.h files there per 1 or 2 seconds. Basically, it is impossible to
back this dir up in any reasonable time.
I used tar -l as to exclude /proc and /mnt where the destination disk was
mounted. Still, even 'tar -cvf/dev/null /usr/src' showed the exact same
behaviour although it slowed down faster as it came faster to
/usr/src/linux-2.2.0pre7 .
During the slow down, top claims system is well over 90% percent idle,
CPU time consumed by tar and general system time spent is virtually zero
(1 or 2 %). Tar is not locked in an uninterruptible sleep waiting for a
device ('D') nor is there any apparent high disk activity (it just gets
these tiny files every few seconds).
To make things even more weird, cp -rv /usr/src /mnt works like a greased
weasel. So, a simple filesystem/disk cache/dname cache/disk device or
driver issue can IMHO be excluded.
Now, honestly, this is the very first time with linux I really have no
clue what's going on. Therefore any comments and ideas are appreciated.
Michael.
P.S. Files and disks are currently moved around a bit more on this
machine. I'll see if the same holds when /usr/src/linux-2.2.0pre7 was
moved to another partition.
--
Michael Weller: eowmob@exp-math.uni-essen.de, eowmob@ms.exp-math.uni-essen.de,
or even mat42b@spi.power.uni-essen.de. If you encounter an eowmob account on
any machine in the net, it's very likely it's me.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
Michael Weller <eowmob@exp-math.uni-essen.de> writes:
> Anyway, effect under all those kernels/# of CPU's is as follows: I
> try to backup / (whole linux is in one partition) to another disk,
> file on same disk, or even /dev/null (it doesn't matter). It runs
> nicely. But when it comes to /usr/src/linux-2.2.0pre7 where the
> 2.2.0pre7 sources are, it slows down to a crawl. That means it
> backs up one of these really tiny *.c and *.h files there per 1 or
> 2 seconds. Basically, it is impossible to back this dir up in any
> reasonable time.
Do the numeric uids/gids under /usr/src/linux-2.2.0pre7 correspond to
a known user? (Try `find -nouser'). If not, tar can become _really_
slow asking NIS for a username for each file anew.
I had this phenomenon when I upgraded my home machine to SuSE 6.0
with libc6. The default /etc/nssswitch.conf has NIS included, even if
NIS is not installed.
--
Thorsten Ohl, Physics Department, TU Darmstadt -- ohl@hep.tu-darmstadt.de
http://heplix.ikp.physik.tu-darmstadt.de/~ohl/ [<=== PGP public key here]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
On Thu, 21 Jan 1999, Dr. Michael Weller wrote:
>Anyway, effect under all those kernels/# of CPU's is as follows: I try to
>backup / (whole linux is in one partition) to another disk, file on same
>disk, or even /dev/null (it doesn't matter). It runs nicely. But when it
>comes to /usr/src/linux-2.2.0pre7 where the 2.2.0pre7 sources are, it
>slows down to a crawl. That means it backs up one of these really tiny *.c
>and *.h files there per 1 or 2 seconds. Basically, it is impossible to
>back this dir up in any reasonable time.
>
>I used tar -l as to exclude /proc and /mnt where the destination disk was
>mounted. Still, even 'tar -cvf/dev/null /usr/src' showed the exact same
>behaviour although it slowed down faster as it came faster to
>/usr/src/linux-2.2.0pre7 .
>
>During the slow down, top claims system is well over 90% percent idle,
>CPU time consumed by tar and general system time spent is virtually zero
>(1 or 2 %). Tar is not locked in an uninterruptible sleep waiting for a
>device ('D') nor is there any apparent high disk activity (it just gets
>these tiny files every few seconds).
You could try
# chown -R root: /usr/src/linux*
and check if it makes any difference.
Something similar showed up on linux-kernel some time ago:
it seems that tar finds those "strange" 1046/1046 values for uid/gid
and asks around (i.e. to NYS) for them, sleeping while waiting for an answer.
>To make things even more weird, cp -rv /usr/src /mnt works like a greased
>weasel. So, a simple filesystem/disk cache/dname cache/disk device or
>driver issue can IMHO be excluded.
cp doesn't try to resolve uid/gid into names, tar does.
So this *could* be a point for my explanation.
>Now, honestly, this is the very first time with linux I really have no
>clue what's going on. Therefore any comments and ideas are appreciated.
I just sent my idea. Maybe I am wrong...
>Michael.
>
Massimiliano Ghilardi
----------------------------------------------------------------
| I have yet to meet a person who had a bad experience of Linux. |
| Most have never had an experience. |
----------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
On Thu, Jan 21, 1999 at 11:20:54AM +0100, Dr. Michael Weller wrote:
> Hi people,
>
> I've the following weird behaviour on a Compaq Proliant, 1Gig phys ram,
> Smart2 Compaq raid adapter with 6 disk Raid 5 array, 2 Xeon CPU's.
>
> I tried using both CPU's or only one (disabling it from the bios). I tried
> kernel 2.0.36 and 2.2.0pre7 (always with SMP compiled in (even when only 1
> CPU was used)). I also tried restricting the kernel memory to 64MB (side
> note: It appeared (!) to be a little faster with this setting, I assume
> handling/searching 1Gig of disk caches is much too slow. If it is not
> possible to speed the handling of cache memory up, maybe it is smarter to
> allow to set the kernel to use a maximum of n MB of buffer/caches memory.
>
> Anyway, effect under all those kernels/# of CPU's is as follows: I try to
> backup / (whole linux is in one partition) to another disk, file on same
> disk, or even /dev/null (it doesn't matter). It runs nicely. But when it
> comes to /usr/src/linux-2.2.0pre7 where the 2.2.0pre7 sources are, it
> slows down to a crawl. That means it backs up one of these really tiny *.c
> and *.h files there per 1 or 2 seconds. Basically, it is impossible to
> back this dir up in any reasonable time.
>
> I used tar -l as to exclude /proc and /mnt where the destination disk was
> mounted. Still, even 'tar -cvf/dev/null /usr/src' showed the exact same
> behaviour although it slowed down faster as it came faster to
> /usr/src/linux-2.2.0pre7 .
>
> During the slow down, top claims system is well over 90% percent idle,
> CPU time consumed by tar and general system time spent is virtually zero
> (1 or 2 %). Tar is not locked in an uninterruptible sleep waiting for a
> device ('D') nor is there any apparent high disk activity (it just gets
> these tiny files every few seconds).
>
> To make things even more weird, cp -rv /usr/src /mnt works like a greased
> weasel. So, a simple filesystem/disk cache/dname cache/disk device or
> driver issue can IMHO be excluded.
>
> Now, honestly, this is the very first time with linux I really have no
> clue what's going on. Therefore any comments and ideas are appreciated.
>
It might be the same problem that bit me some time ago with NIS:
Linus has uid/gid 1046 1046 or so set in the kernel tar archive.
If this doesn't match in /etc/passwd neither in NIS it causes
a NIS request for _every_ file. Seems that misses don't get
cached with Linux glibc NIS support.
I found out by running strace tar and tcpdump, and fixed it with entries
torvalds in NIS passwd and group :-)).
Does your problem persist after chown -R root:root
/usr/src/linux-2.2.0pre7 ?
Ciao
Dietmar
--
Reporter (to Mahatma Gandhi): Mr Gandhi, what do you think of Western
Civilization? Gandhi: I think it would be a good idea.
Dietmar Goldbeck, E-Mail: dietmar@telemedia.de; phone +49-5241-80-7646
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: VM20 behavior on a 486DX/66Mhz with 16mb of RAM [ In reply to ]
On Thu, 21 Jan 1999, Stephen C. Tweedie wrote:
> No. The algorithm should react to the current *load*, not to what it
> thinks the ideal parameters should be. There are specific things you
Obviously when the system has a lot of freeable memory in fly there are
not constraints. When instead the system is very low on memory you have to
choose what to do.
Two choices:
1. You want to give the most of available memory to the process that is
trashing the VM, in this case you left the balance percentage of
freeable pages low.
2. You leave the number of freeable pages more high, this way other
iteractive processes will run smoothly even if with the trashing proggy
in background.
This percentage of freeable page balance you want at your time can't be
known by the algorithm. 5% of freeable pages work always well here but you
may want 30% of freeable pages (but note too much pages in the swap cache
are a real risk for the __ugly__ O(n) seach we handle right now in the
cache so rising too much the freeable percentage could theorically
decrease performances (and obviously increase the swap space
really available for not in ram pages)).
> can do to the VM which completely invalidate any single set of cache
> figures. For example, you can create large ramdisks which effectively
> lock large amounts of memory into the buffer cache, and there's nothing
> you can do about that. If you rely on magic numbers to get the
> balancing right, then performance simply disappears when you do
> something unexpected like that.
My current (not yet diffed and released VM due not time to play with Linux
today due offtopic lessions at University) try to go close the a balance
percentage (5%) of freeable pages. Note: for freeable pages I mean pages
in the file cache (swapper_inode included) with a reference count of 1,
really shrunkable (exists "shrunkable" ? ;) from shrink_mmap(). I
implemented two new functions page_get() and page_put() (and hacked
free_pages and friends) to take nr_freeable_pages uptodate.
> This is not supposition. This is the observed performance of VMs which
> think they know how much memory should be allocated for different
> purposes. You cannot say that cache should be larger than or smaller
> than a particular value, because only the current load can tell you how
> big the cache should be and that load can vary over time.
I just know that (not noticed, because the old code was just quite good).
The reason I can't trust the cache size is because some part of cache are
not freeable and infact I just moved my VM to check the percentage of
_freeable_ pages. And the algorithm try to go close to such percentage
because it know that it's rasonable, but it works fine even if it can't
reach such vale. If you don't try to go in a rasonable direction you could
risk to swapout even if there are tons of freeable pages in the swap cache
(maybe because the pages are not distributed equally on the mmap so
shrink_mmap() exires too early).
The current VM balance is based on the (num_physpages << 1) / (priority+1)
and I find this bogus. My current VM change really nothing using a
starting prio of 6 or of 1. Sure starting from 1 is less responsive but
the numbers of vmstat are the ~same.
> > If I am missing something (again ;) comments are always welcome.
>
> Yes. Offer the functionality of VM limits, sure. Relying on it is a
> disaster if the user does something you didn't predict.
Do still think this even if I am trying to give a balance to the number of
_freeable_ pages? Note, I never studied about the memory management.
Everything I do came from my instinct, so I can be wrong of course... but
I am telling you what I think right now.
BTW, do you remeber the benchmark that I am using that dirtyfy 160Mbyte in
loop and tell me how many seconds take each loop?
Well it was taking 100 sec in pre1, it started to take 50 sec since I
killed kswapd and I started to async swapout also from process context,
and with my current experimental code (not yet released) is running in 20
sec each loop (the record ever seen here ;). I don't know if my new
experimental code (with the new nr_freeable_pages) is good under all
aspects but sure it gives a big boost and it's a real fun (at least Linux
is fun, University is going worse and worse).
Andrea Arcangeli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
On Thu, 21 Jan 1999, Dr. Michael Weller wrote:
> I tried using both CPU's or only one (disabling it from the bios). I tried
> kernel 2.0.36 and 2.2.0pre7 (always with SMP compiled in (even when only 1
> CPU was used)). I also tried restricting the kernel memory to 64MB (side
> note: It appeared (!) to be a little faster with this setting, I assume
> handling/searching 1Gig of disk caches is much too slow. If it is not
> possible to speed the handling of cache memory up, maybe it is smarter to
> allow to set the kernel to use a maximum of n MB of buffer/caches memory.
Which distro and libc? RedHat 5.0 shipped with a broken glibc which
couldn't completely disable NIS and NIS+ so when it came across a uid
in the filesystem that it didn't know about (like torvalds=1046 :) it
broadcast around the network for a nis server to identify the miscreant.
And then it waited for 2-3 secs and gave up and stored only the numberic
ID in the tar file. And then it went onto the next file...
I suspect that this is your problem.
Matthew.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
Dietmar Goldbeck <dietmar@telemedia.de> writes:
> If this doesn't match in /etc/passwd neither in NIS it causes
> a NIS request for _every_ file. Seems that misses don't get
> cached with Linux glibc NIS support.
The library itself must not cache things. But if you would use the
nscd from glibc 2.1 you'd see there is no problem anymore.
--
---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace
Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA
Cygnus Solutions `--' drepper at cygnus.com `------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/
Re: Tar (but not cp) is incredible slow on certain dirs; request for comments/solution ideas/clues. [ In reply to ]
> During the slow down, top claims system is well over 90% percent idle,
> CPU time consumed by tar and general system time spent is virtually zero
You have NIS configured in /etc/nsswitch.conf but are not running NIS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/