Mailing List Archive

Performance issues when performing git clone
Hello All,

I've been evaluating ClamAV for only a few months now. What I'm seeing may be a configuration issue on my part. We will see. Anyway, the issue I'm having:

Git clone operations seem really really slow when clamonacc is running. Granted, I shouldn't have to do this often however some users have also complained of git fetch operations as well. A brief description of the process:


* Git clone <repo>. The clamav project itself is a great example.
* Running clamdtop in a separate window shows a single file being scanned. This is a "pack" file located within <repo>/.git/objects/pack/tmp_pack_<random string>. The clamd log (I have clamd, clamonacc and clamscan logs separated) produces a whole slew of similar to the following:

2021-01-19T15:54:29.288903-06:00 vdr-l999-024 clamd[3561]: THRMGR: active jobs for 0x564759671330: 2
2021-01-19T15:54:29.288945-06:00 vdr-l999-024 clamd[3561]: Consumed entire command
2021-01-19T15:54:29.288985-06:00 vdr-l999-024 clamd[3561]: fds_poll_recv: timeout after 120 seconds
2021-01-19T15:54:29.289038-06:00 vdr-l999-024 clamd[3561]: THRMGR: queue (single) crossed low threshold -> signaling
2021-01-19T15:54:29.289080-06:00 vdr-l999-024 clamd[3561]: THRMGR: queue (bulk) crossed low threshold -> signaling
2021-01-19T15:54:29.289118-06:00 vdr-l999-024 clamd[3561]: Finished scanthread
2021-01-19T15:54:29.289157-06:00 vdr-l999-024 clamd[3561]: THRMGR: group_finished: 0x564759671330, 2
2021-01-19T15:54:29.289195-06:00 vdr-l999-024 clamd[3561]: THRMGR: active jobs for 0x564759671330: 1
2021-01-19T15:54:29.289232-06:00 vdr-l999-024 clamd[3561]: THRMGR: queue (single) crossed low threshold -> signaling
2021-01-19T15:54:29.289270-06:00 vdr-l999-024 clamd[3561]: THRMGR: queue (bulk) crossed low threshold -> signaling
2021-01-19T15:54:31.290165-06:00 vdr-l999-024 clamd[3561]: Received POLLIN|POLLHUP on fd 12
2021-01-19T15:54:31.290615-06:00 vdr-l999-024 clamd[3561]: got command STATS (6, 12), argument:


* The above logs continue for the duration of the clone operation which as noted earlier extends for at least a few minutes. There's no discernable break, the next cycle happens within less than a second.

If a user attempts to exit the operation with a <cntl>-c the git process becomes defunct, the file continues to be scanned and the load count grows until either the system is reboot or the clamonacc service is stopped. Stopping clamonacc allows the git operation to exit completely. I've seen the load on an otherwise idle system grow to 50.

I've seen the above behavior in both RHEL7 and RHEL8 (the logs from above where RHEL8. The RHEL7 systems were configured to communicate to clamd@scan over the network port (127.0.0.1:3301). The above logs came from a RHEL8.3 system that communicated to clamd@scan over the local socket.

It seems to me, and what I'm reporting, is that processes where a given file receives updates over the network cause issues. I can cite those, but for now I think the git clone performance is an easily reproduceable issue. Hah, I wish I could un-produce it. Any thoughts? I've reviewed existing bugs on bugzilla but none of them caught my eye.

Many Thanks,
-Andrew

This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.