Mailing List Archive: [clamav-users] ERROR: VirusEvent: fork failed.

[clamav-users] ERROR: VirusEvent: fork failed.

Feb 6, 2020, 6:48 AM

Post #1 of 10 (1019 views)

So I have Clam setup in network mode. On the server I have the
VirusEvent line in the clamd.conf file uncommented and in place of the
example I have it set to run a script which is supposed to grab the
last line of the clamd.log file add that to a text file which is then
emailed to us. I can run the script manually with no problems and
during my initial testing the VirusEvent was working correctly.

Starting two days ago the email stopped being sent when a virus was
found when I was running tests. Saw the "fork failed" error and after
some troubleshooting which did not reveal anything I tried rebooting
the server. After the server came back up VirusEvent started working
so I chalked it up to the server just needing a reboot. Yesterday
same thing started to happen, during testing I realized that the
emails were not being sent. Checked the logs on the server and saw
the "fork failed" error again, tried another reboot but this time that
has not worked. I have found two other threads in this mailing list
with the same error, but neither has any solutions to the problem. I
know this setup can work I'm just stuck on why this error keeps
popping up. Is there anything I can do to get more information from
Clam about what is happening to hopefully point me to a solution?

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 7, 2020, 6:51 AM

Post #2 of 10 (1017 views)

Permalink

Hi there,

On Thu, 6 Feb 2020, Tom Ossman via clamav-users wrote:

> So I have Clam setup in network mode.

I'm not sure that I know what that means. Please elaborate in as much
detail as it would take for me to reproduce your system.

> On the server I have the VirusEvent line in the clamd.conf file

So I guess you're running clamd. Be aware that there have been some
problems with the VirusEvent feature which have only fairly recently
been fixed (as late as October 2019 - see for example this link:
https://blog.clamav.net/2019/10/clamav-01020-has-been-released.html),
and you might expect that, depending on your use case, there could be
relatively new code in there which hasn't yet been as well exercised
as some of the other code has been.

> uncommented and in place of the example I have it set to run a
> script which is supposed to grab the last line of the clamd.log file
> add that to a text file which is then emailed to us.

Please tell us

What is the server; what resources it has (particularly CPU & memory);
what operating system it uses; what version of ClamAV it uses and how
that was installed; the full configuration files; the exact VirusEvent
script; what you are scanning, how, and how it is presented to ClamAV;
an example line of the log file that you're looking for; how you know
that the last line is the one you're looking for; what other processes
are running on the sever and what resources are used by them; relevant
log extracts etc.; and as much about the client(s) too - how many of
them; what they are; what load they present to the server; etc..

> Starting two days ago the email stopped being sent when a virus was
> found when I was running tests. Saw the "fork failed" error and after
> some troubleshooting which did not reveal anything

Please tell us

the EXACT error message; where you found it; what the troubleshooting
was; the test results; what you were doing at the time; and what you
were looking for which was not revealed in the test results.

> I tried rebooting the server. After the server came back up
> VirusEvent started working

It seems like the server might have been running out of resources, but
that's just my conjecture. Please tell us what you have done to
verify that the server has enough resources to do the tasks which it
has to do - for example, have you studied the 'man' page for 'top'?

> so I chalked it up to the server just needing a reboot.

Very woolly thinking, a bit like working with Windows boxes. I run
servers for sometimes more than a year without a reboot, including
servers which run several clamd daemons. I never expect any server to
be "just needing a reboot", and if a production server does need a
reboot to make it work, in the absence of extenuating circumstances I
will consider it broken, and fix it.

> Yesterday same thing started to happen, during testing I realized
> that the emails were not being sent.

Please describe the testing - carefully - and the mail system.

> Checked the logs on the server and saw the "fork failed" error
> again, tried another reboot but this time that has not worked.

Please tell us what IS working; what resources are being used; etc.

> I have found two other threads in this mailing list with the same
> error, but neither has any solutions to the problem. I know this
> setup can work I'm just stuck on why this error keeps popping up.

Please point us to those threads as I'm sure some of the list threads
about failed forks are not relevant to this issue. The only one I see
which might be relevant is over three years old (January 2017, which
is very old in terms of ClamAV development) and, as you say, it was in
any case uninformative all round.

> Is there anything I can do to get more information from Clam about
> what is happening to hopefully point me to a solution?

You might enable debug logging, but at the moment the issues are more
about us getting information from you than you getting it from ClamAV.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 10, 2020, 9:05 AM

Post #3 of 10 (1013 views)

Permalink

Hello,

> > So I have Clam setup in network mode.
>
> I'm not sure that I know what that means. Please elaborate in as much
> detail as it would take for me to reproduce your system.
>
> The whole setup is in AWS, I have one instance setup as a "ClamAV server"
and 7 instances setup as "ClamAV clients" this link
<https://xn--blgg-hra.no/2016/03/clamav-clientserver-setup/> was the main
reference for this setup. So basically I have in the clamd.conf file on
each instance the TCP port and IP Address lines uncommented and configured
with the IP address of the "server" instance. We have the TCP port 3310
allowed between all the instances involved. I installed Clam on each
instance from source, on the "server" clamd and freshclam are both running
and on the "clients" I have disabled clamd and freshclam, I have clamonacc
setup as a service to watch a specific uploads directory on each instance
also a script setup to run a full system clamdscan starting at 2 am on each
instance staggered by 15 minutes. The config files on the "clients" as I
said previously are setup to point to the "server", I also have the
file/directory exclusions setup in them. The "server's" config file is
where the VirusEvent is configured.

> On the server I have the VirusEvent line in the clamd.conf file
>
> So I guess you're running clamd. Be aware that there have been some
> problems with the VirusEvent feature which have only fairly recently
> been fixed (as late as October 2019 - see for example this link:
> https://blog.clamav.net/2019/10/clamav-01020-has-been-released.html),
> and you might expect that, depending on your use case, there could be
> relatively new code in there which hasn't yet been as well exercised
> as some of the other code has been.
>
> You are correct, on the server I am running clamd. I was not aware the
the code was that new, I'll review the link you provided.

Please tell us
> What is the server; what resources it has (particularly CPU & memory);
> what operating system it uses; what version of ClamAV it uses and how
> that was installed; the full configuration files; the exact VirusEvent
> script; what you are scanning, how, and how it is presented to ClamAV;
> an example line of the log file that you're looking for; how you know
> that the last line is the one you're looking for; what other processes
> are running on the sever and what resources are used by them; relevant
> log extracts etc.; and as much about the client(s) too - how many of
> them; what they are; what load they present to the server; etc..

The "server" instance is a t3a.small, 2 CPUs and 2 GB of memory, running
Ubuntu 18.04, ClamAV version is 0.102.1, and as I stated previously it was
installed from source using the guide on the ClamAV site. I've attached
the server's full configuration file (cleansed) as well as the VirusEvent
script (also cleansed). I have both clamonacc and clamdscan scanning.
Clamonacc is scanning a particular directory on each client 24/7, I have
created a clamonacc.service file and loaded that into systemd. Clamdscan
is setup to do a full system scan on each client instance starting at 2 am
EST staggered by 15 minutes each a cron job kicks the scan off. Here is
the log output in question:

Wed Feb 5 20:00:16 2020 -> instream(10.5.1.217@44956):
Eicar-Test-Signature(aa991d6e29bf8eb4c1b56c599dffce0a:70) FOUND
Wed Feb 5 20:00:16 2020 -> ERROR: VirusEvent: fork failed.

I know that the script is giving me the last line of the log file because
it includes the timestamp. So if I run the script myself and then look at
the log file I can see that the last line of the log is the same as the
line that the script included in the email. The server instance was spun
up specifically for this so other then what is included in the default
Ubuntu AWS image the only things installed on it are ClamAV and msmtp (for
sending the emails). The 7 clients are all running Ubuntu as well, as for
load, they are not all speaking to the server at once, here is a screenshot
of the resources box presented in clamdtop when one of the servers is
running a full scan:
[image: image.png]

> Starting two days ago the email stopped being sent when a virus was
>> > found when I was running tests. Saw the "fork failed" error and after
>> > some troubleshooting which did not reveal anything
>> Please tell us
>> the EXACT error message; where you found it; what the troubleshooting
>> was; the test results; what you were doing at the time; and what you
>> were looking for which was not revealed in the test results.
>>
>
Exact error message is above, but to keep things clear here it is again:

Wed Feb 5 20:00:16 2020 -> ERROR: VirusEvent: fork failed.

Admittedly the troubleshooting I did was fairly basic. First, I made sure
that msmtp could still send an email from the server (it could). Second,
made sure the script was able to be ran (it is). Third, made sure that the
client instance was connecting to the server instance when running a scan
(it was, verifed that by watching clamdtop on the server instance while the
client was running the scan).

> I tried rebooting the server. After the server came back up
> > VirusEvent started working
> It seems like the server might have been running out of resources, but
> that's just my conjecture. Please tell us what you have done to
> verify that the server has enough resources to do the tasks which it
> has to do - for example, have you studied the 'man' page for 'top'?

If I watch the resources being used on the server during a scan on one of
the clients the CPU usage averages around 85%.

>
> > so I chalked it up to the server just needing a reboot.
> Very woolly thinking, a bit like working with Windows boxes. I run
> servers for sometimes more than a year without a reboot, including
> servers which run several clamd daemons. I never expect any server to
> be "just needing a reboot", and if a production server does need a
> reboot to make it work, in the absence of extenuating circumstances I
> will consider it broken, and fix it.

Fair enough.

> Yesterday same thing started to happen, during testing I realized
> > that the emails were not being sent.
> Please describe the testing - carefully - and the mail system.

Testing the clamdscan cron job. I have msmtp installed on the server, it
is setup to send emails through AWS SES.

> Checked the logs on the server and saw the "fork failed" error
> > again, tried another reboot but this time that has not worked.
> Please tell us what IS working; what resources are being used; etc.

Scans are working/running, clamd and freshclam on working/running, virus
detection is happening, just the VirusEvent is not working.

> I have found two other threads in this mailing list with the same
> > error, but neither has any solutions to the problem. I know this
> > setup can work I'm just stuck on why this error keeps popping up.
> Please point us to those threads as I'm sure some of the list threads
> about failed forks are not relevant to this issue. The only one I see
> which might be relevant is over three years old (January 2017, which
> is very old in terms of ClamAV development) and, as you say, it was in
> any case uninformative all round.

I found two, one is the same one that you reference, the other I cannot
find now but contained even less information than the January '17 one.

On Fri, Feb 7, 2020 at 9:53 AM G.W. Haywood via clamav-users <
clamav-users@lists.clamav.net> wrote:

> Hi there,
>
> On Thu, 6 Feb 2020, Tom Ossman via clamav-users wrote:
>
> > So I have Clam setup in network mode.
>
> I'm not sure that I know what that means. Please elaborate in as much
> detail as it would take for me to reproduce your system.
>
> > On the server I have the VirusEvent line in the clamd.conf file
>
> So I guess you're running clamd. Be aware that there have been some
> problems with the VirusEvent feature which have only fairly recently
> been fixed (as late as October 2019 - see for example this link:
> https://blog.clamav.net/2019/10/clamav-01020-has-been-released.html),
> and you might expect that, depending on your use case, there could be
> relatively new code in there which hasn't yet been as well exercised
> as some of the other code has been.
>
> > uncommented and in place of the example I have it set to run a
> > script which is supposed to grab the last line of the clamd.log file
> > add that to a text file which is then emailed to us.
>
> Please tell us
>
> What is the server; what resources it has (particularly CPU & memory);
> what operating system it uses; what version of ClamAV it uses and how
> that was installed; the full configuration files; the exact VirusEvent
> script; what you are scanning, how, and how it is presented to ClamAV;
> an example line of the log file that you're looking for; how you know
> that the last line is the one you're looking for; what other processes
> are running on the sever and what resources are used by them; relevant
> log extracts etc.; and as much about the client(s) too - how many of
> them; what they are; what load they present to the server; etc..
>
> > Starting two days ago the email stopped being sent when a virus was
> > found when I was running tests. Saw the "fork failed" error and after
> > some troubleshooting which did not reveal anything
>
> Please tell us
>
> the EXACT error message; where you found it; what the troubleshooting
> was; the test results; what you were doing at the time; and what you
> were looking for which was not revealed in the test results.
>
> > I tried rebooting the server. After the server came back up
> > VirusEvent started working
>
> It seems like the server might have been running out of resources, but
> that's just my conjecture. Please tell us what you have done to
> verify that the server has enough resources to do the tasks which it
> has to do - for example, have you studied the 'man' page for 'top'?
>
> > so I chalked it up to the server just needing a reboot.
>
> Very woolly thinking, a bit like working with Windows boxes. I run
> servers for sometimes more than a year without a reboot, including
> servers which run several clamd daemons. I never expect any server to
> be "just needing a reboot", and if a production server does need a
> reboot to make it work, in the absence of extenuating circumstances I
> will consider it broken, and fix it.
>
> > Yesterday same thing started to happen, during testing I realized
> > that the emails were not being sent.
>
> Please describe the testing - carefully - and the mail system.
>
> > Checked the logs on the server and saw the "fork failed" error
> > again, tried another reboot but this time that has not worked.
>
> Please tell us what IS working; what resources are being used; etc.
>
> > I have found two other threads in this mailing list with the same
> > error, but neither has any solutions to the problem. I know this
> > setup can work I'm just stuck on why this error keeps popping up.
>
> Please point us to those threads as I'm sure some of the list threads
> about failed forks are not relevant to this issue. The only one I see
> which might be relevant is over three years old (January 2017, which
> is very old in terms of ClamAV development) and, as you say, it was in
> any case uninformative all round.
>
> > Is there anything I can do to get more information from Clam about
> > what is happening to hopefully point me to a solution?
>
> You might enable debug logging, but at the moment the issues are more
> about us getting information from you than you getting it from ClamAV.
>
> --
>
> 73,
> Ged.
>
> _______________________________________________
>
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
>
> http://www.clamav.net/contact.html#ml
>

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 11, 2020, 4:00 AM

Post #4 of 10 (1013 views)

Permalink

Hi there,

Thanks for the excellent extra information, it makes things a lot clearer.

On Mon, 10 Feb 2020, Tom Ossman via clamav-users wrote:

> ... the VirusEvent script (also cleansed).

Does the script contain the first two lines as in the version which
you sent to me? If so, remove them. See the 'man' page for the
'file' utility and use it on your script. :)

> I was not aware the the code was that new, I'll review the link ...

Maybe this will help too:

https://blog.clamav.net/2019/09/understanding-and-transitioning-to.html

> The "server" instance is a t3a.small, 2 CPUs and 2 GB of memory...

I'm not sure that's be enough memory for what you're doing, I suggest
at least 4GB. From personal experience I'd recommend Nagios or Icinga
to monitor resource usage. Depending on your familiarity with things
like Apache it can be a steep learning curve, but once you have it
under your belt it's difficult to imagine working without something
like that when you have to look after more than about half a dozen
systems.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 11, 2020, 6:51 AM

Post #5 of 10 (1013 views)

Permalink

>
> > ... the VirusEvent script (also cleansed).
>
> Does the script contain the first two lines as in the version which
> you sent to me? If so, remove them. See the 'man' page for the
> 'file' utility and use it on your script. :)
>
> Are you refering to the comment and shebang? If so, the comment isn't
there but the shebang is and if that is one of the lines you are refering
to, correct me if I am wrong but doesn't a script need the shebang to run?

> > The "server" instance is a t3a.small, 2 CPUs and 2 GB of memory...
> I'm not sure that's be enough memory for what you're doing, I suggest
> at least 4GB. From personal experience I'd recommend Nagios or Icinga
> to monitor resource usage. Depending on your familiarity with things
> like Apache it can be a steep learning curve, but once you have it
> under your belt it's difficult to imagine working without something
> like that when you have to look after more than about half a dozen
> systems.
>
> I am familar with Nagios (never ran it though), I can watch the memory
usage on the server a bit closer during a test scan, but from what I have
observed in the past the memory did not look like it was being maxed out.
But spinning up a larger instance with more memory is not a big deal.

*Tom Ossman*

tossman@aspirevc.com | aspirevc.com | +1.717.468.0293

100 North Queen Street | Suite 300 | Lancaster, PA 17603

Engage with us on Twitter <https://twitter.com/AspireVC> | LinkedIn
<https://www.linkedin.com/company/aspire_ventures> | Facebook
<https://www.facebook.com/aspirevc>

The information contained in this electronic message is legally privileged
and confidential information intended only for the person to whom the
message is addressed. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution, or
copy of this electronic message is strictly prohibited. If you have
received this electronic message in error, please immediately notify us by
return electronic message, and then delete this electronic message. Thank
you.

On Tue, Feb 11, 2020 at 7:01 AM G.W. Haywood via clamav-users <
clamav-users@lists.clamav.net> wrote:

> Hi there,
>
> Thanks for the excellent extra information, it makes things a lot clearer.
>
> On Mon, 10 Feb 2020, Tom Ossman via clamav-users wrote:
>
> > ... the VirusEvent script (also cleansed).
>
> Does the script contain the first two lines as in the version which
> you sent to me? If so, remove them. See the 'man' page for the
> 'file' utility and use it on your script. :)
>
> > I was not aware the the code was that new, I'll review the link ...
>
> Maybe this will help too:
>
> https://blog.clamav.net/2019/09/understanding-and-transitioning-to.html
>
> > The "server" instance is a t3a.small, 2 CPUs and 2 GB of memory...
>
> I'm not sure that's be enough memory for what you're doing, I suggest
> at least 4GB. From personal experience I'd recommend Nagios or Icinga
> to monitor resource usage. Depending on your familiarity with things
> like Apache it can be a steep learning curve, but once you have it
> under your belt it's difficult to imagine working without something
> like that when you have to look after more than about half a dozen
> systems.
>
> --
>
> 73,
> Ged.
>
> _______________________________________________
>
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
>
> http://www.clamav.net/contact.html#ml
>

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 11, 2020, 8:29 AM

Post #6 of 10 (1010 views)

Permalink

Hi there,

On Tue, 11 Feb 2020, Tom Ossman via clamav-users wrote:
>>
>>> ... the VirusEvent script (also cleansed).
>>
>> Does the script contain the first two lines as in the version which
>> you sent to me? If so, remove them. See the 'man' page for the
>> 'file' utility and use it on your script. :)
>>
>> Are you refering to the comment and shebang? If so, the comment isn't
> there but the shebang is and if that is one of the lines you are refering
> to, correct me if I am wrong but doesn't a script need the shebang to run?

In the script which you attached the shebang line is the third line.
The first two lines were a non-shebang line comment and a blank line.
Obviously the shebang line must be the first line in the real script.
(When I asked you to show us the script, I didn't mean for you to show
us some rough approximation to it. :)

You might want to use full path names for things like 'cat' in the
script so that it doesn't depend e.g. on environment variables which
might not be set.

> ... from what I have observed in the past the memory did not look like
> it was being maxed out. But spinning up a larger instance with more
> memory is not a big deal.

Please let me know if you have any more information about resources.

We still don't know that this is not a fault in ClamAV itself of
course, but as it appears that at some time the sytem _was_ working,
it seems much more likely to be an issue with your configuration.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 11, 2020, 8:57 AM

Post #7 of 10 (1010 views)

Permalink

Wanted to add a bit of insight to this convo from the dev side of things:

VirusEvent currently works by forking the existing clamd process into a new, short-lived process that handles execution of the user's script.

This is a legacy design choice and is problematic for a number of reasons--most relevant here is that you will need at minimum 2x the amount of resources clamd is already using to execute the VirusEvent. It was this resource drain, combined with the threaded nature of the old on access code, which led to us disabling the feature (only for on access scanning, not clamd/clamdscan).

From what I can tell, your problem is that the fork system command is failing (code path for that error requires a negative return for fork())--very likely due to lack of resources on the server.

Ideally, we would fix this resource consumption issue on its own, or better, as part of a larger redesign of clamd, but for now--like Ged, I would also recommend increasing memory resources and seeing if that solves the issue.

-Mickey

On 2020-02-11 11:30:11-05:00 clamav-users wrote:

Hi there,

On Tue, 11 Feb 2020, Tom Ossman via clamav-users wrote:
>>
>>> ... the VirusEvent script (also cleansed).
>>
>> Does the script contain the first two lines as in the version which
>> you sent to me? If so, remove them. See the 'man' page for the
>> 'file' utility and use it on your script. :)
>>
>> Are you refering to the comment and shebang? If so, the comment isn't
> there but the shebang is and if that is one of the lines you are refering
> to, correct me if I am wrong but doesn't a script need the shebang to run?

In the script which you attached the shebang line is the third line.
The first two lines were a non-shebang line comment and a blank line.
Obviously the shebang line must be the first line in the real script.
(When I asked you to show us the script, I didn't mean for you to show
us some rough approximation to it. :)

You might want to use full path names for things like 'cat' in the
script so that it doesn't depend e.g. on environment variables which
might not be set.

> ... from what I have observed in the past the memory did not look like
> it was being maxed out. But spinning up a larger instance with more
> memory is not a big deal.

Please let me know if you have any more information about resources.

We still don't know that this is not a fault in ClamAV itself of
course, but as it appears that at some time the sytem _was_ working,
it seems much more likely to be an issue with your configuration.

--

73,
Ged.

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 14, 2020, 9:47 AM

Post #8 of 10 (1010 views)

Permalink

Thank you both, I spun up a larger instance (t3a.xlarge) and everything is
working as expected now.

*Tom Ossman*

tossman@aspirevc.com | aspirevc.com | +1.717.468.0293

100 North Queen Street | Suite 300 | Lancaster, PA 17603

Engage with us on Twitter <https://twitter.com/AspireVC> | LinkedIn
<https://www.linkedin.com/company/aspire_ventures> | Facebook
<https://www.facebook.com/aspirevc>

The information contained in this electronic message is legally privileged
and confidential information intended only for the person to whom the
message is addressed. If the reader of this message is not the intended
recipient, you are hereby notified that any dissemination, distribution, or
copy of this electronic message is strictly prohibited. If you have
received this electronic message in error, please immediately notify us by
return electronic message, and then delete this electronic message. Thank
you.

On Tue, Feb 11, 2020 at 11:58 AM Mickey Sola (micksola) via clamav-users <
clamav-users@lists.clamav.net> wrote:

> Wanted to add a bit of insight to this convo from the dev side of things:
>
> VirusEvent currently works by forking the existing clamd process into a
> new, short-lived process that handles execution of the user's script.
>
> This is a legacy design choice and is problematic for a number of
> reasons--most relevant here is that you will need at minimum 2x the amount
> of resources clamd is already using to execute the VirusEvent. It was this
> resource drain, combined with the threaded nature of the old on access
> code, which led to us disabling the feature (only for on access scanning,
> not clamd/clamdscan).
>
> From what I can tell, your problem is that the fork system command is
> failing (code path for that error requires a negative return for
> fork())--very likely due to lack of resources on the server.
>
> Ideally, we would fix this resource consumption issue on its own, or
> better, as part of a larger redesign of clamd, but for now--like Ged, I
> would also recommend increasing memory resources and seeing if that solves
> the issue.
>
> -Mickey
>
>
>
> On 2020-02-11 11:30:11-05:00 clamav-users wrote:
>
> Hi there,
>
> On Tue, 11 Feb 2020, Tom Ossman via clamav-users wrote:
> >>
> >>> ... the VirusEvent script (also cleansed).
> >>
> >> Does the script contain the first two lines as in the version which
> >> you sent to me? If so, remove them. See the 'man' page for the
> >> 'file' utility and use it on your script. :)
> >>
> >> Are you refering to the comment and shebang? If so, the comment isn't
> > there but the shebang is and if that is one of the lines you are refering
> > to, correct me if I am wrong but doesn't a script need the shebang to run?
>
> In the script which you attached the shebang line is the third line.
> The first two lines were a non-shebang line comment and a blank line.
> Obviously the shebang line must be the first line in the real script.
> (When I asked you to show us the script, I didn't mean for you to show
> us some rough approximation to it. :)
>
> You might want to use full path names for things like 'cat' in the
> script so that it doesn't depend e.g. on environment variables which
> might not be set.
>
> > ... from what I have observed in the past the memory did not look like
> > it was being maxed out. But spinning up a larger instance with more
> > memory is not a big deal.
>
> Please let me know if you have any more information about resources.
>
> We still don't know that this is not a fault in ClamAV itself of
> course, but as it appears that at some time the sytem _was_ working,
> it seems much more likely to be an issue with your configuration.
>
> --
>
> 73,
> Ged.
>
> _______________________________________________
>
> clamav-users mailing listclamav-users@lists.clamav.nethttps://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:https://github.com/vrtadmin/clamav-faq
> http://www.clamav.net/contact.html#ml
>
>
> _______________________________________________
>
> clamav-users mailing list
> clamav-users@lists.clamav.net
> https://lists.clamav.net/mailman/listinfo/clamav-users
>
>
> Help us build a comprehensive ClamAV guide:
> https://github.com/vrtadmin/clamav-faq
>
> http://www.clamav.net/contact.html#ml
>

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 16, 2020, 6:33 PM

Post #9 of 10 (1010 views)

Permalink

On 2020-02-11 at 16:57 +0000, Mickey Sola (micksola) via clamav-users
wrote:
> Wanted to add a bit of insight to this convo from the dev side of
> things:
>
> VirusEvent currently works by forking the existing clamd process into
> a new, short-lived process that handles execution of the user's
> script.
>
> This is a legacy design choice and is problematic for a number of
> reasons--most relevant here is that you will need at minimum 2x the
> amount of resources clamd is already using to execute the VirusEvent.
> It was this resource drain, combined with the threaded nature of the
> old on access code, which led to us disabling the feature (only for on
> access scanning, not clamd/clamdscan).
>
> From what I can tell, your problem is that the fork system command is
> failing (code path for that error requires a negative return for
> fork())--very likely due to lack of resources on the server.
>
> Ideally, we would fix this resource consumption issue on its own, or
> better, as part of a larger redesign of clamd, but for
> now--like Ged, I would also recommend increasing memory resources and
> seeing if that solves the issue.
>
> -Mickey

This can be easily solved by changing the fork() to vfork()

--- ./clamd/others.c 2020-02-04 15:59:26.000000000 +0100
+++ ./clamd/others.c.new 2020-02-17 02:25:10.404123000 +0100
@@ -157,9 +157,9 @@

pthread_mutex_lock(&virusaction_lock);
/* We can only call async-signal-safe functions after fork(). */
- pid = fork();
+ pid = vfork();
if (pid == 0) { /* child */
- exit(execle("/bin/sh", "sh", "-c", buffer_cmd, NULL, env));
+ _exit(execle("/bin/sh", "sh", "-c", buffer_cmd, NULL, env));
} else if (pid > 0) { /* parent */
pthread_mutex_unlock(&virusaction_lock);
while (waitpid(pid, NULL, 0) == -1 && errno == EINTR) continue;

You will surely find text saying how nowadays there is no need for
vfork(), since fork() uses copy-on-write and it has practically no
penalty.
Well, I have found years ago that there *is* a difference (at least on
Linux, which is used by the OP). When the memory usage of the process
goes over a point? (in the default overcommit_memory=0) fork() will
start failing, while vfork() still works.

The context around that code means that it is safe to exchange the
fork() with vfork(), as it only modifies a pid_t variable "before
successfully calling _exit(2) or one of the exec(3) family of
functions".

Note that I am changing to _exit(2) the original exit(3) call used if
execle fails. Usage of exit(3) seems like a bug even when in the fork()
version, since atexit() handlers would be called at the children
process, which could have undesired effects.

It had been about 10 years since I last tested this, but I have made a
cheap program showing the issue. Hope it doesn't get stripped by the
mailing list.

Compiling with
gcc -o vfork-test vfork-test.c -O3
will -depending on the host memory- quickly stop after a few iterations
under the normal overcommit_memory=0, fail much earlier for
non-overcomitting (2), and "never" for always overcommit (1).

If instead of fork() you use vfork():
gcc -o vfork-test -Dfork=vfork vfork-test.c -O3

you get the "always forking" behavior also under overcommit_memory=0,
without changing the overcommit behavior systemwide (and a slight
increase under non-overcomitting mode).

Kind regards

Note: use -O0 if you want the program to force the memory to be
allocated.

? https://unix.stackexchange.com/questions/378172/what-does-heuristics-in-overcommit-memory-0-mean

Re: [clamav-users] ERROR: VirusEvent: fork failed. [ In reply to ]

clamav-users at lists

Feb 17, 2020, 8:34 PM

Post #10 of 10 (1008 views)

Permalink

Ángel,

Thanks! I've added a link to your email (and attached example) to our Jira task for improving VirusEvent. We'll definitely try it out.

-Micah

Micah Snyder
ClamAV Development
Talos
Cisco Systems, Inc.

?On 2/16/20, 9:34 PM, "clamav-users on behalf of Ángel via clamav-users" <clamav-users-bounces@lists.clamav.net on behalf of clamav-users@lists.clamav.net> wrote:

On 2020-02-11 at 16:57 +0000, Mickey Sola (micksola) via clamav-users
wrote:
> Wanted to add a bit of insight to this convo from the dev side of
> things:
>
> VirusEvent currently works by forking the existing clamd process into
> a new, short-lived process that handles execution of the user's
> script.
>
> This is a legacy design choice and is problematic for a number of
> reasons--most relevant here is that you will need at minimum 2x the
> amount of resources clamd is already using to execute the VirusEvent.
> It was this resource drain, combined with the threaded nature of the
> old on access code, which led to us disabling the feature (only for on
> access scanning, not clamd/clamdscan).
>
> From what I can tell, your problem is that the fork system command is
> failing (code path for that error requires a negative return for
> fork())--very likely due to lack of resources on the server.
>
> Ideally, we would fix this resource consumption issue on its own, or
> better, as part of a larger redesign of clamd, but for
> now--like Ged, I would also recommend increasing memory resources and
> seeing if that solves the issue.
>
> -Mickey

This can be easily solved by changing the fork() to vfork()

--- ./clamd/others.c 2020-02-04 15:59:26.000000000 +0100
+++ ./clamd/others.c.new 2020-02-17 02:25:10.404123000 +0100
@@ -157,9 +157,9 @@

pthread_mutex_lock(&virusaction_lock);
/* We can only call async-signal-safe functions after fork(). */
- pid = fork();
+ pid = vfork();
if (pid == 0) { /* child */
- exit(execle("/bin/sh", "sh", "-c", buffer_cmd, NULL, env));
+ _exit(execle("/bin/sh", "sh", "-c", buffer_cmd, NULL, env));
} else if (pid > 0) { /* parent */
pthread_mutex_unlock(&virusaction_lock);
while (waitpid(pid, NULL, 0) == -1 && errno == EINTR) continue;

You will surely find text saying how nowadays there is no need for
vfork(), since fork() uses copy-on-write and it has practically no
penalty.
Well, I have found years ago that there *is* a difference (at least on
Linux, which is used by the OP). When the memory usage of the process
goes over a point¹ (in the default overcommit_memory=0) fork() will
start failing, while vfork() still works.

The context around that code means that it is safe to exchange the
fork() with vfork(), as it only modifies a pid_t variable "before
successfully calling _exit(2) or one of the exec(3) family of
functions".

Note that I am changing to _exit(2) the original exit(3) call used if
execle fails. Usage of exit(3) seems like a bug even when in the fork()
version, since atexit() handlers would be called at the children
process, which could have undesired effects.

It had been about 10 years since I last tested this, but I have made a
cheap program showing the issue. Hope it doesn't get stripped by the
mailing list.

Compiling with
gcc -o vfork-test vfork-test.c -O3
will -depending on the host memory- quickly stop after a few iterations
under the normal overcommit_memory=0, fail much earlier for
non-overcomitting (2), and "never" for always overcommit (1).

If instead of fork() you use vfork():
gcc -o vfork-test -Dfork=vfork vfork-test.c -O3

you get the "always forking" behavior also under overcommit_memory=0,
without changing the overcommit behavior systemwide (and a slight
increase under non-overcomitting mode).

Kind regards

Note: use -O0 if you want the program to force the memory to be
allocated.

¹ https://unix.stackexchange.com/questions/378172/what-does-heuristics-in-overcommit-memory-0-mean

_______________________________________________

clamav-users mailing list
clamav-users@lists.clamav.net
https://lists.clamav.net/mailman/listinfo/clamav-users

Help us build a comprehensive ClamAV guide:
https://github.com/vrtadmin/clamav-faq

http://www.clamav.net/contact.html#ml

Mailing List Archive

Attached Files:

Attached Files: