Mailing List Archive

race condition in clamd
Hello!

I think I fixed race condition in clamd. You initialized startup time of
thread after its flag set to active and after thread is created. Creation of
new thread sometimes switches timeslice to the monitor, it compares current
time with that unitialized value and sometimes kills the new thread
complaining about timeout to syslog. Try to

clamdscan bigfile & clamdscan bigfile & clamdscan bigfile & clamdscan
bigfile & clamdscan bigfile & clamdscan bigfile & clamdscan bigfile &
clamdscan bigfile & clamdscan bigfile & clamdscan bigfile & clamdscan
bigfile & clamdscan bigfile & clamdscan bigfile & clamdscan bigfile &
clamdscan bigfile & clamdscan bigfile & clamdscan bigfile & clamdscan
bigfile &

with enough big amount of maxchildren (I use 20) and look if some of them
die because of instant timeout.

I suggest to initialize ths[i].active and ths[i].start in thread function
itself. And it is good to close descriptor if the thread did not start (due
to lack of system resources, for example).

Another thing in clamd is it uses FILE *tmp to create temporary file for
stream, but closes only file descriptor, not whole stream.

Clamav-milter sometimes hang on final while(recv(...)) trying to wait for
clamd to close stream.

And thing not covered by this patch is file descriptor leakage in clamd.
Under linux go to /proc/<CLAMD-PID>/fd and compare its contents right after
startup and after several hours of milter operation. With my patch you'll be
able to see increasing filedescriptor number when accepting connection.
netstat shows hanging connections to clamd command socket.

Following patch added some stability on my system...

misha.

diff -ru clamav-devel/clamav-milter/clamav-milter.c clamav-devel.misha/clamav-milter/clamav-milter.c
--- clamav-devel/clamav-milter/clamav-milter.c Wed Oct 22 23:44:01 2003
+++ clamav-devel.misha/clamav-milter/clamav-milter.c Wed Oct 29 19:30:39 2003
@@ -1358,8 +1365,9 @@
/*
* Flush the remote end so that clamd doesn't get a SIGPIPE
*/
- while(recv(privdata->cmdSocket, buf, sizeof(buf), 0) > 0)
- ;
+// clamd seems to forget about this...
+// while(recv(privdata->cmdSocket, buf, sizeof(buf), 0) > 0)
+// ;
close(privdata->cmdSocket);
privdata->cmdSocket = -1;
}
diff -ru clamav-devel/clamd/scanner.c clamav-devel.misha/clamd/scanner.c
--- clamav-devel/clamd/scanner.c Sun Oct 26 09:00:57 2003
+++ clamav-devel.misha/clamd/scanner.c Wed Oct 29 22:00:12 2003
@@ -27,6 +27,7 @@
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
+#include <errno.h>
#include <clamav.h>

#include "cfgfile.h"
@@ -180,7 +181,7 @@
char *virname, buff[32768];
struct sockaddr_in server;
struct cfgstruct *cpt;
- FILE *tmp;
+ FILE *tmp=NULL;


while(!binded && portscan--) {
@@ -219,12 +220,13 @@
}


- logg("*Accepted connection on port %d\n", port);
+ logg("*Accepted connection on port %d, fd %d\n", port, acceptd);

if(cfgopt(copt, "StreamSaveToDisk")) {
if((tmp = tmpfile()) == NULL) {
shutdown(sockfd, 2);
close(sockfd);
+ close(acceptd);
mdprintf(odesc, "Temporary file ERROR\n");
logg("!ScanStream: Can't create temporary file.\n");
return -1;
@@ -240,18 +242,22 @@
if(maxsize && (size + sizeof(buff)) > maxsize) {
shutdown(sockfd, 2);
close(sockfd);
+ close(acceptd);
mdprintf(odesc, "Size exceeded ERROR\n");
logg("^ScanStream: Size exceeded (stopped at %d, max: %d)\n", size, maxsize);
- close(tmpd);
+ if(tmp)
+ fclose(tmp);
return -1;
}

if(write(tmpd, buff, bread) < 0) {
shutdown(sockfd, 2);
close(sockfd);
+ close(acceptd);
mdprintf(odesc, "Temporary file -> write ERROR\n");
- logg("!ScanStream: Can't write to temporary file.\n");
- close(tmpd);
+ logg("!ScanStream: Can't write %d bytes to temporary file: %d\n", bread, errno);
+ if(tmp)
+ fclose(tmp);
return -1;
}

@@ -259,7 +265,8 @@

lseek(tmpd, 0, SEEK_SET);
ret = cl_scandesc(tmpd, &virname, scanned, root, limits, options);
- close(tmpd);
+ if(tmp)
+ fclose(tmp);

} else
ret = cl_scandesc(acceptd, &virname, scanned, root, limits, 0);
diff -ru clamav-devel/clamd/server.c clamav-devel.misha/clamd/server.c
--- clamav-devel/clamd/server.c Wed Oct 8 14:40:53 2003
+++ clamav-devel.misha/clamd/server.c Thu Oct 30 00:28:26 2003
@@ -63,6 +63,8 @@
sigset_t sigset;
int bread, options, maxwait = CL_DEFAULT_MAXWHILEWAIT;

+ ths[tharg->sid].start=time(NULL);
+ ths[tharg->sid].active=1;

/* ignore all signals */
sigfillset(&sigset);
@@ -512,6 +514,7 @@
sigaddset(&sigact.sa_mask, SIGHUP);
sigaction(SIGINT, &sigact, NULL);
sigaction(SIGTERM, &sigact, NULL);
+ sigaction(SIGPIPE, &sigact, NULL);
#ifndef CL_DEBUG
sigaction(SIGSEGV, &sigact, NULL);
#endif
@@ -582,9 +585,10 @@
tharg->options = options;

ths[i].desc = acceptd;
- ths[i].active = 1;
- pthread_create(&ths[i].id, &thattr, threadscanner, tharg);
- ths[i].start = time(NULL);
+ if(pthread_create(&ths[i].id, &thattr, threadscanner, tharg)!=0) {
+ logg("Session(%d) did not start. Dropping connection.",i);
+ close(acceptd);
+ }
}
}

@@ -625,6 +629,9 @@
case SIGHUP:
sighup = 1;
logg("SIGHUP catched: log file re-opened.\n");
+ break;
+ case SIGPIPE:
+ logg("SIGPIPE catched by pthread id %d.\n", (int)pthread_self());
break;
}
}
Re: race condition in clamd [ In reply to ]
Hi Misha,

> thread after its flag set to active and after thread is created.
> Creation of new thread sometimes switches timeslice to the monitor, it
> compares current time with that unitialized value and sometimes kills
> the new thread complaining about timeout to syslog. Try to

I agree - the start time must be initialized before the thread structure
is activated.

> I suggest to initialize ths[i].active and ths[i].start in thread
> function

This is not a good idea - it (theoritically) introduces a new race
condition in the main thread - if for some reason there will be a delay
in new thread execution (eg. some context switching), the main thread
may assign the same thread session once more.

> itself. And it is good to close descriptor if the thread did not start
> (due to lack of system resources, for example).

Right.

Thank you.

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Fri Oct 31 22:37:53 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
On Thu, 30 Oct 2003 09:41:35 +0300 (EET)
Michael Dankov <misha@btrc.ru> wrote:

> Another thing in clamd is it uses FILE *tmp to create temporary file
> for
> stream, but closes only file descriptor, not whole stream.

You're right, thanks !

29 22:00:12 2003
> @@ -27,6 +27,7 @@
> #include <sys/types.h>
> #include <sys/socket.h>
> #include <netinet/in.h>
> +#include <errno.h>

Never use a global error variable in multithreaded environment because
it introduces a race condition and the resulting error number is not
reliable.

> + case SIGPIPE:
> + logg("SIGPIPE catched by pthread id %d.\n",
> (int)pthread_self());

Some thread libraries use internal structures for pthread_t so we
shouldn't cast it because it makes the code not portable.

Thanks.

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Sat Nov 1 03:20:19 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
On Sat, Nov 01, 2003 at 03:27:54AM +0100, Tomasz Kojm wrote:
> 29 22:00:12 2003
> > @@ -27,6 +27,7 @@
> > #include <sys/types.h>
> > #include <sys/socket.h>
> > #include <netinet/in.h>
> > +#include <errno.h>
> Never use a global error variable in multithreaded environment because
> it introduces a race condition and the resulting error number is not
> reliable.

I missed the context here, but if it is 'errno', errno only appears to
be a global variable on most systems that support threading. For example,
in linux/errno.h:

# define errno (*__errno_location ())

Where __errno_location() returns a thread-local memory location that
is reliable.

Well, I suppose a signal handler could overwrite it, but that is true
of a non-threaded program as well. Signal handlers should play nice. :-)

Cheers,
mark

--
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/
Re: race condition in clamd [ In reply to ]
On Sat, 1 Nov 2003 01:59:16 -0500
Mark Mielke <mark@mark.mielke.cc> wrote:

> On Sat, Nov 01, 2003 at 03:27:54AM +0100, Tomasz Kojm wrote:
> > 29 22:00:12 2003
> > > @@ -27,6 +27,7 @@
> > > #include <sys/types.h>
> > > #include <sys/socket.h>
> > > #include <netinet/in.h>
> > > +#include <errno.h>
> > Never use a global error variable in multithreaded environment
> > because it introduces a race condition and the resulting error
> > number is not reliable.
>
> I missed the context here, but if it is 'errno', errno only appears to
> be a global variable on most systems that support threading. For
> example, in linux/errno.h:

I agree, but this is not portable and we cannot rely on it.

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Sat Nov 1 14:14:02 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
On Sat, Nov 01, 2003 at 02:15:36PM +0100, Tomasz Kojm wrote:
> On Sat, 1 Nov 2003 01:59:16 -0500
> Mark Mielke <mark@mark.mielke.cc> wrote:
> > > > +#include <errno.h>
> > > Never use a global error variable in multithreaded environment
> > > because it introduces a race condition and the resulting error
> > > number is not reliable.
> > I missed the context here, but if it is 'errno', errno only appears to
> > be a global variable on most systems that support threading. For
> > example, in linux/errno.h:
> I agree, but this is not portable and we cannot rely on it.

Apache 2.0 is in use on more platforms than clamav, I suspect, and they
do not claim that errno should not be used:

http://httpd.apache.org/docs-2.0/developer/thread_safety.html

Rather, they claim that all good threaded programs should define
_REENTRANT. This is compatible with my understanding. Any thread
implementation that does not redefine errno is broken as most common
system calls use errno exclusively as a means of passing information
about a failure to the calling function.

Perhaps you, or whoever has experienced problems with errno in a
threaded environment on some platforms, did not define _REENTRANT when
compiling the source code?

Cheers,
mark

--
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/
Re: race condition in clamd [ In reply to ]
On Sat, 1 Nov 2003 22:06:11 -0500
Mark Mielke <mark@mark.mielke.cc> wrote:

> Perhaps you, or whoever has experienced problems with errno in a
> threaded environment on some platforms, did not define _REENTRANT when
> compiling the source code?

I never used errno in multithreaded software but I think it's quite a
good idea to implement an autoconf macro that will check for a
thread-safe errno. Some older software may not be compatible (just like
old linux kernels don't support sendfile() and clamav compilation fails)
and I think a code using errno should be #ifdefined.

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Sun Nov 2 04:20:07 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
On Sun, Nov 02, 2003 at 04:26:22AM +0100, Tomasz Kojm wrote:
> On Sat, 1 Nov 2003 22:06:11 -0500
> Mark Mielke <mark@mark.mielke.cc> wrote:
> > Perhaps you, or whoever has experienced problems with errno in a
> > threaded environment on some platforms, did not define _REENTRANT when
> > compiling the source code?
> I never used errno in multithreaded software but I think it's quite a
> good idea to implement an autoconf macro that will check for a
> thread-safe errno. Some older software may not be compatible (just like
> old linux kernels don't support sendfile() and clamav compilation fails)
> and I think a code using errno should be #ifdefined.

How do you determine why read() or write() fail in a thread, then?

Unless you can point at a sensible implementation of threads that does
not override errno, I strongly disagree with your conclusion.

Errno is the *only* way of performing error handling for most UNIX system
calls. There is *no* reasonable alternative. Your example of sendfile() is
not correct as sendfile() is not a standard UNIX system call.

mark

--
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/
Re: race condition in clamd [ In reply to ]
On Sat, 1 Nov 2003 23:51:17 -0500
Mark Mielke <mark@mark.mielke.cc> wrote:

> Errno is the *only* way of performing error handling for most UNIX
> system calls. There is *no* reasonable alternative. Your example of

I agree. The most annoying thing for me is that cannot find any
official statement clarifying errno usage in multithreaded environments.
Anyway, I think there's no big risk with errno because in the worst case
scenario it may only give a false result. I just verified that my
favorite (and probably the best one) user level thread library - GNU Pth
- implements errno on per-thread basis. I used to negate errno usage in
clamav but after this discussion (and some google'ing) I'm no longer an
opponent. Thanks ;)

> sendfile() is not correct as sendfile() is not a standard UNIX system
> call.

Well, it seems errno usage in threads is not standardized, too.

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Sun Nov 2 13:23:01 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
Am 02.11.2003 um 13:45 schrieb Tomasz Kojm:

> I agree. The most annoying thing for me is that cannot find any
> official statement clarifying errno usage in multithreaded
> environments.

Let me quote a little from [1]:

1.9.3 Checking for errors

"Pthreads introduces a new way to report errors, without using the
errno variable.
...
Pthreads also provides a per-thread errno, which supports other code
that uses errno. This means that when one thread calls some function
that reports an error using errno, the value cannot be overwritten, or
read, by any other thread - you may go on using errno just as you
always have."

I recommend you get (and read...) [1].

- Marc


[1] Butenhof, David R., Programming with POSIX Threads
Re: race condition in clamd [ In reply to ]
On Sun, 2 Nov 2003 14:18:52 +0100
Marc Balmer <marc@msys.ch> wrote:

> 1.9.3 Checking for errors
>
> "Pthreads introduces a new way to report errors, without using the
> errno variable.
> ...
> Pthreads also provides a per-thread errno, which supports other code
> that uses errno. This means that when one thread calls some function
> that reports an error using errno, the value cannot be overwritten, or
>
> read, by any other thread - you may go on using errno just as you
> always have."

That's what I was looking for, thanks.

> I recommend you get (and read...) [1].

I have no much time for books :-((

Best regards,
Tomasz Kojm
--
oo ..... http://www.clamav.net/gpg/tkojm.gpg
(\/)\......... 0DCA5A08407D5288279DB43454822DC8985A444B
\..........._ Sun Nov 2 14:28:38 CET 2003
//\ /\
Re: race condition in clamd [ In reply to ]
On Sun, Nov 02, 2003 at 02:18:52PM +0100, Marc Balmer wrote:
> >I agree. The most annoying thing for me is that cannot find any
> >official statement clarifying errno usage in multithreaded
> >environments.
> Let me quote a little from [1]:
> 1.9.3 Checking for errors
> "Pthreads introduces a new way to report errors, without using the
> errno variable.
> ...
> Pthreads also provides a per-thread errno, which supports other code
> that uses errno. This means that when one thread calls some function
> that reports an error using errno, the value cannot be overwritten, or
> read, by any other thread - you may go on using errno just as you
> always have."
> I recommend you get (and read...) [1].

Unfortunately, ISO C and C99 have neglected (on accident, or on
purpose? I don't know...) to document errno as being thread-safe. I
believe their only requirement is "it must be a modifiable lvalue" or
something to that effect.

This leaves it entirely up to the threading implementation to define
whether errno is safe. Even if POSIX threads are defined to support
a thread-safe errno, not all threading implementations claim to support
POSIX, and some that do claim only partial support, or have not implemented
POSIX threads correctly (i.e. the original Linux native threads package,
libpthread).

So, I don't blame Tomasz for being conservative. :-)

After everything, though, having errno not be thread-safe, under UNIX,
is disastrous. If any UNIX platform implements threads without a thread-safe
errno, threading support on this platform should be boycotted until they
fix their implementation. UNIX requires a thread-safe errno for any systems
calls to be accessed without synchronization. Requiring synchronization on all
systems calls is ridiculous, and if necessary, should be done by libc.a, not
by the application.

Anyways, enough about this. Google'ing turns up dozens of such
discussions on various other smaller mailing lists such as this one. Each
arrives at the same conclusion via a different route. :-)

Cheers!
mark

--
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
. . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder
|\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ |
| | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada

One ring to rule them all, one ring to find them, one ring to bring them all
and in the darkness bind them...

http://mark.mielke.cc/