Mailing List Archive

review request: omprog
Hi all,

a forum post made me (again) aware of functionality missing in rsyslog:

http://kb.monitorware.com/problem-to-migrate-from-syslog-ng-to-rsyslog-t8982.
html

That is the execution of a program which receives all log messages passed in
via stdin. I have now done a first, rough, implementation of "omprog" which
shall provide this feature.

I would appreciate if some could quickly review the code, especially lines 97
to 135, where I clean up after fork and before I exec the program. The code
can be found here:

http://git.adiscon.com/?p=rsyslog.git;a=blob;f=plugins/omprog/omprog.c;h=2a07
8a6d862c2230a8caa9a489e08a7b5ea4cb29;hb=refs/heads/omprog#l97

There are three questions that I have:

1. is the method used sufficiently secure?
2. is there a better way to close open file handles
3. am I resetting the sigaction() correctly?

Especially #3 puzzles me, because I can not use sigterm to cancel a child
(via a different bash). Also, waitpid() always returns -1 and tells me "there
are no children".

So I am under the impression I am doing something wrong. As I also have
limited experience in that area of executing external programs, I'd
appreciate advice from those in the know.

Feel free to pass this along, if you are able to motivate someone else on
this topic ;)

Thanks,
Rainer
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: review request: omprog [ In reply to ]
I only had a quick glance, but have a couple of questions:

a) is popen() not suitable for some reason? it's a little less efficient
(since it starts a shell to interpret the command line passed in) but
your code would be much simpler.

b) are you sure you need to close the file handles and reset signal
handlers yourself? from the execve() man page:

"execve() does not return on success, and the text, data, bss, and stack
of the calling process are overwritten by that of the program loaded.
The program invoked inherits the calling process's PID, and any open
file descriptors that are not set to close-on-exec. Signals pending on
the calling process are cleared. Any signals set to be caught by the
calling process are reset to their default behaviour. The SIGCHLD signal
(when set to SIG_IGN) may or may not be reset to SIG_DFL."

Sounds like execve is doing both for you. It'd be easy to verify - write
a little external program that dumps the open fds and signal handlers
when it starts.

-- Paul

Rainer Gerhards wrote:
> Hi all,
>
> a forum post made me (again) aware of functionality missing in rsyslog:
>
> http://kb.monitorware.com/problem-to-migrate-from-syslog-ng-to-rsyslog-t8982.
> html
>
> That is the execution of a program which receives all log messages passed in
> via stdin. I have now done a first, rough, implementation of "omprog" which
> shall provide this feature.
>
> I would appreciate if some could quickly review the code, especially lines 97
> to 135, where I clean up after fork and before I exec the program. The code
> can be found here:
>
> http://git.adiscon.com/?p=rsyslog.git;a=blob;f=plugins/omprog/omprog.c;h=2a07
> 8a6d862c2230a8caa9a489e08a7b5ea4cb29;hb=refs/heads/omprog#l97
>
> There are three questions that I have:
>
> 1. is the method used sufficiently secure?
> 2. is there a better way to close open file handles
> 3. am I resetting the sigaction() correctly?
>
> Especially #3 puzzles me, because I can not use sigterm to cancel a child
> (via a different bash). Also, waitpid() always returns -1 and tells me "there
> are no children".
>
> So I am under the impression I am doing something wrong. As I also have
> limited experience in that area of executing external programs, I'd
> appreciate advice from those in the know.
>
> Feel free to pass this along, if you are able to motivate someone else on
> this topic ;)
>
> Thanks,
> Rainer
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com

_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: review request: omprog [ In reply to ]
> -----Original Message-----
> From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
> bounces@lists.adiscon.com] On Behalf Of Paul Chambers
> Sent: Thursday, April 02, 2009 1:45 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] review request: omprog
>
> I only had a quick glance, but have a couple of questions:
>
> a) is popen() not suitable for some reason? it's a little less
> efficient
> (since it starts a shell to interpret the command line passed in) but
> your code would be much simpler.

I have two problems with popen() - the "real" one is that I cannot obtain the
pid of the started program. Or better phrased, no search brought up how to do
that. But (the second problem) the search turned out that there exist
multiple cross-platform issues with popen() and *a lot* of folks recommended
against it.

>
> b) are you sure you need to close the file handles and reset signal
> handlers yourself? from the execve() man page:
>
> "execve() does not return on success, and the text, data, bss, and
> stack
> of the calling process are overwritten by that of the program loaded.
> The program invoked inherits the calling process's PID, and any open
> file descriptors that are not set to close-on-exec. Signals pending on
> the calling process are cleared. Any signals set to be caught by the
> calling process are reset to their default behaviour. The SIGCHLD
> signal
> (when set to SIG_IGN) may or may not be reset to SIG_DFL."

On my Fedora 10, the execve man page is much less specific on this (actually,
open files are not mentioned at all in the "what is cleaned up" list. This
lets me believe that there at least is a portability problem. However, you
are right with the signals, so I'd only need to clear SIGCHLD. But what if
there is some platform who preserves something else? As resetting signals is
very quick, it's probably better to do it for all of them

Probably it would make sense to add these notes as comments into the function
header. Guess other's will have the same questions :)

>
> Sounds like execve is doing both for you. It'd be easy to verify -
> write
> a little external program that dumps the open fds and signal handlers
> when it starts.

How can I check which fd's are open? That would be the solution also to see
what I need to close... Maybe I am overlooking the obvious.

Thanks for looking at the code, definitely helpful!

Rainer

>
> -- Paul
>
> Rainer Gerhards wrote:
> > Hi all,
> >
> > a forum post made me (again) aware of functionality missing in
> rsyslog:
> >
> > http://kb.monitorware.com/problem-to-migrate-from-syslog-ng-to-
> rsyslog-t8982.
> > html
> >
> > That is the execution of a program which receives all log messages
> passed in
> > via stdin. I have now done a first, rough, implementation of "omprog"
> which
> > shall provide this feature.
> >
> > I would appreciate if some could quickly review the code, especially
> lines 97
> > to 135, where I clean up after fork and before I exec the program.
> The code
> > can be found here:
> >
> >
> http://git.adiscon.com/?p=rsyslog.git;a=blob;f=plugins/omprog/omprog.c;
> h=2a07
> > 8a6d862c2230a8caa9a489e08a7b5ea4cb29;hb=refs/heads/omprog#l97
> >
> > There are three questions that I have:
> >
> > 1. is the method used sufficiently secure?
> > 2. is there a better way to close open file handles
> > 3. am I resetting the sigaction() correctly?
> >
> > Especially #3 puzzles me, because I can not use sigterm to cancel a
> child
> > (via a different bash). Also, waitpid() always returns -1 and tells
> me "there
> > are no children".
> >
> > So I am under the impression I am doing something wrong. As I also
> have
> > limited experience in that area of executing external programs, I'd
> > appreciate advice from those in the know.
> >
> > Feel free to pass this along, if you are able to motivate someone
> else on
> > this topic ;)
> >
> > Thanks,
> > Rainer
> > _______________________________________________
> > rsyslog mailing list
> > http://lists.adiscon.net/mailman/listinfo/rsyslog
> > http://www.rsyslog.com
>
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com
Re: review request: omprog [ In reply to ]
Rainer Gerhards wrote:
>> -----Original Message-----
>> From: rsyslog-bounces@lists.adiscon.com [mailto:rsyslog-
>> bounces@lists.adiscon.com] On Behalf Of Paul Chambers
>> Sent: Thursday, April 02, 2009 1:45 PM
>> To: rsyslog-users
>> Subject: Re: [rsyslog] review request: omprog
>>
>> I only had a quick glance, but have a couple of questions:
>>
>> a) is popen() not suitable for some reason? it's a little less
>> efficient
>> (since it starts a shell to interpret the command line passed in) but
>> your code would be much simpler.
>>
>
> I have two problems with popen() - the "real" one is that I cannot obtain the
> pid of the started program. Or better phrased, no search brought up how to do
> that.
>
I don't know offhand of a portable/clean way to do that. Though pclose()
gets it from the FILE * somehow. So looking at the source for pclose()
should tell you.
> But (the second problem) the search turned out that there exist
> multiple cross-platform issues with popen() and *a lot* of folks recommended
> against it.
>
Fair enough. I can certainly believe there would be some implementation
differences across platforms, though I don't have a feel for how
significant they'd be.
>> b) are you sure you need to close the file handles and reset signal
>> handlers yourself? from the execve() man page:
>>
>> "execve() does not return on success, and the text, data, bss, and
>> stack
>> of the calling process are overwritten by that of the program loaded.
>> The program invoked inherits the calling process's PID, and any open
>> file descriptors that are not set to close-on-exec. Signals pending on
>> the calling process are cleared. Any signals set to be caught by the
>> calling process are reset to their default behaviour. The SIGCHLD
>> signal
>> (when set to SIG_IGN) may or may not be reset to SIG_DFL."
>>
>
> On my Fedora 10, the execve man page is much less specific on this (actually,
> open files are not mentioned at all in the "what is cleaned up" list. This
> lets me believe that there at least is a portability problem. However, you
> are right with the signals, so I'd only need to clear SIGCHLD. But what if
> there is some platform who preserves something else? As resetting signals is
> very quick, it's probably better to do it for all of them
>
The 'close on exec' flag (FD_CLOEXEC) is one of the few things posix
specifies as part of the file descriptor flags.

From the posix spec for execve:
"File descriptors open in the calling process image shall remain open in
the new process image, except for those whose close-on- /exec/ flag
FD_CLOEXEC is set. For those file descriptors that remain open, all
attributes of the open file description remain unchanged. For any file
descriptor that is closed for this reason, file locks are removed as a
result of the close as described in /close/()
<http://www.opengroup.org/onlinepubs/000095399/functions/close.html>.
Locks that are not removed by closing of file descriptors remain unchanged."

So I seriously doubt it'd be a portability issue. I'd suggest looking at
the glibc and posix documentation for execve to get the whole story on
execve.
> Probably it would make sense to add these notes as comments into the function
> header. Guess other's will have the same questions :)
>
Quite probably :)
>> Sounds like execve is doing both for you. It'd be easy to verify -
>> write
>> a little external program that dumps the open fds and signal handlers
>> when it starts.
>>
>
> How can I check which fd's are open? That would be the solution also to see
> what I need to close... Maybe I am overlooking the obvious.
>
One method on linux is to look at the contents of /proc/<pid>/fd
(directory of symlinks) and /proc/<pid>/fdinfo (directory of dynamic
text files with current flags and position).

Calling getrlimit(RLIMIT_NOFILE) will tell you the maximum number of
files that the process is permitted to have open simultaneously. I think
sysconf(_SC_OPEN_MAX) is another way to get the same info, if setrlimit
hasn't been used. That'd avoid the 64k loop iterations, at least.

fcntl(fd, F_GETFD, 0) will return the flags set on a given file
descriptor, F_SETFD will set them. FD_CLOEXEC is the bit in the flags
that controls if a file is closed when an exec happens.

I've never looked at the source for 'lsof' or 'pfiles', those may also
contain tricks for finding out what files are open.
> Thanks for looking at the code, definitely helpful!
>
Glad to help where I can. Not that I consider myself an expert in this
area...

By the way, I may start writing a custom plugin myself soon. Got some
ideas that I'd like to experiment with for embedded devices (my day job
- Amazon Kindle, Palm Pre, TiVo, etc. :)

-- Paul
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com