Mailing List Archive

cache_acceptor_poll.c oddities (Solaris)
Hello,

Starting varnish (current code from svn) on Solaris, without any
requests I see it sits there with one thread using up all of 1 cpu,
stuck in a poll loop (per truss).

I see on Solaris it ends up using cache_acceptor_poll.c by default and
the problem is that vca_main() calls poll with a pollfd mostly full of
bad data.

Happens because vca_poll() is called earlier once with one fd (fd=7
here) and so it initializes pollfd[7], but all of pollfd[1..6] remain
uninitialized. I see code in vca_pollspace() that presumably attempted to
initialize the space but doesn't really (the loop only sets "p->fd = -1"
with a constant p so only the 0th entry is initialized). It seems as if
the code was really wanting to do something like shown in diff below?

Presumably this must be working correctly on other platforms so before
filing a bug I thought I'd ask if I'm missing something obvious.
I'll try it on my Linux box later to see what's different. (This diff
isn't enough to make it work at all on Solaris, there are other
problems I see, but wanted to clarify this bit first).

Thanks...


Index: cache_acceptor_poll.c
===================================================================
--- cache_acceptor_poll.c (revision 2590)
+++ cache_acceptor_poll.c (working copy)
@@ -58,6 +58,7 @@
{
struct pollfd *p;
unsigned u, v;
+ unsigned oldnpoll = npoll;

if (fd < npoll)
return;
@@ -69,26 +70,40 @@
p = realloc(pollfd, u * sizeof *p);
XXXAN(p); /* close offending fd */
memset(p + npoll, 0, (u - npoll) * sizeof *p);
- for (v = npoll ; v <= u; v++)
- p->fd = -1;
+ for (v = oldnpoll ; v < u; v++)
+ (p+v)->fd = -1;
pollfd = p;
npoll = u;
}


--
Jyri J. Virkki - Santa Cruz, CA






--
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
Hello,

Starting varnish (current code from svn) on Solaris, without any
requests I see it sits there with one thread using up all of 1 cpu,
stuck in a poll loop (per truss).

I see on Solaris it ends up using cache_acceptor_poll.c by default and
the problem is that vca_main() calls poll with a pollfd mostly full of
bad data.

Happens because vca_poll() is called earlier once with one fd (fd=7
here) and so it initializes pollfd[7], but all of pollfd[1..6] remain
uninitialized. I see code in vca_pollspace() that presumably attempted to
initialize the space but doesn't really (the loop only sets "p->fd = -1"
with a constant p so only the 0th entry is initialized). It seems as if
the code was really wanting to do something like shown in diff below?

Presumably this must be working correctly on other platforms so before
filing a bug I thought I'd ask if I'm missing something obvious.
I'll try it on my Linux box later to see what's different. (This diff
isn't enough to make it work at all on Solaris, there are other
problems I see, but wanted to clarify this bit first).

Thanks...


Index: cache_acceptor_poll.c
===================================================================
--- cache_acceptor_poll.c (revision 2590)
+++ cache_acceptor_poll.c (working copy)
@@ -58,6 +58,7 @@
{
struct pollfd *p;
unsigned u, v;
+ unsigned oldnpoll = npoll;

if (fd < npoll)
return;
@@ -69,26 +70,40 @@
p = realloc(pollfd, u * sizeof *p);
XXXAN(p); /* close offending fd */
memset(p + npoll, 0, (u - npoll) * sizeof *p);
- for (v = npoll ; v <= u; v++)
- p->fd = -1;
+ for (v = oldnpoll ; v < u; v++)
+ (p+v)->fd = -1;
pollfd = p;
npoll = u;
}


--
Jyri J. Virkki - Santa Cruz, CA






--
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
"Jyri J. Virkki" <jyri at virkki.com> writes:
> Starting varnish (current code from svn) on Solaris, without any
> requests I see it sits there with one thread using up all of 1 cpu,
> stuck in a poll loop (per truss).

The poll code has seen very little maintenance or testing lately, simply
because there are better alternatives on all the platforms we officially
support. Theo Schlossnagle has an acceptor implementation that uses
Solaris event ports; you may have more luck with that than with poll.

In any case, thank you for the patch; I will commit it as soon as
possible.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
"Jyri J. Virkki" <jyri at virkki.com> writes:
> Starting varnish (current code from svn) on Solaris, without any
> requests I see it sits there with one thread using up all of 1 cpu,
> stuck in a poll loop (per truss).

The poll code has seen very little maintenance or testing lately, simply
because there are better alternatives on all the platforms we officially
support. Theo Schlossnagle has an acceptor implementation that uses
Solaris event ports; you may have more luck with that than with poll.

In any case, thank you for the patch; I will commit it as soon as
possible.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
Once upon a time Dag-Erling Sm?rgrav wrote:
>
> The poll code has seen very little maintenance or testing lately, simply
> because there are better alternatives on all the platforms we officially
> support. Theo Schlossnagle has an acceptor implementation that uses
> Solaris event ports; you may have more luck with that than with poll.
>
> In any case, thank you for the patch; I will commit it as soon as
> possible.

While I'd meant it as more of an illustrative than final patch, I see
you massaged it during checkin already, so thanks!

The other immediate problem I saw I've filed as ticket 222[1] and
included the patch there (related to IOV_MAX limit on Solaris, which I
see from the archives was discussed earlier and the define was changed,
but didn't quite work).

I realize the poll implementation isn't that interesting but I figured
I'd start with it as an initial experiment and add an event port
alternative later, but nice to hear it's been done.

Theo, are you planning on contributing it to the main trunk to make it
easier to access and keep in sync?



[1]http://varnish.projects.linpro.no/ticket/222
--
Jyri J. Virkki - Santa Cruz, CA






--
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
Once upon a time Dag-Erling Sm?rgrav wrote:
>
> The poll code has seen very little maintenance or testing lately, simply
> because there are better alternatives on all the platforms we officially
> support. Theo Schlossnagle has an acceptor implementation that uses
> Solaris event ports; you may have more luck with that than with poll.
>
> In any case, thank you for the patch; I will commit it as soon as
> possible.

While I'd meant it as more of an illustrative than final patch, I see
you massaged it during checkin already, so thanks!

The other immediate problem I saw I've filed as ticket 222[1] and
included the patch there (related to IOV_MAX limit on Solaris, which I
see from the archives was discussed earlier and the define was changed,
but didn't quite work).

I realize the poll implementation isn't that interesting but I figured
I'd start with it as an initial experiment and add an event port
alternative later, but nice to hear it's been done.

Theo, are you planning on contributing it to the main trunk to make it
easier to access and keep in sync?



[1]http://varnish.projects.linpro.no/ticket/222
--
Jyri J. Virkki - Santa Cruz, CA






--
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
On Mar 14, 2008, at 8:49 PM, Jyri J. Virkki wrote:

> Once upon a time Dag-Erling Sm?rgrav wrote:
>>
>> The poll code has seen very little maintenance or testing lately,
>> simply
>> because there are better alternatives on all the platforms we
>> officially
>> support. Theo Schlossnagle has an acceptor implementation that uses
>> Solaris event ports; you may have more luck with that than with poll.
>>
>> In any case, thank you for the patch; I will commit it as soon as
>> possible.
>
> While I'd meant it as more of an illustrative than final patch, I see
> you massaged it during checkin already, so thanks!
>
> The other immediate problem I saw I've filed as ticket 222[1] and
> included the patch there (related to IOV_MAX limit on Solaris, which I
> see from the archives was discussed earlier and the define was
> changed,
> but didn't quite work).
>
> I realize the poll implementation isn't that interesting but I figured
> I'd start with it as an initial experiment and add an event port
> alternative later, but nice to hear it's been done.
>
> Theo, are you planning on contributing it to the main trunk to make it
> easier to access and keep in sync?

Here's a more recent patch:

http://lethargy.org/~jesus/misc/varnish-cache-2543-sol10-2.diff

I'd love to have the changes put back. There are a few outstanding
issues:

1) the ping/pong stuff (cross notify) is done to leverage the
efficiency of event ports, but it is a bad hack from the integration
perspective. I can fix this when I get some time -- don't have enough
of that at the moment.
2) The socket timeouts aren't supported which is bad and nontrivial to
fix. However, adding a very thin read(v)/write(v)/sendfile(v)
abstraction layer would make this very easy and make the SSL
featureset trivial to add as well. I could add this in if the
developers are interested in it?
3) The VCL compiler stuff is changes to support a short-coming of the
Sun Studio toolchain. And while the change should work everywhere,
phk expressed some dislike for it. Apparently, it's a system("")
call, so I could do some trickery by chaining commands to work around
the issue. I think my work-around is more to-the-point. I doubt that
will be taken back, which is fine -- it's subjective argument over
which approach is better.
4) The IOV_MAX stuff is still an issue as the Sun Studio C
preprocessor doesn't work with
#if (IOV_MAX < (HTTP_HDR_MAX * 2))

Aside from those issues. The patch is working well for us in
production. We've hit no issues with (2) at this point which is the
only part that is bonafide technical problem.

For 2) above, the approach I would take is to change the session fd to
be an:

typedef struct varnish_io_opset {
int (*read)(void *op_ctx, void *buf, size_t nbyte);
int (*write)(void *op_ctx, const void *buf, size_t nbyte);
... readv ...
... writev ...
... sendfile ...

} varnish_io_opset_t;
typedef struct varnish_socket {
varnish_io_opset_t *ops;
void *op_ctx;
} varnish_socket_t;

This would also obviate the IOV_MAX trickery as Solaris would get its
own I/O opset.

Best regards,

Theo

--
Theo Schlossnagle
Esoteric Curio -- http://lethargy.org/
OmniTI Computer Consulting, Inc. -- http://omniti.com/
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
On Mar 14, 2008, at 8:49 PM, Jyri J. Virkki wrote:

> Once upon a time Dag-Erling Sm?rgrav wrote:
>>
>> The poll code has seen very little maintenance or testing lately,
>> simply
>> because there are better alternatives on all the platforms we
>> officially
>> support. Theo Schlossnagle has an acceptor implementation that uses
>> Solaris event ports; you may have more luck with that than with poll.
>>
>> In any case, thank you for the patch; I will commit it as soon as
>> possible.
>
> While I'd meant it as more of an illustrative than final patch, I see
> you massaged it during checkin already, so thanks!
>
> The other immediate problem I saw I've filed as ticket 222[1] and
> included the patch there (related to IOV_MAX limit on Solaris, which I
> see from the archives was discussed earlier and the define was
> changed,
> but didn't quite work).
>
> I realize the poll implementation isn't that interesting but I figured
> I'd start with it as an initial experiment and add an event port
> alternative later, but nice to hear it's been done.
>
> Theo, are you planning on contributing it to the main trunk to make it
> easier to access and keep in sync?

Here's a more recent patch:

http://lethargy.org/~jesus/misc/varnish-cache-2543-sol10-2.diff

I'd love to have the changes put back. There are a few outstanding
issues:

1) the ping/pong stuff (cross notify) is done to leverage the
efficiency of event ports, but it is a bad hack from the integration
perspective. I can fix this when I get some time -- don't have enough
of that at the moment.
2) The socket timeouts aren't supported which is bad and nontrivial to
fix. However, adding a very thin read(v)/write(v)/sendfile(v)
abstraction layer would make this very easy and make the SSL
featureset trivial to add as well. I could add this in if the
developers are interested in it?
3) The VCL compiler stuff is changes to support a short-coming of the
Sun Studio toolchain. And while the change should work everywhere,
phk expressed some dislike for it. Apparently, it's a system("")
call, so I could do some trickery by chaining commands to work around
the issue. I think my work-around is more to-the-point. I doubt that
will be taken back, which is fine -- it's subjective argument over
which approach is better.
4) The IOV_MAX stuff is still an issue as the Sun Studio C
preprocessor doesn't work with
#if (IOV_MAX < (HTTP_HDR_MAX * 2))

Aside from those issues. The patch is working well for us in
production. We've hit no issues with (2) at this point which is the
only part that is bonafide technical problem.

For 2) above, the approach I would take is to change the session fd to
be an:

typedef struct varnish_io_opset {
int (*read)(void *op_ctx, void *buf, size_t nbyte);
int (*write)(void *op_ctx, const void *buf, size_t nbyte);
... readv ...
... writev ...
... sendfile ...

} varnish_io_opset_t;
typedef struct varnish_socket {
varnish_io_opset_t *ops;
void *op_ctx;
} varnish_socket_t;

This would also obviate the IOV_MAX trickery as Solaris would get its
own I/O opset.

Best regards,

Theo

--
Theo Schlossnagle
Esoteric Curio -- http://lethargy.org/
OmniTI Computer Consulting, Inc. -- http://omniti.com/
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
"Jyri J. Virkki" <jyri at virkki.com> writes:
> The other immediate problem I saw I've filed as ticket 222[1] and
> included the patch there (related to IOV_MAX limit on Solaris, which I
> see from the archives was discussed earlier and the define was changed,
> but didn't quite work).

Thanks

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
cache_acceptor_poll.c oddities (Solaris) [ In reply to ]
"Jyri J. Virkki" <jyri at virkki.com> writes:
> The other immediate problem I saw I've filed as ticket 222[1] and
> included the patch there (related to IOV_MAX limit on Solaris, which I
> see from the archives was discussed earlier and the define was changed,
> but didn't quite work).

Thanks

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no