Mailing List Archive

1.1.1 progress
I thought I'd let you know what's currently going on at our top-secret
underground Varnish R&D facility:

- Poul-Henning has been working hard to resolve the stability issues
and assertion failures some of you have reported (#136, #137, #138,
#139, #140, #141 and #143 should be fixed, while #132, #142 and
#144 are still being worked on) as well as working on 2.0 features.

- I have been working on other bugs and build issues (#128, #130,
#131 and #135 should be fixed, while #129 is still being worked on)
as well as improving our test framework (and writing additional
test cases) and trying to get Varnish to build and run reliably on
Mac OS X and Solaris 10.

- Cecilie will be back on Monday and resume work on 2.0 features.

- I just merged a ton of bug fixes from trunk to branches/1.1.
Please please *please*, if you are running 1.0.4 or 1.1 today, take
the time to test branches/1.1 and let me know about any remaining
bugs or regressions. I would love to be able to close all
outstanding tickets on 1.0.4 and 1.1 and release 1.1.1 on August
20th. Pretty please? With sugar on top?

Thanks,

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Tnx for the fixes.

I have just installed latest svn from branches/1.1.
It's been running ok for 20min with arround 600req/sec with no slave process
restarts.

I had to modify Makefiles manually for it to find curses.h but that could as
well be problem on my side.

Thanks once again and will see if it holds till tomorrow.

Janis

On Thursday 09 August 2007 17:12, Dag-Erling Sm?rgrav wrote:
> I thought I'd let you know what's currently going on at our top-secret
> underground Varnish R&D facility:
>
> - Poul-Henning has been working hard to resolve the stability issues
> and assertion failures some of you have reported (#136, #137, #138,
> #139, #140, #141 and #143 should be fixed, while #132, #142 and
> #144 are still being worked on) as well as working on 2.0 features.
>
> - I have been working on other bugs and build issues (#128, #130,
> #131 and #135 should be fixed, while #129 is still being worked on)
> as well as improving our test framework (and writing additional
> test cases) and trying to get Varnish to build and run reliably on
> Mac OS X and Solaris 10.
>
> - Cecilie will be back on Monday and resume work on 2.0 features.
>
> - I just merged a ton of bug fixes from trunk to branches/1.1.
> Please please *please*, if you are running 1.0.4 or 1.1 today, take
> the time to test branches/1.1 and let me know about any remaining
> bugs or regressions. I would love to be able to close all
> outstanding tickets on 1.0.4 and 1.1 and release 1.1.1 on August
> 20th. Pretty please? With sugar on top?
>
> Thanks,
>
> DES
1.1.1 progress [ In reply to ]
Janis Putrams <janis.putrams at delfi.lv> writes:
> I had to modify Makefiles manually for it to find curses.h but that
> could as well be problem on my side.

What OS are you running on, and where is your curses.h installed?

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Dag-Erling Sm?rgrav wrote:
> I thought I'd let you know what's currently going on at our top-secret
> underground Varnish R&D facility:
>
> - Poul-Henning has been working hard to resolve the stability issues
> and assertion failures some of you have reported (#136, #137, #138,
> #139, #140, #141 and #143 should be fixed, while #132, #142 and
> #144 are still being worked on) as well as working on 2.0 features.
>
> - I have been working on other bugs and build issues (#128, #130,
> #131 and #135 should be fixed, while #129 is still being worked on)
> as well as improving our test framework (and writing additional
> test cases) and trying to get Varnish to build and run reliably on
> Mac OS X

Please allow me to cheer-lead: Woohooo! You rock! :-)

> and Solaris 10.
>
> - Cecilie will be back on Monday and resume work on 2.0 features.
>
> - I just merged a ton of bug fixes from trunk to branches/1.1.
> Please please *please*, if you are running 1.0.4 or 1.1 today, take
> the time to test branches/1.1 and let me know about any remaining
> bugs or regressions. I would love to be able to close all
> outstanding tickets on 1.0.4 and 1.1 and release 1.1.1 on August
> 20th. Pretty please? With sugar on top?

Are there any OS X fixes in the 1.1 branch? If so, I'll try to test this
weekend.

Martin

--
Acquisition is a jealous mistress
1.1.1 progress [ In reply to ]
Martin Aspeli <optilude at gmx.net> writes:
> Are there any OS X fixes in the 1.1 branch?

Some, yes. At least, I hope the -flat_namespace hack is no longer
required, but I don't have a Mac to test on right now.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
07:37:04 root[pts/1]@slow ~# uname -a
Linux slow 2.6.16.52-1smp #1 SMP Thu May 31 19:32:54 CEST 2007 i686 Intel(R)_Xeon(R)_CPU____________5150__ at _2.66GHz PLD Linux
07:37:05 root[pts/1]@slow ~# locate curses.h
/usr/include/ncurses/ncurses.h
/usr/include/ncurses/curses.h
/usr/include/ncursesw/ncurses.h
/usr/include/ncursesw/curses.h
07:37:27 root[pts/1]@slow ~# rpm -qf /usr/include/ncurses/curses.h
ncurses-devel-5.5-2
07:37:38 root[pts/1]@slow ~#

when running ./configure from source package varnish-1.1.tar.gz in Makefile it set:
CURSES_LIBS = -lcurses

when I checked source out from svn, run autogen.sh and then ./configure it set:
CURSES_LIBS =

In the seccond case I had to add -lcurses and -I /usr/include/ncurses. In the first case just the -I /usr/include/ncurses


Currently varnish from svn has been running for a day. Master process does not die but slave process still does restarts approximately once in hour.
Will try to get logs from those moments when slave does restarts.

janis


On Thursday 09 August 2007 18:49, Dag-Erling Sm?rgrav wrote:
> Janis Putrams <janis.putrams at delfi.lv> writes:
> > I had to modify Makefiles manually for it to find curses.h but that
> > could as well be problem on my side.
>
> What OS are you running on, and where is your curses.h installed?
>
> DES
1.1.1 progress [ In reply to ]
Janis Putrams <janis.putrams at delfi.lv> writes:
> On Thursday 09 August 2007 18:49, Dag-Erling Sm?rgrav wrote:
> > Janis Putrams <janis.putrams at delfi.lv> writes:
> > > I had to modify Makefiles manually for it to find curses.h but that
> > > could as well be problem on my side.
> > What OS are you running on, and where is your curses.h installed?
> [...]

You still haven't told me what OS you're running.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
> > > What OS are you running on, and where is your curses.h installed?
> > [...]
>
> You still haven't told me what OS you're running.

GNU/LInux (there is a uname in the last post: Linux slow
2.6.16.52-1smp, from the kernel version i guess distribution
pld-linux)

Greetings
Christoph
1.1.1 progress [ In reply to ]
PLD Linux Distribution <http://www.pld-linux.org/>.
PLD 2.0 (Ac)

janis

On Friday 10 August 2007 16:40, Dag-Erling Sm?rgrav wrote:
> Janis Putrams <janis.putrams at delfi.lv> writes:
> > On Thursday 09 August 2007 18:49, Dag-Erling Sm?rgrav wrote:
> > > Janis Putrams <janis.putrams at delfi.lv> writes:
> > > > I had to modify Makefiles manually for it to find curses.h but that
> > > > could as well be problem on my side.
> > >
> > > What OS are you running on, and where is your curses.h installed?
> >
> > [...]
>
> You still haven't told me what OS you're running.
>
> DES
1.1.1 progress [ In reply to ]
"C. Handel" <fragfutter at gmail.com> writes:
> Dag-Erling Sm?rgrav <des at linpro.no> writes:
> > You still haven't told me what OS you're running.
> GNU/LInux (there is a uname in the last post: Linux slow
> 2.6.16.52-1smp, from the kernel version i guess distribution
> pld-linux)

Linux is not an OS. Linux is a kernel. I need to know what OS Janis
is running, and I don't like to guess.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Dag-Erling Sm?rgrav wrote:
> Martin Aspeli <optilude at gmx.net> writes:
>> Are there any OS X fixes in the 1.1 branch?
>
> Some, yes. At least, I hope the -flat_namespace hack is no longer
> required, but I don't have a Mac to test on right now.

I just tried it, doing the following:

$ svn co http://varnish.projects.linpro.no/svn/branches/1.1
$ ./autogen.sh
$ ./configure.sh --prefix=/path/to/install/dir
$ make
$ make install

And it works! I no longer have to do the libtool patch from
http://varnish.projects.linpro.no/ticket/118.

Note that my environment is maybe not 100% "normal", in that I have GNU
versions of various tools (including libtool) installed via MacPorts.
But I definitely had to do the workaround before, and I no longer do, so
I assume it's fixed.

If others can confirm, you may be able to close 118 now.

Thanks!
Martin

--
Acquisition is a jealous mistress
1.1.1 progress [ In reply to ]
On Aug 12, 2007, at 2:48 PM, Martin Aspeli wrote:

> Dag-Erling Sm?rgrav wrote:
>> Martin Aspeli <optilude at gmx.net> writes:
>>> Are there any OS X fixes in the 1.1 branch?
>>
>> Some, yes. At least, I hope the -flat_namespace hack is no longer
>> required, but I don't have a Mac to test on right now.
>
> I just tried it, doing the following:
>
> $ svn co http://varnish.projects.linpro.no/svn/branches/1.1
> $ ./autogen.sh
> $ ./configure.sh --prefix=/path/to/install/dir
> $ make
> $ make install
>
> And it works! I no longer have to do the libtool patch from
> http://varnish.projects.linpro.no/ticket/118.
>
> Note that my environment is maybe not 100% "normal", in that I have
> GNU
> versions of various tools (including libtool) installed via MacPorts.
> But I definitely had to do the workaround before, and I no longer
> do, so
> I assume it's fixed.
>
> If others can confirm, you may be able to close 118 now.
>
> Thanks!
> Martin


Not so fast. I still needed to update automake as per the
instructions on http://varnish.projects.linpro.no/wiki/Installation
(slightly modified since the autogen.sh file is now a little different).

Also, I think you meant ./configure, not ./configure.sh

Running OS X 10.4.10 on an Intel Core Duo. Mostly stock with just a
couple of Fink and MacPorts installed tools.

Ric
1.1.1 progress [ In reply to ]
Ricardo Newbery <ric at digitalmarbles.com> writes:
> Not so fast. I still needed to update automake as per the
> instructions on http://varnish.projects.linpro.no/wiki/Installation
> (slightly modified since the autogen.sh file is now a little
> different).

Why? I would expect that /opt/local/bin would be in your path if you
had anything of interest installed there.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Dag-Erling Sm?rgrav <des at linpro.no> writes:
> Ricardo Newbery <ric at digitalmarbles.com> writes:
> > Not so fast. I still needed to update automake as per the
> > instructions on http://varnish.projects.linpro.no/wiki/Installation
> > (slightly modified since the autogen.sh file is now a little
> > different).
> Why? I would expect that /opt/local/bin would be in your path if you
> had anything of interest installed there.

I see that these instructions say that the automake version that comes
with Mac OS X (Xcode, actually) is too old, but I haven't had any
problems with it. Just use the system's automake and ignore the
warning from autogen.sh.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
On Aug 13, 2007, at 3:49 AM, Dag-Erling Sm?rgrav wrote:

> Dag-Erling Sm?rgrav <des at linpro.no> writes:
>> Ricardo Newbery <ric at digitalmarbles.com> writes:
>>> Not so fast. I still needed to update automake as per the
>>> instructions on http://varnish.projects.linpro.no/wiki/Installation
>>> (slightly modified since the autogen.sh file is now a little
>>> different).
>> Why? I would expect that /opt/local/bin would be in your path if you
>> had anything of interest installed there.
>
> I see that these instructions say that the automake version that comes
> with Mac OS X (Xcode, actually) is too old, but I haven't had any
> problems with it. Just use the system's automake and ignore the
> warning from autogen.sh.
>
> DES
> --
> Dag-Erling Sm?rgrav
> Senior Software Developer
> Linpro AS - www.linpro.no



Without the updated automake, I get this...

$ ./autogen.sh

+ aclocal
+ glibtoolize --copy --force
You should update your `aclocal.m4' by running aclocal.
+ autoheader
+ automake --add-missing --copy --foreign
configure.ac: installing `./install-sh'
configure.ac: installing `./missing'
bin/varnishadm/Makefile.am: installing `./compile'
bin/varnishadm/Makefile.am: installing `./depcomp'
lib/libvarnish/Makefile.am:5: Libtool library used but `LIBTOOL' is
undefined
lib/libvarnish/Makefile.am:5:
lib/libvarnish/Makefile.am:5: The usual way to define `LIBTOOL' is to
add `AC_PROG_LIBTOOL'
lib/libvarnish/Makefile.am:5: to `configure.ac' and run `aclocal' and
`autoconf' again.
lib/libvarnishapi/Makefile.am:5: Libtool library used but `LIBTOOL'
is undefined
lib/libvarnishapi/Makefile.am:5:
lib/libvarnishapi/Makefile.am:5: The usual way to define `LIBTOOL' is
to add `AC_PROG_LIBTOOL'
lib/libvarnishapi/Makefile.am:5: to `configure.ac' and run `aclocal'
and `autoconf' again.
lib/libvarnishcompat/Makefile.am:5: Libtool library used but
`LIBTOOL' is undefined
lib/libvarnishcompat/Makefile.am:5:
lib/libvarnishcompat/Makefile.am:5: The usual way to define `LIBTOOL'
is to add `AC_PROG_LIBTOOL'
lib/libvarnishcompat/Makefile.am:5: to `configure.ac' and run
`aclocal' and `autoconf' again.
lib/libvcl/Makefile.am:5: Libtool library used but `LIBTOOL' is
undefined
lib/libvcl/Makefile.am:5:
lib/libvcl/Makefile.am:5: The usual way to define `LIBTOOL' is to add
`AC_PROG_LIBTOOL'
lib/libvcl/Makefile.am:5: to `configure.ac' and run `aclocal' and
`autoconf' again.

And configure is not created so ignoring the warning is not an option.

Ric
1.1.1 progress [ In reply to ]
Ricardo Newbery <ric at digitalmarbles.com> writes:
> Without the updated automake, I get this...
>
> $ ./autogen.sh
>
> + aclocal
> + glibtoolize --copy --force
> You should update your `aclocal.m4' by running aclocal.
> + autoheader
> + automake --add-missing --copy --foreign
> configure.ac: installing `./install-sh'
> configure.ac: installing `./missing'
> bin/varnishadm/Makefile.am: installing `./compile'
> bin/varnishadm/Makefile.am: installing `./depcomp'
> lib/libvarnish/Makefile.am:5: Libtool library used but `LIBTOOL' is undefined

Yes, that's the usual symptom, but I didn't get that when I tested
with automake 1.6 on Tiger last month, which is why I changed
autogen.sh to only issue a warning. Go figure...

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Hi!
I have been running varnishd from svn @Aug 16 06:17 and I managed to capture master crash log.
It is 900MB in size but it goes as follow:

-----------------------------------------------
...
317 VCL_call c deliver
317 VCL_return c deliver
317 TxProtocol c HTTP/1.1
317 TxStatus c 304
317 TxResponse c Not Modified
317 TxHeader c Date: Fri, 17 Aug 2007 02:38:15 GMT
317 TxHeader c Via: 1.1 varnish
317 TxHeader c X-Varnish: 1235756158
317 TxHeader c Last-Modified: Mon, 10 Apr 2006 14:22:09 GMT
317 TxHeader c Connection: keep-alive
317 ReqEnd c 1235756158 1187318295.812228918 1187318295.812259912 0.116736889 0.000012159 0.000018835
0 StatAddr 213.21.215.95 0 2 4 20 0 0 0 3806 32572
69 Debug c "Pipe Shut read(read)"
301 Debug c "Pipe Shut write(read)"
69 Debug c "Pipe Shut read(read)"
301 Debug c "Pipe Shut write(read)"
69 Debug c "Pipe Shut read(read)"
....
a lot of "Pipe Shut .." messages with some normal requests in between
....
435 Debug c "Pipe Shut write(read)"
69 Debug c "Pipe Shut read(read)"
301 Debug c "Pipe Shut write(read)"
277 Debug c "Pipe Shut read(read)"
435 Debug c "Pipe Shut write(read)"
69 Debug c "Pipe Shut read(read)"
301 Debug c "Pipe Shut write(read)"
277 Debug c "Pipe Shut read(read)"
435 Debug c "Pipewrite(read)%00%00ead)"
69 Debug c "Pipe Shut rad)%00%00%14%00EP"
8275 (null) hut read(read)
28773 (null) Shut write(read)
30578 (null) ite(read)
277 VCL_return c
26992 (null) e Shut write(read)
8306 (null) ead(read)
25888 (null) Shut write(read)
24932 (null) )
276 VCL_return b
26992 (null) e Shut read(read)
30578 (null) ite(read)
....
some messages with "(null)" type
....
26992 RxURL e Shut read(read)
30578 (null) ite(read)
277 VCL_return c
26992 (null) e Shut write(read)
8306 (null) ead(read)
25888 (null) Shut write(read)
24932 (null) )
276 VCL_return b
26992 (null) e Shut read(read)
30578 (null) ite(read)
277 VCL_return c
26992 ExpBan e Shut write(read)
-----------------------------------------------

06:05:39 root[pts/1]@slow /www# cat master_crash.log |sort|uniq -c |sort -n -r|head -n 100
7414921 69 Debug c "Pipe Shut read(read)"
7414919 301 Debug c "Pipe Shut write(read)"
2330830 277 Debug c "Pipe Shut read(read)"
2330825 435 Debug c "Pipe Shut write(read)"
8670 25641 (null)
3419 276 VCL_return b
3410 24932 (null) )
3408 30578 (null) ite(read)
3399 8306 (null) ead(read)
3386 277 VCL_return c
3379 25888 (null) Shut write(read)
1723 26992 RxURL e Shut read(read)
1713 26992 (null) e Shut write(read)
1697 26992 (null) e Shut read(read)
1677 26992 ExpBan e Shut write(read)
742 25640 (null) read)
407 0 VCL_return discard
407 0 VCL_call timeout
234 61 VCL_return c deliver
184 799 VCL_return c deliver
180 722 VCL_return c deliver
148 719 VCL_return c deliver

Please let me know if you need anything else.

--
janis

On Thursday 09 August 2007 17:12, Dag-Erling Sm?rgrav wrote:
> I thought I'd let you know what's currently going on at our top-secret
> underground Varnish R&D facility:
>
> - Poul-Henning has been working hard to resolve the stability issues
> and assertion failures some of you have reported (#136, #137, #138,
> #139, #140, #141 and #143 should be fixed, while #132, #142 and
> #144 are still being worked on) as well as working on 2.0 features.
>
> - I have been working on other bugs and build issues (#128, #130,
> #131 and #135 should be fixed, while #129 is still being worked on)
> as well as improving our test framework (and writing additional
> test cases) and trying to get Varnish to build and run reliably on
> Mac OS X and Solaris 10.
>
> - Cecilie will be back on Monday and resume work on 2.0 features.
>
> - I just merged a ton of bug fixes from trunk to branches/1.1.
> Please please *please*, if you are running 1.0.4 or 1.1 today, take
> the time to test branches/1.1 and let me know about any remaining
> bugs or regressions. I would love to be able to close all
> outstanding tickets on 1.0.4 and 1.1 and release 1.1.1 on August
> 20th. Pretty please? With sugar on top?
>
> Thanks,
>
> DES
1.1.1 progress [ In reply to ]
Janis Putrams <janis.putrams at delfi.lv> writes:
> I have been running varnishd from svn @Aug 16 06:17 and I managed to
> capture master crash log.

I have no idea what a "master crash log" is, but the varnish log you
included is corrupted and doesn't tell me much, except that most of
your traffic is piped to the backend rather than cached. If you're
having trouble with the management and / or worker process crashing,
you should run varnish in the foreground (-d -d) to capture the error
messages (most likely assertion failures), and get a backtrace.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Hi!

With "master crash log" i meant output of varnishlog when management process dies.

Anyway I wanted to capture core file so run varnishd in foreground.

I set
CXXFLAGS="-g"
LDFLAGS="-g"
in configure.ac. then run
./configure --prefix=/usr --enable-debug && make clean && make

05:17:28 root[pts/0]@slow ~# ulimit -a|grep core
core file size (blocks, -c) 650000


08:06:24 root[pts/0]@slow ~# varnishd -d -d -f /etc/varnishd/varnishd.conf -T 127.0.0.1:81 -P /var/run/varnishd.pid
file /var/spool/varnishd/varnishd_storage.bin size 536870912 bytes (131072 fs-blocks, 131072 pages)
Using old SHMFILE
rolling(1)...
rolling(2)...
start
CLI <start>
start child pid 23327
200 0

Child said (2, 23327): <<Child starts
managed to mmap 536870912 bytes of 536870912
Ready
CLI ready
>>
Child said (2, 23327): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Cache child died pid=23327 status=0xb
Clean child
Child cleaned
start child pid 23404
Child said (2, 23404): <<Child starts
managed to mmap 536870912 bytes of 536870912
Ready
CLI ready
>>
Child said (2, 23404): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Cache child died pid=23404 status=0xb
Clean child
Child cleaned
start child pid 23484
...
>>
Child said (2, 28871): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Cache child died pid=28871 status=0xb
Clean child
Child cleaned
start child pid 28900
Child said (2, 28900): <<Child starts
managed to mmap 536870912 bytes of 536870912
Ready
CLI ready
>>
Child said (2, 28900): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Cache child died pid=28900 status=0xb
Clean child
Child cleaned
start child pid 28936
Pushing vcls failed:
CLI communication error
Child said (1, 28936): <<Child starts
>>
unlink ./bin.XXQ9wDR6
14:00:53 root[pts/0]@slow ~#

Though I found no core file after this. Please suggest what could I do to trace it.

Thanks,
Janis

On Saturday 18 August 2007 13:39, Dag-Erling Sm?rgrav wrote:
> Janis Putrams <janis.putrams at delfi.lv> writes:
> > I have been running varnishd from svn @Aug 16 06:17 and I managed to
> > capture master crash log.
>
> I have no idea what a "master crash log" is, but the varnish log you
> included is corrupted and doesn't tell me much, except that most of
> your traffic is piped to the backend rather than cached. If you're
> having trouble with the management and / or worker process crashing,
> you should run varnish in the foreground (-d -d) to capture the error
> messages (most likely assertion failures), and get a backtrace.
>
> DES
1.1.1 progress [ In reply to ]
Janis Putrams <janis.putrams at delfi.lv> writes:
> Anyway I wanted to capture core file so run varnishd in foreground.
> [...]
> 05:17:28 root[pts/0]@slow ~# ulimit -a|grep core
> core file size (blocks, -c) 650000

you should use 'unlimited' instead; a varnishd core dump can easily
reach tens or hundreds of megabytes.

> Though I found no core file after this. Please suggest what could I
> do to trace it.

It will be in $localstatedir/varnish/$varnish_name (probably
/var/lib/varnish/`uname -n`)

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Hi!

Tnx for advice, i increased core file size to unlimited, run in foreground and
wait for crash.
Meanwhile encountered slave process restart that did not kill management
process, but gave this log:

...
Cache child died pid=25094 status=0xb
Clean child
Child cleaned
start child pid 25738
Child said (2, 25738): <<Child starts
managed to mmap 536870912 bytes of 536870912
Ready
CLI ready
>>
Child said (2, 25738): <<socktest: linger=0 sndtimeo=0 rcvtimeo=0
>>
Child said (2, 25738): <<Assert error in wrk_do_one(), cache_pool.c line 199:
Condition(!isnan(w->used)) not true.
errno = 104 (Connection reset by peer)
Assert error in wrk_do_one(), cache_pool.c line 199:
Condition(!isnan(w->used)) not true.
errno = 104 (Connection reset by peer)
>>
Cache child died pid=25738 status=0x6
Clean child
Child cleaned
start child pid 26209
Child said (2, 26209): <<Child starts
managed to mmap 536870912 bytes of 536870912
Ready
CLI ready
...

Hope that it gives some help.
Janis

p.s.
please let me know if I should be sending these messages to tickets, privately
or smth.

On Saturday 18 August 2007 13:39, Dag-Erling Sm?rgrav wrote:
> Janis Putrams <janis.putrams at delfi.lv> writes:
> > I have been running varnishd from svn @Aug 16 06:17 and I managed to
> > capture master crash log.
>
> I have no idea what a "master crash log" is, but the varnish log you
> included is corrupted and doesn't tell me much, except that most of
> your traffic is piped to the backend rather than cached. If you're
> having trouble with the management and / or worker process crashing,
> you should run varnish in the foreground (-d -d) to capture the error
> messages (most likely assertion failures), and get a backtrace.
>
> DES
1.1.1 progress [ In reply to ]
In message <200708201611.32080.janis.putrams at delfi.lv>, Janis Putrams writes:

>Assert error in wrk_do_one(), cache_pool.c line 199:
> Condition(!isnan(w->used)) not true.
> errno =3D 104 (Connection reset by peer)

Try this fix:

In cache_center.c, function cnt_lookup(), before the call to SES_Charge(),
insert:

sp->wrk->used = TIM_real();

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
1.1.1 progress [ In reply to ]
Janis Putrams <janis.putrams at delfi.lv> writes:
> Child said (2, 25738): <<Assert error in wrk_do_one(), cache_pool.c line 199:
> Condition(!isnan(w->used)) not true.
> errno = 104 (Connection reset by peer)
>>>

See ticket #150.

DES
--
Dag-Erling Sm?rgrav
Senior Software Developer
Linpro AS - www.linpro.no
1.1.1 progress [ In reply to ]
Hi!
Thank you. I added the line yesterday as soon as you sent it and seems that it
has helped and no slave processes die with that message any more.
Slave processes do restart from time to time (with no additional information)
but management process has been running ok so far.

janis

On Monday 20 August 2007 16:20, Poul-Henning Kamp wrote:
> In message <200708201611.32080.janis.putrams at delfi.lv>, Janis Putrams
writes:
> >Assert error in wrk_do_one(), cache_pool.c line 199:
> > Condition(!isnan(w->used)) not true.
> > errno =3D 104 (Connection reset by peer)
>
> Try this fix:
>
> In cache_center.c, function cnt_lookup(), before the call to SES_Charge(),
> insert:
>
> sp->wrk->used = TIM_real();