Mailing List Archive

Cyrus terminated abnormally
Hi amd64 friends!

I have recently had some woes with my Cyrus daemon, and I have not been
successful at troubleshooting it myself. I did a system update over the
weekend, and ever since then Cyrus works for a bit and then dies with
this in the logs:

Aug 6 20:19:43 skull imaps[16458]: IOERROR: opening
/var/imap/user_deny.db: No such file or directory
Aug 6 20:19:43 skull imaps[16458]: accepted connection
Aug 6 20:19:43 skull master[17776]: about to exec /usr/lib64/cyrus/imapd
Aug 6 20:19:43 skull imaps[17776]: executed
Aug 6 20:19:43 skull imaps[16458]: imapd:Loading hard-coded DH parameters
Aug 6 20:19:43 skull imaps[16458]: EOF in SSL_accept() -> fail
Aug 6 20:19:43 skull imaps[16458]: imaps TLS negotiation failed:
[2001:470:8:97a:9829:b3a:e34c:1b2]
Aug 6 20:19:43 skull imaps[16458]: Fatal error: tls_start_servertls()
failed
Aug 6 20:19:43 skull master[16810]: process 16458 exited, status 75
Aug 6 20:19:43 skull master[16810]: service imaps pid 16458 in BUSY
state: terminated abnormally

When this happens, none of my clients can connect to the server until I
restart the service. Sometimes when I restart it, it will run for hours,
sometimes only for minutes. I haven't been able to identify a specific
activity that causes it to have this issue.

The update did involve a rebuild of cyrus-imap-admin, I think due to a
Perl update (or maybe it was an auto rebuild? I can't remember.) I can't
think of why that would cause this, but it was the only obviously
related package update.

I can't think of what steps I can take to look into this in a more
informative way. Searching for these error messages reveals many people
with many different kinds of problems (and posts dating back for a
decade!) so it is difficult to narrow down what my problem is. I have
tried to raise maxchild to large values (100) and to unlimited (-1) and
that does not resolve the issue.

Any ideas? Do any of you use Cyrus? Is your server still working?

I've considered migrating to Dovecot if I can't figure this out, so I'd
recommend opinions on Cyrus vs. Dovecot as a side thread.

--
R
Re: Cyrus terminated abnormally [ In reply to ]
I believe this issue is a kernel bug, which I was quite surprised to
learn. I filed a bug about it here if anyone is interested:

https://bugs.gentoo.org/show_bug.cgi?id=519666

--
R
Re: Cyrus terminated abnormally [ In reply to ]
On 08/14/2014 10:00 AM, Randy Barlow wrote:
> I believe this issue is a kernel bug, which I was quite surprised to
> learn. I filed a bug about it here if anyone is interested:
>
> https://bugs.gentoo.org/show_bug.cgi?id=519666

I am now not so sure that this is a kernel bug. The reason I thought
that was that booting to kernel 3.12 would resolve the issue, while
running 3.14 would exhibit the issue. I've since tried 3.16, and the
issue remains there. It seems unlikely that a bug this severe would be
present across three kernel releases.

Could it be that Cyrus just isn't compatible with kernel 3.14 and 3.16?
Is anybody successfully using Cyrus with 3.14 or 3.16?

--
R
Re: Cyrus terminated abnormally [ In reply to ]
On Sun, Aug 24, 2014 at 1:03 PM, Randy Barlow
<randy@electronsweatshop.com> wrote:
> On 08/14/2014 10:00 AM, Randy Barlow wrote:
>> I believe this issue is a kernel bug, which I was quite surprised to
>> learn. I filed a bug about it here if anyone is interested:
>>
>> https://bugs.gentoo.org/show_bug.cgi?id=519666
>
> I am now not so sure that this is a kernel bug. The reason I thought
> that was that booting to kernel 3.12 would resolve the issue, while
> running 3.14 would exhibit the issue. I've since tried 3.16, and the
> issue remains there. It seems unlikely that a bug this severe would be
> present across three kernel releases.
>
> Could it be that Cyrus just isn't compatible with kernel 3.14 and 3.16?
> Is anybody successfully using Cyrus with 3.14 or 3.16?
>
> --
> R
>

I somehow doubt it's that the application isn't compatible with a
newer kernel. If that was the case tell Linus/LKML about it because
they do a lot to ensure that doesn't happen without really good
reasons. However don't go that direction until the Cyrus people tell
you it's the case.

What follows is just opinion and not something you'd want to depend on...

One possibility is there could be a software API issue down deep
somewhere not being caught by emerge. I'd personally look at
rebuilding the app and _all_ of it's libraries, checking app config
files, etc.. That's usually fixed problems like this for me.

In the couple of cases where I've had this sort of problem happen over
the years (just a couple) I sort of hate to say my go-to solution is
to just do an emerge -e @world about once a year. I'll check
gcc-config, rebuild gcc, then rebuild the whole machine. When all else
fails that's has fixed this sort of problem for me a couple of times.

I'll also say that if it's really a kernel issue then consider
bringing up a new kernel with a completely from scratch kernel config
file. I've seen reports over the years where make oldconfig might have
caused a problem with a newer kernel.

I know some folks will say is was a waste of time, but I've got time
for a rebuild (over night on a home server, etc.) but no patience for
apps failing during the day when I'm trying to work.

Best of luck,
Mark