Mailing List Archive

pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors
Hello,

I'd like to ask about following problem that troubles me for some time
and I wan't able to find solution for:

I've got cluster with quite a lot of resources, and when I try to do
multiple operations at time, I get a lot of resource failures (ie
failed starts)

The only related information I was able to find is following snippet of the log:

crmd: notice: process_lrm_event: Operation vmtnv03_start_0: unknown error (node=v1b, call=748, rc=1, cib-update=211, confirmed=true)
crmd: notice: process_lrm_event: v1b-vmtnv03_start_0:748 [. Error: 'Could not establish cib_ro connection: Resource temporarily unavailable (11)'\n ]

The OCF script does not do anything special, start action basically just runs some
python command..

Does somebody have a tip what to do with this problem, or how to debug it further?

my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
resource-agents-3.9.5-12.

thanks a lot in advance!

with best regards

nik


--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
Re: pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors [ In reply to ]
> On 9 Feb 2015, at 8:06 pm, Nikola Ciprich <nikola.ciprich@linuxbox.cz> wrote:
>
> Hello,
>
> I'd like to ask about following problem that troubles me for some time
> and I wan't able to find solution for:
>
> I've got cluster with quite a lot of resources, and when I try to do
> multiple operations at time, I get a lot of resource failures (ie
> failed starts)
>
> The only related information I was able to find is following snippet of the log:
>
> crmd: notice: process_lrm_event: Operation vmtnv03_start_0: unknown error (node=v1b, call=748, rc=1, cib-update=211, confirmed=true)
> crmd: notice: process_lrm_event: v1b-vmtnv03_start_0:748 [. Error: 'Could not establish cib_ro connection: Resource temporarily unavailable (11)'\n ]
>
> The OCF script does not do anything special, start action basically just runs some
> python command..

The python command has nothing to do with the cluster and no reason to connect to the cib?

>
> Does somebody have a tip what to do with this problem, or how to debug it further?
>
> my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
> resource-agents-3.9.5-12.
>
> thanks a lot in advance!
>
> with best regards
>
> nik
>
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28.rijna 168, 709 00 Ostrava
>
> tel.: +420 591 166 214
> fax: +420 596 621 273
> mobil: +420 777 093 799
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis@linuxbox.cz
> -------------------------------------
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Re: pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors [ In reply to ]
Hello Andrew,

I'm really sorry for replying this late..
>
> The python command has nothing to do with the cluster and no reason to connect to the cib?
well, python script actually executes crm_mon to do some internal sanity checks..
is this problem?

nik

>
> >
> > Does somebody have a tip what to do with this problem, or how to debug it further?
> >
> > my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
> > resource-agents-3.9.5-12.
> >
> > thanks a lot in advance!
> >
> > with best regards
> >
> > nik
> >
> >
> > --
> > -------------------------------------
> > Ing. Nikola CIPRICH
> > LinuxBox.cz, s.r.o.
> > 28.rijna 168, 709 00 Ostrava
> >
> > tel.: +420 591 166 214
> > fax: +420 596 621 273
> > mobil: +420 777 093 799
> > www.linuxbox.cz
> >
> > mobil servis: +420 737 238 656
> > email servis: servis@linuxbox.cz
> > -------------------------------------
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
>
>

--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
Re: pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors [ In reply to ]
> On 14 Mar 2015, at 5:53 pm, Nikola Ciprich <nikola.ciprich@linuxbox.cz> wrote:
>
> Hello Andrew,
>
> I'm really sorry for replying this late..
>>
>> The python command has nothing to do with the cluster and no reason to connect to the cib?
> well, python script actually executes crm_mon to do some internal sanity checks..
> is this problem?

It certainly explains the log message.
Do you have a lot of these resources querying the CIB? Perhaps its overloaded

>
> nik
>
>>
>>>
>>> Does somebody have a tip what to do with this problem, or how to debug it further?
>>>
>>> my system is latest centos 6, pacemaker-1.1.12-4.el6, cman-3.0.12.1-68.el6,
>>> resource-agents-3.9.5-12.
>>>
>>> thanks a lot in advance!
>>>
>>> with best regards
>>>
>>> nik
>>>
>>>
>>> --
>>> -------------------------------------
>>> Ing. Nikola CIPRICH
>>> LinuxBox.cz, s.r.o.
>>> 28.rijna 168, 709 00 Ostrava
>>>
>>> tel.: +420 591 166 214
>>> fax: +420 596 621 273
>>> mobil: +420 777 093 799
>>> www.linuxbox.cz
>>>
>>> mobil servis: +420 737 238 656
>>> email servis: servis@linuxbox.cz
>>> -------------------------------------
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 00 Ostrava
>
> tel.: +420 591 166 214
> fax: +420 596 621 273
> mobil: +420 777 093 799
>
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis@linuxbox.cz
> -------------------------------------


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
Re: pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors [ In reply to ]
Hello Andrew,

> It certainly explains the log message.
> Do you have a lot of these resources querying the CIB? Perhaps its overloaded

well, it keeps happening when I try to start many those resources in one moment
(by many I mean for example 10), meaning theese execute that many crm_mons at time..

is there some internal cib limit I could increase to prevent those errors?

nik

--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799

www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@linuxbox.cz
-------------------------------------
Re: pacemaker-1.1.12 - lots of Could not establish cib_ro connection: Resource temporarily unavailable (11) errors [ In reply to ]
> On 18 Mar 2015, at 7:01 pm, Nikola Ciprich <nikola.ciprich@linuxbox.cz> wrote:
>
> Hello Andrew,
>
>> It certainly explains the log message.
>> Do you have a lot of these resources querying the CIB? Perhaps its overloaded
>
> well, it keeps happening when I try to start many those resources in one moment
> (by many I mean for example 10), meaning theese execute that many crm_mons at time..
>
> is there some internal cib limit I could increase to prevent those errors?

no, you'd need to retry on the client side

>
> nik
>
> --
> -------------------------------------
> Ing. Nikola CIPRICH
> LinuxBox.cz, s.r.o.
> 28. rijna 168, 709 00 Ostrava
>
> tel.: +420 591 166 214
> fax: +420 596 621 273
> mobil: +420 777 093 799
>
> www.linuxbox.cz
>
> mobil servis: +420 737 238 656
> email servis: servis@linuxbox.cz
> -------------------------------------


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org