Mailing List Archive

MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
We have a fairly common and standard deployment for our MRA solution. All are running CUCM 14+, latest Expressway, etc...

Vmware server 1 (jn DMZ)
ExpressWay-E-1

Vmware server 2 (in DMZ)
ExpressWay-E-2

Vmware Server 3 (In Core)
CUCM Publisher
Expressway-C-1

VMWare Server 4( In Core)
CUCM Subscriber
Expressway-C-2



1. If ether Expreway-E VMs fail, redundancy works fine
2. If either CUCM fails, redundancy works fine
3. If either Expressway-C VMs fail, redundancy works fine
4. If VMWare Server 4 fails (say during patching, hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.

Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won't go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn't an option since there is no shared storage.

Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
This sound more like a config issue…

Have run into issues where expressways go stupid when boxes go offline

As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….

Kent

> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>
> ?
> We have a fairly common and standard deployment for our MRA solution. All are running CUCM 14+, latest Expressway, etc…
>
> Vmware server 1 (jn DMZ)
> ExpressWay-E-1
>
> Vmware server 2 (in DMZ)
> ExpressWay-E-2
>
> Vmware Server 3 (In Core)
> CUCM Publisher
> Expressway-C-1
>
> VMWare Server 4( In Core)
> CUCM Subscriber
> Expressway-C-2
>
>
> If ether Expreway-E VMs fail, redundancy works fine
> If either CUCM fails, redundancy works fine
> If either Expressway-C VMs fail, redundancy works fine
> If VMWare Server 4 fails (say during patching, hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>
> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.
>
> Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
>
>
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
We don’t use Jabber nor Webex.

Cisco TAC has been escalated and they have been working on this for over 2 months. I have sent repeated expressway and PRT logs from the phone. After working with Cisco engineering, the claim it is “working as intended” and plan on updating the documentation to reflect the limitation that if you loose both the subscriber and redundant expressway-C server, failover won’t happen.

I’d love to be proven wrong since we may have to completely replace our solution.


From: Kent Roberts <dvxkid@gmail.com>
Sent: Tuesday, June 21, 2022 10:09 AM
To: Matthew Huff <mhuff@ox.com>
Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

This sound more like a config issue…

Have run into issues where expressways go stupid when boxes go offline

As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….

Kent


On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com<mailto:mhuff@ox.com>> wrote:
?
We have a fairly common and standard deployment for our MRA solution. All are running CUCM 14+, latest Expressway, etc…

Vmware server 1 (jn DMZ)
ExpressWay-E-1

Vmware server 2 (in DMZ)
ExpressWay-E-2

Vmware Server 3 (In Core)
CUCM Publisher
Expressway-C-1

VMWare Server 4( In Core)
CUCM Subscriber
Expressway-C-2



1. If ether Expreway-E VMs fail, redundancy works fine
2. If either CUCM fails, redundancy works fine
3. If either Expressway-C VMs fail, redundancy works fine
4. If VMWare Server 4 fails (say during patching, hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.

Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.

Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.



_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net<mailto:cisco-voip@puck.nether.net>
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
It might be worth setting up a Jabber test endpoint just to see.

Some questions though:
- Does every Expressway-E know about every Expressway-C?
- Does every Expressway-C know about every CUCM?

I'm trying to figure out what the desired architecture is, and/or how
this problem would happen.
In our environment, the above are both true. So the loss of any number
of anything, should not result in failover issues - and that is the
behavior we have seen (we have shut down entire sites due to
maintenance, power failure, etc. and failover worked).
In fact, we have found MRA phones to be great at failover in this way
(our MRA phones are all 8851s). Jabber has been the problem child.

--
Hunter Fuller (they)
Router Jockey
VBH M-1C
+1 256 824 5331

Office of Information Technology
The University of Alabama in Huntsville
Network Engineering

On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>
> We don’t use Jabber nor Webex.
>
>
>
> Cisco TAC has been escalated and they have been working on this for over 2 months. I have sent repeated expressway and PRT logs from the phone. After working with Cisco engineering, the claim it is “working as intended” and plan on updating the documentation to reflect the limitation that if you loose both the subscriber and redundant expressway-C server, failover won’t happen.
>
>
>
> I’d love to be proven wrong since we may have to completely replace our solution.
>
>
>
>
>
> From: Kent Roberts <dvxkid@gmail.com>
> Sent: Tuesday, June 21, 2022 10:09 AM
> To: Matthew Huff <mhuff@ox.com>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
>
>
> This sound more like a config issue…
>
>
>
> Have run into issues where expressways go stupid when boxes go offline
>
> As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….
>
>
>
> Kent
>
>
>
> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>
> ?
>
> We have a fairly common and standard deployment for our MRA solution. All are running CUCM 14+, latest Expressway, etc…
>
>
>
> Vmware server 1 (jn DMZ)
>
> ExpressWay-E-1
>
>
>
> Vmware server 2 (in DMZ)
>
> ExpressWay-E-2
>
>
>
> Vmware Server 3 (In Core)
>
> CUCM Publisher
>
> Expressway-C-1
>
>
>
> VMWare Server 4( In Core)
>
> CUCM Subscriber
>
> Expressway-C-2
>
>
>
>
>
> If ether Expreway-E VMs fail, redundancy works fine
> If either CUCM fails, redundancy works fine
> If either Expressway-C VMs fail, redundancy works fine
> If VMWare Server 4 fails (say during patching, hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>
>
>
> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.
>
>
>
> Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
>
>
>
>
>
>
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
We have no interest in setting up a jabber environment in order to debug ciscos's issue.

Yes, every expressway-e knows about all expressway-c, all expressway-c know about CUCM. Cisco TAC has verified the configuration, logs, and diagnostic. I've been working with them for 2 months and it's been escalated to backline-engineering. They looked at the Cisco Phone PRT logs and confirmed that it's a known limitation, and there is no solution.

Maybe it's an issue with later versions of CUCM and/or expressway? We are running the latest including latest phone firmware.

Failover works great except in one scenario where both the CUCM subscriber and the expressway-c that reside on the same machine are both shut down. Brining either one up, and the phone registers.


-----Original Message-----
From: Hunter Fuller <hf0002@uah.edu>
Sent: Tuesday, June 21, 2022 12:41 PM
To: Matthew Huff <mhuff@ox.com>
Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

It might be worth setting up a Jabber test endpoint just to see.

Some questions though:
- Does every Expressway-E know about every Expressway-C?
- Does every Expressway-C know about every CUCM?

I'm trying to figure out what the desired architecture is, and/or how this problem would happen.
In our environment, the above are both true. So the loss of any number of anything, should not result in failover issues - and that is the behavior we have seen (we have shut down entire sites due to maintenance, power failure, etc. and failover worked).
In fact, we have found MRA phones to be great at failover in this way (our MRA phones are all 8851s). Jabber has been the problem child.

--
Hunter Fuller (they)
Router Jockey
VBH M-1C
+1 256 824 5331

Office of Information Technology
The University of Alabama in Huntsville
Network Engineering

On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>
> We don’t use Jabber nor Webex.
>
>
>
> Cisco TAC has been escalated and they have been working on this for over 2 months. I have sent repeated expressway and PRT logs from the phone. After working with Cisco engineering, the claim it is “working as intended” and plan on updating the documentation to reflect the limitation that if you loose both the subscriber and redundant expressway-C server, failover won’t happen.
>
>
>
> I’d love to be proven wrong since we may have to completely replace our solution.
>
>
>
>
>
> From: Kent Roberts <dvxkid@gmail.com>
> Sent: Tuesday, June 21, 2022 10:09 AM
> To: Matthew Huff <mhuff@ox.com>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees,
> says it's a documentation defect
>
>
>
> This sound more like a config issue…
>
>
>
> Have run into issues where expressways go stupid when boxes go offline
>
> As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….
>
>
>
> Kent
>
>
>
> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>
> ?
>
> We have a fairly common and standard deployment for our MRA solution.
> All are running CUCM 14+, latest Expressway, etc…
>
>
>
> Vmware server 1 (jn DMZ)
>
> ExpressWay-E-1
>
>
>
> Vmware server 2 (in DMZ)
>
> ExpressWay-E-2
>
>
>
> Vmware Server 3 (In Core)
>
> CUCM Publisher
>
> Expressway-C-1
>
>
>
> VMWare Server 4( In Core)
>
> CUCM Subscriber
>
> Expressway-C-2
>
>
>
>
>
> If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> fails, redundancy works fine If either Expressway-C VMs fail,
> redundancy works fine If VMWare Server 4 fails (say during patching,
> hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>
>
>
> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.
>
>
>
> Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
>
>
>
>
>
>
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?

The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.

Adam Pawlowski
Network Engineer?| Network and Communication Services
University at Buffalo Information Technology (UBIT) 
243 Computing Center, Buffalo, NY 14260 


> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
> Matthew Huff
> Sent: Tuesday, June 21, 2022 12:54 PM
> To: Hunter Fuller <hf0002@uah.edu>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC
> agrees, says it's a documentation defect
>
> We have no interest in setting up a jabber environment in order to debug
> ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c
> know about CUCM. Cisco TAC has verified the configuration, logs, and
> diagnostic. I've been working with them for 2 months and it's been escalated
> to backline-engineering. They looked at the Cisco Phone PRT logs and
> confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We are
> running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM subscriber
> and the expressway-c that reside on the same machine are both shut down.
> Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
> voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
> agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how this
> problem would happen.
> In our environment, the above are both true. So the loss of any number of
> anything, should not result in failover issues - and that is the behavior we
> have seen (we have shut down entire sites due to maintenance, power
> failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way (our
> MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville
> Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for over
> 2 months. I have sent repeated expressway and PRT logs from the phone.
> After working with Cisco engineering, the claim it is “working as intended”
> and plan on updating the documentation to reflect the limitation that if you
> loose both the subscriber and redundant expressway-C server, failover
> won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace our
> solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees,
> > says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or
> webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> >
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote
> phones un-register and never register no matter what is done. If either
> CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones
> and is not solvable unless we purchase two new vmware servers and split
> the CUCM and Expressway-C into separate servers so they both won’t go
> down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion
> isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our VAR
> and cisco rep and may have to replace our phone system since we are all
> working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
> mp;reserved=0
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
What Cisco is saying doesn't make sense to me.

In a scenario like you have described, where every server knows about
every other server, what is the difference in the
Expressway-C and CUCM being on the same machine? Since they are all
"equals" in this configuration, it should not matter where the
Expressway-C and CUCM are.

I guess what I'm suggesting is, is it possible that the failure of ANY
CUCM and ANY Exp-C at the same time, is causing this issue?

Another test could be to shut down and move the Exp-C VMs between
hosts. (Not using vMotion obviously)
If this resolves the issue (e.g., CUCM Subscriber and Expressway-C-1
are on the same host now, and killing that host does NOT result in an
outage anymore), then we will learn that there is some specific thing
lurking in the config between specific CUCM and specific Exp-C that is
causing the issue.
If it does not resolve the issue, then you can test manually powering
off CUCM and Exp-C but on different hosts. This would test whether it
is just an issue with simultaneous failure of ANY CUCM+Exp-C at once.

I hope what I'm saying makes sense. The UC architecture does not
"know" about what VM host the apps live on. So there should be no
special relationship between VMs on the same host. That is why it
smells like something else is going on (despite what Cisco says).
--
Hunter Fuller (they)
Router Jockey
VBH M-1C
+1 256 824 5331

Office of Information Technology
The University of Alabama in Huntsville
Network Engineering
On Tue, Jun 21, 2022 at 11:54 AM Matthew Huff <mhuff@ox.com> wrote:
>
> We have no interest in setting up a jabber environment in order to debug ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c know about CUCM. Cisco TAC has verified the configuration, logs, and diagnostic. I've been working with them for 2 months and it's been escalated to backline-engineering. They looked at the Cisco Phone PRT logs and confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We are running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM subscriber and the expressway-c that reside on the same machine are both shut down. Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how this problem would happen.
> In our environment, the above are both true. So the loss of any number of anything, should not result in failover issues - and that is the behavior we have seen (we have shut down entire sites due to maintenance, power failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way (our MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville
> Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for over 2 months. I have sent repeated expressway and PRT logs from the phone. After working with Cisco engineering, the claim it is “working as intended” and plan on updating the documentation to reflect the limitation that if you loose both the subscriber and redundant expressway-C server, failover won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace our solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees,
> > says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> > ?
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-voip
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
The only issue with them being on the same server is that both have to be shutdown to do hardware maintenance or VMWare patching.

Their solution is to buy another vmware server and separate the expressway and CUCM onto separate servers so that they can be shut down separately. I guess they expect 1 ESXi host = 1 VM. /boggle.

I don't think it's the fact that they are on the same host, I think the phone only downloads limited knowledge of the environment and when there is "enough" of a failure, it doesn't know enough to contact the other servers. It looks like a design defect on the phone firmware/MRA not necessarily CUCM.


-----Original Message-----
From: Hunter Fuller <hf0002@uah.edu>
Sent: Tuesday, June 21, 2022 1:05 PM
To: Matthew Huff <mhuff@ox.com>
Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

What Cisco is saying doesn't make sense to me.

In a scenario like you have described, where every server knows about every other server, what is the difference in the Expressway-C and CUCM being on the same machine? Since they are all "equals" in this configuration, it should not matter where the Expressway-C and CUCM are.

I guess what I'm suggesting is, is it possible that the failure of ANY CUCM and ANY Exp-C at the same time, is causing this issue?

Another test could be to shut down and move the Exp-C VMs between hosts. (Not using vMotion obviously) If this resolves the issue (e.g., CUCM Subscriber and Expressway-C-1 are on the same host now, and killing that host does NOT result in an outage anymore), then we will learn that there is some specific thing lurking in the config between specific CUCM and specific Exp-C that is causing the issue.
If it does not resolve the issue, then you can test manually powering off CUCM and Exp-C but on different hosts. This would test whether it is just an issue with simultaneous failure of ANY CUCM+Exp-C at once.

I hope what I'm saying makes sense. The UC architecture does not "know" about what VM host the apps live on. So there should be no special relationship between VMs on the same host. That is why it smells like something else is going on (despite what Cisco says).
--
Hunter Fuller (they)
Router Jockey
VBH M-1C
+1 256 824 5331

Office of Information Technology
The University of Alabama in Huntsville
Network Engineering
On Tue, Jun 21, 2022 at 11:54 AM Matthew Huff <mhuff@ox.com> wrote:
>
> We have no interest in setting up a jabber environment in order to debug ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c know about CUCM. Cisco TAC has verified the configuration, logs, and diagnostic. I've been working with them for 2 months and it's been escalated to backline-engineering. They looked at the Cisco Phone PRT logs and confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We are running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM subscriber and the expressway-c that reside on the same machine are both shut down. Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list
> <cisco-voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
> Cisco TAC agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how this problem would happen.
> In our environment, the above are both true. So the loss of any number of anything, should not result in failover issues - and that is the behavior we have seen (we have shut down entire sites due to maintenance, power failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way (our MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for over 2 months. I have sent repeated expressway and PRT logs from the phone. After working with Cisco engineering, the claim it is “working as intended” and plan on updating the documentation to reflect the limitation that if you loose both the subscriber and redundant expressway-C server, failover won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace our solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
> > agrees, says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go
> > offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> > ?
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote phones un-register and never register no matter what is done. If either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones and is not solvable unless we purchase two new vmware servers and split the CUCM and Expressway-C into separate servers so they both won’t go down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our VAR and cisco rep and may have to replace our phone system since we are all working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-voip
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> > https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.

It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.



-----Original Message-----
From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
Sent: Tuesday, June 21, 2022 1:04 PM
To: cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?

The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.

Adam Pawlowski
Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
243 Computing Center, Buffalo, NY 14260 


> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
> Matthew Huff
> Sent: Tuesday, June 21, 2022 12:54 PM
> To: Hunter Fuller <hf0002@uah.edu>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC
> agrees, says it's a documentation defect
>
> We have no interest in setting up a jabber environment in order to debug
> ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c
> know about CUCM. Cisco TAC has verified the configuration, logs, and
> diagnostic. I've been working with them for 2 months and it's been escalated
> to backline-engineering. They looked at the Cisco Phone PRT logs and
> confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We are
> running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM subscriber
> and the expressway-c that reside on the same machine are both shut down.
> Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
> voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
> agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how this
> problem would happen.
> In our environment, the above are both true. So the loss of any number of
> anything, should not result in failover issues - and that is the behavior we
> have seen (we have shut down entire sites due to maintenance, power
> failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way (our
> MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville
> Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for over
> 2 months. I have sent repeated expressway and PRT logs from the phone.
> After working with Cisco engineering, the claim it is “working as intended”
> and plan on updating the documentation to reflect the limitation that if you
> loose both the subscriber and redundant expressway-C server, failover
> won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace our
> solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC agrees,
> > says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or
> webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> >
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote
> phones un-register and never register no matter what is done. If either
> CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA phones
> and is not solvable unless we purchase two new vmware servers and split
> the CUCM and Expressway-C into separate servers so they both won’t go
> down at once. Sinc VMWare Server 3 & 4 are at different locations, vMotion
> isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our VAR
> and cisco rep and may have to replace our phone system since we are all
> working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
> mp;reserved=0
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.

I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.

And I was all, "wait, what?"

They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?

What a let down. ?

-----Original Message-----
From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
Sent: Tuesday, June 21, 2022 1:37 PM
To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca


Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.

It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.



-----Original Message-----
From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
Sent: Tuesday, June 21, 2022 1:04 PM
To: cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?

The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.

Adam Pawlowski
Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
243 Computing Center, Buffalo, NY 14260 


> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
> Matthew Huff
> Sent: Tuesday, June 21, 2022 12:54 PM
> To: Hunter Fuller <hf0002@uah.edu>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
> Cisco TAC agrees, says it's a documentation defect
>
> We have no interest in setting up a jabber environment in order to
> debug ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c
> know about CUCM. Cisco TAC has verified the configuration, logs, and
> diagnostic. I've been working with them for 2 months and it's been
> escalated to backline-engineering. They looked at the Cisco Phone PRT
> logs and confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We
> are running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM
> subscriber and the expressway-c that reside on the same machine are both shut down.
> Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
> voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
> Cisco TAC agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how
> this problem would happen.
> In our environment, the above are both true. So the loss of any number
> of anything, should not result in failover issues - and that is the
> behavior we have seen (we have shut down entire sites due to
> maintenance, power failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way
> (our MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for
> > over
> 2 months. I have sent repeated expressway and PRT logs from the phone.
> After working with Cisco engineering, the claim it is “working as intended”
> and plan on updating the documentation to reflect the limitation that
> if you loose both the subscriber and redundant expressway-C server,
> failover won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace
> > our
> solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
> > agrees, says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go
> > offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or
> webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> >
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote
> phones un-register and never register no matter what is done. If
> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
> > phones
> and is not solvable unless we purchase two new vmware servers and
> split the CUCM and Expressway-C into separate servers so they both
> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
> locations, vMotion isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our
> > VAR
> and cisco rep and may have to replace our phone system since we are
> all working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
> mp;reserved=0
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle

Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.


-----Original Message-----
From: Lelio Fulgenzi <lelio@uoguelph.ca>
Sent: Tuesday, June 21, 2022 4:30 PM
To: Matthew Huff <mhuff@ox.com>; Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.

I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.

And I was all, "wait, what?"

They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?

What a let down. ?

-----Original Message-----
From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
Sent: Tuesday, June 21, 2022 1:37 PM
To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca


Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.

It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.



-----Original Message-----
From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
Sent: Tuesday, June 21, 2022 1:04 PM
To: cisco-voip voyp list <cisco-voip@puck.nether.net>
Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect

I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?

The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.

Adam Pawlowski
Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
243 Computing Center, Buffalo, NY 14260 


> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
> Matthew Huff
> Sent: Tuesday, June 21, 2022 12:54 PM
> To: Hunter Fuller <hf0002@uah.edu>
> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
> Cisco TAC agrees, says it's a documentation defect
>
> We have no interest in setting up a jabber environment in order to
> debug ciscos's issue.
>
> Yes, every expressway-e knows about all expressway-c, all expressway-c
> know about CUCM. Cisco TAC has verified the configuration, logs, and
> diagnostic. I've been working with them for 2 months and it's been
> escalated to backline-engineering. They looked at the Cisco Phone PRT
> logs and confirmed that it's a known limitation, and there is no solution.
>
> Maybe it's an issue with later versions of CUCM and/or expressway? We
> are running the latest including latest phone firmware.
>
> Failover works great except in one scenario where both the CUCM
> subscriber and the expressway-c that reside on the same machine are both shut down.
> Brining either one up, and the phone registers.
>
>
> -----Original Message-----
> From: Hunter Fuller <hf0002@uah.edu>
> Sent: Tuesday, June 21, 2022 12:41 PM
> To: Matthew Huff <mhuff@ox.com>
> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
> voip@puck.nether.net>
> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
> Cisco TAC agrees, says it's a documentation defect
>
> It might be worth setting up a Jabber test endpoint just to see.
>
> Some questions though:
> - Does every Expressway-E know about every Expressway-C?
> - Does every Expressway-C know about every CUCM?
>
> I'm trying to figure out what the desired architecture is, and/or how
> this problem would happen.
> In our environment, the above are both true. So the loss of any number
> of anything, should not result in failover issues - and that is the
> behavior we have seen (we have shut down entire sites due to
> maintenance, power failure, etc. and failover worked).
> In fact, we have found MRA phones to be great at failover in this way
> (our MRA phones are all 8851s). Jabber has been the problem child.
>
> --
> Hunter Fuller (they)
> Router Jockey
> VBH M-1C
> +1 256 824 5331
>
> Office of Information Technology
> The University of Alabama in Huntsville Network Engineering
>
> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
> >
> > We don’t use Jabber nor Webex.
> >
> >
> >
> > Cisco TAC has been escalated and they have been working on this for
> > over
> 2 months. I have sent repeated expressway and PRT logs from the phone.
> After working with Cisco engineering, the claim it is “working as intended”
> and plan on updating the documentation to reflect the limitation that
> if you loose both the subscriber and redundant expressway-C server,
> failover won’t happen.
> >
> >
> >
> > I’d love to be proven wrong since we may have to completely replace
> > our
> solution.
> >
> >
> >
> >
> >
> > From: Kent Roberts <dvxkid@gmail.com>
> > Sent: Tuesday, June 21, 2022 10:09 AM
> > To: Matthew Huff <mhuff@ox.com>
> > Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
> > Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
> > agrees, says it's a documentation defect
> >
> >
> >
> > This sound more like a config issue…
> >
> >
> >
> > Have run into issues where expressways go stupid when boxes go
> > offline
> >
> > As for it being the phones 88xx. Does the same happen with jabber or
> webex? If it does i’d requeue the case….
> >
> >
> >
> > Kent
> >
> >
> >
> > On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
> >
> >
> >
> > We have a fairly common and standard deployment for our MRA solution.
> > All are running CUCM 14+, latest Expressway, etc…
> >
> >
> >
> > Vmware server 1 (jn DMZ)
> >
> > ExpressWay-E-1
> >
> >
> >
> > Vmware server 2 (in DMZ)
> >
> > ExpressWay-E-2
> >
> >
> >
> > Vmware Server 3 (In Core)
> >
> > CUCM Publisher
> >
> > Expressway-C-1
> >
> >
> >
> > VMWare Server 4( In Core)
> >
> > CUCM Subscriber
> >
> > Expressway-C-2
> >
> >
> >
> >
> >
> > If ether Expreway-E VMs fail, redundancy works fine If either CUCM
> > fails, redundancy works fine If either Expressway-C VMs fail,
> > redundancy works fine If VMWare Server 4 fails (say during patching,
> > hardware maintenance or hardware failure), redundancy fails. Remote
> phones un-register and never register no matter what is done. If
> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
> >
> >
> >
> > Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
> > phones
> and is not solvable unless we purchase two new vmware servers and
> split the CUCM and Expressway-C into separate servers so they both
> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
> locations, vMotion isn’t an option since there is no shared storage.
> >
> >
> >
> > Anyone run into this or have any suggestions? We have engaged our
> > VAR
> and cisco rep and may have to replace our phone system since we are
> all working from home and MRA support including redundancy is critical to us.
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> >
> > _______________________________________________
> > cisco-voip mailing list
> > cisco-voip@puck.nether.net
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> > .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26
> >
> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
> 0b199
> >
> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
> BI%2FsU%3
> > D&amp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
> mp;reserved=0
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
At some point I’ll have to dig deep to understand the Jabber MRA redundancy and the limitations.

Our users don’t want to hear a decision tree regarding redundancy.

In the end, I may have to just announce our maintenance windows with:

You may experience issues with Jabber as we restart servers. Try signing out and back in again. If that doesn’t work, try again later. If that still doesn’t work, wait until the window is completed.

Nice. Why am I installing all these extra servers for then? Only to carry load?

Sent from my iPhone

> On Jun 21, 2022, at 4:55 PM, Matthew Huff <mhuff@ox.com> wrote:
>
> ?CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>
>
> Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle
>
> Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.
>
>
> -----Original Message-----
> From: Lelio Fulgenzi <lelio@uoguelph.ca>
> Sent: Tuesday, June 21, 2022 4:30 PM
> To: Matthew Huff <mhuff@ox.com>; Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.
>
> I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.
>
> And I was all, "wait, what?"
>
> They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?
>
> What a let down. ?
>
> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
> Sent: Tuesday, June 21, 2022 1:37 PM
> To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>
>
> Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.
>
> It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.
>
>
>
> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
> Sent: Tuesday, June 21, 2022 1:04 PM
> To: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?
>
> The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.
>
> Adam Pawlowski
> Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
> 243 Computing Center, Buffalo, NY 14260
>
>
>> -----Original Message-----
>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
>> Matthew Huff
>> Sent: Tuesday, June 21, 2022 12:54 PM
>> To: Hunter Fuller <hf0002@uah.edu>
>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
>> Cisco TAC agrees, says it's a documentation defect
>>
>> We have no interest in setting up a jabber environment in order to
>> debug ciscos's issue.
>>
>> Yes, every expressway-e knows about all expressway-c, all expressway-c
>> know about CUCM. Cisco TAC has verified the configuration, logs, and
>> diagnostic. I've been working with them for 2 months and it's been
>> escalated to backline-engineering. They looked at the Cisco Phone PRT
>> logs and confirmed that it's a known limitation, and there is no solution.
>>
>> Maybe it's an issue with later versions of CUCM and/or expressway? We
>> are running the latest including latest phone firmware.
>>
>> Failover works great except in one scenario where both the CUCM
>> subscriber and the expressway-c that reside on the same machine are both shut down.
>> Brining either one up, and the phone registers.
>>
>>
>> -----Original Message-----
>> From: Hunter Fuller <hf0002@uah.edu>
>> Sent: Tuesday, June 21, 2022 12:41 PM
>> To: Matthew Huff <mhuff@ox.com>
>> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
>> voip@puck.nether.net>
>> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
>> Cisco TAC agrees, says it's a documentation defect
>>
>> It might be worth setting up a Jabber test endpoint just to see.
>>
>> Some questions though:
>> - Does every Expressway-E know about every Expressway-C?
>> - Does every Expressway-C know about every CUCM?
>>
>> I'm trying to figure out what the desired architecture is, and/or how
>> this problem would happen.
>> In our environment, the above are both true. So the loss of any number
>> of anything, should not result in failover issues - and that is the
>> behavior we have seen (we have shut down entire sites due to
>> maintenance, power failure, etc. and failover worked).
>> In fact, we have found MRA phones to be great at failover in this way
>> (our MRA phones are all 8851s). Jabber has been the problem child.
>>
>> --
>> Hunter Fuller (they)
>> Router Jockey
>> VBH M-1C
>> +1 256 824 5331
>>
>> Office of Information Technology
>> The University of Alabama in Huntsville Network Engineering
>>
>>> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>>>
>>> We don’t use Jabber nor Webex.
>>>
>>>
>>>
>>> Cisco TAC has been escalated and they have been working on this for
>>> over
>> 2 months. I have sent repeated expressway and PRT logs from the phone.
>> After working with Cisco engineering, the claim it is “working as intended”
>> and plan on updating the documentation to reflect the limitation that
>> if you loose both the subscriber and redundant expressway-C server,
>> failover won’t happen.
>>>
>>>
>>>
>>> I’d love to be proven wrong since we may have to completely replace
>>> our
>> solution.
>>>
>>>
>>>
>>>
>>>
>>> From: Kent Roberts <dvxkid@gmail.com>
>>> Sent: Tuesday, June 21, 2022 10:09 AM
>>> To: Matthew Huff <mhuff@ox.com>
>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
>>> agrees, says it's a documentation defect
>>>
>>>
>>>
>>> This sound more like a config issue…
>>>
>>>
>>>
>>> Have run into issues where expressways go stupid when boxes go
>>> offline
>>>
>>> As for it being the phones 88xx. Does the same happen with jabber or
>> webex? If it does i’d requeue the case….
>>>
>>>
>>>
>>> Kent
>>>
>>>
>>>
>>>> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>>>
>>>
>>>
>>> We have a fairly common and standard deployment for our MRA solution.
>>> All are running CUCM 14+, latest Expressway, etc…
>>>
>>>
>>>
>>> Vmware server 1 (jn DMZ)
>>>
>>> ExpressWay-E-1
>>>
>>>
>>>
>>> Vmware server 2 (in DMZ)
>>>
>>> ExpressWay-E-2
>>>
>>>
>>>
>>> Vmware Server 3 (In Core)
>>>
>>> CUCM Publisher
>>>
>>> Expressway-C-1
>>>
>>>
>>>
>>> VMWare Server 4( In Core)
>>>
>>> CUCM Subscriber
>>>
>>> Expressway-C-2
>>>
>>>
>>>
>>>
>>>
>>> If ether Expreway-E VMs fail, redundancy works fine If either CUCM
>>> fails, redundancy works fine If either Expressway-C VMs fail,
>>> redundancy works fine If VMWare Server 4 fails (say during patching,
>>> hardware maintenance or hardware failure), redundancy fails. Remote
>> phones un-register and never register no matter what is done. If
>> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>>>
>>>
>>>
>>> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
>>> phones
>> and is not solvable unless we purchase two new vmware servers and
>> split the CUCM and Expressway-C into separate servers so they both
>> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
>> locations, vMotion isn’t an option since there is no shared storage.
>>>
>>>
>>>
>>> Anyone run into this or have any suggestions? We have engaged our
>>> VAR
>> and cisco rep and may have to replace our phone system since we are
>> all working from home and MRA support including redundancy is critical to us.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>>
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26
>>>
>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>> 0b199
>>>
>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>> bGZsb3d8ey
>>>
>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> 7C300
>>>
>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>> BI%2FsU%3
>>> D&amp;reserved=0
>>>
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>>
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26
>>>
>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>> 0b199
>>>
>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>> bGZsb3d8ey
>>>
>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> 7C300
>>>
>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>> BI%2FsU%3
>>> D&amp;reserved=0
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
>> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
>> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
>> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
>> mp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
Expressway is a very temperamental child. You sneeze and it will act up. We are in design reviews again with a Cisco expressway expert and that’s all he does. They want things set their way and the are watching the entire build.

The redundancy has its own set of problems ultimately we are now deploying 6 boxes in data center 1 and 6 boxes and data center 2 as entirely different clusters as traffic across the wan was causing its own set of replication problems with Expressway. Bottom line is Cisco got the product and it’s been added on so many times it has its own set of issues.
We have multiple 10 gig links between the centers. And multiple 10 gig links to the internet. Low voice traffic on expressway but have faced lots of fun over the last 2 years

> On Jun 21, 2022, at 14:55, Matthew Huff <mhuff@ox.com> wrote:
>
> ?Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle
>
> Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.
>
>
> -----Original Message-----
> From: Lelio Fulgenzi <lelio@uoguelph.ca>
> Sent: Tuesday, June 21, 2022 4:30 PM
> To: Matthew Huff <mhuff@ox.com>; Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.
>
> I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.
>
> And I was all, "wait, what?"
>
> They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?
>
> What a let down. ?
>
> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
> Sent: Tuesday, June 21, 2022 1:37 PM
> To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>
>
> Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.
>
> It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.
>
>
>
> -----Original Message-----
> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
> Sent: Tuesday, June 21, 2022 1:04 PM
> To: cisco-voip voyp list <cisco-voip@puck.nether.net>
> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>
> I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?
>
> The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.
>
> Adam Pawlowski
> Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
> 243 Computing Center, Buffalo, NY 14260
>
>
>> -----Original Message-----
>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
>> Matthew Huff
>> Sent: Tuesday, June 21, 2022 12:54 PM
>> To: Hunter Fuller <hf0002@uah.edu>
>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
>> Cisco TAC agrees, says it's a documentation defect
>>
>> We have no interest in setting up a jabber environment in order to
>> debug ciscos's issue.
>>
>> Yes, every expressway-e knows about all expressway-c, all expressway-c
>> know about CUCM. Cisco TAC has verified the configuration, logs, and
>> diagnostic. I've been working with them for 2 months and it's been
>> escalated to backline-engineering. They looked at the Cisco Phone PRT
>> logs and confirmed that it's a known limitation, and there is no solution.
>>
>> Maybe it's an issue with later versions of CUCM and/or expressway? We
>> are running the latest including latest phone firmware.
>>
>> Failover works great except in one scenario where both the CUCM
>> subscriber and the expressway-c that reside on the same machine are both shut down.
>> Brining either one up, and the phone registers.
>>
>>
>> -----Original Message-----
>> From: Hunter Fuller <hf0002@uah.edu>
>> Sent: Tuesday, June 21, 2022 12:41 PM
>> To: Matthew Huff <mhuff@ox.com>
>> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
>> voip@puck.nether.net>
>> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
>> Cisco TAC agrees, says it's a documentation defect
>>
>> It might be worth setting up a Jabber test endpoint just to see.
>>
>> Some questions though:
>> - Does every Expressway-E know about every Expressway-C?
>> - Does every Expressway-C know about every CUCM?
>>
>> I'm trying to figure out what the desired architecture is, and/or how
>> this problem would happen.
>> In our environment, the above are both true. So the loss of any number
>> of anything, should not result in failover issues - and that is the
>> behavior we have seen (we have shut down entire sites due to
>> maintenance, power failure, etc. and failover worked).
>> In fact, we have found MRA phones to be great at failover in this way
>> (our MRA phones are all 8851s). Jabber has been the problem child.
>>
>> --
>> Hunter Fuller (they)
>> Router Jockey
>> VBH M-1C
>> +1 256 824 5331
>>
>> Office of Information Technology
>> The University of Alabama in Huntsville Network Engineering
>>
>>> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>>>
>>> We don’t use Jabber nor Webex.
>>>
>>>
>>>
>>> Cisco TAC has been escalated and they have been working on this for
>>> over
>> 2 months. I have sent repeated expressway and PRT logs from the phone.
>> After working with Cisco engineering, the claim it is “working as intended”
>> and plan on updating the documentation to reflect the limitation that
>> if you loose both the subscriber and redundant expressway-C server,
>> failover won’t happen.
>>>
>>>
>>>
>>> I’d love to be proven wrong since we may have to completely replace
>>> our
>> solution.
>>>
>>>
>>>
>>>
>>>
>>> From: Kent Roberts <dvxkid@gmail.com>
>>> Sent: Tuesday, June 21, 2022 10:09 AM
>>> To: Matthew Huff <mhuff@ox.com>
>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
>>> agrees, says it's a documentation defect
>>>
>>>
>>>
>>> This sound more like a config issue…
>>>
>>>
>>>
>>> Have run into issues where expressways go stupid when boxes go
>>> offline
>>>
>>> As for it being the phones 88xx. Does the same happen with jabber or
>> webex? If it does i’d requeue the case….
>>>
>>>
>>>
>>> Kent
>>>
>>>
>>>
>>>> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>>>
>>>
>>>
>>> We have a fairly common and standard deployment for our MRA solution.
>>> All are running CUCM 14+, latest Expressway, etc…
>>>
>>>
>>>
>>> Vmware server 1 (jn DMZ)
>>>
>>> ExpressWay-E-1
>>>
>>>
>>>
>>> Vmware server 2 (in DMZ)
>>>
>>> ExpressWay-E-2
>>>
>>>
>>>
>>> Vmware Server 3 (In Core)
>>>
>>> CUCM Publisher
>>>
>>> Expressway-C-1
>>>
>>>
>>>
>>> VMWare Server 4( In Core)
>>>
>>> CUCM Subscriber
>>>
>>> Expressway-C-2
>>>
>>>
>>>
>>>
>>>
>>> If ether Expreway-E VMs fail, redundancy works fine If either CUCM
>>> fails, redundancy works fine If either Expressway-C VMs fail,
>>> redundancy works fine If VMWare Server 4 fails (say during patching,
>>> hardware maintenance or hardware failure), redundancy fails. Remote
>> phones un-register and never register no matter what is done. If
>> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>>>
>>>
>>>
>>> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
>>> phones
>> and is not solvable unless we purchase two new vmware servers and
>> split the CUCM and Expressway-C into separate servers so they both
>> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
>> locations, vMotion isn’t an option since there is no shared storage.
>>>
>>>
>>>
>>> Anyone run into this or have any suggestions? We have engaged our
>>> VAR
>> and cisco rep and may have to replace our phone system since we are
>> all working from home and MRA support including redundancy is critical to us.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>>
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26
>>>
>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>> 0b199
>>>
>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>> bGZsb3d8ey
>>>
>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> 7C300
>>>
>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>> BI%2FsU%3
>>> D&amp;reserved=0
>>>
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>>
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26
>>>
>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>> 0b199
>>>
>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>> bGZsb3d8ey
>>>
>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>> 7C300
>>>
>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>> BI%2FsU%3
>>> D&amp;reserved=0
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
>> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
>> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
>> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
>> mp;reserved=0
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
> _______________________________________________
> cisco-voip mailing list
> cisco-voip@puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
Expressway is (one of) the only Collab products that doesn’t follow the “what we list is the minimum version of ESXi, maintenance releases and updates are good to go” rule.

Which is fine, but then they also don’t make an effort to test subsequent ESXi updates either.

They currently only support ESXi 6.5U2 which means I had to stick to the patch just before U3 in order for the version information to show U2.

And this is where the conversation skewed to “you should have expressway on their own ESXi boxes”

Ugh.

Sent from my iPhone

> On Jun 21, 2022, at 7:51 PM, Kent Roberts <dvxkid@gmail.com> wrote:
>
> ?CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>
>
> Expressway is a very temperamental child. You sneeze and it will act up. We are in design reviews again with a Cisco expressway expert and that’s all he does. They want things set their way and the are watching the entire build.
>
> The redundancy has its own set of problems ultimately we are now deploying 6 boxes in data center 1 and 6 boxes and data center 2 as entirely different clusters as traffic across the wan was causing its own set of replication problems with Expressway. Bottom line is Cisco got the product and it’s been added on so many times it has its own set of issues.
> We have multiple 10 gig links between the centers. And multiple 10 gig links to the internet. Low voice traffic on expressway but have faced lots of fun over the last 2 years
>
>> On Jun 21, 2022, at 14:55, Matthew Huff <mhuff@ox.com> wrote:
>>
>> ?Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle
>>
>> Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.
>>
>>
>> -----Original Message-----
>> From: Lelio Fulgenzi <lelio@uoguelph.ca>
>> Sent: Tuesday, June 21, 2022 4:30 PM
>> To: Matthew Huff <mhuff@ox.com>; Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
>> Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>
>> I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.
>>
>> I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.
>>
>> And I was all, "wait, what?"
>>
>> They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?
>>
>> What a let down. ?
>>
>> -----Original Message-----
>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
>> Sent: Tuesday, June 21, 2022 1:37 PM
>> To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>
>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>>
>>
>> Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.
>>
>> It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.
>>
>>
>>
>> -----Original Message-----
>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
>> Sent: Tuesday, June 21, 2022 1:04 PM
>> To: cisco-voip voyp list <cisco-voip@puck.nether.net>
>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>
>> I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?
>>
>> The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.
>>
>> Adam Pawlowski
>> Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
>> 243 Computing Center, Buffalo, NY 14260
>>
>>
>>> -----Original Message-----
>>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
>>> Matthew Huff
>>> Sent: Tuesday, June 21, 2022 12:54 PM
>>> To: Hunter Fuller <hf0002@uah.edu>
>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
>>> Cisco TAC agrees, says it's a documentation defect
>>>
>>> We have no interest in setting up a jabber environment in order to
>>> debug ciscos's issue.
>>>
>>> Yes, every expressway-e knows about all expressway-c, all expressway-c
>>> know about CUCM. Cisco TAC has verified the configuration, logs, and
>>> diagnostic. I've been working with them for 2 months and it's been
>>> escalated to backline-engineering. They looked at the Cisco Phone PRT
>>> logs and confirmed that it's a known limitation, and there is no solution.
>>>
>>> Maybe it's an issue with later versions of CUCM and/or expressway? We
>>> are running the latest including latest phone firmware.
>>>
>>> Failover works great except in one scenario where both the CUCM
>>> subscriber and the expressway-c that reside on the same machine are both shut down.
>>> Brining either one up, and the phone registers.
>>>
>>>
>>> -----Original Message-----
>>> From: Hunter Fuller <hf0002@uah.edu>
>>> Sent: Tuesday, June 21, 2022 12:41 PM
>>> To: Matthew Huff <mhuff@ox.com>
>>> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
>>> voip@puck.nether.net>
>>> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
>>> Cisco TAC agrees, says it's a documentation defect
>>>
>>> It might be worth setting up a Jabber test endpoint just to see.
>>>
>>> Some questions though:
>>> - Does every Expressway-E know about every Expressway-C?
>>> - Does every Expressway-C know about every CUCM?
>>>
>>> I'm trying to figure out what the desired architecture is, and/or how
>>> this problem would happen.
>>> In our environment, the above are both true. So the loss of any number
>>> of anything, should not result in failover issues - and that is the
>>> behavior we have seen (we have shut down entire sites due to
>>> maintenance, power failure, etc. and failover worked).
>>> In fact, we have found MRA phones to be great at failover in this way
>>> (our MRA phones are all 8851s). Jabber has been the problem child.
>>>
>>> --
>>> Hunter Fuller (they)
>>> Router Jockey
>>> VBH M-1C
>>> +1 256 824 5331
>>>
>>> Office of Information Technology
>>> The University of Alabama in Huntsville Network Engineering
>>>
>>>>> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>>>>>
>>>>> We don’t use Jabber nor Webex.
>>>>>
>>>>>
>>>>>
>>>>> Cisco TAC has been escalated and they have been working on this for
>>>>> over
>>> 2 months. I have sent repeated expressway and PRT logs from the phone.
>>> After working with Cisco engineering, the claim it is “working as intended”
>>> and plan on updating the documentation to reflect the limitation that
>>> if you loose both the subscriber and redundant expressway-C server,
>>> failover won’t happen.
>>>>
>>>>
>>>>
>>>> I’d love to be proven wrong since we may have to completely replace
>>>> our
>>> solution.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> From: Kent Roberts <dvxkid@gmail.com>
>>>> Sent: Tuesday, June 21, 2022 10:09 AM
>>>> To: Matthew Huff <mhuff@ox.com>
>>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>>> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
>>>> agrees, says it's a documentation defect
>>>>
>>>>
>>>>
>>>> This sound more like a config issue…
>>>>
>>>>
>>>>
>>>> Have run into issues where expressways go stupid when boxes go
>>>> offline
>>>>
>>>> As for it being the phones 88xx. Does the same happen with jabber or
>>> webex? If it does i’d requeue the case….
>>>>
>>>>
>>>>
>>>> Kent
>>>>
>>>>
>>>>
>>>>> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>>>>
>>>>
>>>>
>>>> We have a fairly common and standard deployment for our MRA solution.
>>>> All are running CUCM 14+, latest Expressway, etc…
>>>>
>>>>
>>>>
>>>> Vmware server 1 (jn DMZ)
>>>>
>>>> ExpressWay-E-1
>>>>
>>>>
>>>>
>>>> Vmware server 2 (in DMZ)
>>>>
>>>> ExpressWay-E-2
>>>>
>>>>
>>>>
>>>> Vmware Server 3 (In Core)
>>>>
>>>> CUCM Publisher
>>>>
>>>> Expressway-C-1
>>>>
>>>>
>>>>
>>>> VMWare Server 4( In Core)
>>>>
>>>> CUCM Subscriber
>>>>
>>>> Expressway-C-2
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> If ether Expreway-E VMs fail, redundancy works fine If either CUCM
>>>> fails, redundancy works fine If either Expressway-C VMs fail,
>>>> redundancy works fine If VMWare Server 4 fails (say during patching,
>>>> hardware maintenance or hardware failure), redundancy fails. Remote
>>> phones un-register and never register no matter what is done. If
>>> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>>>>
>>>>
>>>>
>>>> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
>>>> phones
>>> and is not solvable unless we purchase two new vmware servers and
>>> split the CUCM and Expressway-C into separate servers so they both
>>> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
>>> locations, vMotion isn’t an option since there is no shared storage.
>>>>
>>>>
>>>>
>>>> Anyone run into this or have any suggestions? We have engaged our
>>>> VAR
>>> and cisco rep and may have to replace our phone system since we are
>>> all working from home and MRA support including redundancy is critical to us.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> cisco-voip mailing list
>>>> cisco-voip@puck.nether.net
>>>>
>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>> voip&amp;data=05%7C01%7Cajp26
>>>>
>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>> 0b199
>>>>
>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>> bGZsb3d8ey
>>>>
>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>> 7C300
>>>>
>>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>> BI%2FsU%3
>>>> D&amp;reserved=0
>>>>
>>>> _______________________________________________
>>>> cisco-voip mailing list
>>>> cisco-voip@puck.nether.net
>>>>
>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>> voip&amp;data=05%7C01%7Cajp26
>>>>
>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>> 0b199
>>>>
>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>> bGZsb3d8ey
>>>>
>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>> 7C300
>>>>
>>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>> BI%2FsU%3
>>>> D&amp;reserved=0
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
>>> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
>>> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>>> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
>>> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
>>> mp;reserved=0
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
>> _______________________________________________
>> cisco-voip mailing list
>> cisco-voip@puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
Re: [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect [ In reply to ]
14 should be esx7 compatible. And it’s tanberg adapted Cisco
So. There’s that

Kent

> On Jun 21, 2022, at 18:44, Lelio Fulgenzi <lelio@uoguelph.ca> wrote:
>
> ?
> Expressway is (one of) the only Collab products that doesn’t follow the “what we list is the minimum version of ESXi, maintenance releases and updates are good to go” rule.
>
> Which is fine, but then they also don’t make an effort to test subsequent ESXi updates either.
>
> They currently only support ESXi 6.5U2 which means I had to stick to the patch just before U3 in order for the version information to show U2.
>
> And this is where the conversation skewed to “you should have expressway on their own ESXi boxes”
>
> Ugh.
>
> Sent from my iPhone
>
>> On Jun 21, 2022, at 7:51 PM, Kent Roberts <dvxkid@gmail.com> wrote:
>>
>> ?CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>>
>>
>> Expressway is a very temperamental child. You sneeze and it will act up. We are in design reviews again with a Cisco expressway expert and that’s all he does. They want things set their way and the are watching the entire build.
>>
>> The redundancy has its own set of problems ultimately we are now deploying 6 boxes in data center 1 and 6 boxes and data center 2 as entirely different clusters as traffic across the wan was causing its own set of replication problems with Expressway. Bottom line is Cisco got the product and it’s been added on so many times it has its own set of issues.
>> We have multiple 10 gig links between the centers. And multiple 10 gig links to the internet. Low voice traffic on expressway but have faced lots of fun over the last 2 years
>>
>>>> On Jun 21, 2022, at 14:55, Matthew Huff <mhuff@ox.com> wrote:
>>>
>>> ?Yes, they want 6 boxes for redundancy. Two for expressway-e, two for CUCM, two for expressway-c. /Boggle
>>>
>>> Even then, that doesn't provide 100% redundancy. We want to place our CUCM and expressways at different datacenters connected by a 10GB wan. If we were to loose the WAN, we would still fail with MRA since we would lose both ESXi hosts.
>>>
>>>
>>> -----Original Message-----
>>> From: Lelio Fulgenzi <lelio@uoguelph.ca>
>>> Sent: Tuesday, June 21, 2022 4:30 PM
>>> To: Matthew Huff <mhuff@ox.com>; Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: RE: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>>
>>> I was miffed when they said no real MRA redundancy until you upgrade to v14. But now, hearing this, man, what a disappointment.
>>>
>>> I had a similar discussion with the Expressway folks and ESXi compatibility/testing and they were like, yeah, you should probably have separate UCS boxes for Expressways different than your CUCMs.
>>>
>>> And I was all, "wait, what?"
>>>
>>> They want us to run a completely separate ESXi box with only an E or a C on it to get full MRA redundancy?
>>>
>>> What a let down. ?
>>>
>>> -----Original Message-----
>>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Matthew Huff
>>> Sent: Tuesday, June 21, 2022 1:37 PM
>>> To: Adam Pawlowski <ajp26@buffalo.edu>; cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>>
>>> CAUTION: This email originated from outside of the University of Guelph. Do not click links or open attachments unless you recognize the sender and know the content is safe. If in doubt, forward suspicious emails to IThelp@uoguelph.ca
>>>
>>>
>>> Yes, that sounds almost exactly what we are experiencing. I think it's a design defect with the MRA architecture with the end device not downloading/retrying with the full environment.
>>>
>>> It's the same issue we have with partial registrations. We have a number of shared SIP lines (think SALES line) that can silently fail on phone. It will try for a few minutes, but give up after that. The user doesn't know that the shared SIP line is disconnected, they just don't get calls on it. We had to add a complex SNMP monitoring so that we can be alerted when this happen and remotely reset the phones. Cisco TAC is aware of this issue and also told us it's "working as intended". We had a sales trader lose about $10k of commission because he missed a call, and he was not a happy camper.
>>>
>>>
>>>
>>> -----Original Message-----
>>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of Adam Pawlowski
>>> Sent: Tuesday, June 21, 2022 1:04 PM
>>> To: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work, Cisco TAC agrees, says it's a documentation defect
>>>
>>> I'm a bit miffed on the need for the extra expressway C. We have very few MRA phones, but hadn't had this type of problem, an expressway is somehow busted and not accepting registrations - did they offer any explanation as to why that piece is needed?
>>>
>>> The only thing I'd go to look up is how the CM list is being populated, if changing the CM group to bump the shut subscriber down (assuming reg order is sub -> pub), just since that'd come up before. Expressway doesn't seem to be configured to be aware that a UCM has gone away, despite the zone going down, at least for UDS and discovery. I'm sure TAC looked at that though. I have a conversation going with them about this and Jabber SSO for a similar reason, that the device's configuration isn't dynamic to represent the state of the infrastructure, and sometimes they get stuck trying something that won't work and fail despite other components being available to serve them. That probably doesn't help with anything other than to say we're in a similar boat, just with Jabber and MRA.
>>>
>>> Adam Pawlowski
>>> Network Engineer?| Network and Communication Services University at Buffalo Information Technology (UBIT)
>>> 243 Computing Center, Buffalo, NY 14260
>>>
>>>
>>>> -----Original Message-----
>>>> From: cisco-voip <cisco-voip-bounces@puck.nether.net> On Behalf Of
>>>> Matthew Huff
>>>> Sent: Tuesday, June 21, 2022 12:54 PM
>>>> To: Hunter Fuller <hf0002@uah.edu>
>>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>>> Subject: Re: [cisco-voip] [External] Re: MRA failover doesn't work,
>>>> Cisco TAC agrees, says it's a documentation defect
>>>>
>>>> We have no interest in setting up a jabber environment in order to
>>>> debug ciscos's issue.
>>>>
>>>> Yes, every expressway-e knows about all expressway-c, all expressway-c
>>>> know about CUCM. Cisco TAC has verified the configuration, logs, and
>>>> diagnostic. I've been working with them for 2 months and it's been
>>>> escalated to backline-engineering. They looked at the Cisco Phone PRT
>>>> logs and confirmed that it's a known limitation, and there is no solution.
>>>>
>>>> Maybe it's an issue with later versions of CUCM and/or expressway? We
>>>> are running the latest including latest phone firmware.
>>>>
>>>> Failover works great except in one scenario where both the CUCM
>>>> subscriber and the expressway-c that reside on the same machine are both shut down.
>>>> Brining either one up, and the phone registers.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Hunter Fuller <hf0002@uah.edu>
>>>> Sent: Tuesday, June 21, 2022 12:41 PM
>>>> To: Matthew Huff <mhuff@ox.com>
>>>> Cc: Kent Roberts <dvxkid@gmail.com>; cisco-voip voyp list <cisco-
>>>> voip@puck.nether.net>
>>>> Subject: Re: [External] Re: [cisco-voip] MRA failover doesn't work,
>>>> Cisco TAC agrees, says it's a documentation defect
>>>>
>>>> It might be worth setting up a Jabber test endpoint just to see.
>>>>
>>>> Some questions though:
>>>> - Does every Expressway-E know about every Expressway-C?
>>>> - Does every Expressway-C know about every CUCM?
>>>>
>>>> I'm trying to figure out what the desired architecture is, and/or how
>>>> this problem would happen.
>>>> In our environment, the above are both true. So the loss of any number
>>>> of anything, should not result in failover issues - and that is the
>>>> behavior we have seen (we have shut down entire sites due to
>>>> maintenance, power failure, etc. and failover worked).
>>>> In fact, we have found MRA phones to be great at failover in this way
>>>> (our MRA phones are all 8851s). Jabber has been the problem child.
>>>>
>>>> --
>>>> Hunter Fuller (they)
>>>> Router Jockey
>>>> VBH M-1C
>>>> +1 256 824 5331
>>>>
>>>> Office of Information Technology
>>>> The University of Alabama in Huntsville Network Engineering
>>>>
>>>>>> On Tue, Jun 21, 2022 at 9:13 AM Matthew Huff <mhuff@ox.com> wrote:
>>>>>>
>>>>>> We don’t use Jabber nor Webex.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Cisco TAC has been escalated and they have been working on this for
>>>>>> over
>>>> 2 months. I have sent repeated expressway and PRT logs from the phone.
>>>> After working with Cisco engineering, the claim it is “working as intended”
>>>> and plan on updating the documentation to reflect the limitation that
>>>> if you loose both the subscriber and redundant expressway-C server,
>>>> failover won’t happen.
>>>>>
>>>>>
>>>>>
>>>>> I’d love to be proven wrong since we may have to completely replace
>>>>> our
>>>> solution.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> From: Kent Roberts <dvxkid@gmail.com>
>>>>> Sent: Tuesday, June 21, 2022 10:09 AM
>>>>> To: Matthew Huff <mhuff@ox.com>
>>>>> Cc: cisco-voip voyp list <cisco-voip@puck.nether.net>
>>>>> Subject: Re: [cisco-voip] MRA failover doesn't work, Cisco TAC
>>>>> agrees, says it's a documentation defect
>>>>>
>>>>>
>>>>>
>>>>> This sound more like a config issue…
>>>>>
>>>>>
>>>>>
>>>>> Have run into issues where expressways go stupid when boxes go
>>>>> offline
>>>>>
>>>>> As for it being the phones 88xx. Does the same happen with jabber or
>>>> webex? If it does i’d requeue the case….
>>>>>
>>>>>
>>>>>
>>>>> Kent
>>>>>
>>>>>
>>>>>
>>>>>> On Jun 21, 2022, at 07:47, Matthew Huff <mhuff@ox.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> We have a fairly common and standard deployment for our MRA solution.
>>>>> All are running CUCM 14+, latest Expressway, etc…
>>>>>
>>>>>
>>>>>
>>>>> Vmware server 1 (jn DMZ)
>>>>>
>>>>> ExpressWay-E-1
>>>>>
>>>>>
>>>>>
>>>>> Vmware server 2 (in DMZ)
>>>>>
>>>>> ExpressWay-E-2
>>>>>
>>>>>
>>>>>
>>>>> Vmware Server 3 (In Core)
>>>>>
>>>>> CUCM Publisher
>>>>>
>>>>> Expressway-C-1
>>>>>
>>>>>
>>>>>
>>>>> VMWare Server 4( In Core)
>>>>>
>>>>> CUCM Subscriber
>>>>>
>>>>> Expressway-C-2
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> If ether Expreway-E VMs fail, redundancy works fine If either CUCM
>>>>> fails, redundancy works fine If either Expressway-C VMs fail,
>>>>> redundancy works fine If VMWare Server 4 fails (say during patching,
>>>>> hardware maintenance or hardware failure), redundancy fails. Remote
>>>> phones un-register and never register no matter what is done. If
>>>> either CUCM Subscriber or Expressway-C-2 is brought back online, phones register.
>>>>>
>>>>>
>>>>>
>>>>> Cisco TAC claims that this is a limitation of our Cisco 88xx SIP MRA
>>>>> phones
>>>> and is not solvable unless we purchase two new vmware servers and
>>>> split the CUCM and Expressway-C into separate servers so they both
>>>> won’t go down at once. Sinc VMWare Server 3 & 4 are at different
>>>> locations, vMotion isn’t an option since there is no shared storage.
>>>>>
>>>>>
>>>>>
>>>>> Anyone run into this or have any suggestions? We have engaged our
>>>>> VAR
>>>> and cisco rep and may have to replace our phone system since we are
>>>> all working from home and MRA support including redundancy is critical to us.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip@puck.nether.net
>>>>>
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&amp;data=05%7C01%7Cajp26
>>>>>
>>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>>> 0b199
>>>>>
>>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>>> bGZsb3d8ey
>>>>>
>>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>>> 7C300
>>>>>
>>>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>>> BI%2FsU%3
>>>>> D&amp;reserved=0
>>>>>
>>>>> _______________________________________________
>>>>> cisco-voip mailing list
>>>>> cisco-voip@puck.nether.net
>>>>>
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&amp;data=05%7C01%7Cajp26
>>>>>
>>>> %40buffalo.edu%7C719bda6c11134986ee9d08da53a6afaa%7C96464a8af8ed4
>>>> 0b199
>>>>>
>>>> e25f6b50a20250%7C0%7C0%7C637914272634938090%7CUnknown%7CTWFp
>>>> bGZsb3d8ey
>>>>>
>>>> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
>>>> 7C300
>>>>>
>>>> 0%7C%7C%7C&amp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2
>>>> BI%2FsU%3
>>>>> D&amp;reserved=0
>>>> _______________________________________________
>>>> cisco-voip mailing list
>>>> cisco-voip@puck.nether.net
>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpuck
>>>> .nether.net%2Fmailman%2Flistinfo%2Fcisco-
>>>> voip&amp;data=05%7C01%7Cajp26%40buffalo.edu%7C719bda6c11134986ee
>>>> 9d08da53a6afaa%7C96464a8af8ed40b199e25f6b50a20250%7C0%7C0%7C6379
>>>> 14272634938090%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
>>>> CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&a
>>>> mp;sdata=rBPHyezmLEzor0096y7a9yyEuMv83QEn2tRd9a%2BI%2FsU%3D&a
>>>> mp;reserved=0
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip@puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip