Troubleshooting Keepalives CSS11503

I had a switch die the other day. I have redundant CSS11503's.

When the switch died, the CSS took out a couple of VIPS from the active state and moved them to suspended state. When the switch came back online, the CSS never returned the VIPs to an active state, they stayed suspended.

I am wondering what I need to look at to make sure everything comes back to active after a failure and recovery. My typical service configuration is below:

Any thoughts from anyone about why it would have stayed suspended. The complicating thing was that when I did a show arp on my 3750 switch, it showed the IP for the VIP being on the port to the CSS that had the VIP in the suspended state....this meant that when everything SHOULD have been up, the traffic for the VIP was going up to link to the WRONG CSS. It should have been going to the OTHER CSS but because the ARP was showing one thing, it wasnt.

Topology & Design:
Overview:
Two ACI fabrics
Stretching VLANs using OTV
Both fabrics are advertising BD subnets into same routing domain
Some BDs(or say VLANs) are stretched, but some are not.
Endpoints can move betwee...
view more

Prerequisites
VMware Trunk Port Group is supported from ACI version 2.1
VMM integration must be configured properly
ASA device package must be uploaded to APIC
ASAv version must be compatible with ACI and device package version
Configuration
C...
view more

Topology &Design:Traffic flow within same fabric:Endpoint moves to Fabric-2Bounce Entry Times OutTraffic Black-holedSummarySolutionAppendix:
I. Introduction
In the Previous articles of ACI Automation, we are using Postman/Newman a...
view more