Cisco Catalyst Center High Availability Guide, Release 2.3.7.x
A comprehensive guide for configuring and administering High Availability (HA) for Cisco Catalyst Center 2.3.7.x, including cluster deployment, node maintenance, and failure recovery procedures.
Quick answers from the manual
Quick answer
- Catalyst Center HA requires a three-node cluster configuration to provide software and hardware redundancy. It supports database and security replication across nodes. p. 2
Key actions
- Activate High Availability p. 4
First start
- Configure nodes p. 4
Problems and fixes
If failure persists > 5 minutes, check status. If hardware failure, contact Cisco TAC.
p. 11, 12Maintenance and reset
- Shut down all nodes p. 5
Technical specifications
| Parameter | Value | Meaning | Pages |
|---|---|---|---|
| Cluster size | 3 nodes | Required for quorum | p. 2 |
Where to find it in the PDF
- High Availability Requirements p. 1
- Cluster Administration p. 5, 6
- Failure Scenarios p. 10, 11, 12
Table of contents
Quick Guide from the Manual
This guide outlines the requirements and procedures for implementing High Availability (HA) in a Cisco Catalyst Center environment. HA is designed to reduce downtime and ensure network resilience through a three-node cluster configuration.
- Requirement: A three-node cluster is mandatory for quorum.
- Network: All nodes must reside in the same network and site with a round-trip time (RTT) of 10ms or less.
- Maintenance: Catalyst Center enters maintenance mode during upgrades and HA activation; plan accordingly.
- Hardware: Appliances must have the same number of cores and run the same software version.
High Availability Overview
Catalyst Center's HA framework provides both software and hardware redundancy. It supports database and security replication (including X.509 certificates) across nodes. The system is designed to handle single-node failures automatically.
Deployment Recommendations
When deploying an HA-enabled cluster, follow these best practices:
- Subnets: Use the default link-local subnets (169.x.x.x) or ensure custom subnets conform to RFC 1918 and 6598.
- Interfaces: Keep cluster and enterprise traffic separate by using dedicated Cluster and Enterprise interfaces.
- Network Links: Do not span a LAN across slow links, as this increases susceptibility to network failures.
- Timing: Enable HA during off-hours, as the system will be unavailable while redistributing services.
Cluster Administration
Administrative tasks include shutting down, rebooting, and updating nodes. Important: You cannot simultaneously reboot or shut down two nodes in a three-node cluster, as this breaks the quorum requirement.
- Shutting down all nodes: Run sudo shutdown -h now on all nodes simultaneously.
- Shutting down one node: Use maglev node drain followed by sudo shutdown -h now.
- Rebooting: Use sudo shutdown -r now.
- RMA: Follow specific drain and removal procedures before replacing a failed node.
High Availability Failure Scenarios
Catalyst Center detects failures within 5 minutes. If a failure persists longer, user intervention may be required.
- Node failure (< 5 mins): The system typically recovers automatically.
- Node failure (> 5 mins): Services are migrated to other nodes; the GUI remains usable on remaining nodes.
- Two nodes fail: The cluster breaks, and the GUI becomes inaccessible. Contact Cisco TAC.
- Hardware failure: Replace the failed component (fan, power supply, disk drive) or the node itself.
Explanation of Pending State During a Failover
Pods in a 'Pending' state behave according to their type:
- Stateful set: Node-bound using local persistent volume (LPV).
- DaemonSet: Strictly node-bound.
- Stateless/deployment: Can move across nodes based on cluster state, though some have node antiaffinity rules.
Manufacturer information
Cisco Systems, Inc.
Practical help
Common problems
Ensure at least two nodes are operational. Do not shut down two nodes simultaneously.
If the failure lasts longer than 5 minutes, check the GUI for status messages. If hardware failure is confirmed, contact Cisco TAC.
Catalyst Center is unavailable during upgrades and HA activation. Schedule these operations during off-hours.
Before use
- Ensure three appliances with the same core count are available.
- Verify all nodes are on the same network and at the same site.
- Check that the round-trip time (RTT) is 10ms or less.
- Ensure secondary appliances are running the same software version as the primary.
- Prepare for maintenance mode downtime during HA activation.
Specs in practice
- 3-node cluster
- The required configuration for quorum and HA operations.
Model compatibility
- Does not support mixed clusters with third-generation appliances in version 2.3.7.5.
- Does not support clusters with more than three nodes.
- Does not support distribution of nodes across multiple networks or sites.
Manual page author
David Miller
Documentation analyst
Organizes user manual content into clear summaries, with attention to model details, product context, and everyday usability.