Table Of Contents
Configuring Failover
Understanding Failover
Failover System Requirements
Hardware Requirements
Software Requirements
License Requirements
The Failover and Stateful Failover Links
Failover Link
Stateful Failover Link
Active/Active and Active/Standby Failover
Active/Standby Failover
Active/Active Failover
Determining Which Type of Failover to Use
Regular and Stateful Failover
Regular Failover
Stateful Failover
Failover Health Monitoring
Unit Health Monitoring
Interface Monitoring
Configuring Failover
Configuring Active/Standby Failover
Prerequisites
Configuring Cable-Based Active/Standby Failover (PIX Security Appliance Only)
Configuring LAN-Based Active/Standby Failover
Configuring Optional Active/Standby Failover Settings
Configuring Active/Active Failover
Prerequisites
Configuring Cable-Based Active/Active Failover (PIX security appliance Only)
Configuring LAN-Based Active/Active Failover
Configuring Optional Active/Active Failover Settings
Configuring Failover Communication Authentication/Encryption
Verifying the Failover Configuration
Using the show failover Command
Viewing Monitored Interfaces
Displaying the Failover Commands in the Running Configuration
Testing the Failover Functionality
Controlling and Monitoring Failover
Forcing Failover
Disabling Failover
Restoring a Failed Unit or Failover Group
Monitoring Failover
Failover System Messages
Debug Messages
SNMP
Failover Configuration Examples
Cable-Based Active/Standby Failover Example
LAN-Based Active/Standby Failover Example
LAN-Based Active/Active Failover Example
Configuring Failover
This chapter describes the security appliance failover feature, which lets you configure two security appliances so that one will take over operation if the other one fails.
This chapter includes the following sections:
•
Understanding Failover
•
Configuring Failover
•
Controlling and Monitoring Failover
•
Failover Configuration Examples
Understanding Failover
The failover configuration requires two identical security appliances connected to each other through a dedicated failover link and, optionally, a Stateful Failover link. The health of the active interfaces and units is monitored to determine if specific failover conditions are met. If those conditions are met, failover occurs.
The security appliance supports two failover configurations, Active/Active failover and Active/Standby failover. Each failover configuration has its own method for determining and performing failover.
With Active/Active failover, both units can pass network traffic. This lets you configure load balancing on your network. Active/Active failover is only available on units running in multiple context mode.
With Active/Standby failover, only one unit passes traffic while the other unit waits in a standby state. Active/Standby failover is available on units running in either single or multiple context mode.
Both failover configurations support stateful or stateless (regular) failover.
Note
VPN failover is not supported on units running in multiple context mode. VPN failover available for Active/Standby failover configurations only.
This section includes the following topics:
•
Failover System Requirements
•
The Failover and Stateful Failover Links
•
Active/Active and Active/Standby Failover
•
Regular and Stateful Failover
•
Failover Health Monitoring
Failover System Requirements
This section describes the hardware, software, and license requirements for security appliances in a failover configuration. This section contains the following topics:
•
Hardware Requirements
•
Software Requirements
•
License Requirements
Hardware Requirements
The two units in a failover configuration must have the same hardware configuration. They must be the same model, have the same number and types of interfaces, and the same amount of RAM.
Note
The two units do not have to have the same size Flash memory. If using units with different Flash memory sizes in your failover configuration, make sure the unit with the smaller Flash memory has enough space to accommodate the software image files and the configuration files. If it does not, configuration synchronization from the unit with the larger Flash memory to the unit with the smaller Flash memory will fail.
Software Requirements
The two units in a failover configuration must be in the operating modes (routed or transparent, single or multiple context). They have the same major (first number) and minor (second number) software version. However, you can use different versions of the software during an upgrade process; for example, you can upgrade one unit from Version 7.0(1) to Version 7.0(2) and have failover remain active. We recommend upgrading both units to the same version to ensure long-term compatibility.
License Requirements
On the PIX security appliance platform, at least one of the units must have an unrestricted (UR) license. The other unit can have a Failover Only (FO) license, a Failover Only Active-Active (FO_AA) license, or another UR license. Units with a Restricted license cannot be used for failover, and two units with FO or FO_AA licenses cannot be used together as a failover pair.
Note
The FO license does not support Active/Active failover.
On the ASA security appliance platform, both units must have the same hardware, software, and licensing to support failover. On the ASA 5505 and ASA 5520, both units must have a Security Plus license to support failover.
The FO and FO_AA licenses are intended to be used solely for units in a failover configuration and not for units in standalone mode. If a failover unit with one of these licenses is used in standalone mode, the unit will reboot at least once every 24 hours until the unit is returned to failover duty. A unit with an FO or FO_AA license operates in standalone mode if it is booted without being connected to a failover peer with a UR license. If the unit with a UR license in a failover pair fails and is removed from the configuration, the unit with the FO or FO_AA license will not automatically reboot every 24 hours; it will operate uninterrupted unless the it is manually rebooted.
When the unit automatically reboots, the following message displays on the console:
=========================NOTICE=========================
This machine is running in secondary mode without
a connection to an active primary PIX. Please
check your connection to the primary system.
========================================================
The ASA platform does not have this restriction.
The Failover and Stateful Failover Links
This section describes the failover and the Stateful Failover links, which are dedicated connections between the two units in a failover configuration. This section includes the following topics:
•
Failover Link
•
Stateful Failover Link
Failover Link
The two units in a failover pair constantly communicate over a failover link to determine the operating status of each unit. The following information is communicated over the failover link:
•
The unit state (active or standby).
•
Power status (cable-based failover only—available only on the Cisco PIX security appliance platform).
•
Hello messages (keep-alives).
•
Network link status.
•
MAC address exchange.
•
Configuration replication and synchronization.
Caution 
All information sent over the failover and Stateful Failover links is sent in clear text unless you secure the communication with a failover key. If the security appliance is used to terminate VPN tunnels, this information includes any usernames, passwords and preshared keys used for establishing the tunnels. Transmitting this sensitive data in clear text could pose a significant security risk. We recommend securing the failover communication with a failover key if you are using the security appliance to terminate VPN tunnels.
On the PIX security appliance, the failover link can be either a LAN-based connection or a dedicated serial Failover cable. On the ASA platform, the failover link can only be a LAN-based connection.
This section includes the following topics:
•
LAN-Based Failover Link
•
Serial Cable Failover Link (PIX Security Appliance Only)
LAN-Based Failover Link
You can use any unused Ethernet interface on the device as the failover link. You cannot specify an interface that is currently configured with a name. The failover link interface is not configured as a normal networking interface; it exists only for failover communication. This interface should only be used for the failover link (and optionally for the Stateful Failover link). You can connect the LAN-based failover link in the following ways:
•
Using a dedicated switch with no hosts or routers on the link. This is the recommended method.
•
Using a straight through Ethernet cable to link the units directly. This configuration is not recommended. If one of the failover link interfaces fails, both interfaces are marked as failed; the adaptive security appliance cannot determine which interface caused the failure.
Note
When using VLANs, use a dedicated VLAN for the failover link. Sharing the failover link VLAN with any other VLANs can cause intermittent traffic problems and ping and ARP failures. If you use a switch to connect the failover link, use dedicated interfaces on the switch and security appliance for the failover link; do not share the interface with subinterfaces carrying regular network traffic.
On systems running in multiple context mode, the failover link resides in the system context. This interface and the Stateful Failover link, if used, are the only interfaces that you can configure in the system context. All other interfaces are allocated to and configured from within security contexts.
Note
The IP address and MAC address for the failover link do not change at failover.
Serial Cable Failover Link (PIX Security Appliance Only)
The serial failover cable, or "cable-based failover," is only available on the PIX security appliance platform. If the two units are within six feet of each other, then we recommend that you use the serial failover cable.
The cable that connects the two units is a modified RS-232 serial link cable that transfers data at 117,760 bps (115 Kbps). One end of the cable is labeled "Primary." The unit attached to this end of the cable automatically becomes the primary unit. The other end of the cable is labeled "Secondary." The unit attached to this end of the cable automatically becomes the secondary unit. You cannot override these designations in the PIX security appliance software. If you purchased a PIX security appliance failover bundle, this cable is included. To order a spare, use part number PIX-FO=.
The benefits of using cable-based failover include the following:
•
The PIX security appliance can immediately detect a power loss on the peer unit, and it can differentiate a power loss from an unplugged cable.
•
The standby unit can communicate with the active unit and can receive the entire configuration without having to be bootstrapped for failover. In LAN-based failover you need to configure the failover link on the standby unit before it can communicate with the active unit.
•
The switch between the two units in LAN-based failover can be another point of hardware failure; cable-based failover eliminates this potential point of failure.
•
You do not have to dedicate an Ethernet interface (and switch) to the failover link.
•
The cable determines which unit is primary and which is secondary, eliminating the need to manually enter that information in the unit configurations.
The disadvantages of using cable-based failover include the following:
•
Distance limitation—the units cannot be separated by more than 6 feet.
•
Slower configuration replication.
Stateful Failover Link
To use Stateful Failover, you must configure a Stateful Failover link to pass all state information. You have three options for configuring a Stateful Failover link:
•
You can use a dedicated Ethernet interface for the Stateful Failover link.
•
If you are using LAN-based failover, you can share the failover link.
•
You can share a regular data interface, such as the inside interface. However, this option is not recommended.
If you use a dedicated Ethernet interface for the Stateful Failover link, you can use either a switch or a crossover cable to directly connect the units. If you use a switch, no other hosts or routers should be on this link.
Note
Enable the PortFast option on Cisco switch ports that connect directly to the security appliance.
If you use a data interface as the Stateful Failover link, you will receive the following warning when you specify that interface as the Stateful Failover link:
******* WARNING ***** WARNING ******* WARNING ****** WARNING *********
Sharing Stateful failover interface with regular data interface is not
a recommended configuration due to performance and security concerns.
******* WARNING ***** WARNING ******* WARNING ****** WARNING *********
Sharing a data interface with the Stateful Failover interface can leave you vulnerable to replay attacks. Additionally, large amounts of Stateful Failover traffic may be sent on the interface, causing performance problems on that network segment.
Note
Using a data interface as the Stateful Failover interface is only supported in single context, routed mode.
In multiple context mode, the Stateful Failover link resides in the system context. This interface and the failover interface are the only interfaces in the system context. All other interfaces are allocated to and configured from within security contexts.
Note
The IP address and MAC address for the Stateful Failover link does not change at failover unless the Stateful Failover link is configured on a regular data interface.
Caution 
All information sent over the failover and Stateful Failover links is sent in clear text unless you secure the communication with a failover key. If the security appliance is used to terminate VPN tunnels, this information includes any usernames, passwords, and preshared keys used for establishing the tunnels. Transmitting this sensitive data in clear text could pose a significant security risk. We recommend securing the failover communication with a failover key if you are using the security appliance to terminate VPN tunnels.
Failover Interface Speed for Stateful Links
If you use the failover link as the Stateful Failover link, you should use the fastest Ethernet interface available. If you experience performance problems on that interface, consider dedicating a separate interface for the Stateful Failover interface.
Use the following failover interface speed guidelines for Cisco PIX security appliances and Cisco ASA adaptive security appliances:
•
Cisco ASA 5520/5540/5550 and PIX 515E/535
–
The stateful link speed should match the fastest data link
•
Cisco ASA 5510 and PIX 525
–
Stateful link speed can be 100 Mbps, even though the data interface can operate at 1 Gigabit due to the CPU speed limitation.
For optimum performance when using long distance LAN failover, the latency for the failover link should be less than 10 milliseconds and no more than 250 milliseconds. If latency is more than 10 milliseconds, some performance degradation occurs due to retransmission of failover messages.
All platforms support sharing of failover heartbeat and stateful link, but we recommend using a separate heartbeat link on systems with high Stateful Failover traffic.
Active/Active and Active/Standby Failover
This section describes each failover configuration in detail. This section includes the following topics:
•
Active/Standby Failover
•
Active/Active Failover
•
Determining Which Type of Failover to Use
Active/Standby Failover
This section describes Active/Standby failover and includes the following topics:
•
Active/Standby Failover Overview
•
Primary/Secondary Status and Active/Standby Status
•
Device Initialization and Configuration Synchronization
•
Command Replication
•
Failover Triggers
•
Failover Actions
Active/Standby Failover Overview
Active/Standby failover lets you use a standby security appliance to take over the functionality of a failed unit. When the active unit fails, it changes to the standby state while the standby unit changes to the active state. The unit that becomes active assumes the IP addresses (or, for transparent firewall, the management IP address) and MAC addresses of the failed unit and begins passing traffic. The unit that is now in standby state takes over the standby IP addresses and MAC addresses. Because network devices see no change in the MAC to IP address pairing, no ARP entries change or time out anywhere on the network.
Note
For multiple context mode, the security appliance can fail over the entire unit (including all contexts) but cannot fail over individual contexts separately.
Primary/Secondary Status and Active/Standby Status
The main differences between the two units in a failover pair are related to which unit is active and which unit is standby, namely which IP addresses to use and which unit actively passes traffic.
However, a few differences exist between the units based on which unit is primary (as specified in the configuration) and which unit is secondary:
•
The primary unit always becomes the active unit if both units start up at the same time (and are of equal operational health).
•
The primary unit MAC address is always coupled with the active IP addresses. The exception to this rule occurs when the secondary unit is active, and cannot obtain the primary MAC address over the failover link. In this case, the secondary MAC address is used.
Device Initialization and Configuration Synchronization
Configuration synchronization occurs when one or both devices in the failover pair boot. Configurations are always synchronized from the active unit to the standby unit. When the standby unit completes its initial startup, it clears its running configuration (except for the failover commands needed to communicate with the active unit), and the active unit sends its entire configuration to the standby unit.
The active unit is determined by the following:
•
If a unit boots and detects a peer already running as active, it becomes the standby unit.
•
If a unit boots and does not detect a peer, it becomes the active unit.
•
If both units boot simultaneously, then the primary unit becomes the active unit and the secondary unit becomes the standby unit.
Note
If the secondary unit boots without detecting the primary unit, it becomes the active unit. It uses its own MAC addresses for the active IP addresses. However, when the primary unit becomes available, the secondary unit changes the MAC addresses to those of the primary unit, which can cause an interruption in your network traffic. To avoid this, configure the failover pair with virtual MAC addresses. See the "Configuring Active/Standby Failover" section for more information.
When the replication starts, the security appliance console on the active unit displays the message "Beginning configuration replication: Sending to mate," and when it is complete, the security appliance displays the message "End Configuration Replication to mate." During replication, commands entered on the active unit may not replicate properly to the standby unit, and commands entered on the standby unit may be overwritten by the configuration being replicated from the active unit. Avoid entering commands on either unit in the failover pair during the configuration replication process. Depending upon the size of the configuration, replication can take from a few seconds to several minutes.
On the standby unit, the configuration exists only in running memory. To save the configuration to Flash memory after synchronization:
•
For single context mode, enter the copy running-config startup-config command on the active unit. The command is replicated to the standby unit, which proceeds to write its configuration to Flash memory.
•
For multiple context mode, enter the copy running-config startup-config command on the active unit from the system execution space and from within each context on disk. The command is replicated to the standby unit, which proceeds to write its configuration to Flash memory. Contexts with startup configurations on external servers are accessible from either unit over the network and do not need to be saved separately for each unit. Alternatively, you can copy the contexts on disk from the active unit to an external server, and then copy them to disk on the standby unit, where they become available when the unit reloads.
Command Replication
Command replication always flows from the active unit to the standby unit. As commands are entered on the active unit, they are sent across the failover link to the standby unit. You do not have to save the active configuration to Flash memory to replicate the commands.
Note
Changes made on the standby unit are not replicated to the active unit. If you enter a command on the standby unit, the security appliance displays the message **** WARNING **** Configuration Replication is NOT performed from Standby unit to Active unit. Configurations are no longer synchronized. This message displays even when you enter many commands that do not affect the configuration.
If you enter the write standby command on the active unit, the standby unit clears its running configuration (except for the failover commands used to communicate with the active unit), and the active unit sends its entire configuration to the standby unit.
For multiple context mode, when you enter the write standby command in the system execution space, all contexts are replicated. If you enter the write standby command within a context, the command replicates only the context configuration.
Replicated commands are stored in the running configuration. To save the replicated commands to the Flash memory on the standby unit:
•
For single context mode, enter the copy running-config startup-config command on the active unit. The command is replicated to the standby unit, which proceeds to write its configuration to Flash memory.
•
For multiple context mode, enter the copy running-config startup-config command on the active unit from the system execution space and within each context on disk. The command is replicated to the standby unit, which proceeds to write its configuration to Flash memory. Contexts with startup configurations on external servers are accessible from either unit over the network and do not need to be saved separately for each unit. Alternatively, you can copy the contexts on disk from the active unit to an external server, and then copy them to disk on the standby unit.
Failover Triggers
The unit can fail if one of the following events occurs:
•
The unit has a hardware failure or a power failure.
•
The unit has a software failure.
•
Too many monitored interfaces fail.
•
The no failover active command is entered on the active unit or the failover active command is entered on the standby unit.
Failover Actions
In Active/Standby failover, failover occurs on a unit basis. Even on systems running in multiple context mode, you cannot fail over individual or groups of contexts.
Table 11-1 shows the failover action for each failure event. For each failure event, the table shows the failover policy (failover or no failover), the action taken by the active unit, the action taken by the standby unit, and any special notes about the failover condition and actions.
Table 11-1 Failover Behavior
Failure Event
|
Policy
|
Active Action
|
Standby Action
|
Notes
|
Active unit failed (power or hardware)
|
Failover
|
n/a
|
Become active
Mark active as failed
|
No hello messages are received on any monitored interface or the failover link.
|
Formerly active unit recovers
|
No failover
|
Become standby
|
No action
|
None.
|
Standby unit failed (power or hardware)
|
No failover
|
Mark standby as failed
|
n/a
|
When the standby unit is marked as failed, then the active unit will not attempt to fail over, even if the interface failure threshold is surpassed.
|
Failover link failed during operation
|
No failover
|
Mark failover interface as failed
|
Mark failover interface as failed
|
You should restore the failover link as soon as possible because the unit cannot fail over to the standby unit while the failover link is down.
|
Failover link failed at startup
|
No failover
|
Mark failover interface as failed
|
Become active
|
If the failover link is down at startup, both units will become active.
|
Stateful Failover link failed
|
No failover
|
No action
|
No action
|
State information will become out of date, and sessions will be terminated if a failover occurs.
|
Interface failure on active unit above threshold
|
Failover
|
Mark active as failed
|
Become active
|
None.
|
Interface failure on standby unit above threshold
|
No failover
|
No action
|
Mark standby as failed
|
When the standby unit is marked as failed, then the active unit will not attempt to fail over even if the interface failure threshold is surpassed.
|
Active/Active Failover
This section describes Active/Active failover. This section includes the following topics:
•
Active/Active Failover Overview
•
Primary/Secondary Status and Active/Standby Status
•
Device Initialization and Configuration Synchronization
•
Command Replication
•
Failover Triggers
•
Failover Actions
Active/Active Failover Overview
Active/Active failover is only available to security appliances in multiple context mode. In an Active/Active failover configuration, both security appliances can pass network traffic.
In Active/Active failover, you divide the security contexts on the security appliance into failover groups. A failover group is simply a logical group of one or more security contexts. You can create a maximum of two failover groups on the security appliance. The admin context is always a member of failover group 1, and any unassigned security contexts are also members of failover group 1 by default.
The failover group forms the base unit for failover in Active/Active failover. Interface failure monitoring, failover, and active/standby status are all attributes of a failover group, rather than the unit. When an active failover group fails, it changes to the standby state while the standby failover group becomes active. The interfaces in the failover group that becomes active assume the MAC and IP addresses of the interfaces in the failover group that failed. The interfaces in the failover group that is now in the standby state take over the standby MAC and IP addresses.
Note
A failover group failing on a unit does not mean that the unit has failed. The unit may still have another failover group passing traffic on it.
When creating the failover groups, you should create them on the unit that will have failover group 1 in the active state.
Note
Active/Active failover generates virtual MAC addresses for the interfaces in each failover group. If you have more than one Active/Active failover pair on the same network, it is possible to have the same default virtual MAC addresses assigned to the interfaces on one pair as are assigned to the interfaces of the other pairs because of the way the default virtual MAC addresses are determined. To avoid having duplicate MAC addresses on your network, make sure you assign each physical interface a virtual active and standby MAC address.
Primary/Secondary Status and Active/Standby Status
As in Active/Standby failover, one unit in an Active/Active failover pair is designated the primary unit, and the other unit the secondary unit. Unlike Active/Standby failover, this designation does not indicate which unit becomes active when both units start simultaneously. Instead, the primary/secondary designation determines which unit provides the running configuration to the pair and on which unit each failover group appears in the active state when both start simultaneously.
Each failover group in the configuration is given a primary or secondary unit preference. This preference determines on which unit in the failover pair the contexts in the failover group appear in the active state when both units start simultaneously. You can have both failover groups be in the active state on a single unit in the pair, with the other unit containing the failover groups in the standby state. However, a more typical configuration is to assign each failover group a different role preference to make each one active on a different unit, balancing the traffic across the devices.
Note
The security appliance does not provide load balancing services. Load balancing must be handled by a router passing traffic to the security appliance.
Device Initialization and Configuration Synchronization
Configuration synchronization occurs when one or both units in a failover pair boot.
When a unit boots while the peer unit is not available, then both failover groups become active on the unit regardless of the primary or secondary designation for the failover groups and the unit. Configuration synchronization does not occur. Some reasons a peer unit may not be available are that the peer unit is powered down, the peer unit is in a failed state, or the failover link between the units has not been established.
When a unit boots while the peer unit is active (with both failover groups active on it), the booting unit contacts the active unit to obtain the running configuration. By default, the failover groups will remain active on the active unit regardless of the primary or secondary preference of each failover group and unit designation. The failover groups remain active on that unit until either a failover occurs or until you manually force them to the other unit with the no failover active command. However, using the preempt command, you can configure each failover group to become active on its preferred unit when that unit becomes available. If a failover group is configured with the preempt command, the failover group automatically becomes active on the preferred unit when that unit becomes available.
When both units boot at the same time, the primary unit becomes the active unit. The secondary unit obtains the running configuration from the primary unit. Once the configuration has been synchronized, each failover group becomes active on the preferred unit.
Command Replication
After both units are running, commands are replicated from one unit to the other as follows:
•
Commands entered within a security context are replicated from the unit on which the security context appears in the active state to the peer unit.
Note
A context is considered in the active state on a unit if the failover group to which it belongs is in the active state on that unit.
•
Commands entered in the system execution space are replicated from the unit on which failover group 1 is in the active state to the unit on which failover group 1 is in the standby state.
•
Commands entered in the admin context are replicated from the unit on which failover group 1 is in the active state to the unit on which failover group 1 is in the standby state.
Failure to enter the commands on the appropriate unit for command replication to occur will cause the configurations to be out of synchronization. Those changes may be lost the next time the initial configuration synchronization occurs.
You can use the write standby command to resynchronize configurations that have become out of sync. For Active/Active failover, the write standby command behaves as follows:
•
If you enter the write standby command in the system execution space, the system configuration and the configurations for all of the security contexts on the security appliance is written to the peer unit. This includes configuration information for security contexts that are in the standby state. You must enter the command in the system execution space on the unit that has failover group 1 in the active state.
•
If you enter the write standby command in a security context, only the configuration for the security context is written to the peer unit. You must enter the command in the security context on the unit where the security context appears in the active state.
Replicated commands are not saved to the Flash memory when replicated to the peer unit. They are added to the running configuration. To save replicated commands to Flash memory on both units, use the write memory or copy running-config startup-config command on the unit that you made the changes on. The command will be replicated to the peer unit and cause the configuration to be saved to Flash memory on the peer unit.
Failover Triggers
In Active/Active failover, failover can be triggered at the unit level if one of the following events occurs:
•
The unit has a hardware failure.
•
The unit has a power failure.
•
The unit has a software failure.
•
The no failover active or the failover active command is entered in the system execution space.
Failover is triggered at the failover group level when one of the following events occurs:
•
Too many monitored interfaces in the group fail.
•
The no failover active group group_id command is entered.
You configure the failover threshold for each failover group by specifying the number or percentage of interfaces within the failover group that must fail before the group fails. Because a failover group can contain multiple contexts, and each context can contain multiple interfaces, it is possible for all interfaces in a single context to fail without causing the associated failover group to fail.
See the "Failover Health Monitoring" section for more information about interface and unit monitoring.
Failover Actions
In an Active/Active failover configuration, failover occurs on a failover group basis, not a system basis. For example, if you designate both failover groups as active on the primary unit, and failover group 1 fails, then failover group 2 remains active on the primary unit while failover group 1 becomes active on the secondary unit.
Note
When configuring Active/Active failover, make sure that the combined traffic for both units is within the capacity of each unit.
Table 11-2 shows the failover action for each failure event. For each failure event, the policy (whether or not failover occurs), actions for the active failover group, and actions for the standby failover group are given.
Table 11-2 Failover Behavior for Active/Active Failover
Failure Event
|
Policy
|
Active Group Action
|
Standby Group Action
|
Notes
|
A unit experiences a power or software failure
|
Failover
|
Become standby Mark as failed
|
Become active
Mark active as failed
|
When a unit in a failover pair fails, any active failover groups on that unit are marked as failed and become active on the peer unit.
|
Interface failure on active failover group above threshold
|
Failover
|
Mark active group as failed
|
Become active
|
None.
|
Interface failure on standby failover group above threshold
|
No failover
|
No action
|
Mark standby group as failed
|
When the standby failover group is marked as failed, then the active failover group will not attempt to fail over, even if the interface failure threshold is surpassed.
|
Formerly active failover group recovers
|
No failover
|
No action
|
No action
|
Unless configured with the preempt command, the failover groups remain active on their current unit.
|
Failover link failed at startup
|
No failover
|
Become active
|
Become active
|
If the failover link is down at startup, both failover groups on both units will become active.
|
Stateful Failover link failed
|
No failover
|
No action
|
No action
|
State information will become out of date, and sessions will be terminated if a failover occurs.
|
Failover link failed during operation
|
No failover
|
n/a
|
n/a
|
Each unit marks the failover interface as failed. You should restore the failover link as soon as possible because the unit cannot fail over to the standby unit while the failover link is down.
|
Determining Which Type of Failover to Use
The type of failover you choose depends upon your security appliance configuration and how you plan to use the security appliances.
If you are running the security appliance in single mode, then you can only use Active/Standby failover. Active/Active failover is only available to security appliances running in multiple context mode.
If you are running the security appliance in multiple context mode, then you can configure either Active/Active failover or Active/Standby failover.
•
To provide load balancing, use Active/Active failover.
•
If you do not want to provide load balancing, use Active/Standby or Active/Active failover.
Table 11-3 provides a comparison of some of the features supported by each type of failover configuration:
Table 11-3 Failover Configuration Feature Support
Feature
|
Active/Active
|
Active/Standby
|
Single Context Mode
|
No
|
Yes
|
Multiple Context Mode
|
Yes
|
Yes
|
Load Balancing Network Configurations
|
Yes
|
No
|
Unit Failover
|
Yes
|
Yes
|
Failover of Groups of Contexts
|
Yes
|
No
|
Failover of Individual Contexts
|
No
|
No
|
Regular and Stateful Failover
The security appliance supports two types of failover, regular and stateful. This section includes the following topics:
•
Regular Failover
•
Stateful Failover
Regular Failover
When a failover occurs, all active connections are dropped. Clients need to reestablish connections when the new active unit takes over.
Stateful Failover
When Stateful Failover is enabled, the active unit continually passes per-connection state information to the standby unit. After a failover occurs, the same connection information is available at the new active unit. Supported end-user applications are not required to reconnect to keep the same communication session.
The state information passed to the standby unit includes the following:
•
NAT translation table.
•
TCP connection states.
•
UDP connection states.
•
The ARP table.
•
The Layer 2 bridge table (when running in transparent firewall mode).
•
The HTTP connection states (if HTTP replication is enabled).
•
The ISAKMP and IPSec SA table.
•
GTP PDP connection database.
The information that is not passed to the standby unit when Stateful Failover is enabled includes the following:
•
The HTTP connection table (unless HTTP replication is enabled).
•
The user authentication (uauth) table.
•
The routing tables. After a failover occurs, some packets may be lost our routed out of the wrong interface (the default route) while the dynamic routing protocols rediscover routes.
•
State information for Security Service Modules.
Note
If failover occurs during an active Cisco IP SoftPhone session, the call will remain active because the call session state information is replicated to the standby unit. When the call is terminated, the IP SoftPhone client will lose connection with the Call Manager. This occurs because there is no session information for the CTIQBE hangup message on the standby unit. When the IP SoftPhone client does not receive a response back from the Call Manager within a certain time period, it considers the Call Manager unreachable and unregisters itself.
Failover Health Monitoring
The security appliance monitors each unit for overall health and for interface health. See the following sections for more information about how the security appliance performs tests to determine the state of each unit:
•
Unit Health Monitoring
•
Interface Monitoring
Unit Health Monitoring
The security appliance determines the health of the other unit by monitoring the failover link. When a unit does not receive hello messages on the failover link, then the unit sends an ARP request on all interfaces, including the failover interface. The security appliance retries a user-configurable number of times. The action the security appliance takes depends on the response from the other unit. See the following possible actions:
•
If the security appliance receives a response on any interface, then it does not fail over.
•
If the security appliance does not receive a response on any interface, then the standby unit switches to active mode and classifies the other unit as failed.
•
If the security appliance does not receive a response on the failover link only, then the unit does not failover. The failover link is marked as failed. You should restore the failover link as soon as possible because the unit cannot fail over to the standby while the failover link is down.
Note
If a failed unit does not recover and you believe it should not be failed, you can reset the state by entering the failover reset command. If the failover condition persists, however, the unit will fail again.
Interface Monitoring
You can monitor up to 250 interfaces divided between all contexts. You should monitor important interfaces, for example, you might configure one context to monitor a shared interface (because the interface is shared, all contexts benefit from the monitoring).
When a unit does not receive hello messages on a monitored interface, it runs the following tests:
1.
Link Up/Down test—A test of the interface status. If the Link Up/Down test indicates that the interface is operational, then the security appliance performs network tests. The purpose of these tests is to generate network traffic to determine which (if either) unit has failed. At the start of each test, each unit clears its received packet count for its interfaces. At the conclusion of each test, each unit looks to see if it has received any traffic. If it has, the interface is considered operational. If one unit receives traffic for a test and the other unit does not, the unit that received no traffic is considered failed. If neither unit has received traffic, then the next test is used.
2.
Network Activity test—A received network activity test. The unit counts all received packets for up to 5 seconds. If any packets are received at any time during this interval, the interface is considered operational and testing stops. If no traffic is received, the ARP test begins.
3.
ARP test—A reading of the unit ARP cache for the 2 most recently acquired entries. One at a time, the unit sends ARP requests to these machines, attempting to stimulate network traffic. After each request, the unit counts all received traffic for up to 5 seconds. If traffic is received, the interface is considered operational. If no traffic is received, an ARP request is sent to the next machine. If at the end of the list no traffic has been received, the ping test begins.
4.
Broadcast Ping test—A ping test that consists of sending out a broadcast ping request. The unit then counts all received packets for up to 5 seconds. If any packets are received at any time during this interval, the interface is considered operational and testing stops.
If all network tests fail for an interface, but this interface on the other unit continues to successfully pass traffic, then the interface is considered to be failed. If the threshold for failed interfaces is met, then a failover occurs. If the other unit interface also fails all the network tests, then both interfaces go into the "Unknown" state and do not count towards the failover limit.
An interface becomes operational again if it receives any traffic. A failed security appliance returns to standby mode if the interface failure threshold is no longer met.
Note
If a failed unit does not recover and you believe it should not be failed, you can reset the state by entering the failover reset command. If the failover condition persists, however, the unit will fail again.
Configuring Failover
This section describes how to configure failover and includes the following topics:
•
Configuring Active/Standby Failover
•
Configuring Active/Active Failover
•
Configuring Failover Communication Authentication/Encryption
•
Verifying the Failover Configuration
Configuring Active/Standby Failover
This section provides step-by-step procedures for configuring Active/Standby failover. This section includes the following topics:
•
Prerequisites
•
Configuring Cable-Based Active/Standby Failover (PIX Security Appliance Only)
•
Configuring LAN-Based Active/Standby Failover
•
Configuring Optional Active/Standby Failover Settings
See the "Failover Configuration Examples" section for examples of typical failover configurations.
Prerequisites
Before you begin, verify the following:
•
Both units have the same hardware, software configuration, and proper license.
•
Both units are in the same mode (single or multiple, transparent or routed).
Configuring Cable-Based Active/Standby Failover (PIX Security Appliance Only)
Follow these steps to configure Active/Standby failover using a serial cable as the failover link. The commands in this task are entered on the primary unit in the failover pair. The primary unit is the unit that has the end of the cable labeled "Primary" plugged into it. For devices in multiple context mode, the commands are entered in the system execution space unless otherwise noted.
You do not need to bootstrap the secondary unit in the failover pair when you use cable-based failover. Leave the secondary unit powered off until instructed to power it on.
Cable-based failover is only available on the PIX security appliance platform.
To configure cable-based Active/Standby failover, perform the following steps:
Step 1
Connect the Failover cable to the PIX security appliances. Make sure that you attach the end of the cable marked "Primary" to the unit you use as the primary unit, and that you attach the end of the cable marked "Secondary" to the other unit.
Step 2
Power on the primary unit.
Step 3
If you have not done so already, configure the active and standby IP addresses for each interface (routed mode) or for the management interface (transparent mode). The standby IP address is used on the security appliance that is currently the standby unit, and it must be in the same subnet as the active IP address. To receive packets from both units in a failover pair, standby IP addresses need to be configured on all interfaces.
Note
Do not configure an IP address for the Stateful Failover link if you are going to use a dedicated Stateful Failover interface. You use the failover interface ip command to configure a dedicated Stateful Failover interface in a later step.
hostname(config-if)# ip address active_addr netmask standby standby_addr
Note
In multiple context mode, you must configure the interface addresses from within each context. Use the changeto context command to switch between contexts. The command prompt changes to hostname/context(config-if)#, where context is the name of the current context.
Step 4
(Optional) To enable Stateful Failover, configure the Stateful Failover link.
a.
Specify the interface to be used as the Stateful Failover link:
hostname(config)# failover link if_name phy_if
The if_name argument assigns a logical name to the interface specified by the phy_if argument. The phy_if argument can be the physical port name, such as Ethernet1, or a previously created subinterface, such as Ethernet0/2.3. This interface should not be used for any other purpose.
b.
Assign an active and standby IP address to the Stateful Failover link:
hostname(config)# failover interface ip if_name ip_addr mask standby ip_addr
Note
If the Stateful Failover link uses a data interface, skip this step. You have already defined the active and standby IP addresses for the interface.
The standby IP address must be in the same subnet as the active IP address. You do not need to identify the standby IP address subnet mask.
The Stateful Failover link IP address and MAC address do not change at failover unless it uses a data interface. The active IP address always stays with the primary unit, while the standby IP address stays with the secondary unit.
c.
Enable the interface:
hostname(config)# interface phy_if
hostname(config-if)# no shutdown
Step 5
Enable failover:
hostname(config)# failover
Step 6
Power on the secondary unit and enable failover on the unit if it is not already enabled:
hostname(config)# failover
The active unit sends the configuration in running memory to the standby unit. As the configuration synchronizes, the messages "Beginning configuration replication: sending to mate." and "End Configuration Replication to mate" appear on the primary console.
Step 7
Save the configuration to Flash memory on the primary unit. Because the commands entered on the primary unit are replicated to the secondary unit, the secondary unit also saves its configuration to Flash memory.
hostname(config)# copy running-config startup-config
Configuring LAN-Based Active/Standby Failover
This section describes how to configure Active/Standby failover using an Ethernet failover link. When configuring LAN-based failover, you must bootstrap the secondary device to recognize the failover link before the secondary device can obtain the running configuration from the primary device.
Note
If you are changing from cable-based failover to LAN-based failover, you can skip any steps, such as assigning the active and standby IP addresses for each interface, that you completed for the cable-based failover configuration.
This section includes the following topics:
•
Configuring the Primary Unit
•
Configuring the Secondary Unit
Configuring the Primary Unit
Follow these steps to configure the primary unit in a LAN-based, Active/Standby failover configuration. These steps provide the minimum configuration needed to enable failover on the primary unit. For multiple context mode, all steps are performed in the system execution space unless otherwise noted.
To configure the primary unit in an Active/Standby failover pair, perform the following steps:
Step 1
If you have not done so already, configure the active and standby IP addresses for each interface (routed mode) or for the management interface (transparent mode). The standby IP address is used on the security appliance that is currently the standby unit, and it must be in the same subnet as the active IP address. To receive packets from both units in a failover pair, standby IP addresses need to be configured on all interfaces.
Note
Do not configure an IP address for the Stateful Failover link if you are going to use a dedicated Stateful Failover interface. You use the failover interface ip command to configure a dedicated Stateful Failover interface in a later step.
hostname(config-if)# ip address active_addr netmask standby standby_addr
Note
In multiple context mode, you must configure the interface addresses from within each context. Use the changeto context command to switch between contexts. The command prompt changes to hostname/context(config-if)#, where context is the name of the current context.
Step 2
(PIX security appliance platform only) Enable LAN-based failover.
hostname(config)# failover lan enable
Step 3
Designate the unit as the primary unit.
hostname(config)# failover lan unit primary
Step 4
Define the failover interface.
a.
Specify the interface to be used as the failover interface.
hostname(config)# failover lan interface if_name phy_if
The if_name argument assigns a name to the interface specified by the phy_if argument. The phy_if argument can be the physical port name, such as Ethernet1, or a previously created subinterface, such as Ethernet0/2.3.
b.
Assign the active and standby IP address to the failover link.
hostname(config)# failover interface ip if_name ip_addr mask standby ip_addr
The standby IP address must be in the same subnet as the active IP address. You do not need to identify the standby address subnet mask.
The failover link IP address and MAC address do not change at failover. The active IP address for the failover link always stays with the primary unit, while the standby IP address stays with the secondary unit.
c.
Enable the interface.
hostname(config)# interface phy_if
hostname(config-if)# no shutdown
Step 5
(Optional) To enable Stateful Failover, configure the Stateful Failover link.
a.
Specify the interface to be used as Stateful Failover link.
hostname(config)# failover link if_name phy_if
Note
If the Stateful Failover link uses the failover link or a data interface, then you only need to supply the if_name argument.
The if_name argument assigns a logical name to the interface specified by the phy_if argument. The phy_if argument can be the physical port name, such as Ethernet1, or a previously created subinterface, such as Ethernet0/2.3. This interface should not be used for any other purpose (except, optionally, the failover link).
b.
Assign an active and standby IP address to the Stateful Failover link.
Note
If the Stateful Failover link uses the failover link or data interface, skip this step. You have already defined the active and standby IP addresses for the interface.
hostname(config)# failover interface ip if_name ip_addr mask standby ip_addr