Guest

IP Services

Advanced Services Case Study on DLSw+ Ethernet Redundancy

Table Of Contents

White Paper

Introduction

Topology Description

DLSw+ Partial Mesh

Topology

Normal Operation

Test Results

Fault Tolerance

Configuration

Caveats

DLSw+ Full Mesh

Topology

Test Results

Configuration

Caveats

Mapping SNA Resources

Topology

Test Results

Configurations

Caveats

Adding Hub Technology

Topology

Test Results

Caveats

Conclusion

Appendix A: Configurations-Session Generators

Initiator

Host

White Paper


Advanced Services Case Study on DLSw+ Ethernet Redundancy

Introduction

The purpose of this document is to provide the different options available to customers who wish to implement Data-Link Switching Plus (DLSw+) Ethernet Redundancy. Because of the levels of redundancy built into customer environments (particularly in the campus networks) and the nature of switched Ethernet technology, this often creates an inherent problem when connecting SNA clients over DLSw+ with multiple paths to the host. SNA devices historically ran across Token Ring (via source-route bridging), and multiple paths were possible by means of the Routing Information Field (RIF). The RIF was basically a field in the Token Ring frame that plotted the "route" (bridge number, ring number) from a source to destination, providing a unique series of coordinates. Transparent bridging does not use a RIF, so it is impossible to tell whether a frame has already traversed a particular domain, creating an opportunity for loops.

For Cisco IOS" releases prior to 12.0, the recommendation for multiple paths over DLSw+ (over Ethernet) was to use a single primary peer with a secondary peer as backup. This meant that there would be only one active peer at any one time, preventing any looping conditions. This paper covers the solutions available in Release 12.1 and later.

Topology Description

Figure 1 shows a typical customer topology when deploying DLSw+ in a switched Ethernet environment.

Figure 1 Typical DLSw+ over Ethernet Topology

In this example, the remote site usually has two Cisco Catalyst" switches (for redundancy) connecting to two devices running Cisco IOS Software, which could be either multilayer switch feature card (MSFC) technology or external routers. Two front-end processors (FEPs) share the load across all the remote sites. Host connectivity in the data center could be provided by a FEP or Channel Interface Processor (CIP), and the medium could be Token Ring, Fiber Distributed Data Interface (FDDI), or even switched Ethernet, although DLSw+ Ethernet Redundancy would also need to be employed.

For the customer to perform guaranteed load balancing via DLSw+ Ethernet Redundancy, a major piece of work is required—that is, all the SNA clients may need to have their destination media access control (MAC) address altered or the FEP might need to undergo a MAC address change.

The following sections detail some design alternatives with associated caveats, which may lessen the pain, particularly if the customer has a strategy to migrate from SNA to IP and considers the use of DLSw+ Ethernet Redundancy as an interim phase.

DLSw+ Partial Mesh

A partial mesh is considered good practice in typical DLSw+ topologies because it restricts the risk of looping explorers and added complexity. This section describes the topology and configuration for this design and issues that may be encountered.

Topology

The topology is relatively simple: the DLSw+ Ethernet Redundancy routers (master/slave) are considered as the remote devices, whereas the central site routers locally bridge to the SNA host. There are single peer relationships on each of the DLSw+ pairings (see Figure 2).

Figure 2 DLSw+ Partial Mesh Design

Normal Operation

In this instance the Ethernet Redundancy master router is at the remote site, and an initial test saw all circuits connect via the slave router, as shown in the following output:


sh dls trans nei

Interface Et2/0
    0080.6935.290f  SELF                     Master
    0080.6973.0589  Rcvd Master-Accepted     VALID

The master router is identified by the Master keyword beside its MAC address. On the slave router, the following output can be seen:


sh dls trans nei

Interface Et0/1
    0080.42e6.3085  SELF                     Slave
    0080.4234.8881  Connected                MASTER.

The master/slave relationship is determined by the master-priority parameter, or by the lowest MAC address in the event of a tie. Note that election and participation is only possible when the DLSw+ Ethernet Redundancy routers are in the same multicast group.

The circuits connected to all DLSw+ routers in the multicast group can be viewed from the master, as follows:


nsa-voip-3661-2#sh dls trans c

Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
0000.6c04.0200(08)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
0000.6c04.0200(0C)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
0000.6c04.0200(10)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
0000.6c04.0200(14)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
Total number of circuits in the Cache: 5

In the previous output, there are five circuits connected to all the routers in the multicast group, and their state is negative. This means that the circuit is not attached to the master router, and the owner field identifies the MAC address of the router attached to these circuits.

If the same command is performed on the slave router, the following output can be seen:


nsa-voip-3661-1#sh dls trans cache

Interface Et3/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(08)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(0C)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 5
nsa-voip-3661-1#

The slave state and ownership identify that all the circuits are connected via this router. Note that if the master owned any of the circuits, they would not be seen by the slave router. All circuits connected to all routers in the same multicast group can be viewed from the master only.

Test Results

The previous discussion has dealt with normal operation. This section covers tests performed in the Advanced Services laboratory. The first test performed was to identify how load balancing could be achieved, so the CAM table on the Catalyst 6500 Series switch was cleared (via the command clear cam dyn), and the DLSw+ circuits were reset. The following debug details the activity after the command was issued:


Jul 19 12:28:06.437: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 19 12:28:06.437: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Ethernet2/0
Jul 19 12:28:06.437: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 14, dsap 0
Jul 19 12:28:06.437: DLSW+:Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-14

Jul 19 12:28:06.437: CSM: test_frame_proc: ws_status = NO_CACHE_INFO
Jul 19 12:28:06.437: CSM: Write to all peers ok.
Jul 19 12:28:06.437: CSM: adding new icr pend record - test_frame_proc
Jul 19 12:28:06.437: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0

In this output, the Test frame (TEST_STN.Ind) can be seen entering the DLSw+ router on Ethernet2/0. The DLSw+ router checks its reachability cache and realizes that it is empty (ws_status = NO_CACHE_INFO) before sending to all available peers (Write to all peers ok). This operation was repeated for the remaining Test frames.

The next stage shows the return of the ICR (ICANREACH) frames to the slave router. As a result, it sent an IWANTIT message to the master in the form of the following message, which details the MAC address of the slave in the front of the source MAC (SMAC)/source service access point (SSAP) and destination MAC/destination SAP pairing (DMAC/DSAP):


Jul 19 12:28:06.485: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0080.6973.0589 
0000.6c04.0000:4 0000.6c04.0200:14

At this point, with the responses coming back from the remote DLSw+ peers, both DLSw+ Ethernet Redundancy routers started communicating with each other to agree on ownership of the circuits coming in:


Jul 19 12:28:07.503: CSM: CanUReach-CS frame...cache status is FOUND
Jul 19 12:28:07.503: DLSW-ER:Et0/1:CSM->MS: C_INQ:NEW: 0000.6c04.0200:14 0000.6c04.0000:4
Jul 19 12:28:07.507: DLSW-ER:Et0/1:IW -> 0080.4234.8881: 0000.6c04.0200:14 
0000.6c04.0000:4
Jul 19 12:28:07.507: DLSW-ER:Et0/1:CSM->MS: IW:PENDING: 0000.6c04.0200:14 
0000.6c04.0000:4
Jul 19 12:28:07.511: DLSW-ER:Et0/1:dm_action_k: Rcvd UG for 0000.6c04.0200:14 
0000.6c04.0000:4: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 19 12:28:07.457: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Ethernet2/0
Jul 19 12:28:07.457: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 10, dsap 0
Jul 19 12:28:07.457: DLSW+:Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-10
Jul 19 12:28:07.457: CSM: test_frame_proc: ws_status = FOUND
Jul 19 12:28:07.457: CSM: sending TEST to Ethernet2/0
Jul 19 12:28:07.457: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 19 12:28:07.457: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Ethernet2/0
Jul 19 12:28:07.457: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap C , dsap 0
Jul 19 12:28:07.457: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-C

Interestingly, the first ICR responses were coming via the slave router, in this case SSAP ID of 14. The CAM table for the switch had cached the entry of the DMAC address pointing to the port of the slave router. The slave router requested the circuit via the IW (IWANTIT) activity. However, as the dialogue continued, the master router was already in the process of sending explorer frames and started receiving responses also:


Jul 19 12:28:07.461: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Ethernet2/0
Jul 19 12:28:07.461: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 8 , dsap 4
Jul 19 12:28:07.461: CSM: new_connection: ws_status = FOUND
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:8
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:8
Jul 19 12:28:07.461: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Ethernet2/0
Jul 19 12:28:07.461: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 10, dsap 4
Jul 19 12:28:07.461: CSM: new_connection: ws_status = FOUND
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:10
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: IW:PE
Jul 19 12:28:07.511: CSM: Calling csm_to_core with SSP_START_NEWDL - 
dlsw_csm_er_handlerNDING: 0000.6c04.0000:4 0000.6c04.0200:10
Jul 19 12:28:07.461: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Ethernet2/0
Jul 19 12:28:07.461: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap C , dsap 4
Jul 19 12:28:07.461: CSM: new_connection: ws_status = FOUND
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:C
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:C
Jul 19 12:28:07.461: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Ethernet2/0
Jul 19 12:28:07.461: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 4 , dsap 4
Jul 19 12:28:07.461: CSM: new_connection: ws_status = FOUND
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:4
Jul 19 12:28:07.461: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:4

As the master router received the responses to the explorers for the DMAC of the FEP, the cache in the CAM table on the switch was updated to reflect this. Subsequently, all new circuits destined for this remote MAC address were forwarded to the master router within the cache aging time (default 5 minutes of inactivity). The next series of messages determined the ownership of the circuits:


Jul 19 12:28:07.485: DLSW-ER:Et2/0:UG -> 0080.6973.0589: 0000.6c04.0000:4 
0000.6c04.0200:14
Jul 19 12:28:07.553: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Ethernet2/0
Jul 19 12:28:07.553: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 14, dsap 4
Jul 19 12:28:07.553: CSM: new_connection: ws_status = FOUND
Jul 19 12:28:07.557: DLSW-ER:Et2/0:CSM->MS: C_INQ:CAC_NEG: 0000.6c04.0000:4 
0000.6c04.0200:14
Jul 19 12:28:08.461: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:8
Jul 19 12:28:08.461: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_er_handler
Jul 19 12:28:08.461: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:10
Jul 19 12:28:08.461: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_er_handler
Jul 19 12:28:08.461: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:C
Jul 19 12:28:08.461: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_er_handler
Jul 19 12:28:08.461: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:4
Jul 19 12:28:08.461: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_er_handler

In the previous output there were two distinct differences. The first message (UG) for SSAP of 14 had the slave MAC address appended at the beginning of the message. All corresponding IWANTIT messages were local and were followed by a new circuit setup request (obviously, the slave router details will be on the slave router debug output).

Some further activity can be seen after the UGOTIT decisions had been made in the following output:


Jul 19 12:28:08.557: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0080.6973.0589 
0000.6c04.0000:4 0000.6c04.0200:8
Jul 19 12:28:08.557: DLSW-ER:Et2/0:CT -> 0080.6973.0589: 0000.6c04.0000:4 
0000.6c04.0200:8
Jul 19 12:28:08.557: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0080.6973.0589 
0000.6c04.0000:4 0000.6c04.0200:10
Jul 19 12:28:08.557: DLSW-ER:Et2/0:CT -> 0080.6973.0589: 0000.6c04.0000:4 
0000.6c04.0200:10
Jul 19 12:28:08.557: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0080.6973.0589 
0000.6c04.0000:4 0000.6c04.0200:C
Jul 19 12:28:08.557: DLSW-ER:Et2/0:CT -> 0080.6973.0589: 0000.6c04.0000:4 
0000.6c04.0200:C
Jul 19 12:28:08.557: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0080.6973.0589 
0000.6c04.0000:4 0000.6c04.0200:4
Jul 19 12:28:08.557: DLSW-ER:Et2/0:CT -> 0080.6973.0589: 0000.6c04.0000:4 
0000.6c04.0200:4

An IWANTIT request had been sent from the slave to the master, which responded with a circuit taken (CT) message. This occurred because the new circuit for the SMAC/SSAP and DMAC/DSAP pairing had already been created on the master router, which prevented the duplicate circuit issue.

When all the circuits had been established, the following operational commands could be entered on the master router to check the status:


sh dls trans cache

Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(08)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(0C)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) NEGATIVE        0080.6973.0589
Total number of circuits in the Cache: 5

The output confirms the fact that the SSAP of 14 connected via the slave router and all subsequent circuits connected via the master.

The output from the switch CAM table was as follows:


VOIP-6509-2 (enable) sh cdp nei
* - indicates vlan mismatch.
# - indicates duplex mismatch.
Port     Device-ID                       Port-ID                   Platform
-------- ------------------------------- ------------------------- ------------
 3/1     nsa-voip-3661-2                 Ethernet2/0               cisco 3660
 3/3     nsa-voip-3661-1                 Ethernet3/0               cisco 3660
 3/5     VOIP-3620-4                     Ethernet0/1               cisco 3620
VOIP-6509-2 (enable)

The CDP adjacency is useful to identify which port connects to which router. When the Test frames received a response (destination is 0000.3620.0000), the CAM was updated to reflect the router on port 3/1 as follows:


VOIP-6509-2 (enable) sh cam dyn
* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry

VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-01-96-ac-94-f0             3/1 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
399    00-01-96-ce-a0-91             3/3 [ALL]
Total Matching CAM Entries Displayed = 3
VOIP-6509-2 (enable) sh cam dyn
* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry

VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-01-96-ac-94-f0             3/1 [ALL]
399    00-00-36-20-00-00             3/1 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
399    00-01-96-ce-a0-91             3/3 [ALL]
Total Matching CAM Entries Displayed = 4

To prove that all subsequent circuits would connect to the DLSw+ router that "owned" the entry in the CAM table, the DLSw+ circuits were cleared, and the output was displayed as follows:


dls trans cache

Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(08)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(0C)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 5
nsa-voip-3661-2#

As can be seen, all circuits attached via the master router. On the slave router the following activity was observed:


Jul 19 12:35:52.335: DLSW-ER:Et3/0:CG -> 0080.6935.290f: 0000.6c04.0000:4 
0000.6c04.0200:14
Jul 19 12:35:52.335: DLSW-ER:Et3/0:CSM->MS: CG:OK: 0000.6c04.0000:4 0000.6c04.0200:14
Jul 19 12:35:52.999: DLSW-ER:Et3/0:dm_action_i: Slave Rcvd CG: 0000.6c04.0000:4
0000.6c04.0200:8
Jul 19 12:35:52.999: DLSW-ER:Et3/0:dm_action_i: Slave Rcvd CG: 0000.6c04.0000:4
0000.6c04.0200:10
Jul 19 12:35:53.003: DLSW-ER:Et3/0:dm_action_i: Slave Rcvd CG: 0000.6c04.0000:4
0000.6c04.0200:C
Jul 19 12:35:53.003: DLSW-ER:Et3/0:dm_action_i: Slave Rcvd CG: 0000.6c04.0000:4
0000.6c04.0200:4

The master router advised the slave that the circuits had been taken (CG - Circuit-Got), and no other activity was seen during the Test requests and responses.

Fault Tolerance

Because the first test was designed to examine redundancy and fault tolerance, in this test the master router was deliberately failed to see how fast the failover mechanism would perform. All the following output was taken from the slave router:


Jul 19 12:49:21.913: DLSW-ER:Et3/0:dm_action_r: LLC2 session dead to neighbor 
0080.6935.290f
Jul 19 12:49:21.913: DLSW-ER:Et3/0: Sending MP Frame
Jul 19 12:49:21.913: DLSW-ER:Et3/0:dm_action_u: Freeing current master 0080.6935.290f
Jul 19 12:49:21.913: DLSW-ER:dm_action_u: Changing state to Master

When the failure was detected-bearing in mind that the master and slave router set up a Logical Link Control, type 2 (LLC2) session, which is connection-oriented—the DLSw+ Ethernet Redundancy component changed the role of the slave to become the new master.

Approximately 5 minutes passed before any Test frames were sent from the DLSw+ router to its remote peers:


Jul 19 12:55:24.304: CSM: update local cache for mac 0000.6c04.0200, Ethernet3/0
Jul 19 12:55:24.304: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Ethernet3/0
Jul 19 12:55:24.304: CSM:   smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 14, dsap 0
Jul 19 12:55:24.304: DLSW+:Ethernet3/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-14
Jul 19 12:55:24.304: CSM: test_frame_proc: ws_status = NO_CACHE_INFO
Jul 19 12:55:24.304: CSM: Write to all peers ok.
Jul 19 12:55:24.304: CSM: adding new icr pend record - test_frame_proc

Subsequently, the reachability cache went through the usual routine of being repopulated and so on, and all circuits connected via this router.

Configuration

The following configurations are samples of those used in the partial mesh design:


nsa-voip-3661-1 (slave)

dlsw local-peer peer-id 6.0.0.1
dlsw remote-peer 0 tcp 2.2.2.3
dlsw transparent switch-support
!
interface Ethernet3/0
dlsw transparent redundancy-enable 9999.9999.9999

nsa-voip-3661-2 (master)

dlsw local-peer peer-id 4.4.4.2
dlsw load-balance circuit-count
dlsw remote-peer 0 tcp 2.2.2.2
dlsw transparent switch-support
!
interface Ethernet2/0
dlsw transparent redundancy-enable 9999.9999.9999 master-priority 1
!

Caveats

Although this topology simplifies the DLSw+ topology, there are some major caveats that a customer may consider unacceptable. The main problem with this setup is the fact that load balancing is virtually nonexistent, because it relies heavily on the cache entry from the switch CAM table. Another issue is the time taken for the CAM entry to timeout, which causes at least a 5-minute delay (taking default aging time into consideration) before any Test frames take the alternative path via the other DLSw+ router.

DLSw+ Full Mesh

The full mesh topology provides each DLSw+ router with a second pair, offering some potential for load balancing on a per-router basis.

Topology

Similar to the partial mesh topology, in the full mesh topology the DLSw+ Ethernet Redundancy routers are the remote routers, and the central site routers connect via a Token Ring or Ethernet hub to the SNA host (see Figure 3).

Figure 3 DLSw+ Full Mesh Design

The following section details a test that shows how the full mesh topology worked in terms of circuit setup, load balancing, and the DLSw+ Ethernet Redundancy behavior.


Note: For effective results, it was necessary to include the command dlsw timers explorer-wait-time 10.


Test Results

Initially, we had to assume that the CAM table had already been populated and the router nsa-voip-3661-2 was connected to the port on the switch that the CAM points to for the FEP MAC address. The reachability cache contained two fresh entries (state is FOUND) for the DMAC address (0000.3620.0000 in noncanonical format):


nsa-voip-3661-2#sh dls reach remote 

DLSw Remote MAC address reachability cache list
Mac Addr         status     Loc.    peer
0000.6c04.0000   FOUND      REMOTE  2.2.2.2(2065) max-lf(1500)
                                    2.2.2.3(2065) max-lf(1500)

The next stage was to initiate six SNA sessions on a router behind the switch, remembering that the CAM entry forces all traffic destined for the 3661-2 router:


Jul 23 13:32:12: DLSW-ER:Et2/0: Sending MP Frame
Jul 23 13:32:22: DLSW-ER:Et2/0: Sending MP Frame
Jul 23 13:32:24: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 10, dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-10
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0
Jul 23 13:32:24: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 18, dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-18
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0
Jul 23 13:32:24: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 8, dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-8
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0
Jul 23 13:32:24: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap C , dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-C
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0
Jul 23 13:32:24: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 14, dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-14
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0
Jul 23 13:32:24: CSM: update local cache for mac 0000.6c04.0200, Ethernet2/0
Jul 23 13:32:24: CSM: Received CLSI Msg : TEST_STN.Ind   dlen: 40 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 4 , dsap 0
Jul 23 13:32:24: DLSW+: Ethernet2/0 I d=0000.6c04.0000-0 s=0000.6c04.0200-4
Jul 23 13:32:24: CSM: test_frame_proc: ws_status = FOUND
Jul 23 13:32:24: CSM: sending TEST to Ethernet2/0

The SNA session generator sent out six Test frames for each of the SSAPs, which can be seen in the debug (that is, SSAP 4,8,C,10,14,18).

The DLSw+ Ethernet Redundancy routers then communicated whether they could own the circuit, so some communication between the slave and master in the form of LLC2 traffic was performed. The IW indicated that the IWANTIT message was being received by the master. Because no associated MAC address was "tagged" onto this message (that is, no MAC address was present), one could assume that the master itself was requesting the circuit. The following output is a segment of debug from two of the requests (out of six):


Jul 23 13:32:24: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Etht2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 10, dsap 4
Jul 23 13:32:24: CSM: new_connection: ws_status = FOUND
Jul 23 13:32:24: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c0
4.0200:10
Jul 23 13:32:24: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c
04.0200:10
Jul 23 13:32:24: CSM: Received CLSI Msg : ID_STN.Ind   dlen: 48 from Eth2/0
Jul 23 13:32:24: CSM: smac 0000.6c04.0200, dmac 0000.6c04.0000, ssap 18, dsap 4
Jul 23 13:32:24: CSM: new_connection: ws_status = FOUND
Jul 23 13:32:24: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c0
4.0200:18
Jul 23 13:32:24: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c
04.0200:18

The next action was for the DLSw+ master router to decide on the resources and agree on which router would own the circuit. No IWANTIT frames came from the slave router (because of the CAM entry pointing to the master). All circuits connected that way as follows:


Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:10
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler
Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:18
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler
Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:8
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler
Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:C
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler
Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:14
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler
Jul 23 13:32:25: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.020
0:4
Jul 23 13:32:25: CSM: Calling csm_to_core with CLSI_START_NEWDL - dlsw_csm_e
r_handler

If the slave router was offered a circuit, then in the UGOTIT message the slave router's MAC address was inserted at the beginning of the line before the SMAC address. When all the circuits were set up—that is, when CUR/ICR activity was complete—and the reachability cache had remained fresh for both paths (while using round-robin load balancing) to the DMAC address (via DLSw+), the following results were obtained:


sh dls pe
Peers:                state     pkts_rx   pkts_tx  type  drops ckts TCP   uptime
 TCP 2.2.2.2         CONNECT        283       288  conf      0    3   0 02:03:15
 TCP 2.2.2.3         CONNECT        292       328  conf      0    3   0 01:49:16
Total number of connected peers: 2
Total number of connections:     2

This was tested several times and was successful in all attempts.

Configuration

The following configurations are samples of those used in the full mesh design:


nsa-voip-3661-1 (slave)

dlsw local-peer peer-id 6.0.0.1
dlsw load-balance round-robin
dlsw remote-peer 0 tcp 2.2.2.3
dlsw transparent switch-support
!
interface Ethernet3/0
dlsw transparent redundancy-enable 9999.9999.9999

nsa-voip-3661-2 (master)

dlsw local-peer peer-id 4.4.4.2
dlsw load-balance round-robin
dlsw remote-peer 0 tcp 2.2.2.2
dlsw transparent switch-support
!
interface Ethernet2/0
dlsw transparent redundancy-enable 9999.9999.9999 master-priority 1
!

Caveats

There are a few caveats with this configuration that need to be considered. The first point has to do with load balancing. Although load balancing is possible between the different FEP sites, it can be achieved only on one DLSw+ router at a time, because of the second caveat, which is the restriction of the CAM table to allow only one entry for the DMAC address being reachable via one port only. It is likely that in this configuration all circuits will establish on one of the DLSw+ routers, but with round-robin load balancing they will spread across both remote peers. Fault tolerance is also an issue, as described in the partial mesh section, because the Test frames will not be aware of the secondary router until the CAM entry for the "old" router has aged out. This is a configurable parameter and may add overhead onto the switch depending on the existing size of the CAM table.

Mapping SNA Resources

For accurate load balancing and appropriate fault tolerance in a switched Ethernet environment, Cisco recommends using DLSw+ Ethernet Redundancy where each resource targets a MAC address that appears as a logical MAC address on the DLSw+ Ethernet Redundancy router and is subsequently mapped to a real MAC address for the FEP. Each logical MAC address on the DLSw+ Ethernet Redundancy router is different, meaning that 50 percent of the clients will target MAC address 1, while the others will target MAC address 2. In addition, each peer knows about the resources that are attached to the other peer via the LLC2 session that communicates between both DLSw+ Ethernet Redundancy routers, and in the event of a failure, one of the routers can back up the other.

Topology

For the purpose of the testing, the simplest solution was offered, partial mesh instead of full mesh, as shown in Figure 4.

Figure 4 DLSw+ Mapping Resources Design

Test Results

Before any of the tests were started, the master was checked and found to have perfect load balancing—that is, three LLC2 sessions targeted the master as primary, and the other three targeted the virtual MAC address of the slave:


sh dls tran c
Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(08)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(0C)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(18)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 6
nsa-voip-3661-2#deb dls tr ma
nsa-voip-3661-2#

Interface Et2/0
      LOCAL Mac          REMOTE MAC      BACKUP
      ---------          ----------      ------
    4000.3660.0002     0000.6c04.0000    0000.6c06.0080     STATIC
    4000.3660.0001     0000.6c04.0000    0000.6c06.0080     DYNAMIC(Passive)

At this point, the 3660-1 router had failed, and the master router was receiving the details from the slave advising that it had cleared the circuits that were previously attached (CG):


nsa-voip-3661-2#
Mar  1 20:00:15.003: DLSW-ER:Et2/0:dm_action_h: Rcvd CG <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:00:15.007: DLSW-ER:Et2/0:dm_action_h: Rcvd CG <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:00:15.007: DLSW-ER:Et2/0:dm_action_h: Rcvd CG <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:C

The virtual MAC address previously owned by the failed router was now owned by the master. Subsequently, any packets targeting this MAC address were replaced with the real host address 0000.6c04.0000:


Mar  1 20:00:44.427: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:00:44.427: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:00:44.427: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:00:45.499: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:00:46.511: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:00:55.367: DLSW-ER:Swapping SMAC 0000.6c04.0000 with 4000.3660.0002 before 
sending on Et2/0

Now that the Master router "owned" both MAC addresses, the circuits previously connected to the slave router followed the normal process of connecting to the master router, as follows:


Mar  1 20:01:17.460: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:01:17.460: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:01:17.464: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:01:17.464: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:01:17.464: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:01:17.464: DLSW-ER:Replacing dmac 4000.3660.0001 with 0000.6c04.0000 on a frame 
from Et2/0
Mar  1 20:01:17.464: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:C
Mar  1 20:01:17.464: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 0000.6c04.0200:C
Mar  1 20:01:18.460: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:01:18.460: DLSW-ER:action_a(): target mapped from (wan) 0000.6c04.0000 ---> 
4000.3660.0001
Mar  1 20:01:18.464: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:01:18.464: DLSW-ER:action_a(): target mapped from (wan) 0000.6c04.0000 ---> 
4000.3660.0001
Mar  1 20:01:18.464: DLSW-ER:Et2/0:MS->CSM:UGotIt 0000.6c04.0000:4 0000.6c04.0200:C

Finally, all the circuits connected and could be observed as follows:


Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(08)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(0C)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(18)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 6

The previous failed router finally recovered and the following information could be observed:


Mar  1 20:02:41.464: DLSW-ER:Et2/0: New neighbor: master 0000.6c06.0040, neighbor 
0000.6c06.0080
Mar  1 20:02:41.464: DLSW-ER:Et2/0:dm_action_a: Rcvd MP with worse priority from 
0000.6c06.0080
Mar  1 20:02:41.464: DLSW-ER:Et2/0:dm_action_l: LLC2 up for neighbor 0000.6c06.0080
Mar  1 20:02:41.464: DLSW-ER:Et2/0: Sending MC to 0000.6c06.0080
Mar  1 20:02:41.464: DLSW-ER:Et2/0:dm_action_d: Received MA from neighbor 0000.6c06.0080

These statements show the election process that occurred between the master and slave routers confirming the use of LLC2 to maintain this relationship. The MA frame was basically an acknowledgement from the slave router advising that it had now become the master. When this operation was performed, a further series of messages was passed between the master and slave to identify which router would back up which MAC address. This action was performed in case there were multiple DLSw+ Ethernet Redundancy routers in the same multicast domain:


Mar  1 20:02:41.464: DLSW-ER: Sending BACKMEUP_REQ  4000.3660.0002 --> 0000.6c04.0000 to 
neighbor 0000.6c06.0080 (61F61014)
Mar  1 20:02:41.464: DLSW-ER:Et2/0: Sending DN to 0000.6c06.0080
Mar  1 20:02:41.464: DLSW-ER:Et2/0:Rcvd BACKMEUP_REQ from 0000.6c06.0080 for map entry 
4000.3660.0001 --> 0000.6c04.0000

Last, when all initial activity was complete, an admin-stop was placed on the circuits that should be owned on the recently recovered DLSw+ Ethernet Redundancy router:


Mar  1 20:02:41.464: DLSW-ER:calling admin_stop for ckt(0000.6c04.0200(4) 
0000.6c04.0000(4)) with lmac 4000.3660.0001
Mar  1 20:02:41.468: DLSW-ER:calling admin_stop for ckt(0000.6c04.0200(8) 
0000.6c04.0000(4)) with lmac 4000.3660.0001
Mar  1 20:02:41.468: DLSW-ER:calling admin_stop for ckt(0000.6c04.0200(C) 
0000.6c04.0000(4)) with lmac 4000.3660.0001
Mar  1 20:02:41.468: DLSW-ER:Et2/0:Rcvd BACKMEUP_ACK from 0000.6c06.0080 for map

Unfortunately, this action caused another session disruption and forced all the circuits back onto the original DLSw+ Ethernet Redundancy target router:


Mar  1 20:02:41.468: DLSW-ER:Sourcing a TestFrame 4000.3660.0002 --> 0000.6c06.0080 on 
Et2/0
Mar  1 20:02:46.660: DLSW-ER:Et2/0: Sending MP Frame
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:4
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CSM->MS: CG:OK: 0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:8
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CSM->MS: CG:OK: 0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:C
Mar  1 20:02:50.472: DLSW-ER:Et2/0:CSM->MS: CG:OK: 0000.6c04.0000:4 0000.6c04.0200:C
Mar  1 20:02:51.468: DLSW-ER: Sending BACKMEUP_ACK  4000.3660.0001 --> 0000.6c04.0000 to 
neighbor 0000.6c06.0080
Mar  1 20:03:43.529: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:4
Mar  1 20:03:43.529: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:8
Mar  1 20:03:43.533: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:C
Mar  1 20:03:44.529: DLSW-ER:Et2/0:UG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:4
Mar  1 20:03:44.529: DLSW-ER:Et2/0:UG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:8
Mar  1 20:03:44.533: DLSW-ER:Et2/0:UG -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:C

Finally, the following output confirmed that all circuits were now available on their specified DLSw+ Ethernet Redundancy router:


nsa-voip-3661-2#sh dls tr c

Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(08)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(0C)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(18)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 6

The Catalyst 6500 Series switch showed how the CAM table behaved during the period of one DLSw+ Ethernet Redundancy router becoming unavailable. First, the switch was correctly populated, with 3660-0001 on port 3/1 and 3660-0002 on port 3/3:


VOIP-6509-2 (enable) sh cam dyn
* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry
VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-00-36-60-00-02             3/3 [ALL]
399    00-00-36-60-00-01             3/1 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
399    02-00-6c-06-00-40             3/3 [ALL]
399    02-00-6c-06-00-80             3/1 [ALL]
Total Matching CAM Entries Displayed = 5

Then the 3661-1 router was reloaded, and the CAM entries to this router were deleted (for both real and virtual addresses):


VOIP-6509-2 (enable) sh cam dyn2002 Jul 30 06:17:37 %PAGP-5-PORTFROMSTP:Port 3/1 left 
bridge port 3/1

* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry

VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-00-36-60-00-02             3/3 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
399    02-00-6c-06-00-40             3/3 [ALL]
Total Matching CAM Entries Displayed = 3

When the master DLSw+ Ethernet Redundancy router took ownership of the MAC address and started responding to the Test frames from the SNA generator, the CAM table was subsequently updated as follows:


VOIP-6509-2 (enable) sh cam dyn
* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry

VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-00-36-60-00-02             3/3 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
399    02-00-6c-06-00-40             3/3 [ALL]
399    02-00-6c-06-00-80             3/3 [ALL]
Total Matching CAM Entries Displayed = 4

Note that the virtual MAC addresses were in their bit-swapped format and would be 0000.3660.0001 and 0000.3660.0002, respectively.

Configurations

The following configurations are samples of those used in the mapping resources design:


3661-2 (master)

dlsw local-peer peer-id 4.4.4.2
dlsw timer explorer-wait-time 10
dlsw remote-peer 0 tcp 2.2.2.2
dlsw transparent switch-support
!
interface Ethernet2/0
 mac-address 0000.3660.0002
dlsw transparent redundancy-enable 9999.9999.9999 master-priority 1
 dlsw transparent map local-mac 0200.6c06.0040  remote-mac 0000.3620.0000 neighbor 
0000.3660.0001

3661-1 (slave)

dlsw local-peer peer-id 6.0.0.1
dlsw timer explorer-wait-time 10
dlsw remote-peer 0 tcp 2.2.2.3
dlsw transparent switch-support
!
interface Ethernet3/0
 mac-address 0000.3660.0001
 dlsw transparent redundancy-enable 9999.9999.9999
 dlsw transparent timers sna 500
 dlsw transparent map local-mac 0200.6c06.0080  remote-mac 0000.3620.0000 neighbor 
0000.3660.0002

Caveats

Although this configuration is preferred by Cisco, very few customers actually deploy it. The main reason is because it requires them to perform some widescale changes. For example, when two DLSw+ Ethernet Redundancy routers are in operation, 50 percent of the SNA clients target one, and 50 percent target the other, contributing to large resource overhead. At a time when many customers are considering migrating away from traditional SNA technology to IP-based applications, the topology may be seen as an interim methodology with huge overhead associated. Alternatively, or additionally, it is likely that customers would have to change the MAC address on their FEP or CIP connections, which again could be considered as administrative overhead.

Adding Hub Technology

As can be seen from the DLSw+ Ethernet Redundancy configurations that do not use the mappings (that is, the full or partial mesh designs), the major restriction is the fact that the CAM table allows only one entry for the destination MAC address, which causes problems with load balancing. However, by placing a passive hub between the Catalyst 6500 Series switches and DLSw+ routers, the CAM table will contain one entry for the destination MAC address, and this resolves the issue of wanting to learn duplicate MAC addresses on the same switch.

Topology

Figure 5 shows the topology for DLSw+ Ethernet Redundancy configured with a passive hub.

Figure 5 DLSw+ Ethernet Redundancy with Hub Design

Test Results

All caches were cleared and the DLSw+ circuits reset. The following output could be seen on the DLSw+ master router when all the normal reachability information had been populated:


Jul 23 15:07:12.926: DLSW-ER:Et2/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:18
Jul 23 15:07:12.926: DLSW-ER:Et2/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 
0000.6c04.0200:18
Jul 23 15:07:12.926: DLSW-ER:Et2/0:dm_action_j: Rcvd IW <- 0000.6c06.0080 
0000.6c04.0000:4 0000.6c04.0200:18
Jul 23 15:07:12.926: DLSW-ER:Et2/0:CT -> 0000.6c06.0080: 0000.6c04.0000:4 
0000.6c04.0200:18

In the previous output, two IWANTIT messages were received by the DLSw+ master router, the first one (IW Pending) was generated by itself, and the second was coming from the slave router (Rcvd IW). The master router made a decision and sent a CT (Circuit Taken) message back to the slave as follows:


Jul 23 15:07:12.928: DLSW-ER:Et3/0:CSM->MS: C_INQ:NEW: 0000.6c04.0000:4 0000.6c04.0200:18
Jul 23 15:07:12.928: DLSW-ER:Et3/0:IW -> 0000.6c06.0040: 0000.6c04.0000:4 
0000.6c04.0200:18
Jul 23 15:07:12.928: DLSW-ER:Et3/0:CSM->MS: IW:PENDING: 0000.6c04.0000:4 
0000.6c04.0200:18
Jul 23 15:07:12.928: DLSW-ER:Et3/0:dm_action_g: MS->CSM:CT for 0000.6c04.0000:4
0000.6c04.0200:18

This process occurred for all the circuits, where the master made a decision on each based on current number of circuits and reachability. A final check of the contents of the circuit cache on the master was detailed as follows:


nsa-voip-3661-2#sh dls transparent c

Interface Et2/0
 Circuit Cache

local addr(lsap)    remote addr(dsap)  state          Owner
0000.6c04.0200(04)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(08)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(0C)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(10)  0000.6c04.0000(04) POSITIVE        SELF
0000.6c04.0200(14)  0000.6c04.0000(04) NEGATIVE        0000.6c06.0080
0000.6c04.0200(18)  0000.6c04.0000(04) POSITIVE        SELF
Total number of circuits in the Cache: 6
nsa-voip-3661-2#

The CAM table entry always pointed to the port where the hub resided (3/7), so there was no issue with any aging timers and so on:


VOIP-6509-2 (enable) sh cam dyn
* = Static Entry. + = Permanent Entry. # = System Entry. R = Router Entry.
X = Port Security Entry

VLAN  Dest MAC/Route Des  [CoS]  Destination Ports or VCs / [Protocol Type]
----- ------------------  -----  -------------------------------------------
399    00-00-36-60-00-02             3/7 [ALL]
399    00-00-36-60-00-01             3/7 [ALL]
399    00-00-36-20-00-00             3/7 [ALL]
399    00-00-36-20-40-00             3/5 [ALL]
Total Matching CAM Entries Displayed = 4

Caveats

There are many advantages to this implementation. For example, configuration is greatly simplified, which is a key consideration, and the design provides the ability to perform perfect load balancing. However, it does mean that passive hubs need to be placed between the Catalyst 6500 Series switches and DLSw+ routers, which results in additional cost. Also, the hub creates a single point of failure. To address this issue, hubs could be daisy-chained, so that the destination MAC address would be available via more than one port again, subsequently bringing spanning tree into operation. In this case, some delay may occur while spanning tree converges.

Conclusion

Cisco would prefer that customers move to a configuration with mappings, because this design provides effective (controlled and accurate) load balancing and fault tolerance. However, this choice might entail a large project, particularly if a customer is already on the verge or going through the process of migrating to IP-based SNA applications (via TN3270, HPR/IP, and so on). Changing many MAC address configurations on all SNA clients is a large task, particularly if it is an interim, short-term requirement.

The alternatives described in this document all have caveats. The full and partial mesh designs have issues associated with the CAM table, although the full mesh solution provides load balancing across the remote peers. The hub solution could be considered a step backward, but it resolves the load balancing issue and simplifies the configuration. It also adds a single point of failure into the equation.

Cisco Advanced Services would lean toward either the full-mesh solution or the hub solution, if the mapping of resources design is considered inappropriate. Either technique would be acceptable to most customers' end goal, which is to perform reasonable load balancing across the DLSw+ topology.

Cisco IOS Release 12.2 mainline is the software version associated with DLSw+ Ethernet Redundancy. Although the train is reaching a reasonable level of maturity for these technologies, it should be regularly monitored for new defects.

Finally, all the tests were performed in a controlled laboratory environment with basic routing configurations and Cisco IOS Release 12.2(10a). The SNA traffic generation was performed by a utility in the Cisco IOS Software called Downstream Physical Unit (DSPU) and emulates a PU 2 to host LLC2 session, providing the opportunity to observe circuits across DLSw+.

Appendix A: Configurations-Session Generators

Initiator

For the standard tests, the initiator configuration was consistent. For the test with the mappings, the MAC addresses were changed to target the bit-swapped logical MAC address on each of the DLSw+ Ethernet Redundancy routers:

dspu host TARGET3 xid-snd 01700003 rmac 0000.3620.0000 rsap 4 lsap 20
dspu host TARGET4 xid-snd 01700004 rmac 0000.3620.0000 rsap 4 lsap 8
dspu host TARGET1 xid-snd 01700001 rmac 0000.3620.0000 rsap 4 lsap 16
dspu host TARGET2 xid-snd 01700002 rmac 0000.3620.0000 rsap 4 lsap 12
dspu host TARGET10 xid-snd 01700010 rmac 0000.3620.0000 rsap 4 lsap 4
!
interface Ethernet0/1
 mac-address 0000.3620.4000
 ip address 6.0.0.2 255.255.255.0
 load-interval 30
 no keepalive
 half-duplex
 dspu enable-host lsap 4
 dspu enable-host lsap 8
 dspu enable-host lsap 12 
 dspu enable-host lsap 16
 dspu enable-host lsap 20
 dspu enable-host lsap 24
 dspu start TARGET3
 dspu start TARGET4
 dspu start TARGET1
 dspu start TARGET2
 dspu start TARGET10
 dspu start TARGET9

Host

The host configuration was as follows:

dspu pu PUNUM3 xid-rcv 01700003
dspu pu PUNUM4 xid-rcv 01700004
dspu pu PUNUM1 xid-rcv 01700001
dspu pu PUNUM2 xid-rcv 01700002
dspu pu PUNUM10 xid-rcv 01700010
dspu pu PUNUM9 xid-rcv 01700009
!
interface Ethernet0/0
 mac-address 0000.3620.0000
 bridge-group 1
 dspu enable-pu lsap 4
!
bridge 1 protocol ieee
!