Guest

Design Zone for WAN/MAN

Transport Diversity: Performance Routing (PfR)

Table Of Contents

Transport Diversity: Performance Routing (PfR)

Preface

Technology Primer

Design and Implementation Considerations

General

Routing Protocol Specific Items

Details

Limitations

Passive and Active Monitoring

Reachability Must Be Verified

Sup720/RSP720 (Earl7) Limitations

Authentication

Process Flow

Principles of Operation

Routing Protocol Interaction

Operational Modes

Network Prefix States

Default

Inpolicy

Out-of-Policy (OOP)

Holddown

Key Concepts

Feature Summary

Best Practices, Tips and Techniques

Load Interval and Bandwidth

Solution Overview

Internet Content Server

Design Requirements and Considerations

Scalability Considerations

Prefix Management

Scalability and Performance Results

Performance Results Summary

Topology

Traffic Profile

Software Release

Tested Configuration

Cisco 7200VXR NPE-G2 as Master Controller

Cisco RSP720 as Master Controller

Cisco 3845 as Master Controller

Troubleshooting

Standby Master Controller

Operational Overview

Topology

Authentication

Master Controller Configuration

Summary

WAN Hub: Dual MPLS Service Providers

Design Requirements and Caveats

Scalability Considerations

Scalability and Performance Results

Performance Results Summary

Topology

Traffic Profile

Software Release

Load Sharing Performance Results

Latency Optimization Performance Results

Tested Configuration

Summary

Branch/SOHO VPN Deployment

Design Requirements and Considerations

Design Limitations

Scalability Considerations

Topology

Delay Generation

VoIP Quality Verification

One-Way Delay

Jitter

Configuration Examples

Troubleshooting

show oer master appl detail

Syslog File

Policy Routing of Application(s)

Summary

Branch VPN Deployment with Cisco Wide Area Application Services (WAAS)

Design Requirements and Considerations

General Topology

Failure Situation

Parent Routes

Recovery

Policy Routing

Design Limitations

Topology with WAAS Network Module

Test Results

TCP Connection Failures

Scalability Considerations

Policy-Based Routing

Router CPU Consumption

Configuration Example—Single Branch Router with WAAS module

Dual Branch Router with WAAS Appliance

Topology Including WAAS Appliance

Test Results

Branch WAAS Compressions Ratios

OER Master State Change

Syslog Output

Configuration Example—Dual Branch Routers with WAAS Appliance

Primary Master Controller and Border Router

Standby Master Controller and Border Router

Branch WAAS Appliance

Campus WAAS Appliance

Campus WAAS Central Manager

Troubleshooting

Application Monitoring with oer-maps

Summary

Troubleshooting

DMVPN and EIGRP Integration

Routing Changes Outside of OER Control

OER Probes and External Interfaces

Passive Monitoring Caveats

Passive Mode Example

Out-of-Policy (OOP) Example

Appendix

References

Acknowledgements

Classless Inter-Domain Routing (CIDR) to Dotted Decimal Notation

Reference Configuration for Load Balancing

Caveats

Cisco Validated Design


Transport Diversity: Performance Routing (PfR)


Cisco Validated Design

September 11, 2008

Preface

Transport diversity is a general terminology used for selecting or preferring a network exit-point for end-user application traffic across network topologies that have a variety of characteristics. These characteristics include things like monetary cost, reliability or availability, availability of bandwidth, and latency.

One example of transport diversity is a branch office environment that has a primary path using Frame Relay and a backup or alternate path using basic rate ISDN. An example of why the concept of diversity is important is evident in Frame Relay outages that affected over 6,000 customers following a series of events that included a software upgrade of a Frame Relay switch. Enterprise customers who relied solely on Frame Relay for their branch office connectivity may have experienced outages lasting several hours or days. Enterprise customers who deployed branches with a primary link provisioned as Frame Relay and a backup link using basic rate ISDN were able to maintain branch office connectivity throughout the network failure. This WAN diversity is based on decision making based on link failure.

As the WAN technologies advance and mature, the concept of transport diversity also advances to include path selection over `always on' links like Cable, DSL, wireless broadband, or satellite. Now, it is economically feasible to maintain dedicated multiple WAN transport links as there is no variable cost structure or dial-up delay as is the case with ASYNC or ISDN dial.

Performance Routing (PfR) then, is the general term used for features that take into account diverse WAN characteristics and make an informed-decision on the best path to reach a network or application, given multiple choices that may have varied performance characteristics. PfR by its nature takes into account the network performance, delay, loss, and link loading, where traditional routing protocols typically rely solely on cost (total bandwidth) once reachability, in that there is a neighbor relationship between routers, exists across a WAN link.

Interior gateway Protocols (IGPs), particularly Open Shortest Path First (OSPF), uses a simple single metric component, cost, which is based on the bandwidth of the link. Enhanced Interior Gateway Routing Protocol (EIGRP) is slightly more aware of the link characteristics in that it calculates a metric based on cumulative delay (delay is simply an arbitrary assigned value) and the minimum bandwidth value encountered between the source and destination. The only commonly used Exterior Gateway Protocol (EGP), Border Gateway Protocol (BGP), by default uses the number of hops (a hop being all routers within an autonomous system (AS), to determine the best path to the destination network address.

With both IGP and EGP protocols, the concept of transport diversity means equal or unequal cost load-sharing through the use of the routing protocols such as Routing Information Protocol (RIP), EIGRP, or OSPF and through external BGP Multipath (maximum-paths n) to insert multiple routes for a destination network address into the routing table.

The concept of load sharing is often associated with the capabilities of a routing protocol, however the routing protocol only serves to inject more than one route into the IP routing table. Once routes are in the routing table, it is the function of the switching path; process, fast, or Cisco Express Forwarding (CEF) to actually accomplish a degree of load sharing or load balancing.

Load balancing is the term used to describe two or more links that are used equally between two sites. However, in order to accomplish an equitable distribution between the two links, per-packet load balancing is usually required to obtain this distribution when the number of flows are small. As an example, consider a file transfer using FTP. With such a single large flow between the two sites, fast or CEF switching uses only one of the links, as the switching path selects an exit based on the destination IP address for fast switching, or for CEF switching, on a per source and destination IP address basis. In either case, only one link is used unless CEF per-packet is enabled.


Tip In most cases, as the number of flows increase between two source and destination networks, so does the ability of any load sharing mechanism to more equally distribute packets across multiple links. Per-packet load sharing can address load sharing with a single or few flows, but at the cost of increasing the likelihood of packets arriving out of sequence, which introduces inefficiencies.


Complicating path selection is the overlay of logical interfaces, IPSec tunnels, for example, which means that path selection must be addressed inside the tunnel. The tunnel destination endpoint may also have multiple paths between source and destination. The V3PN: Redundancy and Load Sharing Design Guide (http://www.cisco.com/en/US/netsol/ns742/networking_solutions_program_category_home.html) was written to assist the network manager in implementing IPSec encryption in the presence of multiple paths or dial-up connections to provide a higher degree of availability. As a general recommendation, load sharing inside the tunnel interface and configuring the tunnel with an affinity to a particular physical interface will provide the best results.

PfR is a technology used to improve on the capabilities of routers and routing protocols to make more granular and intelligent decisions on injecting routes into the routing table so application performance can be optimized to meet the needs of the end-user applications.

Technology Primer

As with any emerging technology, basic features and capabilities are initially implemented in Early Deployment (ED) releases of the Cisco IOS and supported on the most commonly used hardware platforms. As the technology is adopted, customer feedback is used to enhance the capability of the existing features and add new features as well as support additional product lines. Performance Routing (PfR) is no exception to this implementation life cycle.

PfR is Cisco's strategy for advanced route optimization. Optimized Edge Routing (OER) was designed to provide route optimization to destination IP prefixes. PfR leverages OER technology to provide application route optimization and other application services. In this document, references to OER should be in the context of a subset of the broader subject of PfR.

OER was initially targeted at addressing Internet and WAN reliability, addressing the issue where the routing protocol, typically BGP to an Internet service provider (ISP), provides network reachability vectors but does not address transient connectivity failures (brownouts) or offer load-sharing based on measured network performance. Additionally, routing protocols like BGP are not aware of the monetary cost of links that may incur a per-byte or per-packet basis fee. Some links have both a fixed cost and a variable cost structure. In other words, there may be a monthly charge for the link and some additional charge per-byte or additional charges once some threshold (or usage tier) is reached.

Enterprise customers use the Internet extensively for electronic commerce and often the entire business model is based on sales of product through their Internet portal. The network managers wanted some means of controlling the exit point of their traffic to optimize the network performance for their users but without tools like OER, the solution was to purchase network connectivity from as many ISP networks as practical and hope that the best path to a user was through the ISP that offered the least number autonomous system (AS) paths. With OER, metrics like delay could also be used to determine the best path rather than only rely on the length of the AS path advertised by their respective ISPs.


Tip BGP chooses, by default, the best path based on the fewest AS between the source and destination. OER, on the other hand, can influence traffic based on reachability, delay, loss jitter, throughput, load, monetary cost, and even mean opinion score (MOS).


OER uses various Cisco IOS capabilities, such as NetFlow and IP SLA, to create these advanced metrics for best path selection to improve the user experience.

Design and Implementation Considerations

This section includes an overview of design and implementation considerations the network manager must consider when implementing OER.

General

In any OER implementation, a master controller (MC) and at least one border router (BR) must be configured. The MC commands and controls the BRs and maintains a central repository for the data collected by the BRs. BRs are in the user traffic switching path. BRs collect data from their NetFlow cache and the IP SLA probes they generate, provide a degree of aggregation of this information, and influence the packet switching path to manage user traffic. The MC communicates with the BRs over an authenticated TCP socket, but has no requirement for populating its own IP routing table with anything more than a route to reach the BRs.

Because OER is a path selection technology, there must be at least two external interfaces under the control of OER and at least one internal interface. There must be at least one BR configured. If there is only one BR configured, then both external interfaces are attached to the single BR. If more than one BR is configured, then the two or more external interfaces are configured across these BRs. External links, or exit points, are therefore owned by the BR; they may be logical (tunnel interfaces) or physical links.

The MC function can be collocated (configured) on the same router as the BR, or it can be a dedicated, standalone chassis. The MC is the decision maker. Typically, at a headend campus location, the MC is a standalone chassis while at branch locations the MC is collocated (configured) on the same chassis as the BR. As a general rule, the headend campus location manages more network prefixes and/or applications than a branch deployment and thus consumes more CPU and memory resources for the MC function. Therefore, it makes a good design practice to dedicate a chassis for the MC at the headend campus. The branch typically manages fewer network prefixes and/or applications and due to the costs associated with dedicating a chassis at each branch, the network manager can collocate the MC and BR on the same chassis.


Tip If there are two distinct BRs, only one is configured as the MC. If there are two external interfaces on one branch BR and a third external interface on a separate BR, the MC should be configured on the BR with the two external interfaces. This way, should the BR with the single exit fail, the surviving BR/MC has two functional exits to meet the requirement for at least one internal and two external exits.


Routing Protocol Specific Items

OER can learn prefixes dynamically through the traffic statistics from the NetFlow cache. Both TCP and non-TCP traffic can be learned based on highest throughput. Delay learning is limited to TCP-only traffic, but throughput can be calculated for non-TCP traffic. Network prefixes can be manually defined and learning need not be configured, or prefixes can both be learned dynamically and configured statically. In any of these use cases, a parent route is required to manage a network prefix or application. Parent routes are routes injected into the routing table by either eBGP or static routes which OER then augments with more specific routes (or uses policy-based routing (PBR)) to manage traffic across the external interfaces. Through an assumed definition, the parent routes must therefore be of equal cost and administrative distance so that more than one path for the parent route exists in the routing table of the border router at the same time.

OER learns prefixes that fall under a parent route, the least specific parent route is a default route (0/0), or more specific networks and masks may be configured. For example 10.0.0.0/8 could be used as a parent route. The learning of network prefixes that fall within the parent route is a function of NetFlow. NetFlow is enabled automatically by OER, however it does not appear in the running or startup configuration.

In the current implementation of OER, external BGP or static routes can serve as parent routes with external interfaces being point-to-point or multipoint interfaces (Ethernet) with a single next hop. In other words, multipoint GRE interfaces (as with a DMVPN configuration) that has multiple next hops reachable from the mGRE interface are not supported. Additionally, Ethernet interfaces with multiple next hops, which is a common BGP peering deployment topology, is not currently supported.

IP routing is not required on the MC, it simply must communicate with the BRs. The MC may be protected by firewall or access control lists. The MC and BRs communicate with each other on TCP port 3494 by default, but this is configurable. The MC listens on TCP port 3494 and the BRs initiate the TCP connection.

Details

By default OER manages external interfaces by priority of WAN performance (delay), then loading (utilization). This means, therefore, that one exit point may be more fully used than another, if that exit point exhibits lower (better) application latency (delay) than an other exit point.


Note OER is designed to optimize end-to-end application performance, not simply WAN load balancing. Historically, routing protocols were geared to load sharing across multiple links in hopes of providing better application performance, but load sharing is link (or hop) specific. OER can deduce end-to-end application performance and optimize the exit point to achieve optimal application performance across the internetwork.


In learn mode, delay (and reachability) is determined by observing TCP flows. Round trip delay is determined by the amount of delay observed in TCP flows during session setup; the TCP first two exchanges of the TCP three-way handshake. The client active open is a TCP SYN to the server. In response, the server replies with a SYN-ACK. This level of visibility into TCP flows is obtained by observing flows (through the NetFlow cache) traversing the border routers.


Tip When OER is configured in learn mode/passive, TCP flows must be observed by the border routers to manage prefixes. This means to test OER in a lab environment, some tool to generate actual TCP/UDP traffic and another to introduce delay, loss, etc,. is necessary to observe meaningful results.


Limitations

There are limitations that the network manager must be aware of in order to successfully implement OER. Cisco Express Forwarding (CEF) must be enabled on all border routers. Up to 10 border routers and a total of 20 external interfaces are supported per master controller. If using BGP as parent routes, the border routers must have external BGP neighbors on directly connected interfaces. That neighbor cannot be an iBGP neighbor, although the border router(s) must be iBGP neighbors with other routers in the network to advertise the exit point of the OER managed routes. Static routes are supported as parent routes.

Depending on the BGP configuration, the use of the maximum-paths 2 command may be required to insert more than one BGP learn network prefix in the routing table. Also, the Cisco IOS hidden command, bgp bestpath as-path multipath-relax is also used for this same purpose. This feature is introduced by enhancement CSCea19918 - BGP: need to do multipath with different as-paths.

EIGRP/OSPF learned routes can satisfy the parent route requirement in future Cisco IOS Releases incorporating CSCsk39768 - PfR-EIGRP integration or CSCsm34644 PfR-OSPF integration.

NAT/pNAT compatibility has been added as of the Cisco IOS Release 12.4(15)T; however, NAT/pNAT is not tested in this design guide. The number of network prefixes able to be managed by OER is discussed in the Internet Content Server section.

The use of multipoint interfaces (mGRE) and multiple next hop addresses is not currently supported. The tracking number CSCsi69186 provides additional information regarding future release integration.

Passive and Active Monitoring

Passive monitoring is the act of OER gathering information on user packets assembled into flows by NetFlow. OER, when enabled, automatically enables NetFlow on the managed interfaces on the border routers. By aggregating this information on the border routers and periodically reporting the collected data to the master controller, the network prefixes and applications in use can automatically be learned. Additionally, attributes like throughput, reachability, loading, packet loss, and latency can be deduced from the collected flows.

Active monitoring is the act of generating IP SLA probes to generate test traffic for the purpose of obtaining information regarding the characteristics of the WAN links. Active probes can either be implicitly generated by OER when passive monitoring has identified destination hosts, or explicitly configured by the network manager in the OER configuration. An example of configuring an explicit IP SLA jitter probe is shown in Branch/SOHO VPN Deployment.

Reachability Must Be Verified

For OER to consider an exit interface as a candidate for traffic, reachability to the target network prefix must be verified. When OER is configured as passive mode (mode monitor passive), TCP flows must be present across the exit interface to learn the validity of reachability across the exit. Note that a parent route needs to be present to direct traffic for a target network out the external interfaces, in order to allow the NetFlow subsystem to identify the validity of reachability through the TCP flows. Given this, if there is no TCP traffic out an exit interface, no passive measurements are available to NetFlow/OER. Or, if there are long lived TCP flows, flows lasting longer than the OER monitor period, no TCP SYNc and TCP SYNc/ACK are seen during the monitor period. So in this case, traffic may be active, but because the TCP SYNc and TCP SYNc/ACK is not seen during the monitor period, no delay and reachability can be deduced from this long persistent flow.


Tip Passive monitoring of delay, loss, and reachability rely on OER observing the NetFlow reported TCP traffic over an exit interface. OER can learn prefixes based on throughput for non-TCP flows. Excluding VoIP, which is UDP-based, TCP-based applications represents the largest share of traffic on the Internet and most enterprise networks.


For OER to function optimally in passive monitor mode, more TCP flows equate to more data points for the master controller to analyze and manage. As the number of TCP flows increase, the database becomes more granular, meaning more delay, loss, and reachability information is available for a given network prefix.

Passive Mode Example illustrates the need for traffic to be observed by NetFlow over more than one exit interface when mode monitor passive is configured.

Sup720/RSP720 (Earl7) Limitations

Because of architectural limitations with the NetFlow implementation on the EARL 7 (PFC3)-based hardware present on supervisor engines of the 6500/7600 series, OER cannot determine performance (delay/loss/reachability) characteristics from passive monitoring of TCP flows. Passive throughput is supported by Earl7. Throughput is the calculation of the number of packets output from the external interfaces of the OER border routers over a unit of time, usually represented as a rate per second (as in megabits per second). Therefore, throughput is synonymous with using OER to manage for load sharing.

More information regarding this limitation is available in Internet Content Server section of this document.

Authentication

Communication between the master controller and border routers must be authenticated through a referenced key-chain and they share a like key-string. In the following configuration examples, assume that the border router and master controller, either collocated on the same chassis or on separate chassis, reference a key chain in their respective configuration files that share a like key-string. An example follows:

!
! Example of key chain with master controller
! and border router on the same device
!
interface Loopback0
 ip address 10.0.0.1 255.255.255.255
!
!
key chain BLUE
 key 10
   key-string 7 0035262F277034241D2E5B40
!   
!         
oer master
 !        
 border 10.0.0.1 key-chain BLUE
!         
oer border
 master 10.0.0.1 key-chain BLUE
!         
end

Warning OER authentication fails with a key string greater than 15 bytes. See CSCsd00633 for more details.


Process Flow

The OER configuration section in the Cisco IOS command line interface provides a means to be very granular in selecting the types of applications or network prefixes are targeted for performance routing. Additionally, the policy associated with each application or network prefix can differ from the default policy and is unique and specific to the network prefixes or applications identified in the configuration.

To better understand this process, one sample configuration is provided (see Figure 1) to demonstrate how a network manager may configure a master controller policy to identify four remote branch networks and apply different policies through three separate OER maps. Note that the second OER map identifies two network prefixes while the first and third identify a single prefix.

The OER master configuration section is parsed through the policy-rule reference to OER map FOO. The sequenced references to FOO are parsed and the distinct policy is associated with the selected addresses identified in the prefix-list. Upon completion of parsing the oer-maps, the remainder of the global OER master configuration is parsed. In this case, prefix learning (learn is referenced under the oer master construct) is configured, meaning that this OER master configuration will both identify network prefixes based on explicitly configured address as well as through learning prefixes based on traffic identified by NetFlow.

Figure 1 demonstrates this process flow.

Figure 1 Process Flow for OER Configuration

Looking at this graphically, An OER policy is analogous to a container or bucket to hold match and set statements, which are evaluated in order of the OER map sequence numbers and then when the OER map is completely evaluated, the global OER master configuration statements are evaluated. This is shown in Figure 2.

Figure 2 OER Policy

The policies in effect can be shown by using the show oer master policy command. Additionally, the show oer master prefix n.n.n.n/n policy command can be used to display the policy in effect for a particular prefix. An example of the output of this command is shown in Displaying the Policy for a Prefix.

Principles of Operation

This section examines the principles of operation for the OER sub-system within the Cisco IOS and how it interacts with other subsystems including the IP routing table, BGP, NetFlow, and IP SLA.

Routing Protocol Interaction

This section describes how OER functions in a basic configuration using passive monitoring and prefix learn mode. This is the simplest means to enable OER, and it relies on NetFlow data of TCP sessions to provide this function.

OER can use both static routes and BGP as the method to provide parent routes, each method is shown. OER is configured to control routes (route control mode), rather than simply to observe.

Static Routing

First, look at a simple configuration in Figure 3 where there is one OER border router with a single internal interface and two external interfaces. The OER master controller is shown as a separate router in this configuration, but it could also be also configured on the same chassis as the OER border router.

A single campus switch is shown in the topology (see Figure 3). Because both exits are on the same chassis, the Layer 3 campus switch routes all packets to the OER border router, allowing it to make the exit interface decision off its own IP routing table. In this example, OER is simply influencing static routes in the IP routing table and these statics do not need to be redistributed into an IGP as there is only one Layer 3 campus switch in the topology.

Figure 3 Static Routing

The principles of operation for OER in this topology are described as follows:

Parent route, static routes with a destination of the external interfaces, are injected into the IP routing table as equal cost routes to the destination network(s). These routes are manually configured and present in the startup/running configuration.

IP CEF switches user traffic (packets) using these equal cost parent routes out the OER external interfaces. CEF switching is enabled by default and is required for OER to function.

NetFlow, enabled automatically and transparently by OER, captures the resulting flow data from packets using the exit points.

The OER border router reports this learned flow data to the OER master for analysis.

When the OER master controller detects traffic out-of-policy, it instructs the OER border router to inject a static route directly to the IP routing table. This directs out-of-policy traffic through a new path to reach the destination network.

By default, OER injects the static into the IP routing table as a /24 network prefix. This length is configurable. However, the key point is that OER is influencing network traffic through a prefix with a longer mask than the parent route. For example, route control of /24 prefixes maybe sufficient for Internet load-sharing policies, but /24 is too short for branch office load-sharing if the branch has a /24 or longer subnet design.

The following output illustrates the relationship between the OER master prefix database and the IP routing table. There are two parent routes in the routing table; 10.0.0.0/8 and 64.102.0.0/16.

ip route 10.0.0.0 255.0.0.0 10.81.7.225 30 tag 300 name OER_parent
ip route 10.0.0.0 255.0.0.0 10.81.7.193 30 tag 300 name OER_parent
ip route 64.102.0.0 255.255.0.0 Tunnel200 tag 300 name OER_parent
ip route 64.102.0.0 255.255.0.0 Tunnel100 tag 300 name OER_parent
joeking-vpn-1811#show ip route static
     64.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
S       64.102.0.0/16 is directly connected, Tunnel200
                      is directly connected, Tunnel100
S       64.102.223.16/28 [1/0] via 192.168.2.1
     10.0.0.0/8 is variably subnetted, 4 subnets, 3 masks
S       10.0.0.0/8 [30/0] via 10.81.7.225
                   [30/0] via 10.81.7.193

This output identifies an OER prefix that is currently being controlled by OER and is in the state of INPOLICY. In this example, 10.16.151.0/24 is used.


joeking-vpn-1811#show oer master prefix learned                     
OER Prefix Statistics:
 Pas - Passive, Act - Active, S - Short term, L - Long term, Dly - Delay (ms),
 P - Percentage below threshold, Jit - Jitter (ms), 
 MOS - Mean Opinion Score
 Los - Packet Loss (packets-per-million), Un - Unreachable (flows-per-million),
 E - Egress, I - Ingress, Bw - Bandwidth (kbps), N - Not applicable
 U - unknown, * - uncontrolled, + - control more specific, @ - active probe all
 # - Prefix monitor mode is Special, & - Blackholed Prefix
 % - Force Next-Hop, ^ - Prefix is denied

Prefix                  State     Time Curr BR         CurrI/F         Protocol
                      PasSDly  PasLDly   PasSUn   PasLUn  PasSLos  PasLLos
                      ActSDly  ActLDly   ActSUn   ActLUn      EBw      IBw
                      ActSJit  ActPMOS
--------------------------------------------------------------------------------
...
joeking-vpn-1811#show oer master prefix learned | inc INPOLICY      
64.102.16.0/24          INPOLICY        0 10.81.7.73      Tu100           STATIC  
64.102.4.0/24           INPOLICY        0 10.81.7.73      Tu200           STATIC  
64.102.6.0/24           INPOLICY        0 10.81.7.73      Tu100           STATIC  
10.16.151.0/24          INPOLICY        0 10.81.7.73      Tu100           STATIC  
64.102.31.0/24          INPOLICY        0 10.81.7.73      Tu100           STATIC  

From the above display, the protocol specified is static, meaning a static route has been injected into the IP routing table. Reviewing the IP routing table:


joeking-vpn-1811#show ip route 10.16.151.0 255.255.255.0
Routing entry for 10.16.151.0/24
  Known via "static", distance 1, metric 0
  Tag 5000
  Routing Descriptor Blocks:
  * 10.81.7.225
      Route metric is 0, traffic share count is 1
      Route tag 5000

Note that the next hop is identified from the value specified for the parent route (next hop is IP address 10.81.7.225) and that its route tag is 5000. OER by default uses a route tag value of 5000.


Note OER injects static routes into the running configuration. They are not in the startup Cisco IOS configuration.


Looking at all static routes in the IP routing table, there are two OER parent routes, the route to 64.102.0.0/16 and 10.0.0.0/8. Note that in the first case, the next hop is specified by the logical interface name (Tunnel200 and Tunnel100) and for the second case, the next hop is specified by IP address.


joeking-vpn-1811#show ip route static
     64.0.0.0/8 is variably subnetted, 7 subnets, 3 masks
S       64.102.6.0/24 [1/0] via 0.0.0.0, Tunnel100
S       64.102.4.0/24 [1/0] via 0.0.0.0, Tunnel200
S       64.102.0.0/16 is directly connected, Tunnel200
                      is directly connected, Tunnel100
S       64.102.19.0/24 [1/0] via 0.0.0.0, Tunnel100
S       64.102.16.0/24 [1/0] via 0.0.0.0, Tunnel100
S       64.102.31.0/24 [1/0] via 0.0.0.0, Tunnel100
S       64.102.223.16/28 [1/0] via 192.168.2.1
     10.0.0.0/8 is variably subnetted, 5 subnets, 4 masks
S       10.0.0.0/8 [30/0] via 10.81.7.225
                   [30/0] via 10.81.7.193
S       10.16.151.0/24 [1/0] via 10.81.7.225

Now with this /24 route in the IP routing table, user traffic for 10.16.151.0 is directed out one of the two exits.

BGP Routing

Using BGP as a source for parent routes is also an option. From the principles of operation for OER description in the previous section, only one line item is changed. That item relates to OER injecting a static route in the routing table to influence the overall path selection. When BGP is configured, OER injects a network prefix and mask into the BGP table, not the IP routing table. In turn, these BGP routes are advertised to the other BGP routers and BGP routes are injected into the routing table through the BGP selection process.

In Figure 4, the topology is modified slightly to have two border routers, each with one external interface.

Figure 4 OER and BGP Routing

These OER border routers are external BGP (eBGP) peers with their respective ISP, while the Layer 3 campus switch and the two OER border routers are iBGP peers. Where in the previous example, the Layer 3 campus switch needed no dynamic routing protocol as all packets were forwarded to the single OER border router. Now the iBGP session between the Layer 3 campus switch and the two OER border routers is used to advertise the OER managed prefixes injected into the BGP table, not directly into the IP routing table, to influence a subset of the total traffic. The BGP routing process then scans the BGP table and inserts routes from the BGP table into the IP routing table of both OER border routers as well as into the Layer 3 campus switch.


Note In this topology, OER could use static parent routes and redistribute the OER static routes that are injected into the IP routing table of the OER border routers into some dynamic IGP routing protocol, like OSPF, RIP, or EIGRP. This example, however, shows the use of BGP as parent routes so that is the nature of the example.


To reiterate, the eBGP sessions provide OER Parent routes, their existence in the IP routing table, along with IP CEF, NetFlow and the reporting by the OER border routers to the OER master controller cause the master controller to direct the border router to inject routes into the BGP table. In turn, these entries in the BGP table are advertised to the configured iBGP peers, and then potentially injected into the IP routing table of these peers.

Internal BGP (iBGP) is therefore the means to influence path selection upstream from the OER border router. In this example, the upstream device(s) is the Layer 3 campus switch.

The following is a sample of a prefix (192.168.192.0/24) that is injected into the IP routing table through BGP.

vpn-jk2-3725-1#show oer master prefix
OER Prefix Statistics:
 Pas - Passive, Act - Active, S - Short term, L - Long term, Dly - Delay (ms),
 P - Percentage below threshold, Jit - Jitter (ms),
 MOS - Mean Opinion Score
 Los - Packet Loss (packets-per-million), Un - Unreachable (flows-per-million),
 E - Egress, I - Ingress, Bw - Bandwidth (kbps), N - Not applicable
 U - unknown, * - uncontrolled, + - control more specific, @ - active probe all
 # - Prefix monitor mode is Special, & - Blackholed Prefix
 % - Force Next-Hop, ^ - Prefix is denied

Prefix                  State     Time Curr BR         CurrI/F         Protocol
                      PasSDly  PasLDly   PasSUn   PasLUn  PasSLos  PasLLos
                      ActSDly  ActLDly   ActSUn   ActLUn      EBw      IBw
                      ActSJit  ActPMOS
--------------------------------------------------------------------------------
192.168.17.0/24         INPOLICY      @36 192.168.131.2   AT3/0.135       BGP

                               U        U        0        0        0        0
                               9        9        0        0        1        1
                               N        N
192.168.33.0/24         INPOLICY       83 192.168.131.1   AT2/0.235       BGP

                               U        U        0        0        0        0
                               4        4        0        0        1        1
                               N        N
192.168.193.0/24        HOLDDOWN       56 192.168.131.2   AT3/0.135       BGP

                               U        U        0        0        0        0
                               U        U        0        0        0        0
                               N        N
192.168.192.0/24        INPOLICY       46 192.168.131.1   AT2/0.235       BGP

                               U        U        0        0        0        0
                               U        U        0        0        0        1
                               N        N

This prefix is displayed from the BGP table:

vpn-jk2-3725-1# show ip bgp 192.168.192.0/24
BGP routing table entry for 192.168.192.0/24, version 228
Paths: (1 available, best #1, table Default-IP-Routing-Table, not advertised to
EBGP peer)
  Advertised to update-groups:
        1
  65001 65002, (injected path from 192.168.192.0/18)
    192.168.129.5 from 192.168.129.5 (192.168.191.1)
      Origin IGP, localpref 100, valid, external, best
      Community: no-export
vpn-jk2-3725-1#


Note OER injected routes remain local to this AS as they have a community value of no-export, meaning do not advertise this route to EBGP peers.


And, in this example, the route is also injected into the IP routing table by the BGP process:

vpn-jk2-3725-1#show ip route bgp
B    192.168.192.0/24 [20/0] via 192.168.129.5, 00:07:49
...

vpn-jk2-3725-1#show ip route 192.168.192.0 255.255.255.0
Routing entry for 192.168.192.0/24
  Known via "bgp 65030", distance 20, metric 0
  Tag 65001, type external
  Redistributing via eigrp 100
  Advertised by eigrp 100 route-map ELIMINATE_RIB_failure
  Last update from 192.168.129.5 00:06:29 ago
  Routing Descriptor Blocks:
  * 192.168.129.5, from 192.168.129.5, 00:06:29 ago
      Route metric is 0, traffic share count is 1
      AS Hops 2
      Route tag 65001

vpn-jk2-3725-1#

Now that the method of OER influencing traffic has been shown, the next section explores how OER then verifies and re-evaluates on an ongoing basis.

Operational Modes

Mode Monitor Passive

The border routers report traffic flows identified by NetFlow to the master controller. The average delay, of the flows, packet loss, and reachability along with the outbound throughput in terms of bits per second is determined for the destination IP prefixes observed in the NetFlow data.

Measurements of the TCP traffic flows is characterized by:

Delay—Time between TCP SYNC and TCP SYNC/ACK in a TCP three-way handshake.

Loss—TCP sequence numbers are tracked, loss can estimated when lower sequence numbers than the highest sequence number observed are seen.

Reachability—Repeated TCP SYNCs without an accompanying TCP SYNC/ACK identify reachability failures.

Throughput—Throughput is calculated from NetFlow and measured in bits per second (bps).

Measurements of non-TCP traffic flows is characterized by throughput only.

Mode Monitor Active

In this mode, Cisco IOS IP service level agreements (SLAs) probes are generated by the border routers and transmitted at the configured probe frequency value. Active probes are created implicitly by OER; however, the network manager may also explicitly create active probes.

By default, an active probe is of the type of ICMP echo. If VoIP is to be characterized, the network manager may choose to explicitly configure an active probe. Following is an example from an oer-map using a traffic-class that matches VoIP streams.

set active-probe jitter 10.1.1.1 target-port 33033 codec g729a
!
set probe frequency 2

In this example, the target IP address configured in the explicit active probe, 10.1.1.1 in this example must be a Cisco router configured for ip sla responder command. Most IP hosts will respond to an ICMP echo, unless administratively disabled or prohibited, however to determine MOS, jitter and other characteristics associated with VoIP quality measurements, the capabilities and function of an IP SLA responder must be enabled.

It is possible, and practical, to use active monitoring for specific traffic-class or IP prefix address, identified through an OER-MAP referenced by a policy-rules statement for measuring VoIP traffic, while using a global configuration option defaulting to passive monitoring of all other traffic through the TCP flows.


Note An example of using both active monitoring for VoIP and passive monitoring for the remaining traffic flows is shown in Displaying the Policy for a Prefix.


Mode Monitor Both

Mode monitor both is the default value and combines the capabilities of passive and active monitoring. Up to five IP addresses are actively probed for each destination prefix learned through passive monitoring. By default, an IP SLA ICMP ECHO probe is automatically generated for the learned IP addresses.

By monitoring both actively and passively, additional data points regarding a network prefix can be obtained through two separate and distinct tools; NetFlow for passive measurements and IP SLA for the active measurements. However, the inclusion of active probing also has disadvantages. The ICMP ECHO requests that are generated by default constitute additional background traffic on the network. When used on the Internet, activating probing may not be desirable in that ICMP packets may be blocked or administratively prohibited and may be considered a threatening or abusive posture to the target hosts. Because of this, mode monitor both is best suited for use within the private internal network of the enterprise. Unlike mode monitor fast, which is described in a later section, active probing does not probe all exit points continuously. It probes only the current exit point provided the status is INPOLICY and probes are generated after the prefix timer value is exhausted.

To illustrate, prefix 192.168.33.0/34 is being monitored by both passive and active probing. In looking at the detail display of the prefix, several items bear notice:

State of INPOLICY*—The asteric (*) indicates this prefix is uncontrolled by OER (the parent route controls routing) , but is currently inpolicy.

The `at sign' (@) on the Time Remaining value means the prefix is being actively probed. The numerical value is a countdown timer indicating when this state will expire.

The latest statistics from the active probes, by the five individual IP addresses are shown, along with the corresponding values. Note that each of the five target IP addresses were attempted two times, and successfully completed each attempt. The sum of the delay values, along with the minimum, maximum, and derived average delay (Dly) is shown:

vpn4-3800-15#show oer master prefix 192.168.33.0/24 detail
Prefix: 192.168.33.0/24
   State: INPOLICY*   Time Remaining: @11     
   Policy: Default

   Most recent data per exit
   Border          Interface         PasSDly  PasLDly  ActSDly  ActLDly
  *192.168.131.1   Gi0/1.651               5        5        0        0
   192.168.131.2   Gi0/1.652               6        6        0        0

   Latest Active Stats on Current Exit:
   Type     Target          TPort Attem Comps    DSum     Min     Max     Dly
   echo     192.168.33.5        N     2     2       7       3       4       3
   echo     192.168.33.6        N     2     2      20       8      12      10
   echo     192.168.33.25       N     2     2       2       1       1       1
   echo     192.168.33.16       N     2     2       8       4       4       4
   echo     192.168.33.26       N     2     2       8       4       4       4

Prefix performance history records
 Current index 50, S_avg interval(min) 5, L_avg interval(min) 60

Age       Border          Interface       OOP/RteChg Reasons                  
Pas: DSum  Samples  DAvg  PktLoss  Unreach   Ebytes   Ibytes     Pkts    Flows
Act: Dsum Attempts  DAvg    Comps  Unreach   Jitter LoMOSCnt   MOSCnt
00:00:55  192.168.131.1   Gi0/1.651                                           
       36        6     6        0        0    13268    18783      287       76
        0        0     0        0        0        N        N        N
00:02:11  192.168.131.1   Gi0/1.651                                           
       24        5     4        0        0    10472    15276      237       66
        0        0     0        0        0        N        N        N

In this example, also note that both passive and active data points are shown collectively in the output for the most recent data statistics as well as the performance history section. The performance history section can be used to see trends in the characteristics of a given prefix.

Mode Monitor Special

Mode monitor special is an alternate syntax to mode monitor both for the Cisco 6500 and Cisco 7600 series implementations. Active probing is enabled to accommodate the EARL 7 (PFC3) passive monitoring limitations described in a previous section.

Mode Monitor Fast

This feature was introduced in Cisco IOS Release 12.4(15)T as a key component to the Fast Reroute feature. This mode generates active probes through all exists continuously at the configured probe frequency. This differs from either active or both mode in that these modes only generate probes through alternate paths (exits) in the event the current path is out-of-policy. One way to describe this behavior is the OER subsystem quantifies the alternatives only when the current path is known to be deficient, where with Fast Reroute, the characteristics of the alternative paths are always known, allowing immediate use as required. If unreachable is determined to be out-of-policy for the current exit, the alternate exit is selected as the current exit, assuming the unreachable values for the alternate exit is in policy.

The unreachable threshold is calibrated in number of failed probes per million probe attempts. If the unreachable value is set to 1, a single probe fails on the current exit, an attempt is made to locate a alternate exit. However, if the alternate exits also have a single failed probe, they are not selected because they too are out-of-policy.


Warning Setting the unreachable threshold to a value of 1 may cause an alternate exit to be out-of-policy in the event a transient error occurred in the past, but which has now cleared.


The Fast Reroute feature, therefore, allows rerouting actions to be taken, at an interval approaching the configured probe frequency value. Probe frequency can now be set as low as 2 seconds if fast mode is configured. This allows re-routing at slightly more than the configured probe frequency value. While the Fast Reroute feature was not scale tested in this design guide, in an ideal deployment, rerouting can occur in as little as 3 seconds.


Note This feature may be best described as continuous monitoring of alternate paths, as opposed to as required monitoring of alternate paths.


The obvious drawback to this feature is the potential for adding additional network traffic overhead associated with the probes themselves and additional CPU resources to the OER border routers, the source of the active probes. Unless the prefix is deleted or in the default state, probes are generated.

The active probe results are used for out-of-policy and to control routing. Passive data collected is for information only, the throughput transmit and receive Kbps values (show oer mast border detail) are used for load balancing.

Network Prefix States

Default

A network prefix may be shown in the default state if it is manually configured or learned but has not been determined to be in or out-of-policy. Prefixes may revert back to default state if, for some reason, OER can no longer control the prefix. This may happen if all the exits are out-of-policy.

The default state means that the parent IP routes control the exit for this destination prefix. This would be the same behavior as if OER were not configured or shutdown.

Inpolicy

The prefix is inpolicy, which means that it meets the policy associated with this prefix or application. The prefix can be inpolicy and being controlled by OER, or inpolicy* and not controlled by OER. The presence of the asteric (*) on the state attribute indicates the network is known to OER, but is under the control of the parent route. When no asteric (*) is present, the prefix is being controlled by OER. The state of inpolicy is considered to be a desirable state.

Out-of-Policy (OOP)

The prefix or application has been identified as failing to meet its respective policy. If traffic is identified as being out-of-policy, OER moves the traffic to an alternative exit to bring the traffic inpolicy or unmanages the traffic, allowing it to revert back to the default exits as determined by the parent routes in the IP routing table. If the traffic reverts back to the default state, OER will again cycle this traffic, like all other traffic on the network, in an attempt to optimize based on the configured or default OER policy. The state of out-of-policy is considered undesirable.

Holddown

The holddown state is enabled when a traffic class is initially controlled by OER. This holddown concept is applied to prevent churning or erratic behavior of OER managed routes from being injected and withdrawn from the IP routing table (and subsequently being redistributed by some IGP) or BGP tables.

Once a prefix has been changed, it enters holddown for the specified (holddown) period before it can be deemed in or out-of-policy. A network prefix can leave holddown state before the timer expires if the current exit point experiences an unreachable out-of-policy condition. All other out-of-policy conditions are ignored during holddown state.

Key Concepts

This section provides an overview to some of the terminology and concepts of OER

Variance

The concept of variance in OER is similar in implementation to the variance keyword that enables EIGRP to install multiple unequal cost routes in the local routing table. Variance is a means to specify a range in which two unequal values are considered similar enough to be treated as equal.

From the context of OER, variance is a percentage, from 1-100. If delay is set to an absolute value of 80ms and a 10 percent variance is configured, delay values from 80 to 88ms will be considered equal.

Path Selection

OER selects the best path based on:

Excluding links currently overloaded (refer to Max BW from the output of the show oer master border detail command).

Best performing link depending on configured priorities and their associated variance.

Granularity

Without sufficient granularity, meaning the number of flows and prefixes being learned or configured, OER cannot effectively do optimal load balancing. This is also true of CEF or fast switching when load balancing two equal cost paths in the IP routing table. CEF has an advantage over fast switching in that CEF load balances based on source and destination IP address, while fast switching only load balances based on destination IP address. However, OER has an advantage over CEF or fast switching in that it can load balance based on Layer 4 fields or ToS byte (DSCP) instead of simply network (destination) prefixes, to provide better visibility and granularity. OER also takes into consideration link utilization, where CEF does not. From a campus headend, granularity may not be an issue; however, it may be from the branch router.


Tip OER performs most effectively with the more flows and the resulting network prefixes it has observed. If the number of destination network prefixes are low, consider adding more granularity by monitoring applications instead of simply network prefix addresses.


Interval Period

The configured interval period value determines how often traffic is analyzed.

Monitor Period

The configured monitor period determines for how long traffic is measured before being reported by the border router to the master controller. This is the means of specifying the learning interval. The default is 5 minutes. Flows are aggregated on the border router during this interval. At the end of the interval, the top ( prefixes keyword value, subordinate to the learn command) prefixes based on throughput are reported to the master controller.

Loss

Packet loss is based on packets per million (PPM) regardless of how many hosts are involved, and loss is based on both passive and active monitoring; however, with active monitoring, loss is reported only for jitter probes. Loss is specified as a relative percentage or maximum number of packets.


Note If the fast re-route feature is implemented to support voice or video over IP and packet loss is one criteria desired to trigger the reroute, then an explicitly configured jitter probe is required.


Unreachable

Unreachable is based on flows per million (FPM). Unreachable hosts only apply to TCP sessions. Reachability failures are determined by TCP SYNCs without an accompanying TCP SYNC/ACK. Unreachable can either be an absolute maximum number or a relative percentage.

Feature Summary

Table 1 lists the features and the release train implemented.

Table 1 Implemented Features

Release
Feature

12.4(6)T

Voice Traffic Optimization

12.4(9)T

DSCP Monitoring

BGP Inbound Optimization

12.4(11)T

Dynamically Learned Well Known Applications

12.4(15)T

Link Grouping, Fast Re-route, NAT/pNAT

12.2(33)SRB

OER on 7600

12.2(33)SXH

OER BR on 6500


Best Practices, Tips and Techniques

This section demonstrates useful commands, best practices and other tips and techniques to assist the network manager in deploying and maintaining OER in a production environment.

Load Interval and Bandwidth

To provide the most granular and accurate information to the master controller, configure the load-interval on internal and external interfaces on the border routers to the minimum value of 30 seconds. Additionally, the bandwidth statement on the interface should also be appropriately configured.

interface Tunnel100
 bandwidth 256
 load-interval 30

joeking-vpn-1811#show oer mast bor det
Border           Status   UP/DOWN             AuthFail  Version
10.81.7.73       ACTIVE   UP       00:32:08          0  2.0
 Tu200           EXTERNAL UP             
 Tu100           EXTERNAL UP             
 Vl1             INTERNAL UP             

 External         Capacity      Max BW   BW Used    Load Status          Exit Id
 Interface         (kbps)       (kbps)    (kbps)    (%)                         
 ---------        --------      ------   ------- ------- ------           ------
 Tu200                 256         204        74      28 UP                    2
                                   192         0       0
 Tu100                 256         192        67      25 UP                    1
                                   192        91      35
!

Note The above display was captured while a file transfer was executing an FTP PUT through Tunnel 200 and a VoIP call was active on Tunnel 100. This accounts for the display showing bidirectional data on Tunnel 100, but primarily unidirectional data on Tunnel 200.


The Max BW value is derived from the default value of 75 percent or the configured value. For Tunnel 200 80 percent of 256K is 204K, and for Tunnel 100, the default value of 75 percent (which is not shown in the configuration) is represented as 192Kbps. This display was captured from a router using the following configuration:

oer master
 policy-rules BRANCH
 logging
 !
 border 10.81.7.73 key-chain GREEN
  interface Tunnel200 external
   max-xmit-utilization percentage 80
  interface Tunnel100 external
  interface Vlan1 internal
 !        

The max-xmit-utilization value is to bound the path selection algorithm. Links that are currently overloaded (links that have loading that exceeds the maximum bandwidth value) are removed from consideration for selecting the best path.

Displaying the Policy for a Prefix

This command displays the policy in effect for a particular prefix:

vpn-jk2-3725-1#show oer master prefix 192.168.193.0/24 policy
Default Policy Settings:
  backoff 90 3000 300
  delay relative 50
  holddown 300
  periodic 180
  probe frequency 56
  mode route control
  mode monitor both
  mode select-exit best
  loss relative 10
  jitter threshold 20
  mos threshold 3.60 percent 30
  unreachable relative 50
  resolve delay priority 11 variance 20
  resolve utilization priority 12 variance 20
 *tag 0


Note If the output of show oer master prefix command is null, then that prefix has not been learned or configured.


Active and Passive Combined

This example also demonstrates a configuration for OER Fast re-route. This feature is introduced in Cisco IOS Release 12.4(15)T:

!
hostname vpn-jk2-3725-1
!
! System image file is "flash:c3725-advipservicesk9-mz.124-15.T"
!
key chain GREEN
 key 10
   key-string 7 11283B263343595F500F0D03
!
!
oer master
 policy-rules ENTERPRISE_CAMPUS
 logging
 !
 border 192.168.131.1 key-chain GREEN
  interface ATM2/0.235 external
  interface FastEthernet0/1.100 internal
  interface FastEthernet0/1.102 internal
 !
 border 192.168.131.2 key-chain GREEN
  interface ATM3/0.135 external
  interface FastEthernet1/0.100 internal
  interface FastEthernet1/0.102 internal
 !
 learn
  throughput
  delay
  periodic-interval 0
  monitor-period 1
  prefixes 2500
  expire after time 30
 backoff 90 3000 300
 mode route control
 mode select-exit best
 periodic 180
 !
!
!
oer border
 logging
 local FastEthernet0/1.100
 master 192.168.131.1 key-chain GREEN
!
!
ip access-list extended VOICE
 permit udp any 10.0.0.0 0.255.255.255 dscp ef
 permit udp any 10.0.0.0 0.255.255.255 dscp af41
 permit udp any 10.0.0.0 0.255.255.255 dscp cs5
!
! For each branch you would need one map entry (sequence no.) because we 
! are manually configuring the probe destination IP address.
!
oer-map ENTERPRISE_CAMPUS 10
 match traffic-class access-list VOICE
 set holddown 300
 set delay threshold 150
 set mode route control
 set mode monitor fast
!
! The order in priority is jitter, delay then MOS
!
 set resolve jitter priority 1 variance 10
 set resolve delay priority 2 variance 10
 set resolve mos priority 10 variance 10
 set jitter threshold 15
 set mos threshold 4.00 percent 15
 set active-probe jitter 10.1.1.1 target-port 33033 codec g729a
!

Tip IP SLA responder must be configured on the target router at 10.1.1.1.


!
 set probe frequency 2
!
end

Solution Overview

This solution is comprised of the following deployment models:

Internet Content Server

WAN Hub: Dual MPLS Service Providers

Branch/SOHO VPN Deployment

Branch VPN Deployment with Cisco Wide Area Application Services (WAAS)

The deployment models are described and documented each in their own section; however, there are some similarities across the sections.

The Internet Content Server section focuses on master controller scalability.

The WAN Hub: Dual MPLS Service Providers section focuses on border router scalability, but the master controller scalability findings are applicable to both deployments.

The Branch VPN Deployment with Cisco Wide Area Application Services (WAAS) section builds on the topology and results described in theBranch/SOHO VPN Deployment section.

In the Internet Content Server and the Branch VPN Deployment with Cisco Wide Area Application Services (WAAS) sections, a standby master controller configuration is tested and documented.

Internet Content Server

This represents an Internet edge deployment with two or more ISP links receiving full Internet routing table advertisements. The remote users are unknown individual user clients accessing web hosting servers. As the bulk of the traffic is from server-to-client, OER is used to control only routing to the Internet. The OER configuration deployed is simple passive monitoring of TCP traffic and dedicated chassis for the control function.

The goal is to obtain the scalability limits of managing large number of IP network prefixes to manage user traffic. Because of architectural limitations with the NetFlow implementation on the EARL 71 (PFC3)-based architectures, OER cannot deduce performance (delay, loss, reachability) characteristics from passive monitoring of TCP flows. Also many Internet hosts do not respond to active probes, the IP SLA ICMP echo probe. Because of these limitations, the PFC3-based architectures are not used as border routers in this section. There is a feature enhancement request, CSCsi59058, to add support for Internet path availability probing for load-balancing. This feature is targeted to support the PFC3-based architectures for Internet load sharing.

In the topology tested in this section, the Cisco 7200VXR series of routers are deployed as OER border routers. The master controller function is tested using the Cisco 7200VXR NPE-G2, Cisco 3845, and Cisco 7600-rsp720. An active/standby master controller configuration is also tested to demonstrate this function and to document a working configuration.

Design Requirements and Considerations

The Internet content server use case is the most common deployment scenario as this is the primary customer use case the OER technology was developed to address; optimization of large numbers of client devices sourced from several ISP connections. In terms of megabits per second, the bulk of the user traffic is from server to client. OER, therefore, is configured and addresses the path selection from server to client over two or more links to typically multiple ISPs.

The majority of the user traffic is TCP traffic, specifically HTTP (port 80) and SSL/HTTPS (port 443). The tested configuration uses two Cisco 7200VXR NPE-G2 as WAN edge routers terminating links to their respective ISPs. These are OER border routers in all test cases.

The MC function is tested using the Cisco 7200VXR NPE-G2, Cisco 3845, and Cisco 7600-rsp720. An active/standby MC configuration is also tested to demonstrate this function and to document a tested working configuration.

The objective of testing an Internet content server deployment is to determine what resource (memory or CPU utilization) is the limiting factor in scaling a dedicated master controller. In this deployment, it is assumed that the master controller is deployed on a separate chassis rather than collocated on a border router, because the goal is to scale the total number of prefixes being managed the resources consumed by the master controller function should not be limited or reserved in order to switch user packets or process other network functions like QoS, BGP peering, NAT, access-lists or Cisco IOS firewall.

It is important to note that the concept of performance routing is an optimization technique. In other words, it adds to or is an enhancement of the core function of the WAN aggregation role of switching packets to and from the Internet service providers and maintaining BGP peering sessions to send and receive network prefix advertisements. As such, using a dedicated chassis not only provides better opportunity to scale the performance routing function but also isolates it from the core function of WAN aggregation. It is a good design practice to dedicate a chassis for key functions where stability and isolation are important to the overall design. Using a dedicated chassis for the master controller function is analogous to using a dedicated route-reflector in large BGP deployment, a dedicated DLSw peer or TN3270 server.

Additionally, to provide design guidance and verification, the concept of implementing a standby master controller is demonstrated and tested in this section. The use of a standby master controller is useful to maintain the performance routing function in the event the primary master controller must be taken offline for service or experiences a hardware or software failure.

Scalability Considerations

July 2, 2007, a Cisco RTP campus Internet gateway (BGP peering with AS 7018 and AS 701) had 220,508 network prefixes in the BGP table using approximately 25MB of memory. If it is assumed that an Internet content server is receiving flows from 1 to 2% of these network prefixes during any given interval of time, managing 2,000 to 4,000 network prefixes is an expected requirement of the customer deployment. This snapshot is to provide context, as a point of reference, for the number of prefixes the typical enterprise customer may encounter. Selected content service providers may have more aggressive requirements.

OER, however, does not use the BGP routing table as the source of data to populate the master controller database, rather active flows from the NetFlow cache are used to populate the database. Actual user traffic, as cached by NetFlow, are used to determine what network prefixes are to be managed. The next section examines the basic configuration used in scale testing and how the configurable parameters influence prefix collection and retention.

Prefix Management

This section describes how network prefixes are collected, aggregated, stored and reported between the border routers and the master controller. In these illustrations, the pictures of the plastic pails represents the collection and storage of network prefixes.

Underlying Routing

First the underlying routing configuration must be discussed. A very typical Internet edge deployment is shown in Figure 5.

Figure 5 Internet Edge Deployment Example

In Figure 5, there are two Layer 3 campus switches and two WAN edge routers. One of the Layer 3 campus switches is shown grey or subdued. In the lab test phase of this section, the second Layer 3 campus switch is not included in the topology, as this switch is deployed to provide redundancy. The testing was not meant to cover campus switching redundancy. However, for the purpose of explaining the un