As physics research progresses rapidly, the world's leading physicists need to share data samples ranging up to 100 terabits in size, from data storage 100 times larger. Now a global network proves it possible to support such enormous demands-thanks to Caltech and Cisco Systems®.
BUSINESS CHALLENGE
With a faculty that includes four Nobel laureates and off-campus facilities such as the Jet Propulsion Laboratory, Palomar Observatory, and W. M. Keck Observatory, the California Institute of Technology (Caltech) is considered one of the world's greatest research centers. Located in Pasadena, California, the institute's research partners encompass the greatest minds and institutions in Europe, Asia, and North and South America. True international collaboration has driven a need for powerful, high-speed data networks capable of transporting huge data files-ranging up to hundreds of terabits in size-and supporting CPU-intensive calculations and simulations.
A primary goal of particle-physics research is finding the Higgs particle, thought to be responsible for mass. For the past 20 years, research institutions have engaged in major international collaborations where experiments conducted overseas have enabled U.S. physicists to effectively participate without needing to be physically present. To facilitate true global participation, next-generation, high-energy physics research will require networks interlinking many national-scale and medium-sized university-based computing facilities. They will be capable of delivering sustained bandwidth of 10 to 100 Gbps over trans-continental distances. An annual event-the Supercomputing Bandwidth Challenge, sponsored by Qwest-allows scientists to demonstrate network solutions that contribute to the development of these ultra-high-performance networks. The 2004 event in Pittsburgh, Pennsylvania, offered a preview of the globally distributed grid system that is under development in the United States, Europe, and Asia. Its objective is to support the next generation of high-energy physics experiments at CERN, a research facility based in Geneva, Switzerland, beginning operation in 2007.
The Caltech-led team won the 2003 Sustained Bandwidth Award, achieving network throughput of 23.2 Gbps to and from the show floor. In 2004, Caltech and partners from the Stanford Linear Accelerator Center (SLAC) in Stanford, California; Fermilab in Chicago, Illinois; CERN in Geneva, Switzerland; the University of Florida in Jacksonville, Florida; the University of Manchester and UKLight in the United Kingdom; Rio de Janeiro State University and the State University of Sao Paolo in Rio de Janeiro and Sao Paulo, Brazil, respectively; and Kyungpook National University in Daegu, Korea. The goal was to exceed last year's mark by using a network that closely approximates the actual research environment's typical network usage.
According to Harvey Newman, professor of physics at Caltech, the research network architecture is hierarchical. It is designed to serve high-energy physics collaborations that will search for new phenomena at CERN's Large Hadron Collider (LHC). Binary research data obtained by large particle detectors is first sent from the Tier-0 CERN laboratory to approximately 12 national research centers worldwide. While CERN staff members provide overall project management, staff at the national centers maintain Tier-1 data centers for storing large amounts of data (multiple petabytes) and running reliable data services that deliver data samples to Tier-2 research centers. A Tier-1 center typically supports 500 to 1500 computing nodes, used to reconstruct and analyze data; a large set of disk arrays to provide robust, reliable online data service; and tape backup for archiving and reconstructing data.
The university-based Tier-2 centers have smaller staffs and support most of the required data analysis and simulation. A Tier-2 center typically serves 40-70 physicists working with individual data samples-accessed over the network from Tier-1 centers-that may total up to ten terabytes. The United States will be home to 10-15 Tier-2 centers, each with its own cluster of computing, networking, and data storage equipment. Finally, there are thousands of individual scientists working with desktops and laptops, performing collaborative development activities, such as improving the selection strategies and data processing methods for finding new physics signals and identifying new physics processes.
To fully exploit the potential for scientific discoveries, physicists foresee storing, processing, distributing, and analyzing multiple petabytes of data that require the extraction and transport of terabyte-scale data samples on demand.
"In preparing for the Bandwidth Challenge, these long-term goals were in mind," says Newman. "We already use networking equipment from Cisco Systems at our campus, and Cisco equipment is also used at CERN. When evaluating overall cost performance, network flexibility, and ability to develop a long-term relationship with the vendor, Cisco was a natural choice." Caltech's UltraLight project, led by Newman to carry out the required network and grid development, uses two 10-Gigabit/sec links to facilities in downtown Los Angeles, California. Here the UltraLight equipment peers with the Internet2 Abilene research and education network and National Lambda Rail (NLR).
NETWORK SOLUTION
The team's entry at the Supercomputing 2004 Bandwidth Challenge was designed to demonstrate high-speed data transfers between host labs and collaborating institutions. The application was a typical real-time event-analysis application requiring the transfer of large physics data sets. Caltech implemented Cisco® 7609 routers for configurations at the show floor in Pittsburgh, as well as in Jacksonville, CERN, Fermilab, and SLAC locations; Cisco 7606 routers were used for installations at its Pasadena facility and at the NLR POP in Los Angeles. Enabled by the Cisco Supervisor Engine 720, these routers deliver 40 Gbps per slot for line-rate Gigabit Ethernet (GigE) and 10-GigE services. Caltech also used Cisco Catalyst® 6509-E switches.
Seven 10-Gbps links connected the Cisco routers at the Caltech and CACR booth; three 10-Gbps links connected to the SLAC and Fermilab booth and to the TeraGrid. External network connections included four dedicated wavelengths of NLR between the show floor in Pittsburgh and the Los Angeles site (two waves), the Chicago site, and the Jacksonville site. Ten-Gbps connections linked the Abilene network (two 10-Gbps links), the TeraGrid (three 10-Gbps links), and the U.S. Department of Energy's Energy Sciences Network (ESNet) across the SciNet network infrastructure at the show. Institutions in Korea, Japan, Miami, Florida, and Brazil were connected to all other participants through ESNet and Abilene. Within the show booth, servers were connected to the network using 10-GigE interfaces.
"We're happy with the solution we chose," says Newman. "It was the most essential item and it lived up to our expectations."
Deploying the solution was a challenge because of the number of participants and the routing complexity involved. Multiple subnets had to be routed to different wavelengths and along different paths. Also, during implementation some of the routes were changed, which created additional complexity.
"In less than three days, we had to build one of the largest backbones in the world," says Sylvain Ravot, senior network engineer from Caltech and resident at CERN. "We were able to simply turn on the Cisco equipment and it worked."
BUSINESS VALUE
During the test, the network achieved 101.13 Gbps, which according to Bandwidth Challenge sponsor Wes Kaplow of Qwest, exceeded the sum of all throughput marks submitted in the present and previous years by other entrants. The record data-transfer speed is equivalent to downloading three complete DVD movies per second, or transmitting all of the content of the Library of Congress in 15 minutes. The network sustained the transport for several minutes, which is a phenomenal achievement, and links over both the Abilene and ESNet networks operated successfully at up to 99 percent of full capacity.
"The Bandwidth Challenge is a breakthrough for developing global networks and grids," says Newman, "as well as for demonstrating international cooperation in high-energy projects. The Cisco network delivered the highest performance ever tested in this event and it paves the way for flexible, efficient sharing of data and collaboration across researchers in many countries."
Newman and his colleagues see no reason why the high-speed links cannot be used to carry traffic at full wire speed in two directions. Individual links delivered from 13 Gbps to 19 Gbps during the test, with long-distance transmissions over routed networks being the most efficient.
"Cisco understood what we were attempting to achieve," Newman says. "The Cisco team was exceptional in its support and there was nothing they failed to deliver."
NEXT STEPS
Success at the Bandwidth Challenge demonstrated that multiple 10-Gbps wavelengths can be used efficiently over continental and transoceanic distances-often in both directions simultaneously-achieving the goal of a global grid that can support multi-terabyte and larger data transactions. It also makes it possible to construct "hybrid" networks that can integrate traditional packet switching and routing with dynamically constructed optical paths to support extremely large data flows.
Caltech will be implementing new Cisco Catalyst 6509-E switches for secure, converged services. The Cisco Catalyst 6500 switches provide scalable, intelligent, multilayer switching performance with up to 1152 10/100-Mbps Ethernet ports and support for hundreds of millions of packets per second (mpps) network cores supporting multiple-gigabit and 10-Gbps trunks. Five systems will be deployed at Caltech, SLAC, Fermilab, the University of Michigan, and Jacksonville.
To support large-scale physics experiments, the required bandwidth on long-distance network link is expected to increase from one 10-gigabit link currently, to four by 2007, reaching a terabit by approximately 2013.
Newman believes that its new Cisco Systems switching equipment represents a cost-effective solution for achieving the kind of scalability and reliability that Caltech will need to continue its rapid research progress.
"The Supercomputing 2004 Bandwidth Challenge was a demonstration, but many of the basic elements we used are realistic," he says. "We reached 2.8-Gbps full-duplex speeds between the United States and Brazil during the event and we anticipate that soon, those kinds of transfers will be normal."
This customer story is based on information provided by the California Institute of Technology and describes how that particular organization benefits from the deployment of Cisco products. Many factors may have contributed to the results and benefits described; Cisco does not guarantee comparable results elsewhere.
CISCO PROVIDES THIS PUBLICATION AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some jurisdictions do not allow disclaimer of express or implied warranties, therefore this disclaimer may not apply to you.