Data Center手册(4):设计
基础架构
拓扑图
Switching Path
L3 routing at aggregation layer
L2 switching at access layer
L3 switch融合了三种功能:
RP, router processor, 处理路由协议
SP, switch processor, 处理L2协议
ASIC, Application-specific integrated circuit专用集成电路,用于重写header的
对于traffic forwarding有几种方法:
- Process switching: 通过IP input过程,每个包都通过CPU处理,查找整个routing table,因而是最慢的
- Fast switching: 将第一个包的路由查询结果放在cache里面,后续的package之间查找cache即可
- CEF:是最快的方式,它处理routing table,得到一个可以快速查询的FIB forwarding information base,无论是第一个包,还是后续包,都能快速的查询。而且处理有特别的硬件ASIC进行。
Use VLAN
VLAN可以很好的进行二次隔离。
L3 switch可以允许不同的VLAN之间进行通信,通过一个L3的interface称为SVI,是一个在VLAN上虚拟网卡,没有物理端口与之对应,仅仅用于VLAN之间的通信。
当一个VLAN上面没有物理端口的时候,这个虚拟端口也会设置为down,从而不会有包再到这个VLAN,这种行为称之为Autostate.
Link Redundancy and Load Distribution
容错与分流
使用EtherChannels增加带宽,将多个连接绑定在一起,在STP看来是一个link, LACP
L2的分流方法,我们仅仅考虑Loop-free的情况。
HSRP, VRRP, and GLBP are the key protocols to provide redundancy when working with a static routing environment. HSRP is a Cisco proprietary protocol (RFC 2281, informational), VRRP is an Internet Engineering Task Force (IETF)–proposed standard (RFC 2338), and GLBP is a Cisco proprietary protocol.
With HSRP, only one of the two routers (the active router) is responsible for routing the servers’ traffic; the standby router assumes responsibility for the task when the active router fails.
Aggregation1 and Aggregation2 both have an interface on VLAN 10: 10.0.0.253 and 10.0.0.254.
Together, they provide the default gateway to the servers: 10.0.0.1.
Aggregation1 is the active HSRP router: when the server sends an ARP request for 10.0.0.1, Aggregation1 responds with the MAC address 0000.0c07.ac01, which is a virtual MAC (vMAC) address; the burned-in MAC address (BIA) for Aggregation1 is 0003.6c43.8c0a.
In case the interface of Aggregation1 on VLAN 10 is lost, Aggregation2 takes over 10.0.0.1 and the MAC address 0000.0c07.ac01.
HSRP Group
One VLAN segment can have multiple groups
multiple virtual IP addresses to be used concurrently
One single router interface can belong to multiple groups and be active for one group and standby for another one.
You assign half of the servers to use the HSRP IP address of group 1 (10.0.0.1) as the default gateway and the other half to use the HSRP IP address of group 2 (10.0.0.2).
VRRP conceptually is similar to HSRP
In the presence of multiple routers on a VLAN segment, VRRP elects a router as master and the other routers as backup for a given virtual router (equivalent to an HSRP group).
VRRP has preemption enabled by default. You can use the command no vrrp group preempt to disable preemption.
The master router sends hello packets to the multicast IP address 224.0.0.18 (MAC 0100.5e00.0012) every 1 sec, and the backup detects the failure of the master after three hello packets are lost.
GLBP, it possible for the peer routers providing redundancy to the servers to be active concurrently on the VLAN segment.
All ARP requests for the default gateway from the servers are directed to the virtual IP address (vIP) 10.0.0.1.
Only one of the routers is authorized to respond to the ARP request, the active virtual gateway (AVG).
This router answers to the ARP requests by performing a round-robin among a number of vMAC addresses (for example, two MACs).
Each vMAC address identifies a router in the GLBP group; for example, 0007.B400.0101 is the vMAC for Aggregation1 and 0007.B400.0102 is the vMAC for Aggregation2.
By answering with different vMACs to different servers, the AVG achieves load distribution: half of the servers use Aggregation1 as their default gateway, and the other half uses Aggregation2.
Each router is an active virtual forwarder (AVF) for a given virtual MAC. Aggregation1 is AVF for 0007.B400.0101 and Aggregation2 is the AVF for
0007.B400.0102. Should Aggregation1 fail, Aggregation2 becomes the AVF for both the vMACs.
L3分流的方法
the links between the aggregation switches and the core are typically Layer 3 links, and it is desirable to take advantage of the bandwidth provided by all these links.
OSPF allows four equal-cost routes by default, which you can extend to eight routes with the command maximum-path under the router ospf configuration.
EIGRP allows load balancing for four equal-cost routes by default. You can modify this parameter with the maximum-path command. Differently from OSPF, EIGRP can also load-balance unequal-cost routes if you use the variance command.
Load-balancing routes多种方式:
Per-packet: Each packet is treated independently, and the router round-robins the packets on all the available routes (equal-cost routes). packages may out-of-order.
Per-destination: Traffic destined to a specific host always takes the same next hop; packets from different clients for the same destination take the same next hop.
Per-source-and-destination: Load balancing on both the source IP address and the destination IP address allows better load distribution without breaking the packet sequence for a specific flow
Process switching uses per-packet load balancing.
Fast switching uses per-destination load balancing.
CEF uses either per-packet or per-source-and-destination load balancing.
Flow-based MLS typically uses no load balancing by default. You can configure it to per-source-and-destination load balancing by changing the flowmask to source-destination.
CEF-based MLS typically uses per-source-and-destination load balancing (source and destination IP address) by default.
Dual-Attached Servers
attach dual NIC servers to a Layer 2 infrastructure for a loop-free design.
安全
那些需要保护的区域
Internet Edge
You can provide security at the Internet Edge using the following methods:
Deploying antispoofing filtering to prevent DoS attacks by limiting IP spoofing
RFC 1918 filtering:
RFC 1918 filtering makes sure that no packets using source IP addresses from the private address space are sent to or received from the Internet.
RFC 2827 filtering:
RFC 2827 filtering prevents the spoofing of the enterprise address space by blocking incoming packets with source IP addresses belonging to the public address space reserved for the enterprise’s public services.
Using uRPF, also to prevent DoS attacks by limiting IP spoofing
When uRPF is enabled, each packet is checked not only for its destination IP address but also for the routing table of the source IP addresses.
It verifies that there is a routing-table entry with the destination to the source IP address of the packet and the route is associated with the interface the packet came from.
ACL
allow only access to and from the public services provided by the enterprise.
these filters permit the typical services used in a Data Center, such as DNS, HTTP, Simple Mail Transfer Protocol (SMTP), ICMP, and Network Time Protocol (NTP).
Implementing traffic rate limiting to reduce the effect of DoS and DDoS attacks
Traffic rate limiting consists of implementing queuing mechanisms that control the volume of traffic forwarded through a router.
The traffic is usually classified based on protocol, source and destination IP address, and port numbers.
Each defined traffic type is assigned a threshold, after which packets are processed at a lower priority or are simply discarded.
You can use traffic rate limiting to reduce the effects of DoS attacks and their large volumes of data
缺点:
fixed thresholds
legitimate packets often cannot be distinguished from DoS packets
Securing routing protocols to avoid trust exploitation and routing disruptions
When you use dynamic routing, you implement Border Gateway Protocol (BGP) between the ISP and the Internet Edge routers, and you deploy an Interior Gateway Protocol (IGP) such as Open Shortest Path First (OSPF) or Enhanced Interior Gateway Routing Protocol (EIGRP) to propagate routing information to the interior of the enterprise network.
Attackers may do illegal routing updates.
Protocols such as BGP, OSPF, Inter-mediate System-to-Intermediate System (IS-IS), EIGRP, and Routing Information Protocol Version 2 (RIPv2) provide mechanisms to ensure that routing updates are valid and are received from legitimate routing peers. They achieve this goal by using route filters and neighbor router authentication.
Route filters are typically deployed at the ISP router to ensure that only the public networks assigned to the enterprise are externally advertised.
Internet Edge routers should use neighbor router authentication to ensure that routing updates are valid and are received only from legitimate peers.
1. The routers are configured with a shared secret key that is used to sign and validate each routing update.
2. Every time a router has to send a routing update, the routing update is processed with a hash function that uses the secret key to produce a digest.
3. The resulting digest is appended to the routing update. In this way, the routing update message contains the actual routing update plus its corresponding digest. The routing update message contains the actual routing update plus its corresponding digest.
4. Once the message is sent, the receiving router processes the routing update with the same hash function and secret key.
5. The receiving router compares the result with the digest in the routing update message. A match means that the sender has signed the update using the same secret key and hashing algorithm and that the message has not changed while in transit.
Deploying stateful firewalls to prevent unauthorized access
The use of stateful firewalls has two main goals, protecting the Internet server farm and controlling the traffic between the Internet and the rest of the enterprise network.
Implementing intrusion detection to detect network reconnaissance activities and to identify threats and intruders
When you deploy the network-based sensor in a switched infrastructure, you must use features such as switch port analyzer (SPAN) or capture to forward traffic to the monitoring interface of the IDS sensor.
DNS signatures: Examples are 6050 - DNS HINFO Request, 6051 - DNS Zone Transfer, 6052 - DNS Zone Transfer from High Port, 6053 - DNS Request for All Records, 6054 - DNS Version Request, 6055 - DNS Inverse Query Buffer Overflow, and 6056 - DNS NXT Buffer Overflow.
HTTP signatures: Examples are 5188 - HTTP Tunneling, 5055 - HTTP Basic Authentication Overflow, 3200 - WWW Phf Attack, 3202 - WWW .url File Requested, 3203 - WWW .lnk File Requested, 3204 - WWW .bat File Requested, 3212 - WWW NPH-TEST-CGI Attack, and 3213 - WWW TEST-CGI Attack.
FTP signatures: Examples are 3150 - FTP Remote Command Execution, 3151 FTP SYST Command Attempt, 3152 - FTP CWD ~root, 3153 - FTP Improper Address Specified, 3154 - FTP Improper Port Specified, 3155 - FTP RETR Pipe Filename Command Execution, 3156 - FTP STOR Pipe Filename Command Execution, 3157 - FTP PASV Port Spoof, 3158 - FTP SITE EXEC Format String, 3159 - FTP PASS Suspicious Length, and 3160 - Cesar FTP Buffer Overflow.
E-mail signatures: Examples are 3100 - Smail Attack, 3101 - Sendmail Invalid Recipient, 3102 - Sendmail Invalid Sender, 3103 - Sendmail Reconnaissance, 3104 - Archaic Sendmail Attacks, 3105 - Sendmail Decode Alias, 3106 - Mail Spam, and 3107 - Majordomo Execute Attack.
Host-based IDSs specifically target host vulnerabilities, including the following:
- Protection against e-mail worm attacks such as GONER or NIMDA
- Protection against application hijacking using a dynamic link libraries (DLLs) control hook
- Protection against downloading files using instant-messenger applications
- Protection against known buffer-overflow attacks
- Control of application execution in the system
Campus Core
Disable any unnecessary services and harden the configuration of the switches and routers that build the campus core.
The second recommendation is to secure the exchange of routing updates with routing-update authentication, route filters, and neighbor definitions.
Use secure protocols such as Secure Shell (SSH) and Simple Network Management Protocol Version 3 (SNMPv3), and avoid insecure protocols that do not protect usernames and passwords
Intranet Server Farms
Management Isolation
Performance
Traffic Patterns
Internet Traffic Patterns
有一些组织进行这方面的研究
San Diego Supercomputer Center (SDSC) http://www.sdsc.edu/
The Cooperative Association for Internet Data Analysis (CAIDA) http://www.caida.org/
The National Laboratory for Applied Network Research, Measurement Network Analysis Group (NLANR) http://www.nlanr.net/
Wide-Area Internet Traffic Patterns and Characteristics
TCP averages 95 percent of bytes, 90 percent of packets, and at least 75 percent of flows on the link.
User Datagram Protocol (UDP) averages 5 percent of bytes, 10 percent of packets, and 20 percent of flows.
Web traffic makes 75 percent of bytes, 70 percent of packets, and 75 percent of flows in the TCP category.
In addition to Web traffic, Domain Name System (DNS), Simple Mail Transfer Protocol (SMTP), FTP data, Network News Transfer Protocol (NNTP), and Telnet are identified as contributing a visible percentage.
DNS represents 18 percent of flows but only 3 percent of total packets and 1 percent of total bytes.
SMTP makes 5 percent of bytes, 5 percent of packets, and 2 percent of flows.
FTP data produces 5 percent of bytes, 3 percent of packets, and less than 1 percent of flows.
NNTP contributes 2 percent of bytes and less than 1 percent of packets and flows.
Intranet Traffic Pattern
A good source of information for measuring performance of IP networks is the paper “Measuring IP Network Performance” by Geoff Houston on the Internet Protocol Journal at http://www.cisco.com/warp/customer/759/ipj_6-1/ipj_6-1_measuring_ip_networks.html.
common performance matrix
Throughput: The maximum rate at which none of the offered frames are dropped by the device.
Frame loss: Percentage of frames that should have been forwarded by a network device under steady state (constant) load that were not forwarded due to lack of resources.
Latency for store and forward devices: The time interval starting when the last bit of the input frame reaches the input port and ending when the first bit of the output frame is seen on the output port.
Latency for bit-forwarding devices: The time interval starting when the end of the first bit of the input frame reaches the input port and ending when the start of the first bit of the output frame is seen on the output port.
Connection processing rate: The maximum rate of new connections the device is able to process.
CC: The number of simultaneous connections the device is able to track and process.
Multilayer Switch Metrics
Throughput:
Throughput is measured in bits per second (BPS) or PPS. BPS gives the absolute throughput number, but PPS multiplied by the packet size
Multilayer switches process frames or packets
You obtain the maximum throughput values using the maximum transmission unit (MTU) size
Frame and Packet Loss:
the actual processing limits of the DUT(Device under test) under a constant load
Latency:
latency generally increases as the depth of packet inspection increases
Firewall Metrics
The DoS handling tests determine how the firewall deals with a high rate of TCP connection requests (SYN packets). This maximum rate indicates how well the firewall would fare under such an attack (a SYN flood attack).
HTTP transfer rate refers to how the firewall handles entire HTTP transactions that include the TCP connection request, the transfer of the objects associated with the URL in the request, and the final connection teardown.
HTTP transaction rate refers to the transaction rate per unit of time that the firewall is able to support.
Illegal traffic handling refers to the capability of the firewall to handle both legal and illegal traffic concurrently.
IP fragmentation handling refers to the capability of the firewall to process fragments that might require re-assembly before a rule could be applied.
Load Balancer Performance Metrics
CPS describes how many new connection requests per second a load balancer can process.
The term processing implies the successful completion of the connection handshake and connection teardown.
CC refers to the number of simultaneous connections a load balancer can support.
PPS describes how many packets per second a load balancer can process.
a load balancer has the potential to add more latency than other devices because it can execute tasks deeper in the payload of packets.
At Layer 4, the load balancer must perform the following tasks:
- 5-tuple lookup
- Lookup of content policy information on TCP/IP headers
- Rewrite of MAC header information
- Rewrite of IP header information
- Checksum calculations for TCP
- Calculation and rewrite of other TCP/UDP header information
At Layer 5, the load balancer performs all Layer 4 tasks in addition to the following:
- Spoofing TCP connections toward the client side
- Lookup of content policy information on packet payload
- Initiating new TCP connections with the server
- Maintaining both client and server connection synchronization, which requires SEQ and checksum calculation, in addition to other header rewrite operations for both connections
Response time is loosely defined as the elapsed time between the end of an application layer request (the user presses the Enter key) and the end of the response (the data is displayed in the user’s screen).
SSL Offloaders Performance Metrics
The CPS rate that you should measure for SSL offloaders is related to the number of SSL handshakes it can complete. This metric is often called transactions per second (TPS) or sessions per second
Concurrent connections or rather concurrent SSL sessions are mostly related to long-lived sessions and therefore indicate the memory capacity to hold them.
As with load balancers, measuring PPS requires real traffic or at least real SSL connections.
Latency on an SSL offloader indicates the time it would take the device to process the data, which in this case is the SSL handshake and subsequent encryption/decryption of packets.
Testing Tools
First are the web load tools:
http://www.testingfaqs.org/t-load.html lists a number of tools.
http://www.softwareqatest.com/qatweb1.html lists a number of tools under the category of load and performance tools.
http://www.aptest.com/resources.html lists a number of testing tools under the category of web test tools.
The next list outlines specific testing tools:
HTTPLOAD from ACME offers a variety of tools for HTTP-related tests at http://www.acme.com/software/http_load/.
The Web Application Stress Tool from Microsoft is at http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnw2kmag00/html/StressTool.asp.
WebStone from Mindcraft for benchmarking Web servers is at http://www.mindcraft.com/webstone/.
WebBench from Ziff Davis is at http://www.etestinglabs.com/benchmarks/webbench/webbench.asp.
SPECweb99 from Standard Performance Evaluation Corporation is at http://www.spec.org/osg/web99/.