Congestion Avoidance in TCP
Congestion Avoidance in TCP
- Consequence of lack of congestion control
- When a popular resource is shared without regulation the result is always over-utilization
- With the introduction of TCP in 1983, users can write networking applications that require reliablity with greater ease
- When more applications are available, more data and information are exchange on the Internet.
- In mere 3 years time, the Internet had its first breakdown....
- A classic paper by Jacobson contains the following introduction:
"In October of '86, the Internet had the first of what became a series of congestion collapses. ..., the data throughput from LBL to UC Berkeley (sites separated by 400 yards and 2 IMP - i.e., routers - hops) dropped from 32 Kilo bits/sec to 40 bits/sec."
- Jacobson's paper can be found here: click here
- When a popular resource is shared without regulation the result is always over-utilization
- History of Congestion Control in TCP
- There have been many (and increasingly sophisticated) congestion avoidance mechanims added to TCP since Jacobson's work on Congestion Control.
- The Congestion Control mechanism in TCP is an ever developing process.... (it is still a research topic !)
- The most popular versions of TCP - named after cities in Nevada - are:
- TCP Tahoe
- This is the original version of TCP congestion control as implemented by Jacobson
- Congestion detection mechanism is based on packet loss
- Techniques used for congestion control:
- Slow Start
- Congestion Avoidance
- Fast Retransmit
- This is the original version of TCP congestion control as implemented by Jacobson
- TCP Reno
- This is the most popular version of TCP congestion control mechanism today.
- Techniques used for congestion control:
- same as TCP Tahoe (Slow Start, Congestion Avoidance and Fast Retransmit), plus
- Fast Recovery
- This is the most popular version of TCP congestion control mechanism today.
- TCP Vegas
- This is completely new implementation
- Congestion detection mechanism is based on end-to-end delay
- This is completely new implementation
- TCP Tahoe
- There have been many (and increasingly sophisticated) congestion avoidance mechanims added to TCP since Jacobson's work on Congestion Control.
- TCP packet size
- TCP is a byte oriented protocol
- However, TCP would not send every byte in a separate packet since this would result in an enormous overhead...
- User data is carried inside a TCP packet which itself is carried inside a IP packet :
- Example:
- If no additional options are used (no additional packet header information beside the necessary ones), and each packet carries a single byte of data, the IP packet size will be 41 bytes.
- The efficiency (useful part of the packet) would be 1/41 or 2.4%
- That's equivelent to Uncle Sam taking 97.6% in income taxes !!!
- If no additional options are used (no additional packet header information beside the necessary ones), and each packet carries a single byte of data, the IP packet size will be 41 bytes.
- TCP will always try to send multiple bytes in a packet to improve efficient
- Since we are dealing with congestion control, we will assume the worst case scenario which is when the TCP source is transmitting a large amount of data continuously
In other words:
We assume that every TCP source is sending at maximum data rate This is achieved by sending packets whose size is as large as possible
In other words, in the analysis of TCP congestion control scheme, we always assume that:
- Maximum Packet (Segment) Size
- User/System can impose a maximum packet size used in TCP
- The maximum packet size is called Maximum Segment Size or MSS
- User/System can impose a maximum packet size used in TCP
- The MSS will play an important part in describing the congestion avoidance mechanism used in TCP....
- TCP is a byte oriented protocol
- Transmission Data Rate
- The transmission data rate is indirectly dependent on the transmit window size...
The dependency is pretty complicated and very dynamic in nature
The following examples will derive a simple relationship between the data transmission rate and the transmit window size.
- Example 1:
- Suppose the sender has a lot of data to transmit (transmits continously) and the transmit window size is equal to MSS (Max Segment Size). The following will happen:
- Because transmit window size is equal to MSS, the sender can send only 1 packet at a time and must stop (because he promised not to send more than MSS bytes before hearing back from the receiver on how the data were received).
- The ACK for will return in approximately RTT (round trip time) sec
- When the ACK returns, the sender sends the next packet. (If the ACK does not return for a long time, the sender will retransmit - sender assumes the packet was lost).
- The resulting transmission rate is approximately MSS/RTT bps.
- Because transmit window size is equal to MSS, the sender can send only 1 packet at a time and must stop (because he promised not to send more than MSS bytes before hearing back from the receiver on how the data were received).
- Suppose the sender has a lot of data to transmit (transmits continously) and the transmit window size is equal to MSS (Max Segment Size). The following will happen:
- Example 2: If window size = 2*MSS, the sender can send faster:
But do not conclude that data rate is proportional to the window size. The above examples are "idealized". Network delays, route changes and other factors can make the relationship very unpredictable and dynamic.
- The transmission data rate is indirectly dependent on the transmit window size...
- TCP Transmit Window Size
- Terminology:
Transmit Window - # packets (
- bytes in each packet) that sender can transmit without having to wait for acknowledgement
- The size of the Transmit Window is computed using 2 windows:
- Advertised Window Size
- TCP's Congestion Window Size
- We have learned about the Advertised Window Size previously (see: click here ):
- The advertised window size is the amount of data that the receiver is willing to buffer when data arrived out of order:
- The advertised window size is the amount of data that the receiver is willing to buffer when data arrived out of order:
- Terminology:
- TCP's Congestion Window Size
- An intuitive definition of the Congestion Window Size is:
Congestion Window Size is - the number of packets (of
- bytes of data) that the sender
- that it can transmit into the network without causing congestion in the network.
- Notice that this amount of data depends on the current network status and thus varies over time...
In fact, it changes faster than the weather and it is just as unpredictable...
- An intuitive definition of the Congestion Window Size is:
- Relationship of the Transmit Window and the Congestion Window
- When users send a large amount of data through the shared Internet, they must be be courteous in regard to:
- The receiving party (do not overwhelm a slower receiver)
- The shared transmission medium (Internet)
- In other words, the sender must NOT transmit more data than:
- The receiver can handle
- The network links can handle
- In yet other words:
- TCP Transmit Window size <= Advertised Window size, and:
- TCP Transmit Window size <= Congestion Window size
- TCP Transmit Window size <= Advertised Window size, and:
- When users send a large amount of data through the shared Internet, they must be be courteous in regard to:
- Notations used in TCP congestion control scheme
- Advertised Window Size (AWS) = amount of data that the receiver will buffer.
AWS is negotiated at connection establishment and remains unchanged afterwards
- Congestion Window Size (CWND) = window size imposed by the TCP congestion mechanism to avoid causing congestion in the network
CWND changes over time !!!
- Transmit Window Size (TWS) = the amount of unacknowledged data, i.e., data that TCP transmits in a burst without receiving any indication on what happened to the data.
- Relationship:
TWS = min (AWS, CWND)
- Advertised Window Size (AWS) = amount of data that the receiver will buffer.
-
How TCP controls its transmission rate
- Recall that the
Advertised Window Size
(AWS)
is contained in the
TCP header
(so TCP has this information to its disposal)
- The TCP congestion control algorithm will
compute
the value of
CWND
according to (implicit) signals/events
(such as timeout, duplicate ACKs, see:
click here )
from the network
(We have not yet discussed HOW TCP changes the value of CWND - will come next)
- From the values of
AWS
and
CWND,
TCP will computes the
transmit window size as:
- TWS = min (AWS, CWND)
-
- Because
AWS is not under TCP control
(but determined by the receiver),
we will leave this value out of the discussion.
In the remainder of the discussion, we will discuss how TCP updates the value of CWND
- Recall that the
Advertised Window Size
(AWS)
is contained in the
TCP header
(so TCP has this information to its disposal)
- TCP modes/phases of operation
- The key to understand why TCP operates in the way it does
is to remember that network condition changes constantly
- New TCP connections can be started at any time which will
reduce the avaliable network capacity for existing TCP connections
- Existing TCP connections can end at any time which will increase the avaliable network capacity for the remaining TCP connections
- New TCP connections can be started at any time which will
reduce the avaliable network capacity for existing TCP connections
- To accomodate the uncertainty,
TCP operates in
two different modes/phases
- Slow Start Mode/Phase:
- This is the
start up mode of operation
of TCP
- In this mode/phase, TCP has an
idea (guess)
about the
maximum transmission rate
and TCP is trying to
reach this transmission rate
- Although TCP has an
idea (guess)
about this
maximum transmission rate,
TCP will
NOT
transmit at this rate
instantaneously
Rather, TCP will try to reach this maximum transmission rate in a piece meal fashion
- In this phase, TCP will start by transmitting ONE packet and at each successfully transmission epoch, TCP will DOUBLE the number of packets (resulting in an exponential increase in number of packets in time).
- This is the
start up mode of operation
of TCP
- Congestion Avoidance Phase:
- This is the phase that begins
AFTER the start up phase
The start up phase ends when TCP has reached the maximum transmission rate that it "believed" to be safe.
- In other words, TCP is now in
uncharted territory....
Because TCP has reach the maximum safe level, it would appear that there is still some more capacity available - it would be a shame NOT to use the available capacity !!!
- But ! TCP has no idea what the new maximum capacity is...
so it must be careful !
- In this phase, TCP will increase the number of packet much slower than in the start up phase (increase rate will not be exponential, but linear)
- This is the phase that begins
AFTER the start up phase
- Slow Start Mode/Phase:
- The key to understand why TCP operates in the way it does
is to remember that network condition changes constantly
- TCP congestion strategy:
A video game analogy
- What TCP is doing is somewhat the same strategy as playing
a video game...
- In some adventure video games, there are "danger" areas
where the player get killed by some booby trap.
- So how do you play such a video game ?
- You just walked in a trap and get killed....
- Restart the game, and play quickly upto the point
where you got killed.
- From that point on, play very carefully.....
- You just walked in a trap and get killed....
- The life of TCP is like a never-ending video game:
- When TCP
detects congestion
(through a packet loss), it
sets
CWND
to
half
of
the transmit window size that it was used when
the packet loss occured.
(Because the current transmit window size causes packet loss, half of the current transmit window size is a conservative estimate of the NEW safe level to operate !)
- Then TCP will restart by transmitting using
CWND = 1
(ONE packet outstanding) and increase
CWND
exponentially (from ONE) to the new
congestion window size
CWND
This phase is the slow start phase
- When TCP
reach the new congestion window size
(this is the point where TCP believe it is safe),
it will
enter the second phase
and
increase the window size
much less aggressively
(linearly instead of exponentially)
This phase is the congestion avoidance phase
- The congestion avoidance phase ends when TCP detects a packet loss and the cycle starts again from the top....
- When TCP
detects congestion
(through a packet loss), it
sets
CWND
to
half
of
the transmit window size that it was used when
the packet loss occured.
- What TCP is doing is somewhat the same strategy as playing
a video game...
-
Overview of the (idealized) TCP congestion control operation:
- When TCP starts out, it sets
CWND = AWS
(try to send as much data as the receiver can handle)
If the network can handle this transmission rate, TCP will not need to do any congestion control !!! (Because the bottle neck is at the receiver...)
The picture above shows a scenario where the network capacity is less than what the receiver can handle - i.e., the network is the bottle neck.
- At some point (in the figure, it happens when sender transmits
50 Kbps ),
packets are dropped and congestion is detected.
Because the packet drop happens at the moment when the sender was transmitting 50 Kbps , the new target congestion rate is set to 25 Kbps
- TCP increase the transmission rate
exponentially
until it reaches
25 Kbps
- From 25 Kbps
onwards, TCP will increase the transmission rate
linearly - until
it discover a packet loss
(In the figure, it happens when sender is transmitting 30 Kbps )
Because the packet drop happens at the moment when the sender was transmitting 30 Kbps , the new target congestion rate is set to 15 Kbps
- TCP increase the transmission rate
exponentially
until it reaches
15 Kbps
- From 15 Kbps
onwards, TCP will increase the transmission rate
linearly - until
it discover a packet loss
And so on....
- NOTES:
- Remember: the goal of TCP is get the highest possible throughput.
- This goal is not achieved by sending as fast as possible,
but as much as the network can handle !!!
- The available network bandwidth changes constantly.
- TCP tries to determine the available capacity by remembering
when the data rate at which a packet drop occured
the last time and proceeds carefully
starting from half way of this capacity level.
- NOTE: the first time that TCP starts, it has no idea what the network capacity is and the only thing that it can do is to set the level to what the receiver can handle...
- Remember: the goal of TCP is get the highest possible throughput.
- When TCP starts out, it sets
CWND = AWS
(try to send as much data as the receiver can handle)
-
An overview of of techniques used in TCP Congestion control
- We have just seen an high level discussion of the
TCP Congestion control algorithm consisting of 2 different phase
In the slow start phase, transmission rate increases exponentially in time.
In the congestion avoidance phase, transmission rate increases linearly in time.
- So basically, the difference between the 2 phases is the
rate of increase in transmission speed.
- Now it's time to see
how
the increase in transmission speed is realised.
- TCP uses the following
3 mechanisms
with very sexy sounding names:
- Slow Start
- Fast Retransmit
- Fast Recovery
We will look at each mechanism separately and indicated when the mechanism is appropriate.
The SLOW START Phase
- We have just seen an high level discussion of the
TCP Congestion control algorithm consisting of 2 different phase
- The Slow Start Mechanism
- During the slow start phase,
TCP uses the
slow start mechanism
for congestion control.
- Information needed to implement the
slow start mechanism:
- SSThresHold
-
The window size that TCP
-
to be safe
- SSThresHold = AWS when TCP begins for the first time
- SSThresHold is set to TWS/2 when TCP detects a packet loss
- CWND
-
The (current) congestion window size
CWND and AWS will determine the transmit window size of TCP
- SSThresHold
- Operation of the
slow start mechanism
is as follows:
Initilization: - SSThresHold is set to AWS (when TCP first begins) or Transmit Window/2 (when TCP detects congestion)
Slow Start:
- Set
CWND = MSS
(i.e., ONE packet)
- TCP increases CWND by
MSS whenever TCP receives an
NEW ACK packet
(= an ACK message that TCP has never seen before)
(NOTE:: If TCP receives a duplicate ACK, no updates are made to the CWND variable)
- Example of TCP operation in the slow start phase:
- Initially (at time 0), CWND = 1
- At time RTT (round trip time),
CWND = 2
- At time 2 RTT,
CWND = 4
- At time 3 RTT (not in figure),
CWND = 8
- And so on...
- When you plot CWND over time, CWND will increase exponentially
- Initially (at time 0), CWND = 1
- During the slow start phase,
TCP uses the
slow start mechanism
for congestion control.
-
When to BEGIN a new Slow Start epoch
- A slow start epoch can be initiated
in 2 stituations:
- When a TCP connection is first establish.
In this case, SSThresHold is set to AWS
- When TCP has detected a packet loss
In this case, SSThresHold is set to Transmit Window/2
- When a TCP connection is first establish.
- A slow start epoch can be initiated
in 2 stituations:
-
When to END a Slow Start epoch
- The slow start epoch can
ended
by 2 events:
- When CWND > SSThresh
-
This is a "normal" termination.
In this case, TCP will enter the congestion control phase:
TCP is now in "uncharted" territory and will increase its congestion window slower
- When TCP detects a packet loss
-
This is an "abnormal" termination.
In this case, TCP will re-enter
TCP first sets SSThresHold = Transmit Window
Then TCP resets CWND = 1 to start a new Slow Start epoch
- When CWND > SSThresh
- The slow start epoch can
ended
by 2 events:
- A $64,000 question:
- Why would TCP use a
"slow start"
procedure to increase
CWND
from ONE all the way to
SSThrehHold
Why not just set CWND to SSThrehHold and be done with it ???
- Answer:
- TCP uses timeouts to
tell if packets are lossed
- The timeout value used must be
estimated
because we don't know in advance how far away the receiver is located.
- So TCP must maintain an estimate for the
RTT
to the receiver and the timeout interval
is a function of the
RTT
- By sending packets slowly instead of in a burst, TCP can measure the RTT of packets more accurately
- TCP uses timeouts to
tell if packets are lossed
- Why would TCP use a
"slow start"
procedure to increase
CWND
from ONE all the way to
SSThrehHold
-
How can you call the EXPONENTIAL increase of transmission rate
in "Slow Start" SLOW ???
- The name "slow start" is probably one of the
worst misnomer
in networking...
- How on earth can you call an exponential increase in window size
SLOW ???
- To understand the terminology, you have to look in history....
- Prior to Jacobson's work, TCP operates as follows:
- A new TCP connection first negotiate a
advertised window size (AWS)
- The source immediately transmits an amount of data that is equal to the advertised window size (e.g., when a large file is transfered).
- A new TCP connection first negotiate a
advertised window size (AWS)
- Now, compared to sending AWS bytes of data, the new way of start transmitting ONE packet first is indeeds slower...
The Congestion Avoidance Phase
- The name "slow start" is probably one of the
worst misnomer
in networking...
- TCP's Congestion Avoidance mode
- TCP enters the
congestion avoidance phase
when the
slow start phase
terminates normally
(i.e., CWND > SSThresHold)
- Operation of the
congestion avoidance phase
is as follows:
- Ideally,
TCP increases
CWND by
ONE packet
or
MSS bytes after every
RTT seconds
It is quite complex to remember how many bytes you have acknowledge...
It is far easier to increase CWND each time you receive a NEW acknowledgement
- Notice that if TCP is transmitting maximum size packets,
and the congestion window is
CWND,
then there are approximately
CWND/MSS packets sent using
the transmit window
So if we add MSS/CWND to the congestion window size, we will have effectively increase CWND by ONE after all the acknowledgement packets return (they wil return in RTT seconds)
- So practically,
this can be (approximately) accomplished by
increasing
the congestion window
CWND by
MSS/CWND packets
or
MSS * MSS/CWND bytes after
TCP receives a
NEW acknowledgement
So:
CWND = CWND + MSS * MSS/CWND
when TCP receive a NEW acknowledgement
(Again, when a duplicate (old) ACK is received, CWND is not updated)
Example of TCP operation in the congestion avoidance phase:
- Suppose that CWND = 4 when TCP enters the congestion avoidance phase...
TCP sends out 4 packets (each containing MSS bytes) to the receiver.
- If there is no congestion, 4 NEW ACK packets will be received in approximately RTT seconds
- When the first ACK packet is received, TCP updates CWND as follows:
CWND = CWND + MSS * MSS/CWND // CWND = 4 MSS = 4 MSS + MSS * MSS/(4 MSS) = 4 MSS + MSS * 1/4 = 4.25 MSS
- When the second ACK packet is received, TCP updates CWND as follows:
CWND = CWND + MSS * MSS/CWND // CWND = 4.25 MSS = 4.25 MSS + MSS * MSS/(4.25 MSS) = 4.25 MSS + MSS * 1/4.25 = 4.485 MSS
- When the third ACK packet is received, TCP updates CWND as follows:
CWND = CWND + MSS * MSS/CWND // CWND = 4.485 MSS = 4.485 MSS + MSS * MSS/(4.485 MSS) = 4.485 MSS + MSS * 1/4.485 = 4.708 MSS
- When the fourth (and final) ACK packet is received, TCP updates CWND as follows:
CWND = CWND + MSS * MSS/CWND // CWND = 4.708 MSS = 4.708 MSS + MSS * MSS/(4.708 MSS) = 4.708 MSS + MSS * 1/4.708 = 4.92 MSS
- So you can see that CWND is increased approximately by MSS or ONE packet after RTT seconds
(In the slow start phase, CWND DOUBLES after each RTT seconds)
- Ideally,
TCP increases
CWND by
ONE packet
or
MSS bytes after every
RTT seconds
- NOTE: the actual implementation of TCP (see Stevens - Volume 2) increases CWND during congestion avoidance slightly faster than above using to the following formula:
CWND = CWND + MSS * MSS/CWND + MSS/8
- TCP enters the
congestion avoidance phase
when the
slow start phase
terminates normally
- Why do TCP want to keep increasing CWND ?
- Why does TCP not keep CWND constant after reaching the "safe" operation level SSThresHold ?
- When TCP keeps increasing CWND, it would eventually cause a congestion !!!
So why so foolish ???
- The reason is:
- TCP does not know the current network capacity... because network condition keeps changing.
- The goal of TCP is to transfer data as fast as possible.
If TCP would stop increasing CWND, it would not be true to its goal.
- So during the congestion avoidance period, TCP is testing the tolerance of the network:
- after it has successfully transferring
- amount of data, it adds
- to the congestion window:
- and
- the network.
(This technique is similar to kids testing their boundary by asking their parents for favors over and over again... The boundary may have moved :-))
- TCP does not know the current network capacity... because network condition keeps changing.
- Why does TCP not keep CWND constant after reaching the "safe" operation level SSThresHold ?
- When does the Congestion Avoidance phase BEGIN ?
- When the Slow Start phase terminates successfully
- When does the Congestion Avoidance phase END ?
- Eventually, TCP will push the congestion window too far and cause some packet drop.
- When a packet loss occurs, it can cause the sending TCP to timeout
- When a timeout occurs, the congestion avoidance phase ends and TCP will begin a slow start phase:
- TCP sets SSThresh = CWND/2.
This is the new "safe" operation level...
- Then TCP set CWND = 1 x MSS (i.e., 1 packet worth of data) and increases CWND at an expontial rate towards SSThresh (the "safe" level")
- TCP sets SSThresh = CWND/2.
- Eventually, TCP will push the congestion window too far and cause some packet drop.
Fast Retransmit
- Fast Retransmit
- Before the introduction of the Fast Retransmit, TCP was not "pro-active".
- Example:
- When
- is lossed, and the receiver
- transmits
- back to the sender, the sender would
- :
- Assume all packets upto packet 13 have been received and acknowledged
- When packet 14 is lossed, and packets #15, #16, #17 and #18 arrive out of order at the receiver, the receiving will send back ACK 13 to indicate that the last consecutive packet received was #13
- Prior to the introduction of Fast Retransmit, TCP does not act upon the multiple duplicate ACK messages from the receiver.
- The sender would TIMEOUT and them retransmits the lossed packet
- FURTHERMORE, when TCP times out, TCP will enter the SLOW START phase
- Assume all packets upto packet 13 have been received and acknowledged
- Fast Retransmit: using duplicate ACKs as indicators for lossed pacet
- Clearly, when the receiver keep sending the same (duplicate) ACK, some message may be lossed
- BUT, because packets can arrive OUT OF ORDER, an occasional duplicate ACK can arrive:
- So to eliminate most of these false loss indications, it is decided that when TCP receives 3 duplicate ACKs (so TCP received a total of 4 identical ACK packets), then TCP concludes that the packet is lossed and TCP retransmits the lossed packet IMMEDIATELY (without waiting for timeout):
- Having received 3 duplicate ACKs privides a high probability that a packet was lossed, but does not provide certainty...
- Clearly, when the receiver keep sending the same (duplicate) ACK, some message may be lossed
- After TCP retransmits the lossed packet, it enters the Slow Start phase (because a packet loss had occured)
- Time to show TCP congestion control mechanism in action...
- Before the introduction of the Fast Retransmit, TCP was not "pro-active".
- TCP Tahoe Demo
- TCP Tahoe (the original version by Jacobson) incorporates the Slow Start and Congestion Avoidance mechanisms.
- We will look at the operation of TCP Tahoe in this sample network:
- Here is a NS2 source file to simulate a TCP Tahoe source: click here
- Right click and save the file in your directory.
- Run program with:
export PATH=/usr/local/gnu/gcc/4.1.0/bin:$PATH export LD_LIBRARY_PATH=/usr/local/gnu/gcc/4.1.0/lib:$LD_LIBRARY_PATH /home/cheung/NS/run-ns Tahoe.tcl
- You should see the Network Animator window when it finish running... click PLAY to see the simulation in action
- You don't need to run the simulation to see the animation... I have saved a copy of the animation file generated by the simulation.
The NAM (Network Animation) output file is here: click here
- To see the animation, save the NAM file in your directory and use this command:
/home/cheung/NS/bin/nam Tahoe.nam
- The Congestion Window CWND plot data output file is here: click here
- To see the plot of the CWND of TCP, save the Congestion Window CWND plot file in your directory and run
In gnuplot, issue the command:
plot "WinFile" using 1:2 title "Flow 1" with lines 1
You should see this plot:
You can see the operation of TCP Tahoe clearly from the above figure:
- At approximately time 0, TCP Tahoe starts and it is in the slow start mode: the congestion window size increases exponentially
- At approximately time 5, packet loss is detected.
TCP marks SSThresh = 25 (approximately) and begins another slow start
- When it reaches CWND = 25 (approximately), the CWND increases linearly - here TCP Tahoe enters the congestion avoidance mode
- At approximately time 19, TCP Tahoe detects packet loss and begins a slow start.
SSThresHold is approximately 22.
- TCP begins another slow start and so on...
- At approximately time 0, TCP Tahoe starts and it is in the slow start mode: the congestion window size increases exponentially
Fast Recovery
- TCP Tahoe (the original version by Jacobson) incorporates the Slow Start and Congestion Avoidance mechanisms.
- Added Improvement: Fast Recovery:
- Fast recovery is a beautiful little improvement made to TCP that signifantly increased the TCP's performance level.
- Research discovered that:
- occur during
- situation, i.e., congestions that clears up very quickly.
- Recall that TCP performs a SLOW START after TCP performs Fast Retransmit (because there was a packet loss)
- Instead of perform a SLOW START (which reduces CWND down to 1 x MSS), research found that TCP can use a larger congestion window without causing network congestion !
- Fast recovery:
When TCP performs a fast restransmit (so TCP did not timeout): - set SSThresh = CWND/2
- set CWND = SSThresh + 3 * MSS.
(The rationel is that 3 duplicate ACK is worth 3 MSS bytes)
- TCP continue to use congestion avoidance (but using the new values of of SSThresh and CWND).
- set SSThresh = CWND/2
- Example:
- Fast recovery is a beautiful little improvement made to TCP that signifantly increased the TCP's performance level.
- TCP Reno Demo (Reno implements Fast Recovery)
- Here is a NS2 source file to simulate a TCP Reno source: click here
- Right click and save the file in your directory.
- Run program with:
export PATH=/usr/local/gnu/gcc/4.1.0/bin:$PATH export LD_LIBRARY_PATH=/usr/local/gnu/gcc/4.1.0/lib:$LD_LIBRARY_PATH /home/cheung/NS/run-ns Reno.tcl
- You should see the Network Animator window when it finish running... click PLAY to see the simulation in action
- You don't need to run the simulation to see the animation... I have saved a copy of the animation file generated by the simulation.
The NAM (Network Animation) output file is here: click here
- To see the animation, save the NAM file in your directory and use this command:
/home/cheung/NS/bin/nam Reno.nam
- The Congestion Window CWND plot data output file is here: click here
- To see the plot of the CWND of TCP, save the Congestion Window CWND plot file in your directory and run
In gnuplot, issue the command:
plot "Reno-Window" using 1:2 title "Flow 1" with lines 1
You should see this plot:
You can see that this small change in TCP Reno has resulted in a huge performance improvement:
- Again, at time 20, TCP Reno is in congestion avoidance mode
- At approximately time 27, TCP Reno detects packet loss and performs a fast retransmit and the fast recovery was successful. TCP Reno continues in the congestion avoidance mode without performing a slow start (We can see this because CWND did not start with 1)
- Again, at approximately time 40, TCP Reno detects packet loss and performs a fast retransmit and the fast recovery was successful. TCP Reno continues in the congestion avoidance mode without performing a slow start (We can see this because CWND did not start with 1)
- Again, at time 20, TCP Reno is in congestion avoidance mode
- Here is a NS2 source file to simulate a TCP Reno source: click here
- Further TCP Research: the TCP Flow Synchronization problem
- Do NOT think that this is the end of the story about Congestion Control on the Internet !
- This material on TCP is only the tip of the iceberg...
There are much more problems and issues with TCP after they introduced TCP Reno
- For example, research has discovered that different TCP flows sharing a bottle neck link will synchronize with each other !!!
(Here is a paper that points out the phenomenon: click here )
Example that illustrates TCP synchronization:
- Source 1 (red) start transmitting at time 0.1 sec
- Source 2 (blue) start transmitting at time 20.0 sec
- Here is a NS2 source file to simulate 2 TCP Reno sources sharing the bottle neck link: click here
- Right click and save the file in your directory.
- Run program with:
/home/cheung/NS/run-ns Reno.tcl
- You should see the Network Animator window when it finish running... click PLAY to see the simulation in action
- You need to run the simulation to see the animation because I deleted the animation file generated by the simulation.
- The Congestion Window CWND plot data output files are here:
- Window plot for TCP source 1: click here
- Window plot for TCP source 2: click here
To see the plot of the CWND of TCP, save the Congestion Window CWND plot file in your directory and run gnuplot
In gnuplot, issue the command:
plot "WinFile" using 1:2 title "Flow 1" with lines 1, \ "WinFile2" using 1:2 title "Flow 2" with lines 2
You should see this plot:
- The plots shows clearly that:
- Flow 1 starts early
- Flow 2 starts later and cause packet drops for both flows
- The 2 TCP flows perform Slow Start together and reduce their windows simultaneously
- Eventually, the congestion windows of both flows are synchronized !!!
This kind of behavior is not good, because the best way to utilizate all network capacity is for one of the flow to cut back
(But it should NOT always be the same flow, otherwise you have unfairness)
- The research to solve this phenomenon triggered the development of "Active Queue Management (AQM)" - among them, the "Random Early Drop/Detection (RED)" queue is the best well-known representative.
- Do NOT think that this is the end of the story about Congestion Control on the Internet !
- Further TCP Research: RTT unfairness problem
- TCP is NOT fair when different TCP connection share a bottle neck link but have different Round Trip Times (RTT)
- Example that illustrates TCP unfairness when RTTs differ:
- Source 1 (red) start transmitting at time 0.1 sec, RTT is 2 x 150 msec
- Source 2 (blue) start transmitting at time 20.0 sec, RTT is 2 x 640 msec
- Due to higher RTT, the CWND of flow 2 will increase slower !!!
- Here is a NS2 source file to simulate 2 TCP Reno sources sharing the bottle neck link: click here
- Right click and save the file in your directory.
- Run program with:
/home/cheung/NS/run-ns Reno.tcl
- You should see the Network Animator window when it finish running... click PLAY to see the simulation in action
(The NAM file is too big and I deleted it...)
- The Congestion Window CWND plot data output files are here:
- Window plot for TCP source 1: click here
- Window plot for TCP source 2: click here
To see the plot of the CWND of TCP, save the Congestion Window CWND plot file in your directory and run gnuplot
In gnuplot, issue the command:
plot "WinFile" using 1:2 title "Flow 1" with lines 1, \ "WinFile2" using 1:2 title "Flow 2" with lines 2
You should see this plot:
- You can see that the average congestion window size of flow 2 is lower than flow 1 and they do not converge...
- BTW, you can also see the TCP flow synchronization problem in this plot:
- Both flows will often perform
- at (approximately) the same time
- TCP is NOT fair when different TCP connection share a bottle neck link but have different Round Trip Times (RTT)
- Further TCP Research: Gigabit networks
- Another area of intense research is to adapt TCP for higher speed networks (Giga or Tera bit networks).
In these networks, the usable window size is huge... hundreds of thousands of packets.
TCP cannot afford the luxury to increase its window size by 1 in each RTT.
In order to reach the fill capacity of the network, TCP must increase faster...
- Example that illustrates TCP's behavior in high speed network :
- Source start transmitting at time 0.1 sec
- Here is a NS2 source file to simulate TCP Reno on a high speed (Giga bit) network: click here
- Right click and save the file in your directory.
- Run program with:
/home/cheung/NS/run-ns Reno.tcl
- You should see the Network Animator window when it finish running... click PLAY to see the simulation in action
(The NAM file is too big and I deleted it...)
- The Congestion Window CWND plot data output files are here:
- Window plot for TCP: click here
To see the plot of the CWND of TCP, save the Congestion Window CWND plot file in your directory and run gnuplot
In gnuplot, issue the command:
plot "WinFile" using 1:2 title "Flow 1" with lines 1
You should see this plot:
- You can see that TCP performed 2 unsuccessful slow starts
- At approximately 16 sec, TCP performs the 3rd slow start.
- This slow start terminates at approximately 18 sec.
- Then TCP performs Congestion Avoidance... all the way up from CWND = 20 (approximately) to 250+ The simulated ended after 140 sec and TCP has not reached full capacity yet !!!
- If TCP wants to take advantage of high speed links, it must increase the congestion window more aggressively and additively
http://www.mathcs.emory.edu/~cheung/Courses/558-old/Syllabus/6-transport/TCP.html
- Another area of intense research is to adapt TCP for higher speed networks (Giga or Tera bit networks).