Bug 2484 - If TCP times out during CA_Recovery phase, it should not leave the CA_LOSS state until it recovers above highest tx mark
If TCP times out during CA_Recovery phase, it should not leave the CA_LOSS st...
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: tcp
ns-3-dev
All All
: P5 major
Assigned To: natale.patriciello
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-08-24 10:58 UTC by l.salameh
Modified: 2016-08-30 03:51 UTC (History)
3 users (show)

See Also:


Attachments
Patch for changing congestion state on timeout. (882 bytes, text/plain)
2016-08-24 10:58 UTC, l.salameh
Details
don't change state until above m_recover (921 bytes, patch)
2016-08-24 11:01 UTC, l.salameh
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description l.salameh 2016-08-24 10:58:07 UTC
Created attachment 2555 [details]
Patch for changing congestion state on timeout.

If TCP times out during CA_Recovery phase, it should not leave the CA_LOSS state until after it receives an ACK > m_recover + 1.

The scenario is as follows:

A TCP flow receives 3 dup acks, sets m_recover, and goes into fast retransmit. But if there are too many losses, TCP will time out in this recovery phase.

It will restart sending all the packets from the HeadSequence(). We will get Dup Acks corresponding to duplicate data. But when we get any partial acks, the current code will immediately go back to CA_OPEN.

We shouldn't be doing this. In fact we should wait until we fully recover above m_recover before going back to CA_OPEN, otherwise we will go right back to CA_RECOVERY and wrongly inflate CWND due to duplicates.

Please see patch for fix.
Comment 1 l.salameh 2016-08-24 11:01:09 UTC
Created attachment 2556 [details]
don't change state until above m_recover
Comment 2 natale.patriciello 2016-08-24 11:32:46 UTC
Agree, you have reported something similar in the past. However, I fixed this in the SACK version (and other thing). I think in the current code also the Retransmission phase is wrong.

Tom, I can include this patch, but if the schedule of ns3.26 is delayed (say, 1 month more) I would prefer to include SACK, which fixes this (and other bugs) and (in my opinion) is waaay more clear, since I inserted a lot of comments that follows the RFC point after point.
Comment 3 Tom Henderson 2016-08-28 08:53:49 UTC
(In reply to natale.patriciello from comment #2)
> Agree, you have reported something similar in the past. However, I fixed
> this in the SACK version (and other thing). I think in the current code also
> the Retransmission phase is wrong.
> 
> Tom, I can include this patch, but if the schedule of ns3.26 is delayed
> (say, 1 month more) I would prefer to include SACK, which fixes this (and
> other bugs) and (in my opinion) is waaay more clear, since I inserted a lot
> of comments that follows the RFC point after point.

I am hoping to release ns-3.26 shortly, such as a code freeze/release candidate in a day or two.

I would prefer to include this now, regardless of a future SACK patch, if you agree with the patch correctness.  This aspect of NewReno is central to the algorithm (see section 4 of RFC 6582).
Comment 4 natale.patriciello 2016-08-30 03:51:40 UTC
Fixed in 12284:f2f398679f6c