Bugzilla – Bug 2769
TCP RTO event sets ssthresh to 2*MSS always
Last modified: 2017-09-17 18:01:00 UTC
Michele Polese reports the following possible problem: In TcpSocketBase::ReTxTimeout, the following operations are done after an RTO: 1 - all the packets in the SentList of the TxBuffer are marked as lost, and moved in the AppList (i.e., as if they were new packets); 2 - the recovery point i set to HighData (RFC 6675, Sec. 5.1); 3 - the RTO is doubled (RFC 6298); 4 - the ssthresh is set to (with NewReno) to max (2*MSS, bytesInFlight/2); 5 - cwnd is updated; 6 - the state is set to CA_LOSS; 7 - DoRetransmit is called. Step 4 appears to be impacted by step 1, and the SentList perhaps should not be cleared in step 1. When in step 4 the TcpTxBuffer::BytesInFlight() is called (via TcpSocketBase::BytesInFlight()), it scans the whole m_sentList. However, this was reset at step 1, and now it contains a single packet, which is marked as lost, thus the total amount of bytes in flight that the method returns is 0. Consequently, the ssthresh is set (with NewReno) to max(bytesInFlight/2=0, 2*MSS) = 2*MSS. Logs for the BytesInFlight and the ssthresh are attached for an example (MSS=14k bytes), and at 0.488996 an RTO is triggered. If my interpretation is correct, then the ssthresh after an RTO is always set to 2*MSS.
Created attachment 2894 [details] ssthresh log
Created attachment 2895 [details] bytes in flight log
Here is relevant specification text from RFC 5681: When a TCP sender detects segment loss using the retransmission timer and the given segment has not yet been resent by way of the retransmission timer, the value of ssthresh MUST be set to no more than the value given in equation (4): ssthresh = max (FlightSize / 2, 2*SMSS) (4) where, as discussed above, FlightSize is the amount of outstanding data in the network. On the other hand, when a TCP sender detects segment loss using the retransmission timer and the given segment has already been retransmitted by way of the retransmission timer at least once, the value of ssthresh is held constant.
Thanks Tom for filing this. A quick fix for the first part is swapping step 4 and step 1, or at least saving the amount of in flight bytes before doing step 1.
Thank you Michele, The report is perfect and lead to the very simple approach proposed by Tom: diff --git i/src/internet/model/tcp-socket-base.cc w/src/internet/model/tcp-socket-base.cc index 1a5faa1bb..c61ebfaba 100644 --- i/src/internet/model/tcp-socket-base.cc +++ w/src/internet/model/tcp-socket-base.cc @@ -3152,6 +3152,8 @@ TcpSocketBase::ReTxTimeout () return; } + uint32_t inFlight = BytesInFlight (); + // From RFC 6675, Section 5.1 // [RFC2018] suggests that a TCP sender SHOULD expunge the SACK // information gathered from a receiver upon a retransmission timeout @@ -3204,7 +3206,7 @@ TcpSocketBase::ReTxTimeout () // retransmission timer, decrease ssThresh if (m_tcb->m_congState != TcpSocketState::CA_LOSS || !m_txBuffer->IsHeadRetransmitted ()) { - m_tcb->m_ssThresh = m_congestionControl->GetSsThresh (m_tcb, BytesInFlight ()); + m_tcb->m_ssThresh = m_congestionControl->GetSsThresh (m_tcb, inFlight); } // Cwnd set to 1 MSS What I would like to see is also the proper assert added in the TCP timeout test. I would like to leave it as an exercise for some of you; do you have time to do the patch? Thanks!
Created attachment 2906 [details] Proposed patch
Created attachment 2907 [details] Test draft Test for the ssthresh after an RTO.
Natale, Tom, I created the patch and also a draft of the test. However, I have some issues in re-creating in the test the condition for which bytesInFlight/2 > 2*SMSS. Do you have any suggestion? Thanks!
I think it is not necessary to make the connection fail; I think we need only this (pseudocode): - send segment [1, 100] - drop segment 50 - send segment 50 (triggered by fast recovery) - drop segment 50 - .... (time passes) - RTO fires for segment 50, check ssthresh half of the in flight (25 segments in this particular case) - send segment 50 (triggered by RTO) - drop segment 50 - .... (time passes) - RTO fires for segment 50, check ssthresh == as before (no halving)
Created attachment 2908 [details] Test draft
Hello Michele, the test is running fine even without the patch applied... I would like to include it before the next release (next week). Do you have time to work on it?
I can try. The thing is that it is difficult to replicate in the test environment the condition for which the amount of bytes in flight is larger that the mss. Do you have any suggestion?
Pushed in commit 13070; test case in 13069.