Bug 647 - Ns-3 implementation of TCP fails to produce limited-queue CW sawtooth
Ns-3 implementation of TCP fails to produce limited-queue CW sawtooth
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: internet
pre-release
All All
: P1 blocker
Assigned To: George Riley
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-31 07:41 UTC by Vladimir Tregub
Modified: 2009-10-07 23:12 UTC (History)
3 users (show)

See Also:


Attachments
Test case (5.83 KB, text/x-c++src)
2009-09-28 14:36 UTC, Josh Pelkey
Details
Patch for bug 647 (467 bytes, patch)
2009-09-28 14:37 UTC, Josh Pelkey
Details | Diff
Testcase for the test suite (9.78 KB, patch)
2009-09-30 20:28 UTC, Josh Pelkey
Details | Diff
Testcase for the test suite (9.32 KB, patch)
2009-09-30 20:33 UTC, Josh Pelkey
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Vladimir Tregub 2009-07-31 07:41:29 UTC
When run with a limited queue on its point-to-point net device channel, a trivial two-node, one flow ns-3 simulation of a bulky TCP transfer gives odd results. The packet log traces are incompatible with those obtained from ns-2 simulations for identical scenarios (the same channel, parameters, TCP/Tahoe etc.). The difference is substantial and far beyond fine distinctions sort of effects from simplistic initial handshake implementation in ns-2. The simulation even fails to produce the familiar TCP sawtooth CW graph typical for large window/limited queue parameters.

So I turned on a TcpSocketImpl log, examined the log results, and noticed odd-length window sizes among records. I suspected it suggests there is something wrong with window size calculation. After some scrutiny given to the code of the ns-3 TCP implementation, I edited the TcpSocketImpl::RxBufferFreeSpace(), as a means to give a quick check to my suspicions:

uint32_t TcpSocketImpl::RxBufferFreeSpace()
{
  /* added code */
  if ((int32_t) m_rxBufSize < 0) {
    return m_rxBufMaxSize - m_rxBufSize + 1;
  }
  /* end of added code */

  return m_rxBufMaxSize - m_rxBufSize;
}

and everything went OK with my simulations. The ns-2 and ns-3 simulations for two-node, TCP transfers are now well agreed.

Is it a bug or is it a gross misunderstanding on my part?

Vladimir Tregub
Comment 1 Josh Pelkey 2009-09-28 14:36:02 UTC
Created attachment 603 [details]
Test case

I've confirmed this problem with the test case attached.  I also think I've tracked down the issue, and I've tested it with the same test case.

In TcpSocketImpl::NewRx, case 2, where a packet is received out-of-order (i.e. seq. number is greater than the one expected) the data is buffered, but the m_rxBufferSize is not updated (the size of data buffered is not added).  When TcpSocketImpl::Recv is called to send the data up, the rxbuffer is decremented.  This is particularly a problem when the rxbuffer is zero, and it gets decremented.  This makes the rxbuffer seem very full (since it's unsigned).  Therefore, the free space in the buffer is miscalculated, and in this case, leads to a very small Rx window size.

The output of the congestion window is shown for both the old code, and the fixed code below:
http://www.nsnam.org/~jpelkey3/cwnd-old.png
http://www.nsnam.org/~jpelkey3/cwnd-new.png

Patch coming up in next comment.
Comment 2 Josh Pelkey 2009-09-28 14:37:44 UTC
Created attachment 604 [details]
Patch for bug 647

Proposed fix.  The problem is described above.  All the regression tests passed even after applying this patch.
Comment 3 Tom Henderson 2009-09-29 09:44:43 UTC
(In reply to comment #2)
> Created an attachment (id=604) [details]
> Patch for bug 647
> 
> Proposed fix.  The problem is described above.  All the regression tests passed
> even after applying this patch.
> 

+1.  Can you please add your test case to the codebase so that the test suite fails without your patch?
Comment 4 Josh Pelkey 2009-09-29 10:17:25 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Created an attachment (id=604) [details] [details]
> > Patch for bug 647
> > 
> > Proposed fix.  The problem is described above.  All the regression tests passed
> > even after applying this patch.
> > 
> 
> +1.  Can you please add your test case to the codebase so that the test suite
> fails without your patch?
> 

I'd be happy to, though I'm unfamiliar with the process.  Do you mean a regression test?  Does this just involve adding it to the examples folder and then a command to generate the regression (then pushing both the traces and ns-3-dev)?  If there is a how-to for this already, or a discussion somewhere, you can just direct me there.
Comment 5 Tom Henderson 2009-09-29 11:58:24 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #2)
> > > Created an attachment (id=604) [details] [details] [details]
> > > Patch for bug 647
> > > 
> > > Proposed fix.  The problem is described above.  All the regression tests passed
> > > even after applying this patch.
> > > 
> > 
> > +1.  Can you please add your test case to the codebase so that the test suite
> > fails without your patch?
> > 
> 
> I'd be happy to, though I'm unfamiliar with the process.  Do you mean a
> regression test?  Does this just involve adding it to the examples folder and
> then a command to generate the regression (then pushing both the traces and
> ns-3-dev)?  If there is a how-to for this already, or a discussion somewhere,
> you can just direct me there.
> 

I would put a test into our new test framework, rather than an example.

On Sunday I added a sample test program in src/test/ directory that can be cloned to produce a new test suite.  Maybe you can instead squeeze your test into a TestCase of an existing ns3tcp test suite.  Please have a read of this, especially section 3 and 4:
http://www.nsnam.org/docs/testing.html

If you already have a working example, the main thing that needs to be done is to convert the main() to a TestCase::DoRun() method.  And the main decision you will need to make is how to write some kind of assert or test data comparison that will fail when your bug fix is not present, and will succeed when your bug fix is present.  (i.e. figure out how to capture the sawtooth behavior in a testable way).

For reference, I took the example testcase program in bug 643 and made a new wifi-interference-test-suite.cc program under src/test/ns3wifi, in case you want to see a worked example.


Comment 6 Craig Dowell 2009-09-29 12:18:26 UTC
There are a couple of TCP test suites in src/test/ns3tcp for your viewing pleasure.
Comment 7 Josh Pelkey 2009-09-30 20:28:33 UTC
Created attachment 611 [details]
Testcase for the test suite

Here is a testcase that I've added to the ns3-tcp-cwnd suite.  You'll notice that I renamed the testcase that was already in there (appended a 1).  I wasn't sure if this is what should be done, or if more descriptive names were needed.  My testcase is Ns3TcpCwndTestCase2.

This fails without the patch mentioned in comment 2.
Comment 8 Josh Pelkey 2009-09-30 20:33:16 UTC
Created attachment 612 [details]
Testcase for the test suite

Accidentally left in some incorrect comments.  New patch attached.
Comment 9 Tom Henderson 2009-10-07 00:33:05 UTC
(In reply to comment #8)
> Created an attachment (id=612) [details]
> Testcase for the test suite
> 
> Accidentally left in some incorrect comments.  New patch attached.
> 

Thanks for making the new test case.  IMO this patch + test should be merged.
Comment 10 Josh Pelkey 2009-10-07 23:12:18 UTC
changeset 7bc9c7bd4d8f and 5d2766b6aafc