Bug 1400

Summary: tcp valgrind errors due to rtt estimator changes
Product: ns-3 Reporter: Tom Henderson <tomh>
Component: tcpAssignee: Adrian S.-W. Tam <adrian.sw.tam>
Status: RESOLVED FIXED    
Severity: normal CC: ns-bugs
Priority: P5    
Version: pre-release   
Hardware: All   
OS: All   

Description Tom Henderson 2012-03-27 16:50:41 UTC
while testing for bug 1399, I found unexpected behavior when running the test-runner binary outside of test.py.  While ./test.py produced consistent results (failing the test at all times), running test-runner under waf produced inconsistent results, sometimes passing and sometimes failing.

Here is an example:

1) checkout a version of ns-3 that has a broken test (test traces are not correct):
[tomh@ns-test ns-3-dev]$ hg co -r c4c585d16c68

[tomh@ns-test ns-3-dev]$ hg sum
parent: 7801:c4c585d16c68
 repair broken nsc workaround that was uncovered by the fix in changeset 62dee74
123ca

2) now run this repeatedly:

[tomh@ns-test ns-3-dev]$ ./waf --run "test-runner --suite=routing-aodv-regression"
Waf: Entering directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
Waf: Leaving directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
'build' finished successfully (2.865s)
PASS routing-aodv-regression 0.420ms

[tomh@ns-test ns-3-dev]$ ./waf --run "test-runner --suite=routing-aodv-regressio
n"
Waf: Entering directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
Waf: Leaving directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
'build' finished successfully (2.861s)
PASS routing-aodv-regression 0.420ms

[tomh@ns-test ns-3-dev]$ ./waf --run "test-runner --suite=routing-aodv-regressio
n"
Waf: Entering directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
Waf: Leaving directory `/home/tomh/ns-3-allinone/ns-3-dev/build'
'build' finished successfully (2.851s)
FAIL routing-aodv-regression 0.430ms

However, "./test.py -s routing-aodv-regression" always fails in this case.

I could not detect a pattern; sometimes test-runner passes, sometimes it fails.
Comment 1 Tom Henderson 2012-03-27 19:48:36 UTC
Well, I have some evidence now that this is an artifact of some memory errors in the rtt estimator as patched recently for bug 1351.  There are uninitialized conditional jump/move events that could account for the behavior observed.  I am working on this.
Comment 2 Tommaso Pecorella 2012-03-28 13:55:56 UTC
I double checked RTT estimation code and it seems to be ok. Maybe (just a hint) it's some dirty memory overlapping with RTEstimation.

T.
Comment 3 Tommaso Pecorella 2012-03-28 14:13:56 UTC
One thing I noticed (dunno if it's important, maybe):

RttEstimator::GetTypeId (void)
{
  static TypeId tid = TypeId ("ns3::RttEstimator")
    .SetParent<Object> ()
[...]

The line:
    .AddConstructor<RttEstimator> ()
is missing.

I don't know if it's important and the outcomes of this.

T.
Comment 4 Tom Henderson 2012-04-04 13:16:00 UTC
These valgrind issues were fixed in 6c1a7055aeba.  However, RTTEstimator class still needs some unit testing.  I will close this and open a separate issue.