Bugzilla – Bug 790
Memory leak in TestSuite routing-aodv-regression
Last modified: 2010-01-18 04:03:03 UTC
Fails valgrind on ns-regression VALGR: TestSuite routing-aodv-regression > lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 8.04.3 LTS Release: 8.04 Codename: hardy > gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with -system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/in clude/c++/4.2 --program-suffix=-4.2 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-checki ng=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4) [ns-regression] ~/repos/ns-3-allinone-dev/ns-3-dev > ./test.py -g -v -s routing-aodv-regression Building: ./waf Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build' Waf: Leaving directory `/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build' 'build' finished successfully (1.661s) NS3_ACTIVE_VARIANT == debug NS3_BUILDDIR == /home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build NS3_MODULE_PATH == ['/usr/lib/gcc/x86_64-linux-gnu/4.2.4', '/home/craigdo/repos/ns-3-allinone-dev/nsc/linux-2.6.18', '/home/crai gdo/repos/ns-3-allinone-dev/nsc/linux-2.6.26', '/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build/debug'] ENABLE_NSC == False ENABLE_REAL_TIME == True ENABLE_EXAMPLES == True os.environ["LD_LIBRARY_PATH"] == /usr/lib/gcc/x86_64-linux-gnu/4.2.4:/home/craigdo/repos/ns-3-allinone-dev/nsc/linux-2.6.18:/hom e/craigdo/repos/ns-3-allinone-dev/nsc/linux-2.6.26:/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build/debug:/usr/lib/gcc/x86_6 4-linux-gnu/4.2.4:/home/craigdo/repos/ns-3-allinone-dev/nsc/linux-2.6.18:/home/craigdo/repos/ns-3-allinone-dev/nsc/linux-2.6.26: /home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build/debug Queue routing-aodv-regression Launch utils/test-runner --suite=routing-aodv-regression Synchronously execute valgrind --suppressions=/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/testpy.supp --leak-check=full --err or-exitcode=2 /home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build/debug/utils/test-runner --suite=routing-aodv-regression --bas edir=/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev --tempdir=testpy-output/2010-01-12-23-35-45-CUT --out=testpy-output/2010-01- 12-23-35-45-CUT/routing-aodv-regression.xml Return code = 2 stderr = ==27520== Memcheck, a memory error detector ==27520== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al. ==27520== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info ==27520== Command: /home/craigdo/repos/ns-3-allinone-dev/ns-3-dev/build/debug/utils/test-runner --suite=routing-aodv-regression --basedir=/home/craigdo/repos/ns-3-allinone-dev/ns-3-dev --tempdir=testpy-output/2010-01-12-23-35-45-CUT --out=testpy-output/201 0-01-12-23-35-45-CUT/routing-aodv-regression.xml ==27520== ==27520== ==27520== HEAP SUMMARY: ==27520== in use at exit: 17,680 bytes in 156 blocks ==27520== total heap usage: 25,471 allocs, 25,315 frees, 1,612,731 bytes allocated ==27520== ==27520== 17,480 (224 direct, 17,256 indirect) bytes in 2 blocks are definitely lost in loss record 70 of 70 ==27520== at 0x4C2397E: operator new(unsigned long) (vg_replace_malloc.c:220) ==27520== by 0x574AE40: ns3::Ptr<ns3::Node> ns3::CreateObject<ns3::Node>() (object.h:515) ==27520== by 0x5B2DE79: ns3::NodeContainer::Create(unsigned int) (node-container.cc:96) ==27520== by 0x5A071C7: ns3::aodv::ChainRegressionTest::CreateNodes() (aodv-regression.cc:112) ==27520== by 0x5A08344: ns3::aodv::ChainRegressionTest::DoRun() (aodv-regression.cc:90) ==27520== by 0x54BFEFF: ns3::TestCase::Run() (test.cc:152) ==27520== by 0x54C069B: ns3::TestSuite::DoRun() (test.cc:684) ==27520== by 0x54BFBE1: ns3::TestSuite::Run() (test.cc:459) ==27520== by 0x4026A8: main (test-runner.cc:263) ==27520== LEAK SUMMARY: ==27520== definitely lost: 224 bytes in 2 blocks ==27520== indirectly lost: 17,256 bytes in 150 blocks ==27520== possibly lost: 0 bytes in 0 blocks ==27520== still reachable: 200 bytes in 4 blocks ==27520== suppressed: 0 bytes in 0 blocks ==27520== Reachable blocks (those to which a pointer was found) are not shown. ==27520== To see them, rerun with: --leak-check=full --show-reachable=yes ==27520== ==27520== For counts of detected and suppressed errors, rerun with: -v ==27520== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 3 from 3) ...
Actually this is not only for routing aodv regression: utils/test-runner --suite=routing-aodv-regression still reachable: 936 bytes in 7 blocks utils/test-runner --suite=routing-aodv still reachable: 280 bytes in 6 blocks utils/test-runner --suite=routing-olsr-regression still reachable: 200 bytes in 4 blocks utils/test-runner --suite=routing-olsr-header definitely lost: 224 bytes in 2 blocks. indirectly lost: 17,256 bytes in 150 blocks. possibly lost: 0 bytes in 0 blocks. still reachable: 200 bytes in 4 blocks. utils/test-runner --suite=ipv6-protocol still reachable: 200 bytes in 4 blocks utils/test-runner --suite=packetbb-test-suite still reachable: 5,816 bytes in 42 blocks utils/test-runner --suite=drop-tail-queue still reachable: 720 bytes in 16 blocks utils/test-runner --suite=packet-metadata still reachable: 936 bytes in 7 blocks utils/test-runner --suite=buffer 200 bytes in 4 blocks utils/test-runner --suite=object-name-service still reachable: 200 bytes in 4 blocks examples/stats/wifi-example-sim still reachable: 200 bytes in 4 blocks examples/tcp/star still reachable: 200 bytes in 4 blocks and there is some others like those plus valgrind is returning 0 which doesn't allow us to detect the errors. I think this is more than just an error in a test program. Any thoughts?
> examples/tcp/star > still reachable: 200 bytes in 4 blocks > > and there is some others like those plus valgrind is returning 0 > which doesn't allow us to detect the errors. I think this is more > than just an error in a test program. Any thoughts? This is expected behavior. It has been like this since ns-3.1 and is due to the fact that valgrind doesn't consider still-reachable an "important" error since "such blocks don't need direct fixing by the programmer."
Created attachment 723 [details] commented ping If I comment installing ping, valgrind is happy
Created attachment 724 [details] Proposed fix Just stop the ping a nanosecond before end regression test ends
(In reply to comment #4) > Created an attachment (id=724) [details] > Proposed fix > > Just stop the ping a nanosecond before end regression test ends Woow. Why is this fixing the leak ??? Is it because the Socket::Close function is doing something special ? If so, what ?
StopApplication is already not called, and RawSocket is not closed
You schedule stopevent for application like this: m_stopEvent = Simulator::Schedule (m_stopTime, &Application::StopApplication, this); and Simulator may be already destroyed
Created attachment 725 [details] Clear the list of sockets in ipv4-l3-protocol.cc This fix will clear the memory leak, actually in ipv4-l3-protocol.cc, the DoDispose function doesn't clear the list of sockets it has, that's what is causing the memory leak. Here is a patch that fixes it.
I verified that clearing m_sockets cleans up the following errors: ==7504== definitely lost: 224 bytes in 2 blocks ==7504== indirectly lost: 17,256 bytes in 150 blocks Another error remains after this patch is applied: ==7504== still reachable: 200 bytes in 4 blocks I have filed a separate bug on these, so it seems to me that applying the patch above should close this particular bug. I'm a little troubled that something as blatant as this would not appear somewhere else, though; so maybe this is all related.
Changeset: 8f94a0ca3964