Bug 381 - Wifi crashes on shutdown
Wifi crashes on shutdown
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: wifi
ns-3.2
All All
: P1 normal
Assigned To: Faker Moatamri
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-17 08:22 UTC by Gustavo J. A. M. Carneiro
Modified: 2009-12-10 06:43 UTC (History)
3 users (show)

See Also:


Attachments
Wifi-failure (336 bytes, patch)
2009-04-29 05:05 UTC, Kirill V. Andreev
Details | Diff
alternate patch (1.52 KB, patch)
2009-07-15 14:59 UTC, Mathieu Lacage
Details | Diff
program reproducing the bug (6.27 KB, text/x-c++src)
2009-08-26 06:36 UTC, Nicola Baldo
Details
new program reproducing the bug (6.27 KB, text/x-c++src)
2009-11-26 04:36 UTC, Nicola Baldo
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gustavo J. A. M. Carneiro 2008-10-17 08:22:41 UTC
Sorry, I don't have code to reproduce this problem, but my colleague occasionally gets a segfault when calling Simulator::Destroy().  The bug appears to be inside the wifi module.  Here's a valgrind error:

==6605== Invalid read of size 8
==6605==    at 0x5489DBD: ns3::MacLow::CancelAllEvents() (mac-low.cc:330)
==6605==    by 0x548DD0B: ns3::MacLow::DoDispose() (mac-low.cc:273)
==6605==    by 0x51EA9DF: ns3::Object::MaybeDelete() const (object.cc:235)
==6605==    by 0x42E7F9: ns3::Object::Unref() const (object.h:346)
==6605==    by 0x54A888C: ns3::Ptr<ns3::MacLow>::operator=(ns3::Ptr<ns3::MacLow> const&) (ptr.h:421)
==6605==    by 0x549E460: ns3::DcaTxop::DoDispose() (dca-txop.cc:143)
==6605==    by 0x51EA9DF: ns3::Object::MaybeDelete() const (object.cc:235)
==6605==    by 0x42E7F9: ns3::Object::Unref() const (object.h:346)
==6605==    by 0x54C24E6: ns3::Ptr<ns3::DcaTxop>::operator=(ns3::Ptr<ns3::DcaTxop> const&) (ptr.h:421)
==6605==    by 0x54CADA1: ns3::NqapWifiMac::DoDispose() (nqap-wifi-mac.cc:98)
==6605==    by 0x51EAEDC: ns3::Object::Dispose() (object.cc:136)
==6605==    by 0x54E0453: ns3::WifiNetDevice::DoDispose() (wifi-net-device.cc:79)
==6605==  Address 0x409EA10 is 0 bytes inside a block of size 16 free'd
==6605==    at 0x4C2153D: operator delete(void*) (vg_replace_malloc.c:244)
==6605==    by 0x54A8AC2: ns3::DcaTxop::TransmissionListener::~TransmissionListener() (dca-txop.cc:71)
==6605==    by 0x549E54A: ns3::DcaTxop::DoDispose() (dca-txop.cc:145)
==6605==    by 0x51EA9DF: ns3::Object::MaybeDelete() const (object.cc:235)
==6605==    by 0x42E7F9: ns3::Object::Unref() const (object.h:346)
==6605==    by 0x54C24E6: ns3::Ptr<ns3::DcaTxop>::operator=(ns3::Ptr<ns3::DcaTxop> const&) (ptr.h:421)
==6605==    by 0x54CAD47: ns3::NqapWifiMac::DoDispose() (nqap-wifi-mac.cc:97)
==6605==    by 0x51EAEDC: ns3::Object::Dispose() (object.cc:136)
==6605==    by 0x54E0453: ns3::WifiNetDevice::DoDispose() (wifi-net-device.cc:79)
==6605==    by 0x51EAEDC: ns3::Object::Dispose() (object.cc:136)
==6605==    by 0x52E3B96: ns3::Node::DoDispose() (node.cc:147)
==6605==    by 0x51EAEDC: ns3::Object::Dispose() (object.cc:136)
pure virtual method called
terminate called without an active exception
Comment 1 Kirill V. Andreev 2009-04-29 05:05:33 UTC
Created attachment 434 [details]
Wifi-failure

I have observed the following problem: DcaTxop destroys first, and Mac-Low destroys second. Running DoDispose of MacLow, we call CancellAllEvents(), which sometimes (if one of events is running) calls m_listnerer->Cancel() and catches segmentation fault. The solution of this problem is to add "m_listener = 0;" before CancelAllEvents in destructor.
Comment 2 Nicola Baldo 2009-07-07 08:13:51 UTC


(In reply to comment #0)
> Sorry, I don't have code to reproduce this problem, but my colleague
> occasionally gets a segfault when calling Simulator::Destroy().  The bug
> appears to be inside the wifi module.  Here's a valgrind error:

I can reproduce this bug. I am using a slightly modified version of ns-3.5, the only differences being:

- a custom application

- a custom trace sink for wifi events (connected to DevTxTrace, DevRxTrace, PhyRxOkTrace, PhyRxErrorTrace, PhyTxTrace, and PhyStateTrace)

- WIFI_PREAMBLE_SHORT used instead of WIFI_PREAMBLE_LONG in mac-low.cc


------------------- gdb output --------------------


Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb6c36700 (LWP 11593)]
0xb7e35e5a in ns3::MacLow::CancelAllEvents (this=0x9d1db28) at ../src/devices/wifi/mac-low.cc:331
331	      m_listener->Cancel ();
(gdb) back
#0  0xb7e35e5a in ns3::MacLow::CancelAllEvents (this=0x9d1db28) at ../src/devices/wifi/mac-low.cc:331
#1  0xb7e36168 in ns3::MacLow::DoDispose (this=0x9d1db28) at ../src/devices/wifi/mac-low.cc:274
#2  0xb7bea88f in ns3::Object::MaybeDelete (this=0x9d1db28) at ../src/core/object.cc:254
#3  0xb7e449d8 in ns3::DcaTxop::DoDispose (this=0x9d1dee0) at optimized/ns3/object.h:355
#4  0xb7bea88f in ns3::Object::MaybeDelete (this=0x9d1dee0) at ../src/core/object.cc:254
#5  0xb7e647b0 in ns3::NqapWifiMac::DoDispose (this=0x9d1da20) at optimized/ns3/object.h:355
#6  0xb7bea7b8 in ns3::Object::Dispose (this=0x9d1da20) at ../src/core/object.cc:137
#7  0xb7e71a95 in ns3::WifiNetDevice::DoDispose (this=0x9d1d910) at ../src/devices/wifi/wifi-net-device.cc:76
#8  0xb7bea7b8 in ns3::Object::Dispose (this=0x9d1d910) at ../src/core/object.cc:137
#9  0xb7cf5484 in ns3::Node::DoDispose (this=0x9cf9250) at ../src/node/node.cc:155
#10 0xb7bea7b8 in ns3::Object::Dispose (this=0x9cf9250) at ../src/core/object.cc:137
#11 0xb7d0eb16 in ~NodeListPriv (this=0x9cf94b8) at ../src/node/node-list.cc:110
#12 0xb7bea8a3 in ns3::Object::MaybeDelete (this=0x9cf94b8) at ../src/core/object.cc:265
#13 0xb7d0faf0 in ns3::NodeListPriv::Delete () at optimized/ns3/object.h:355
#14 0xb7c9018c in Notify (this=0x9cfb6f0) at ../src/simulator/make-event.cc:19
#15 0xb7c77810 in ns3::EventImpl::Invoke (this=0x9d1e058) at ../src/simulator/event-impl.cc:39
#16 0xb7c87a9b in ns3::DefaultSimulatorImpl::Destroy (this=0x9cfb2c0) at ../src/simulator/default-simulator-impl.cc:83
#17 0xb7c7b298 in ns3::Simulator::Destroy () at ../src/simulator/simulator.cc:131
#18 0x0805064d in main (argc=2, argv=0xbfb0d414) at ../examples/wifi-voip-ns3-like-extreme.cc:277
(gdb) 

(gdb) print m_listener
$2 = (class ns3::MacLowTransmissionListener *) 0x9d1e058

(gdb) info locals
oneRunning = true

Comment 3 Nicola Baldo 2009-07-07 08:22:15 UTC
(In reply to comment #1)
> Created an attachment (id=434) [details]
> Wifi-failure
> 
> I have observed the following problem: DcaTxop destroys first, and Mac-Low
> destroys second. Running DoDispose of MacLow, we call CancellAllEvents(), which
> sometimes (if one of events is running) calls m_listnerer->Cancel() and catches
> segmentation fault. The solution of this problem is to add "m_listener = 0;"
> before CancelAllEvents in destructor.
> 

I get the impression that this might not be the desired fix.
With this proposed solution, m_listener->Cancel () is never executed. This of course makes the error disappear, but I am not sure if it is the intended behavior.

Nicola
Comment 4 Mathieu Lacage 2009-07-15 14:59:00 UTC
Created attachment 532 [details]
alternate patch

please, can you verify that this fixes your problem ?
Comment 5 Nicola Baldo 2009-07-16 05:54:38 UTC
(In reply to comment #4)
> please, can you verify that this fixes your problem ?

I confirm that your alternate patch fixes the problem in my setup. 

Comment 6 Mathieu Lacage 2009-07-16 06:31:39 UTC
changeset 02bf728f7e39
Comment 7 Nicola Baldo 2009-08-26 06:35:56 UTC
I experienced this bug once again, so I am re-opening it.
This time I can reproduce it with plain ns-3-dev. See attached program below.
Comment 8 Nicola Baldo 2009-08-26 06:36:45 UTC
Created attachment 572 [details]
program reproducing the bug
Comment 9 Faker Moatamri 2009-11-25 08:43:10 UTC
Hi Nicola,
I tried your program using latest ns-3-dev 5777:a7ca957db043, using valgrind valgrind-3.3.0 and g++-4.3.2, valgrind reports memory leaks errors but no invalid read of size 8.

==13264== 8 bytes in 2 blocks are still reachable in loss record 1 of 4                                                                                                   
==13264==    at 0x4A06D5C: operator new(unsigned long) (vg_replace_malloc.c:230)                                                                                          
==13264==    by 0x52089A9: ns3::ObjectRefCount<ns3::Object, ns3::ObjectBase>::ObjectRefCount() (object-ref-count.h:42)                                                    
==13264==    by 0x51F92D3: ns3::Object::Object() (object.cc:86)                                                                                                           
==13264==    by 0x5337CCE: ns3::Scheduler::Scheduler() (scheduler.h:54)                                                                                                   
==13264==    by 0x5338E66: ns3::MapScheduler::MapScheduler() (map-scheduler.cc:44)                                                                                        
==13264==    by 0x5338EA9: ns3::TypeId ns3::TypeId::AddConstructor<ns3::MapScheduler>()::Maker::Create() (type-id.h:429)                                                  
==13264==    by 0x52084E5: ns3::FunctorCallbackImpl<ns3::ObjectBase* (*)(), ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() (callback.h:166)                                                                                                            
==13264==    by 0x522FF9F: ns3::Callback<ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() const (callback.h:407)                                                                                                                                         
==13264==    by 0x522F66A: ns3::ObjectFactory::Create() const (object-factory.cc:69)                                                                                      
==13264==    by 0x5361BFB: ns3::Ptr<ns3::Scheduler> ns3::ObjectFactory::Create<ns3::Scheduler>() const (object-factory.h:110)                                             
==13264==    by 0x53607A3: ns3::DefaultSimulatorImpl::SetScheduler(ns3::ObjectFactory) (default-simulator-impl.cc:92)                                                     
==13264==    by 0x5348881: _ZN3ns3L7GetImplEv (simulator.cc:110)                                                                                                          
==13264==                                                                                                                                                                 
==13264==                                                                                                                                                                 
==13264== 32 bytes in 2 blocks are still reachable in loss record 2 of 4                                                                                                  
==13264==    at 0x4A0739E: malloc (vg_replace_malloc.c:207)                                                                                                               
==13264==    by 0x51F930C: ns3::Object::Object() (object.cc:86)                                                                                                           
==13264==    by 0x5361D1A: ns3::SimulatorImpl::SimulatorImpl() (simulator-impl.h:36)                                                                                      
==13264==    by 0x5360DC4: ns3::DefaultSimulatorImpl::DefaultSimulatorImpl() (default-simulator-impl.cc:49)                                                               
==13264==    by 0x5360E54: ns3::TypeId ns3::TypeId::AddConstructor<ns3::DefaultSimulatorImpl>()::Maker::Create() (type-id.h:429)                                          
==13264==    by 0x52084E5: ns3::FunctorCallbackImpl<ns3::ObjectBase* (*)(), ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() (callback.h:166)                                                                                                            
==13264==    by 0x522FF9F: ns3::Callback<ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() const (callback.h:407)                                                                                                                                         
==13264==    by 0x522F66A: ns3::ObjectFactory::Create() const (object-factory.cc:69)                                                                                      
==13264==    by 0x535AB4B: ns3::Ptr<ns3::SimulatorImpl> ns3::ObjectFactory::Create<ns3::SimulatorImpl>() const (object-factory.h:110)                                     
==13264==    by 0x5348704: _ZN3ns3L7GetImplEv (simulator.cc:103)                                                                                                          
==13264==    by 0x5348F7B: ns3::Simulator::Cancel(ns3::EventId const&) (simulator.cc:301)                                                                                 
==13264==    by 0x5336C4C: ns3::EventId::Cancel() (event-id.cc:42)                                                                                                        
==13264==                                                                                                                                                                 
==13264==                                                                                                                                                                 
==13264== 88 bytes in 1 blocks are still reachable in loss record 3 of 4                                                                                                  
==13264==    at 0x4A06D5C: operator new(unsigned long) (vg_replace_malloc.c:230)                                                                                          
==13264==    by 0x5338E9C: ns3::TypeId ns3::TypeId::AddConstructor<ns3::MapScheduler>()::Maker::Create() (type-id.h:429)                                                  
==13264==    by 0x52084E5: ns3::FunctorCallbackImpl<ns3::ObjectBase* (*)(), ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() (callback.h:166)                                                                                                            
==13264==    by 0x522FF9F: ns3::Callback<ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() const (callback.h:407)                                                                                                                                         
==13264==    by 0x522F66A: ns3::ObjectFactory::Create() const (object-factory.cc:69)                                                                                      
==13264==    by 0x5361BFB: ns3::Ptr<ns3::Scheduler> ns3::ObjectFactory::Create<ns3::Scheduler>() const (object-factory.h:110)                                             
==13264==    by 0x53607A3: ns3::DefaultSimulatorImpl::SetScheduler(ns3::ObjectFactory) (default-simulator-impl.cc:92)                                                     
==13264==    by 0x5348881: _ZN3ns3L7GetImplEv (simulator.cc:110)                                                                                                          
==13264==    by 0x5348F7B: ns3::Simulator::Cancel(ns3::EventId const&) (simulator.cc:301)                                                                                 
==13264==    by 0x5336C4C: ns3::EventId::Cancel() (event-id.cc:42)                                                                                                        
==13264==    by 0x554D7E0: ns3::TcpSocketImpl::CancelAllTimers() (tcp-socket-impl.cc:1516)                                                                                
==13264==    by 0x5559D04: ns3::TcpSocketImpl::~TcpSocketImpl() (tcp-socket-impl.cc:201)                                                                                  
==13264==
==13264==
==13264== 96 bytes in 1 blocks are still reachable in loss record 4 of 4
==13264==    at 0x4A06D5C: operator new(unsigned long) (vg_replace_malloc.c:230)
==13264==    by 0x5360E47: ns3::TypeId ns3::TypeId::AddConstructor<ns3::DefaultSimulatorImpl>()::Maker::Create() (type-id.h:429)
==13264==    by 0x52084E5: ns3::FunctorCallbackImpl<ns3::ObjectBase* (*)(), ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() (callback.h:166)
==13264==    by 0x522FF9F: ns3::Callback<ns3::ObjectBase*, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty, ns3::empty>::operator()() const (callback.h:407)
==13264==    by 0x522F66A: ns3::ObjectFactory::Create() const (object-factory.cc:69)
==13264==    by 0x535AB4B: ns3::Ptr<ns3::SimulatorImpl> ns3::ObjectFactory::Create<ns3::SimulatorImpl>() const (object-factory.h:110)
==13264==    by 0x5348704: _ZN3ns3L7GetImplEv (simulator.cc:103)
==13264==    by 0x5348F7B: ns3::Simulator::Cancel(ns3::EventId const&) (simulator.cc:301)
==13264==    by 0x5336C4C: ns3::EventId::Cancel() (event-id.cc:42)
==13264==    by 0x554D7E0: ns3::TcpSocketImpl::CancelAllTimers() (tcp-socket-impl.cc:1516)
==13264==    by 0x5559D04: ns3::TcpSocketImpl::~TcpSocketImpl() (tcp-socket-impl.cc:201)
==13264==    by 0x51F31A1: ns3::Object::DoDelete() (object.cc:378)

Is that the error you are reporting?
Comment 10 Nicola Baldo 2009-11-26 04:36:19 UTC
Created attachment 680 [details]
new program reproducing the bug

Here is a slightly modified version of the program that addresses a minor change in the Application interface.
Comment 11 Nicola Baldo 2009-11-26 04:39:04 UTC
(In reply to comment #9)
> Hi Nicola,
> I tried your program using latest ns-3-dev 5777:a7ca957db043, using valgrind
> valgrind-3.3.0 and g++-4.3.2, valgrind reports memory leaks errors but no
> invalid read of size 8.
[snip]
> Is that the error you are reporting?

No. It really crashes with a segmentation fault.

nicola@pcnbaldo:~/locale/ns-3-dev$ ./waf --run scratch/bug381
Waf: Entering directory `/home/nicola/locale/ns-3-dev/build'
Waf: Leaving directory `/home/nicola/locale/ns-3-dev/build'
'build' finished successfully (1.048s)
Command ['/home/nicola/locale/ns-3-dev/build/debug/scratch/bug381'] exited with code -11


nicola@pcnbaldo:~/locale/ns-3-dev$ ./waf --command="gdb %s" --run scratch/bug381
Waf: Entering directory `/home/nicola/locale/ns-3-dev/build'
Waf: Leaving directory `/home/nicola/locale/ns-3-dev/build'
'build' finished successfully (1.027s)
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
(gdb) run
Starting program: /home/nicola/locale/ns-3-dev/build/debug/scratch/bug381 
[Thread debugging using libthread_db enabled]
[New Thread 0xb6140720 (LWP 18684)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb6140720 (LWP 18684)]
0x085f9053 in ?? ()
(gdb) back
#0  0x085f9053 in ?? ()
#1  0xb7b7b456 in ns3::MacLow::DoDispose (this=0x85f8820) at ../src/devices/wifi/mac-low.cc:306
#2  0xb75234ee in ns3::Object::DoDelete (this=0x85f8820) at ../src/core/object.cc:363
#3  0x08055607 in ns3::ObjectRefCount<ns3::Object, ns3::ObjectBase>::Unref (this=0x85f8820) at debug/ns3/object-ref-count.h:83
#4  0xb7b96090 in ns3::Ptr<ns3::MacLow>::operator= (this=0x85f9138, o=@0xbf93adec) at debug/ns3/ptr.h:455
#5  0xb7c2b7eb in ns3::EdcaTxopN::DoDispose (this=0x85f9110) at ../src/devices/wifi/edca-txop-n.cc:130
#6  0xb75234ee in ns3::Object::DoDelete (this=0x85f9110) at ../src/core/object.cc:363
#7  0x08055607 in ns3::ObjectRefCount<ns3::Object, ns3::ObjectBase>::Unref (this=0x85f9110) at debug/ns3/object-ref-count.h:83
#8  0xb7c0dd3e in ns3::Ptr<ns3::EdcaTxopN>::operator= (this=0x85f92a4, o=@0xbf93af5c) at debug/ns3/ptr.h:455
#9  0xb7c0a83f in ns3::QadhocWifiMac::DoDispose (this=0x85f86e0) at ../src/devices/wifi/qadhoc-wifi-mac.cc:118
#10 0xb7522850 in ns3::Object::Dispose (this=0x85f86e0) at ../src/core/object.cc:205
#11 0xb7bdc64f in ns3::WifiNetDevice::DoDispose (this=0x85f8580) at ../src/devices/wifi/wifi-net-device.cc:84
#12 0xb7522850 in ns3::Object::Dispose (this=0x85f8580) at ../src/core/object.cc:205
#13 0xb77949fc in ns3::Node::DoDispose (this=0x85f6298) at ../src/node/node.cc:156
#14 0xb7522850 in ns3::Object::Dispose (this=0x85f6298) at ../src/core/object.cc:205
#15 0xb77cc8c8 in ~NodeListPriv (this=0x85f6400) at ../src/node/node-list.cc:110
#16 0xb752353b in ns3::Object::DoDelete (this=0x85f6400) at ../src/core/object.cc:378
#17 0x08055607 in ns3::ObjectRefCount<ns3::Object, ns3::ObjectBase>::Unref (this=0x85f6400) at debug/ns3/object-ref-count.h:83
#18 0xb77ce3cc in ns3::Ptr<ns3::NodeListPriv>::operator= (this=0xb7f74c98, o=@0xbf93b1b8) at debug/ns3/ptr.h:455
#19 0xb77cdec9 in ns3::NodeListPriv::Delete () at ../src/node/node-list.cc:95
#20 0xb76a8eba in Notify (this=0x85f61c8) at ../src/simulator/make-event.cc:19
#21 0xb7675f4a in ns3::EventImpl::Invoke (this=0x85f61c8) at ../src/simulator/event-impl.cc:37
#22 0xb7695a44 in ns3::DefaultSimulatorImpl::Destroy (this=0x85f61e0) at ../src/simulator/default-simulator-impl.cc:84
#23 0xb767c877 in ns3::Simulator::Destroy () at ../src/simulator/simulator.cc:143


nicola@pcnbaldo:~/locale/ns-3-dev$ uname -a
Linux pcnbaldo 2.6.28-16-generic #57-Ubuntu SMP Wed Nov 11 09:47:24 UTC 2009 i686 GNU/Linux

nicola@pcnbaldo:~/locale/ns-3-dev$ g++ --version
g++ (Ubuntu 4.3.3-5ubuntu4) 4.3.3
Comment 12 Nicola Baldo 2009-11-26 04:49:30 UTC
I forgot to mention that the above was produced with ns-3-dev 5770:bb1eea10412f. I just tried with latest ns-3-dev 5779:6642920ad056, the program still crashes, though with SIGILL instead of SIGSEGV. The backtrace produced by gdb looks identical to the one I posted above.
Comment 13 Faker Moatamri 2009-12-10 06:43:48 UTC
changeset:   5846:f7a4e1b3f632