Bug 754

Summary: Occasional crashes in emulation mode (tap-bridge)
Product: ns-3 Reporter: Fabian Mauchle <f1mauchl>
Component: devicesAssignee: ns-bugs <ns-bugs>
Status: RESOLVED FIXED    
Severity: normal CC: craigdo
Priority: P5    
Version: ns-3-dev   
Hardware: PC   
OS: Linux   
Attachments: patch

Description Fabian Mauchle 2009-11-26 11:43:26 UTC
Created attachment 682 [details]
patch

Setup:
Host with dual-core cpu,
virtual machine running in VirtualBox (1 virtual CPU, 1 network interface connected to a tap device),
ns-3 in realtime-mode (emulation), connecting the virtual machine with the host via 2x TapBridge,
virtual machine has constant traffic (audio stream)

After some time (occurred between 1 and 72 hours) ns-3 crashes. I suspect the crash being caused somewhere in TapBridge::ReadThread. Maybe there's a problem with the Ref and Unref methods executed concurrently, finally causing the refcount to hit 0 and deleting the SimulatorImpl.

I'm currently testing the patch in a long run.
Comment 1 Fabian Mauchle 2009-11-27 11:21:44 UTC
ns-3 has now been running for 100 hours without any problem (and is still running).
Comment 2 Fabian Mauchle 2009-12-02 05:18:18 UTC
200 hours now. So I think this patch indeed solves the bug.
Comment 3 Craig Dowell 2009-12-02 14:53:59 UTC
I addressed this in a slightly different way in changeset d3f02a8dee76

If this is a problem with the Ptr<RealtimeSimulatorImpl> reference count increment in GetImplementation in the presence of multithreading, changing from N of these to 1 (where N is the number of packets received) in the presence of multithreading reduces the probability of failure *dramatically*, but the possibility of a problem still exists once when each ReadThread is spun up.  To remove this possibility I moved the GetImplementation into the main thread of the simulator while I was doing some other work.

Can you see if the current version in ns-3-dev works to your satisfaction?
Comment 4 Fabian Mauchle 2009-12-03 08:53:35 UTC
(In reply to comment #3)
> I addressed this in a slightly different way in changeset d3f02a8dee76
> 
> If this is a problem with the Ptr<RealtimeSimulatorImpl> reference count
> increment in GetImplementation in the presence of multithreading, changing from
> N of these to 1 (where N is the number of packets received) in the presence of
> multithreading reduces the probability of failure *dramatically*, but the
> possibility of a problem still exists once when each ReadThread is spun up.  To
> remove this possibility I moved the GetImplementation into the main thread of
> the simulator while I was doing some other work.
> 
> Can you see if the current version in ns-3-dev works to your satisfaction?

Looks fine to me.