Bug 2477

Summary: DCF manager assert
Product: ns-3 Reporter: Tom Henderson <tomh>
Component: wifiAssignee: sebastien.deronne
Status: RESOLVED FIXED    
Severity: normal CC: adarshpatel, ns-bugs, sebastien.deronne
Priority: P3    
Version: ns-3.26   
Hardware: All   
OS: All   
Attachments: test case displaying the error
proposed patch to fix
new patch to fix

Description Tom Henderson 2016-08-16 01:21:56 UTC
Created attachment 2541 [details]
test case displaying the error

Reported in this forum entry:

https://groups.google.com/forum/#!topic/ns-3-users/5_cJ1TVpu2E

"I am running the script attached to this post in ns-3-dev on Ubuntu 12.04 and for a number of nodes inferior to 36 there is no problem but for a number of nodes equal or superior to 36 I have the following error:

assert failed. cond="Simulator::Now () - m_lastRxStart <= m_sifs", file=../src/wifi/model/dcf-manager.cc, line=763

(test case attached)
Comment 1 Tom Henderson 2016-09-16 02:17:28 UTC
In this test case, the assert is firing due to the scheduled transmission of a block ack.  The assert fires at time 161.254010337s on node 55.  The block ack is scheduled on node 55 at time 161.245902644s due to the receipt of the first MPDU in an AMPDU from node 49.  Node 49 sends node 55 six MPDUs (between times 161.245902 and 161.252646).  However, there are reception problems on node 55, and node 55 drops most of these:

161.249949412s 55 YansWifiPhy:StartReceivePreambleAndHeader(0xe8cac0, 0x10c1cb0, -83.9128, HtMcs0, 5, 1)
161.249949412s 55 drop packet because no PLCP preamble/header has been received
161.251298335s 55 YansWifiPhy:StartReceivePreambleAndHeader(0xe8cac0, 0x1361a20, -83.9095, HtMcs0, 5, 1)
161.251298335s 55 drop packet because no PLCP preamble/header has been received
161.252647258s 55 YansWifiPhy:StartReceivePreambleAndHeader(0xe8cac0, 0x1144cf0, -83.9062, HtMcs0, 5, 2)

and then starts to receive another transmission (from node 61).  It is this reception that causes the assert; when the node 55 sends the block ack, the Phy is in receive state, and the DCF is notified that transmission is occurring.

It seems to me that the comment in dcf-manager.cc is not completely correct anymore:

       //this may be caused only if PHY has started to receive a packet
       //inside SIFS, so, we check that lastRxStart was maximum a SIFS ago

and that it is possible for the DCF to get out of sync with the PHY in some of these cases.  So, the proposed patch for now (which clears the error and passes all other tests) basically just removes this assert.  This doesn't really change the model behavior; the PHY cancels the receive event and starts to transmit.

However, I think that what should be done is to patch this for the 3.26 release but keep this tracker issue open for a bit to think about whether more should be done.  Specifically, if a receiver gives up on an A-MPDU, is there a way to cancel the scheduled block ack event in MacLow?
Comment 2 Tom Henderson 2016-09-16 02:18:03 UTC
Created attachment 2582 [details]
proposed patch to fix
Comment 3 Adarsh Patel 2016-09-18 23:27:25 UTC
Hi Tom H., I also did the same to solve the issue as you proposed in fix, this may be coincidence. I am doing PhD so due to lake of time I was unable to create patch and run regressions etc. 

I doubt this solution, I think it should be fixed in WifiPhyStateHelper::SwitchToTx (wifi-phy-state-helper.cc) where it is being decided that "Yes notify Tx started" but I doubt if I am true. Please look into issue.
Comment 4 Tom Henderson 2016-09-28 00:11:03 UTC
For ns-3.26, I committed the patch as changeset 12346:10fae18dcfd2, which avoids the assert, but I will leave this open for further study as to whether there needs to be some better way to keep DCF in sync and to cancel scheduled block ack transmissions when the TXOP has been released prematurely.
Comment 5 sebastien.deronne 2016-10-01 07:05:29 UTC
Tom, thanks for the workaround, I will also think on how we can better fix this issue.
Comment 6 sebastien.deronne 2016-10-11 17:25:31 UTC
I did not checked yet traces carefully, but PHY should not lock onto another signal if it did not receive the preamble.
Or does this mean the preamble starts after the last MPDU is received?
Comment 7 Tom Henderson 2016-10-11 17:39:27 UTC
(In reply to sebastien.deronne from comment #6)
> I did not checked yet traces carefully, but PHY should not lock onto another
> signal if it did not receive the preamble.
> Or does this mean the preamble starts after the last MPDU is received?

I believe that the issue is the last MPDU is not received-- the PHY gives up on the A-MPDU and captures another MPDU from another sender.
Comment 8 sebastien.deronne 2016-10-14 08:29:57 UTC
I went a bit further in the issue.

The last MPDU is indeed not received.
In this scenario, the node did move so that the signal is too weak and we are no longer locked on the transmitter (drop packet because signal power too Small), which results in reseting the PHY, so that it can start capturing other frames, which is expected.
But since the BACK is still scheduled, it results in the assert.

I see two solutions:
- MAC skips transmitting BACK if the PHY is not idle (i.e. state is either rx or tx)
- PHY triggers the MAC if we are no longer locked on the ongoing A-MPDU.

I actually prefer the first solution for which I have a patch already.
Comment 9 sebastien.deronne 2016-10-14 08:32:17 UTC
Created attachment 2616 [details]
new patch to fix

I removed previous change in DcfManager and added a check in MacLow whether the scheduled block ack transmission should be skipped based on the PHY state.
Comment 10 sebastien.deronne 2016-10-24 16:15:23 UTC
Is the latest patch ok for everyone?
Comment 11 Tom Henderson 2016-10-26 00:49:08 UTC
Yes, passes my tests.
Comment 12 sebastien.deronne 2016-10-30 04:22:29 UTC
Fixed in changeset 12384:aa1a97ec0289