Bug 2185

Summary: WiFi MacLow may respond to errored frames that it should ignore
Product: ns-3 Reporter: Tom Henderson <tomh>
Component: wifiAssignee: sebastien.deronne
Status: RESOLVED FIXED    
Severity: normal CC: matis18, ns-bugs
Priority: P5    
Version: ns-3-dev   
Hardware: All   
OS: All   
Attachments: patch to fix
patch to trigger a PHY-RXEND indication at the end of the complete frame
patch to check destination address before replying

Description Tom Henderson 2015-09-25 20:06:30 UTC
Created attachment 2149 [details]
patch to fix

MacLow::ReceiveError() is called when the Phy has a reception error.  It may happen that the reception error is for a frame destined to another station.  However, there are no checks on destination address in ReceiveError() to determine whether further action should be taken.

This can eventually end up causing an assert elsewhere, such as 

MacLow::SendBlockAckAfterAmpdu(... originator...) 
 NS_ASSERT (i != m_bAckCaches.end ());

because there is no block ack state for the originator.

Attached patch clears the problem for me.
Comment 1 sebastien.deronne 2015-09-26 05:20:51 UTC
Tom, I thought (I was wrong apparently) this was already fixed by a previous small change I did in MacLow (which in any case adds some more safety to our code, please refer to commit 11614:50d7bddfc00b).

Anyhow, I am in favor of including your patch, this will definitely prevents from a bad behavior.
Comment 2 Tom Henderson 2015-09-26 09:21:25 UTC
(In reply to sebastien.deronne from comment #1)
> Tom, I thought (I was wrong apparently) this was already fixed by a previous
> small change I did in MacLow (which in any case adds some more safety to our
> code, please refer to commit 11614:50d7bddfc00b).
> 
> Anyhow, I am in favor of including your patch, this will definitely prevents
> from a bad behavior.

You may be right; the code that was triggering it for me was the ns-3-dev-laa code whose version of mac-low.cc predates 11614.  I have not reproduced the problem on ns-3-dev.  If you want, I could hold off on this patch until we update ns-3-dev-laa and see if 11614 in fact clears it.  Or maybe this check is good to have anyway to prevent future such errors from appearing.

I was wondering about the MacLow::ReceiveError() method and that maybe at the very least a comment should be added about what is being modelled here.  This method is triggered from the Phy due to a reception error (even PLCP header decode error), yet the Mac is taking action on the packet fields.  Shouldn't the packet be disregarded?  The doxygen says:

   * This method is typically invoked by the lower PHY layer to notify
   * the MAC layer that a packet was unsuccessfully received.

However the ReceiveError treats it as if at least some portions were successfully received.  Is the model cheating here?
Comment 3 sebastien.deronne 2015-09-26 09:52:55 UTC
> You may be right; the code that was triggering it for me was the
> ns-3-dev-laa code whose version of mac-low.cc predates 11614.  I have not
> reproduced the problem on ns-3-dev.  If you want, I could hold off on this
> patch until we update ns-3-dev-laa and see if 11614 in fact clears it.  Or
> maybe this check is good to have anyway to prevent future such errors from
> appearing.

I think we should add this check, at least for preventive reasons.
 
> I was wondering about the MacLow::ReceiveError() method and that maybe at
> the very least a comment should be added about what is being modelled here. 
> This method is triggered from the Phy due to a reception error (even PLCP
> header decode error), yet the Mac is taking action on the packet fields. 
> Shouldn't the packet be disregarded?  The doxygen says:
> 
>    * This method is typically invoked by the lower PHY layer to notify
>    * the MAC layer that a packet was unsuccessfully received.
> 
> However the ReceiveError treats it as if at least some portions were
> successfully received.  Is the model cheating here?

You are right, we should normally not read fields there. However, we have currently no other solution as long as the PHY layer is not improved to handle A-MPDUs. 

In order to avoid this workaround, we would need a PHY-RXEND indication NOT at the end of each MPDU (as done currently) but at the end of the complete frame (based on PPDU header’s Length field). If the MAC gets this indication and previously received at least 1 MPDU, it should send a Block ACK. This would no longer require the fields of the failed PPDU.

If you also have the same feeling, we should open another thread in the tracker for this improvement, which I think is not too much work but would definitely clean MacLow::ReceiveError and better stick with the standard. Then I'll work on a patch in the very short loop.
Comment 4 sebastien.deronne 2015-10-10 05:19:21 UTC
Created attachment 2161 [details]
patch to trigger a PHY-RXEND indication at the end of the complete frame

I prepared a patch that triggers a PHY-RXEND indication at the end of the complete frame. This avoids the need to take actions on the packet fields in MacLow::ReceiveError.
Comment 5 Matías Richart 2015-10-19 13:08:39 UTC
The patch looks good to me.
Comment 6 Tom Henderson 2015-10-20 16:48:59 UTC
+1
Comment 7 sebastien.deronne 2015-10-27 17:58:15 UTC
Committed in changeset 11732:5438504862ac
Comment 8 Tom Henderson 2015-11-16 22:52:01 UTC
I am still seeing instances of the problem described here:

https://groups.google.com/forum/#!searchin/ns-3-users/m_bAckCaches.end/ns-3-users/GAm50Z-FhcU/dyVgtOWj82QJ

I don't have a good test case at the moment as it is occurring in some simulations not from a public repo.  I will post a test case if I find it elsewhere.

The attached patch fixes it for me and it passes all test.py tests.  I am uncertain whether there can be ReceiveError() called where the WifiMacHeader is not next to be popped (in which case it is not appropriate to PeekHeader on it); I would think that individual MPDUs in an A-MPDU would be like this.   I would like some review on this patch; maybe something else is preferable.

Anyway, the bottom line is that it is dangerous to reply to things from ReceiveError() unless the destination can be determined to be the local host, and even then, it is probably best to clean ReceiveError() from replying at all.
Comment 9 Tom Henderson 2015-11-16 22:52:54 UTC
Created attachment 2189 [details]
patch to check destination address before replying
Comment 10 sebastien.deronne 2015-11-17 08:55:15 UTC
It seems good to add this check.
Does this fix the issues you still observed?
Comment 11 Tom Henderson 2015-11-17 09:54:20 UTC
(In reply to sebastien.deronne from comment #10)
> It seems good to add this check.
> Does this fix the issues you still observed?

Yes, but can it ever be the case that we receive a Packet here without a WifiMacHeader?  If so, we need to add additional checking for this case.
Comment 12 sebastien.deronne 2015-11-17 10:07:29 UTC
(In reply to Tom Henderson from comment #11)
> (In reply to sebastien.deronne from comment #10)
> > It seems good to add this check.
> > Does this fix the issues you still observed?
> 
> Yes, but can it ever be the case that we receive a Packet here without a
> WifiMacHeader?  If so, we need to add additional checking for this case.

No, this case will never happen.
Comment 13 sebastien.deronne 2015-11-26 07:51:40 UTC
Is there still issues with the latest changes you proposed?
Comment 14 Tom Henderson 2015-11-26 11:34:47 UTC
(In reply to sebastien.deronne from comment #13)
> Is there still issues with the latest changes you proposed?

Yes, I still ran into issues with that patch.  They are related to the bug 2224 issues I think; if the node is an AP, which STA does the variable m_receivedAtLeastOneMpdu pertain to?

The current code that I have in the ns-3-lbt repository, which is avoiding asserts from happening, disables all replying to packets from the ReceiveError() method.  I think this is a better solution in the long run.  I made a small change to the SpectrumWifiPhy to only send NotifyRxEnd() upon last MPDU in aggregate:

http://code.nsnam.org/laa/ns-3-lbt/file/fbb156e4ed76/src/wifi/model/spectrum-wifi-phy.cc#l1359

but I wonder whether more is needed?  Can you review this type of solution?  My concern with it is that the state of "RxEnd" means something different in the context of the InterferenceHelper state machine (which refers to a MPDU) and the context of the Phy state machine (which refers to an A-MPDU)-- is that a concern?
Comment 15 sebastien.deronne 2015-11-27 03:02:54 UTC
(In reply to Tom Henderson from comment #14)
> (In reply to sebastien.deronne from comment #13)
> > Is there still issues with the latest changes you proposed?
> 
> Yes, I still ran into issues with that patch.  They are related to the bug
> 2224 issues I think; if the node is an AP, which STA does the variable
> m_receivedAtLeastOneMpdu pertain to?

To the ongoing transmission, i.e. the transmission on which the AP is synched to.
Once the ongoing A-MPDU transmission finished, it should be reset to false.

> The current code that I have in the ns-3-lbt repository, which is avoiding
> asserts from happening, disables all replying to packets from the
> ReceiveError() method.  I think this is a better solution in the long run. 
> I made a small change to the SpectrumWifiPhy to only send NotifyRxEnd() upon
> last MPDU in aggregate:
> 
> http://code.nsnam.org/laa/ns-3-lbt/file/fbb156e4ed76/src/wifi/model/spectrum-
> wifi-phy.cc#l1359
> 
> but I wonder whether more is needed?  Can you review this type of solution? 
> My concern with it is that the state of "RxEnd" means something different in
> the context of the InterferenceHelper state machine (which refers to a MPDU)
> and the context of the Phy state machine (which refers to an A-MPDU)-- is
> that a concern?

Do you have a patch that I can apply to ns-3-dev?
Comment 16 sebastien.deronne 2015-12-26 04:07:47 UTC
Tom,
Could you please check whether you still observe issues once the fix for bug 2224 is applied?
Comment 17 sebastien.deronne 2016-02-23 18:32:57 UTC
changeset 11913:11d2776a1f25