Bug 1605

Summary: Multihop network is not working at all
Product: ns-3 Reporter: Ehsan Elahi <ehsan.zahoor>
Component: wifiAssignee: Nicola Baldo <nicola>
Status: RESOLVED DUPLICATE    
Severity: normal CC: julian.quindt, ns-bugs, ruben, sebastien.deronne, tomh, tommaso.pecorella
Priority: P4    
Version: ns-3.16   
Hardware: PC   
OS: Linux   
See Also: https://www.nsnam.org/bugzilla/show_bug.cgi?id=2369

Description Ehsan Elahi 2013-03-20 11:14:32 UTC
I run the example given in mesh module i.e. mesh.cc without any change but there is no packet received at sink node. However when I reduced the distance between nodes from 100 to 70, the packets are being received at sink. But when I increased the network density from 3x3 grid to even 4x4, again no packet is received at sink.
It clearly means that routing protocol is not working at all or packet forwarding mechanism is not working and only direct node neighbours can send and receive packets.
Comment 1 Julian 2013-04-07 16:45:18 UTC
(In reply to comment #0)
> I run the example given in mesh module i.e. mesh.cc without any change but
> there is no packet received at sink node. However when I reduced the distance
> between nodes from 100 to 70, the packets are being received at sink. But when
> I increased the network density from 3x3 grid to even 4x4, again no packet is
> received at sink.
> It clearly means that routing protocol is not working at all or packet
> forwarding mechanism is not working and only direct node neighbours can send
> and receive packets.

HI. I saw this behavior, too. Here is what I did do far:
-Varied the distance between the nodes
-Varied the number of nodes
-Changed the propagation model from default (Log) to Range and RSS
-Varied the network load (i.e. UDP ping interval) 

The things I tried do change the symptoms a little bit, but none has been found to work reliably for me.

Main symptoms are:
-Links to other nodes are not established at all if the number of nodes exceeds a certain amount (for instance 22 nodes or in another setup 26)
-Links are established, but throughput is near zero. When looking into pcap data I could see, that a lot of retry flags are set
-Excessive execution duration for some of the parameter sets. See the bug I filed before you.

However, I can confirm that a diametral ping through the network does work sometimes (rare and unreliable).
Comment 2 Kirill Andreev 2013-04-14 08:35:02 UTC
Hi all!

The problem is actually not in the mesh model itself.

The actual problem is hidden in the fact, that node's packet processing time is exactly zero. And the model is highly sensitive to this fact.


I have run the mesh example and have discovered that no data packets can be delivered.


The first thing that I have changed was the random jitter, which I have added to all packets that go from the mesh interface mac to EdcaTxopN, and the network has started to work properly.


The explanation is the following:
1. Some node sends a packet (this may be broadcast ARP request, or broadcast HWMP PREQ or something else)
2. Several nodes receive this packet
3. The received packet goes immediately through the node and comes to outgoing wifi-queue (EDCA-TXOP)
4. all received packets are forwarded immediately (due to wifi-backoff procedure, and backoff is not calculated in this case, because there was no previous concurrency in the network)
5. Packets are collided
6. NO management data can pass through the network


The symptoms are the following:
1. Successful transmission is possible only between two nodes, or
2. The successful transmission is possible only in a chain topology


Introducing the jitter looks like follows:

diff -r c9534df44a2d src/mesh/model/mesh-wifi-interface-mac.cc
--- a/src/mesh/model/mesh-wifi-interface-mac.cc	Sat Apr 13 00:04:21 2013 +0900
+++ b/src/mesh/model/mesh-wifi-interface-mac.cc	Sun Apr 14 16:21:15 2013 +0400
@@ -18,7 +18,7 @@
  * Authors: Kirill Andreev <andreev@iitp.ru>
  *          Pavel Boyko <boyko@iitp.ru>
  */
-
+#include "ns3/random-variable.h"
 #include "ns3/mesh-wifi-interface-mac.h"
 #include "ns3/mesh-wifi-beacon.h"
 #include "ns3/log.h"
@@ -269,7 +269,10 @@
   m_stats.sentFrames++;
   m_stats.sentBytes += packet->GetSize ();
   NS_ASSERT (m_edca.find (ac) != m_edca.end ());
-  m_edca[ac]->Queue (packet, hdr);
+  Time delay = MicroSeconds (UniformVariable (0,100).GetValue ());
+  std::cout << "Delay = " << delay << std::endl;
+  Simulator::Schedule(delay, &EdcaTxopN::Queue, m_edca[ac], packet, hdr);
+  //m_edca[ac]->Queue (packet, hdr);
 }
 void
 MeshWifiInterfaceMac::SendManagementFrame (Ptr<Packet> packet, const WifiMacHeader& hdr)
@@ -300,11 +303,19 @@
    */
   if (hdr.GetAddr1 () != Mac48Address::GetBroadcast ())
     {
-      m_edca[AC_VO]->Queue (packet, header);
+      Time delay = MicroSeconds (UniformVariable (0,100).GetValue ());
+      std::cout << "Delay = " << delay << std::endl;
+      Simulator::Schedule(delay, &EdcaTxopN::Queue, m_edca[AC_VO], packet, header);
+
+      //m_edca[AC_VO]->Queue (packet, header);
     }
   else
     {
-      m_edca[AC_BK]->Queue (packet, header);
+      Time delay = MicroSeconds (UniformVariable (0,100).GetValue ());
+      std::cout << "Delay = " << delay << std::endl;
+      Simulator::Schedule(delay, &EdcaTxopN::Queue, m_edca[AC_BK], packet, hdr);
+
+      //m_edca[AC_BK]->Queue (packet, header);
     }
 }
 SupportedRates


Of course, this is not a solution (because it may change packet order), but this dirty fix shows the cause of this strange behavior.

What I can suggest is the special queue in the mesh interface mac, which passes packets to wifi dca-queue with some random delay.
Comment 3 Tommaso Pecorella 2013-04-14 08:45:10 UTC
To be honest, I don't like too much the solution.

I'd first go through the standard, as the fact that the packet is transmitted without a backoff sounds quite strange to me.

Without the backoff, the network is simply showing a normal behaviour: collisions and congestion. Up to the point that it's a matter of luck if a packet can be received.

I'd be extremely surprised if the standards would say that the backoff isn't necessary for packet rebroadcasting. Really surprised, as it would mean, basically, that we're 100% sure that a collision will happen in a "diamond" network.

Right now I can't look at the standard (over-busy with work). Can someone check this ?

T.
Comment 4 Kirill Andreev 2013-04-14 08:56:35 UTC
This problem has already beed discussed int the past. I have suggested some changes into DCF-manager (which did fix this problem!), but they were rejected, because this fix was in contrary with the standard

(In reply to comment #3)
> To be honest, I don't like too much the solution.
> 
> I'd first go through the standard, as the fact that the packet is transmitted
> without a backoff sounds quite strange to me.
> 
> Without the backoff, the network is simply showing a normal behaviour:
> collisions and congestion. Up to the point that it's a matter of luck if a
> packet can be received.
> 
> I'd be extremely surprised if the standards would say that the backoff isn't
> necessary for packet rebroadcasting. Really surprised, as it would mean,
> basically, that we're 100% sure that a collision will happen in a "diamond"
> network.
> 
> Right now I can't look at the standard (over-busy with work). Can someone check
> this ?
> 
> T.
Comment 5 Kirill Andreev 2013-04-14 09:00:50 UTC
The long discussion of the problem is presented here:
https://www.nsnam.org/bugzilla/﷒0
Comment 6 Tommaso Pecorella 2013-04-14 17:13:41 UTC
I see... indeed the discussion was long.

Now, about *this* bug, I'd take another approach. Instead of putting the delay deep into the MAC, why don't we put it (as an optional attribute) into a higher layer ?

From a quick analysis (and I might be wrong), the routing protocol should be the right place.

When a packet is received, the packet goes up 'til the routing protocol, then if need it's re-injected for forwarding. That one is the perfect place to add a random delay without changing the lower layers.
Plus, it mimics the real devices, with their random delays due to internal processing.

Sorry if I can't elaborate a patch right now, but I'm too busy to study the mesh model (which I never used for real).

Cheers,

T.
Comment 7 Nicola Baldo 2015-04-01 10:00:11 UTC
After re-reading this thread and comparing with Bug 737 and Bug 912, I am marking this as a duplicate of Bug 912.

*** This bug has been marked as a duplicate of bug 912 ***
Comment 8 Tom Henderson 2016-09-28 00:19:00 UTC
This was not a duplicate of 912 but of 1465; anyway, the patch to bug 2369 resolves this.

*** This bug has been marked as a duplicate of bug 1465 ***