Bug 1435

Summary: LTE tests do not terminate on OS X
Product: ns-3 Reporter: Tom Henderson <tomh>
Component: lteAssignee: Manuel Requena <manuel.requena>
Status: RESOLVED FIXED    
Severity: normal CC: manuel.requena, ns-bugs
Priority: P1    
Version: pre-release   
Hardware: All   
OS: Mac OS   

Description Tom Henderson 2012-05-24 01:35:34 UTC
this seems to be limited to OS X, but the LTE tests do not run to completion.  To reproduce, try:
./test.py -s lte-rlc-am-e2e
Comment 1 Tommaso Pecorella 2012-05-24 17:13:51 UTC
This seems quite a bug, possibly only showing on MacOS and not on other systems by luck (or unluck).

The piece of code responsible for this is in lte-rlc-am.cc, line 706:

      if ( m_rxonBuffer[ m_vrMs.GetValue () ].m_pduComplete )
        {
          while ( m_rxonBuffer[ m_vrMs.GetValue () ].m_pduComplete )
            {
              m_vrMs++;
              NS_LOG_LOGIC ("Incr VR(MS) = " << m_vrMs);
            }
          NS_LOG_LOGIC ("New VR(MS) = " << m_vrMs);
        }

For some reason the cycle never ends. Well, the reason is that m_vrMs++ is actually a modulus-based increment (at 1023 it goes back to 0) and when *all* the elements in m_rxonBuffer have the m_pduComplete flag... the cycle will never end.
 
I see two problems in this piece of code:
1) m_pduComplete is a map. If an element in the map is not there (is this possible?) simply referencing it will create it. The "safe" function for this case is map.find(key). See http://www.sgi.com/tech/stl/Map.html
2) there is no check about a complete map scanning, i.e., if the cycle is complete.

Note: I found the problem using a complex MonteCarlo run analysis (I ran the debugger and stopped it randomly 'til I found it was stuck in the same point). I can *not* exclude similar issues elsewhere in the code.

I'm raising the bug priority, as it seems more than a simple test failing on a particular OS.
Comment 2 Manuel Requena 2012-05-25 08:20:25 UTC
According your description, it seems to be a bug in lte-rlc-am. I will take care of it.
Comment 3 Manuel Requena 2012-05-30 12:56:03 UTC
The following changesets solve the problem:

changeset:   8828:ccee8110ddb5
tag:         tip
user:        Manuel Requena <manuel.requena@cttc.es>
date:        Wed May 30 18:04:22 2012 +0200
summary:     Fix condition of assert message

changeset:   8827:988a5b38cd6e
user:        Manuel Requena <manuel.requena@cttc.es>
date:        Wed May 30 17:51:05 2012 +0200
summary:     Protect rxonBuffer against missing PDUs

I have tested in Fedora15 and MacOS 10.6.8