Bug 1435 - LTE tests do not terminate on OS X
LTE tests do not terminate on OS X
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: lte
pre-release
All Mac OS
: P1 normal
Assigned To: Manuel Requena
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-24 01:35 UTC by Tom Henderson
Modified: 2012-05-30 12:56 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom Henderson 2012-05-24 01:35:34 UTC
this seems to be limited to OS X, but the LTE tests do not run to completion.  To reproduce, try:
./test.py -s lte-rlc-am-e2e
Comment 1 Tommaso Pecorella 2012-05-24 17:13:51 UTC
This seems quite a bug, possibly only showing on MacOS and not on other systems by luck (or unluck).

The piece of code responsible for this is in lte-rlc-am.cc, line 706:

      if ( m_rxonBuffer[ m_vrMs.GetValue () ].m_pduComplete )
        {
          while ( m_rxonBuffer[ m_vrMs.GetValue () ].m_pduComplete )
            {
              m_vrMs++;
              NS_LOG_LOGIC ("Incr VR(MS) = " << m_vrMs);
            }
          NS_LOG_LOGIC ("New VR(MS) = " << m_vrMs);
        }

For some reason the cycle never ends. Well, the reason is that m_vrMs++ is actually a modulus-based increment (at 1023 it goes back to 0) and when *all* the elements in m_rxonBuffer have the m_pduComplete flag... the cycle will never end.
 
I see two problems in this piece of code:
1) m_pduComplete is a map. If an element in the map is not there (is this possible?) simply referencing it will create it. The "safe" function for this case is map.find(key). See http://www.sgi.com/tech/stl/Map.html
2) there is no check about a complete map scanning, i.e., if the cycle is complete.

Note: I found the problem using a complex MonteCarlo run analysis (I ran the debugger and stopped it randomly 'til I found it was stuck in the same point). I can *not* exclude similar issues elsewhere in the code.

I'm raising the bug priority, as it seems more than a simple test failing on a particular OS.
Comment 2 Manuel Requena 2012-05-25 08:20:25 UTC
According your description, it seems to be a bug in lte-rlc-am. I will take care of it.
Comment 3 Manuel Requena 2012-05-30 12:56:03 UTC
The following changesets solve the problem:

changeset:   8828:ccee8110ddb5
tag:         tip
user:        Manuel Requena <manuel.requena@cttc.es>
date:        Wed May 30 18:04:22 2012 +0200
summary:     Fix condition of assert message

changeset:   8827:988a5b38cd6e
user:        Manuel Requena <manuel.requena@cttc.es>
date:        Wed May 30 17:51:05 2012 +0200
summary:     Protect rxonBuffer against missing PDUs

I have tested in Fedora15 and MacOS 10.6.8