Bug 596

Summary: unbounded tcp queue is evil
Product: ns-3 Reporter: Gustavo J. A. M. Carneiro <gjcarneiro>
Component: internetAssignee: ns-bugs <ns-bugs>
Status: RESOLVED FIXED    
Severity: major CC: mathieu.lacage, tomh
Priority: P5    
Version: ns-3-dev   
Hardware: All   
OS: All   
Attachments: test script
massif log file (run through ms_print to see the report)
patch

Description Gustavo J. A. M. Carneiro 2009-06-16 13:03:06 UTC
Created attachment 468 [details]
test script

Run the attached program with massif.  The resulting memory graph is like this:

    MB
122.6^                                                                 ,,#    
     |                                                              ,@@@@#    
     |                                                          ,,@@@@@@@#    
     |                                                       ,,@@@@@@@@@@#    
     |                                                    ..@@@@@@@@@@@@@#.   
     |                                                   .::@@@@@@@@@@@@@#:   
     |                                               .,: :::@@@@@@@@@@@@@#:   
     |                                           ,,@@:@: :::@@@@@@@@@@@@@#::  
     |                                         @@@@@@:@: :::@@@@@@@@@@@@@#::  
     |                                    ,, @ @@@@@@:@: :::@@@@@@@@@@@@@#:.  
     |                                  .:@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::  
     |                              ,@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::  
     |                            @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::. 
     |                        ,:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::: 
     |                    ,,@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::: 
     |                ,,: @@@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::::
     |              ,@@@: @@@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::::
     |           ,: @@@@: @@@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#:::.
     |       ., :@: @@@@: @@@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::::
     |    ,@@:@ :@: @@@@: @@@@@:@ @:@@: ::@@ @ @@@@@@:@: :::@@@@@@@@@@@@@#::::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   1.769

This is not reported as memory leak per-se, it's just that *someone* is keeping hold of pointers to packets, or packet buffer caching system going berserk, I don't know.

I had this problem first in NS 3.2, then I ported the test script to ns-3-dev and the problem remains.


 53  1,509,550,840      109,677,888       83,993,535    25,684,353            0
76.58% (83,993,535B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->46.60% (51,111,280B) 0x6A4E469: ns3::OnOffApplication::SendPacket() (ptr.h:237)
[...]
->10.72% (11,761,319B) 0x691EDD1: ns3::Buffer::Create(unsigned int) (buffer.cc:136)
| ->10.55% (11,568,620B) 0x691EDF8: ns3::Buffer::Initialize(unsigned int) (buffer.cc:245)
| | ->10.55% (11,568,620B) 0x694398E: ns3::Packet::Packet(unsigned int) (packet.cc:173)
| | | ->10.55% (11,568,620B) 0x6A4E476: ns3::OnOffApplication::SendPacket() (ptr.h:237)
[...]
Comment 1 Gustavo J. A. M. Carneiro 2009-06-16 13:04:20 UTC
Created attachment 469 [details]
massif log file (run through ms_print to see the report)
Comment 2 Mathieu Lacage 2009-06-17 09:12:04 UTC
tcp tx queue has all packets ?
Comment 3 Mathieu Lacage 2009-06-17 11:26:59 UTC
This appears to be an instance of TCP having unbounded tx buffers. If you modify your code to call CommandLine ().Parse (argv) and, then, run your program with --ns3::TcpSocket::SndBufSize=512, you will see a fixed-size memory usage.

Comment 4 Gustavo J. A. M. Carneiro 2009-06-17 12:36:17 UTC
This is seriously broken.  We have bounded device tx queues, as it should be, so why do we have unbounded tcp queues?!
Comment 5 Gustavo J. A. M. Carneiro 2009-06-18 06:25:56 UTC
According to [1], the linux default maximum TCP send buffer size is 128k.  I propose to change it to that value.

[1] http://ipsysctl-tutorial.frozentux.net/chunkyhtml/tcpvariables.html
Comment 6 Gustavo J. A. M. Carneiro 2009-06-18 06:29:03 UTC
Created attachment 472 [details]
patch
Comment 7 Mathieu Lacage 2009-06-18 07:27:52 UTC
+1
Comment 8 Gustavo J. A. M. Carneiro 2009-06-18 09:52:52 UTC
Grr, the fix works for ns-3-dev but not for ns-3.2 :(
Comment 9 Gustavo J. A. M. Carneiro 2009-06-18 12:46:33 UTC
Ok, in ns-3.2, 128k as send buffer makes tcp behave as unbounded.  Reducing the value to 10 times smaller (12.8k) seems to fix the value.  32k also seems to work in ns-3.2.
Comment 10 Tom Henderson 2009-06-19 18:07:00 UTC
Are people OK with 131072 (128K) for ns-3-dev?

Note, I haven't applied this yet because it causes a regression in the ppp-level traces that I haven't figured out yet.
Comment 11 Mathieu Lacage 2009-06-26 07:11:28 UTC
(In reply to comment #10)
> Are people OK with 131072 (128K) for ns-3-dev?
> 
> Note, I haven't applied this yet because it causes a regression in the
> ppp-level traces that I haven't figured out yet.

the regression is because of bugs in the test application which does not handle correctly short writes
Comment 12 Mathieu Lacage 2009-06-26 07:41:54 UTC
I rewrote the tcp test application to handle short writes. The only regression file difference is now in the ascii output where you can see differing sizes for tcp packets due to the differing way we fill the tx buffer.

changeset eb6e86305f4f