Bug 1167

Summary: TCP Socket Deferring CLOSE forever.
Product: ns-3 Reporter: Frederic Urbani <frederic.urbani>
Component: tcpAssignee: natale.patriciello
Status: RESOLVED INVALID    
Severity: normal CC: natale.patriciello, ns-bugs, tomh, tommaso.pecorella
Priority: P5    
Version: ns-3.10   
Hardware: All   
OS: All   
Attachments: tcap file
NS3 Logs
Testcase using only NS-3-DEV
Updated test (ns-3.25, ns-3-dev)

Description Frederic Urbani 2011-05-19 08:34:15 UTC
Created attachment 1144 [details]
tcap file

My Test case :

Using DCE I have a client and server connected via a socket using de loopback,
the server write within a loop to the socket until timeout,
and the client read slowly the socket using a sleep between reads,
when the server write timeout because of the saturation of out buffers, it stop and close the socket,
the client should detect the socket close and stop but it is not the case 
Pseudo Code:

CLIENT:
  sleep(1)

  sock = socket (AF_INET, SOCK_STREAM, 0);

  fill_addr(ad, 1236);
  status = connect (sock, (struct sockaddr *) &ad, sizeof(ad) );

  tot = 0;
  do
    {
      // Slow read until remote closure
      sleep (10);
      status = recv (sock, readBuffer, TEST3_LEN, 0);
    }
  while (status > 0);

  status = close (sock);

  printf("Client3: end\n ");


SERVER:
  sock = socket (AF_INET, SOCK_STREAM, 0);

  status = bind (sock, (struct sockaddr *) &ad, sizeof(ad));

  status = listen (sock, 1);
  sockin = accept (sock, NULL, NULL);
  status = close (sock);

  struct timeval tiout;

  tiout.tv_sec = 3;

  status = setsockopt (sockin, SOL_SOCKET, SO_SNDTIMEO, &tiout, sizeof(tiout));

  do
    {
      status = send (sockin, sendBuffer, TEST3_LEN, 0);
    }
  while (status > 0);

  status = close (sockin);

--------------------------------------

I put pcap traces and NS3 logs as attachements.

In NS3 log I can notice this suspicious lines :

1.2s 0 1.2 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets
1.2s 0 1.2 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets
1.2s 0 1.2 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets
1.2s 0 1.2 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets
1.2s 0 1.2 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets
2.1234s 0 2.1234 [node 0] TcpSocketBase:SendPendingData(): SendPendingData sent 0 packets

7.00008s 0 7.00008 [node 0] TcpSocketBase:Close(): Socket 0xcf1400 deferring close, state ESTABLISHED

but after the FIN is never sended and the simulation end without more events...

At this time I am not able to fix this problem, is it an already known problem ?
Comment 1 Frederic Urbani 2011-05-19 08:34:39 UTC
Created attachment 1145 [details]
NS3 Logs
Comment 2 Adrian S.-W. Tam 2011-09-29 13:20:41 UTC
This shall be fixed by the patch in bug 1274. I would appreciate if the original test program is provided to verify this.
Comment 3 Frederic Urbani 2011-09-30 04:38:31 UTC
(In reply to comment #2)
> This shall be fixed by the patch in bug 1274. I would appreciate if the
> original test program is provided to verify this.

With the patch I obtain the same result :

  17.0001s 0 17.0001 [node 0] TcpSocketBase:Close(): Socket 0x230fc80 deferring close, state ESTABLISHED

You should install DCE :

  1. checkout http://code.nsnam.org/furbani/ns-3-dce/

  2. install with help of script ns-3-dce/utils/clone_and_compile_ns3_dce.sh
     
  3. do not forget to set env vars : .  ns-3-dce/utils/setenv.sh

  4. test with NS_LOG=\*  ./build/bin/ns3test-dce 

  5. See the result: cat files-0/var/log/*/*

  6. the test source is in : test/test-tcp-client.cc (see client3 and server3 functions)
Comment 4 Frederic Urbani 2011-10-12 10:14:49 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > This shall be fixed by the patch in bug 1274. I would appreciate if the
> > original test program is provided to verify this.
> 
> With the patch I obtain the same result :
> 
>   17.0001s 0 17.0001 [node 0] TcpSocketBase:Close(): Socket 0x230fc80 deferring
> close, state ESTABLISHED
> 
> You should install DCE :
> 
>   1. checkout http://code.nsnam.org/furbani/ns-3-dce/
> 
>   2. install with help of script ns-3-dce/utils/clone_and_compile_ns3_dce.sh
> 
>   3. do not forget to set env vars : .  ns-3-dce/utils/setenv.sh
> 
>   4. test with NS_LOG=\*  ./build/bin/ns3test-dce 
> 
>   5. See the result: cat files-0/var/log/*/*
> 
>   6. the test source is in : test/test-tcp-client.cc (see client3 and server3
> functions)

 test/test-tcp-socket.cc and not test/test-tcp-client.cc :(
Comment 5 Frederic Urbani 2011-10-13 10:12:35 UTC
Created attachment 1256 [details]
Testcase using only NS-3-DEV
Comment 6 natale.patriciello 2015-10-27 06:48:33 UTC
I'm not sure on what the behavior should be in this case. Do you have any hints?
Comment 7 natale.patriciello 2016-05-18 11:36:53 UTC
Created attachment 2438 [details]
Updated test (ns-3.25, ns-3-dev)

I have corrected that testcase, and re-run it. The output is:

0.016s BUG:ServerHandleSend(): Server Send --> 131072
0.016s BUG:ServerHandleSend(): Server Closing socket after first send
BUG:main(): Simulation with early close, client see close : YES Total Recv Len 0
0.018s BUG:ServerHandleSend(): Server Send --> 131072
0.018s BUG:ServerHandleSend(): Server Send --> 536
0.018s BUG:ServerHandleSend(): Server Send --> 1072
[cut]
0.018s BUG:ServerHandleSend(): Server Send --> 824
15.218s BUG:ServerHandleSend(): Server Send --> 536
15.218s BUG:ServerHandleSend(): Server Send --> 1072
[cut]
15.218s BUG:ServerHandleSend(): Server Send --> 824
30.418s BUG:ServerHandleSend(): Server Send --> -1
[cut]
BUG:main(): Simulation with tardive close, client see close : YES Total Recv Len 262144
Comment 8 natale.patriciello 2016-05-18 11:38:44 UTC
Reopen if a new testcase proving the problem is provided; a simple approach (suitable for students) to test this Close handling is to insert a check in tcp-general-test.cc after the close callback is invoked.
Comment 9 Tommaso Pecorella 2016-05-19 15:29:57 UTC
For the records (and to understand the problem).

The socket receiving the FIN (i.e., the one that did NOT start the Close procedure) will call CloseAndNotify immediately, thus triggering the Close callback.

The socket sending the FIN will have to wait a timer to close for real - because the standard say so.
This timer is set to 2 * MSL (https://en.wikipedia.org/wiki/Maximum_segment_lifetime). MSL is (arbitrarily) set to 2 minutes *in the standard itself*. As a consequence, the socket initiating the Close will notify the upper layers 4 minutes after the Close has been done.

This is normal.

Note that MaxSegLifetime is a TcpSocketBase attribute, and the user can change it.
Comment 10 natale.patriciello 2016-05-19 16:11:39 UTC
note that before, there was a condition in which the Close call was not called, even after 2*MSL. Now the problem seems gone (I don't know exactly what commit fixed it)