Bug 2243 - TCP Socket Fork() fails to copy some parameters, causing connections to close prematurely on retransmit.
TCP Socket Fork() fails to copy some parameters, causing connections to close...
Status: RESOLVED FIXED
Product: ns-3
Classification: Unclassified
Component: internet
ns-3-dev
All All
: P5 blocker
Assigned To: natale.patriciello
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-07 15:13 UTC by l.salameh
Modified: 2015-12-14 19:14 UTC (History)
5 users (show)

See Also:


Attachments
initialize all members (2.99 KB, patch)
2015-12-11 10:12 UTC, natale.patriciello
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description l.salameh 2015-12-07 15:13:39 UTC
If you have an experiment with a forked TCP socket as a result of an incoming connection on a listening socket, any time you get drops and duplicate ACKs on that connection, the socket immediately closes, and does not perform a retransmit.

The culprit is the copy constructor for tcp-socket-base. It turns out when we Fork() a new connection, the copy constructor does not copy the dataRetries parameter. I.e. when we check for its value in the retransmit function, it is zero, causing the connection to terminate.

Easy fix:

--- a/src/internet/model/tcp-socket-base.cc	Mon Dec 07 19:52:59 2015 +0000
+++ b/src/internet/model/tcp-socket-base.cc	Mon Dec 07 20:11:11 2015 +0000
@@ -298,6 +298,7 @@
     m_delAckMaxCount (sock.m_delAckMaxCount),
     m_noDelay (sock.m_noDelay),
     m_synRetries (sock.m_synRetries),
+    m_dataRetries (sock.m_dataRetries),
     m_delAckTimeout (sock.m_delAckTimeout),
     m_persistTimeout (sock.m_persistTimeout),
     m_cnTimeout (sock.m_cnTimeout),
Comment 1 Tommaso Pecorella 2015-12-07 15:45:58 UTC
I agree on the analysis, but... how is it possible that this bug wasn't reported before ?
Comment 2 l.salameh 2015-12-07 15:49:28 UTC
I think maybe the change to add dataRetries has been relatively recent, so maybe not many people have been using the top of tree ns-3 to experience it.
Comment 3 Tommaso Pecorella 2015-12-07 15:54:03 UTC
Mmmm... it seems that there are more variables not copied. E.g.:
  Time              m_minRto;          //!< minimum value of the Retransmit timeout
  Time              m_clockGranularity; //!< Clock Granularity used in RTO calc
These as well will cause troubles.

Changing the bug topic accordingly and handing it over to Natale (he did the TCP refactoring).
Comment 4 natale.patriciello 2015-12-11 07:53:36 UTC
I have a patch ready for this, to initialize all members of TcpSocket*. However, with valgrind I have many errors in many simulations; I'm trying to find the source.

Nat
Comment 5 natale.patriciello 2015-12-11 10:12:16 UTC
Created attachment 2204 [details]
initialize all members

Patch which initializes all members
Comment 6 Tommaso Pecorella 2015-12-11 17:41:03 UTC
Are the valgrind tests passing ? 
Can we push this or do you want to add a unit testing for this ?


(In reply to natale.patriciello from comment #5)
> Created attachment 2204 [details]
> initialize all members
> 
> Patch which initializes all members
Comment 7 natale.patriciello 2015-12-14 04:15:27 UTC
No, they aren't, but I think the problem isn't in such patch.
Comment 8 natale.patriciello 2015-12-14 06:16:33 UTC
This is my list of VALGR failures under current ns-3-dev; they are the same with and without patch:

List of VALGR failures:
    aggregation-wifi
    angles
    animation-interface
    aodv-routing-id-cache
    attributes
    average
    basic-data-calculators
    basic-energy-harvester
    buffer
    build-profile
    building-position-allocator
    buildings-helper
    buildings-pathloss-test
    buildings-shadowing-test
    callback
    codel-queue
    command-line
    config
    cosine-antenna-model
    csma-system
    degrees-radians
    devices-mesh
    devices-mesh-dot11s
    devices-mesh-dot11s-regression
    devices-mesh-flame
    devices-mesh-flame-regression
    devices-point-to-point
    devices-uan
    devices-wifi
    devices-wifi-dcf
    devices-wifi-tx-duration
    double-probe
    drop-tail-queue
    epc-gtpu
    epc-s1u-downlink
    epc-s1u-uplink
    eps-tft-classifier
    error-model
    event-garbage-collector
    examples/energy/energy-model-example
    examples/error-model/simple-error-model
    examples/ipv6/icmpv6-redirect
    examples/ipv6/ping6
    examples/ipv6/radvd
    examples/ipv6/radvd-two-prefix
    examples/ipv6/test-ipv6
    examples/naming/object-names
    examples/realtime/realtime-udp-echo
    examples/routing/dynamic-global-routing
    examples/routing/global-injection-slash32
    examples/routing/global-routing-slash32
    examples/routing/mixed-global-routing
    examples/routing/simple-alternate-routing
    examples/routing/simple-global-routing
    examples/routing/simple-routing-ping6
    examples/routing/static-routing-slash32
    examples/stats/wifi-example-sim
    examples/tcp/star
    examples/tcp/tcp-large-transfer
    examples/tcp/tcp-star-server
    examples/tcp/tcp-variants-comparison
    examples/tutorial/fifth
    examples/tutorial/first
    examples/tutorial/fourth
    examples/tutorial/hello-simulator
    examples/tutorial/second
    examples/tutorial/seventh
    examples/tutorial/sixth
    examples/tutorial/third
    examples/udp/udp-echo
    examples/wireless/mixed-wireless
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::AarfcdWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::AmrrWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::CaraWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::IdealWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::MinstrelWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::OnoeWifiManager
    examples/wireless/multirate --totalTime=0.3s --rateManager=ns3::RraaWifiManager
    examples/wireless/ofdm-ht-validation
    examples/wireless/ofdm-validation
    examples/wireless/ofdm-vht-validation
    examples/wireless/power-adaptation-distance --manager=ns3::AparfWifiManager --outputFileName=aparf --steps=5 --stepsSize=10
    examples/wireless/power-adaptation-distance --manager=ns3::ParfWifiManager --outputFileName=parf --steps=5 --stepsSize=10
    examples/wireless/wifi-ap --verbose=0
    examples/wireless/wifi-simple-adhoc
    examples/wireless/wifi-simple-adhoc-grid
    examples/wireless/wifi-simple-infra
    examples/wireless/wifi-simple-interference
    examples/wireless/wifi-wired-bridging
    geo-to-cartesian
    global-route-manager-impl
    global-value
    hash
    histogram
    int64x64
    ipv4-address-generator
    ipv4-address-helper
    ipv4-forwarding
    ipv4-fragmentation
    ipv4-global-routing
    ipv4-header
    ipv4-list-routing
    ipv4-packet-info-tag
    ipv4-protocol
    ipv4-raw
    ipv4-static-routing
    ipv6-address
    ipv6-address-generator
    ipv6-address-helper
    ipv6-dual-stack
    ipv6-extension-header
    ipv6-forwarding
    ipv6-fragmentation
    ipv6-list-routing
    ipv6-packet-info-tag
    ipv6-protocol
    ipv6-raw
    ipv6-ripng
    isotropic-antenna-model
    itu-r-1411-los
    itu-r-1411-nlos-over-rooftop
    kun-2600-mhz
    li-ion-energy-source
    lr-wpan-ack
    lr-wpan-clear-channel-assessment
    lr-wpan-collision
    lr-wpan-energy-detection
    lr-wpan-error-model
    lr-wpan-packet
    lr-wpan-plme-pd-sap
    lr-wpan-spectrum-value-helper
    lte-antenna
    lte-cell-selection
    lte-cqa-ff-mac-scheduler
    lte-cqi-generation
    lte-downlink-power-control
    lte-downlink-sinr
    lte-earfcn
    lte-epc-e2e-data
    lte-handover-delay
    lte-handover-target
    lte-harq
    lte-interference
    lte-interference-fr
    lte-link-adaptation
    lte-mimo
    lte-pathloss-model
    lte-phy-error-model
    lte-rlc-am-e2e
    lte-rlc-am-transmitter
    lte-rlc-header
    lte-rlc-um-e2e
    lte-rlc-um-transmitter
    lte-rrc
    lte-spectrum-value-helper
    lte-tdbet-ff-mac-scheduler
    lte-test-deactivate-bearer
    lte-ue-measurements
    lte-ue-measurements-handover
    lte-ue-measurements-piecewise-1
    lte-ue-measurements-piecewise-2
    lte-uplink-power-control
    lte-uplink-sinr
    lte-x2-handover
    lte-x2-handover-measures
    mobility
    mobility-ns2-trace-helper
    mobility-trace
    ns3-tcp-loss
    ns3-tcp-no-delay
    ns3-tcp-socket
    ns3-tcp-state
    ns3-wifi-interference
    ns3-wifi-msdu-aggregator
    object
    object-name-service
    okumura-hata
    packet
    packet-metadata
    packet-socket-apps
    packetbb-test-suite
    parabolic-antenna-model
    pcap-file
    power-rate-adaptation-wifi
    propagation-loss-model
    ptr
    rand-cart-around-geo
    random-number-generators
    random-variable-stream-generators
    red-queue
    rocketfuel-topology-reader
    routing-aodv
    routing-aodv-loopback
    routing-aodv-regression
    routing-dsdv
    routing-dsr
    routing-olsr
    routing-olsr-header
    routing-olsr-regression
    rtt-estimator
    sample
    sequence-number
    simulator
    sixlowpan-fragmentation
    sixlowpan-hc1
    sixlowpan-iphc
    spectrum-converter
    spectrum-ideal-phy
    spectrum-interference
    spectrum-value
    src/aodv/examples/aodv
    src/bridge/examples/csma-bridge
    src/bridge/examples/csma-bridge-one-hop
    src/buildings/examples/buildings-pathloss-profiler
    src/core/examples/main-callback
    src/core/examples/main-ptr
    src/core/examples/sample-random-variable
    src/core/examples/sample-simulator
    src/csma/examples/csma-broadcast
    src/csma/examples/csma-multicast
    src/csma/examples/csma-one-subnet
    src/csma/examples/csma-packet-socket
    src/csma/examples/csma-ping
    src/csma/examples/csma-raw-ip-socket
    src/energy/examples/li-ion-energy-source
    src/energy/examples/rv-battery-model-test
    src/fd-net-device/examples/dummy-network
    src/fd-net-device/examples/fd2fd-onoff
    src/internet/examples/main-simple
    src/lr-wpan/examples/lr-wpan-data
    src/lr-wpan/examples/lr-wpan-error-distance-plot
    src/lr-wpan/examples/lr-wpan-error-model-plot
    src/lr-wpan/examples/lr-wpan-packet-print
    src/lr-wpan/examples/lr-wpan-phy-test
    src/lte/examples/lena-cqi-threshold
    src/lte/examples/lena-dual-stripe
    src/lte/examples/lena-dual-stripe --epc=1 --fadingTrace=../../src/lte/model/fading-traces/fading_trace_EPA_3kmph.fad --simTime=0.01
    src/lte/examples/lena-dual-stripe --epc=1 --simTime=0.0 --nApartmentsX=1 --homeEnbDeploymentRatio=0.5 --nMacroEnbSites=0 --macroUeDensity=0 --nBlocks=1
    src/lte/examples/lena-dual-stripe --epc=1 --simTime=0.01
    src/lte/examples/lena-dual-stripe --epc=1 --useUdp=0 --simTime=0.01
    src/lte/examples/lena-dual-stripe --nBlocks=1  --nMacroEnbSites=0 --macroUeDensity=0 --homeEnbDeploymentRatio=1 --homeEnbActivationRatio=1 --homeUesHomeEnbRatio=2 --macroEnbTxPowerDbm=0 --simTime=0.01
    src/lte/examples/lena-dual-stripe --nMacroEnbSites=0 --macroUeDensity=0 --nBlocks=1 --nApartmentsX=4 --nMacroEnbSitesX=0 --homeEnbDeploymentRatio=1 --homeEnbActivationRatio=1 --macroEnbTxPowerDbm=0 --epcDl=1 --epcUl=0 --epc=1 --numBearersPerUe=4 --homeUesHomeEnbRatio=15 --simTime=0.01
    src/lte/examples/lena-dual-stripe --simTime=0.0 --nApartmentsX=1 --homeEnbDeploymentRatio=0.5 --nMacroEnbSites=0 --macroUeDensity=0 --nBlocks=1
    src/lte/examples/lena-dual-stripe --simTime=0.01
    src/lte/examples/lena-fading
    src/lte/examples/lena-intercell-interference --simTime=0.1
    src/lte/examples/lena-pathloss-traces
    src/lte/examples/lena-profiling
    src/lte/examples/lena-profiling --simTime=0.1 --nUe=2 --nEnb=5 --nFloors=0
    src/lte/examples/lena-profiling --simTime=0.1 --nUe=3 --nEnb=6 --nFloors=1
    src/lte/examples/lena-rem
    src/lte/examples/lena-rem-sector-antenna
    src/lte/examples/lena-rlc-traces
    src/lte/examples/lena-simple
    src/lte/examples/lena-simple-epc
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::FdBetFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::FdMtFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::FdTbfqFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::PfFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::PssFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::RrFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::TdBetFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::TdMtFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::TdTbfqFfMacScheduler
    src/lte/examples/lena-simple-epc --simTime=1.1 --ns3::LteHelper::Scheduler=ns3::TtaFfMacScheduler
    src/lte/examples/lena-x2-handover
    src/mesh/examples/mesh
    src/mobility/examples/main-grid-topology
    src/mobility/examples/main-random-topology
    src/mobility/examples/main-random-walk
    src/network/examples/main-packet-header
    src/network/examples/main-packet-tag
    src/network/examples/red-tests
    src/nix-vector-routing/examples/nix-simple
    src/olsr/examples/simple-point-to-point-olsr
    src/spectrum/examples/adhoc-aloha-ideal-phy
    src/spectrum/examples/adhoc-aloha-ideal-phy-matrix-propagation-loss-model
    src/spectrum/examples/adhoc-aloha-ideal-phy-with-microwave-oven
    src/stats/examples/double-probe-example
    src/stats/examples/file-aggregator-example
    src/stats/examples/file-helper-example
    src/stats/examples/gnuplot-aggregator-example
    src/stats/examples/gnuplot-helper-example
    src/uan/examples/uan-cw-example
    src/uan/examples/uan-rc-example
    src/virtual-net-device/examples/virtual-net-device
    src/wave/examples/wave-simple-80211p
    src/wave/examples/wave-simple-device
    src/wimax/examples/wimax-ipv4
    src/wimax/examples/wimax-multicast
    src/wimax/examples/wimax-simple
    steady-state-rwp-mobility-model
    tcp
    tcp-cong-avoid-test
    tcp-endpoint-bug2211-test
    tcp-fast-retr-test
    tcp-header
    tcp-highspeed-test
    tcp-hybla-test
    tcp-option
    tcp-rto-test
    tcp-slow-start-test
    tcp-timestamp
    tcp-wscaling
    tcp-zero-window-test
    test-asn1-encoding
    threaded-simulator
    time
    timer
    traced-callback
    traced-callback-typedef
    traced-value-callback
    tv-helper-distribution
    tv-spectrum-transmitter
    type-id
    type-traits
    uan-energy-model
    udp
    udp-client-server
    watchdog
    wave-mac-extension
    waveform-generator
    waypoint-mobility-model
    wifi-80211p-ocb
    wifi-block-ack
    wimax-fragmentation
    wimax-mac-messages
    wimax-phy-layer
    wimax-qos
    wimax-service-flow
    wimax-ss-mac-layer
    wimax-tlv
Comment 9 Tommaso Pecorella 2015-12-14 18:33:17 UTC
(In reply to natale.patriciello from comment #8)
> This is my list of VALGR failures under current ns-3-dev; they are the same
> with and without patch:

Basically all of them. Either you changed something else as well, or one of your libraries is generating the errors. happens on some OSes.

I checked on a Linux box and all seems to be fine.

Pushed in changeset:   11784:72e40787d4c6
Comment 10 Tom Henderson 2015-12-14 19:14:48 UTC
(In reply to natale.patriciello from comment #7)
> No, they aren't, but I think the problem isn't in such patch.

On Linux, make sure that you --disable-gtk at configure time when using valgrind; else all valgrind tests will fail.