English French
We can use a similar code to demonstrate that the TCP stack performs a fast retransmit after having received three duplicate acknowledgments. This script is available from :download:`/exercises/packetdrill_scripts/frr.pkt`.
We can now explore how TCP's retransmission techniques interact with the congestion control scheme. The Linux TCP code that combines these two techniques contains several heuristics to improve their performance. We start with a transfer of 8KBytes where the penultimate segment is not received by the remote host. In this case, TCP does not receive enough acknowledgments to trigger the fast retransmit and it must wait for the expiration of the retransmission timer. This script is available from :download:`/exercises/packetdrill_scripts/slow-start-rto2.pkt`.
We can now close the connection gracefully. Let us first issue inject a segment with the ``FIN`` flag set.
Unless otherwise noted, we assume for the questions in this section that the following conditions hold.
To understand the operation of the TCP congestion control, it is often useful to write time-sequence diagrams for different scenarios. The example below shows the operation of the TCP congestion control scheme in a very simple scenario. The initial congestion window (``cwnd``) is set to 1000 bytes and the receive window (``rwin``) advertised by the receiver (supposed constant for the entire connection) is set to 2000 bytes. The slow-start threshold (``ssthresh``) is set to 64000 bytes.
This TCP segment is sent immediately by the stack. The ``SYN`` flag is set and the dot next to the ``S`` character indicates that the ACK flag is also set. The SYN+ACK segment does not contain any data but its acknowledgment number is set to 1 (relative to the initial sequence number). For outgoing packets, packetdrill_ does not verify the value of the advertised window. In this line, it also accepts any TCP options (``<...>``).
The variables that are included in TCP_INFO are defined in https://github.com/torvalds/linux/blob/master/include/uapi/linux/tcp.h
the value of the duplicate acknowledgment threshold is fixed and set to 3
the transmission delay for a TCP acknowledgment is negligible
The third segment of the three-way handshake is sent by packetdrill_ after a delay of 0.1 seconds. The connection is now established and the accept system call will succeed.
The ``tcpi_state`` variable used in this script is returned by ``TCP_INFO`` [#ftcpinfo]_. It tracks the state of the TCP connection according to TCP's finite state machine [#fstates]_. This script is available from :download:`/exercises/packetdrill_scripts/client.pkt`.
These states are defined in https://github.com/torvalds/linux/blob/master/include/net/tcp_states.h
the sender/receiver performs a single :manpage:`send(3)` of `x` bytes
the round-trip-time is fixed and does not change during the lifetime of the TCP connection. We assume a fixed value of 100 milliseconds for the round-trip-time and a fixed value of 200 milliseconds for the retransmission timer.
The `Push` flag is one of the TCP flags defined in :rfc:`793`. TCP stacks usually set this flag when transmitting a segment that empties the send buffer. This is the reason why we observe this push flag in our example.
The :manpage:`accept` system call returns a new file descriptor, in this case value ``4``. At this point, packetdrill_ can write data on the socket or inject packets.
the initial value of the congestion window is one MSS-sized segment
The examples above have demonstrated how TCP retransmits lost segments. However, they did not consider the interactions with the congestion control scheme since the use a large initial congestion window. We now set the initial congestion window to two MSS-sized segments and use the ``tcpi_snd_cwnd`` and ``tcpi_snd_ssthresh`` variables from ``TCP_INFO`` to explore the evolution of the TCP congestion control scheme. Our first script looks at the evolution of the congestion window during a slow-start when there are no losses. This script is available from :download:`/exercises/packetdrill_scripts/slow-start.pkt`.
The description of TCP packets in packetdrill_ uses a syntax that is very close to the tcpdump_ one. The ``+0`` timing indicates that the line is executed immediately after the previous event. The ``<`` sign indicates that packetdrill_ injects a TCP segment and the ``S`` character indicates that the ``SYN`` flag must be set. Like tcpdump_, packetdrill_ uses sequence numbers that are relative to initial sequence number. The three numbers that follow are the sequence number of the first byte of the payload of the segment (``0``), the sequence number of the last byte of the payload of the segment (``0`` after the semi-column) and the length of the payload (``0`` between brackets) of the ``SYN`` segment. This segment does not contain a valid acknowledgment but advertises a window of 1000 bytes. All ``SYN`` segments must also include the ``MSS`` option. In this case, we set the MSS to 1000 bytes. The next line of the packetdrill_ script verifies the reply sent by the instrumented Linux kernel.
the delay required to transmit a single TCP segment containing MSS bytes is small and set to 1 milliseconds, independently of the MSS size