TCP Deep Dive

*UNDER WORK*

1. The History of The TCP Protocol

1.1. Historical Origin of the TCP Protocol

The work on TCP and IP dates back to the 1970s. Vinton Cerf and Robert Kahn submitted the first-ever paper on the internet, titled “A Protocol for Packet Network Intercommunication.” at the IEEE Transactions on Communications conference in 1974. Later that year, an RFC was published (RFC 675), “Specification of Internet Transmission Control Program” by Vinton Cerf, Yogen Dalal, and Carl Sunshine in December 1974. The initial RFC 675 was not fully functional in 1974, so the authors were asked to revise the original work. They revised it several times, and finally, in 1981, the ‘v4’ specification of TCP/IP was published. This time it was not one but two separate RFCs:

  • RFC 791 “Internet Protocol”

  • RFC 793 “Transmission Control Protocol”

1.2. Why was TCP Develop

  • TCP was developed to address the need for reliable communication over unreliable network connections, specifically in packet-switched networks.

  • The goal was to create a protocol that could handle data transmission between different types of computer systems and networks, enabling seamless interconnectivity.

  • TCP aimed to provide end-to-end reliability, flow control, and congestion control mechanisms to ensure efficient and error-free data transfer. (Just to be clear, TCPv4 did not have congestion control, it was later added as a variant in TCP Tahoe.)

  • It managed to achieve that due to it being

  • Reliable in data delivery: TCP uses acknowledgments, sequence numbers, and retransmission mechanisms to ensure that data is delivered reliably and in the correct order.

  • Connection-Oriented: TCP establishes a connection between the sender and receiver before data transfer, allowing for reliable and ordered transmission.

  • Flow Control: TCP incorporates flow control mechanisms to regulate the rate of data transmission based on the receiver's capacity, preventing overwhelm and congestion.

  • Congestion Control: TCP adapts its transmission rate to network conditions, avoiding congestion and ensuring fair sharing of network resources.

  • Interoperability: TCP's design allows it to work across different types of networks and operating systems, making it a widely adopted and standardized protocol.

2. TCP Fundamentals

(This is based on a WireShark capture and terms, though they stand for most applications that are similar.)

2.1. The TCP Three-Way Hand-Shake

The TCP, IP three-way handshake is built around four operations.

Listen: The server is listening for incoming requests (Essentially waiting for a connection to be made.)

SYN-Sent: The Client is initiating the connection by sending the first SYN, That SYN packet at its minimum (Only TCP & IPv4) is 40 bytes, though every network configuration is different, for instance using TCP & IPv6 would be a 60 Byte header.

SYN-RECIVIED Or SYN-ACK: The server will answer with the SYN Received or as most people call it ACK. Just like the SYN-Sent, the minimum an SYN-ACK can weigh is 40 bytes.

Established or ACK: The Client sends back an ACK to the server.

2.2. What Information Is Exchanged During The Handshake During the TCP handshake

  • Source Port & IP: The source port and IP address of the sender (client) are included in the packet's header, identifying the source of the TCP segment.

  • Destination Port & IP: The destination port and IP address of the receiver (server) are included in the packet's header, specifying the destination of the TCP segment.

  • Stream Index: This identifies the particular communication stream or connection between the client and server.

  • Sequence Number: Each TCP segment has a sequence number assigned to it. During the handshake, the initial sequence number (ISN) is exchanged between the client and server.

  • Acknowledgment Number: The acknowledgment number field is used to acknowledge the receipt of data. During the handshake, the acknowledgment number acknowledges the initial sequence number received from the other party.

  • TCP Options: Additional TCP options can be exchanged during the handshake, allowing for specific configurations and features.

  • TCP Flags: Various TCP flags, such as SYN, ACK, and FIN, are set during the handshake to indicate the state and purpose of the packets.

  • Window Buffer: The window size or buffer indicates the amount of data the sender is willing to receive without acknowledgment.

  • Window Scale and MSS: These parameters help optimize the TCP connection by adjusting the window size and maximum segment size based on network conditions.

2.3. Sequence Number (RAW), (WireShark)

Each device generates a random sequence number, also known as an ISN (Initial Sequence Number). The range of sequence numbers depends on the operating system. Sequence numbers play a role in tracking and verifying the transmission and receipt of data. It's important to note that sequence numbers between the client and server may not match. For instance, Client A might have a sequence number of 1937372673, while Server B might have 284380740. Each device counts the bytes sent relative to its own sequence number. (WireShark displays a simplified version of the sequence number as 0, which is an alias for readability purposes.)

Let's simulate a TCP handshake scenario: Client A on port 58312 sends a SYN packet to Server B on port 80, including its sequence number. If the connection is established, Server B responds with a SYN-ACK packet, containing its own sequence number, Client A's sequence number, and an additional 1 byte. Client A then sends an ACK packet, and both Client A and Server B increment their sequence numbers by 1. These additional bytes (SYN, SYN-ACK, ACK) exchanged during the handshake are considered "ghost bytes" and do not contain actual data. This handshake confirms the existence of Server B, allows routing to Server B, and enables Server B to allocate space for the connection.

2.4. Send Buffer

The send buffer is closely related to the sequence number. It maintains a copy of the data sent and helps re-transmit it in case of an unreliable or slow connection, ensuring that every byte sent is eventually received.

2.5. Stream Index

Also known as a "Four Tuple," the Stream Index helps identify the specific TCP stream being analyzed. For example, if there are two SYN packets captured, one will be labeled [Stream Index: 0], while the other will be [Stream Index: 1].

2.6. TCP Options

TCP options are exchanged during the handshake, and their size varies. The header size is significantly larger during the SYN, SYN-ACK, ACK exchange. After the initial handshake, TCP options are not seen again. Capturing the handshake is crucial for identifying network issues caused by improper settings.

2.7. TCP Flags

  • SYN (Synchronization): Used in the connection establishment phase to synchronize sequence numbers between hosts. Is 0 By default will be set to 1 when a SYN is acknowledged

  • ACK (Acknowledgement): Sent to acknowledge successfully received packets. Is 0 By default will be set to 1 when a SYN ACK is acknowledged

  • FIN (Finish): Requests connection termination and frees reserved resources.

  • RST (Reset): Terminates the connection if something is wrong or unexpected.

  • URG (Urgent): Indicates prioritized and urgent data within the packet.

  • PSH (Push): Requests immediate delivery of data without waiting for buffering.

  • WND (Window): Communicates the size of the receive window to the sender.

  • CHK (Checksum): Verifies the integrity of the TCP segment during transmission.

  • SEQ (Sequence Number): Uniquely identifies the order of packets for reliable data transfer.

  • ACK (Acknowledgement Number): Communicates the next expected sequence number and acknowledges received segments. There are a lot of TCP flags, and there are some more that I have not mentioned as they are not in use anymore.

2.8. Window Buffer

The window size is first determined and exchanged during the handshake. The client declares its desired window size, indicating the maximum number of bytes it can receive from the server before sending an acknowledgment (ACK). However, the default window size of 65KB may be insufficient for efficient data transfer. To illustrate, it's like attempting to fill a pool using a shot glass, which would be time-consuming. (Credit to Chris for the analogy.) Due to limitations in header size, the window buffer cannot grow beyond a certain limit. This is where the concept of Window Scale comes into play.

2.9. Window Scale

To overcome the limitations of the Window Buffer, Window Scale was introduced. It involves multiplying the Window Buffer value by a specific number determined by the service. The multiplication factor depends on the operating system and the request made. Window Scale is a dynamic feature that can change as needed. For example, a Window Scale of 6 multiplies the Window Buffer by 64, while a Window Scale of 9 multiplies it by 512. Therefore, even though the initial handshake may have declared a maximum window buffer size of 65KB, we can request the server to ignore this limit and use the multiplied value based on our Window Scale. This dynamic sizing of the Window Buffer affected by the Window Scale can be likened to a situation where I have a 10-liter barrel of water for filling my pool. The water delivery person has already delivered 8 liters, and although it takes time for me to transfer the water from the barrel to the pool, I can still receive more data. So, I can instruct the water delivery person, "Hey, I have 5 liters of free space, keep sending me water."

In subsequent sections, I will delve deeper into troubleshooting. One particular issue related to the Window Buffer is when it consistently decreases without being emptied. This indicates that a service or your computer is causing congestion and slowdowns in the connections, while also consuming server resources unnecessarily.

3. The MTU and MSS and the Differences between them

MTU and MSS are the concepts that people find hard to understand since they are so similar

3.1. MTU (Maximum Transition Unit)

MTU will have aand size of 1500bytes, MTU is essentially a single packet, each packet will have a maximum weight of 1500bytes, Destination = 6 Bytes, Source = 6 Bytes, Ether/Type = 2 Bytes, Data/Payload = 1482 Bytes, FCS = 4 Bytes. Please do notice that this is just the standard sizing/default. Customization may change the size of D, S. Take, for example: if I’m using IPv6, which is 128-bit instead of 48-bit address like IPv4, my destination and source headers will weight 16 Bytes each.

The Data Field in an MTU (Maximum Transmission Unit) is a crucial component that plays a significant role in network communication. It represents the portion of a network packet that carries the actual payload or data being transmitted. Understanding the behavior of the Data Field within the MTU is essential for efficient and reliable data transmission.

Let's consider a scenario where Client A wants to communicate with Server B, and there are 10 hops (intermediate network devices) between them. Client A is requesting a service from Server B, which is 10 hops away. Client A's MTU size, as declared in the handshake, is 1460. However, Server B's MTU size is 1420. To ensure smooth communication without fragmentation, Client A needs to adjust its MTU size to match Server B's MTU size. Let's imagine the network path between Client A and Server B. On hop 4, there is a node that enforces an MTU size of 1360 due to a firewall setting. This node does not allow fragmentation, so when Client A's packet reaches this hop, it cannot fit within the 1360 MTU size. In a scenario where fragmentation is allowed, the packet would be fragmented into two parts. Server B would receive the first part, which is 1360 bytes, and then the remaining 60 bytes as a separate fragment. This process of fragmentation can slow down the network and place additional burdens on Server B, as it needs to reassemble the fragments. However, let's consider a situation where the node at hop 4 does not allow fragmentation. In this case, the packet will be dropped by hop 4 as it exceeds the MTU size. To overcome this issue, Client A would need to employ BGP (Border Gateway Protocol) to find an alternative route that avoids hop 4 or consider terminating the connection. During the TCP handshake, the MSS value is negotiated between Client A and Server B based on the minimum MTU along the network path. This MSS value represents the maximum size of the TCP segment within the IP packet, considering the TCP/IP header size. By ensuring that the TCP segment fits within the MTU, fragmentation can be avoided and smooth data transmission can take place."

3.2 MSS (Maxmimum Segment Size)

MSS (Maximum Segment Size), is the Data part of the MTU. The differences between MSS and MTU usually get confused, MSS is only the Data part of the MTU, while MTU is IP, TCP, and MSS. Both the Client and Server will list the lowest amount of an MSS that they exchange, between them self, they will not account for the nodes that are between them, that's the situation I explained on the analogy in the MTU section.

3.3 Data Offset

The Data Offset segment indicates where the actual data (payload) begins within the TCP segment. The value in the Data Offset field represents the number of 32-bit words from the start of the TCP segment to the beginning of the data. For example, if the Data Offset is 5, it means that the TCP header occupies 5 * 4 = 20 bytes. The actual data (payload) will start at byte 20, and everything before that is part of the TCP header and is irrelevant for interpreting the payload.

By using the Data Offset field, the receiving end can determine the exact position in the TCP segment where the payload starts. This allows for proper extraction and processing of the data without considering the header information that precedes it.

Analogy:

Let's imagine you want to send a letter to a friend, and the letter consists of multiple parts: the envelope, the letterhead, and the actual message. The envelope contains information about the sender and the recipient, just like the TCP header contains control information. The letterhead contains details about the letter, such as the date and the subject.

In this analogy, the Data Offset represents the size of the letterhead. It tells you how many pages the letterhead occupies in the letter. Now, when your friend receives the letter, they need to know where the actual message starts so they can read it. By knowing the Data Offset (1 page), they can skip the letterhead and go directly to the beginning of the message. If the Data Offset were larger, they would need to skip more pages to find the start of the message.

This reformatted version separates the description of the Data Offset field from the analogy, making it easier to distinguish between the technical explanation and the illustrative analogy.

Last updated