[NETENG] Python TCP Programming

All scripts I write for this are found from the latest edition of Foundations of Python Networking by Brandon Rhodes which you can buy here: https://www.amazon.com/Foundations-Python-Network-Programming-Brandon/dp...

The almighty Transfer Control Protocol or TCP as we all know it to be can be seen as a demigod for network protocols encompassing most of the internet in its grasp. To the way TCP works, you have to consider the way the three-way handshake happens SYN - SYN/ACK - ACK for beginning the connection, and FIN - FIN/ACK - ACK. If you don't already know about it consider looking up information about it. I went ahead and looked for a really good tutorial on YouTube which you can check out here https://www.youtube.com/watch?v=F27PLin3TV0

Anyway, the simplest thing to look at is a code for simply making a TCP client and server scenario...however with an added twist explained later on:

#!/usr/bin/env python3
import argparse, socket

def recvall(sock, length):
    #define empty data buffer
    data = b''
    #while the data buffer isnt the size of the incoming data
    while len(data) < length:
        #receive the (length - len(data)) bytes
        more = sock.recv(length - len(data))
        #if the server/client receives nothing
        if not more:
            #raise EOF error
            raise EOFError('was expecting {0} bytes but only received {1} bytes before the socket closed'.format(length, len(data)))
        data += more
    return data

def server(interface, port):
    #create socket with the reuse option for server
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind((interface, port))
    sock.listen(1)
    print('Listening at', sock.getsockname())
   
    while True:
        #accept the incoming socket
        print('Waiting to accept a new connection')
        sc, sockname = sock.accept()
        print('We have accepted a connection from', sockname)
        #sockname is the client, peername is the server
        print('  Socket name:', sc.getsockname())
        print('  Socket peer:', sc.getpeername())
        #call Recvall
        message = recvall(sc, 16)
        print('  Incoming sixteen-octet message:', repr(message))
        sc.sendall(b'Farewell, client')
        sc.close()
        print('  Reply sent, socket closed')

def client(host, port):
    #connect client to the server  
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect((host, port))
    print('Client has been assigned socket name', sock.getsockname())
    sock.sendall(b'Hi there, server!') #Note that the exclamation mark doesn't get received!
    reply = recvall(sock, 16)
    print('The server said', repr(reply))
    sock.close()

if __name__ == '__main__':
    choices = {'client': client, 'server': server}
    parser = argparse.ArgumentParser(description='Send and receive over TCP')
    parser.add_argument('role', choices=choices, help='which role to play')
    parser.add_argument('host', help='interface the server listens at;'
                        ' host the client sends to')
    parser.add_argument('-p', metavar='PORT', type=int, default=1060,
                        help='TCP port (default 1060)')
    args = parser.parse_args()
    function = choices[args.role]
    function(args.host, args.p)

The main things consider here are three major method calls, and they are connect(), sendall(), and recvall(). Contrary to its UDP counterpart, using TCP requires the use of either bind or connect; binding as you might have already guessed assigns an IP and port to a socket, it doesn't trigger a listener mode, but rather says to the program that the socket has X IP and Y Port. Connect on the other hand, when used in a TCP socket triggers the three-way handshake I mentioned earlier, while on a UDP socket it doesn't.

Something to note about the sendall() and recvall() methods is why they are being used to send and receive data. The reason you want to use sendall() as opposed to send() is that send() can send less bytes than what the server requested, so sendall fixes this by just sending the entirety of the buffer. Sadly, however recv() never got its recvall() counterpart developed so the program listed above defines it; the concept is similar, with recv() you wont be able to tell when you've received the end of the data. Now there are a few ways to implement this, the example above defines one of them, and at first glance the extra if statement that raises an error can throw the reader off so it's best to examine the loop without it to understand what's going on. By analyzing it from a spoken perspective one can look at it as: "Define an empty buffer, as long as the length of this buffer is less than the length of the incoming data, receive the difference in bytes of the data that remains to get pushed into the buffer into more, and add it into the empty buffer (which may or may not contain data already) that we defined earlier." You can re-read that as many times as you need to before you get it. It's imperative you understand this concept because if you remember back in the UDP tutorial you cannot push more bytes than the MTU of your network can handle, so if for example I send a 3000 bytes of data (like a string of 3000 characters note that one character is one byte long) on my socket, and my recv() call is only set to handle 1024 bytes, that means that the remaining 1976 bytes (1976 characters left on the string) will never be considered! IF you want to look further into other ways of defining recvall() you can check out this link http://code.activestate.com/recipes/408859-socketrecv-three-ways-to-turn... for more info.

Let's assume, for a second that you didn't follow this protocol, then for large data transfers you would generally end up getting a concept called deadlock occurring on the network. See, TCP stacks use buffers—both so that they have somewhere to place incoming packet data until an application is ready to read it and so that they can collect outgoing data until the network hardware is ready to transmit an outgoing packet. The buffers are limited in size, and the server will choose to pause the flow of data if it begins to wait for data to come at the wrong time long before the client realizes that it's not receiving anything back; this then results in both the server and the client waiting for data, which in turn causes a deadlock, here is an example:

#!/usr/bin/env python3
import argparse, socket, sys

def server(host, port):
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
    sock.bind((host, port))
    sock.listen(1)
    print('Listening at', sock.getsockname())
    while True:
        sc, sockname = sock.accept()
        print('Processing up to 1024 bytes at a time from', sockname)
        n =
        while True:
            data = sc.recv(1024)
            if not data:
                break
            output = data.decode('ascii').upper().encode('ascii')
            sc.sendall(output)  # send it back uppercase
            n += len(data)
            print('\r  %d bytes processed so far' % (n,), end=' ')
            sys.stdout.flush()
        print()
        sc.close()
        print('  Socket closed')

def client(host, port, bytecount):
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    bytecount = (bytecount + 15) // 16 * 16  # round up to a multiple of 16
    message = b'capitalize this!'  # 16-byte message to repeat over and over

    print('Sending', bytecount, 'bytes of data, in chunks of 16 bytes')
    sock.connect((host, port))

    sent =
    while sent < bytecount:
        sock.sendall(message)
        sent += len(message)
        print('\r  %d bytes sent' % (sent,), end=' ')
        sys.stdout.flush()

    print()
    sock.shutdown(socket.SHUT_WR)

    print('Receiving all the data the server sends back')

    received =
    while True:
        data = sock.recv(42)
        if not received:
            print('  The first data received says', repr(data))
        if not data:
            break
        received += len(data)
        print('\r  %d bytes received' % (received,), end=' ')

    print()
    sock.close()

if __name__ == '__main__':
    roles = ('client', 'server')
    parser = argparse.ArgumentParser(description='Get deadlocked over TCP')
    parser.add_argument('role', choices=roles, help='which role to play')
    parser.add_argument('host', help='interface the server listens at;'
                        ' host the client sends to')
    parser.add_argument('bytecount', type=int, nargs='?', default=16,
                        help='number of bytes for client to send (default 16)')
    parser.add_argument('-p', metavar='PORT', type=int, default=1060,
                        help='TCP port (default 1060)')
    args = parser.parse_args()
    if args.role == 'client':
        client(args.host, args.p, args.bytecount)
    else:
        server(args.host, args.p)

Running the script as a server first then running it as a client trying to process 1 GB of data ends up yielding the following results:
server:
$ ./TCP_deadlock server ""
Listening at ('0.0.0.0', 1060)
Processing up to 1024 bytes at a time from ('127.0.0.1', 55214)
  8000848 bytes processed so far

client:
$ ./TCP_deadlock client 127.0.0.1 1073741824Sending 1073741824 bytes of data, in chunks of 16 bytes
  15056144 bytes sent

Before your pc goes into deadlock, and if you look closely, the entire gigabyte from the client was not even sent through to the server. Don't worry about understanding the script if you don't want to, it's just meant for you to understand the concept.

The last thing I can talk about in regards to TCP is the closing of sockets, well I suppose this can apply to UDP sockets as well. Simply put, when we think of sockets what we should really think about is the fact that they're actually files with an integer attached like a tag earring on a cows ear to them (that integer is known as a file descriptor). And if you've done any extensive programming you've most likely dealt with opening, reading, and closing files; likewise, when you consider sockets, you are essentially doing the same thing when communicating through them, except you're using functions to do so. As you may have guessed that the close() command destroys the socket object in question, and your assumption is correct. However say you want to simply disable packets flowing into the socket, then you use the shutdown() method to do so. shutdown() takes in two parameters, the first is the file descriptor, followed by the operation you wish to shutdown in a socket, here are the options with a quick rundown on each one:

SHUT_WR - cut off sending packets on the socket
SHUT_RD - cut off receiving packets
SHUT_RDWR - destroy the socket (close() equivalent)