Kubernetes has a reasonably nice REST-ish HTTP API which is used pervasively by the system to do pretty much all of its work. It is very open and reasonably well documented which makes it excellent for integrating with so you can manage your cluster from code. However the API has a concept which does not directly map to HTTP: WATCH. This is used to enable an API user to be notified of any changes to resources when they happen. Using this watch functionality turns out to be non-trivial unfortunately.
The Anatomy of a WATCH Request
Using the Kubernetes API from within Python this is very easy to do using the Requests library, the API is well behaved and always consumes and returns JSON messages. However when it comes to issuing a watch request things become more complicated. There are supposedly two ways to issue a watch request: one with a normal HTTP request which returns a streaming result using chunked-encoding and one using websockets. Unfortunately when testing against a Kubernetes 1.1 master it did not seem to correctly use the websocket protocol, so using the streaming result is the way to go.
When using the chunked-encoding streaming the Kubernetes master will start sending a chunk by sending the chunk size. But then it does not send an entire chunk, it will rather only send one line of text terminated by a newline. This line of text is a JSON-encoded object with the event and changed resource item inside it. So the protocol is very much line-based and the chunked-encoding is just a way to stream the results as they become available. On the face of it this doesn't seem so difficult to do with requests, the request could be made using requests.get('http://localhost:8001/api/v1/pods',params={'watch':'true'}, stream=True) and .iter_lines() could be used on the response. But the iter_lines method does not do what you expect it to do, it keeps an internal buffer which means you will never see the last event since you are still waiting for that buffer to fill.
The issue raising this suggests a work-around by implementing your own iter_lines() function using the raw socket from the response to read from the socket. Unfortunately that simple solution makes a few mistakes. Firstly it does not process the chunked-encoding correctly, the octets describing the chunk size will appear in the output. But more importantly, there is another layer of buffering going on, one that you cannot work around. The additional buffering is because Requests uses the raw socket's makefile method to read data from it. This makes sense for Requests, the Python standard library and OS are good at making things fast by buffering. However it does mean that after Requests has parsed the headers of the response the buffering already has consumed an unknown number of bytes from the response body, with no way of retrieving these bytes. This makes it impossible to consume the watch API using Requests.
Manually Doing HTTP
So how can you consume the watch API from Python? By making the request and processing the response yourself. This is easier than it sounds, socket programming isn't so scary. First you need to connect the socket to the server and send the HTTP request. HTTP is very simple, you just send some headers over the socket:
request=('GET /api/v1/pods?watch=true HTTP/1.1\r\n''Host: localhost\r\n''\r\n')sock=socket.create_connection(('localhost',8001))sock.send(request.encode('ASCII'))
Note that the Host header is required for the Kubernetes master to accept the request.
Parsing the HTTP response is a little more involved. However the http-parser library does neatly implement the HTTP parsing side of things without getting involved with sockets or anything network like. Thanks to this we can easily read and parse the response:
parser=http_parser.parser.HttpParser()whilenotparser.is_headers_complete():chunk=sock.recv(io.DEFAULT_BUFFER_SIZE)ifnotchunk:raiseException('No response!')nreceived=len(chunk)nparsed=parser.execute(chunk,nreceived)ifnparsed!=nreceived:raiseException('Ok, http_parser has a real ugly error-handling API')
Now the response headers have been parsed. Maybe some body data was already received, that is fine however, it will just stay buffered in the parser until we retrieve it. But first let's keep reading data until there is no more left (don't do this in production, it's bad for your memory):
readers=[sock]writers=out_of_band=[]timeout=0whileTrue:rlist,_,_=select.select(readers,writers,out_of_band,timeout)ifnotrlist:# No more data queued by the kernelbreakchunk=sock.recv(io.DEFAULT_BUFFER_SIZE)ifnotchunk:# remote closed the connectionsock.close()breaknreceived=len(chunk)nparsed=parser.execute(chunk,nreceived)ifnparsed!=nreceived:raiseException('Something bad happened to the HTTP parser')
This shows how you can use select to only read data when there is some available instead of having to block until data is available again. Of course as soon as this has consumed all the data the Kubernetes master may have sent the next update to the PodList, but let's read the events received so far:
data=parser.recv_body()lines=data.split(b'\n')pending=lines.pop(-1)events=[json.loads(l.decode('utf-8'))forlinlines]
That's it! If the data received ends in a newline then the lines.split() call will return an empty bytestring (b'') as last item. If the data did not end in a newline an incomplete event was received so we need to save it for later when we get the rest of the data.
Conclusion
So to correctly consume the response from a Kubernetes watch API call you need to create your own socket connection and parse the HTTP response. Luckily this isn't all that difficult as hopefully I've managed to show here. But you don't need to write all this yourself! We've already implemented all this and more in our kube project, which offers a decent implementation of the above wrapped into a nice iterator API. Kube itself still needs a lot more features, but the watch implementation is already very useful.