HTTP persistent connections

(back to HTTP)

Basic HTTP
The original model for HTTP 0.9 and 1.0 was for each connection to be a one-off request-response pair:


 * client opens TCP connection to server
 * server accepts connection
 * client sends a request (GET this url, POST to that URL) and some headers with it
 * server sends back some headers and data
 * server closes connection
 * client drops the dead connection

The advantage is simplicity: neither server nor client need to worry about how much data there's going to be! Just pump data out until you're done, then cut the connection.

There are two fairly obvious disadvantages to that:
 * don't know how long something's going to take to download (bad human usability for large files)
 * if you're going to make multiple requests, you need to keep reopening new connections

With HTTP 1.1, this is still a pretty frequent model as it's the default behavior for most web scripting systems such as PHP that produce content dynamically.

Persistent connections
For a dynamic web site, the most important part is that you're probably going to transfer multiple files. Hit a few pages, load some scripts, CSS and images, make some AJAX submissions, blah blah. Keeping connections open means there's less overhead:


 * don't have to wait for TCP connection setup for every file
 * on HTTPS, don't have to wait for SSL handshakes for every file

Especially for client who aren't close to the server on the network, this can help save a lot of network round trips and greatly improve user-visible response time. (More so if pipelining is enabled in the client, allowing it to push several requests at once, then receive the responses as they come.)

At its most basic, to keep a connection open for multiple requests, HTTP 1.1 needs to add one thing: a way for the client to know when a response body has been fully transferred, without closing the connection.

Content-Length header
In some ways this is the simplest: if the server knows the size of the response body before you start sending it, that size can simply be included as an HTTP response header:

HTTP/1.1 200 OK Content-type: text/html; charset=utf-8 Content-Length: 1234

bla bla bla

When the client reads that many bytes in, it knows it's done and the connection is free for another request.

For dynamic content, this generally means buffering your complete output before sending it:



In PHP, the final bit of output logic can be encapsulated in a custom output buffering callback to reduce impact on the code.

This is roughly what MediaWiki does (with some added logic for gzipping), with the principal goal of making sure caching proxies in Europe or Asia can keep open connections to master servers in the US.

Downsides:
 * can't start sending data until it's all generated
 * output data must be buffered in memory, which may count against memory limits

If generating huge output such as a backup/export feed containing the complete history of a user, this might not work very well.

Chunked transfer encoding
The alternative is to use chunked transfer encoding. This breaks the data stream up into pieces, each of known size.

This way, rather than having to precalculate the size of the entire body, smaller buffers can be sent out one after the other, as data comes along.

Check: is the PHP+Apache combo smart enough to do chunking on its own based on its default low-level buffering, or must this be done manually?

It might be a little tricky to integrate manual chunking in with XMLWriter, but it probably could be combined with the general output buffering in some way.

Questions

 * How does our load-balancing system interact with persistent connections?
 * Can PHP+Apache do basic chunking for us itself?
 * If not, can we do it easily with an output buffering handler?
 * Can we benefit from specific arrangements of stuff?

Other fun tricks

 * experiments with using XMLHttpRequest and chunked encoding to start executing JavaScript _while_ it's loading
 * looks like a clever hack! :) Primary use is probably in comet-style things (like the RealTime system; I think Meteor actually uses a similar technique as one of its transports)