HTTPClient interface

From StatusNet

Jump to: navigation, search

Quick implementation notes as of 2009-11-01... --brion 14:54, 2 November 2009 (UTC)

HTTPClient is an extended version of HTTP_Request2. Over the base package, support has been added for automatically following redirects; this is portable and should work equally on the socket-based and CURL-based backends.

I went ahead and changed existing file_get_contents() and CURL usages I could find in StatusNet core and included plugins, except where they're in externally-sourced libraries. Yadis, the Facebook REST library, and a couple others still use other methods.

I've started working on memcached-backed caching in a branch but won't commit it to 0.9.x until unit tests are in place to make sure it's working. :)


[edit] Usage

Simple:

 $request = HTTPClient::start();
 $response = $request->get('http://example.com/');
 // $body = $request->post('http://example.com/submit', null, $vars);
 // connection errors will throw an exception
 if ($response->isOk()) {
     do_something($response->getBody());
 }

Fancy:

 $request = new HTTPClient('http://example.com/', 'POST');
 $request->setConfig(array('follow_redirects' => false));
 $request->setHeader('X-Thingy: foo');
 $request->addPostParameter($vars);
 try {
   $response = $request->send();
   if ($response->isOk()) {
     // isOk() is an added convenience method on HTTPResponse,
     // which checks the HTTP status code; a 20x is considered OK.
     do_something($response->getBody());
   }
 } catch (HTTP_Request2_Exception $e) {
   // Connection failures will throw an exception.
   // Errors returned successfully over HTTP do not,
   // so check the response code in $response->getStatus()
 }

[edit] Extensions over HTTP_Request2

  • logging successes and failures via StatusNet's logging infrastructure
  • Automatic following of redirects unless disabled in request config; defaults are:
    • 'follow_redirects' => true
    • 'max_redirs' => 10
    • loop detection might be wise, but the max will do for now
  • convenience functions on HTTPClient:
    • get(), post(), head() quickie functions to consolidate a couple of steps
  • convenience functions on HTTPResponse:
    • getUrl(), getRedirectCount() to aid in following redirects -- returns final URL
    • isOk() to check if returned status is OK without hard-coding checks for 200
  • caching in memcached (future)
    • if-modified-since, etag support
    • obey expiration times from expires, cache-control headers

[edit] Planning

Evan originally sent this to the mailing list; HTTPClient is now part of the 0.9.x branch, using the PEAR HTTP_Request2 package as a backend, and most old CURL or file_get_contents() usages have been reworked to use it. This supports both native sockets and CURL.

Hello, everyone. As people who've poked around in the Laconica codebase know, we have a lot of different HTTP client access methods: curl, file_get_contents(), the Yadis tool's HTTPFetcher, and maybe even others.

In Laconica 0.9.x, I would like to have a single HTTP client class that all HTTP client code uses (except our external libraries, of course).

The benefits to the admin are that he/she doesn't have to worry about dependencies. To developers, our code becomes more readable. Also, improvements to the client class will improve all HTTP-using code.

I see a few options:

  • Use the PEAR HTTP_Client or HTTP_Request (or effin' HTTP_Request2... stupid PEAR) tools. Of these, I like HTTP_Request2, since it seems to use pluggable engines. http://pear.php.net/packages.php?catpid=11&php=all
  • Always use curl, and always require curl. Maybe not such a bad idea, although I think the curl programming interface is horrible.
  • Use PECL's HTTP client class, and require it.
  • Write our own wrapper class, which uses whichever is the best available HTTP client method.

Some things I'd like to add to our HTTP client functionality, no matter what:

  • Caching of HTTP results in memcached; using If-Modified-Since, If-Not-Match to only get stuff from a server if the cache is out of date.
  • Follow HTTP redirects automatically.
  • Don't get into HTTP redirect loops.
  • Sane timeouts and size limits for requests.

My current feeling is that we should have our own wrapper class. Yet more extra code, but it seems like the most efficient, best chance for clean code, and best chance to use available resources effectively.

Personal tools