As part of (one of) my day jobs, I have had to yet again bash together a set of REST APIs. This is so we can start wiring up some proper micro services AWS style scalable architecture into the monolithic beast that is the current incarnation of the software I’m working on.

Anyway, here are a few gotchas for this if you intend to start using proper REST style HTTP verbs (PUT/PATCH/DELETE), rather than doing everything via GET and POST like most everyone.

No easy way to access variables

If you’re familiar with the standard $_POST mechanism to access passed variables, you’ll be disappointed that PHP doesn’t by default provide a nice way of access these for PUT and PATCH.

So, you’re going to have to extract them yourself. Not overly tricky, but irritating:

So, for example:

Requests not coming through

If you find that your APIs work fine on your local machine but break when deployed, you might want to check your server configuration.

It is quite common for web servers (especially on shared hosts) to block access to HTTP verbs other than most common GET and POST. Modsecurity’s default config definitely blocks these methods.

You should also check that any proxies or load balancers that you have in front of your REST endpoint. These may need some configuration tweaks as well.

Hopefully this will save you some time and frustration!

Back in July I gave a talk at Oxford Geek Nights about the Digital Britain report entitled “#DigitalBritain fail” in which I discussed the Digital Britain report and some of it’s many shortcomings.

One of the potential courses of action I suggested that people could take was to essentially smile,  say “that’s nice dear” and continue innovating. To take the typically open source approach adopted by the guys at Open Streetmap (among others) and recreate proprietary datasets in the public domain.

I was therefore delighted when I came across the guys at Ernest Marples, who were attempting to provide a free version of the Postcode to location database.

As a bit of background; in the UK the state (via Royal mail holdings for which the state is the sole shareholder) has a monopoly on all postcode to location lookups. This monopoly is protected by crown copyright and a royal charter, which basically means that even though the dataset was produced using taxpayer’s money it is owned by the crown (in the case of crown copyright), and the charter means that nobody else is permitted to provide the same service.

This means that in order to do anything with postcodes you need to pay a licence fee to the post office, pricing the small players out of the game or limiting them to use a service provider such as Yahoo (which has it’s own terms of usage). A similar situation exists for geolocation in general, but in this instance you have to pay the Ordnance Survey.

This situation is archaic and was a hot topic at Barcamp Transparency. Data which are produced by taxpayer money should be freely available to all, and I had hoped that the dissolution of crown copyright would have been one of the first thing that the Digital Britain report recommended.

Yesterday, Ernest Marples announced in their blog that they were shutting down their service in the face of a legal challenge from Royal Mail, who pretty much accused them of stealing their database. Although the Ernest Marples guys were a little cagey about where they got their data (with hindsight this was probably a mistake) they did explicitly state that it was not using the Royal Mail database in any way.

Under the terms of the charter however, they are simply not permitted to provide this service and compete with Royal mail, and this is the basis of the legal challenge.

I am saddened to see this promising project go, and especially sorry to see that they don’t have the funds to get their day in court. A court case of this nature could provide a useful forum to hold a long overdue debate as to the relevance of the charter and crown copyright in general in the twenty-first century.

Crown copyright is a problem (as well as being morally dubious), and a monopoly is always bad (especially when state enforced). It is sad to see promising UK innovation stifled by entrenched interests, but it seems to be a reoccuring theme in modern Britain. As we have just seen it puts severe limits on just how far a project can go in opening up and recreating data sets, and this worries me.

I wish the project and it’s organisers all the best for the future.

Top image “postbox_20may2009_0830” by Patrick H. Lauke

in_ur_realityAnother one of Elgg‘s less documented but very powerful features is the ability to expose functionality from the core and user modules in a standard way via a REST like API.

This gives you the opportunity to develop interoperable web services and provide them to the users of your site, all in a standardised way.

The endpoint

To make an API call you must direct your query at a special URL. This query will be either a GET or a POST query (depending on the command you are executing), the specific endpoint you use depends on the format you want the return value returned in.

The endpoint:

http://yoursite.com/pg/api/[protocol]/[return format]/

Where:

  • [protocol] is the protocol being used, in this case and for the moment only “rest” is supported.
  • [return format] is the format you want your information returned in, either “php”, “json” or “xml”.

This endpoint should then be passed the method and any parameters as GET variables, so for example:

http://yoursite.com/pg/api/rest/xml/?method=test.test&myparam=foo&anotherparam=bar

Would pass “foo” and “bar” as the given named parameters to the function “test.test” and return the result in XML format.

Notice here also that the API uses the “PG” page handler extension, this means that it would be a relatively simple matter to add a new API protocol or replace the entire API subsystem in a module – should you be so inclined.

Return result

The result of the api call will be an entity encoded in your chosen format.

This entity will have a “status” parameter – zero for success, non-zero denotes an error. Result data will be in the “result” parameter. You may also receive some messages and debug information.

Exporting a function

Any Elgg function – core or module – can be exposed via the API, all you have to do is declare it using expose_function() from within your code, passing the method name, handler and any parameters (note that these parameters must be declared in the same order as they appear in your function).

Listing functions

You can see a list of all registered functions using the built in api command “system.api.list”, this is also a useful test to see if your client is configured correctly.

E.g.

http://yoursite.com/pg/api/rest/xml/?method=system.api.list

Authorising and authenticating

Most commands will require some form of authorisation in order to function. There are two main types of authorisation – protocol level which determines whether a given client is permitted to connect, and user level where a command whereby a user requires a special token in lieu of a username and password.

Protocol level authentication
Protocol level authentication is a way to ensure that commands only come from approved clients for which you have previously given keys. This is in keeping with many web based API systems and permits you to disconnect clients who abuse your system, or track usage for accountancy purposes.

The client must send a HMAC signature together with a set of special HTTP headers when making a call. This ensures that the API call is being made from the stated client and that the data has not been tampered with.

Eagle-eyed readers with long memories will see a lot of similarity with the ElggVoices API I wrote about previously.

The HMAC must be constructed over the following data:

  • The Secret Key provided by the target Elgg install (as provided easily by the APIAdmin plugin).
  • The current unix time in microseconds as a floating point decimal, produced my microtime(true).
  • Your API key identifying you to the Elgg api server (companion to your secret key).
  • URLEncoded string representation of any GET variable parameters, eg “method=test.test&foo=bar”
  • If you are sending post data, the hash of this data.

Some extra information must be added to the HTTP header in order for this data to be correctly processed:

  • X-Elgg-apikey – The API key (not the secret key!)
  • X-Elgg-time – Microtime used in the HMAC calculation
  • X-Elgg-hmac – The HMAC as hex characters.
  • X-Elgg-hmac-algo – The algorithm used in the HMAC calculation – eg, sha1, md5 etc

If you are sending POST data you must also send:

  • X-Elgg-posthash – The hash of the POST data.
  • X-Elgg-posthash-algo – The algorithm used to produce the POST data hash – eg, md5.
  • Content-type – The content type of the data you are sending (if in doubt use application/octet-stream).
  • Content-Length – The length in bytes of your POST data.

Much of this will be handled for you if you use the built in Elgg API Client.

User level tokens

User level tokens are used to identify a specific user on the target system, in much the same way as if they were to log in with their user name and password, but without the need to send this for every API call.

Tokens are time limited, and so it will be necessary for your client to periodically refresh the token they use to identify the user.

Tokens are generated by using the API command “auth.gettoken” and passing the username and password as parameters, eg:

http://yoursite.com/pg/api/rest/xml/?method=auth.gettoken&username=foo&password=bar

Anonymous methods
Anonymous methods (such as “system.api.list”) can be executed without any form of authentication, thus accepting connections from any client and regardless of whether they provide a user token. This is useful in certain situations and it goes without saying that you don’t expose sensitive functionality this way.

To do so set $anonymous=true in your call to expose_function().

Image “In UR Reality” by XKCD