Open Source Software Technical Articles

Want the Best of the Wazi Blogs Delivered Directly to your Inbox?

Subscribe to Wazi by Email

Your email:

Connect with Us!

Current Articles | RSS Feed RSS Feed

Apache and RESTful Web Services

  
  
  

REST (REpresentational State Transfer) services are a lightweight alternative to SOAP or RPC that use HTTP requests in order to make queries (GET), create or update data (POST, PUT), or erase data (DELETE). You can install a full-scale framework to implement RESTful web services, but if you want to set up with just a few services, some lines of PHP code plus some Apache configuration work offer a viable alternative. In this article we'll focus on providing access to RESTful web services through Apache, either by using aliases, a simple approach that requires extra dispatching code, or through URL rewriting, which is somewhat more complex but requires no extra code. In both cases, you'll be up and running with little coding, and you'll see some Apache configuration techniques you can use for other purposes.

For this article I created three web services to deliver geographical information. I used data regarding countries (identified by a unique code, such as UY for Uruguay, my home, as given in ISO 3166), their regions (identified by a composite key: the country code plus a region code), and cities (with a pure ASCII name, an accented version of the name that may include foreign characters, a population if known, a latitude, and a longitude). The countries data can be found at http://www.maxmind.com/app/iso3166; check http://www.maxmind.com/app/iso3166_2 for the USA and Canada region codes, and http://www.maxmind.com/fips10-4.csv for the rest of the world; and go to http://www.maxmind.com/app/worldcities for the cities themselves.

Here are the data structures I created in MySQL:

CREATE SCHEMA world
    DEFAULT CHARACTER SET=latin1
    DEFAULT COLLATE=latin1_general_ci;

USE world;

CREATE TABLE countries (
  countryCode char(2) NOT NULL,
  countryName varchar(50),
  PRIMARY KEY (countryCode),
  KEY countryName (countryName)
) ENGINE=MyISAM COMMENT='Countries of the World';

CREATE TABLE regions (
  countryCode char(2) NOT NULL,
  regionCode char(2) NOT NULL,
  regionName varchar(50),
  PRIMARY KEY (countryCode,regionCode),
  KEY regionName (regionName)
) ENGINE=MyISAM COMMENT='Regions, per country';

CREATE TABLE cities (
  countryCode char(2) NOT NULL,
  cityName varchar(50),
  cityAccentedName varchar(50),
  regionCode char(2) NOT NULL,
  population bigint(20),
  latitude float(10,7),
  longitude float(10,7),
  KEY countryRegion (countryCode, regionCode)
  KEY cityName (cityName),
  KEY cityAccentedName (cityAccentedName)
) ENGINE=MyISAM;

The cities table lacks a proper key, so we have to invent one; after we load the data for all tables, we can add it with the SQL command:

ALTER TABLE cities
    ADD cityCode BIGINT NOT NULL AUTO_INCREMENT FIRST,
    ADD PRIMARY KEY (cityCode);

Now let's consider what services to provide.

Planning the REST services

REST isn't truly a standard, but rather a design style for web services. Such services are HTTP-based and platform- and programming language-independent, and mesh well with AJAX and modern RIAs (Rich Internet Applications). A REST architecture is based upon resources, themselves identified by URIs (Uniform Resource Identifiers, or URLs as used everywhere on the Internet). Accessing a URI with, say, a GET HTTP method gets you a representation of the corresponding resource, while using a PUT method lets you update the resource. This may sound sort of abstract - our example should help clarify things.

Given our data, we must provide access to three collections (countries, regions, cities). We also want to be able to access not only the complete sets (all the countries) but also more specific data (all of Uruguay, or maybe just its capital, Montevideo). We'd also like to get several entities at the same time (for instance, all the "Mercosur" countries – Argentine, Brazil, Paraguay, and Uruguay – in a single request) and be able to run searches (for instance, all countries with "GUA" in their names). The following table lists the paths to some possible URIs and what they refer to.

The URIs for our examples
URIExpected result
/countries All countries
/countries/UY Data of Uruguay ("UY" code)
/countries/AR,BR,PY,UY Data for the four Mercosur countries
/countries?countrynamelike=GUA All countries that include GUA in their names
/regions All regions of all countries
/regions/UY All regions of Uruguay
/regions/UY/10 Data of a specific region of Uruguay
/regions/UY/10,14,15 Data of three regions of Uruguay
our_host/regions/UY?regionnamelike=G Data of the Uruguay regions with a G in their names
/cities List of all cities – this is big, don't try it at home!
/cities/153585 Data of the single city identified by the given code; in this example, Darwin in Australia
/cities?citynamelike=DARWIN All cities named "DARWIN"

Accessing the URIs with a GET method would produce an XML representation of their values. (You could ask for, say, JSON, by using the Accept header or the query string, but we won't deal with that.) A DELETE method would delete the resources, but in our particular case, we have forbidden deleting complete sets, since any such request would likely be a mistake. Finally, a PUT method updates the corresponding resource (or creates it, if it doesn't exist), and the POST method creates a new resource. POSTing a resource requires the server to provide the resource key; since countries and regions have standard codes, we won't allow keyless requests.

The GET, PUT, and DELETE requests are said to be idempotent, meaning that repeated equal requests won't produce any changes, but POST requests may not be. That is, if you PUT a resource several times, only the first request will actually have an effect; the rest will just set the resource to the same values. On the other hand, every time you POST a resource (unless you give its key) the system will assign a key on its own and send it back to you. If you know the resource's key, PUT and POST will be equivalent, but conceptually they aren't the same.

REST services use common HTTP return codes to signal operations results. We will be using the following values, with the given meanings. Note that we could also be considering authorization, and possibly return a 401 UNAUTHORIZED code for some operations.

HTTP return codes used by our services
CodeMeaningExample
200 OK Did a GET for a country and gave the correct code
201 CREATED Did a POST for a city and it could be created
204 NO CONTENT Did a DELETE for a region and it could be deleted; no results are sent
400 BAD REQUEST Asked for something other than a GET, DELETE, PUT, or POST
403 FORBIDDEN Wanted to DELETE all countries
404 NOT FOUND Asked for a country, but gave a wrong code
405 METHOD NOT ALLOWED Tried to POST a new country without giving the country code
409 CONFLICT Tried to create a country but something happened, possibly because of other users

The data for a country would include its code and name, plus the collection of its regions, given by their appropriate URIs. For example, the data returned for Uruguay (code UY) should be something like the following. In usual REST fashion, the data for regions isn't given in full (though you could do that if required); instead, a link is provided so the client can access it on its own.

<?xml version='1.0' encoding='UTF-8'?>
<countries>
    <country>
        <link rel='Uruguay' href='http://192.168.1.200/rest_alias/countries/UY' />
        <code>UY</code>
        <name>Uruguay</name>
        <regions>
            <link rel='Artigas' href='http://192.168.1.200/rest_alias/regions/UY/01' />
            <link rel='Canelones' href='http://192.168.1.200/rest_alias/regions/UY/02' />
            ...
            several lines snipped
            ...
            <link rel='Tacuarembo' href='http://192.168.1.200/rest_alias/regions/UY/18' />
            <link rel='Treinta y Tres' href='http://192.168.1.200/rest_alias/regions/UY/19' />
        </regions>
    </country>
</countries>

Similarly, the results for the "153585" city GET request would be along the lines of the following:

<?xml version='1.0' encoding='UTF-8'?>
<cities>
    <city>
        <link rel='darwin' href='http://192.168.1.200/rest_alias/cities/153585' />
        <country href='http://192.168.1.200/rest_alias/countries/au' />
        <region href='http://192.168.1.200/rest_alias/regions/au/03' />
        <code>153585</code>
        <name>darwin</name>
        <accentedName>Darwin</accentedName>
        <pop>93081</pop>
        <coordinates lat='-12.4572201' long='-12.4572201' />
    </city>
</cities>

Finally, a mere access to the base address without anything else could provide a list of links to the given collections, but that's usually not too relevant, so we won't bother. Let's now turn to the services coding, and then move on to enabling access to them through Apache.

Implementing the web services

I've provided full source code in ws_countries.php, ws_regions.php, ws_cities.php, and commonsetup.php, so we'll just skim over the structure of our three services,. The structure is pretty much the same for all of them; for cities, there is a slight difference between PUT and POST, because a POST without giving a city code is allowed. Let's analyze the countries service:

get all parameters, such as country code
if the method is DELETE:
    if the country code wasn't given, send back a 400 BAD REQUEST code; deleting the whole collection isn't allowed
    if the country cannot be deleted without affecting other tables' integrity, send back a 403 NOT ALLOWED code
    if the country doesn't exist, send back a 404 NOT FOUND code
    otherwise, delete the country and send back a 204 code

if the method is GET:
    if no countries match the country codes and search queries, send back a 404 NOT FOUND code
    otherwise, build an XML string and send it back with a 200 OK code

if the method is POST or PUT:
    if the country code wasn't given, send back a 405 BAD REQUEST code
    if any resource parameters are missing, send back a 403 code with an appropriate explanation
    try creating the country; on success, send back a 201 CREATED code, and a Location header pointing to the new resource
    otherwise, try updating the country; on success, send back a 204 UPDATED code
    otherwise, something weird is happening; send back a 409 CONFLICT code

if the method is any other, send back a 400 BAD REQUEST

Working with alias and rewrite
Before trying out the examples in this article, you must have the required "alias" and "rewrite" modules loaded in your Apache setup. Use the apache2ctl -M command to see if they are already present:

# apache2ctl -M | grep -E "alias|rewrite"
 alias_module (shared)
 rewrite_module (shared)

If the modules do not show up, you'll have to reconfigure Apache. The way to do this depends on your specific distro; for example, in OpenSUSE, I had to edit the /etc/sysconfig/apache2 file, change the APACHE_MODULES="actions auth_basic ..." line so it would include both alias and rewrite, and then restart Apache with /etc/init.d/apache2 restart.

A slight enhancement: if a variable like _method=PUT is passed to the service, then the request is assumed to be a PUT, instead of whatever it actually was. This helps us test the services directly from the browser, as we'll see. (And if you use a debugger such as Firebug, you can also inspect the returned headers.) I created the required files in the Apache document root, in a directory called services. Given that my server is at IP 192.168.1.200, I could easily test services by navigating to addresses such as http://192.168.1.200/services/ws_countries.php?countrycode=UY or http://192.168.1.200/services/ws_regions.php?_method=PUT&countrycode=UY&regioncode=10&regionname=Montevideo. Of course these URLs are not really REST-y in the sense we saw above.

Accessing the web services through URL aliases

There are two ways of allowing the REST URIs through Apache: through an alias that will send all requests to a specific dispatcher program that will have to "interpret" the URI and route the request to the appropriate service code, or through URL rewrites that transform a REST-like URL into something Apache can work with. Let's start by trying an alias, which is more straightforward.

We'll plan to use paths like 192.168.1.200/rest_alias/countries?countrynamelike=GUA and 192.168.1.200/rest_alias/regions/UY/10. To do so, add a line to the default_server.conf configuration file, Alias /rest_alias "/srv/www/htdocs/services/router.php", then restart the Apache service. This line implies that all requests to a path that includes /rest_alias will be rerouted to router.php, which will then deal with the request. The router code can get the original path (that is, /rest_alias/countries?countrynamelike=GUA or /rest_alias/regions/UY/10) and proceed with a logic along the lines of the following pseudocode:

get the original path
extract from it a query string, if present
split the remainder along the slashes, so:
  - the first element is "rest_alias"
  - the second element ("countries," say) points to the required service
  - the rest of the elements are parameters to the service
if the required service exists, call it and pass all parameters to it
otherwise, return a 404 NOT FOUND error status code

Whenever the Apache server receives a request for a resource, on matching the "/rest_alias" part, it passes the request to our router code to do the final work and invoke the actual required service. Normally, Apache itself calls services, but in this case, the router code deals with that.

This is a simple technique that requires little modification to your Apache setup, but a little extra coding (the dispatcher source code is less that 20 lines in length), so it works well for a small number of services. Adding new resources requires adding a bit to the code, but it's not complex; see the official documentation for more information.

Let's now turn to a different solution that requires a bit more Apache configuration but works without any extra dispatching code.

Accessing the web services through URL rewriting

The rewrite module for Apache lets you do far more than just creating aliases, and is able to do all kinds of work with a request string. Before you can use it you must edit your default-server.conf file and add several lines, as below. The first three lines enable the rewriting engine, set the logging level to a reasonable value, and define the logging file to which the module will report whatever it does. (This is just about the only debugging tool you'll have with the rewrite module, so don't sneer at it!) The rest of the lines are rewriting rules that define how an original URL is converted into the definitive one:

RewriteEngine On
RewriteLogLevel 3
RewriteLog "/var/log/apache2/mod_rewrite.log"

RewriteRule ^/rest_rewrite/countries/(.*)$ /srv/www/htdocs/services/ws_countries.php?countrycode=$1 [L,QSA]
RewriteRule ^/rest_rewrite/countries$ /srv/www/htdocs/services/ws_countries.php [L,QSA]
RewriteRule ^/rest_rewrite/regions/(.*)/(.*)$ /srv/www/htdocs/services/ws_regions.php?countrycode=$1&regioncode=$2 [L,QSA]
RewriteRule ^/rest_rewrite/regions/(.*)$ /srv/www/htdocs/services/ws_regions.php?countrycode=$1 [L,QSA]
RewriteRule ^/rest_rewrite/regions$ /srv/www/htdocs/services/ws_regions.php [L,QSA]

Each RewriteRule always includes a pattern and a rewritten URL. If the current URL matches the former, it gets rewritten according to the latter. An optional third parameter [L] means no further comparisons are made; if this parameter is missing, Apache will try matching additional following patterns, even after a match. Patterns can become complicated quickly, and you should rather check the official documentation. However, in our case we can make do with just a few cases.

Let's consider a path such as /rest_rewrite/countries/UY; this should be routed to /srv/www/htdocs/services/ws_countries.php?countrycode=UY, which suggests changing "/rest_rewrite/countries" into "/srv/www/htdocs/services/ws_countries" and adding whatever follows the last slash to the query string. The symbols ^ and $ stand for the beginning and the end of the path part of the given URL. The (.*) pattern stands for zero or more characters, and becomes $1 in the replacement string. (If you had another pattern in parenthesis, it would be referred to as $2, and so on; check the third RewriteRule for such a case.) So, the first rewrite rule takes care of finding an optional country code and copying it to the replacement URL – but we're not done yet.

What would happen with a path such as /rest_rewrite/countries/AR,BR,PY,UY?countrynamelike=GUA, which requests picking, out of the four given countries, only those with GUA in their names? We need the rewrite module to add the original query string to the one we created, and we manage this by adding QSA to the parameters of the rewrite rule. The complete rule is correct as shown above.

Let's check the second rewrite rule. It looks similar to the first one, but there's a slash missing, and no pattern for the country code; this gets turned into a simple reference to ws_countries.php, once again with the optional query string appended to the end, if present. You must be careful when ordering the rewrite rules, because the first one that matches will be applied. If things don't work out as expected, you won't get any help from your browser other than a error, but you can check the rewrite log file to see what changes were applied, and what the results were, as in the example below, somewhat shortened for clearness.

(2) init rewrite engine with requested uri /rest_rewrite/countries/
(3) applying pattern '^/rest_rewrite/countries/(.*)$' to uri '/rest_rewrite/countries/'
(2) rewrite '/rest_rewrite/countries/' -> '/srv/www/htdocs/services/ws_countries.php?countrycode='
(3) split uri=/srv/www/htdocs/services/ws_countries.php?countrycode= -> uri=/srv/www/htdocs/services/ws_countries.php, args=countrycode=&countrycode=AR,BR,PY,UR?countrynamelike=GUA
(2) local path result: /srv/www/htdocs/services/ws_countries.php
(1) go-ahead with /srv/www/htdocs/services/ws_countries.php [OK]

Rewriting URLs is powerful but it can also be complex and error-prone. We used two different paths (/rest_alias and /rest_rewrite) so there would be no confusion. You can use both aliases and rewriting rules in your code, but keep in mind that rewrites come first, and aliases later.

In conclusion

Developing RESTful services isn't necessarily hard, and by mixing a bit of PHP plus some Apache configuration techniques, you can quickly be off and running without complex frameworks.

If you'd like to learn more about REST and RESTful services:

  • Read the original Ph.D. dissertation by Roy Fielding that gave birth to the REST concept, for the earliest reference.
  • Check the RESTful Web services article by Alex Rodriguez for more on REST design.
  • Get the official list of HTTP Status Codes as defined in the HTML 1.1 standard.



This work is licensed under a Creative Commons Attribution 3.0 Unported License
Creative Commons License.

Comments

I must say, I can't understand why contents like this doesn't get the deserved attention on the internet. 
 
Great post and thanks for sharing!
Posted @ Sunday, April 21, 2013 8:38 PM by Pulgafree
Thanks for your kind comments!!
Posted @ Tuesday, May 07, 2013 10:30 AM by Federico Kereki
Hey hey, thanks a lot for this great tutorial. you may consider to fix a small mistake on SQL query. A comma is missing in the end of the following line. ;) 
 
KEY countryRegion (countryCode, regionCode), 
 
Thanks a lot once more!
Posted @ Thursday, June 05, 2014 6:49 AM by Gabriel Almeida
Hey, thanks for your comments, and good catch on the comma!
Posted @ Thursday, June 05, 2014 7:59 AM by Federico Kereki
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics