Developing modules for the Apache HTTP Server 2.4
Developing modules for the Apache HTTP Server 2.4
Available Languages: en
This document explains how you can develop modules for the Apache HTTP Server 2.4
![Support Apache!](https://www.apache.org/images/SupportApache-small.png)
- Introduction
- Defining a module
- Getting started: Hooking into the server
- Building a handler
- Adding configuration options
- Context aware configurations
- Summing up
- Some useful snippets of code
See also
Introduction
What we will be discussing in this document
This document will discuss how you can create modules for the Apache HTTP Server 2.4, by exploring an example module called mod_example
. In the first part of this document, the purpose of this module will be to calculate and print out various digest values for existing files on your web server, whenever we access the URL http://hostname/filename.sum
. For instance, if we want to know the MD5 digest value of the file located at http://www.example.com/index.html
, we would visit http://www.example.com/index.html.sum
.
In the second part of this document, which deals with configuration directive and context awareness, we will be looking at a module that simply writes out its own configuration to the client.
Prerequisites
First and foremost, you are expected to have a basic knowledge of how the C programming language works. In most cases, we will try to be as pedagogical as possible and link to documents describing the functions used in the examples, but there are also many cases where it is necessary to either just assume that "it works" or do some digging yourself into what the hows and whys of various function calls.
Lastly, you will need to have a basic understanding of how modules are loaded and configured in the Apache HTTP Server, as well as how to get the headers for Apache if you do not have them already, as these are needed for compiling new modules.
Compiling your module
To compile the source code we are building in this document, we will be using APXS. Assuming your source file is called mod_example.c, compiling, installing and activating the module is as simple as:
apxs -i -a -c mod_example.c
Defining a module
Every module starts with the same declaration, or name tag if you will, that defines a module as a separate entity within Apache:
module AP_MODULE_DECLARE_DATA example_module =
{
STANDARD20_MODULE_STUFF,
create_dir_conf, /* Per-directory configuration handler */
merge_dir_conf, /* Merge handler for per-directory configurations */
create_svr_conf, /* Per-server configuration handler */
merge_svr_conf, /* Merge handler for per-server configurations */
directives, /* Any directives we may have for httpd */
register_hooks /* Our hook registering function */
};
This bit of code lets the server know that we have now registered a new module in the system, and that its name is example_module
. The name of the module is used primarily for two things:
- Letting the server know how to load the module using the LoadModule
- Setting up a namespace for the module to use in configurations
For now, we're only concerned with the first purpose of the module name, which comes into play when we need to load the module:
LoadModule example_module modules/mod_example.so
In essence, this tells the server to open up mod_example.so
and look for a module called example_module
.
Within this name tag of ours is also a bunch of references to how we would like to handle things: Which directives do we respond to in a configuration file or .htaccess, how do we operate within specific contexts, and what handlers are we interested in registering with the Apache HTTP service. We'll return to all these elements later in this document.
Getting started: Hooking into the server
An introduction to hooks
When handling requests in Apache HTTP Server 2.4, the first thing you will need to do is create a hook into the request handling process. A hook is essentially a message telling the server that you are willing to either serve or at least take a glance at certain requests given by clients. All handlers, whether it's mod_rewrite, mod_authn_*, mod_proxy and so on, are hooked into specific parts of the request process. As you are probably aware, modules serve different purposes; Some are authentication/authorization handlers, others are file or script handlers while some third modules rewrite URIs or proxies content. Furthermore, in the end, it is up to the user of the server how and when each module will come into place. Thus, the server itself does not presume to know which module is responsible for handling a specific request, and will ask each module whether they have an interest in a given request or not. It is then up to each module to either gently decline serving a request, accept serving it or flat out deny the request from being served, as authentication/authorization modules do:
To make it a bit easier for handlers such as our mod_example to know whether the client is requesting content we should handle or not, the server has directives for hinting to modules whether their assistance is needed or not. Two of these are AddHandler
and SetHandler
. Let's take a look at an example using AddHandler
. In our example case, we want every request ending with .sum to be served by mod_example
, so we'll add a configuration directive that tells the server to do just that:
AddHandler example-handler .sum
What this tells the server is the following: Whenever we receive a request for a URI ending in .sum, we are to let all modules know that we are looking for whoever goes by the name of "example-handler" . Thus, when a request is being served that ends in .sum, the server will let all modules know, that this request should be served by "example-handler ". As you will see later, when we start building mod_example, we will check for this handler tag relayed by AddHandler
and reply to the server based on the value of this tag.
Hooking into httpd
To begin with, we only want to create a simple handler, that replies to the client browser when a specific URL is requested, so we won't bother setting up configuration handlers and directives just yet. Our initial module definition will look like this:
module AP_MODULE_DECLARE_DATA example_module =
{
STANDARD20_MODULE_STUFF,
NULL,
NULL,
NULL,
NULL,
NULL,
register_hooks /* Our hook registering function */
};
This lets the server know that we are not interested in anything fancy, we just want to hook onto the requests and possibly handle some of them.
The reference in our example declaration, register_hooks
is the name of a function we will create to manage how we hook onto the request process. In this example module, the function has just one purpose; To create a simple hook that gets called after all the rewrites, access control etc has been handled. Thus, we will let the server know, that we want to hook into its process as one of the last modules:
static void register_hooks(apr_pool_t *pool)
{
/* Create a hook in the request handler, so we get called when a request arrives */
ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
}
The example_handler
reference is the function that will handle the request. We will discuss how to create a handler in the next chapter.
Other useful hooks
Hooking into the request handling phase is but one of many hooks that you can create. Some other ways of hooking are:
ap_hook_child_init
: Place a hook that executes when a child process is spawned (commonly used for initializing modules after the server has forked)ap_hook_pre_config
: Place a hook that executes before any configuration data has been read (very early hook)ap_hook_post_config
: Place a hook that executes after configuration has been parsed, but before the server has forkedap_hook_translate_name
: Place a hook that executes when a URI needs to be translated into a filename on the server (thinkmod_rewrite
)ap_hook_quick_handler
: Similar toap_hook_handler
, except it is run before any other request hooks (translation, auth, fixups etc)ap_hook_log_transaction
: Place a hook that executes when the server is about to add a log entry of the current request
Building a handler
A handler is essentially a function that receives a callback when a request to the server is made. It is passed a record of the current request (how it was made, which headers and requests were passed along, who's giving the request and so on), and is put in charge of either telling the server that it's not interested in the request or handle the request with the tools provided.
A simple "Hello, world!" handler
Let's start off by making a very simple request handler that does the following:
- Check that this is a request that should be served by "example-handler"
- Set the content type of our output to
text/html
- Write "Hello, world!" back to the client browser
- Let the server know that we took care of this request and everything went fine
In C code, our example handler will now look like this:
static int example_handler(request_rec *r)
{
/* First off, we need to check if this is a call for the "example-handler" handler.
* If it is, we accept it and do our things, if not, we simply return DECLINED,
* and the server will try somewhere else.
*/
if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED);
/* Now that we are handling this request, we'll write out "Hello, world!" to the client.
* To do so, we must first set the appropriate content type, followed by our output.
*/
ap_set_content_type(r, "text/html");
ap_rprintf(r, "Hello, world!");
/* Lastly, we must tell the server that we took care of this request and everything went fine.
* We do so by simply returning the value OK to the server.
*/
return OK;
}
Now, we put all we have learned together and end up with a program that looks like mod_example_1.c . The functions used in this example will be explained later in the section "Some useful functions you should know".
The request_rec structure
The most essential part of any request is the request record . In a call to a handler function, this is represented by the request_rec*
structure passed along with every call that is made. This struct, typically just referred to as r
in modules, contains all the information you need for your module to fully process any HTTP request and respond accordingly.
Some key elements of the request_rec
structure are:
r->handler (char*):
Contains the name of the handler the server is currently asking to do the handling of this requestr->method (char*):
Contains the HTTP method being used, f.x. GET or POSTr->filename (char*):
Contains the translated filename the client is requestingr->args (char*):
Contains the query string of the request, if anyr->headers_in (apr_table_t*):
Contains all the headers sent by the clientr->connection (conn_rec*):
A record containing information about the current connectionr->user (char*):
If the URI requires authentication, this is set to the username providedr->useragent_ip (char*):
The IP address of the client connecting to usr->pool (apr_pool_t*)
: The memory pool of this request. We'll discuss this in the "Memory management" chapter.
A complete list of all the values contained within the request_rec
structure can be found in the httpd.h
header file or at http://ci.apache.org/projects/httpd/trunk/doxygen/structrequest__rec.html.
Let's try out some of these variables in another example handler:
static int example_handler(request_rec *r)
{
/* Set the appropriate content type */
ap_set_content_type(r, "text/html");
/* Print out the IP address of the client connecting to us: */
ap_rprintf(r, "<h2>Hello, %s!</h2>", r->useragent_ip);
/* If we were reached through a GET or a POST request, be happy, else sad. */
if ( !strcmp(r->method, "POST") || !strcmp(r->method, "GET") ) {
ap_rputs("You used a GET or a POST method, that makes us happy!<br/>", r);
}
else {
ap_rputs("You did not use POST or GET, that makes us sad :(<br/>", r);
}
/* Lastly, if there was a query string, let's print that too! */
if (r->args) {
ap_rprintf(r, "Your query string was: %s", r->args);
}
return OK;
}
Return values
Apache relies on return values from handlers to signify whether a request was handled or not, and if so, whether the request went well or not. If a module is not interested in handling a specific request, it should always return the value DECLINED
. If it is handling a request, it should either return the generic value OK
, or a specific HTTP status code, for example:
static int example_handler(request_rec *r)
{
/* Return 404: Not found */
return HTTP_NOT_FOUND;
}
Returning OK
or a HTTP status code does not necessarily mean that the request will end. The server may still have other handlers that are interested in this request, for instance the logging modules which, upon a successful request, will write down a summary of what was requested and how it went. To do a full stop and prevent any further processing after your module is done, you can return the value DONE
to let the server know that it should cease all activity on this request and carry on with the next, without informing other handlers.
General response codes:
DECLINED
: We are not handling this requestOK
: We handled this request and it went wellDONE
: We handled this request and the server should just close this thread without further processing
HTTP specific return codes (excerpt):
HTTP_OK (200)
: Request was okayHTTP_MOVED_PERMANENTLY (301)
: The resource has moved to a new URLHTTP_UNAUTHORIZED (401)
: Client is not authorized to visit this pageHTTP_FORBIDDEN (403)
: Permission deniedHTTP_NOT_FOUND (404)
: File not foundHTTP_INTERNAL_SERVER_ERROR (500)
: Internal server error (self explanatory)
Some useful functions you should know
ap_rputs(const char *string, request_rec *r)
:
Sends a string of text to the client. This is a shorthand version of ap_rwrite.ap_rputs("Hello, world!", r);
ap_rprintf
:
This function works just likeprintf
, except it sends the result to the client.ap_rprintf(r, "Hello, %s!", r->useragent_ip);
ap_set_content_type(request_rec *r, const char *type)
:
Sets the content type of the output you are sending.ap_set_content_type(r, "text/plain"); /* force a raw text output */
Memory management
Managing your resources in Apache HTTP Server 2.4 is quite easy, thanks to the memory pool system. In essence, each server, connection and request have their own memory pool that gets cleaned up when its scope ends, e.g. when a request is done or when a server process shuts down. All your module needs to do is latch onto this memory pool, and you won't have to worry about having to clean up after yourself - pretty neat, huh?
In our module, we will primarily be allocating memory for each request, so it's appropriate to use the r->pool
reference when creating new objects. A few of the functions for allocating memory within a pool are:
void* apr_palloc( apr_pool_t *p, apr_size_t size)
: Allocatessize
number of bytes in the pool for youvoid* apr_pcalloc( apr_pool_t *p, apr_size_t size)
: Allocatessize
number of bytes in the pool for you and sets all bytes to 0char* apr_pstrdup( apr_pool_t *p, const char *s)
: Creates a duplicate of the strings
. This is useful for copying constant values so you can edit themchar* apr_psprintf( apr_pool_t *p, const char *fmt, ...)
: Similar tosprintf
, except the server supplies you with an appropriately allocated target variable
Let's put these functions into an example handler:
static int example_handler(request_rec *r)
{
const char *original = "You can't edit this!";
char *copy;
int *integers;
/* Allocate space for 10 integer values and set them all to zero. */
integers = apr_pcalloc(r->pool, sizeof(int)*10);
/* Create a copy of the 'original' variable that we can edit. */
copy = apr_pstrdup(r->pool, original);
return OK;
}
This is all well and good for our module, which won't need any pre-initialized variables or structures. However, if we wanted to initialize something early on, before the requests come rolling in, we could simply add a call to a function in our register_hooks
function to sort it out:
static void register_hooks(apr_pool_t *pool)
{
/* Call a function that initializes some stuff */
example_init_function(pool);
/* Create a hook in the request handler, so we get called when a request arrives */
ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST);
}
In this pre-request initialization function we would not be using the same pool as we did when allocating resources for request-based functions. Instead, we would use the pool given to us by the server for allocating memory on a per-process based level.
Parsing request data
In our example module, we would like to add a feature, that checks which type of digest, MD5 or SHA1 the client would like to see. This could be solved by adding a query string to the request. A query string is typically comprised of several keys and values put together in a string, for instance valueA=yes&valueB=no&valueC=maybe
. It is up to the module itself to parse these and get the data it requires. In our example, we'll be looking for a key called digest
, and if set to md5
, we'll produce an MD5 digest, otherwise we'll produce a SHA1 digest.
Since the introduction of Apache HTTP Server 2.4, parsing request data from GET and POST requests have never been easier. All we require to parse both GET and POST data is four simple lines:
apr_table_t*GET; apr_array_header_t*POST; ap_args_to_table(r, &GET); ap_parse_form_data(r, NULL, &POST, -1, 8192);
In our specific example module, we're looking for the digest
value from the query string, which now resides inside a table called GET
. To extract this value, we need only perform a simple operation:
/* Get the "digest" key from the query string, if any. */
const char *digestType = apr_table_get(GET, "digest");
/* If no key was returned, we will set a default value instead. */
if (!digestType) digestType = "sha1";
The structures used for the POST and GET data are not exactly the same, so if we were to fetch a value from POST data instead of the query string, we would have to resort to a few more lines, as outlined in this example in the last chapter of this document.
Making an advanced handler
Now that we have learned how to parse form data and manage our resources, we can move on to creating an advanced version of our module, that spits out the MD5 or SHA1 digest of files:
static int example_handler(request_rec *r)
{
int rc, exists;
apr_finfo_t finfo;
apr_file_t *file;
char *filename;
char buffer[256];
apr_size_t readBytes;
int n;
apr_table_t *GET;
apr_array_header_t *POST;
const char *digestType;
/* Check that the "example-handler" handler is being called. */
if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED);
/* Figure out which file is being requested by removing the .sum from it */
filename = apr_pstrdup(r->pool, r->filename);
filename[strlen(filename)-4] = 0; /* Cut off the last 4 characters. */
/* Figure out if the file we request a sum on exists and isn't a directory */
rc = apr_stat(&finfo, filename, APR_FINFO_MIN, r->pool);
if (rc == APR_SUCCESS) {
exists =
(
(finfo.filetype != APR_NOFILE)
&& !(finfo.filetype & APR_DIR)
);
if (!exists) return HTTP_NOT_FOUND; /* Return a 404 if not found. */
}
/* If apr_stat failed, we're probably not allowed to check this file. */
else return HTTP_FORBIDDEN;
/* Parse the GET and, optionally, the POST data sent to us */
ap_args_to_table(r, &GET);
ap_parse_form_data(r, NULL, &POST, -1, 8192);
/* Set the appropriate content type */
ap_set_content_type(r, "text/html");
/* Print a title and some general information */
ap_rprintf