How to install mod_gzip
Actually performing the installation isn't difficult, but finding the method that suits best to the needs of your Apache installation may take some time.
Therefore it is highly recommended that you read this chapter completely and become aware of the pros and cons of the different options before you select the operation method and perform the installation.
The document in hand especially covers the internal processing model of mod_gzip as an Apache module and may thus provide informations that can be helpful for understanding mod_gzip's evaluation method for configuration directives.
Basics
Introduction
The Apache web server supports two different methods of integrating a module into its program code:
Depending on the operation concept used for your Apache server and the given requirements
- one of these two operation methods for mod_gzip has to be selected and
- the set of files required for this method has to be downloaded.
Static integration of an Apache module
Static integration means that the module becomes a permanent part of the linked program binary httpd
which implements the Apache server.
For this the Apache server source code has to be
- extended by this module's source code and then
- compiled as a whole by using a C compiler.
Normally each administrator may want to use a different set of features for his Apache server adapted to his own requirements; therefore it doesn't seem feasible to provide program files ready to run on a multitude of platforms for download.
Dynamic integration of an Apache module
Dynamic integration means that the module can be loaded from a separate module file as shared object when starting the Apache process.
For this
- the shared object file of this module for the required target platform has to be
- the Apache configuration file has to be extended by a directive to load this module.
The structure of mod_gzip's module source code
Most Apache modules consist of one single source code file only. This file can be compiled by invoking the apxs
program (with corresponding parameter values); for the installation of the shared object created this way one more invocation of apxs
(with different parameter values) would be required.
From version 1.3.26.1a on, mod_gzip
's source code is divided into three separate files:
mod_gzip.c
(about 8000 lines) contains all functions that are necessary to implement the processing logic of the mod_gzip Apache module.
This file is very much dependent on the module interface of the Apache version 1.3 (which didn't change for many years).mod_gzip_debug.c
(about 500 lines) contains functions that merely are required for debugging tasks for the developer; part of these functions are not even contained in a mod_gzip compiled 'the normal way' (depending on the values of the given compiler directives of the-D
type to define symbolic constants).mod_gzip_compress.c
(about 3000 lines) contains Kevin Kiley's implementation of the gzip compression function, the one that 'actually does the work'.
This part is not dependent on any specific Apache version and (from a purely technical point of view) might be used by other compression tools as well (likemod_deflate
which currently uses the 'zlib' for compression).
This structure of the source code makes the mod_gzip maintenance a little easier - and the installation a little more complicated (as now several source files have to be compiled instead of just one). Therefore mod_gzip now provides Makefile
s to simplify this installation process.
Download
Depending on which of the operation concepts named above is to be run, different files (suitable for the respective purpose of use) are to be used.
As of this writing the following files are available for each mod_gzip version at the download page for the mod_gzip project:
If mod_gzip
is to be run with dynamic integration on some other platform then the shared module file for this platform has to be created by the administrator.
Static integration of mod_gzip
The normal compilation of the Apache webserver
The procedure for installing the Apache webserver on a UNIX machine documented by the Apache Group reads like this (in the short version):
- download the archive with the Apache source code from the WWW
- unpack the archive
- navigate into the directory created by the previous operation
- read and understand the file
INSTALL
- start the shell script
./configure
with the appropriate parameter values (this will cause the creation ofMakefile
files in a large number of subdirectories) make install
(this will cause the compilation and installation of the Apache server including its online documentation)
If the Apache webserver has been created this way then the shipped modules normally become static parts of the created program file httpd
in the Apache program directory - unless you specified something different in the parameters of the configure
call.
(The actual call of configure
may become very extensive, depending on the degree of deviation from the standard parameter values. I recommend storing this call itself in a small shell script to document the type of processed installation by the way.)
The integration of mod_gzip into the Apache source code
The source code of the official Apache modules is contained in the src/modules
subdirectory of the unpacked tar
archive of the Apache software.
To let mod_gzip be treated like a standard Apache module by this mechanism, the following preparations are necessary:
- Uncompress and unpack the content of the download archive containing the mod_gzip source code (which will create a directory
mod_gzip-
versionnumber), - Create a directory
src/modules/gzip
within the directory tree of the Apache source - Copy all files with the extensions
*.c
,*.h
and*.tmpl
into this newgzip
directory.
As next step, you extend the configure
call by the parameter --activate-module=src/modules/gzip/mod_gzip.c
. Now the configure
script will find the mod_gzip source code and create a suitable Makefile
from the shipped file Makefile.tmpl
- logged by the messages
+ activated gzip module (modules/gzip/mod_gzip.c)
and
Creating Makefile in src/modules/gzip
- the latter one just like for Apache's own modules. (The Makefile
shipped with mod_gzip is not suitable for this type of installation - this one is only for the creation of a shared object file.)
Now the Apache installation will work as usual - and mod_gzip will be treated like a normal Apache module.
But configure
knows that a module integrated via the --activate-module
parameter is a 3rd-party module that may probably have specific requirements, and thus will load mod_gzip automatically on top of the module stack so that it will have access to the incoming HTTP request prior to all other modules - which is exactly what mod_gzip urgently needs.
On some platforms Apache's configure doesn't seem to automatically set the value of the $(LIBEXT) environment variable to the proper value of .a. In this case the compilation of mod_gzip will fail. The exact reason for this behaviour is unknown as of now; as workaround you may replace the line
LIB=libgzip.$(LIBEXT)
by
LIB=libgzip.a
within the shipped file Makefile.tmpl, i. e. insert the proper value manually.
Be sure to use an editor that doesn't expand tab characters to whitespaces for this task!
(To be tested: What happens when integrating more than one 3rd-party module with --activate-module
? Is the order of the parameter values relevant in this case?)
Verify that mod_gzip has been integrated correctly
To check whether mod_gzip actually has been integrated into the Apache program code as requested, the Apache server provides the httpd -l
command. This will display a list of all integrated modules (in the order in which they will be loaded); mod_gzip.c
should be the last entry displayed there.
Dynamic Integration of mod_gzip
The concept of loadable Apache modules
The Apache webserver supports the concept of loadable modules.
Nearly each Apache module can be
- compiled as a shared object and then
- dynamically loaded into Apache's address space at the start of the Apache server (by use of the corresponding configuration directives).
The handling of loadable modules requires additional knowledge about the Apache configuration (because the order in which these modules are loaded may be significant for their functioning) but allows for changes of the Apache server's code range without having to recompile its source code.
On platforms like Windows (where not many Apache administrators have a C development environment at hand to compile and link the Apache code) the use of loadable modules may often be the only possibility to enlarge the functional scope of the Apache server.
The Apache 1.3 documentation provides the following articles about this topic:
- Dynamic Shared Object (DSO) Support - the description of the corresponding concept for the Apache webserver
- Module
mod_so
- the description of the Apache module for loading other modules and the required configuration directives
Directive for loading mod_gzip
To dynamically add the mod_gzip shared object to the Apache code, one of the following configuration directives is required:
# --------------------------------------------------------------------- # load a DLL / Windows: LoadModule gzip_module modules/ApacheModuleGzip.dll # --------------------------------------------------------------------- # load a DSO / UNIX: LoadModule gzip_module modules/mod_gzip.so # --------------------------------------------------------------------- # (none of both if module statically integrated) # ---------------------------------------------------------------------
The actual file name can be freely selected - it only has to match the name of the file effectively used. On the other hand, this name can depend on the operating system platform and even on the compilation method used for this module - in this case either the directive shown above has to be adapted or the file has to be renamed accordingly.
The handling of modules by Apache 1.3
The Apache server can dynamically load any number of modules. While doing so, the corresponding LoadModule
directives are processed in the order of their occurrence within the configuration file.
But the modules are loaded on a stack within the working memory: The module that has been loaded last will be the first to get access to handling the corresponding HTTP request to the Apache webserver - and may then decide whether to consider itself responsible for handling this request or not.
Only one of all modules in question can be responsible for handling a request in Apache 1.3 - subsequent modules will not even be asked.
The integration of mod_gzip into Apache's evaluation of a request
Therefore, to be able to process the output of arbitrary modules, mod_gzip has to do something that actually contradicts the Apache 1.3 architecture: It has to 'handle' a request but subsequently revoke the responsibility for handling this request. Only by this procedure the module which is effectively responsible for handling the request can still be activated by the Apache server at all.
In this first phase the 'handling' of this request by mod_gzip does not mean to compress the page's content to be served - because this content doesn't even exist yet, it still has to be generated by another module! Instead, at this point in time mod_gzip just prepares to be asked again whether it wants to do anything after the page content's creation. Only in this second phase of its activation (where the content of the HTTP response is already available then) mod_gzip can perform its essential task, which is compressing the content of a HTTP response packet (and the modification of certain HTTP headers).
This 'registration' for later postprocessing the HTTP response performed by mod_gzip is necessary only if mod_gzip cannot already determine at this stage that it definitely won't be interested in processing the response content anyway.
Thus in this first phase mod_gzip already performs a part of the evaluation of the filter directives specified in the Apache configuration: It checks those rules where it can do this based upon the request description alone (i. e. the content of the corresponding HTTP headers). This applies to the mod_gzip_item_include
/mod_gzip_item_exclude
rules of the type
reqheader
(content of the HTTP request headers of the request),url
(URL of the requested HTTP ressource),file
(file name of the file betroffenen by this request, after evaluation of allAlias
translations etc.) andhandler
(name of the handler responsible for evaluating this request, according to the Apache configuration).
If the evaluation of these filter rules already proves that this request's result must not be compressed, i. e. if
- at least one
exclude
rule is satisfied or - none of the
include
rules is satisfied or - if any other condition for performing the compression isn't satisfied (e. g. at this stage it can already be verified whether the client has entitled the serving of compressed data at all by sending the
Accept-Encoding: gzip
HTTP header)
then it is not necessary for mod_gzip to check the remaining rules after the creation of the response content - so this won't happen then, because mod_gzip remembers the result of the first evaluation phase for each request and terminates the second phase immediately in this case.
Otherwise in the second phase of its operation mod_gzip checks the remaining filter rules that can be evaluated only based on the actual content of the generated response packet:
rspheader
(content of the HTTP response headers) as well asmime
(HTTP content type of the result).
Furthermore some other conditions are tested now, such as the size of the response packet (directives mod_gzip_minimum_file_size
rsp. mod_gzip_maximum_file_size
).
And only if all of these tests led to a positive result the compression of the response packed will actually be performed.
The position of mod_gzip within the loading sequence of all Apache modules
As to be able to perform all tasks described above, the mod_gzip module must have access to handling the HTTP request prior to each other Apache module whose output it is meant to handle. Because of the reversed order of access to handling a request for all Apache modules, mod_gzip should be loaded as the last one of all Apache modules.
For the static integration this module order is defined by the 'blueprint' of the httpd
program during the compilation of the Apache source code. The procedure for compiling the Apache source code shipped by the Apache Group, activated by the configure
shell script, knows all dependencies between the shipped modules (and ensures a corresponding order of these modules) but not the requirements of 3rd party modules like mod_gzip which are integrated into the compilation process by the configure
parameter --add-module=
file. To allow for a maximum of influence to these 3rd party modules such modules are loaded as last modules on the module stack.
So if mod_gzip is to be integrated into an Apache server as the only 3rd party module then configure
automatically does the right thing. In case of using more than one 3rd party module the administrator is responsible for ordering these modules (maybe by the order of his --add-module=
values? I didn't test this yet).
Compiling mod_gzip using apxs
The function of the Apache compilation utility apxs
If an Apache server is operated to support the dynamic integration of modules(i. e. uses the mod_so
module) then a utility program named apxs
will be generated in Apache's bin
directory during the Apache installation.
This programm allows its user to compile the program source code of an Apache module (using a C compiler) and to create a corresponding shared object file without requiring the complete source code of the Apache servers to be available: apxs
knows all required Apache program interfaces and supplies the C compiler with the necessary information.
Creating a shared object for mod_gzip using make
To save the user from finding out how exactly mod_gzip has to be compiled and installed completely when using apxs
, a file named Makefile
is provided within the source code archive.
Using this Makefile
reduces the installation procedure to the following steps:
- Extract the files from the downloaded mod_gzip source code archive file into a (new, temporary) directory of your choice and change your current directory position into there.
- Find out the path name of the program
apxs
from your Apache installation. - Perform the compilation running the command
make APXS=your_apxs_pathname
This will create the shared object filemod_gzip.so
within the current directory.
(This step of the operating sequence may be omitted as it will then be covered by the subsequent step.) - Perform the installation running the command
make install APXS=your_apxs_pathname
This will not only copy the shared object file into the corresponding directory of the Apache installation but automatically extend the Apache configuration filehttpd.conf
by the required directivesLoadModule
andAddModule
as well ... if you don't like foreign programs to rewrite your precious configuration files you might prefer to perform this final step manually, or at least make a backup of you Apache configuration first.
Besides these necessary steps the Makefile
supports the following commands (which might rather be of interest to developers):
make clean
removes all created object module files of the mod_gzip source code files from the current directory (i. e. all files with the name pattern*.o
).make clean
additionally removes the created shared object filemod_gzip.so
as well.
(Michael Schröpl, 2003-09-24)