Apache access logs, error logs, and Cronolog

Apache has two main logs at the vhost level, and those are the access_log and error_log. These two logs keep track of just what you might think–visitor information for each vhost, and any errors encountered, respectively. Though Apache has a lot of built-in customisation abilities regarding these logs (including the format of each, the log location, et cetera), it does not have much in the way of organising them. Before delving into the solution to that problem, let’s look at some of the customisation that can be done for each of these logs within Apache itself. For the access_log, one can choose what level of information is logged by setting up a LogFormat directive such as the one below:


LogFormat "%h %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined

What does this directive actually do? The first portion–LogFormat–is the Apache directive itself, and as such, instructs Apache that all of the options thereafter will be related to your desired formatting of its logs. Each element within the quotation marks instructs Apache to log certain elements, and here’s what each element listed above is:

  • %h – The host of the remote connection (the IP address of the visitor)
  • %u – The user that is trying to connect to a resource protected by HTTP Authentication (like Apache BasicAuth)
  • %t – The timestamp of the activity (in the default format of DD/MMM/YYYY:HH:MM:SS GMT+/- offset)
  • %r – The request line, which contains the method (i.e. GET / POST), the resource requested, and the protocol used
  • %>s – The HTTP status code that the user received for that resource (200 OK, 404 Not Found, et cetera)
  • %b – The size (in bytes) of the requested resource
  • %{Referer}i – The site from which the user came to request this resource
  • %{User-agent}i – Information about the user’s browser (e.g. Firefox, Chromium) and their OS (e.g. Windows 7, Linux, Android)

The last portion of the LogFormat directive is the nickname, which in this case is “combined.” That nickname can be referenced in conjunction with the CustomLog directive in each vhost in order to call the particular format. For instance, if I had a particular vhost for which I wanted this LogFormat, I would reference it with a line like:


CustomLog /var/www/domains/somedomain.com/host/logs/access_log combined

Where the first field (CustomLog) is the directive itself, the second field is the location of the log, and the third field is the nickname of the LogFormat that we previously defined.

One thing that Apache doesn’t have built in regarding logging, though, is organisation of the logs based on a time frame or similar. There is a programme that, even though it hasn’t been updated for some time now, works really well in achieving this goal. That programme is Cronolog. Many distributions might not have it in their repositories, and in those cases, one needs to compile it from source. However, Gentoo does offer it via Portage. Once you have it installed (either from your distribution’s package manager or compiled from source), configuring Cronolog is quite simple and straightforward. All that needs to be done is a slight change of syntax in the CustomLog directive from above. Here’s an example:


CustomLog "|/usr/sbin/cronolog /var/www/domains/somedomain.com/host/logs/%Y/%m/access_log" combined

Let’s break down that syntactical change piece by piece. The first field remains the same, as it is the Apache directive of CustomLog itself. The second field is now encased in quotation marks, and features the pipe to start. The pipe is essentially taking the logged information that Apache is gathering and passes it to the programme mentioned thereafter; namely, Cronolog (which is, by default, located at /usr/sbin/cronolog). Thereafter, the same log location is called, except for one key difference: the %Y and %m folders are now in the path. As you may have guessed, Cronolog creates a folder for the year (%Y), and subfolders for each month (%m) within it. When either of those stamps change (in this case, each month), new folders / subfolders are created, and Cronolog instructs Apache to log to the access_log file within the appropriate subfolder.

Pretty neat, and easy to set up eh? I hope that you feel empowered to use Cronolog within your Apache environment.

Cheers,
Zach

2 comments

1 pings

    • Frederik on Monday, 23 July 2012 at 08:15
    • Reply

    Just a note, apache from at least version 2.2 includes a tool that does nearly (all?) cronolog does. You can find it under apache’s bin directory and is called rotatelogs

      • Zach on Monday, 23 July 2012 at 15:54
        Author
      • Reply

      Hi Frederik,

      Yes, I thought about using the included ‘rotatelogs’ binary, but I have used cronolog in the past. When I found that it was readily available within Portage, I decided to use it for the time being. I may switch to rotatelogs at some point though. Thanks for brining it to my attention.

      Cheers,
      Zach

  1. […] Apache has two main logs at the vhost level, and those are the access_log and error_log. These two logs keep track of just what you might think–visitor information for each vhost, and any errors encountered, respectively.  […]

Leave a Reply

Your email address will not be published.