AWStats to analyse HAProxy logs

Medibuntu used HAProxy as load balancer. It's been a little challenge to be able to analyse HAProxy logs with AWStats. Here is a howto, hopefully there's a simpler way to do it, but I couldn't find one...

This article assumes that you know how to configure HAProxy, rsyslog and AWStats.

Setting up the logs

The first step is to make HAProxy logs available. We will use rsyslog to do so.

First, modify the HAProxy configuration to enable clf logs (common log format) and to send them to syslog:

# haproxy.cfg snippet
global
    log 127.0.0.1 local0

defaults
    log     global       # use the log definition from the global section
    mode    http
    option  httplog clf
    option  dontlognull

The local0 syslog facility is used. Now we need to make rsyslog listen UDP connections on localhost:

# rsyslog.conf snippet
$ModLoad imudp
$UDPServerRun 514
$UDPServerAddress 127.0.0.1

Next we need to redirect the HAProxy logs to the correct location. In addition, HAProxy on medibuntu handled several backends, so an explicit filter has been added. The interesting backends all use the pattern ^medibuntu.* in their names. The frontend is called http_in.

The rsyslog rule is:

# /etc/rsyslog.d/haproxy.conf
$template rawFormat,"%msg%\n"
if $syslogfacility-text == 'local0' and $msg contains '\"http_in\" \"medibuntu' then /var/log/haproxy-medibuntu.log;rawFormat
if $syslogfacility-text == 'local0' then ~
  • The first line defines a log format in which only the message is stored (no time, no facility, no priority, no program).
  • The second line creates a filter. The local0 facility should be used, and the message must contain references to the http_in frontend name and to the various medibuntu backends.
  • The last line forces to stop the processing (if omitted the messages will also be included in the default syslog file).

The rules can be simplified if HAProxy handles only one backend. It could be something like (untested):

$template rawFormat,"%msg%\n"
local0.* /var/log/file.log
local0.* ~

Restart HAProxy and rsyslog, the logging should be working.

AWStats

The AWStats configuration file does not need any specific configurations, except the log format. You have to set up the LogFormat value to 4:

LogFormat=4

But awstats.pl itself must be modified. Not sure where this comes from and how/if it can be changed, but rsyslog will begin every log line with a white space. AWStats (7.0) doesn't like it at all. The solution I've chosen is to simply patch awstats.pl. Here is a diff:

--- awstats.pl.orig  2011-06-04 10:24:04.000000000 +0200
+++ awstats.pl       2011-06-04 08:21:17.000000000 +0200
@@ -8918,7 +8918,7 @@
             elsif ( $LogFormat eq '4' ) {    # Same than "%h %l %u %t \"%r\" %>s %b"
                      # %u (user) is "(.+)" instead of "[^ ]+" because can contain space (Lotus Notes).
                     $PerlParsingFormat =
-"([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) ([^ ]+)(?: [^\\\"]+|)\\\" ([\\d|-]+) ([\\d|-]+)";
+" *([^ ]+) [^ ]+ (.+) \\[([^ ]+) [^ ]+\\] \\\"([^ ]+) ([^ ]+)(?: [^\\\"]+|)\\\" ([\\d|-]+) ([\\d|-]+)";
                     $pos_host    = 0;
                     $pos_logname = 1;
                     $pos_date    = 2;

AWStats can be run and should correctly parse the log file.

Large log files

Medibuntu generated a large amount of log messages. Be careful if it's your case too. The log files obviously need to be rotated. We used logrotate with this configuration:

/var/log/haproxy-medibuntu.log {
        daily
        missingok
        rotate 7
        compress
        notifempty
        create 640 root adm
        sharedscripts
        prerotate
                cp /var/log/haproxy-medibuntu.log /var/log/haproxy-medibuntu-awstats.log
        endscript
}

Note that the log file is copied before performing the rotation. AWStats is configured to use this copy.