Angelo’s blog

A systems administrator’s diary
  • rss
  • Home
  • About me
  • Computers
  • Bibliography
  • Guides
    • IPSec VPN using FreeBSD
    • Setting up OpenVPN using radius on FreeBSD
    • Protect OWA using a reverse proxy
    • Monitoring a Windows machine with extended counters
  • Résumé

Graphing requests per second out of apache log files

March 19, 2009 | 11:00 am

I’ve been doing some analysis on some log files I got from a client. Since I have to propose an infrastructure that can support these web applications, I’d like to do some reconnaisance first, and see what the traffic is, and how many requests/sec are actually happening. Are we talking 10 req/s or 1000 req/s?

So, I wrote some scripts to graph out the requests/sec from an apache logfile, and I want to share this with you.

First of all, I want to merge the logfiles, so I get one big log file of all cluster nodes. For this, I use the logresolvemerge.pl script that comes with AWStats.

perl logresolvemerge.pl example.node1.log example.node2.log > example.log

Then basicly what I do, is I sort the log file (aparently apache log files are not really chronologic, probably because of the difference between the time a request is made, and the time a request is completed), then I run a script to get the number of requests per minute using this requests.rb script. I then create an rrd file, post all the information about the requests per minute into it, and then I export graphs :)

To sort the logfile, save this script as _sortlog.sh (you might want to use another location than ~/tmp as temporary sorting dir:

#!/bin/sh
if [ ! -f $1 ]; then
    echo "Usage: $0 "
    exit
fi
echo "Sorting $1"
sort -t ' ' -k 4.9,4.12n -k 4.5,4.7M -k 4.2,4.3n -k 4.14,4.15n -k 4.17,4.18n -k 4.20,4.21n -T ~/tmp $1 > $2

Here’s a complete script. Keep in mind I use FreeBSD, so on other *nix flavours you might have to change the script a bit (use expr instead of gexpr, etc). Requirements are the _sortlog.sh script and the requests.rb script mentioned above, and on FreeBSD the packages gexpr and rrdtool12 (or maybe rrdtool 1.3 wil word as well). The 4 timestamps mentioned in the script are for the exporting of the graphs. (My blog will replace straigt double quotes with the curly ones as well, you should replace them back):

#/bin/sh
#ERROR HANDLING
if [ $# -ne 1 ]; then
    echo "Usage: $0 "
    exit
fi
LOGFILENAME=`basename $1 .log`
TIMESTAMP_WEEK_BEGIN=1236654000
TIMESTAMP_WEEK_END=1237258800
TIMESTAMP_DAY_BEGIN=1237071600
TIMESTAMP_DAY_END=1237158000
echo "`date` started script for log $LOGFILENAME.log.."
#SORTING
echo `date` sorting log file "$LOGFILENAME".log to "$LOGFILENAME"_sorted.log..
sh _sortlog.sh "$LOGFILENAME".log "$LOGFILENAME"_sorted.log
#GETTING REQUESTS
echo `date` calculating requests per minute in file "$LOGFILENAME"_requests.txt..
cat "$LOGFILENAME"_sorted.log | ./requests.rb > "$LOGFILENAME"_requests.txt
#CREATING RRD
echo `date` making script "$LOGFILENAME"_createrrd.sh..
FIRSTTIMESTAMP=`head -n 1 "$LOGFILENAME"_requests.txt | awk '{print $1}'`
FIRSTTIMESTAMP=`gexpr "$FIRSTTIMESTAMP - 60"`
echo rrdtool create $LOGFILENAME.rrd \\ > "$LOGFILENAME"_createrrd.sh
echo --step 60  \\ >> "$LOGFILENAME"_createrrd.sh
echo --start $FIRSTTIMESTAMP \\ >> "$LOGFILENAME"_createrrd.sh
echo DS:requests:ABSOLUTE:60:0:U \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:AVERAGE:0.5:1:500 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:AVERAGE:0.5:1:600 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:AVERAGE:0.5:6:700 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:AVERAGE:0.5:24:775 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:AVERAGE:0.5:1440:3985 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MIN:0.5:1:600 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MIN:0.5:6:700 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MIN:0.5:24:775 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MIN:0.5:1440:3985 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MAX:0.5:1:500 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MAX:0.5:1:600 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MAX:0.5:6:700 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MAX:0.5:24:775 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:MAX:0.5:1440:3985 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:LAST:0.5:1:600 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:LAST:0.5:6:700 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:LAST:0.5:24:775 \\ >> "$LOGFILENAME"_createrrd.sh
echo RRA:LAST:0.5:1440:3985 >> "$LOGFILENAME"_createrrd.sh
echo `date` deleting "$LOGFILENAME".rrd..
rm "$LOGFILENAME".rrd
echo `date` executing script "$LOGFILENAME"_createrrd.sh..
sh "$LOGFILENAME"_createrrd.sh
 
#DO THE UPDATES
echo `date` making script "$LOGFILENAME"_updaterrd.sh..
rm "$LOGFILENAME"_updaterrd.sh
cat "$LOGFILENAME"_requests.txt | while read line; do
  AWKSTRING="{print \"rrdtool update $LOGFILENAME.rrd \" \$1 \":\" \$2}"
  echo `echo ${line} | awk "$AWKSTRING"` >> "$LOGFILENAME"_updaterrd.sh
done
echo `date` executing script "$LOGFILENAME"_updaterrd.sh..
sh "$LOGFILENAME"_updaterrd.sh
# MAKING GRAPH
echo `date` making script "$LOGFILENAME"_graph_week.sh..
echo /usr/local/bin/rrdtool graph "$LOGFILENAME"_week.png \\ >  "$LOGFILENAME"_graph_week.sh
echo --imgformat=PNG \\ >> "$LOGFILENAME"_graph_week.sh
echo --start=$TIMESTAMP_WEEK_BEGIN \\ >> "$LOGFILENAME"_graph_week.sh
echo --end=$TIMESTAMP_WEEK_END \\ >> "$LOGFILENAME"_graph_week.sh
echo --title=\"$LOGFILENAME requests\" \\ >> "$LOGFILENAME"_graph_week.sh
echo --rigid \\ >> "$LOGFILENAME"_graph_week.sh
echo --base=1000 \\ >> "$LOGFILENAME"_graph_week.sh
echo --height=120 \\ >> "$LOGFILENAME"_graph_week.sh
echo --width=500 \\ >> "$LOGFILENAME"_graph_week.sh
echo --alt-autoscale-max \\ >> "$LOGFILENAME"_graph_week.sh
echo --lower-limit=0 \\ >> "$LOGFILENAME"_graph_week.sh
echo --vertical-label=\"requests per second\" \\ >> "$LOGFILENAME"_graph_week.sh
echo --slope-mode \\ >> "$LOGFILENAME"_graph_week.sh
echo --font TITLE:9:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_week.sh
echo --font AXIS:7:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_week.sh
echo --font LEGEND:8:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_week.sh
echo --font UNIT:7:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_week.sh
echo DEF:a=\"$LOGFILENAME.rrd\":requests:AVERAGE \\ >> "$LOGFILENAME"_graph_week.sh
echo DEF:b=\"$LOGFILENAME.rrd\":requests:LAST \\ >> "$LOGFILENAME"_graph_week.sh
echo DEF:c=\"$LOGFILENAME.rrd\":requests:MIN \\ >> "$LOGFILENAME"_graph_week.sh
echo DEF:d=\"$LOGFILENAME.rrd\":requests:MAX \\ >> "$LOGFILENAME"_graph_week.sh
echo AREA:a#6EA1007F:\"HTTP Requests\"  \\ >> "$LOGFILENAME"_graph_week.sh
echo GPRINT:b:LAST:\"Current\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_week.sh
echo GPRINT:a:AVERAGE:\"Average\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_week.sh
echo GPRINT:d:MAX:\"Maximum\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_week.sh
echo LINE1:a#6EA100FF:\"\" >> "$LOGFILENAME"_graph_week.sh
echo `date` executing script "$LOGFILENAME"_graph_week.sh..
sh "$LOGFILENAME"_graph_week.sh
 
echo `date` making script "$LOGFILENAME"_graph_day.sh..
echo /usr/local/bin/rrdtool graph "$LOGFILENAME"_day.png \\ >  "$LOGFILENAME"_graph_day.sh
echo --imgformat=PNG \\ >> "$LOGFILENAME"_graph_day.sh
echo --start=$TIMESTAMP_DAY_BEGIN \\ >> "$LOGFILENAME"_graph_day.sh
echo --end=$TIMESTAMP_DAY_END \\ >> "$LOGFILENAME"_graph_day.sh
echo --title=\"$LOGFILENAME requests\" \\ >> "$LOGFILENAME"_graph_day.sh
echo --rigid \\ >> "$LOGFILENAME"_graph_day.sh
echo --base=1000 \\ >> "$LOGFILENAME"_graph_day.sh
echo --height=120 \\ >> "$LOGFILENAME"_graph_day.sh
echo --width=500 \\ >> "$LOGFILENAME"_graph_day.sh
echo --alt-autoscale-max \\ >> "$LOGFILENAME"_graph_day.sh
echo --lower-limit=0 \\ >> "$LOGFILENAME"_graph_day.sh
echo --vertical-label=\"requests per second\" \\ >> "$LOGFILENAME"_graph_day.sh
echo --slope-mode \\ >> "$LOGFILENAME"_graph_day.sh
echo --font TITLE:9:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_day.sh
echo --font AXIS:7:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_day.sh
echo --font LEGEND:8:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_day.sh
echo --font UNIT:7:/usr/local/share/rrdtool/fonts/DejaVuSansMono-Roman.ttf \\ >> "$LOGFILENAME"_graph_day.sh
echo DEF:a=\"$LOGFILENAME.rrd\":requests:AVERAGE \\ >> "$LOGFILENAME"_graph_day.sh
echo DEF:b=\"$LOGFILENAME.rrd\":requests:LAST \\ >> "$LOGFILENAME"_graph_day.sh
echo DEF:c=\"$LOGFILENAME.rrd\":requests:MIN \\ >> "$LOGFILENAME"_graph_day.sh
echo DEF:d=\"$LOGFILENAME.rrd\":requests:MAX \\ >> "$LOGFILENAME"_graph_day.sh
echo AREA:a#6EA1007F:\"HTTP Requests\"  \\ >> "$LOGFILENAME"_graph_day.sh
echo GPRINT:b:LAST:\"Current\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_day.sh
echo GPRINT:a:AVERAGE:\"Average\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_day.sh
echo GPRINT:d:MAX:\"Maximum\\:%8.0lf\"  \\ >> "$LOGFILENAME"_graph_day.sh
echo LINE1:a#6EA100FF:\"\" >> "$LOGFILENAME"_graph_day.sh
echo `date` executing script "$LOGFILENAME"_graph_day.sh..
sh "$LOGFILENAME"_graph_day.sh
 
echo `date` done

You can run this script using the command:

$ ./go.sh example.log

And the output would look like:

Wed Mar 18 17:30:30 CET 2009 started script for log example.log..
Wed Mar 18 17:30:30 CET 2009 sorting log file example.log to example_sorted.log..
Sorting example.log
Wed Mar 18 17:54:23 CET 2009 calculating requests per minute in file example_requests.txt..
Thu Mar 19 03:22:42 CET 2009 making script example_createrrd.sh..
Thu Mar 19 03:22:42 CET 2009 deleting example.rrd..
rm: example.rrd: No such file or directory
Thu Mar 19 03:22:42 CET 2009 executing script example_createrrd.sh..
Thu Mar 19 03:22:42 CET 2009 making script example_updaterrd.sh..
rm: example_updaterrd.sh: No such file or directory
Thu Mar 19 03:23:10 CET 2009 executing script example_updaterrd.sh..
Thu Mar 19 03:23:27 CET 2009 making script example_graph_week.sh..
Thu Mar 19 03:23:27 CET 2009 executing script example_graph_week.sh..
595×199
Thu Mar 19 03:23:27 CET 2009 making script example_graph_day.sh..
Thu Mar 19 03:23:27 CET 2009 executing script example_graph_day.sh..
595×199
Thu Mar 19 03:23:27 CET 2009 done

Note the requests.rb file can take long, the above output was for an 18GB file. The graphs will perhaps look like this:

example_day
example_week

Categories
Linux/Unix, Misc
Comments rss
Comments rss
Trackback
Trackback

« Seamless patterns Drop all extended properties in a MSSQL database »

3 responses

I've been looking all over fro something that does this

Paul | January 22, 2010 | 3:54 am

I’ve been looking all over fro something that does this for quite a while. Your requests.rb link is broken. Do you have a copy of this available?

The link works for me, but here's a mirror: http://files.hongens.nl/2010/01/23/requests.rb

angelo | January 23, 2010 | 10:06 am

The link works for me, but here’s a mirror: http://files.hongens.nl/2010/01/23/requests.rb

You're right, the timestamps being out of order is because

Andrew | July 22, 2010 | 2:48 am

You’re right, the timestamps being out of order is because of the difference between the time at which a request starts (the timestamp on the log line) and when it finishes (which is when it’s written into the log line).

This is very useful information. It may not be something you want for your current purpose, but it would be a shame to throw this information away.

It’s very useful to be able to identify the log lines where the timestamp recorded is significantly behind the latest timestamp seen by that point in the logs, as this identifies the long running requests, which are likely to be performance issues on your site.

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Pages

  • Guides
    • IPSec VPN using FreeBSD
    • Monitoring a Windows machine with extended counters
    • Protect OWA using a reverse proxy
    • Setting up OpenVPN using radius on FreeBSD

Categories

  • Games
  • Hardware
  • Linux/Unix
  • Mac
  • Microsoft
  • Misc
  • Security/privacy
  • Virtualization

Archives

  • February 2010
  • March 2009
  • November 2008
  • June 2008
  • February 2008
  • December 2007
  • November 2007
  • October 2007
  • September 2007
  • August 2007
  • May 2007
  • March 2007
  • February 2007
  • January 2007
  • December 2006
  • October 2006

Friends' sites

  • ErwinK’s site
  • Jef’s site
  • Judith’s site

Work

  • NetMatch
rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox