eb server logs are a powerful resource for extracting information about your Web sites and applications. When logging is enabled, Web servers log information about each request. By analyzing these logs you can gain knowledge about which resources are popular, what kinds of browsers people are using, how much bandwidth a Web site consumes, the request trend during a given time span, and so on.
Because log analysis is so important, there are many commercial software applications that create useful and good looking reports. Unfortunately, they’re often also very expensive?usually out of budget range for people with small sites. In addition, some log analyzers have a performance impact on the server itself. For example, recent WebTrends (one of the more popular commercial packages) versions require considerable RAM and processor time. If you can’t afford to devote an entire server to Web log analysis and report creation, then you need a cheap and fast solution that isn’t terribly resource-intensive. You’re lucky; such as solution exists, and its name is AWStats.
AWStats is a free, open source log analyzer. It can analyze a wide variety of logs, and it creates good reports?not the full-featured, interactive, and special-effect-filled reports generated by costly commercial software?but still useful, with all the most important data, presented tastefully. It is also very fast, and doesn’t consume many resources. In fact, most of the time, AWStats doesn’t consume any resources at all.
AWStats is a Perl application with a simple structure. It requires an interpreter that can execute Perl scripts and has a small footprint on the server. The Perl source is neither very clear nor particularly readable, so it’s not easy to change the behavior of the software; but you can use it “as is” for most common needs.
AWStats works in two modes. The first is interactive, or “online”, mode. When used in online mode, AWStats updates its reports only when a user requests it. The second is the “offline” mode, in which AWStats analyzes data and creates static reports as HTML pages publishable via any standard Web server.
In this article, you’ll see how to install and use AWStats only in offline mode, which is preferable, because it both minimizes security risks and resource usage, so it doesn’t undermine your Web server’s performance. In offline mode you schedule report generation during off-peak hours and avoid running it during busier times.
You’ll see how to install AWStats, analyze logs, and publish generated reports from IIS6 log files.
To install AWStats on Windows, you need to download a Perl interpreter (if you don’t already have one) and the AWStats software (scripts). I recommend ActivePerl 5.8, which you can download from http://www.activestate.com/ for free. ActivePerl 5.8 installs using a standard MSI installation. During the installation process, you have the option of installing an ISAPI Perl extension, but I recommend not installing it?you don’t need it to run AWStats in offline mode as discussed in this article. However, if you do want to run AWStats (or other Perl scripts) online, go ahead and install it.
Next, download AWStats (currently at version 6.4).
|Author’s Note: You need to be aware that earlier AWStats versions have known security problems, so if you’re already running AWStats and haven’t updated to the latest version, you should do so immediately. If you’re planning to install AWStats, make sure you get the latest version before performing the install.|
The download is a .zip file that you can expand wherever you want. The AWStats .zip file contains three folders: docs, tools, and wwwroot. I suggest deleting the docs folder (you can find documentation online, or you can copy it on your workstation, but you don’t need it on your production server). Next, I suggest you create a new folder where you will copy only the files you really need to run AWStats, both to minimize the installation complexity and to create a “personal distribution” of the software. For example, you can create an E:myAppsawstats-6.4in folder. This is the folder I’ve used for the sample application; I’ll refer to it from here on out as simply the in folder.
Here’s the rest of the prodedure:
- Copy to in the folders named css, icon, lang, lib, and plugin from the wwwwrootcgi-bin folder created when you unzipped the AWStats download.
- From that same folder, also copy the file awstats.pl.
- Next, copy the awstats_buildstaticpages.pl file from the ools folder.
- Finally, create a indirdata folder that you’ll use to manage AWStats’ databases.
That’s it; you don’t need any of the other files and folders from the download to run AWStats in offline mode.
You’ll use the in folder to hold the configuration files defined later in this article as well as batch scripts to automatically run the application.
You control AWStats by creating or (more often) altering configuration files. A configuration file is a text file with a .conf extension that contains information about a Web site’s logs. You must create a configuration file for each Web site you want to analyze.
You should name your configuration files using the naming pattern awstats.CONFIGNAME.conf, where “CONFIGNAME” is variable and is the name of the configuration you want to create. You can later refer to specific configuration files using that variable name. For example, when you instruct AWStats to apply a “CONFIGNAME” configuration, it will look for a file named awstats.CONFIGNAME.conf in the in folder.
A configuration file can “include” options defined in another file, so it’s best to create a base/general configuration file that contains the standard options you want to apply to all reports, and then define a specific file for each Web site. The site-specific file contains information such as where the logs to analyze are, how to process them for this site, and so on. Extracting the standard options into a separate general file and then including that in the site-specific files simplifies managing a hosting environment where you want to analyze data for many Web sites.
AWStats comes with a template configuration called awstats.model.conf that you can use to create new files (you can find it in the wwwrootcgi-bin folder from the installation).
Put all .conf files in the in folder you created earlier.
IIS 6 can record data in either log files or database tables; the precise fields it records are configurable. To use IIS6 with AWStats, you need to make sure that IIS6 logging is turned on, that it’s saving data to log files, and you need to specify which data fields to collect and how to write them. You can control these options from the IIS MMC snap-in management application, editing the log properties for a specific Web site or for all Web sites simultaneously. You can launch the IIS MMC through Control Panel by double-clicking the Administrative Tools item, and then Internet Information Services
After launching the IIS MMC:
- Enable extended logging capabilities by checking the “Enable logging” option.
- Accept the default “W3C Extended Log File Format” option.
- Next, click the Properties button to edit the logging options.
- Set up a log schedule (I suggest monthly, because that schedule makes it easy to manage through AWStats configuration). Check the “Use local time for file naming and rollover” option; however note that doing so doesn’t change the way IIS records time values in the log but only how it manages files.
- Select a log file folder (I suggest moving the logs to a different one than the default, so you can have a simple log storage path for all your Web sites).
- Set the fields that IIS will log. To do that, select the Advanced tab, and check only these options: date, time, client IP, user name, method, URI stem, URI query, protocol status, bytes sent, protocol version, user agent, and referrer. Check your selections, because AWStats log analysis depends on having the correct format.
Finally, if there’s an existing and active log file for your Web sites, delete (or rename) it, which will force IIS to create a new one and begin filling it using the new format. If you cannot access this file because IIS is using it, restart IIS by entering iisreset.exe at a command prompt.
It’s best to edit AWStats configuration files using WordPad (not Notepad or another Windows text editor), because the provided templates are in a Unix file format, which uses a single line feed (LF) character for line endings rather than a carriage return/line feed (CR/LF) character pair as Windows does, so it’s not easy mange them with Notepad). A typical configuration file for a Web site might look like the following (note that I’ve removed all AWStats’ template notes, but you can find them included in the downloadable code).
Include "awstats.common.conf" LogFile="E:LogsWebW3SVC529796009ex%YY-0%MM-0.log" SiteDomain="www.website.com" HostAliases="www.website.com"
These files tell AWStats to look for IIS log files in a specific directory. By default, IIS dynamically creates file names using an exYYMM.log pattern. For example, for November 2005, IIS would use a file named ex0511.log. You can refer to the AWStats documentation to check supported patterns.
The included file awstats.common.conf is the “base” file discussed above that you can use to manage all shared configuration options. You’ll “include” this file with more specific configurations. Create the file as a copy of the awstats.model.conf template provided with the AWStats installation. You can leave all options as defined in this template, remove all “include” commands, and overwrite the following entries:
LogFormat="date time cs-method cs-uri-stem cs-uri-query cs-username c-ip cs-version cs(User-Agent) cs(Referer) sc-status sc-bytes" DirData="E:myAppsawstats-6.4indirdata" DirIcons="icon" Logo="logo_huge.gif" LogoLink="http://www.huge.it" LoadPlugin="timezone +1"
In the preceding code, the LogFormat option is one of the most important options. Logformat=2, the default option for IIS logs, does not work on IIS 6 because that option records data with an order that AWStats doesn’t recognize. Instead, you must explicitly specify the correct parameters as shown above. The pattern I provided works well on IIS 6, so you can simply cut and paste it.
DirData is the working folder for AWStats that you can define on your system. AWStats will save its working file (databases on statistics) in this folder. Although you can change it to whatever you like, the example uses the indirData folder you created during the installation.
DirIcons is the folder that contains all graphic files used to create reports.
Logo defines the file name of the logo to publish on the reports (typically your company logo), and LogoLink defines the URL for the image. In this case, you may put the logo file in the iniconothers folder. You can leave the standard configuration values for experimentation purposes, but you should be aware that you can create reports with a custom logo.
The LoadPlugin timezone option tells AWStats to “correct” time values found in log files. IIS records data using Greenwich Meridian time, so, if you want your reports to be based on your local time zone, you have to correct the values from the logs. For example, “timezone +1” adds 1 hour to time values recorded by IIS (+1 is Italy’s time zone, so with this setting I can read reports with results based on my local time). While AWStats manages time zone adjustment using this setting, it doesn’t correct for daylight savings time, so you must manually update this value for daylight savings time changes.
Running AWStats is quite simple: you need to run only awstats.pl and awstats_buildstaticpages.pl; however you do need to provide some parameters. The main parameter is -config, which defines the configuration file used by the program to analyze data and create reports.
Eventually, you’ll schedule report generation automatically, but at first, it’s useful to run AWStats directly from the command line, both to see how it works and to test that your configuration files are OK. First, create an awstats.www.companysite.com.conf file in your in folder as described in the previous section. You will need to have an IIS log file to analyze, so you can point your .conf file at the log file you specified using the IIS MMC.
Now select Start, Run and enter the command cmd.exe /f:on. In the command window that opens, change directories to the in folder that contains all the AWStats resources.
To run your first analysis, enter the command:
That command causes AWStats to open your log file, analyze it, show the results, and then end. If everything worked, you’ll find a new file in your indirdata folder. The actual file name depends on the configuration file name, the analyzed month, and so on. If any problems occur (you can check this from the results report), make sure that your Perl interpreter is installed and working, your .conf files are defined correctly, and that your IIS log file is in the correct format, and exists in the location where AWStats expects to find it. If you find that AWStats failed to decode only some rows (the results will indicate such problems), don’t panic. If most rows are decoded, that’s ok; otherwise you probably have problems with your IIS log format.
Run the same command again; you’ll see that this time AWStats runs very fast, skipping all the records it’s already analyzed. That’s because it builds on old data, checking and analyzing only new records in the IIS log.
After analyzing the log data, you can make AWStats create a report. Create a folder to contain the report (for this example, I used E:Reports, and copy the inicon folder to it. Also create a www.companysite.com directory in this folder, to archive all report files for this particular configuration/Web site (you can create a different folder for each configuration/Web site you manage). Now, from the same command window you used previously, run this command:
awstats_buildstaticpages.pl -update -config=www.companysite.com -- dir=E:Reportswww.companysite.com -diricons=../icon
Make sure to change the path to the report path you created. The preceding command causes AWStats to create an HTML-formatted report in the specified folder. You can open the HTML file with a browser to see the analyzed data.
As you can see, after setting up the correct folders and configuration, analyzing log data and creating reports requires running only two scripts. In fact, you actually need to run only buildstaticpages.pl, because that runs awstats.pl in the background. Remember, you launched it manually only to check the output and verify that the environment is OK.
So, scheduling report generations is simple; however, you still need to keep security in mind?a topic I’ll discuss a little later.
Some Tips for Running AWStats
AWStats is powerful and simple to use, but it does have some problems running on Windows. To save you time, I’ve created a list of tips that you can use to run AWStats with fewer problems.
- AWStats creates a cache of data it analyzed for each web site (one cache for each .conf file you create). Every time you run it, it checks to see if the cache contains previously analyzed data; if so, it uses those to avoid reanalyzing the entire log file. Instead, it starts reading the log immediately after the last line read during the previous execution. So, if you need to clear all the data and reanalyze your logs, you must delete the cache files. You can find them in the dirdata folder you created during your installation.
- AWStats needs to analyze data in strict sequence. So, for example, if you have already analyzed the October log file, you can’t later analyze the September log. If you need to work out-of-sequence, you must first delete the cache files from your dirdata folder, and then analyze the September log before the October one.
- AWStats skips log files with incorrect formatting. If that happens, stop IIS 6, rename (or delete) the current log file, make sure the log file options are correct, and then restart the Web server. IIS 6 will then create a new log file with the correct format, which you can analyze with AWStats.
- AWStats can do DNS lookups of IP addresses that it finds in log files. This can be a nice feature, because it gives you more information about where the requests are coming from, but it also requires a lot of time because AWStats must query the DNS server for each IP. So, despite the advantages, it’s usually better not to enable this feature.
- By default, AWStats focuses on “monthly” reports, analyzing and creating reports with a month-centric view (in fact, it focuses the report on the current month by default). If you want to have different reports, you can specify a specific month or date range.
Even though you’re using AWStats in offline mode, you probably want to create updated reports automatically. The simplest way to do this is to use the Windows scheduler. First, create a new standard user account on the Web server (or in Active Directory) with no extended rights (you can create the account in the Users group). It’s a good idea to assign a strong password to this account. You’ll use the account to create a scheduled task and nothing else.
Next, create one a batch file to launch data analysis for each Web site for which you want to create reports. Here’s a typical batch file that analyzes three separate Web sites on the server:
start /low /wait awstats_buildstaticpages.pl -update -config=www. companysite.com -dir=E:LogsReportswww.companysite.com -diricons=../icon start /low /wait awstats_buildstaticpages.pl -update -config=www.companysite2.com -dir=E:LogsReportswww.companysite2.com -diricons=../icon start /low /wait awstats_buildstaticpages.pl -update -config=www.huge.it -dir=E:LogsReportswww.huge.it -diricons=../icon
Save the file with a .bat extension. Note that the batch file uses start.exe rather than running the Perl scripts directly, because Perl is an interpreted language, and you cannot define a task priority or a maximum CPU usage value when you run a Perl program. Running the commands with start.exe, and passing the /low parameter runs Perl in low-priority mode, letting the Windows process scheduler assign more CPU time to standard programs, and running the log analyzer with less impact on the overall system. The /wait option causes start.exe to wait until program execution completes before running the next command. If you omit the /wait option the batch file will launch all the defined AWStats processes (three in this case) at one time, which will consume too many server resources.
Using a similar batch file and the user you defined, you can create an NT scheduler task to update your log reports at off-peak times and at convenient intervals.
Security and NTFS settings
When you schedule (or run) a program, it’s best to restrict its permissions as much as possible. To create reports, you can assign NTFS permissions to the user account you created to run the scheduled task. You’ll have to assign these permissions:
- Execute, on c:program filesperl
- List folder contents only (or the less restrictive read option) on the root folder where you put the AWStats files (the root of the disk containing the in folder)
- Execute, on the in folder
- Modify, on the indirdata folder
- Read, on the folder containing the IIS logs
- Modify, on the folder where you want AWStats to create the report files
Setting only these permissions restricts file and folders access to the user account running AWStats?and it’s a common “best practice” for every server and application.
AWStats has many additional options and features beyond the ones mentioned here. You can find the complete documentation online, so you can easily find and test additional features yourself. The documentation is quite Linux-centric, but after you get AWStats working on Windows as described in this article, you’ll find that you can refer to the documentation with few problems.