PySquiLA - Python Squid Log Analyzer
Table of Contents:
- Thanks
- What is it
- Installation
- Configuration
- Usage
- Templates and Languages
THANKS
Thanks go for my lovely wife, that understand my programming cravings and
helped me with some of english stuff (this file wasn't revised yet! :)), you're
my strenght. To my son, that sleeped in the cradle by my side in the
programming nights.
To kov, my debian guru and python online-help :)
People on the Python-Brasil group (http://www.pythonbrasil.com.br) and the
#python IRC channel on freenode.net, for solving doubts and showing codes.
WHAT IS IT
PySquiLA it's a Squid log parser and analyzer. It works two ways:
- Console Mode: It'll run and parse Squid's access.log and put all the data on
a MySQL table. It'll do some checking like skipping older entries and blank
lines.
-> Attention: The check for old entries is done comparing the last entry
on the database with the time in the log file. That means that after you add the data from a log, older logs will not be added. So, it's good to run pysquila once a day, before the log rotation, or in a cron job (every 2 or 4 hours) to avoid losing data.
- Web CGI Mode: Acessing the script as a CGI app will let you perform some statistics and graphics, like Top Sites by downloads or Top Sites by bytes. I'm searching better names for this stats. I define "Top Sites by Bytes" the sum of the "normal" web traffic, as html, javascript, text and flash animations. The "Top Sites by downloads" it's almost the same sum, but it deals only with binaries like applications, compacted files, downloads and stuff like that. I tried to optmize a little the queries and MySQL table structure, but I think that only using sub-queries in MySQL 4.1 will save some speed. Until that, the data will be not-so-fast, but it works.
PySquiLA project is hosted at GNA!, where you can find support, bug reports and patches (please, send patches! :)). The project page is hosted at http://home.gna.org/pysquila. Thanks for the GNA! people for the outstanting service.
INSTALLATION
Just read the INSTALL file.
CONFIGURATION
PySquiLA uses a python format config file, that can reside on the same dir as
the application or in '/etc/pysquila'. I recommend using the etc dir as it
makes easier for updates without losing the configuration. After the 1st run,
pysquila's config.py will be byte-compiled by python and a 'config.pyc' will
be created. Don't worry, it's completely normal. You can delete 'config.pyc'
after changing something the 'config.py', but python will always check to see
if the .py is newer than the .pyc and use it, byte-compiling it again.
The config file has all options commented with the possible values and effects,
so, use it as the reference :)
The source code can help too, I tried to comment it and make it readable for
everyone.
USAGE
Run pysquila.py once a day before the log rotation, or in cron jobs 2 or 4 times a day. There're no command line options yet, all configuration is done using the config.py file. The application will put the data in the MySQL database, and after finish will show a summary of the actions. To access the CGI interface, just point your web-browser to the cgi-bin script, as in:
http://servername/cgi-bin/pysquila.py
TEMPLATES AND LANGUAGES
The 'templates' dir must be set in the config file. I suggest you put it
somewhere in the '/usr/share' directory. All templates can be translated,
observing the name of the special placeholders (like '%(cgi_base_path)s') and
the name of the template's dir.
Don't change the operational part of the tags, like the name or values of the
input fields. unless, of course, that you know what you're doing :)
