Thursday, March 27, 2008

Webalizer DNS problems (fails to resolve any domains)

I had the (mis)pleasure of reinstalling webalizer the other night on a server to run some stats, I ended up wasting a bunch of my time on a silly problem, and thought I should post my solution since my googled in vain for an answer.

Basically, the coles notes version: I had run a grep based on a particular domain name, on all the (several gigs) of apache2 log files sitting in the log folder, in hopes of creating just a log for a particular domain. Unfortunately, I forgot that when grepping multiple files (using filename wildcards on grep command line), that it will output the filename that any search results are found in (once per line of text found). I didn't notice this happening, since I piped the output to a new file. I later went on to try running webalizer on this file, only to find that the DNS wasn't working properly, and it wasn't resolving any of the visitor's IP's.

After googling, it became obvious that this was/is a common problem for webalizer (DNS issues), unfortunately for me, my erroneous log files were the cause of the problem, and I wasted *alot* of time trying to rebuild webalizer from source using the suggested --enable-dns option, tried downloading a newer src rpm from a newer distribution (opensuse 10.0 based system, tryed 10.2 or 10.3 src), unfortunately it had different requirements that didn't seem solveable with Opensuse 10.0.

I eventually actually looked at my log file I was trying to process (more closely.. as I had looked at it several times before... without noticing my/the error). I went on to also determine that webalizer doesn't like the vhost at the beginning of each line (don't quote me on this.. I may be dreaming.. it was late... ), so I went on to find a way to remove that as well.

Tools/methods to fix the problems:

To remove the filename "access_log-20060131:"(and variations (dates) thereof) from in front of the vhost name on each line (note, make sure all the original log filenames were the same number of characters length for this search/delete to work and not miss any, otherwise it could be modified a bit I guess):

sed -e 's/access_log-2006.....//g' testlog

This removed the filename and the colon (:) from in front of any vhost names in the log file.

To remove the vhost names for webalizer (once again, not positive you need to do this, but I wanted to make sure I had only gotten log entries that originated from the particular vhost.. my grep may have caught some that were referred by the particular vhost to other vhost domains.. so this step eliminates that problem:

split-logfile2 < testlog

split-logfile2 splits a vhost_combined format apache2 logfile into separate files based on the vhost names contained in the logfile, and also removes the vhost name from the first part of each line, since the file it outputs to (for each domain) is based on the vhost name, this isn't an issue in terms of recognizing what domain each log is for.

I hadn't known about split-logfile2 before all of this, and now that I do, it will definitely come in handy down the road.

After all this, webalizer was more then happy to finally parse the logs, and even reverse dns all the IP's providing much better results in terms of visitor information, etc.

on a side note, in my several hours of pissing around, I ran across several programs / projects / packages that either have forked off of webalizer, or forked off of other common log analyzers (seems there are many), and I will try to put a post up with what I found... if only for my own reference later. Webalizer results really do seem kind of basic after all is said and done, so finding another analyzer suite (I've used awstat before as well) may be in the cards if I want to garner more information... another alternative is also google analytics of course, but this involves having it in the pages being served, and during the fact of things happening.. its always nice to be able to process log files after the fact using tools such as webalizer and awstats.

"Upgraded" from Kingston 4GB USB Flash Drive to Patriot Xporter 8GB USB Flash Drive

I traded "up" and got a Patriot XPORTER 8GB (not the XT designated model) USB Flash drive, to replace my "little" 4gb Kingston USB Flash Drive. I had only purchased the 4gb a month or two ago for $20, and the 8gb xporter was $30 on sale last week, so thats why I picked it up, plus I have a friend that could use the 4gb drive.

While the space is an upgrade, the performance is around 50% when compared to the kingston. The kingston can read (most of the drive) at 33mb/sec, while the Patriot Xporter only manages a steady 14mb/sec across the whole drive. Writing is nearly half of the kingston also, though I can't recall what I calculated for write speeds, since the program I used (HDTACH) only supports reading tests (RAW, not filesystem based as far as I know), and I just manually calculated the writing speeds by writing a 150MB file to each drive and timed them.

In this case, I'm willing to accept the speed downgrade for the doubled capacity, allowing me to store/transfer some rather large files when I need to, or simply not have to "clean it up" as often, in order to make sure I have room on it.

Install Native Ubuntu system from the comfort of Windows GUI/System using WUBI

http://wubi-installer.org/

A friend posted this link to me, its bascially a GUI windows (and linux) based application that allows a user to install Ubuntu directly on a Windows system, on the windows filesystem (using large file(s)).

Later on, you can move the ubuntu system onto its own partition if you want a speed upgrade, as well as greater security of your data in case of power outages, since one of the notes about running ubuntu in a file on the windows filesystem, it is more prone to errors during hard shutdown / reboots then it would normally be if it were installed on its own partition.

Looks cool... I've used some other ones before (forget the names), I imagine this one is either an evolution of the ones I used, seems this stuff is always getting better and easier.

Oh yeah, and whats kinda cool, is that you can simply go to Add/Remove programs in windows if you want to remove the linux installation later, in case you want to try a different version, or need the space.

centos 4.4 lvm snapshots crash system when they become full (run out of space)

Just a note:

On a CentOS 4.4 server, lvm snapshots that eventually became full (and hence invalid), would crash a system. This problem seems to be fixed in Centos 4.6 (I didn't try 4.5). I can now run snapshots that eventually run out of space, and successfully lvremove them without crashing the system.