|LateNightHacking Louis Projects 2002||Auth|
I started by first just being able to parse all the fields and dump them to the screen. Then I started adding more record keeping as I thought of interesting things I wanted to know. For example, I divided hits into four categories - myself, attackers, web crawlers, and actual visitors. I decided that with so much information, the best thing to do would be to produce a web page with the results. Here is an example of what it does right now (2002-12-14).
There are lots of things I want to add. The analysis of a single day is moderately interesting. I'm more interested in trends over time. So, I'll have to extract summary info from each log and then combine all of those into a separate page that shows trends over time.
I was suprised how hard it turns out to be to identify a visitor, or even a web crawler. After all, many hits can represent a single visit, and I'm more interested in visits. I want to massage the data more until I can give a better estimate of visits. I can't just track IP. People who use AOL seem to go through a web cache, so related hits come from different IPs. I want to try to tie these together into one visit.
The referer field often contains a search string. It would be interesting to show that information in the report. Sometimes it's pretty funny!
I also want to be able to do some automated analysis to detect attacks - For example, I'd like to get e-mailed when someone starts accessing weird URLs.
Lots to do. :)
2003-09-14 : added some more details to the output, notably the list of search strings that led people to my site.
analyzeLog [options] -? this help -log <fileName> log file to analyze -out <fileName> optional file to write results to
You can get the source via anonymous CVS at
cvs -d :pserver:email@example.com:/code-cvsroot co 2002/analyzeLog
|Louis K. Thomas <loui sth@hotm ail.co m>||Auth||2003-09-14 (5037 days ago)|