Complete Source Code: Putting It All Together

This document discusses using regular expressions to analyze httpd log files and summarize the results. It shows how to use regexes to split log lines into fields, extract the URL, file type, hour of the request, and count requests by each of those fields. A report_section subroutine is defined to output the results in a consistent format with a header and sorted contents.

Uploaded by

pankajsharma2k3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

75 views2 pages

Complete Source Code: Putting It All Together

Uploaded by

pankajsharma2k3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Putting it all together

Regular expressions have many practical uses. We'll look at a httpd log analyzer for an example. In our last article, one of the play-around items was to write a simple log analyzer. Now, let's make it a bit more interesting: a log analyzer that will break down your log results by file type and give you a list of total requests by hour. (Complete source code.) First, let's look at a sample line from a httpd log:
127.12.20.59 - - [01/Nov/2000:00:00:37 -0500] "GET /gfx2/page/home.gif HTTP/1.1" 200 2285

The first thing we want to do is split this into fields. Remember that the split() function takes a regular expression as its first argument. We'll use /\s/ to split the line at each whitespace character:
@fields = split(/\s/, $line);

This gives us 10 fields. The ones we're concerned with are the fourth field (time and date of request), the seventh (the URL), and the ninth and 10th (HTTP status code and size in bytes of the server response). First, we'd like to make sure that we turn any request for a URL that ends in a slash (like /about/) into a request for the index page from that directory (/about/index.html). We'll need to escape out the slashes so that Perl doesn't mistake them for terminators in our s/// statement.
$fields[6] =~ s/\/$/\/index.html/;

This line is difficult to read, because anytime we come across a literal slash character we need to escape it out. This problem is so common, it has acquired a name: leaning-toothpick syndrome. Here's a useful trick for avoiding the leaning-toothpick syndrome: You can replace the slashes that mark regular expressions and s/// statements with any other matching pair of characters, like { and }. This allows us to write a more legible regex where we don't need to escape out the slashes:
$fields[6] =~ s{/$}{/index.html};

(If you want to use this syntax with a matching expression, you'll need to put a m in front of it. /foo/ would be rewritten as m{foo}.) Now, we'll assume that any URL request that returns a status code of 200 (request OK) is a request for the file type of the URL's extension (a request for /gfx/page/home.gif returns a GIF image). Any URL request without an extension returns a plain-text file. Remember that the period is a metacharacter, so we need to escape it out!
if ($fields[8] eq '200') { if ($fields[6] =~ /\.([a-z]+)$/i) {

$type_requests{$1}++; } else { $type_requests{'txt'}++; } }

Next, we want to retrieve the hour each request took place. The hour is the first string in $fields[3] that will be two digits surrounded by colons, so all we need to do is look for that. Remember that Perl will stop when it finds the first match in a string:
# Log the hour of this request $fields[3] =~ /:(\d{2}):/; $hour_requests{$1}++;

Finally, let's rewrite our original report() sub. We're doing the same thing over and over (printing a section header and the contents of that section), so we'll break that out into a new sub. We'll call the new sub report_section():
sub report { print ``Total bytes requested: '', $bytes, ``\n''; print "\n"; report_section("URL requests:", %url_requests); report_section("Status code results:", %status_requests); report_section("Requests by hour:", %hour_requests); report_section("Requests by file type:", %type_requests); }

The new report_section() sub is very simple:

sub report_section { my ($header, %type) = @_; print $header, "\n"; for $i (sort keys %type) { print $i, ": ", $type{$i}, "\n"; } print "\n"; }

We use the keys function to return a list of the keys in the %type hash, and the sort function to put it in alphabetic order. We'll play with sort a bit more in the next article.

Lecture 10 - 22AIE201
No ratings yet
Lecture 10 - 22AIE201
14 pages
Root Route Sail Sale Sea See Seam Seem Sight Site Sew So Sow Shore Sure Sole Soul Some Sum Son Sun Stair Stare Stationary Stationery
No ratings yet
Root Route Sail Sale Sea See Seam Seem Sight Site Sew So Sow Shore Sure Sole Soul Some Sum Son Sun Stair Stare Stationary Stationery
1 page
Steal Steel Suite Sweet Tail Tale Their There To Too/two Toe Tow Waist Waste Walt Weight Way Weigh Weak Week Wear Where
No ratings yet
Steal Steel Suite Sweet Tail Tale Their There To Too/two Toe Tow Waist Waste Walt Weight Way Weigh Weak Week Wear Where
1 page
Hole Whole Hour Our Idle Idol Knight Night Knot Not Know No Made Maid Mail Male Meat Meet Mormng Mourmng
No ratings yet
Hole Whole Hour Our Idle Idol Knight Night Knot Not Know No Made Maid Mail Male Meat Meet Mormng Mourmng
1 page
None Nun Oar or One Won Palr Pear Peace Plece Plain Plane Poor Pour Pray Prey Principal Principle Profit Prophet Real Reel Right Write
No ratings yet
None Nun Oar or One Won Palr Pear Peace Plece Plain Plane Poor Pour Pray Prey Principal Principle Profit Prophet Real Reel Right Write
1 page
Alr Heir Aisle Isle Ante-Antieye I Bare Bear Bear Be Bee Brake Break Buy by Cell Sell Cent Scent Cereal Serial
No ratings yet
Alr Heir Aisle Isle Ante-Antieye I Bare Bear Bear Be Bee Brake Break Buy by Cell Sell Cent Scent Cereal Serial
1 page
We've Only Scratched The Surface of What Perl Can Do. Don't
No ratings yet
We've Only Scratched The Surface of What Perl Can Do. Don't
1 page
DGHGFHG Yuyutyuy GHNGJDG ZCXVXCVXC XCCXVCXVC CXVCXC
No ratings yet
DGHGFHG Yuyutyuy GHNGJDG ZCXVXCVXC XCCXVCXVC CXVCXC
1 page
21ADVG75 Information Storage Management Syllabus & QB
No ratings yet
21ADVG75 Information Storage Management Syllabus & QB
6 pages
Business Rules Framework (BRF+) Setup
No ratings yet
Business Rules Framework (BRF+) Setup
2 pages
Lab 4
No ratings yet
Lab 4
3 pages
IBM DS8000 Family Enterprise Disk Storage Technical Sales Level 3 Quiz Attempt Review
100% (1)
IBM DS8000 Family Enterprise Disk Storage Technical Sales Level 3 Quiz Attempt Review
13 pages
TS.63 v1.0 UE Wi Fi Calling Requirements Specification
No ratings yet
TS.63 v1.0 UE Wi Fi Calling Requirements Specification
20 pages
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
From Everand
UNIX Shell Programming Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
03 - Your Daily Cup of CLI
No ratings yet
03 - Your Daily Cup of CLI
43 pages
Cyber Security Management
No ratings yet
Cyber Security Management
4 pages
Profound Linux For Developers
From Everand
Profound Linux For Developers
Onder Teker
No ratings yet
Brumund Building A Smallish v1
No ratings yet
Brumund Building A Smallish v1
30 pages
Data Analyst Test - AdvaRisk
No ratings yet
Data Analyst Test - AdvaRisk
13 pages
BAC+Resolution+96,+s2022+-+Cloud+Failure+gmz Raf Ppe gat+CLEAN+BACsigned
No ratings yet
BAC+Resolution+96,+s2022+-+Cloud+Failure+gmz Raf Ppe gat+CLEAN+BACsigned
16 pages
Using Low-Code Solutions To Make The Most of Industrial IoT
No ratings yet
Using Low-Code Solutions To Make The Most of Industrial IoT
9 pages
System Proposal - 6
No ratings yet
System Proposal - 6
17 pages
Opendaylight LoadBalancing
75% (4)
Opendaylight LoadBalancing
63 pages
Mikrotik With 3cx Pabx
No ratings yet
Mikrotik With 3cx Pabx
3 pages
Lesson 1: Using The Interface
No ratings yet
Lesson 1: Using The Interface
2 pages
Lalit Sood CV 16 Sept 2021
No ratings yet
Lalit Sood CV 16 Sept 2021
4 pages
DR 138 438
No ratings yet
DR 138 438
2 pages
Autostainer 360, 480S, and 720: What's Best For The Patient
No ratings yet
Autostainer 360, 480S, and 720: What's Best For The Patient
2 pages
Project Documentation PDF
No ratings yet
Project Documentation PDF
7 pages
Perl 2
No ratings yet
Perl 2
14 pages
Q2 WK3 4 Mooc 1
No ratings yet
Q2 WK3 4 Mooc 1
42 pages
Good Computer Validation Practices: Common Sense Implementation
No ratings yet
Good Computer Validation Practices: Common Sense Implementation
8 pages
User Centered Design
No ratings yet
User Centered Design
23 pages
Finolex Academy of Management and Technology
No ratings yet
Finolex Academy of Management and Technology
35 pages
VerusCoin Paperwallet Generator
No ratings yet
VerusCoin Paperwallet Generator
1 page
Linux SOP
No ratings yet
Linux SOP
4 pages
Data Collection KoboToolbox
No ratings yet
Data Collection KoboToolbox
67 pages
Network Lab Programs For 7th Sem Vtu
No ratings yet
Network Lab Programs For 7th Sem Vtu
40 pages
Design and Evaluation of The Electronic Class Record For LPU-Laguna International School
0% (1)
Design and Evaluation of The Electronic Class Record For LPU-Laguna International School
8 pages
Whats What Is
No ratings yet
Whats What Is
1 page
Its It Is
No ratings yet
Its It Is
1 page
Contraction Letters Missed Out: Contractions
No ratings yet
Contraction Letters Missed Out: Contractions
1 page
Thats That Is
No ratings yet
Thats That Is
1 page
Contractions
No ratings yet
Contractions
1 page
Weyd We Had Weyd We Would: Contractions
No ratings yet
Weyd We Had Weyd We Would: Contractions
1 page
PD I Had
No ratings yet
PD I Had
1 page
I'll We'll She'll He'll They'll You'll Examples
No ratings yet
I'll We'll She'll He'll They'll You'll Examples
1 page
Spoken English
No ratings yet
Spoken English
1 page
He:1s She:1s It's What's Who:1s There:1s Where:1s: Examples
No ratings yet
He:1s She:1s It's What's Who:1s There:1s Where:1s: Examples
1 page
All All All: That's That Is
No ratings yet
All All All: That's That Is
1 page
Contraction Letters Missed Out: WI WI WI
No ratings yet
Contraction Letters Missed Out: WI WI WI
1 page
Layayoga
No ratings yet
Layayoga
4 pages
For Theory Cum Practical Papers - Examination For Theory Paper - 3 Hours Duration For 75 Marks and For
No ratings yet
For Theory Cum Practical Papers - Examination For Theory Paper - 3 Hours Duration For 75 Marks and For
18 pages
Perl Tutorial 08
No ratings yet
Perl Tutorial 08
54 pages
1) What Is Data? 2) What Is Database? 3) What Is DBMS?
No ratings yet
1) What Is Data? 2) What Is Database? 3) What Is DBMS?
1 page
Session 6
No ratings yet
Session 6
32 pages
Eyeos 2.3 Installation Manual: Requirements
No ratings yet
Eyeos 2.3 Installation Manual: Requirements
4 pages
Bta
No ratings yet
Bta
17 pages
March 22, 2007: Posted by Prasannam in
No ratings yet
March 22, 2007: Posted by Prasannam in
2 pages
Engineering Foundations of Robotics
No ratings yet
Engineering Foundations of Robotics
70 pages
Aurveda Books
No ratings yet
Aurveda Books
2 pages
BMC Control-M Campus Setup Guide 05042004
No ratings yet
BMC Control-M Campus Setup Guide 05042004
35 pages
RHSA1 Day5
No ratings yet
RHSA1 Day5
37 pages
Introduction To Perl:: #!c:/xampp/perl/bin/perl - Exe
No ratings yet
Introduction To Perl:: #!c:/xampp/perl/bin/perl - Exe
70 pages
Efficient Ways To Read The Log Files in Linux
No ratings yet
Efficient Ways To Read The Log Files in Linux
17 pages
Perl Language
100% (1)
Perl Language
141 pages
C++ Functions and tutorial
From Everand
C++ Functions and tutorial
Nino Paiotta
No ratings yet
Scripting Languages Advanced Perl: Course: 67557 Hebrew University Lecturer: Elliot Jaffe - הפי טוילא
100% (1)
Scripting Languages Advanced Perl: Course: 67557 Hebrew University Lecturer: Elliot Jaffe - הפי טוילא
44 pages
Perintah Linux Centos
No ratings yet
Perintah Linux Centos
3 pages
CGI Programming Part 1 - Perl Hacks
No ratings yet
CGI Programming Part 1 - Perl Hacks
15 pages
UI/UX Presentation5
No ratings yet
UI/UX Presentation5
64 pages
HNN
No ratings yet
HNN
18 pages
Perl Onliners
No ratings yet
Perl Onliners
11 pages
Mason - 1
No ratings yet
Mason - 1
35 pages
Apache 2.2 Log Files
No ratings yet
Apache 2.2 Log Files
11 pages
Perl Reference Card Cheat Sheet: by Via
No ratings yet
Perl Reference Card Cheat Sheet: by Via
6 pages
PERL Workbook: Instructor: Lisa Pearl March 14, 2011
No ratings yet
PERL Workbook: Instructor: Lisa Pearl March 14, 2011
55 pages
Perl Paxterra PDF
No ratings yet
Perl Paxterra PDF
232 pages
Lecture21 Adv Perl Programming
No ratings yet
Lecture21 Adv Perl Programming
9 pages
The Source Code For The HTTP Log Analyzer: Our Second Script
No ratings yet
The Source Code For The HTTP Log Analyzer: Our Second Script
2 pages
Linux Command
No ratings yet
Linux Command
10 pages
Software Carpentry
No ratings yet
Software Carpentry
83 pages
Unix Manual
No ratings yet
Unix Manual
4 pages
Perl
No ratings yet
Perl
62 pages
Programing in Perl
No ratings yet
Programing in Perl
14 pages
Perl Project: Siddhant Sanjeev 337/CO/11 Siddharth Saluja 338/CO/11
No ratings yet
Perl Project: Siddhant Sanjeev 337/CO/11 Siddharth Saluja 338/CO/11
14 pages
Command Ex-2003
No ratings yet
Command Ex-2003
4 pages
Customlog Logformat: Common Log Format
No ratings yet
Customlog Logformat: Common Log Format
4 pages
Perl Refcard
No ratings yet
Perl Refcard
2 pages
Advanced Awk For Sysadmins - LFY
No ratings yet
Advanced Awk For Sysadmins - LFY
6 pages
Perl Programming
No ratings yet
Perl Programming
25 pages
Perl Maven Cookbook v0.01
100% (1)
Perl Maven Cookbook v0.01
21 pages
Brief Introduction To Perl
No ratings yet
Brief Introduction To Perl
11 pages
Unix Book
No ratings yet
Unix Book
57 pages
01 Slides
No ratings yet
01 Slides
32 pages
Perl Scripting
No ratings yet
Perl Scripting
58 pages
Perl
100% (1)
Perl
44 pages
Perl and Regular Expressions 1 Perl and Regular Expressions 1
No ratings yet
Perl and Regular Expressions 1 Perl and Regular Expressions 1
5 pages
PHP Shell
No ratings yet
PHP Shell
6 pages
Re: What Is Normalization Means..? Answer
No ratings yet
Re: What Is Normalization Means..? Answer
43 pages
TB Unix Cheat Sheet
No ratings yet
TB Unix Cheat Sheet
8 pages
A Program That Illustrates The Use of The Matching Operator
No ratings yet
A Program That Illustrates The Use of The Matching Operator
6 pages
Perl Version 5.14.1 Documentation - Perlintro
No ratings yet
Perl Version 5.14.1 Documentation - Perlintro
11 pages
Perl
No ratings yet
Perl
54 pages
Perl - Part Iv: Indian Institute of Technology Kharagpur
No ratings yet
Perl - Part Iv: Indian Institute of Technology Kharagpur
24 pages
Linux CLI Cheat Sheet
No ratings yet
Linux CLI Cheat Sheet
6 pages
Scripting Languages Perl Basics: Course: 67557 Hebrew University Lecturer: Elliot Jaffe - הפי טוילא
No ratings yet
Scripting Languages Perl Basics: Course: 67557 Hebrew University Lecturer: Elliot Jaffe - הפי טוילא
48 pages
Programming On The Web (Csc309F) : Website: Office-Hour: Friday 12:00-1:00 (Sf2110) Email: Wael@Cs - Toronto.Edu
No ratings yet
Programming On The Web (Csc309F) : Website: Office-Hour: Friday 12:00-1:00 (Sf2110) Email: Wael@Cs - Toronto.Edu
12 pages
A Field Guide To The Perl Command Line: Andy Lester
No ratings yet
A Field Guide To The Perl Command Line: Andy Lester
18 pages
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
Perl
No ratings yet
Perl
21 pages
Perl
No ratings yet
Perl
99 pages
Perl Examples
No ratings yet
Perl Examples
8 pages
Perl Intro
No ratings yet
Perl Intro
21 pages

Complete Source Code: Putting It All Together

Uploaded by

Complete Source Code: Putting It All Together

Uploaded by

Putting it all together

$type_requests{$1}++; } else { $type_requests{'txt'}++; } }

The new report_section() sub is very simple:

You might also like