Pyhton Programming Language[1]
Pyhton Programming Language[1]
2
You can read data from files for processing and
analysis, as well as write data to files for storage or
sharing.
Modify Existing Data:
Files allow you to modify existing data by
appending, updating, or deleting specific parts.
Data Transformation:
You can transform data from one format to another,
such as converting CSV to JSON or vice versa.
Integration with Other Systems:
Data Exchange:
You can easily share data with other systems by
writing it to a file in a standard format.
Flexibility and Versatility:
Support for Different File Formats:
Python supports a wide range of file formats,
including text files, CSV, JSON, binary files, and
more.
Cross-Platform Compatibility:
File handling code written in Python can work
seamlessly on different operating systems, making
it portable and versatile.
3
LOG FILE ANALYzER
Q. What is a log file?
A log file is like a diary or record book for a computer
program or system. It keeps a detailed list of what the
program does, including actions, events, and any
problems that occur.
Let’s understand use of log file with example:
We have a fintech platform, then log file keeps a record
of all organic/inorganic/paid traffic or human users or the
bots.
1. Detecting different types of bots visiting our website:
If say, our site is crashing frequently and google
search console is showing there are only 20K users
on our website, we may not get a clear idea about
the troubleshoot occurring. But using log files we can
get to know that how many bots and of what type
are visiting our website. This way we get to know
4
that in addition to the 20k human users we also have
1k bots visiting our website, this shows the actual
troubleshoot.
2. Troubleshooting: When something goes wrong, log
files provide detailed information about errors,
crashes, or unexpected behavior. This helps
developers and IT professionals diagnose and fix
problems.
3. Security: Logs can reveal unauthorized access
attempts, suspicious activities, or security breaches.
By analyzing these records, you can enhance the
security of your systems and detect potential threats
early. For e.g.: SQL INJECTION ATTACK
4. Auditing: They provide a historical record of actions
and changes made within a system. This is useful for
compliance with regulations or for reviewing what
happened during a particular period.
5
THIS IS A PICTURE OF A LOG FILE
6
Q. What is a log file analyzer?
A log file analyzer helps you make sense of log files. They
are just like dairy entries.
a log file analyzer reads through these diary entries to:
1. Find Patterns: It looks for unusual or repetitive
entries to spot issues or errors.
2. Troubleshoot Problems: If something goes wrong,
the analyzer helps you figure out what happened
and why.
3. Monitor Performance: It tracks how well the system
or application is performing, helping you spot
slowdowns or bottlenecks.
4. Improve Security: By analyzing logs, you can detect
unauthorized access or other security threats.
7
Q. HOW TO MAKE A LOG FILE ANALYSER USING
PYTHON?
STEP 1:
Read the Log File
First, you need to read the log file.
STEP 2:
Parse the Log Data
Parsing a date in Python means converting a string that
represents a date into a datetime object, which allows
you to easily manipulate and work with the date data.
Thus, in python we are reading the file and extracting
meaningful information from it. Log files usually have a
consistent format. You’ll need to parse each line to
extract relevant information
8
STEP 3:
Analyze the Data
You can now perform various analyses, such as counting
how many times the research was successful and how
many time showed error.
STEP 4:
Handle Large Log Files
For very large log files, consider using more advanced
techniques, such as processing the file line by line or
using libraries like pandas for efficient data manipulation.
9
Summary
1. Read the file: Use Python’s file handling methods.
2. Parse the data: Extract useful information from each
log entry.
3. Analyze the data: Count occurrences, search for
patterns, or visualize.
4. Handle large files: Use efficient data processing
techniques if needed.
10
Key Functions
read_log_file: Reads logs from the specified file.
analyze_logs: Calculates total logs, errors, and
warnings.
filter_logs_by_date: Filters logs based on a user-
provided date.
search_for_keyword: Searches logs for a specified
keyword.
count_by_severity: Counts INFO, WARNING, and
ERROR logs.
count_by_time_period: Categorizes logs into time
periods.
percentage_distribution: Calculates percentage
representation of log types.
top_error_sources: Retrieves the most frequent
error messages.
export_summary: Saves the summary of the analysis
into a text file
SOURCE CODE:
11
from collections import Counter
12
continue
return filtered_logs
# Function to search for a keyword in logs
def search_for_keyword(logs, keyword):
return [log for log in logs if keyword in log]
# Function to count logs by time period
def count_by_time_period(logs):
morning = afternoon = evening = night = 0
for log in logs:
# Assuming logs are in the format: "YYYY-MM-DD HH:MM:SS - Log
message"
time = log.split()[1]
hour = int(time.split(':')[0])
if 5 <= hour < 12:
morning += 1
elif 12 <= hour < 17:
afternoon += 1
elif 17 <= hour < 21:
evening += 1
else:
night += 1
return morning, afternoon, evening, night
# Function to calculate percentage distribution of log types
def percentage_distribution(info_count, warning_count, error_count,
total_logs):
info_percent = (info_count / total_logs) * 100
warning_percent = (warning_count / total_logs) * 100
error_percent = (error_count / total_logs) * 100
return info_percent, warning_percent, error_percent
13
# Function to get top error sources
def top_error_sources(logs):
error_messages = []
for log in logs:
if "ERROR" in log:
try:
# Ensure the log line contains the separator before splitting
error_message = log.split(" - ")[1]
error_messages.append(error_message)
except IndexError:
# If the log doesn't have the expected format, skip it
continue
return Counter(error_messages).most_common(3)
# Function to export summary to a file
def export_summary(total_logs, info_count, error_count,
warning_count, top_errors):
with open("log_analysis_summary.txt", 'w') as file:
file.write("Log Analysis Summary\n")
file.write("=====================\n")
file.write(f"Total Logs: {total_logs}\n")
file.write(f"INFO Logs: {info_count}\n")
file.write(f"WARNING Logs: {warning_count}\n")
file.write(f"ERROR Logs: {error_count}\n")
14
print("Summary exported to 'log_analysis_summary.txt'.")
if logs:
# Step 2: Analyze total logs, errors, and warnings
total_logs, error_count, warning_count, info_count =
analyze_logs(logs)
print(f"Total Logs: {total_logs}, Errors: {error_count}, Warnings:
{warning_count}, INFO: {info_count}")
15
print(f"\nLogs containing '{keyword}':")
for log in keyword_logs:
print(log.strip())
16
WARNING: {warning_percent:.2f}%, ERROR: {error_percent:.2f}%")
else:
print("Let's continue then...")
17
OUTPUT:
18
SOURCES:
https://www.w3schools.com/python/
python_intro.asp
https://phoenixnap.com/kb/file-handling-in-
python#:~:text=File%20handling%20is%20an
%20integral,%2C%20writing%2C%20and
%20appending%20information.
https://www.youtube.com/watch?
v=ndoLTHtZX8Q&t=1091s
https://www.youtube.com/watch?
v=ndoLTHtZX8Q&t=1091s
https://www.dremio.com/wiki/log-file-
analysis/#:~:text=Log%20File
%20Analysis%20offers%20a,case%20of
%20a%20security%20breach.
19
20