0% found this document useful (0 votes)
2 views22 pages

Pyhton Programming Language[1]

The document provides an overview of Python programming, focusing on file handling and log file analysis. It explains the importance of log files, their advantages over text files, and how to create a log file analyzer using Python. Key features, source code, and functions for analyzing log files are also included.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views22 pages

Pyhton Programming Language[1]

The document provides an overview of Python programming, focusing on file handling and log file analysis. It explains the importance of log files, their advantages over text files, and how to create a log file analyzer using Python. Key features, source code, and functions for analyzing log files are also included.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

INDEX

S.NO TOPIC PAGE NO.


1. PYTHON 1
PROGRAMMING
LANGUAGE
2. FILE HANDLING 2-3
3. LOG FILE ANALYSER 4-5
4. WHY USE LOG FILES 6
INSTEAD OF TEXT
FILES?
5. WHAT IS A LOG FILE 7
ANALYSER
6. HOW TO MAKE A LOG 8-10
FILE ANALYSER USING
PYTHON
7. KEY FEATURES 11
8. SOURCE CODE 12-17
9. OUTPUT 18
SCREENSHOTS
10. REFERENCES 19
PYthON PROGRAMMING LANGUAGE
Python is an interpreted, object-oriented, high-level
programming language with dynamic semantics
developed by Guido van Rossum. It was originally
released in 1991. Designed to be easy as well as
fun, the name "Python" is a nod to the British
comedy group Monty Python.
FOLLOWING ARE IT’S SALIENT FEATURES:
1. Free and Open Source
2. Easy to Learn
3. Easy to Code
4. Cross-platform Language
5. Database Support
6. Interpreted Language
7. Object-oriented Language
8. Extensible
9. GUI Programming Support
10. High-level Language
11. Large Standard Library
12. Frontend and Backend Development
13. Dynamically-typed Language
1
DATA/FILE HANDLING
File handling in Python allows you to interact with
files on your computer, enabling you to read data
from them, write data to them, and perform other
operations.
WHY USE FILE HANDLING?
File handling in Python offers several advantages:
Data Persistence:
 Store and Retrieve Data:
Unlike data stored in memory, which is lost when
the program terminates, files allow you to store
data persistently on disk. This is crucial for
applications that need to retain information across
sessions.
 Work with Large Datasets:
Files enable you to work with datasets that exceed
the available RAM. You can process data in chunks,
making it possible to handle massive amounts of
information.
Data Manipulation:
 Read and Write Data:

2
You can read data from files for processing and
analysis, as well as write data to files for storage or
sharing.
 Modify Existing Data:
Files allow you to modify existing data by
appending, updating, or deleting specific parts.
 Data Transformation:
You can transform data from one format to another,
such as converting CSV to JSON or vice versa.
Integration with Other Systems:
 Data Exchange:
You can easily share data with other systems by
writing it to a file in a standard format.
Flexibility and Versatility:
 Support for Different File Formats:
Python supports a wide range of file formats,
including text files, CSV, JSON, binary files, and
more.
 Cross-Platform Compatibility:
File handling code written in Python can work
seamlessly on different operating systems, making
it portable and versatile.

3
LOG FILE ANALYzER
Q. What is a log file?
A log file is like a diary or record book for a computer
program or system. It keeps a detailed list of what the
program does, including actions, events, and any
problems that occur.
Let’s understand use of log file with example:
We have a fintech platform, then log file keeps a record
of all organic/inorganic/paid traffic or human users or the
bots.
1. Detecting different types of bots visiting our website:
If say, our site is crashing frequently and google
search console is showing there are only 20K users
on our website, we may not get a clear idea about
the troubleshoot occurring. But using log files we can
get to know that how many bots and of what type
are visiting our website. This way we get to know

4
that in addition to the 20k human users we also have
1k bots visiting our website, this shows the actual
troubleshoot.
2. Troubleshooting: When something goes wrong, log
files provide detailed information about errors,
crashes, or unexpected behavior. This helps
developers and IT professionals diagnose and fix
problems.
3. Security: Logs can reveal unauthorized access
attempts, suspicious activities, or security breaches.
By analyzing these records, you can enhance the
security of your systems and detect potential threats
early. For e.g.: SQL INJECTION ATTACK
4. Auditing: They provide a historical record of actions
and changes made within a system. This is useful for
compliance with regulations or for reviewing what
happened during a particular period.

5
THIS IS A PICTURE OF A LOG FILE

Q. WHY USE LOG FILES INSTEAD OF TEXT FILES?


LOG FILE TEXT FILE
Typically follow a standardized Often lack a standardized
format that includes timestamps, structure, which can make
log levels (INFO, WARN, ERROR), extracting and interpreting
and message content. This
data more challenging.
structure makes it easier to parse
and analyze data.
Consistent formatting across May vary widely in format and
entries makes it easier to content, leading to potential
automate parsing and inconsistencies that
analysis. Many logging complicate automated
frameworks enforce this processing.
consistency.
Designed to be continuously While they can be monitored,
updated and monitored. Tools log files are more readily
can be configured to watch integrated into monitoring
log files for specific patterns systems and tools.
or events, enabling real-time
monitoring and alerting.

6
Q. What is a log file analyzer?
A log file analyzer helps you make sense of log files. They
are just like dairy entries.
a log file analyzer reads through these diary entries to:
1. Find Patterns: It looks for unusual or repetitive
entries to spot issues or errors.
2. Troubleshoot Problems: If something goes wrong,
the analyzer helps you figure out what happened
and why.
3. Monitor Performance: It tracks how well the system
or application is performing, helping you spot
slowdowns or bottlenecks.
4. Improve Security: By analyzing logs, you can detect
unauthorized access or other security threats.

7
Q. HOW TO MAKE A LOG FILE ANALYSER USING
PYTHON?
STEP 1:
Read the Log File
First, you need to read the log file.

STEP 2:
Parse the Log Data
Parsing a date in Python means converting a string that
represents a date into a datetime object, which allows
you to easily manipulate and work with the date data.
Thus, in python we are reading the file and extracting
meaningful information from it. Log files usually have a
consistent format. You’ll need to parse each line to
extract relevant information

8
STEP 3:
Analyze the Data
You can now perform various analyses, such as counting
how many times the research was successful and how
many time showed error.
STEP 4:
Handle Large Log Files
For very large log files, consider using more advanced
techniques, such as processing the file line by line or
using libraries like pandas for efficient data manipulation.

9
Summary
1. Read the file: Use Python’s file handling methods.
2. Parse the data: Extract useful information from each
log entry.
3. Analyze the data: Count occurrences, search for
patterns, or visualize.
4. Handle large files: Use efficient data processing
techniques if needed.

10
Key Functions
 read_log_file: Reads logs from the specified file.
 analyze_logs: Calculates total logs, errors, and
warnings.
 filter_logs_by_date: Filters logs based on a user-
provided date.
 search_for_keyword: Searches logs for a specified
keyword.
 count_by_severity: Counts INFO, WARNING, and
ERROR logs.
 count_by_time_period: Categorizes logs into time
periods.
 percentage_distribution: Calculates percentage
representation of log types.
 top_error_sources: Retrieves the most frequent
error messages.
 export_summary: Saves the summary of the analysis
into a text file

SOURCE CODE:
11
from collections import Counter

# Function to read the log file


def read_log_file(file_name):
try:
with open(file_name, 'r') as file:
logs = file.readlines()
return logs
except FileNotFoundError:
print("File not found. Please check the file name and try again.")
return []
# Function to analyze logs
def analyze_logs(logs):
total_logs = len(logs)
error_count = sum("ERROR" in log for log in logs)
warning_count = sum("WARNING" in log for log in logs)
info_count = total_logs - error_count - warning_count # INFO logs
return total_logs, error_count, warning_count, info_count

# Function to filter logs by date


def filter_logs_by_date(logs, date_filter):
filtered_logs = []
for log in logs:
try:
# Check if log has enough parts before accessing date
log_date = log.split()[0]
if log_date == date_filter:
filtered_logs.append(log)
except IndexError:
# Log line doesn't have expected format, skip it

12
continue
return filtered_logs
# Function to search for a keyword in logs
def search_for_keyword(logs, keyword):
return [log for log in logs if keyword in log]
# Function to count logs by time period
def count_by_time_period(logs):
morning = afternoon = evening = night = 0
for log in logs:
# Assuming logs are in the format: "YYYY-MM-DD HH:MM:SS - Log
message"
time = log.split()[1]
hour = int(time.split(':')[0])
if 5 <= hour < 12:
morning += 1
elif 12 <= hour < 17:
afternoon += 1
elif 17 <= hour < 21:
evening += 1
else:
night += 1
return morning, afternoon, evening, night
# Function to calculate percentage distribution of log types
def percentage_distribution(info_count, warning_count, error_count,
total_logs):
info_percent = (info_count / total_logs) * 100
warning_percent = (warning_count / total_logs) * 100
error_percent = (error_count / total_logs) * 100
return info_percent, warning_percent, error_percent

13
# Function to get top error sources

def top_error_sources(logs):
error_messages = []
for log in logs:
if "ERROR" in log:
try:
# Ensure the log line contains the separator before splitting
error_message = log.split(" - ")[1]
error_messages.append(error_message)
except IndexError:
# If the log doesn't have the expected format, skip it
continue
return Counter(error_messages).most_common(3)
# Function to export summary to a file
def export_summary(total_logs, info_count, error_count,
warning_count, top_errors):
with open("log_analysis_summary.txt", 'w') as file:
file.write("Log Analysis Summary\n")
file.write("=====================\n")
file.write(f"Total Logs: {total_logs}\n")
file.write(f"INFO Logs: {info_count}\n")
file.write(f"WARNING Logs: {warning_count}\n")
file.write(f"ERROR Logs: {error_count}\n")

file.write("\nTop Error Sources:\n")


for error, count in top_errors:
file.write(f"{error}: {count} occurrences\n")

14
print("Summary exported to 'log_analysis_summary.txt'.")

# Main function to run the log analyzer


def main():
file_name = input("Enter the log file name (with extension): ")

# Step 1: Read the log file


logs = read_log_file(file_name)

if logs:
# Step 2: Analyze total logs, errors, and warnings
total_logs, error_count, warning_count, info_count =
analyze_logs(logs)
print(f"Total Logs: {total_logs}, Errors: {error_count}, Warnings:
{warning_count}, INFO: {info_count}")

# Step 3: Filter logs by date


date_filter = input("Enter a date to filter logs (YYYY-MM-DD) or
press Enter to skip: ")
if date_filter:
filtered_logs = filter_logs_by_date(logs, date_filter)
print(f"\nLogs for {date_filter}:")
for log in filtered_logs:
print(log.strip())

# Step 4: Search for a keyword in the logs


keyword = input("Enter a keyword to search in logs or press Enter
to skip: ")
if keyword:
keyword_logs = search_for_keyword(logs, keyword)

15
print(f"\nLogs containing '{keyword}':")
for log in keyword_logs:
print(log.strip())

# Step 5: Get top error sources


top_errors = top_error_sources(logs)
print("\nTop Error Sources:")
for error, count in top_errors:
print(f"{error}: {count} occurrences")

# Step 6: Analyze logs by time period


morning, afternoon, evening, night = count_by_time_period(logs)
a=input("Do you want to count logs on the basis of time period...
(y/n): ")
if a== "y":
print(f"\nLogs by Time Period - Morning: {morning}, Afternoon:
{afternoon}, Evening: {evening}, Night: {night}")
else:
print("Let's continue then...")

# Step 7: Calculate percentage distribution of log types


info_percent, warning_percent, error_percent =
percentage_distribution(info_count, warning_count, error_count,
total_logs)
a=input("Do you want me to show percent distribution of log
types...(y/n): ")
if a== "y":
print(f"\nPercentage Distribution - INFO: {info_percent:.2f}%,

16
WARNING: {warning_percent:.2f}%, ERROR: {error_percent:.2f}%")
else:
print("Let's continue then...")

# Step 8: Export the summary to a file


export = input("Do you want to export the summary to a file?
(yes/no): ")
if export.lower() == 'yes':
export_summary(total_logs, info_count, error_count,
warning_count, top_error_sources(logs))

# Entry point of the program


if __name__ == "__main__":
main()

17
OUTPUT:

18
SOURCES:
 https://www.w3schools.com/python/
python_intro.asp
 https://phoenixnap.com/kb/file-handling-in-
python#:~:text=File%20handling%20is%20an
%20integral,%2C%20writing%2C%20and
%20appending%20information.
 https://www.youtube.com/watch?
v=ndoLTHtZX8Q&t=1091s
 https://www.youtube.com/watch?
v=ndoLTHtZX8Q&t=1091s
 https://www.dremio.com/wiki/log-file-
analysis/#:~:text=Log%20File
%20Analysis%20offers%20a,case%20of
%20a%20security%20breach.

19
20

You might also like