0% found this document useful (0 votes)
17 views

Analyzing and Simplifying Log Files Using Python IJERTV9IS050113

This document summarizes a research paper about analyzing and simplifying log files using a Python tool called YM Log Analyzer. The tool was developed to more easily analyze server-based log files on Linux systems, such as logs from Apache, mail servers, DNS, DHCP, FTP, authentication, syslog, and command histories. The tool has both a script and graphical user interface version. It allows administrators to more easily search log files for keywords or events within specific time periods to help troubleshoot issues and enhance system security. Prior related work discussed includes analyzing logs from vehicular networks, oil industry systems, and supercomputers. The document focuses on Linux log files due to Linux's widespread use in servers, smartphones, cloud infrastructure, space
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Analyzing and Simplifying Log Files Using Python IJERTV9IS050113

This document summarizes a research paper about analyzing and simplifying log files using a Python tool called YM Log Analyzer. The tool was developed to more easily analyze server-based log files on Linux systems, such as logs from Apache, mail servers, DNS, DHCP, FTP, authentication, syslog, and command histories. The tool has both a script and graphical user interface version. It allows administrators to more easily search log files for keywords or events within specific time periods to help troubleshoot issues and enhance system security. Prior related work discussed includes analyzing logs from vehicular networks, oil industry systems, and supercomputers. The document focuses on Linux log files due to Linux's widespread use in servers, smartphones, cloud infrastructure, space
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Published by : International Journal of Engineering Research & Technology (IJERT)

http://www.ijert.org ISSN: 2278-0181


Vol. 9 Issue 05, May-2020

Analyzing and Simplifying Log Files using Python


Yaser Mowlaiwzadah
Master Degree Student
ECE Department
REVA University
Bangalore, India

P. I. Basarkod R. B. Manjula
ECE Department ECE Department
REVA University REVA University
Bangalore, India Bangalore, India

Abstract—Nowadays computer security has become an


important subject that it discusses about detection and
prevention of computer systems from unauthorized access
and also human around the world whom have access to
internet transmit their sensitive data through internet, all
these activities of users during using computer systems and
internet are logged into log files which log files have a key role
to find information about attacks and unauthorized access to
the systems and servers. In today’s computer systems, a
massive number of various logs is produced, which these logs
can be security log or any other type of logs. Analyzing these
logs can help an investigator to find useful information about
system vulnerabilities and using techniques to prevent them.
The purpose of this study is simplifying and analyzing log files
by YM Log Analyzer tool, developed by python programming
language, it’s been more focused on server-based logs (Linux)
like apace, Mail, DNS (Domain name System), DHCP
(Dynamic Host Configuration Protocol), FTP (File Transfer
Protocol), Authentication, Syslog, and History of commands
logs. This program has two versions, Script version and
Graphic version which the script version is used in servers
with no GUI and the graphic version for Desktop user. Using
this tool, the administrator is able to find what is happening in
systems and realize the importance of log file in systems
security.

Keywords—Logs, Server-based, GUI, Security.


Fig. 1 Syslog File Logs
I. INTRODUCTION
First of all, what are logs? Mostly logs are providing a A. Prior Work
timeline of events for users about operating system, As it is said oil is no more the most important and valuable
applications, and system and can be very useful for asset it is data that is more valuable than oil, and logs are
troubleshooting, and these logs are stored on files which one type of data that are having a big share of these data,
are called log files, usually after encountering a problem anyone who is having access to this data would be able to
the first thing an administrator should see through are log do many things with it, so many researches are done on
files. logs in all fields of computer systems not only networks
Logs located in files are so hard to be comprehended for but other parts also, as mentioned before after a system is
someone new to Linux Administration and networking, facing a problem the first thing to be analyzed are logs so
below is a screenshot shows logs located in /var/etc/syslog the system admin would be able to answer who or what
file which is responsible for storing system logs, what if was the problem cause, when it started, why did it happen,
someone new to the Linux is searching for a keyword all sort of these things, and admin make sure that in future
among log file or they want logs within a known period of same thing never happens. For all above reasons logs are
time, even if someone is professional in the field it is good making a big part in systems troubleshooting, maintenance
to have some ready program to search for specific keyword and security issues, so there is need for research in this
or time periods, it can avoid timewasting. field which is widely done already and is being done every

IJERTV9IS050113 www.ijert.org 125


(This work is licensed under a Creative Commons Attribution 4.0 International License.)
Published by : International Journal of Engineering Research & Technology (IJERT)
http://www.ijert.org ISSN: 2278-0181
Vol. 9 Issue 05, May-2020

day at the present time, these researches on log processing Beside all above tools there are also many other tools
are done on every field and type of subject, some of the maybe hundred tools that are all doing the same thing and
prior works that are done and are worthy to be looked upon giving the same service, having a look on all these tools
are: and functionalities that are provided by them it would be
- Vehicular networks: however vehicular networks more clear that how important logs and log files are in
are a new technology but log analyzing has today technological world
already gone through them, analyzing logs on
B. Why Linux
these networks would give good information
As mentioned in abstract we are analyzing logs on Linux
about the driver, trip, vehicle and road traffic
system, as answer to why log files are specifically studied
conditions so they are going to be analyzed in
on Linux operating system in this paper? Wide usage of
real-time and give useful data to avoid any
Linux operating system should be considered: top 500
predictable type of problem by monitoring the
supercomputers are ran by Linux, 96.3 percent of the top 1
services and conditions.
million web servers are ran by Linux, by 2019 1.99% of
- Logs are of wide usage in oil industry also,
personal computers around the globe were running Linux,
researches are done and implemented in oil
90% of cloud infrastructure running Linux, 85% of all
industry which uses log files to provide real time
smartphones are running Linux, Android is a distribution of
monitoring and reports for better managing of
Linux in other words, out of 5 smartphones 4 are Linux
FPSO (Floating Production and Offloading
based, Linux is ran by every major space program such as
System) which is responsible for production,
SpaceX running vehicle Falcon 9, even in Hollywood 90%
hydrocarbons processing and oil storing.
of special effects are made using Linux, certain countries
- Super computer are another field where log files
announced it as national OS, many military establishments
are used widely, many algorithms and applications
prefer using Linux rather than any other operating System
are developed to extract and analyze log files on
and more are migrating to Linux, these are all cases which
super computers and used them for better
Linux is being used and many more fields that are not
management of the systems and also most of the
mentioned above, and these are reasons which this paper is
time for providing real time monitoring and
focused on Linux.
reports.
Logs in Linux are a valuable troubleshooting tool
- Logs are also used widely in digital
whenever a system admin encounters an issue, everything
communication where data transaction is in need
in Linux has logs for example: system, packet manager,
of high technologies and good management, log
kernel, Apache, boot processes, and etc.
are used to provide more optimized and better
Also as a paper on IEEE named “A comparative study of
solutions for data transaction in this technology
network based system log management tools” in 2015
and are also used for many other tasks that are
which is a comparison between five log analyzing tools
usual and are used also in other platforms and
only one of them supports Linux, while other four don’t, so
technologies.
need for more log analyzers with different capabilities to
There are also many developed tools that are available for
support Linux is felt while Linux is leading operating
log analyzing and they are advanced tools also, they
system in many fields and it is being used and growing
provide many useful functionalities, can be used on many
widely.
platforms all of them are having GUI interfaces but too less
Logs in Linux are stored in /var/log directory mostly, log
of them are having non-graphical interface, still they can
files in the directory can be explored and checked easily by
work with the servers because they are going to connect to
a text editor or any command line command capable of
the servers and access logs on the servers and provide
doing so, such as cat, less, tail, head or etc.
useful reports some of the tools are:
- Solarwinds Manager: designed for windows only,
is a centralized log analyzing tool, data in C. Tools Involved
transaction is encrypted so no unauthorized Tools that are involved and used in this research paper
access, not free are:
- PRTG: have tools for both windows logs detecting Python: the fundamental tool and language used to
and syslogs, admins can also set alarms for some develop this tool is python programming language
types of predefined activities, high customizable TkInter: under python the GUI package of TkInter is
notification system which also gives the ability for used to develop the graphical interface for systems with
Email or SMS messaging, not free graphical interfaces
- Datadog: provides logs analyzing in form of OS package: inside the program some Linux system
graphs so network performance can be seen in real commands are also involved which are based on bash or
time also, provides centralized management so all default shell for Linux so OS package is also imported and
the logs are collected to central storage, not free used.
- Papetrail: filters can be applied on the output
information, free trail only allows up to 100MBs
D. Why Python
per month,
Python is a powerful, high-level, easy to use and
general-purpose programming language, python can be

IJERTV9IS050113 www.ijert.org 126


(This work is licensed under a Creative Commons Attribution 4.0 International License.)
Published by : International Journal of Engineering Research & Technology (IJERT)
http://www.ijert.org ISSN: 2278-0181
Vol. 9 Issue 05, May-2020

used in different fields like: artificial intelligent, data


analyzing, web designing, software development,
networking software development, and many other fields,
and below are some key points that why python is used
here:
Automatic compile to byte code, high level data types
and operations, wide range of supported extensions,
readable code with a distinct C-like quality supports
maintenance, large library of contributed applications and
tools, object-oriented model, portability across
architectures, excellent documentation.

II. METHODOLOGY
The program is only developed for Linux operating
system and cannot be used on any other platform, the Fig. 4 Non GUI Interface
program is consisted of two separate files which are:
GUI file: this file is the main file for local systems with There are differences between the GUI and non-GUI
graphical user interface access. When this file is running version in workflow and results such that in GUI you can
the following windows would be opened. search for a specific word or date but in non-GUI it is not
possible to search with this much of specificity. Since the
program is an open source program so files’ directories can
be changed inside the program if the user wishes so.
III. RESULT AND DISCUSSION
As result the program that is developed is capable of
picking up the requested logs from log files with thousand
lines of logs and this is good for avoiding time wasting and
simplifying this enormous amount of data that is provided
and stored daily on systems, the coded files are available
online and can be altered as convinces the user.
Fig. 2 Local Section Interface
IV. CONCLUSION AND FUTURE SCOPE
As illustrated the program is consisted of two main parts As for future and further development on the file the
which are local and server parts in local part logs located server log analyzing and simplifying part on the program is
on your own local system can be searched and analyzed, ready to be developed and be provided with more features.
while server part is only a predesigned page for future work Both GUI and script files are uploaded online so that be
if anyone is willing to work and develop it a only a available for public and anyone wishes to work on them,
primilinary login page is developed. for downloading the files click the link, after opening the
link both files are available for download and they can be
recognized by their names.
REFERENCES
[1] Amit , A., & Shyam Tukadiya. (2015). A Comparative Study of
Network Based System Log Management Tools. IEEE, 6.
[2] Chen , R., Ji, W., Duan, S., Ling, Q., & Li, F. (2017). A Novel
Method to Analyze Logs Generated by Wireless Telecommunication
Systems. IEEE, 4.
[3] Hacker, T. (2016). A Markov Random Field Based Approach for
Analyzing Supercomputer System Logs. IEEE, 14.
[4] Hongli, W. (2019). A Flow Real-time Data Analyzer for Log of
FPSO Central Control System. IEEE, 3.
Fig. 3 Server Section Interface [5] Jeffrey , S., & Purtilo , J. (2104). Mining Security Vulnerabilities
from Linux Distribution Metadata. IEEE, 6.
[6] Mastsumoto, S., Sato, A., Shinjo, Y., Nakai, H., Itano , K., Shomura,
Server-based file: this file contains the code for non-GUI Y., & Yoshida, K. (2010 ). A Method for Analyzing network Traffic
environment and server environment where there is no Using Cardinality Information in Firewall Logs . IEEE, 6.
access to GUI interfaces, this file also works same as the [7] Shaout, A., Mysuru, D., & Raghupathy, K. (2018). CAN Sniffing for
before file except that it is not having GUI interface Vehicle Condition, Driver Behavior Analysis and Data Logging.
IEEE, 6.
[8] Stackify. (2017, June 23). What are Linux Logs? How to View
Them, Most Important Directories, and More. Retrieved from
Stackify : https://stackify.com/linux-logs/
[9] Stackify. (2017, June 23). What are Linux Logs? How to View
Them, Most Important Directories, and More. Retrieved from
Stackify : https://stackify.com/linux-logs/

IJERTV9IS050113 www.ijert.org 127


(This work is licensed under a Creative Commons Attribution 4.0 International License.)
Published by : International Journal of Engineering Research & Technology (IJERT)
http://www.ijert.org ISSN: 2278-0181
Vol. 9 Issue 05, May-2020

[10] Swati , C., Hitendra , C., Tomar, S., & Anil , R. (2014). User and
Device Tracking in Private Networks by Correlating Logs: A system
for Responsive Forensic Analysis . IEEE, 6.
[11] Team, D. (2019, December 10 ). Advantages and Disadvantages of
Python – How it is dominating programming world. Retrieved from
Data-Flair Training: https://data-flair.training/blogs/advantages-and-
disadvantages-of-python/
[12] Vaughan-Nichols, S. J. (2015, October 15). Can the Internet exist
without Linux? Retrieved from ZDNet:
https://www.zdnet.com/article/can-the-internet-exist-without-linux/
[13] Keary, T. (2019 , April 26). 11 Best Log Analysis Tools . Retrieved
from Comparitech : https://www.comparitech.com/net-admin/best-
log-analysis-tools/

IJERTV9IS050113 www.ijert.org 128


(This work is licensed under a Creative Commons Attribution 4.0 International License.)

You might also like