0% found this document useful (0 votes)
25 views35 pages

Detection of Malicious Social Bots Using Learning Automata With URL Features in Twitter Network

Vidvek InfoTech Private Limited, founded in 2012, is an engineering consultancy based in Hyderabad and Kalaburgi, specializing in power electronics, digital signal processing, and IT solutions. The company offers a range of services including software development, training, and research and development, aiming to produce skilled professionals for the industry. Their mission is to contribute to technological education and research while maintaining partnerships with various organizations in the field.

Uploaded by

khsheth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views35 pages

Detection of Malicious Social Bots Using Learning Automata With URL Features in Twitter Network

Vidvek InfoTech Private Limited, founded in 2012, is an engineering consultancy based in Hyderabad and Kalaburgi, specializing in power electronics, digital signal processing, and IT solutions. The company offers a range of services including software development, training, and research and development, aiming to produce skilled professionals for the industry. Their mission is to contribute to technological education and research while maintaining partnerships with various organizations in the field.

Uploaded by

khsheth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

CHAPTER 1
ABOUT THE ORGANIZATION
1.1 COMPANY PROFILE

Name VIDVEK INFOTECH PRIVATE LIMITED

Registration VIDVEK INFOTECH PRIVATE LIMITED

Company Logo

Foundation Logo

Tag line Quick and Relible Soulations

Founder Vivekanand M Bonal

Establishing Year 2012

Website www.vidvek.com

Email [email protected]

Products Java, Web Development, Research Out-Sorucing, Digital


Marketting, Electronics & Control System Research Optimization,
Cloud Application Development , Resrearch for IoT

Affiliations Karnataka,Telangana

Registered and Verifying Govt. of Karnataka and Govt.of Telangna INDIA


Authority
Opp: Poonam Ladies Corner, PDA Engineering College Rd, Near
Branches Address Maharaj Hotel, Aiwan-E-Shahi Area, Gulbarga, Karnataka 585102

Dept. of CSE, Sharnbasva University Kalaburgi 1


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

1.2 ABOUT COMPANY

VIDVEK INFOTECH PRIVATE LIMITED is a Hyderabad and Kalaburgi based


Engineering Project Consultancy associated with an independent R & D unit established in
the year 2012. The field that we are excellent in is Power Electronics, Digital Image/Signal
Processing, Embedded systems, DSP application in Electric Drives. Apart from that, they
provide IT solutions including web design, Office Automation software like attendance
management system, billing system, library management system, payroll system.

Also they providers Training and Projects in Embedded systems, Power systems,
Power Electronics, Electronic Drivers, Machines, DSP/DIP, VLSI, Data warehousing, .Net,
C#, Java/J2EE and Linux as well as develops its own range of quality Embedded products.
Vidvek InfoTech has successfully powerd itself in training thousands of students and
professionals. The teaching philosophy deployed, trives to create in-depth knowledge about
the subject at hand. We believe that depth is an essential ingredient to achieve heights in
training and development

 MISSION AND VISION

The Main Mission of Vidvek InfoTech is "To be a key player in deployment of


Knowledge a Technology, By Producing and Providing Versatile and Resourceful
Professionals to the growing Industries".

The vision of Vidvek InfoTech is to involve its students in never-ending Engineering


Research and make them excel in Quality Education. In the years to come its vision is to keep
its students in the forefront in Engineering and Technological Education. Moreover it should
remain forever a world class institution for technological education and scientific research for
public good.

1.3 CLIENTS AND PARTNERSHIPS

 DELTA SEMICONDUCTOR TECHNOLOGIES

Dept. of CSE, Sharnbasva University Kalaburgi 2


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Delta Semiconductor Technologies Pvt.Ltd is a Bangalore start-up company which provides


cutting edge solutions with persuasion of "Service with Leadership" in VLSI and Embedded
Systems Design.

The company strongly believes on customer success through cooperation, unity and service.
We are specialized in providing services in Custom IC development which includes
Analog/Mixed Signal Design, Foundation IP Development viz. Memory Compilers, IOs and
Standard Cells, IBIS, VerilogA Modelling. The team has extensive experience in Circuit
Design, Characterization and Layout up to 20nm/ 22nm process nodes.

 TECHKSHETRA INFO SOLUIONS PVT. LTD

TechKshetra Info Soluions Pvt. Ltd is an ISO 9001-2015 certified Software development
firm with Corporate Office in Bangalore and registered office at KEONICS IT Park
Kalaburagi. Techkshetra Info Sulutions Pvt.Ltd is an software development company
providing IT services and solutions focusing on delivering beautiful, scalable and high-
quality products and apps globally. It is specialized in product development, high-end mobile
apps and custom software development..

 TANNA EDUCATION

Tanna Educational Services is located in the heart of the city of Rajkot which is one
of fastest developing educational and industrial center in Gujarat. They have another branch
in Santa Clara, CA, USA which is a hub of world-wide semiconductor and software
development known as Silicon Valley.

It is the leading educational institute. They are specialized in the field of Embedded
Systems, VLSI and Database Management. They are overwhelmed with highly trained staff
with many years of experience working in multinational companies. They offer variety of

Dept. of CSE, Sharnbasva University Kalaburgi 3


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

teaching and training programs to help engineers sharpen their skills and broaden their
knowledge. Please go to our "Services" section for detailed information about our various
Tech services.

 INNOVETECH

Innovetech Pro Solutions Pvt.Ltd.,


Linking Technologies to students

Innovetech Pro Solution Trainers Published Several Books By World Class Publishers
Like Phi India Pvt. Ltd. Innovetech Pro Solutions Trainers Were Recognized As The Best
Trainers By Top MNC Companies Like Wipro, TCS, Logical CMG, I flex, IGATE, L&T,
Satyam Etc.Innovetech Pro Solutions Involved In Training And Academic Projects Program
In More Than 100 Engineering Colleges From Tamil Nadu And Pondicherry.

 SOOXMA TECHNOLOGIES

Sooxma Technologies is based in Hyderabad is a leading organization providing


excellent training in Robotics, Embedded Systems, Embedded Linux, Android and the latest
innovative technologies. Sooxma Technologies works round the clock to cater to the needs of
the equipment it has excellent record for the 9 years is a manifestation of the quality of
knowledge.

 KEONICS

Karnataka State Electronics Development Corporation Limited (KEONICS), an ISO


9001:2008 certified organization, is in existence since 1976 with the sole objective of
promoting electronics industries in Karnataka. To ensure that Karnataka is always in the
forefront of electronics and IT, we have been providing high quality infrastructure for IT
industries, spreading IT education to the masses and providing state-of-the products and
service for e-Governance.

Dept. of CSE, Sharnbasva University Kalaburgi 4


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

KEONICS focuses on setting up world-class infrastructure for the electronics and IT


related industries in the state. Aimed at providing a platform for these industries to operate
from, we work with Software Technology Parks of India to develop the estates, provide
connectivity and administer export-oriented schemes. We have diversified into IT related
activities in 1990.

 TURING POINT

Turning Point Computers is a leading Training Institute in North Karnataka since with
the mission of providing best quality Computer education to all class of people. Turning
Point Computers is a Franchisee of KEONICS (A government of Karnataka Enterprise) for
government Certification and Employment. Over the past few years the growth of the
computer industry has been quite remarkable and today it is the fastest growing industry, Not
just the students or housewives, even experienced professional are helped greatly by
upgrading themselves in Turning Point Computers. Our organization not only provides the
platform to build up the bright professional career in computer field but also provides the
placement opportunities. As Computer knowledge has become primary requirement for
everyone, our Institute provides best Quality Computer Education. Our motive is to make all
class of people Computer literate and take all possible advantages to make their future much
brighter

1.4 SERVICES OFFERED

1. Lab Equipments/ Educational Board

2. Software Services

3. Development Board

4. Academic Projects

5. Training and Placement

6. Research & Development

Dept. of CSE, Sharnbasva University Kalaburgi 5


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

1.4.1 LAB EQUIPMENTS/ EDUCATIONAL BOARD

In the manufacture of Microprocessor Trainers and Interface Boards. With the efforts
from our Research and Development Team, the company has expanded its activities in
various areas as follows. Microcontroller, Process Control Instrumentation, Digital Signal
Processing, Power Electronics & Drives, Data Acquisition Systems, Personal Computer
Trainer Systems, VLSI & Embedded Systems etc.

At present, the company is concentrating in various new fields like Advanced Control
Systems, Solar Heat Pump Trainers, Advanced Process control, Chemical Reactors,
Distillation Column, Image Processing, Nuclear Electronics, Defense projects etc.,

1.4.2SOFTWARE SERVICES

The provide a software service in the area of software development, mobile app
development, and digital marketing, in Electronics and communication, Computer Science,
and Mechanical with advance tools as required by the customer, college and industry needs.

14.3 DEVELOPMENT BOARDS

Development board is a printed circuit board containing controller and the minimal
support logic needed for an engineer to become acquainted with the microprocessor on the
board and to learn to program it. It also served users of the microprocessor as a method to
prototype applications in products. Like Microcontroller, FPGA and PCBS

1.4.4 ACADEMIC PROJECTS

 B.E/B.Tech – EEE, ECE, TCE, CSE, IT, MECH.

 M.E/M.Tech – P.E&D, Power Systems, Java, Embedded Systems, Digital Signal


Processing, Digital Electronics and Communications, Applied Electronics, CSE &
MCA, Diploma – DEEE, DECE, DCSE,

1.4.5 TRAINING AND PLACEMENT

This Institute offers 24-Week Advanced Course in Embedded Systems. This course is
designed to offer application oriented training & real time exposure to students, there by
provides for bridging the gap between industry‟s requirements and students‟ academic skill

Dept. of CSE, Sharnbasva University Kalaburgi 6


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

set. By pursuing the Institute‟s Program in Embedded Systems the students gain ready
acceptance in the market.

Institute‟s Advanced Course in Embedded systems serves the interests of practicing


embedded software engineers as well as those engineers planning to enter the embedded
field. The course content of this Program is approved by the industry and it,

 Presents practical lessons and techniques for use in Designing, Implementing,


Integrating and Testing software for Modern Embedded Systems
 Describes what an embedded system is, what makes them different, and what
embedded systems designers need to know to develop embedded systems
 Provides the student with a life cycle view for designing multi-objective, multi-
discipline embedded systems
 Imparts a solid understanding of the role of embedded systems and embedded systems
design and development in modern day‟s technology-enabled society

1.4.6 RESEARCH & DEVELOPMENT

(R&D) in Power Electronics, Communication, VLSI, Management, Image


Processing, Surveillance System, Computer Science, Information Technology, Bio-Medical
and Management. R&D unit is specially managed by most experience Doctorates
Dr.VinyanadattaKohir, Dr. TC Manjunath, and Dr.GKGuta (MBA)

1.5 OPERATION OF COMPANY

 EMBEDDED SYSTEM

An embedded system is some combination of computer hardware and


software, either fixed in capability or programmable, that is specifically
designed for a particular function. Industrial machines, automobiles, medical
equipment, cameras.

 WEB DESIGN& MAINTENANCE


Web design is the process of creating websites. It encompasses several different aspects,
including webpage layout, content production, and graphic design. While the terms web

Dept. of CSE, Sharnbasva University Kalaburgi 7


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

design and web development are often used interchangeably, web design is technically a
subset of the broader category of web

 DOMAIN REGISTER & HOSTING

A domain name registrar is an organization or commercial entity that manages the


reservation of internet domain names. A domain name registrar must be accredited by
ageneric top-level domain (gTLD) registry and/or a country code top-level domain (ccTLD)
registry. The management is done in accordance with the guidelines of the designateddomain
name registries.

 POWER SYSTEM & POWER ELECTRONICS

Electric power systems are comprised of components that produce electrical


energy and transmit this energy to consumers which lower the voltage to the
level needed for the consumer equipment. The production and transmission.

 MATLAB

MATLAB is a high-performance language for technical computing. It


integrates computation, visualization, and programming in an easy-to-use
environment where problems and solutions are expressed in familiar
mathematical notation.

 VLSI

VLSI (very large-scale integration) is the current level of computer microchip


miniaturization and refers to microchips containing in the hundreds of thousands
of transistor s. LSI (large-scale integration) meant microchips containing
thousands of transistors.

 MECHATRONICS

Mechatronics is an interdisciplinary area of engineering that combines


mechanical and electrical engineering and computer science. A typical
mechatronic system picks up signals from the environment, processes them

Dept. of CSE, Sharnbasva University Kalaburgi 8


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

to generate output signals.

 CLOUD COMPUTING

Cloud computing, also known as on-demand computing, is a kind


of internet-based computing, where shared resources and information
are provided to computers and other devices on-demand. It is a
model for enabling ubiquitous, on-demand access to a shared pool of
configurable computing resources.

 DATAMINING:

Data mining is an interdisciplinary subfield of computer


science involving processes of data set discovery utilizing
intersecting methodologies (artificial intelligence, machine
learning, statistics, and database system(s) etc. The overall goal of
the data mining process is transformative structural application(s).
Aside from the raw analysis step, it involves database and data
management aspects, data pre-processing, model and inference considerations,
interestingness metrics, complexity considerations, post-processing of discovered structures,
visualization, and online updating.

 BIG DATA

'Big data is a broad term for data sets so large or


complex that traditional data processing applications are
inadequate. Challenges include analysis, capture, data
curation, search, sharing, storage, transfer, visualization,
and information privacy. The term often refers simply to the
use of predictive analytics or other certain advanced methods to extract value from data, and
seldom to a particular size of data set. Accuracy in big data may lead to more confident
decision making. And better decisions can mean greater operational efficiency, cost reduction
and reduced risk.

Dept. of CSE, Sharnbasva University Kalaburgi 9


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

 JAVA/.Net

Whereas .Net needs a very heavy framework to be installed which have higher
Hardware requirements too compared to Java. C# is the most popular language
of .Net and is used to create any kind of programming like Web Application.

 ANDROID

Android is a mobile operating system developed by Google as an open source


Linux based platform for third party companies to build off. The open nature
of Android means developers and manufacturers can work with the platform
for free and change it.

 NS2

NS2 is an open-source simulation tool that runs on Linux. It is a discreet event


simulator targeted at networking research and provides substantial support for
simulation of routing, multicast protocols and IP protocols, such as UDP, TCP,
RTP and SRM.

Dept. of CSE, Sharnbasva University Kalaburgi 10


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

CHAPTER 2

ABOUT THE DEPARTMENT

2.1 SPECIFIC FUNCTIONS


2.1.1 DIGITAL SIGNAL PROCESSING (DSP)

Digital signal processing (DSP) is the numerical manipulation of signals, usually with
the intention to measure, filter, produce or compress continuous analog signals. It is
characterized by the use of digital signals to represent these signals as discrete time, discrete
frequency, or other discrete domain signals in the form of a sequence of numbers or symbols
to permit the digital processing of these signals.

Theoretical analyses and derivations are typically performed on discrete-time signal


models, created by the abstract process of sampling. Numerical methods require a digital
signal, such as those produced by an analog-to-digital converter (ADC). The processed result
might be a frequency spectrum or a set of statistics. But often it is another digital signal that
is converted back to analog form by a digital-to-analog converter (DAC). Even if that whole
sequence is more complex than analog processing and has a discrete value range, the
application of computational power to signal processing allows for many advantages over
analog processing in many applications, such as error detection and correction in
transmission as well as data compression.

Digital signal processing and analog signal processing are subfields of signal
processing. DSP applications include audio and speech signal processing, sonar and radar
signal processing, sensor array processing, spectral estimation, statistical signal processing,
digital image processing, signal processing for communications, control of systems,
Dept. of CSE, Sharnbasva University Kalaburgi 11
Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

biomedical signal processing, seismic data processing, among others. DSP algorithms have
long been run on standard computers, as well as on specialized processors called digital
signal processors, and on purpose-built hardware such as application-specific integrated
circuit (ASICs). Currently, there are additional technologies used for digital signal processing
including more powerful general purpose microprocessors, field-programmable gate arrays
(FPGAs), digital signal controllers (mostly for industrial applications such as motor control),
and stream processors, among others.

Digital signal processing can involve linear or nonlinear operations. Nonlinear signal
processing is closely related to nonlinear system identification and can be implemented in
the time, frequency, and spatio-temporal domains.

 APPLICATIONS OF DSP

The main applications of DSP are audio signal processing, audio compression, digital
image processing, video compression, speech processing, speech recognition, digital
communications, radar, sonar, financial signal processing, seismology and biomedicine.
Specific examples are speech compression and transmission in digital mobile phones, room
correction of sound in hi-fi and sound reinforcement applications, weather forecasting,
economic forecasting, seismic data processing, analysis and control of industrial processes,
medical imaging such as CAT scans and MRI, MP3 compression, computer graphics, image
manipulation, hi-fi loudspeakercrossovers and equalization, and audio effects for use with
electric guitaramplifiers.

2.1.2 DIGITAL IMAGE PROCESSING

Digital image processing deals with manipulation of digital images through a digital
computer. It is a subfield of signals and systems but focus particularly on images. DIP

Dept. of CSE, Sharnbasva University Kalaburgi 12


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

focuses on developing a computer system that is able to perform processing on an image. The
input of that system is a digital image and the system process that image using efficient
algorithms, and gives an image as an output. The most common example is Adobe
Photoshop. It is one of the widely used application for processing digital images.

 APPLICATIONS OF DIGITAL IMAGE PROCESSING

Some of the major fields in which digital image processing is widely used are mentioned
below

 Image sharpening and restoration

 Medical field

 Remote sensing

 Transmission and encoding

 Machine/Robot vision

 Color processing

 Pattern recognition

 Video processing

 Microscopic Imaging

2.1.3 DIGITAL VIDEO PROCESSING

A video signal is the term used to describe any sequence of time varying images. A
still image is a spatial distribution of intensities that remain constant with time while a time
varying image has a spatial intensity distribution that varies with time. Movies (films) and
television are both examples of video signals as are the signals that drive computer monitor,
laptop and PDA displays. It is widely expected that video communications in particular will
be the next application driving the mobile and handheld device market. This course should
give you the tools to understand the components that are necessary for such systems to
operate effectively

Dept. of CSE, Sharnbasva University Kalaburgi 13


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

 APPLICATIONS OF DIGITAL VIDEO PROCESSING

 Motion tracking

 HD videos

 Video Editing

 Surveillance cameras

 video compression

 Video coding

2.1.4 EMBEDDED SYSTEMS

An embedded system is a computer system with a dedicated function within a larger


mechanical or electrical system, often with real-time computing constraints.It is embedded as
part of a complete device often including hardware and mechanical parts. Embedded systems
control many devices in common use today.

Examples of properties typical of embedded computers when compared with general-purpose


ones are low power consumption, small size, rugged operating ranges, and low per-unit cost.
This comes at the price of limited processing resources, which make them significantly more
difficult to program and to interface with. However, by building intelligence mechanisms on
the top of the hardware, taking advantage of possible existing sensors and the existence of a
network of embedded units, one can both optimally manage available resources at the unit
and network levels as well as provide augmented functionalities, well beyond those available.

Dept. of CSE, Sharnbasva University Kalaburgi 14


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

For example, intelligent techniques can be designed to manage power consumption of


embedded systems.

Modern embedded systems are often based on microcontrollers (i.e. CPUs with
integrated memory or peripheral interfaces) but ordinary microprocessors (using external
chips for memory and peripheral interface circuits) are also still common, especially in more
complex systems. In either case, the processor(s) used may be types ranging from general
purpose to those specialised in certain class of computations, or even custom designed for the
application at hand. A common standard class of dedicated processors is the digital signal
processor (DSP).

Since the embedded system is dedicated to specific tasks, design engineers can
optimize it to reduce the size and cost of the product and increase the reliability and
performance. Some embedded systems are mass-produced, benefiting from economies of
scale.

Embedded systems range from portable devices such as digital watches and MP3
players, to large stationary installations like traffic lights, factory controllers, and largely
complex systems like hybrid vehicles, MRI, and avionics. Complexity varies from low, with
a single microcontroller chip, to very high with multiple units, peripherals and networks
mounted inside a large chassis or enclosure.

2.1.5 ROBOTICS

Robotics is the branch of mechanical engineering, electrical engineering, electronic


engineering and computer science that deals with the design, construction, operation, and
application of robots, as well as computer systems for their control, sensory feedback, and
information processing.

Dept. of CSE, Sharnbasva University Kalaburgi 15


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

These technologies deal with automated machines that can take the place of humans
in dangerous environments or manufacturing processes, or resemble humans in appearance,
behavior, and/or cognition. Many of today's robots are inspired by nature contributing to the
field of bio-inspired robotics.

The concept of creating machines that can operate autonomously dates back to
classical times, but research into the functionality and potential uses of robots did not grow
substantially until the 20th century. Throughout history, it has been frequently assumed that
robots will one day be able to mimic human behavior and manage tasks in a human-like
fashion. Today, robotics is a rapidly growing field, as technological advances continue;
researching, designing, and building new robots serve various practical purposes, whether
domestically, commercially, or militarily. Many robots do jobs that are hazardous to people
such as defusing bombs, mines and exploring shipwrecks.

 APPLICATIONS OF ROBOTICS
 Space Robotics

 Underwater Robotics

 Electric Mobility

 Logistics, Production and Consumer (LPC)

 Search and Rescue (SAR) & Security Robotics

 Assistance- and Rehabilitation Systems

 Agricultural Robotics
2.1.6 JAVA

Very-large-scale integration (VLSI) is the process of creating an integrated circuit


(IC) by combining thousands of transistors into a single chip. VLSI began in the 1970s when
complex semiconductor and communication technologies were being developed. The
microprocessor is a VLSI device. Before the introduction of VLSI technology most ICs had a

Dept. of CSE, Sharnbasva University Kalaburgi 16


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

limited set of functions they could perform. An electronic circuit might consist of a CPU,
ROM, RAM and other glue logic. VLSI lets IC designers add all of these into one chip.

 APPLICATIONS OF JAVA

 Application-specific integrated circuit

 Caltech Cosmic Cube

 Design rules checking

 Electronic design automation

 Poly silicon

 Mead & Conway revolution

2.2 TOOLS AND TECHNIQUES ADOPTED

 SOFTWARE ADOPTED

 Mentor Graphics(Cadance Tool) – For Analog Desgin

 Mentor Graphics(ModelSim) – For Digital Design based on VHDL, Verilog


and System Verilog

 Quartus II 10.1 for VLSI Based Design and Development

 Spice Tools – NGSpice, HSpice, PSpice and PSCAD

 Xilinx ISE Design Suite

 Vivado Xilinx

 Matlab& Simulink

 Multi-Sim & NI Simulator

 DSCH &Mirowind

Dept. of CSE, Sharnbasva University Kalaburgi 17


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

 HARDWAR ADOPTED

 FPGA -Alter Cyclone II & IV, Spartan Kitsand Mojo FPGA

 Microcontroller - PIC, ARM, Raspberry Pi, AVR, 8051, Atmega, Cortex

 DSP - TMS320C6416

 Wireless Communication – IOT, WIFI, Bluetooth, Zigbee and GSM

2.3 ON GOING R& D PROJECTS

 Wireless Communication

 Internet of Things

 Security and Surveillance System

 GSM based remote control of Industrial process,

 Image compression/decompression, MP3 coding and decoding,

 Finger print classification, Hand tracking; Gesture recognition

 Embedded systems,

 ANN based alphanumeric classification,

 Face detection, Signature recognition, Speech recognition,

 Office/Library Automation, Device drivers (Linux/Windows),

 Simulation of power diodes, Radar Target detection, 3phase motor


performance

 Face Recognition Using DCT-HMM Approach

 MIMO Communication System

 BER Performance for Communication System

 Critical Analysis of MIMO Communication System

Dept. of CSE, Sharnbasva University Kalaburgi 18


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

CHAPTER 3

TASK PERFORMED
3.1 INTRODUCTION

Social Network is an emerging area of research today. The amount of information


carried by Online Social Networks in the form of text and images is of immense value for
data mining and knowledge extraction. There are many approaches to social network analysis
including machine learning. Machine learning algorithms work on a set of observable
features extracted from user information. Application of machine learning in the field of
online social network analysis includes spammer detection, user classification, link
prediction, troll pages detection, friend suggestions, community or cluster identification,
trend analysis, sentiment analysis of political blogging etc.

By utilizing the Internet, it has become very unassuming to get any sort of data from
any source all throughout the planet. The expanded interest from social locales permits clients
to assemble a bounty of client data and information. Tremendous measures of information on
these pages regularly draw the consideration of phony clients [1]. Twitter has quickly gotten
an on the web hotspot for gaining continuous data about clients. Twitter is an Online Social
Network (OSN) where clients can share everything without exception, such as news,
assessments, and surprisingly their temperaments. A few contentions can be held over various
themes, for example, legislative issues, current undertakings, and significant occasions. When
a client tweets something, it is quickly passed on to his/her adherents, permitting them to
extend the gotten data at a lot more extensive level [2]. With the advancement of OSNs, the
need to contemplate and break down clients' practices in online social stages has power.
Numerous individuals who don't have a lot data with respect to the OSNs can undoubtedly be
deceived by the fraudsters. There is additionally an interest to battle what‟s more, place a
control on individuals who use OSNs as it were for notices and hence spam others‟ accounts.
As of late, the identification of spam in friendly organizing locales pulled in the consideration
of analysts. Spam location is a troublesome errand in keeping up with the security of informal
communities.

Majority of people uses internet and trust's the contents over it. The scenario where
anyone can bring out a survey gives an open edge to the spammer to generate fake surveys
about products and services. Identification of these intruder and fake contain is a widely

Dept. of CSE, Sharnbasva University Kalaburgi 19


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

debated issue of research as of now tremendous amount of studies has been done till now,
then also the existing work lacks behind in differentiating spam reviews and none of them
gives the significant result to the collected feature type

Due to the continuous growth of data size on platforms with large data such as social
media, the detection of fraudulent accounts on these platforms is becoming more difficult.
Although social media is preferred for communication purposes, it is becoming an
increasingly attractive target for spammers and fraudsters. A suggestion system can be
developed in order to provide better products to customers by analyzing the shares and
interactions of people on social media. But if the messages are not sent by a real people, the
analysis is wrong.

3.2 TECHNICAL PAPER REFERRED

A Survey on Machine Learning Methodologies in Social Network Analysis[3] This paper


surveys on the existing work on a) fake profile detection b) personality trait recognition c)
depression detection based on using machine learning algorithms in social network analysis
and presents a comparative study of the different approaches

Survey on Spam Filtering Using Netspam Framework [4] This paper presents an
application uses a new structure, called NetSpam, which offers spam features to demonstrate
product review data sets as heterogeneous information networks in order to design a spam
review detection method in such networks. Using the importance of spam features helps us to
achieve better results on review data sets with respect to different metrics. The outcomes
represent that NetSpam results with the previous methods and encompassed by four
categories of features: The first type of features performs better than the other categories,
involving review - behavioral, user - behavioral, linguistic review and user - linguistic.

Detection of Fake Twitter Accounts with Multiple Classifier and Data Augmentation
Technique[5] In this study, malicious accounts have been identified in order to prevent dirty
and false information circulating on the internet by using the messages sent by social media
users. For this purpose, a system has been developed to classify automatic or normal accounts
using intelligent techniques. The nearest neighbor, logistic regression and random forest
algorithms were used for the identification of counterfeit accounts. The classification

Dept. of CSE, Sharnbasva University Kalaburgi 20


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

performances of these methods were compared and smote and majority voting techniques
were applied to related algorithms to improve performance.

Machine Learning Based Twitter Spam Account Detection: A Review[6] Twitter is one
of the biggest microblogging networking platform, it has more than half a billion tweets are
posted every day in average by millions of users on Twitter. Such a versatility and wide
spread of use, Twitter easily get intruded with malicious activities. Malicious activities
includes malware intrusion, spam distribution, social attacks, etc. Spammers use social
engineering attack strategy to send spam tweets, spam URLs, etc. This made twitter an ideal
arena for proliferation of anomalous spam accounts. The impact stimulates researchers to
develop a model that analyze, detects and recovers from defamatory actions in twitter.
Twitter network is inundated with tens of millions of fake spam profiles which may
jeopardize the normal user's security and privacy. To improve real users safety and
identification of spam profiles become key parts of the research.

Trust-Aware Review Spam Detection [7]In this paper, we aim at providing an efficient and
effective method to identify review spammers by incorporating social relations based on two
assumptions that people are more likely to consider reviews from those connected with them
as trustworthy, and review spammers are less likely to maintain a large relationship network
with normal users. The contributions of this paper are two-fold: (1) We elaborate how social
relationships can be incorporated into review rating prediction and propose a trust-based
rating prediction model using proximity as trust weight, and (2) We design a trust-aware
detection model based on rating variance which iteratively calculates user-specific overall
trustworthiness scores as the indicator for spamicity. Experiments on the dataset collected
from Yelp.com show that the proposed trust-based prediction achieves a higher accuracy than
standard CF method, and there exists a strong correlation between social relationships and the
overall trustworthiness scores

3.3 DESIGN METHODOLOGY

TWEET SERVER

In this module, the Admin has to login by using valid user name and password. After
login successful he can perform some operations such as view and authorize users, Adding
Short URLs, Listing all Friends Request and Responses, Listing all User Posted Tweets,

Dept. of CSE, Sharnbasva University Kalaburgi 21


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Listing all Tweets and Re-tweets with Comments ,Viewing all Spammers URLs, Viewing
URL Shortening Users and Post Details, Finding all Clicked Shortened URLs and
Corresponding Users and Chart Results.

Viewing and Authorizing Users- In this module, the admin views all users details and
authorize them for login permission. User Details such as User Name, Address, Email Id,
Mobile Number.

Viewing all Friends Request and Response -In this module, the admin can see all the
friends‟ requests and response history. Details such as Requested User Name and Image, and
Requested to User Name and Image, status and date.

List all User Posted TweetsIn this module, the admin can see all the tweets posted by the
users. The Tweet Details such as, tweet name, tweet image, tweet description, tweet uses,
date of post and Posted user name.

View all Inference Attackers- In this module, the all-Inference Attacker details will be
listed. The details consist of the comment which has Shortening URLs (like t.co, goo.gl,
bit.ly), Tweet Name, and Date of Attack.

View URL Shortening Users and Post Details - In this, the admin can see all URL
Shortening users and post details. This contains the number of times the particular user used
these URLs (t.co, goo.gl, bit.ly) while commenting on tweets.

View all Clicked Shortened URLs and Corresponding End Users- In this, the admin can
view all the users who clicked Number of times on these URLs (t.co, goo.gl, bit.ly).

Find Number of times Posted URL shortening users in Chart- In this, the admin can see
the graph which describes the number of times the particular user used these Shortened URLs
while tweeting or Re-Tweeting (Posting their comment).

Find Number of times used URL shortening users in Chart- In this, the admin can see the
graph which describes the number of times the particular Shortened URL is used by the users
while tweeting or Re-Tweeting (Posting their comment).

Dept. of CSE, Sharnbasva University Kalaburgi 22


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

USER

In this module, there are n numbers of users are present. User should register before
performing any operations. Once user registers, their details will be stored to the database.
After registration successful, he has to login by using authorized user name and password.
Once Login is successful user can perform some operations like viewing their profile details,
searching for friends and sending friend requests, accepting friend requests, viewing friends
details, Posting Their own Tweets, Finding Friends tweets and Re-tweets, Listing user tweets
and comments and Finding Inference Attack user Posted tweets.

Tweet Server

1. List all users and authorize


2. List all Friends Req and Res
3. List all user posted tweets
4. List All Tweets Re Tweets with
comments
5. View all Spammers URLs(those
who use URL shortening (like t.co,
Remote goo.gl, bit.ly,) tweets )
User 6. View URL shortening users and
post details(user, number of time
posted these URLs)
7. Find all Clicked shortened URLs
1. View your profile
and corresponding end users. (user,
2. search friends,req / res friends
number of time used these URLs)
3. View all your friends and their details
4. Post your tweet by tweet name, tweet 8. Find number of time posted URL
des,tweet uses, tweet image shortening users in Chart
5. Find your friends tweets and re tweet 9. Find number of time used URL
6.List all your tweets and comments shortening users in chart
7. Find inference attack on your tweets

Figure 3.1 System architecture

Dept. of CSE, Sharnbasva University Kalaburgi 23


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Viewing Profile Details, Search and Request FriendsIn this module, the user can see their
own profile details, such as their address, email, mobile number, profile Image.The user can
search for friends and can send friend requests or can accept friend requests.

Post Tweets - In this, the user can post their own tweets by providing details such as tweet
image, tweet name, tweet description, tweet uses.

View Friends Tweets on Posts and Re-Tweet - In this, the user can see all his/her friends‟
tweets on posts and can Re-tweet on them by providing user own comment (if the comment
contains Shortened URLs that is, t.co, goo.gl, bit.ly then user will become an inference
attacker).

View Inference Attack on User Posts(Tweets)- In this, the user can see all the Inference
attackers who have posted Shortening URLs in their comments on User Posts.

3.4 DESIGN FLOW DIAGRAM

Figure 3.2 Design Methodology

Dept. of CSE, Sharnbasva University Kalaburgi 24


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Dataset: The benefit of using these words based on their entropy score in the characteristic-
set is that we have been capable of lessen uncertainty in the prediction final results as those
phrases have a exceptional effect of frequency count in spam and non-spam URL .

Pre-processing: Before starting with preparation pre-processing of the messages must be


done. First all the characters must be in lowercase. The word which is both in uppercase and
lowercase must be considered as same words and not as two different words. Then
tokenization must be done for each message in the data set.

Feature Selection: The main advantage of using the words present in the dataset is that it is
capable of reducing uncertainty in the prediction of the final results as those phrases have a
remarkable effect of frequency count in spam and ham comments in URL .

Feature Extraction and Feature Engineering; Attribute significance is a supervised


characteristic that ranks attributes in a step by step manner with their significance in
predicting an aim. Here Count Vectorizer is used which convert a “collection of text
documents to a matrix of token counts . This undergoes the following technique:

 N-grams: N-grams is used to improve the accuracy. It is dealt with single word but when
there are two mutual words the complete meaning will be changed. So, the variation of
accuracy is better occurred when text is split into token of two or more words rather than
being a single word.

 Analyzer: “Whether the feature should be made of word or character n-grams. Option
„char_wb‟ creates character n-grams only from text inside word boundaries; n-grams at
the edges of words are padded with space.”

 Vocabulary: “Either a Mapping (e.g., a dicts) where keys are terms and values are
indices in the feature matrix, or an iterable over terms. If not given, a vocabulary is
determined from the input documents. Indices in the mapping should not be repeated and
should not have any gap between 0 and the largest index.

 Binary :If True, all non zero counts are set to 1. This is useful for discrete probabilistic
models that model binary events rather than integer counts. Model Building After Pre-
processing there has to be a way of constructing a version to keep the abilities of the

Dept. of CSE, Sharnbasva University Kalaburgi 25


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

function of the project in accordance to the labelled model, which is built as per the
Supervised set of rules.

 Max_Features: If not None, build a vocabulary that only consider the top max_features
ordered by term frequency across the corpus. This parameter is ignored if vocabulary is
not None.

Adaboost is the boosting algorithm which is adapted in solving practices. It helps to combine
many weak classifiers to a single strong classifier. It first separates the weak learners called
as decision stumps which means the decision tree with single split. It then separates the
datasets based on the level of difficulty, it puts more weight on the instances which are
trickier and more difficult, and less weight on the ones which are handled properly.

The decision stumps will be made into two subsets and a threshold value will be calculated
all the data will be either above or below the threshold value. It is moderately accurate on
dataset because it failed when we get a value which is an exception from threshold value.
Decision tree is a series of true or false questions that are asked about our data eventually
leading to continuous value or predicted. In this it tries to form nodes in which it contains
high proportion of data points from a particular or single class by finding the values in
features which divides the data into classes.

It is a nonlinear model which is built by many linear boundaries, here for a model we give
both label and features so that it will understand to classify points based on features, due to
overfitting in the data it is not accurate compared with other algorithms. Random forest has
number of blocks of decision trees together in a single thing, so it is not accurate compared
with other algorithms. Logisitic regression is used for prediction of binomial or multinomial
values of a variable. It uses a statistical approach to find the outcome. The outcome is binary
in nature. It uses a logit function for the prediction of probability of occurrence of binary
outcome, it follows bernoulis distribution, so the outcome here will be accurate either x or y.
Here it works on dataset and predicts x or y that is spam or ham.

Dept. of CSE, Sharnbasva University Kalaburgi 26


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

CHAPTER 4

SPECIFIC OUTCOME

4.1 RESULTS

Figure 4.1 Home Page

Figure 4.2 User login

Dept. of CSE, Sharnbasva University Kalaburgi 27


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Figure 4.3 User Registration Page

Figure 4.4 Server Side Home Page

Dept. of CSE, Sharnbasva University Kalaburgi 28


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

4.2 EXPERIENCE AND ASSESSMENT

Experience at the company satisfactory, the people works in co-ordination and the
company environment is very safe and studious. The reason to choose this company was that
it was offering internship in wirelesswhich is my core specialization in PG degree and I
wanted to benefit from this experience, also I got to learn new tools like JAVA/J2EE (JSP,
SERVLET).

I used to spend nearly 5 to 6 hours daily in the company trying out with different
circuits and make their layout manually. I think my guide who was always there by my side
throughout my internship process giving me advice, feedback and tips on how the people
work in an industry environment.

On the whole, this internship was a useful experience. I have gained new knowledge,
skills and met many new people. Internships help us to learn more about our self. Through
an internship, we come to know clarity on our strengths, weaknesses, and interests.

Internships increase our professional confidence and also improve our communication
skills. Through an internship, we get a chance to learn what it is really like to work in a
company, in an industry, and in various job functions. Internships help us to develop better
work habits and learn how to manage tasks/projects and learn how to carry our self in a
professional environment. We can also learn from our colleagues by observing their positive
and negative work habits.

4.3 TECHNICAL OUTCOMES


We studied about basics of Java and MySQL, we used the software‟s like Java Virtual
Machine (JVM), How to run application that runs on the server side and creates a dynamic
page is called a web application. Currently, Servlet, JSP, Struts, Spring, Hibernate, JSF, etc.
technologies are used for creating web applications in Java.

4.4 NON-TECHNICAL OUTCOMES

 Developed analytical skills


 Developed problem solving skills
 Improvement in verbal and written communication skills
 Effective utilization of resources

Dept. of CSE, Sharnbasva University Kalaburgi 29


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

 Effective time management


 Loss of stage fear
 Learnt the methods of writing technical articles
 Improvement in presentation skills
 Personality development
 Learnt the importance of team work and group discussions
 Better coordination and interaction with people
 Stress management
 Exposure to innovative ideas.

4.5 NON-TECHNICAL SKILLS

 Procedure to be followed to apply technical papers for several conferences

 Method of preparing technical papers


 IEEE format for technical papers
 Improved oral and written communication
 How to face the crowd
 Overall personality development
 How to interact with people
 Improved level of confidence
 Method of paper presentation
 Improved typing speed.

4.6 INTERPERSONAL SKILLS

Interpersonal skills influence business cultures because they affect job performance,
which in turn helps to decide the outcome of a company's success. Interpersonal skills include
interaction with others, good communication skills, listening skills and attitude. Companies
should realize that interpersonal skills are not learned in a classroom; rather they are
characteristics that an individual may possess naturally.

Interpersonal skills allow people to communicate effectively, handle conflicts and


respond to others' needs accordingly. In an ever-changing job market, these skills teach
workers and business owners how to be agile, solve complex problems, perform critical
Dept. of CSE, Sharnbasva University Kalaburgi 30
Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

thinking on their feet and manage diverse relationships both internally and externally.
Measuring a potential employee's ability to interact with others in a respectful and
appropriate manner determines how we likely to thrive in a team-oriented environment.
Some of the major factors that make up a person's interpersonal skills are diplomacy,
helpfulness, optimism, influence and flexibility. Also vital arecollaboration skills, empathy,
tolerance and frankness. These characteristics often align with corporate culture as well as
small business culture.
Ways to improve interpersonal skills include touring different sites, managing by
walking around, arranging lunches and corresponding consistently via phone or email.
Having good interpersonal skills promotes approachability, likability and comfort. Managers
who possess strong interpersonal skills motivate their staff to challenge themselves and do a
better job. Most importantly, they make workers feel as if they can go to their bosses with
any problems or concerns.

Both verbal and non-verbal interpersonal skills are extremely important when it
comes to a company's success. When you can speak to people in an articulate manner, you
avoid communication errors and are more likely to have happy customers. It's just as
important to maintain the correct tone of voice as well. Non-verbal communication consists
of facial expressions, hand gestures and body language. It can also determine whether or not
your interaction results in a satisfied customer. When you combine both verbal and non-
verbal skills, the result is a powerful demeanor that may help to determine the success of a
company.In addition, superb interpersonal skills encompass listening skills, problem-solving,
decision-making and negotiation skills. The ability to communicate internally with
employees and coworkers is just as important as building and maintaining solid relationships
with customers.

4.7 PERSONALITY DEVELOPMENT

Personality development grooms an individual and helps to make a mark of them


self‟s in society. Individuals need to have a style of their own for others to follow them. Do
not blindly copy others. We need to set an example for people around. Personality
development not only looks us good and presentable but also helps us to face the world with
a smile.

Dept. of CSE, Sharnbasva University Kalaburgi 31


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

Personality development goes a long way in reducing stress and conflicts. It


encourages individuals look at the brighter side of life Personality development helps us to
develop a positive attitude in life. An individual with a negative attitude finds a problem in
every situation. It is essential for individuals to behave well with people around. Being polite
with others will not only make an individual popular among other people, but also helps to
earn respect and pride.

Personality development helps us to differentiate our personal and professional life. It


is really essential to keep a balance between both the lives to lead a peaceful and stress free
life. Personality development helps an individual to inculcate positive qualities like
punctuality, flexible attitude, willingness to learn, friendly nature, eagerness to help others
and so on.

Dept. of CSE, Sharnbasva University Kalaburgi 32


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

CHAPTER 5

CONCLUSIONANDFUTUREWORK

In this internship In we did a review of the methods used on Twitter to identify


spammers. However, we have provided taxonomy of Twitter spam detection approaches and
classified them as fake content detection, spam detection based on URLs, spam detection in
trend topics and fake user detection. We have compared the strategies presented based on a
number of characteristics, such as user features, material features, graph features, structure
features and time features. In addition, in terms of their defined objectives and data sets used,
the techniques were also compared. The study presented is expected to help researchers find
the information on state-of - the-art Twitter spam detection techniques in a consolidated form

Dept. of CSE, Sharnbasva University Kalaburgi 33


Detection Of Malicious Social Bots Using Learning Automata With URL Features In Twitter Network

REFERENCES

[1]. B. Erçahin, Ö. Aktaş, D. Kilinç, and C. Akyol, „„Twitter fake account detection,‟‟ in Proc.
Int. Conf. Comput. Sci. Eng. (UBMK), Oct. 2017, pp. 388–392.

[2]. F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida, „„Detecting spammers on


Twitter,‟‟ in Proc. Collaboration, Electron. Messaging, AntiAbuse Spam Conf. (CEAS),
vol. 6, Jul. 2010, p. 12.

[3]. Runa Ganguli;Akash Mehta;Soumya Sen “A Survey on Machine Learning


Methodologies in Social Network Analysis” 2020 8th International Conference on
Reliability, Infocom Technologies and Optimization (Trends and Future Directions)
(ICRITO) Year: 2020 | Conference Paper | Publisher: IEEE

[4]. Khushabu Solanke;Mayura Kulkarni“Survey on Spam Filtering Using Netspam


Framework”2019 International Conference on Recent Advances in Energy-efficient
Computing and Communication (ICRAECC)Year: 2019 | Conference
Paper | Publisher: IEEE

[5]. Mehmet Sevi;İlhan Aydin “Detection of Fake Twitter Accounts with Multiple
Classifier and Data Augmentation Technique” 2019 International Artificial
Intelligence and Data Processing Symposium (IDAP)Year: 2019 | Conference
Paper | Publisher: IEEE

[6]. Shivangi Gheewala;Rakesh Patel“Machine Learning Based Twitter Spam Account


Detection: A Review” 2018 Second International Conference on Computing
Methodologies and Communication (ICCMC)Year: 2018 | Conference
Paper | Publisher: IEEE

[7]. Hao Xue;Fengjun Li;Hyunjin Seo;Roseann Pluretti“Trust-Aware Review Spam


Detection”2015 IEEE Trustcom/BigDataSE/ISPA Year: 2015 | Volume: 1 | Conference
Paper | Publisher: IEEE

Dept. of CSE, Sharnbasva University Kalaburgi 34

You might also like