0% found this document useful (0 votes)
45 views28 pages

Restricting Unsolicited Approaches and Counterfeit Users: Batch No: 28 Guided by Done by

The document describes a project that aims to restrict unsolicited approaches and counterfeit users on social networks. The proposed system will retrieve data from Twitter using its API and analyze tweets using machine learning models like SVM. It will detect potentially fake content, URLs, hashtags and words. It will also determine the geolocation of users and analyze the sentiment of words in tweets. The system will then familiarize users with counterfeit accounts so they can block them or take other actions. Key modules include live stream data collection, tweet analysis, geolocation processing and result analysis comparison. The goal is to increase user immunity against hoaxes on social media.

Uploaded by

Hem Ramesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views28 pages

Restricting Unsolicited Approaches and Counterfeit Users: Batch No: 28 Guided by Done by

The document describes a project that aims to restrict unsolicited approaches and counterfeit users on social networks. The proposed system will retrieve data from Twitter using its API and analyze tweets using machine learning models like SVM. It will detect potentially fake content, URLs, hashtags and words. It will also determine the geolocation of users and analyze the sentiment of words in tweets. The system will then familiarize users with counterfeit accounts so they can block them or take other actions. Key modules include live stream data collection, tweet analysis, geolocation processing and result analysis comparison. The goal is to increase user immunity against hoaxes on social media.

Uploaded by

Hem Ramesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

RESTRICTING UNSOLICITED APPROACHES

AND COUNTERFEIT USERS

BATCH NO: 28

GUIDED BY:
DONE BY:
Mrs. S. MAHESHWARI., M.E VIGNESH. V - 311416104079
VISHAL. P - 311416104083
ASSISTANT PROFESSOR - 311416104087
YOGESH KUMAR.S

Mrs. V.VIDHYA, M.E Dr. C H.PRAMEELA DEVI, M.E.,Ph.D Mrs. D. SUDHA, M.E,(Ph.D)
SUPERVISOR INTERNAL EXAMINER PROJECT COORDINATOR
OUTLINE
• OBJECTIVE

• ABSTRACT

• LITERATUTRE SURVEY

• EXISTING SYSTEM

• PROPOSED SYSTEM

• MODULES

• SYSTEM REQUIREMENTS

• REFERENCES
OBJECTIVE

Creating immunity against unsolicited approaches and


hoax in Social Network.
ABSTRACT

• Social Network has huge impact on day to day life.

• Every content on Social media is not legitimate and necessary.

• Collecting the data sets required from Online Interpersonal


Network(OSN) and fetch it to the Machine Learning(ML)
Models.

• The ML models will encounter the Fake content, Spam in manner


of we trained.

• Geolocation of Victim users were identified.

• Those profile details and contents will be forewarned to the users.


LITERATURE SURVEY
TECHNIQUE
S.NO TITLE (OR) ADVANTAGES DISADVANTAGES
ALGORITHM

1. A Novel Stream Clustering • Random Forest • Efficient in • Multi density data.


FrameworkforSpamm Classification assigning micro • Familiarization.
e r Detection in Twitter. • Bayesian cluster.
Author: Hadi Tajalizadeh,Reza Classification. • Quick
Boostani Evaluation.
Year: 2019

2. A N e u r a l N e t w o r k - B a • Neural network- based • Larger data- • Time consumption.


s e d Ensemble Approach for ensemble technique processing. • Trained Schemas.
Spam Detection in Twitter • Scalability.
Author: Sreekanth Madisetty
and Maunendra Sanka
r Desarkar
Year: 2018
3. A Hybrid Approach for • Page Ranking • Robust. • No Log of Spam.
Detecting Automated Spammer • Graph partitioning • Reliable.
in Twitter
Author: Mohd Fazil and
Muhammad Abulaish
Year:2018

4. Co-evolution of Mobile Malware • GP-based mobile • Detection • Mysterious


and Anti-Malware Malware detection. possible. Attacks.
Author:Sevil Sen, Emre • Dictionary • Untraceabl
Aydogan, Ahmet I. Aysan updation. e
Year:2018

5. NetSpam: a Network-based • Semi-supervised • Keyword • No information


Spam Detection Framework for learning. enhanced search filtering.
Reviews in Online Social Media. • Unsupervised • Hybrid learning • Media hypocrite.
Author:Saeedreza Shehnepoor, learning. process.
Mostafa Salehi*, Reza
Farahbakhsh, Noel Crespi
Year: 2016
EXISTING SYSTEM
• Web Crawlers are used for fetch data sets from OSN.

• The collected dataset are processed with Data Processing Algorithm.

• Literature review methodology is used to verify the authenticity of the


processed data.

• Then the Spam related content from processed data is differentiated.

• Random forest classification and Bayesian network algorithm are


used.
ARCHITECTURE - EXISTING SYSTEM

Requests for
Data Web Crawler
Initiates
Data
Request
Social
Serves
Media Data

Server Fetches
Data

Applies
Database
Data
for storing Data

Results on
Spam content

Classifier Model
ADVANTAGES:

• Efficient in identifying spams.

• Word by word review can be done.

• Efficient in assigning micro-clusters.

DISADVANTAGES:

• Insufficient flexibility in stream clustering.

• The users are not familiar with spams.

• Space consumption is high.


PROPOSED SYSTEM
• Twitter API will retrieve the required data from OSN as a dataset in CSV format,
Then it was processed using ML models.

• The Support Vector Machine(SVM) will detect the content, URL , Hash Tags,
Words in the Raw data. Then those words are stored.

• The Geolocation of User is found and the content in their tweets are found.

• Sentiment analysis technique is used to rank each words based on our


preprocessed results.

• With the processed results, the counterfeit account will be familiarized to the user
for blocking or to take any other actions to take.
ARCHITECTURE DIAGRAM - PROPOSED SYSTEM
Deployment platform

Request
Data
Hash-tag

Lexical Ranking
OSN Fetches
Data analysis
Server

Geolocation Hoax

Machine Learning
Models
ADVANTAGES
:
• Familiarization with hoax and counterfeit contents.

• Higher space efficiency.

• Execution time of detecting spam will be lesser.


MODULES

• LIVE STREAM DATA

• TWEET ANALYSIS

• GEOLOCATION BASED PROCESSING

• RESULT ANALYSIS COMPARISON


LIVE STREAM DATA

• The OSN consists of all data which is fetched from server using API.

• Each data were under several parameter such as tweet.fields, user.fields,


search.fields, tweet.location,etc.

• We omit the unwanted data. tweet.fields used for training the model.

• Processing those data in the Support Vector Machine and several Machine Learning
algorithms, we get an result of each tweet
The Data through Twitter API reach us in CSV Format with pre-defined parameters as in below
TWEET ANALYSIS

• The user analyse the tweet’s nature like positive negative or neutral by analyzing
the containing words in the tweet.

• Then each word, we use in a regular basis comprises of words of positive and
negative as well as neutral words are carefully reviewed.

• These words are then stored into the database for analysis.

• The most important features in senti-features are involved and analytics provide
details on your tweets.
Using Support Vector Algorithm, The large amount of data in the dataset were
analyzed and processed, then plotted with the trained model.
• Initial plotting of SVM with • SVM plot with larger data after more
lesser trails on data. trails.

• Only Positive and Negative • It can able to find the all


Contents were found by the Positive,Negative, and Neutral
Model on this trail Contents
GEOLOCATION BASED PROCESSING

• An precise location of a real time and country-level tweets are being tracked by
geolocation system.

• This is a methodology, enable us to perform a complete analysis of a tweet’s


geolocation.

• It will give an scope for revealing an best approaches for an accurate country-level
location classifier.

• Identifying spam over a region can be done easily.


With the help of Google API, Precise location of user is detected and top layer parameter such as
CITY,STATE,COUNTRY were saved for processing.
RESULT ANALYSIS COMPARISON

• Sentiment Analysis (Emotion AI) refers to the use of Natural Language


Processing, Text Analysis.

• It was used to identify, extract, quantify and study affective sates and
subjective information.

• The Polarity of the Expressions from the Data Sets was recognized by this
processing method.

• The effective and familiar words of hoax in existence was recognized


applying this.
SENTI-WORD CALL

• In this module, the data to be analyzed for analysis is gets the score by calling the
senti-word file.

• The senti-word file is defined as scoring file according to the uniqueness of the
content.

• The generation of score for individual words and this score is used for
generation of analysis result
FLOWCHART
Trained Twee
Start
data t
Post

TWEETS Sentiment
analysis

Data Sets Polarit


from Tweet y

Processing Display
Data Sets result

End
USER FAMILIARIZATION

• The final score is generated by the result from the comparison process with
sentiword file.

• This is useful in comparison of scores and this comparison results are kept some
threshold to state the tweet as negative, positive or neutral one.

• After results, positive content and neutral content will be forwarded and negative
content will be measured from the server-side.

• The tweet to found will be familiarized with the user.


USER FAMILIARIZATION
SYSTEM REQUIREMENTS
HARDWARE REQUIREMENTS:
• Processor - Pentium –III
• RAM - 4 GB
• Hard Disk - 260 GB
• Monitor - SVGA

SOFTWARE REQUIREMENTS:
• Operating System - Windows95/98/2000/XP
• Front End - HTML, Java, JSP
• Scripts - JavaScript
• Server side Script - Java Server Pages
• Database - MySQL
• Database Connectivity - JDBC
REFERENCES
[1]. C. Chen et al., “A performance evaluation of machine learning-based
streaming spam tweets detection,” IEEE Trans. Comput. Social Syst., vol. 2, no.
3, pp. 65–76, Sep. 2015.
[2]. C. Yang, R. Harkreader, and G. Gu, “Empirical evaluation and new design
for fighting evolving Twitter spammers,” IEEE Trans. Inf. Forensics Security, vol.
8, no. 8, pp. 1280–1293, Aug. 2013.
[3]. O. Kurasova, V. Marcinkevicius, V. Medvedev, A. Rapecka, and P.
Stefanovic, “Strategies for big data clustering,” in Proc. IEEE 26th Int. Conf.
Tools Artif. Intell., Nov. 2014, pp. 740–747.
[4]. S. Sedhai and A. Sun, “Semi-supervised spam detection in Twitter stream,”
IEEE Trans. Comput. Social Syst., vol. 5, no. 1, pp. 169–175, Mar. 2018.
[5]. F. Benevenuto, G. Magno, T. Rodrigues, and V. Almeida, “Detecting
spammers on Twitter,” in Proc. 7th Annu. Collaboration, Electron. Messaging,
Anti Abuse Spam Conf., Redmond, WA, USA, Jul. 2010

API-REFERENCE:
TWITTER: https://developer.twitter.com/en/docs/tweets/search/api-
reference/get-search-tweets
GOOGLE MAPS : https://maps.googleapis.com/maps/api/js?key=AIzaSyBOU-
GKNx-YL5o-b8cvlqgyn0rso6iQtUk&callback=showlocation
THANK
YOU

You might also like