Project Report
Project Report
DETECTION
A PROJECT REPORT
Submitted by
KARTHIKA.M(412517205034)
PRAVEENA.B(412517205055)
Submitted to
FACULTY OF INFORMATION
TECHNOLOGY
of
BACHELOR OF TECHNOLOGY
IN
INFORMATION TECHNOLOGY
I
ANNA UNIVERSITY: CHENNAI 600 025
BONAFIDE CERTIFICATE
supervision.
SIGNATURE SIGNATURE
Dr.T.Sheela, V.K.G.Kalaiselvi
Submitted for the Anna University Project Examination held on ________ Sri Sairam
Engineering College(Autonomous), Chennai – 600 044
II
ACKNOWLEDGEMENT
We express our gratitude to our respected Founder and Chairman, Late Thiru. MJF.
Ln. LEO MUTHU who always had the vision that changes can be made only through
education which fulfils all the needs. His constant blessings is the driving force which made
us complete the project work successfully.
We express our gratitude to our CEO Mr. J. SAI PRAKASH LEO MUTHU for his
constant encouragement in completing this project.
We express our thanks to our Principal Dr.A.RAJENDRA PRASAD for having given
us his spontaneous and whole-hearted encouragement for completing this project.
We are indebted to our Head of the Department Dr. T. SHEELA for her constant
support during the entire course of this project work.
Finally, we would like to thank all the staff members of the Department of Information
Technology and all others who contributed directly and indirectly for the successful
completion of our project.
We thank our almighty for his blessings to make this project a success.
III
ABSTRACT
With growth in Internet Technology, Social media has highly gained popularity as a
medium of interaction. With increase in Social media conversation, there is increase in
activities such as aggressive and intentional actions performed via digital communication
such as sending abusive messages and posting immoral comments. User Base for social
media such as facebook, twitter has increased multiple times in recent past. Also there is a
high increase in unstructured data in the means of reviews and comments in online portals.
With these huge data growing there is a lot of scope to use big data technologies to analyse
these data. In our project, we propose to handle unstructured data using machine learning
algorithms. We use these technologies to address cyberbullying on Twitter. In the proposed
system, random forest regressor, a machine learning algorithm is used to train the dataset
consisting of a collection of abusive comments. By accurately predicting the abusive
comments used the system can be used by social media platforms to block such abusive
comments.Thus, improving the user experience for using and accessing social media
platforms.
IV
LIST OF FIGURES
FIGURE PAGE
NAME OF THE FIGURE
NO NO.
1.1 Machine learning architecture
ML Machine Learning
SARSA State-Action-Reward-State-Action
V
TABLE OF CONTENTS
VI
CHAPTE TITLE PAGE
R NO NO.
ABSTRACT
LIST OF FIGURES
1 INTRODUCTION
1.1 GENERAL
1.3 OBJECTIVE
2 LITERATURE SURVEY
2.1 INTRODUCTION
3.4 ADVANTAGES
3.5 APPLICATIONS
4 SYSTEM DESIGN
4.2 WORKING
Machine learning plays an important role in cybersecurity and online fraud detection.
Because of growing monetary online frauds, companies like PayPal have started using
machine learning techniques for protection against the money laundering. The prediction
problem of the model for fraud detection can be divided into two types: classification and
regression. Some of the most used machine learning approaches for this type of prediction
problems are Logistic Regression, Decision Tree, Random Forest Tree, and Neural Networks.
Modern day machine learning has two objectives, one is to classify data based on
models which have been developed, the other purpose is to make predictions for future
outcomes based on these models. A hypothetical algorithm specific to classifying data may
use computer vision of moles coupled with supervised learning in order to train it to classify
the cancerous moles. Where as machine learning algorithm for stock trading may inform the
trader of future potential predictions.
Machine Learning is the field of study that gives computers the capability to learn
without being explicitly programmed. ML is one of the most exciting technologies that one
would have ever come across. As it is evident from the name, it gives the computer that
makes it more similar to humans: The ability to learn. Machine learning is actively being
used today, perhaps in many more places than one would expect.
A subset of machine learning is closely related to computational statistics, which
focuses on making predictions using computers; but not all machine learning is statistical
learning. The study of mathematical optimization delivers methods, theory and application
domains to the field of machine learning. Data mining is a related field of study, focusing
on exploratory data analysis through unsupervised learning. In its application across business
problems, machine learning is also referred to as predictive analytics.
Machine learning involves computers discovering how they can perform tasks without
being explicitly programmed to do so. It involves computers learning from data provided so
that they carry out certain tasks. For simple tasks assigned to computers, it is possible to
program algorithms telling the machine how to execute all steps required to solve the
problem at hand; on the computer's part, no learning is needed.
The machine learning field is continuously evolving. And along with evolution comes a
rise in demand and importance. There is one crucial reason why data scientists need machine
learning, and that is: ‘High-value predictions that can guide better decisions and smart actions
in real-time without human intervention.
Machine learning as technology helps analyze large chunks of data, easing the tasks of
data scientists in an automated process and is gaining a lot of prominence and recognition.
Machine learning has changed the way data extraction and interpretation works by involving
automatic sets of generic methods that have replaced traditional statistical techniques.
For more advanced tasks, it can be challenging for a human to manually create the
needed algorithms. In practice, it can turn out to be more effective to help the machine
develop its own algorithm, rather than having human programmers specify every needed
step.
Figure 1.1 Machine learning architecture
1.2.1.1 Machine Learning approaches
Machine learning approaches are traditionally divided into three broad categories,
depending on the nature of the "signal" or "feedback" available to the learning system:
Supervised learning
XIII
algorithm analyses the training data(set of training examples) and produces a correct outcome
from labelled data.
Supervised learning is where there are input variables (x) and an output variable (Y)
and an algorithm is used to learn the mapping function from the input to the output.
Y = f(X)
The goal is to approximate the mapping function so well that when there is a new input
data (x) that the output variables (Y) for that data can be predicted easily.
Unsupervised learning
Unsupervised learning algorithms take a set of data that contains only inputs, and find
structure in the data, like grouping or clustering of data points. The algorithms, therefore,
learn from test data that has not been labeled, classified or categorized. Instead of responding
to feedback, unsupervised learning algorithms identify commonalities in the data and react
based on the presence or absence of such commonalities in each new piece of data. A central
application of unsupervised learning is in the field of density estimation in statistics, such as
XIV
finding the probability density function. Though unsupervised learning encompasses other
domains involving summarizing and explaining data features.
Unsupervised learning is the training of machine using information that is neither classified
nor labeled and allowing the algorithm to act on that information without guidance. Here the
task of machine is to group unsorted information according to similarities, patterns and
differences without any prior training of data. Unlike supervised learning, no teacher is
provided that means no training will be given to the machine. Therefore, machine is restricted
to find the hidden structure in unlabeled data by itself.
Semi-supervised Learning
Reinforcement learning
XV
reward. Due to its generality, the field is studied in many other disciplines, such as game
theory, control theory, operations research, information theory, simulation-based
optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms. In
machine learning, the environment is typically represented as a Markov decision
process (MDP).Many reinforcement learning algorithms use dynamic
programming techniques. Reinforcement learning algorithms do not assume knowledge of an
exact mathematical model of the MDP, and are used when exact models are infeasible.
Reinforcement learning algorithms are used in autonomous vehicles or in learning to play a
game against a human opponent.
Feature Learning
Several learning algorithms aim at discovering better representations of the inputs
provided during training.[47] Classic examples include principal components analysis and
cluster analysis. Feature learning algorithms, also called representation learning algorithms,
often attempt to preserve the information in their input but also transform it in a way that
makes it useful, often as a pre-processing step before performing classification or predictions.
This technique allows reconstruction of the inputs coming from the unknown data-generating
distribution, while not being necessarily faithful to configurations that are implausible under
XVI
that distribution. This replaces manual feature engineering, and allows a machine to both
learn the features and use them to perform a specific task.
Feature learning is motivated by the fact that machine learning tasks such as
classification often require input that is mathematically and computationally convenient to
process. However, real-world data such as images, video, and sensory data has not yielded to
attempts to algorithmically define specific features. An alternative is to discover such features
or representations through examination, without relying on explicit algorithms.
1.3 OBJECTIVE
• To effectively develop a system to help social media platforms detect and identify abusive
comments.
• To make use of random forest regressor, a regression machine learning algorithm for
accurate detection of abusive comments.
XVII
CHAPTER -2
LITERATURE SURVEY
2.1 INTRODUCTION
The following shows survey on social media abuse detection. The most popular of
the existing techniques used for power forecasting.
Long-Term Retail Jingrui Xie Makes use of The proposed The system only
Energy Forecasting SAS Institute, regression methodology focuses on the
With Consideration Cary, NC, USA algorithms for has been forecasting in
of Residential TaoHong; Joshu forecasting implemented residential aspects
Customer Attrition a Stroud and the results of energy.
[2017] are very
accurate.
Probabilistic Load Bidong Makes use of Compared with The system does
Forecasting via Liu; Jakub sister point several not focus on
Quantile Nowotarski; Tao forecast method benchmark power forecasting.
Regression Hong; Rafał machine methods, the
Averaging Weron learning proposed
Forecasts[2018] algorithms. approach leads
to dominantly
better
XVIII
performance as
measured by
the pinball loss
function and
the Winkler
score.
An attention-based Yuan Gao, Makes use of Identifies spam Not very accurate
unsupervised Maoguo Gong, unsupervised movie reviews
adversarial model Senior Member, model
for review spam IEEE, Yu Xie,
detection[2020] and A. K. Qin,
Senior Member,
IEEE
XIX
The existing system focuses on identifying offensive comments, it is very slow and
high false positives will be observed.
The system is not very accurate.
If a categorical variable has a category in the test dataset, which was not observed in
training dataset, then the model will assign a 0 (zero) probability and will be unable
to make a prediction. This is often known as Zero Frequency.
CHAPTER - 3
SYSTEM ANALYSIS
XX
3.1 EXISTING SYSTEM
With the prevalence of the Internet, online reviews have become a valuable
information resource for people. How- ever, the authenticity of online reviews remains a
concern, and deceptive reviews have become one of the most urgent network security
problems to be solved. Review spams will mislead users into making suboptimal choices and
inflict their trust in online reviews. Most existing research manually extracted features and
labeled training samples, which are usually complicated and time- consuming.
This paper focuses primarily on a neglected emerging domain - review, and develops
a novel unsupervised spam detection model with an attention mechanism. By extracting the
statistical features of reviews, it is revealed that users will express their sentiments on
different aspects of movies in reviews. An attention mechanism is introduced in the review
embedding, and the conditional generative adversarial network is exploited to learn users’
review style for different genres of comments. The experimental results demonstrate the
superior performance of the proposed approach.
3.2 DISADVANTAGES OF EXISTING SYSTEM
The existing system focuses on identifying offensive comments ,it is very slow and
high false positives will be observed.
It assumes that all predictors are independent. It’s estimations can be wrong in some
cases so we cannot take probability outputs seriously.
The system is not very accurate.
Naive Bayes assumes that all predictors (or features) are independent, rarely
happening in real life. ...
This algorithm faces the 'zero-frequency problem' where it assigns zero
probability to a categorical variable whose category in the test data set wasn't
available in the training dataset.
XXI
recent past. Also there is high increase in unstructured data in the means of reviews and
comments in online portal. With these huge data growing there is lot of scope to use big data
technologies to analyse these data.In our project, we propose to handle unstructured data
using machine learning algorithms. We use these technologies to address cyberbullying in
Twitter. In the proposed system, random forest regressor,a machine learning algorithm is
used to train the dataset consisting of a collection of abusive comments.By accurately
predicting the abusive comments used the system can be used by social media platforms to
block such abusive comments.Thus, improving the user experience for using and accessing
social media platforms.
3.4 ADVANTAGES OF PROPOSED SYSTEM:
⮚ A technological solution for detecting abusive comments on social media and low false
positives will be observed.
⮚ Random forest regressor is used which provides accurate detection.
3.5APPLICATIONS
CHAPTER-4
SYSTEM DESIGN
XXII
DETAILED DESIGN OF THE PROJECT:
This chapter describes the overall and the detailed architectural design. It also describes
each module that is to be implemented along with Data Flow diagram.
4.1 ARCHITECTURE DIAGRAM:
4.2 WORKING:
The aim of this project is to investigate and implement algorithms that could possibly
detect and identify the abusive comments on social media . Data mining techniques and
machine learning algorithms can be used for the prediction and detection of abusive
comments.In this project the initial step will be the collection of different datasets from
internet on various abusive comments perpetrated over social media internet which will be
helpful in analysing the abusive comments then those dataset will be aligned accordingly. Then
it will undergo a process called separation of datasets into training as well as testing where the
training datasets will be used to train the model as well as testing will be used for evaluating
the model. Then dataset pre-processing will be done which will align all the datasets into a
specific category. There exist several regression algorithms in machine learning to develop
XXIII
an abusive comment detection model such as random forest regressor algorithm . In the
proposed system, Random forest regressor, a regression machine learning algorithm is used
to train the dataset consisting of data on the abusive comment detection and identification.By
accurately detecting and marking abusive comments over social media it can be very helpful
in improving the social media experience.
A data set is a collection of data. Machine learning has become the go-to method for
solving many challenging real-world problems. It’s definitely by far the best performing
method for prediction tasks. These machine learning machines that have been working so well
need fuel lots of fuel; that fuel is data. The more labelled data available, the better our model
performs. The idea of more data leading to better performance has even been explored at a
large-scale by Google with a dataset of 300 Million images! When deploying a machine
learning model in a real-world application, data must be constantly fed to continue improving
XXIV
its performance. And, in the machine learning era, data is very well arguably the most
valuable resource. There are three steps of collecting data.
Regression. For an algorithm to yield some numeric value. For example, if you spend too
much time coming up with the right price for your product since it depends on many factors,
regression algorithms can aid in estimating this value.
Ranking. Some machine learning algorithms just rank objects by a number of features.
Ranking is actively used to recommend movies in video streaming services or show the
products that a customer might purchase with a high probability based on his or her previous
search and purchase activities.
XXV
generator . For our dataset , we split the training and testing set with 80 , 20 ratio the random
state is passed as 0.
XXVI
Figure 4.4 Dataset preprocessing
4.3.4 TRAINING WITH ALGORITHM
The Random forest regressor is the regression machine learning algorithm to train the d
ataset consisting of abusive comments in social media. machine learning models require a lot
of data in order for them to perform well. Usually, when training a machine learning model,
one needs to collect a large, representative sample of data from a training set. Data from the
training set can be as varied as a corpus of text, a collection of images, and data collected
from individual users of a service. Overfitting is something to watch out for when training a
machine learning model. Trained models derived from biased data can result in skewed or
undesired predictions. Algorithmic bias is a potential result from data not fully prepared for
training.
Random forest Algorithm
Random forest is a flexible, easy to use machine learning algorithm that produces,
even without hyper-parameter tuning, a great result most of the time. It is also one of the
most used algorithms, because of its simplicity and diversity (it can be used for both
classification and regression tasks).
Random forest is a supervised learning algorithm. The "forest" it builds, is an ensemble
of decision trees, usually trained with the “bagging” method. The general idea of the bagging
method is that a combination of learning models increases the overall result. Random forest
builds multiple decision trees and merges them together to get a more accurate and stable
prediction. It technically is an ensemble method (based on the divide-and-conquer approach)
of decision trees generated on a randomly split dataset. This collection of decision tree
classifiers is also known as the forest.
One big advantage of random forest is that it can be used for both classification and
regression problems, which form the majority of current machine learning systems. Random
forest has nearly the same hyperparameters as a decision tree or a bagging classifier.
Fortunately, there's no need to combine a decision tree with a bagging classifier because you
can easily use the classifier-class of random forest.
Below you can see how a random forest would look like with two trees:
XXVII
Figure 4.5 Random forest regressor
2.It will construct a Decision Tree for each sample and get a prediction result from each
Decision Tree.
XXVIII
CHAPTER 5
SOFTWARE DESCRIPTION
5.1 Jupyter notebook
In this project the jupyter notebook is used as an IDE.
In this case, "notebook" or "notebook documents" denote documents that contain both
code and rich text elements, such as figures, links, equations, ... Because of the mix of code
and text elements, these documents are the ideal place to bring together an analysis
description, and its results, as well as, they can be executed perform the data analysis in real
time.
At some point, we all need to show our work. Most programming work is shared either
as raw source code or as a co mpiled executable. The source code provides complete
information, but in a way that’s more “tell” than “show.” The executable shows us what the
software does, but even when shipped with the source code it can be difficult to grasp exactly
how it works.
A notebook integrates code and its output into a single document that combines
visualizations, narrative text, mathematical equations, and other rich media. In other words:
it's a single document where you can run code, display the output, and also add explanations,
formulas, charts, and make your work more transparent, understandable, repeatable, and
shareable.
Using Notebooks is now a major part of the data science workflow at companies across
the globe. If your goal is to work with data, using a Notebook will speed up your workflow
and make it easier to communicate and share your results.
XXIX
Imagine being able to view the code and execute it in the same UI, so that you could
make changes to the code and view the results of those changes instantly, in real time? That’s
just what Jupyter Notebook offers.
Jupyter Notebook was created to make it easier to show one’s programming work, and
to let others join in. Jupyter Notebook allows you to combine code, comments, multimedia,
and visualizations in an interactive document — called a notebook, naturally — that can be
shared, re-used, and re-worked.
And because Jupyter Notebook runs via a web browser, the notebook itself could be
hosted on your local machine or on a remote server
One major feature of the Jupyter notebook is the ability to display plots that are the
output of running code cells. The IPython kernel is designed to work seamlessly with the
matplotlib plotting library to provide this functionality. Specific plotting
library integration is a feature of the kernel..
Each .ipynb file is one notebook, so each time you create a new notebook, a
new .ipynb file will be created.
Each .ipynb file is a text file that describes the contents of your notebook in a format
called JSON. Each cell and its contents, including image attachments that have been
converted into strings of text, is listed therein along with some metadata.
Jupyter Notebooks are a powerful way to write and iterate on your Python code for data
analysis. Rather than writing and re-writing an entire program, you can write lines of code
and run them one at a time. Then, if you need to make a change, you can go back and make
your edit and rerun the program again, all in the same window.
Jupyter Notebook is built off of IPython, an interactive way of running Python code in
the terminal using the REPL model (Read-Eval-Print-Loop). The IPython Kernel runs the
computations and communicates with the Jupyter Notebook front-end interface. It also allows
Jupyter Notebook to support multiple languages. Jupyter Notebooks extend IPython through
XXX
additional features, like storing your code and output and allowing you to keep markdown
notes.
5.2 Python
Python is Interactive − You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
XXXI
languages. In turn, this reduces the cost of program maintenance and development because it
allows teams to work collaboratively without significant language and experience barriers.
Additionally, Python supports the use of modules and packages, which means that
programs can be designed in a modular style and code can be reused across a variety of
projects. Once you've developed a module or package you need, it can be scaled for use in
other projects, and it's easy to import or export these modules.
One of the most promising benefits of Python is that both the standard library and the
interpreter are available free of charge, in both binary and source form. There is no
exclusivity either, as Python and all the necessary tools are available on all major platforms.
Therefore, it is an enticing option for developers who don't want to worry about paying high
development costs.
If this description of Python over your head, don't worry. You'll understand it soon
enough. What you need to take away from this section is that Python is a programming
language used to develop software on the web and in app form, including mobile. It's
relatively easy to learn, and the necessary tools are available to all free of charge.
import pandas as pd
import pandas as pd. Simply imports the library that current namespace, but rather than
using the name pandas , it's instructed to use the name pd instead. This is just so you can do
pd. whatever instead of having to type out pandas. whatever all the time if you just do
import pandas.
import numpy as np
NumPy is an open-source numerical Python library. NumPy contains a multi-dimensional
array and matrix data structures. It can be utilised to perform a number of mathematical
operations on arrays such as trigonometric, statistical, and algebraic routines. NumPy is an
extension of Numeric and Numarray.
import Random
import random imports the random module, which contains a variety of things
to do with random number generation. Among these is the random() function, which
generates random numbers between 0 and 1.
XXXII
import matplotlib.pyplot as plt
Pyplot is a collection of functions in the popular visualization package Matplotlib. Its
functions manipulate elements of a figure, such as creating a figure, creating a plotting area,
plotting lines, adding plot labels, etc.
XXXIII
The steepness of ROC curves is also important, since it is ideal to maximize the true positive
rate while minimizing the false positive rate.
ROC curves are typically used in binary classification to study the output of a classifier. In
order to extend ROC curve and ROC area to multi-label classification, it is necessary to
binarize the output. One ROC curve can be drawn per label, but one can also draw a ROC
curve by considering each element of the label indicator matrix as a binary prediction.
CHAPTER 6
SOURCE CODE
Dataset Collection code
import pandas as pd
import numpy as np
import random
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split,GridSearchCV
import matplotlib.pyplot as plt
import seaborn as sns
label_data = pd.read_csv("bullying_dataset.csv")
type(label_data)
label_data.head()
Preprocessing Dataset
y = label_data["label"]
label_data1 = label_data["tweet"]
def makeTokens(f):
tkns_BySlash = str(f.encode('utf-8')).split('/') # make tokens after splitting by slash
total_Tokens = []
XXXIV
for i in tkns_BySlash:
tokens = str(i).split('-') # make tokens after splitting by dash
tkns_ByDot = []
for j in range(0,len(tokens)):
temp_Tokens = str(tokens[j]).split('.') # make tokens after splitting by dot
tkns_ByDot = tkns_ByDot + temp_Tokens
total_Tokens = total_Tokens + tokens + tkns_ByDot
total_Tokens = list(set(total_Tokens)) #remove redundant tokens
return total_Tokens
vectorizer = CountVectorizer(tokenizer=makeTokens)
X = vectorizer.fit_transform(label_data1)
from collections import Counter
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
def clean_str(string):
#print (string)
return string.strip().lower()
XXXV
Confusion Matrix Using Random Forest Classifier
from sklearn.metrics import confusion_matrix
confusion_matrix1 = pd.DataFrame(confusion_matrix(y_test, Y_rt_predict))
plt.figure()
plt.title('Confusion Matrix using RandomForest Classifier')
sns.heatmap(confusion_matrix1,annot=True,cmap='Greens',fmt='.2f')
XXXVI
CHAPTER 7
TESTING
XXXVII
CHAPTER 8
RESULTS AND DISCUSSION
8.1 FINAL RESULTS OBTAINED:
To begin with, testing of the trained model, we can split our project into modules of
implementation that is done.
Dataset collection involves the process of collecting abusive comments from various
internet sources and social media.
Various datasets were collected and one example among the collected dataset can be
found below:
The below screenshot shows a sample of dataset collected :
The below image shows the training using random forest classifier:
XXXVIII
Figure 8.2 Training using random forest classifier
The below image shows the graph obtained using random forest classifier:
XXXIX
Figure 8.4 Graph obtained using random forest classifier
The below image shows the effective detection and prediction of abusive comments:
XL
CHAPTER – 9
CONCLUSION AND FUTURE WORK
9.1 CONCLUSION
In this project, we have successfully implemented a system of effective detection and
identification of abusive comments over social media. The abusive comments dataset is
trained using random forest classifier, a machine-learning algorithm. The present detection
methods are inaccurate and inefficient. The system has provided an easy and efficient solution
at very cost-effective approach.
Hence, The goal of our project is to filter tweets or comments using machine learning
algorithms and to create an environment which is healthy for interaction and transfer of
information between individuals.
We as the future generation are responsible for preventing the ongoing toxic environment
in social media and create a healthy environment in social media.
XLI
The application has good use in the social media. In the social media space, they have more
chance to develop or convert this project in many ways. Thus, this project has an efficient
scope in coming future where manual detection and prediction can be converted to
computerized prediction in a cheap way.
CHAPTER 10
REFERENCES
[1] Yuan Gao, Maoguo Gong, Senior Member, IEEE, Yu Xie, and A. K. Qin, Senior
Member, IEEE,”An attention-based unsupervised adversarial model for review spam
detection”[2020]
[2] An Improved Rainfall Forecasting Model Based on GNSS Observations, Qingzhi Zhao ,
Yang Liu , Xiongwei Ma , Wanqiang Yao, Yibin Yao , and Xin Li,[2020]
[3] An Ensemble Forecasting Method for the Aggregated Load With Subprofiles, Yi
Wang; Qixin Chen; Mingyang Sun; Chongqing Kang; Qing Xia,[2019]
[4] Probabilistic Load Forecasting via Quantile Regression Averaging on Sister Forecasts,
Bidong Liu; Jakub Nowotarski; Tao Hong; Rafał Weron,[2018]
[5] Long-Term Retail Energy Forecasting With Consideration of Residential Customer
Attrition,Jingrui Xie SAS Institute,Cary,NC,USA; Tao Hong;Joshua Stroud,[2017]
[6] Bala Sundara Ganapathy.N,Helda Mercy.M,Giftson Vasanth Samuel Raj. A,”A
Framework for Social Media Network to Curtail the Banned and Abused Images ”,[2017]
[7] Han Hu, Pranavi Moturu, Kannan Neten Dharan, James Geller, Sophie Di Iorio, Hai
Phan,”Deep Learning Model for Classifying Drug Abuse Risk Behavior in Tweets”[2016]
[8] Sayeed Ahsan Khan,Mohammed Hazim Alkawaz,Hewa Majeed Zangana,”The Use and
Abuse of Social Media for Spreading Fake News”[2016]
XLII
[9] LAIHANG YU1 , (Student Member, IEEE), LIN FENG2 , CHEN CHEN3 , (Member,
IEEE), TIE QIU4 , (Senior Member, IEEE), LI LI1 , AND JUN WU2,”A Novel Multi-
Feature Representation of Images for Heterogeneous IoTs”[2015]
[10] Ryan Sequeira , Avijit Gayen , Niloy Ganguly, Senior Member, IEEE, Sourav Kumar
Dandapat, and Joydeep Chandra,”A Large-Scale Study of the Twitter Follower Network to
Characterize the Spread of Prescription Drug Abuse Tweets”[2018]
[11] Semiu Salawu, Yulan He, and Joanna Lumsden,”Approaches to Automated Detection of
Cyberbullying: A Survey ”[2014]
[12] Yubao Zhang, Student Member, IEEE, Xin Ruan, Student Member, IEEE, Haining
Wang, Senior Member, IEEE, Hui Wang, and Su He,”Twitter Trends Manipulation: A First
Look Inside the Security of Twitter Trending”[2013]
[13] Ryan Sequeira , Avijit Gayen , Niloy Ganguly, Senior Member, IEEE, Sourav Kumar
Dandapat, and Joydeep Chandra,”A Large-Scale Study of the Twitter Follower Network to
Characterize the Spread of Prescription Drug Abuse Tweets”[2012]
[14] Gaoyang Liu, Chen Wang, Kai Peng, Haojun Huang, Yutong Li, Wenqing Cheng,”
SocInf: Membership Inference Attacks on Social Media Health Data With Machine
Learning”[2011]
[15] J. Jiménez, K. Donado and C. G. Quintero,” A Methodology for Short-Term Load
Forecasting”
[16] Mao Tan, Member, IEEE, Siping Yuan, Shuaihu Li, Yongxin Su, Hui Li, and Feng He,”
Ultra-short-term industrial power demand forecasting using LSTM based hybrid ensemble
learning”[2010]
[17] Chinnawat Surussavadee, Senior Member, IEEE,” Evaluation of High-Resolution
Tropical Weather Forecasts Using Satellite Passive Millimeter-Wave Observations”[2009]
[18] Ming Yang, Member, IEEE, You Lin, Student Member, IEEE, and Xueshan Han,”
Probabilistic Wind Generation Forecast Based on Sparse Bayesian Classification and
Dempster-Shafer Theory”
XLIII
CHAPTER 11
CERTIFICATES AND PROCEEDING COPY
Name : M.Karthika
Reg No : 412517205034
XLIV
Name : B.Praveena
Reg No : 412517205055
XLV
XLVI
XLVII