Batch-9 Paper
Batch-9 Paper
LEARNING APPROACH TO
CYBERBULLYING DETECTION
B.V. Chowdary Mavoori Akhil Komirishetty Pavan
Associate Professor UG Scholar UG Scholar
Dept of IT Dept of IT Dept of IT
Vignan Institute of Technology and Science(A) Vignan Institute of Technology and Science(A) Vignan Institute of Technology and science(A)
Hyderabad Hyderabad Hyderabad
[email protected] [email protected] [email protected]
Proposed System
The framework to identify cyberbullying is explained in
this section, with primary components, as seen in Figure
1. Natural language processing, as well or NLP for short,
is the first section, in addition, machine learning, also
referred to as ML, is the second. The initial stage
involves gathering and utilizing natural language
processing to build datasets that include bully words,
messages, etc announcements for the machine learning
techniques. After the datasets have been examined,
machine learning algorithms are trained to identify any
harassment or Cyberbullying interactions on online
platforms like YouTube and Twitter. Techniques •
Processing Natural Language: The content or posts from
the actual world include a variety of extraneous
characters. For instance, grammar or numerals have no
bearing on whether bullying is detected. The remarks
need to be fixed before the machine techniques for
learning are applied.
A. System Architecture
Page 3
integration and interoperability between noise. These irregularities must be addressed to create a
system elements. dataset suitable for machine learning algorithms. In our
Data Storage: System architecture describes case, we focused on obtaining relevant data metrics
how data is stored, managed, and accessed. It related to profanity in daily online comments to train our
includes databases, file systems, and data models effectively. The initial dataset was in XML
structures. Data storage mechanisms are format, which we converted to the standard CSV format
crucial for ensuring data integrity, security, and commonly used for machine learning purposes. During
efficient retrieval. preprocessing, we handled missing values, removed
Scalability and Performance: System noise, and addressed inconsistencies in the data.
Additionally, we ensured that variables were
architecture addresses how the system can
appropriately scaled and transformed to prevent any
handle increased loads and demands. single variable from dominating the model's predictions.
Scalability features ensure that the system can These meticulous data preparation steps were crucial to
expand its capabilities as the user base or data creating a clean and reliable dataset, providing a solid
volume grows. foundation for our regression modeling efforts.
Deployment: System architecture outlines 3) Training Phase: For training the model, first we
how the system is deployed in various import a specific algorithm class/module and create an
environments. It includes considerations for instance of it. Then using that instance, we fit the model
physical deployment (such as server to the training data. Then we validate it by testing its
locations), cloud-based deployment, and accuracy score and tuning its parameters till we get the
virtualization strategies required results.
4) Testing Phase: For testing the model, we compare its
predicted values after the training phase with test data.
Then input some different values for prediction and
check whether it predicts it right. If it didn’t predict right
then, fine-tune the algorithmic parameters and fit the
model again.
V IMPLEMENTATION
A. PyCharm IDE
The widely used Integrated Development
Environment (IDE) PyCharm was created
especially for Python development. PyCharm,
created by JetBrains, provides a robust and user-
Fig. 1. System Architecture friendly platform tailored to meet the needs of
B. Modules Python developers. It provides a comprehensive set
The development of the project is based on the of features that enhance productivity, code quality,
Dataset considered and effective tuning of and collaboration.
parameters of Machine Learning Algorithms. The The IDE gives advanced code error, smart
system consists of basically 4 phases: suggestions, allowing developers to write code
1) Data Gathering faster and with fewer mistakes. Its powerful
2) Data processing refactoring tools simplify the process of
3) Training Phase restructuring code, making it easier to maintain and
4) Testing Phase improve the quality of existing projects. PyCharm
also includes a built- in visual debugger that assists
1) Data Gathering: The dataset represented here is a in identifying and fixing bugs efficiently.
collection of tweets that were collected using Twitter PyCharm excels in supporting various, Flask, and
API. The number of data entries exceeded 1000 tweets Pyramid. It offers dedicated project templates,
which belong to different periods. The following images integrated tools for database management, and
depict the datasets indicating Text Labels.
seamless integration with popular version control
2) Data Processing: Preparing raw data for regression
modeling is a critical step, as the data obtained from systems like Git. The IDE's web development
online sources are often inconsistent, incomplete, or capabilities streamline the creation of dynamic web
contain applications and ensure smooth collaboration
Page 4
among
Page 5
team members. So that user can register with the unique information
Additionally, PyCharm promotes efficient testing
with its integrated test runner and comprehensive
testing tools. It facilitates running unit tests, and
behavioral tests and even provides support for
popular testing frameworks like pytest. The version
control features enable seamless collaboration by
allowing developers to manage and merge code
changes.
Furthermore, PyCharm enhances the development
process with its powerful tools for data science and
scientific computing. Supports the pandas, and Fig..3. Registration Status
mathplotLib enables data analysis and visualization
within the IDE. PyCharm's user-friendly interface Fig. 4. Displays the posted information of the
and integration capabilities make it a preferred members of the website and their friends
choice for Python developers, whether they are
working on web applications, data science projects,
or any other Python-based software development.
B. Python
The Python programming language is interpreted as
high- level, dynamic, cross-platform, and open source.
Python's 'philosophy' prioritizes readability, clarity, and
simplicity while optimizing the programmer's power and
expressiveness. When a Python programmer writes
elegant code, rather than just intelligent code, it is the
greatest compliment. For these reasons, Python makes an
excellent 'first language' but may also be a very potent
tool in the hands of a seasoned and ruthless coder. Fig.4. Post Page
Python is an incredibly versatile language. It is
extensively utilized for a variety of objectives. Common
applications include: Fig.5. It displays the profile of the user where he
• Writing web applications using frameworks like can update and post information
Django, Zope, and TurboGears; Using basic scripts for
systems Using GUI toolkits such as Tkinter or wxPython
(and more recently, Windows Forms and Iron Python) to
create desktop applications; developing Windows apps;
VII.CONCLUSION
The cyberbullying detection project stands as a
pivotal initiative in promoting online safety and
fostering a positive digital atmosphere. this project
addresses the pressing issue of cyberbullying across
Fig.2. Login Status diverse online platforms. The implementation of
Fig. 3. It is the registration Page of our application
Page 6
robust algorithms not only facilitates early K.
intervention and mental health support for victims
but also encourages responsible online behavior,
making significant strides toward creating secure
online spaces. Despite the challenges, including
privacy concerns and algorithmic biases, the
project's potential for impact is immense. As
technologies evolve, it is imperative to refine these
systems continually, ensuring they strike the right
balance between safeguarding users and preserving
freedom of expression. The project not only
contributes to immediate online safety but also
serves a foundation for ongoing research, paving an
empathetic respectful digital landscape where
individuals can engage, learn, and express
themselves without the fear of cyberbullying.
ACKNOWLEDGEMENT
First of all, we would like to extend our deepest
appreciation to Mr. B.V. Chowdary, Associate
Professor, who served as our project’s mentor. Next,
we would like to express our heartfelt gratitude to
Vignan Institute of Technology and Science,
Hyderabad, and especially the Department of
Information Technology for providing our team with
all the tools resources, help, and direction required
to finish this project.
REFERENCE
[1] Fuchs, social media: An analytical overview.
Sage (2017)
[2] N. Selwyn, "Social media in higher education,"
Erasmus World of Learning, Vol. 1, No. 3, 2012,
pp. 1–10.
[3] Antecedents of social media business-to-
business use in an industrial marketing context:
clients' perspective, H. Karafuto, P. Ulkuniwemi,
H. Keinanenq, and O. Kuivalainen, Journal of
Business & Industrial Marketing, 2015.
[4] W. Akram and R. Kumar, "A study on the
positive and negative effects of social media on
society," International Journal of Computer
Sciences and Engineering, vol. 5, no. 10, pp. 351-
354, 2017.
[5] The digital marketplace, by D. Tapscott et al.
2015 saw McGraw-Hill Education.
[6] Cyberbullying on social network sites: a pilot
investigation by S. Bastiaensens, H. Vandebosch,
Page 7
Poels, K. Van Cleemput, A. Desmet, and I. [15] D. Perito, C. Castelluccia, M. A. Kaafar, and
De Bourdeaudhuij P. Manila, “How unique and traceable are
[7] Hoff, D. L., and Mitchell, S. N., usernames?” in Proc. 11th Int. Conf. Privacy
"Cyberbullying: Causes, Effects, and Enhancing Technology., 2011, pp. 1–17
Remedies," Journal of Educational
Administration, 2009.
[8] S. Hinduja and J. W. Patchin, "Bullying,
Cyberbullying, and Suicide," Archives of
Suicide Research, vol. 14, no. 3, 2010.
[9] V. Balakrishnan, S. Khan, and H. R.
Arabnia, “Improving cyberbullying detection
using twitter users’ psychological features
and machine learning,” Computers &
Security, vol. 90, p. 101710, 2020.
[10] S. Agrawal and A. Awekar, “Deep
learning for detecting cyberbullying across
multiple social media platforms,” in European
Conference on Information Retrieval.
Springer, 2018, pp. 141–153.
[11] M. A. Al-Ajlan and M. Ykhlef, “Deep
learning algorithm for cyberbullying
detection,” International Journal of Advanced
Computer Science and Applications, vol. 9,
no. 9, 2018.
[12] K. Wang, Q. Xiong, C. Wu, M. Gao,
and Y. Yu, “Multi-modal cyberbullying
detection on social networks,” in 2020
International Joint Conference on Neural
Networks (IJCNN). IEEE, 2020, pp. 1–8
[13] T. A. Buan and R. Ramachandra,
“Automated cyberbullying detection in social
media using an svm activated stacked
convolution lstm network,” in Proceedings of
the 2020 the 4th International Conference on
Compute and Data Analysis, 2020, pp. 170–
174
[14] E. Raisi and B. Huang, “Weakly
supervised cyberbullying detection using co-
trained ensembles of embedding models,” in
2018 IEEE/ACM International Conference on
Advances in Social Networks Analysis and
Mining (ASONAM). IEEE, 2018, pp. 479–
486. [20] M. A. Al-garadi, K. D. Varathan,
and S. D. Ravana, “Cybercrime detection in
online communications: The experimental
case of cyberbullying detection in the twitter
network,” Computers in Human Behavior,
vol. 63, pp. 433– 443, 2016.
Page 8