0% found this document useful (0 votes)

95 views

Text Mining in Big Data Analytics (1) (1) - 1

This document outlines a project on text mining in big data analytics submitted for a Bachelor of Engineering degree. The project aims to design a grid framework for executing web pages in a distributed manner using threads run by executors managed by a central manager. The document includes an abstract, introduction, literature review, system design, implementation details, and conclusion.

Uploaded by

azim momin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views

Text Mining in Big Data Analytics (1) (1) - 1

Uploaded by

azim momin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 32

Text mining in big data analytics

Submitted in partial fulfilment of the requirements of the degree of

Bachelor of Engineering

By
Mr. Shoaib Moosa ARMIET/BE/CS20MD218
Mr. Azim Momin ARMIET/BE/CS20MM229
Mr. Deepesh Panday ARMIET/BE/CS20MD219
Mr. Deevesh Panday ARMIET/BE/CS20TS220

Under the Guidance of

PROF. Vivek Pandey

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING

AND TECHNOLOGY

Affiliated to

UNIVERSITY OF MUMBAI

Department of Computer Engineering

Academic Year – 2022-2023
CERTIFICATE

This dissertation report entitled “Text Mining in big data analytics” by Mr. Shoaib Abdul
Razzak Moosa is approved for the degree of Bachelor of Engineering in Computer
Engineering for academic year 2022 - 2023.

Examiners

Supervisor

(Prof. Archana Khelurkar)

Head of the Department Principal

Date:

Place:
Declaration
I declare that this written submission represents my ideas in my own words and where others'

ideas or words have been included, I have adequately cited and referenced the original

sources. I also declare that I have adhered to all principles of academic honesty and integrity

and have not misrepresented or fabricated or falsified any idea/data/fact/source in my

submission. I understand that any violation of the above will be cause for disciplinary action

by the Institute and can also evoke penal action from the sources which have thus not been

properly cited or from whom proper permission has not been taken when needed.

Mr. Shoaib Moosa

Date:
ACKNOWLEDGEMENT
We have immense pleasure in presenting the report for our project entitled “Text Mining in

big data analytics”.

We would like to take this opportunity to express our gratitude to a number of people who

have been sources of help & encouragement during the course of this project.

We are very grateful and indebted to our project guide PROF. Vivek Pandey & our

respected HOD PROF. MAYANK MANGAL for providing their enduring patience,

guidance & invaluable suggestions. They were the one who never let our morale down &

always supported us through our thick & thin. They were the constant source of inspiration

for us & took utmost interest in our project.

We would also like to thank all the staff members for their invaluable co-operation &

permitting us to work in the computer lab.

We are also thankful to all the students for giving us their useful advice & immense co-

operation. Their support made the working of this project very pleasant.
PREFACE
This project has been submitted in the fulfillment of the requirements for the diploma of
engineering. We the team members of this project, take pleasure in presenting the detail
project report that reflects our efforts in academic year 2022-23.

Our project involves designing a Grid framework for executing web page where the process
is divided into threads and accordingly the threads are executed by the executors. The outputs
generated by the executors are given back to the manager which in turn gives the results to
the owner. This is a dedicated in which the manager can select particular executors to run the
web page.

Initially manager is started by connecting it to a storage application. The executors are

connected to the manager by providing the required credentials. Once the executors get
connected to the manager the execution of the required can be started.

Additionally, there is a Grid console which keeps track of the executors connected and the
web page running. A record of all the operations performed by either of the logger is
maintained in a log file.

Group Members:
1. Shoaib Moosa
2. Azim Momin
3. Deepesh Panday
4. Deevesh Panday
CONTENTS
CH.N TOPIC NAME PAGE
O. NO.
INTRODUCTION 1
1
1.1 AIM AND OBJECTIVE 2
1.2 PROBLEM STATEMENT 2
2 REVIEW OF LITERATURE 4
3 EXISTING SYSTEM 7
4 SYSTEM ARCHITECHTURE 9
5 FLOW CHART 1
1
6 PROPOSED SYSTEM 1
3
SYSTEM DESIGN 1
7 5
7.1 SOFTWARE REQUIREMENTS 1
6
7.2 HARDWARE REQUIREMENTS 1
6
8 IMPLEMENTATION 1
7
9 CONCLUSION 2
2
10 REFERENCE 2
4
LIST OF FIGURES

FIGU FIGURE NAME PAG

RE E
NO. NO
1 SYSTEM ARCHITECTURE 9

2 FLOW CHART 11

3 EXPECTED OUTCOME 18

4 STYLE YOUR APPLICATION 18

5 GENERATING A COMPANY INFORMATION AND 19

GRAPHS
6 CREATING THE MACHINE LEARNING MODEL 20

7 DEPLOYING THE PROJECT ON HEROKU 21

Abstract

Text mining in big data analytics is emerging as a powerful tool for harnessing the power of
unstructured textual data by analyzing it to extract new knowledge and to identify significant
patterns and correlations hidden in the data. This study seeks to determine the state of text
mining research by examining the developments within published literature over past years and
provide valuable insights for practitioners and researchers on the predominant trends, methods,
and applications of text mining research. In accordance with this, more than 200 academic
journal articles on the subject are included and discussed in this review; the state-of-the-art text
mining approaches and techniques used for analyzing transcripts and speeches, meeting
transcripts, and academic journal articles, as well as websites, emails, blogs, and social media
platforms, across a broad range of application areas are also investigated. Additionally, the
benefits and challenges related to text mining are also briefly outlined.
Text Mining In Big Data Analysis

CHAPTER-1
INTRODUCTION

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

Introduction-

In recent years, we have witnessed an increase in the quantities of available digital textual
data, generating new insights and thereby opening up opportunities for research along new
channels. In this rapidly evolving field of big data analytic techniques, text mining has gained
significant attention across a broad range of applications. In both academia and industry,
there has been a shift towards research projects and more complex research questions that
mandate more than the simple retrieval of data. Due to the increasing importance of artificial
intelligence and its implementation on digital platforms, the application of parallel
processing, deep learning, and pattern recognition to textual information is crucial. All types
of business models, market research, marketing plans, political campaigns, or strategic
decision-making are facing an increasing need for text mining techniques in order to address
the competition.

Aim And Objective: -

Widely used in knowledge-driven organizations, text mining is the process of examining

large collections of documents to discover new information or help answer specific research
questions. Text mining identifies facts, relationships and assertions that would otherwise
remain buried in the mass of textual big data.

PROBLEM STATEMENT: -
Many issues occur during the text mining process and effectthe efficiency and effectiveness
of decision making. Complexities can arise at the intermediate stage of text mining. In pre-
processing stage various rules and regulations are defined tostandardize the text that make
text mining process efficient. Before applying pattern analysis on the document there is a
need to convert unstructured data into intermediate form but at this stage mining process has
its own complications. Sometime real theme or data mislay its importance due to the
modification in the text sequence

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-2
REVIEW OF LITERATURE

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

described that gathering, extracting, pre-processing, text transformation, feature extraction,

pattern selection, and evaluation steps are part of text mining process. In addition,different
widely used text mining techniques, i.e., clustering, categorization, decision tree
categorization, and their application in diverse fields are surveyed. [8] highlighted the issues in
text mining applications and techniques. They discussed that dealing with unstructured text is
difficult as compared to structured or tabular data using traditional mining tools and
techniques. They have shown the applications of text mining process in bioinformatics,
business intelligence and national security system. Natural language processing and entity
recognition techniques has reduced the issues that occur during text mining process. However,
there exist issues which need attention

explored MEDLINE biomedical database by integrating a framework for named entity

recognition, classification of text, hypothesis generation and testing, relationship and synonym
extraction, extract abbreviations. This new framework helps to eliminate unnecessary details
and extract valuable information. analyzed the text using text mining patterns and showed term
based approaches cannot analyze synonyms and polysemy properly. Moreover, a prototype
model was designed for specification of patterns in terms of assigning weight according to
their distribution. This approach helps to enhance the efficiency of text mining process.
presented a crime detection system using text mining tools and relation discovery algorithm
was designed to correlate the term with abbreviation.C. data repository

presented a top down and bottom up approach for web based text mining process. To combine
the similar text documents, they apply k-mean clustering technique for bottom up partitioning.
To find out the similarity within the document TF-IDF (Term Frequency- Inverse Document
Frequency) algorithm has been used to find information regarding specific subjects. gave an
overview of applications, tools and issues arises to mine the text. They discussed that
documents may be structured, semi structured or unstructured and extracting useful
information is a tiresome task. They presented a generic framework for concept based mining
which can be visualized as text refinement and knowledge distillation phases. The intermediate
form of entity representation mining depends on specific domain.

presented innovative and efficient pattern discovery techniques. They used the pattern evolving
and discovering techniques to enhance the effectiveness of discovering relevant and
appropriate information. They performed BM25 and vector support machine based filtering on
router corpus volume 1 and text retrieval conference data to estimate the effectiveness of the
suggested technique. performed various experiments of classification using multi-word
features on the text. They proposed a hand-crafted method to extract multi-word features from
the data set. To classify and extract multi-word text they divide text into linear and nonlinear
polynomial form in support of vector machine that improve the effectiveness of the extracted
data.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-3
EXISTING SYSTEM

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

In existing system, we tend to propose that a company’s performance, in terms of its stock
worth movement, is foreseen by internal communication patterns. to get early warning
signals, we tend to believe that it’s vital for patterns in company communication networks to
be detected earlier for the prediction of serious stock worth movement to avoid attainable
adversities that an organization could face within the securities market in order that
stakeholders’ interests is protected the maximum amount as attainable. Despite the potential
importance of such data regarding corporate communication, very little work has been tired
this vital direction. We attempt to bridge these research gaps by employing a data-mining
method to examine the linkage between a firm’s communication data and its share price. As
Enron Corporation’s e-mail messages constitute the only corpus available to the public, we
make use of Enron’s e-mail corpus as the training and testing data for our proposed
algorithm.

Predictions of stock and Forex have always been a trending and profitable area of study.
Deep learning applications have been approved to submit better accuracy and return in the
field of financial prediction and forecasting. In this survey, we selected research papers from
the Digital Bibliography & Library Project (DBLP) database for comparison and analysis.
We separated papers according to different type of deep learning methods, which mentioned
Convolutional neural network (CNN); Long Short-Term Memory (LSTM); Deep neural
network (DNN); Recurrent Neural Network (RNN); Reinforcement Learning; and other deep
learning methods such as Hybrid Attention Networks (HAN), self-paced learning mechanism
(NLP), and Wave net. Furthermore, this paper examines the dataset, variable, model, and
results of each one article. The survey used represents the results through the most used
performance models: Root

Mean Square Error (MSE), exactness, keen ratio, and return rate. We recognized that recent
models combining Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Mean
Absolute Error (MAE), and Mean LSTM with other methods, for example, DNN, are widely
researched. Reinforcement learning and other deep learning methods submitted great returns
and performances. We conclude that, in previous recent years, the trend of using deep-
learning- based methods for financial modelling is increasing exponentially.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-4
SYSTEM ARCHITECHTURE

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

FIGURE 1: - SYSTEM ARCHITECTURE

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-5
FLOW CHART

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

FIGURE 2: - FLOW CHART

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-6
PROPOSED SYSTEM

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

Manufacturing and product development:

Analysis of machine logs and maintenance tickets can pinpoint problems in the
manufacturing process, as well as in the finished product.

Email filtering:
Email system providers mine incoming email to identify distinctive characteristics of spam
and phishing messages, automatically deleting or quarantining messages before they are
delivered to employees. This helps businesses minimize the risk of cyberattacks.

Competitive marketing analysis:

Mining the sentiment of competitor reviews in sources such as Yelp enables a business to
assess the competition's strengths and weaknesses.

Human resources:
By analyzing the content of emails and other communications within the company, HR teams
can gain insights into employee concerns and measure employee engagement.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-7
SYSTEM DESIGN

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

7.1 Software Requirements

 Operating System : Windows 98, Windows XP ,Windows 7 or better
 Language : Python

7.2 Hardware Requirements

• Random Access Memory (RAM): 1 GB or above
• Central Processing Unit (CPU): 1.7 GHz Processor and above
• Operating System (OS): Windows 8 and above

The system is designed such that it works in the following way:

1. In the case research, we need to visualize consumer habits and styles from different
perspectives. You don’t need to go into this method recklessly. Otherwise, the result
will be dirty and disordered.

2. The next step is to assemble the data to discover more different patterns and biases
inside the datasets.

3. K-means clustering is a famous method of unsupervised machine learning. This

method obtains all of the diverse “clusters” and clubs them collectively while
maintaining them as tiny as attainable.

4. Determining the most beneficial kit of hyperparameters for an algorithm is the

subsequent measure in customer segments with Ml because it assists us in
attaining the most genuine and satisfying customer crowds.

5. At last, we visualize the decisions applying the open-source Plotly-Python, a plotting

library in python for making interactive graphs, plots, and charts. Then we understand
the charts and various graphs to develop our enterprise..

6. Finally, our script is deployed and can be accessed by anyone in the world.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-8
IMPLEMENTATION

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

A. Expected Outcome:
In the first step of this data science project, we will perform data exploration. We will import
the essential packages required for this role and then read our data. Finally, we will go through
the input data to gain necessary insights about it.

We will now display the first six rows of our dataset using the head() function and use the
summary() function to output summary of it.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

B. Customer Gender Visualizations:

In this, we will create a barplot and a piechart to show the gender distribution across our
customer_data dataset.

Code:

Screenshots

Output

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

C. Visualization of Age Distribution:

Let us plot a histogram to view the distribution to plot the frequency of customer ages. We
will first proceed by taking summary of the Age variable.

Code

Output

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

D. Analyzing Spending Score of the Customers:

Code

Output

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-9
CONCLUSION

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CONCLUSION
New technologies have facilitated access to immense quantities of digital text, recording an
ever increasing share of human interaction, communication, and culture. Text mining
provides a framework to maximize the value of information within large quantities of text;
thereby, the use of text mining technologies has increased steadily in recent years and has
become highly diverse.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

CHAPTER-10
REFERENCES

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Text Mining In Big Data Analysis

REFERENCES
[1] Blanchard, Tommy. Bhatnagar, Pranshu. Behera, Trash. (2019). Marketing Analytics Scientific
Data: Achieve your marketing objectives with Python's data analytics capabilities. S.l: Packt printing is
limited

[2] Griva, A., Bardaki, C., Pramatari, K., Papakyriakopoulos, D. (2018). Sales business analysis:
Customer categories use market basket data. Systems Expert Systems, 100, 1-16.

[3] Hong, T., Kim, E. (2011). It separates consumers from online stores based on factors that affect the
customer's intention to purchase. Expert System Applications, 39 (2), 2127-2131.

[4] Hwang, Y. H. (2019). Hands-on Advertising Science Data: Develop your machine learning
marketing strategies… using python and r. S.l: Packt printing is limited

[5] Puwanenthiren Premkanth, - Market Classification and Its Impact on Customer Satisfaction and
Special Reference to the Commercial Bank of Ceylon PLC.‖ Global Journal of Management and
Business Publisher Research: Global Magazenals Inc. (USA). 2012. Print ISSN: 0975-5853. Volume 12
Issue 1.

[6] Puwanenthiren Premkanth, - Market Classification and Its Impact on Customer Satisfaction and
Special Reference to the Commercial Bank of Ceylon PLC.‖ Global Journal of Management and
Business Publisher Research: Global Magazenals Inc. (USA). 2012. Print ISSN: 0975-5853. Volume 12
Issue 1.

[7] Sulekha Goyat. "The basis of market segmentation: a critical review of the literature. European
Journal of Business and Management www.iiste.org. 2011. ISSN 2222-1905 (Paper) ISSN 2222-2839
(Online). Vol 3, No.9, 2011

[8] By Jerry W Thomas. 2007. Accessed at:

www.decisionanalyst.com on July 12, 2015.

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Grade 7 Teacher'S Edition Grade 7 Teacher'S Edition Grade 7 Teacher'S Edition
67% (6)
Grade 7 Teacher'S Edition Grade 7 Teacher'S Edition Grade 7 Teacher'S Edition
137 pages
Project Report
70% (10)
Project Report
47 pages
Ipl Report
100% (3)
Ipl Report
44 pages
ON "Marketing Layout and Strategy in Oil and Natural Gas Corporation-Ongc"
No ratings yet
ON "Marketing Layout and Strategy in Oil and Natural Gas Corporation-Ongc"
35 pages
Quality Improvement Project
100% (1)
Quality Improvement Project
7 pages
Chapter 7-Integer Linear Programming: Multiple Choice
100% (4)
Chapter 7-Integer Linear Programming: Multiple Choice
13 pages
Module 1 Community Development
100% (3)
Module 1 Community Development
11 pages
DMW Report
No ratings yet
DMW Report
20 pages
Text Mining in Big Data Analytics
No ratings yet
Text Mining in Big Data Analytics
34 pages
Text Mining
No ratings yet
Text Mining
40 pages
Sentiment Analysis of Social Media Statements
No ratings yet
Sentiment Analysis of Social Media Statements
31 pages
Big Data CIS Full Report
No ratings yet
Big Data CIS Full Report
13 pages
Deep_Learning_Techniques_for_Sentiment_Analysis_on_Social_Media_Text Final
No ratings yet
Deep_Learning_Techniques_for_Sentiment_Analysis_on_Social_Media_Text Final
51 pages
Chapter 1: Text Mining: Big Data Analytics (15CS82)
No ratings yet
Chapter 1: Text Mining: Big Data Analytics (15CS82)
12 pages
19bit0029 VL2022230101720 Pe004
No ratings yet
19bit0029 VL2022230101720 Pe004
56 pages
Twitter-Sentiment Documentation
No ratings yet
Twitter-Sentiment Documentation
48 pages
05b.BDA (18CS72) Module-5 Text Mining
No ratings yet
05b.BDA (18CS72) Module-5 Text Mining
23 pages
Social Media Sentiment Analysis
No ratings yet
Social Media Sentiment Analysis
49 pages
Literature Review On Text Mining
100% (3)
Literature Review On Text Mining
5 pages
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
No ratings yet
Mini Project Report: Submitted in Partial Fulfilment of The Requirement For The University of Mumbai For The Degree of by
24 pages
Literature Review Text Mining
100% (1)
Literature Review Text Mining
9 pages
Student Name: in Computer Science and Engineering
No ratings yet
Student Name: in Computer Science and Engineering
8 pages
Ce 21 PDF
No ratings yet
Ce 21 PDF
75 pages
What Is Text Mining
No ratings yet
What Is Text Mining
9 pages
Theolaaaa4273 Merged
No ratings yet
Theolaaaa4273 Merged
76 pages
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
No ratings yet
43.IJCSCN PreprocessingTechniquesforTextMining Ilamathi Nithya
11 pages
Text Mining & Applications in Social Media: by Anthony Yang
No ratings yet
Text Mining & Applications in Social Media: by Anthony Yang
30 pages
UNIT - 1 Text Mining
No ratings yet
UNIT - 1 Text Mining
18 pages
Hadoop Report
No ratings yet
Hadoop Report
49 pages
Project Report
No ratings yet
Project Report
47 pages
Youtube Summ
No ratings yet
Youtube Summ
116 pages
Twitter Sentimental Analysis
No ratings yet
Twitter Sentimental Analysis
42 pages
Dissertation Text Mining
100% (2)
Dissertation Text Mining
4 pages
yaswanth (1)
No ratings yet
yaswanth (1)
103 pages
internship report final
No ratings yet
internship report final
31 pages
Summer Intern Project Template
No ratings yet
Summer Intern Project Template
14 pages
Text Mining: Techniques and Its Application: December 2014
100% (1)
Text Mining: Techniques and Its Application: December 2014
5 pages
La Vanya
No ratings yet
La Vanya
44 pages
Mastering Scalable Backends with Node.js and Express: Unlock the Secrets of Expert-Level Skills
From Everand
Mastering Scalable Backends with Node.js and Express: Unlock the Secrets of Expert-Level Skills
Larry Jones
No ratings yet
Application Design: Key Principles For Data-Intensive App Systems
From Everand
Application Design: Key Principles For Data-Intensive App Systems
Rob Botwright
No ratings yet
Data Mining Assignment
No ratings yet
Data Mining Assignment
6 pages
Text Classification and Processing using NLP
No ratings yet
Text Classification and Processing using NLP
21 pages
Spark Seminar Report
100% (1)
Spark Seminar Report
30 pages
NLPCourseOutline24-25_0816953ba4215defc2a791d7c6fe3dcd
No ratings yet
NLPCourseOutline24-25_0816953ba4215defc2a791d7c6fe3dcd
4 pages
Introduction To Text Mining
No ratings yet
Introduction To Text Mining
45 pages
Bda Mod5
No ratings yet
Bda Mod5
20 pages
Text Analytics Notes
No ratings yet
Text Analytics Notes
12 pages
Black Book 3.0 Krishna-1
No ratings yet
Black Book 3.0 Krishna-1
88 pages
ProjectReport2023
No ratings yet
ProjectReport2023
32 pages
DevOps Engineer's Guidebook: Essential Techniques
From Everand
DevOps Engineer's Guidebook: Essential Techniques
Ted Noreux
No ratings yet
Chapter 5 Predictive Analytics II Text^j Web^j and Social Media Analytics
No ratings yet
Chapter 5 Predictive Analytics II Text^j Web^j and Social Media Analytics
5 pages
Text Mining Literature Review
100% (3)
Text Mining Literature Review
7 pages
Dept. of ISE, Acit 1
No ratings yet
Dept. of ISE, Acit 1
12 pages
Comparative Analysis of Text Mining Techniques For
No ratings yet
Comparative Analysis of Text Mining Techniques For
12 pages
Combinepdf
No ratings yet
Combinepdf
64 pages
Text Mining: Concepts, Process and Applications: January 2013
No ratings yet
Text Mining: Concepts, Process and Applications: January 2013
5 pages
Project Report
No ratings yet
Project Report
56 pages
D13_Project Report
No ratings yet
D13_Project Report
33 pages
Sample Project Final Document
No ratings yet
Sample Project Final Document
68 pages
Professional Test Driven Development with C#: Developing Real World Applications with TDD
From Everand
Professional Test Driven Development with C#: Developing Real World Applications with TDD
James Bender
No ratings yet
Sat - 36.Pdf - Truth Identification by Discarding Rumor and Vulgar Posts in Social Media Applications
No ratings yet
Sat - 36.Pdf - Truth Identification by Discarding Rumor and Vulgar Posts in Social Media Applications
11 pages
major synopsis
No ratings yet
major synopsis
18 pages
Training Report On Machine Learning
No ratings yet
Training Report On Machine Learning
32 pages
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
From Everand
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
Dr. Prashanth Harish Southekal
No ratings yet
Report
No ratings yet
Report
37 pages
IJSR
No ratings yet
IJSR
5 pages
Using Design Apparel Design: Teach Functional
No ratings yet
Using Design Apparel Design: Teach Functional
5 pages
Bilingual Education Review of Related Literature
No ratings yet
Bilingual Education Review of Related Literature
6 pages
Trinity Institute of Professional Studies, Sector-9, Dwarka, New Delhi-110075
No ratings yet
Trinity Institute of Professional Studies, Sector-9, Dwarka, New Delhi-110075
10 pages
.Sci Hub
No ratings yet
.Sci Hub
11 pages
Engagement
No ratings yet
Engagement
24 pages
Chapter 3 Market Efficiency
No ratings yet
Chapter 3 Market Efficiency
32 pages
Opportunity Recognition
No ratings yet
Opportunity Recognition
5 pages
Li Et Al (2019) - Mathematics Conceptual Knowledge For Teaching
No ratings yet
Li Et Al (2019) - Mathematics Conceptual Knowledge For Teaching
28 pages
MMW - Midterm - Modules - DATA MANAGEMENT
No ratings yet
MMW - Midterm - Modules - DATA MANAGEMENT
29 pages
Medical and Psychiatric Social Work
65% (26)
Medical and Psychiatric Social Work
36 pages
Are Greg and Emily More Employable Than Lakisha and Jamal
No ratings yet
Are Greg and Emily More Employable Than Lakisha and Jamal
20 pages
What Objective Tests
No ratings yet
What Objective Tests
3 pages
Relationship Between Jump Height and Rate of Braking Force Development in Professional Soccer Players
No ratings yet
Relationship Between Jump Height and Rate of Braking Force Development in Professional Soccer Players
9 pages
Child To Child Study in India PDF
No ratings yet
Child To Child Study in India PDF
10 pages
Theoretical Framework
No ratings yet
Theoretical Framework
2 pages
Quantitative Strategic Planning Matrix (QSPM)
100% (1)
Quantitative Strategic Planning Matrix (QSPM)
4 pages
+ Conceptual Interests and Analytical Shifts in Research On Rave Culture
No ratings yet
+ Conceptual Interests and Analytical Shifts in Research On Rave Culture
21 pages
13 Writing Chemistry Research Proposal
No ratings yet
13 Writing Chemistry Research Proposal
47 pages
A Philosophical Inquiry Into Knowledge, Ethics, and Artificial Intelligence
No ratings yet
A Philosophical Inquiry Into Knowledge, Ethics, and Artificial Intelligence
2 pages
Untitled
No ratings yet
Untitled
20 pages
Your Practical Guide To Basic Laboratory Techniques PDF
No ratings yet
Your Practical Guide To Basic Laboratory Techniques PDF
24 pages
A Project Report: "Understanding The Reading Habit of Consumers"
No ratings yet
A Project Report: "Understanding The Reading Habit of Consumers"
44 pages
Literature Review On The Role of Mass Media
100% (1)
Literature Review On The Role of Mass Media
7 pages
Ndian Nstitute of Ourism & Ravel Anagement: I T T M
100% (4)
Ndian Nstitute of Ourism & Ravel Anagement: I T T M
69 pages

Text Mining in Big Data Analytics (1) (1) - 1

Uploaded by

Text Mining in Big Data Analytics (1) (1) - 1

Uploaded by

Text mining in big data analytics

Submitted in partial fulfilment of the requirements of the degree of

Under the Guidance of

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING

Department of Computer Engineering

(Prof. Archana Khelurkar)

Head of the Department Principal

and have not misrepresented or fabricated or falsified any idea/data/fact/source in my

Mr. Shoaib Moosa

big data analytics”.

for us & took utmost interest in our project.

permitting us to work in the computer lab.

Initially manager is started by connecting it to a storage application. The executors are

FIGU FIGURE NAME PAG

4 STYLE YOUR APPLICATION 18

5 GENERATING A COMPANY INFORMATION AND 19

7 DEPLOYING THE PROJECT ON HEROKU 21

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Aim And Objective: -

Widely used in knowledge-driven organizations, text mining is the process of examining

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

described that gathering, extracting, pre-processing, text transformation, feature extraction,

explored MEDLINE biomedical database by integrating a framework for named entity

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

FIGURE 1: - SYSTEM ARCHITECTURE

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

FIGURE 2: - FLOW CHART

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

Manufacturing and product development:

Competitive marketing analysis:

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

7.1 Software Requirements

7.2 Hardware Requirements

The system is designed such that it works in the following way:

3. K-means clustering is a famous method of unsupervised machine learning. This

4. Determining the most beneficial kit of hyperparameters for an algorithm is the

5. At last, we visualize the decisions applying the open-source Plotly-Python, a plotting

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

B. Customer Gender Visualizations:

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

C. Visualization of Age Distribution:

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

D. Analyzing Spending Score of the Customers:

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

[8] By Jerry W Thomas. 2007. Accessed at:

ALAMURI RATNAMALA INSTITUTE OF ENGINEERING & TECHNOLOGY

You might also like