LINKEDIN TREND ANALYSIS
BY USING LATENT
DIRICHLET ALLOCATION
Team Members:
1. A.Abhishek (21H71A0503) Batch No:A16
2. D.Trisanthi (21H71A0559) Name of the Guide : Ms. G.Rohini
3. P.Balaji (22H75A0509) Designation : Assistant Professor
4. K.Tejesh (21H71A0558)
AGENDA
INTRODUCTION
PROBLEM STATEMENT
LITERATURE REVIEW
EXISTING SYSTEM
PROPOSED SYSTEM
ARCHITECTURE
METHODOLOGY
TOOLS
OUTPUT
CONCLUSION
THIS PROJECT LEVERAGES NATURAL
LANGUAGE PROCESSING (NLP)
TECHNIQUES, SPECIFICALLY LATENT
DIRICHLET ALLOCATION (LDA), TO
AUTOMATE THE IDENTIFICATION AND
ANALYSIS OF EMERGING INDUSTRY
TRENDS.
SOCIAL MEDIA PLATFORMS,
PARTICULARLY LINKEDIN, SERVE AS
VALUABLE REPOSITORIES REGARDING
INDUSTRY-SPECIFIC DISCUSSIONS,
INTRODUCTION INSIGHTS, AND SKILL DEMANDS.
BY ANALYZING THE VAST AMOUNT OF
UNSTRUCTURED TEXT DATA FROM
LINKEDIN POSTS, ORGANIZATIONS CAN
UNCOVER KEY THEMES THAT SHAPE
WORKFORCE TRENDS AND BUSINESS
STRATEGIES.
APPLYING LDA TO LINKEDIN POST DATA,
THIS PROJECT IDENTIFIES RECURRING
THEMES AND SKILL DEMANDS ACROSS
VARIOUS INDUSTRIES.
PROBLEM STATEMENT
There is a need for an
However, identifying these
automated system that can
trends in real-time from vast
efficiently process large-scale
amounts of unstructured data,
textual data, extract key
such as LinkedIn posts,
insights, and highlight
presents a significant
evolving trends to support
challenge.
data-driven decision-making.
LITERATURE REVIEW
Title Author Techniques Methodology Limitation
Dataset preparation,
A proposed Lack of emotional
NLP techniques, Preprocessing,
Academic Chatbot N. Raeesh, Nabi, N. intelligence,
Hybrid model Hybrid intelligence,
System using NLP Bavigha outdated
approach AI-driven approach,
Techniques information
Feedback loops
Local & cloud-
Enhancing
based LLM, Selection, Fine-
Programming Balancing accuracy,
domain-specific tuning, Performance
Education with A. Savecic, I. Tonic, quality, and
recognition, RAG – evaluation,
Open-source A. Inulin coverage of training
Retrieval enhancement via
Generative AI data
Augmented training
Chatbots
Generation
Title Author Techniques Methodology Limitationation
Large language NLP, Transformer Inaccuracies,
Dicle Yügi, Savas Dataset curation,
model-based models (LLM- Biases in training
Tüggü, Aydoğan Preprocessing,
chatbots in Higher GPT), Training data, Reliability
Ozan Model fine-tuning
Education techniques issues
Adoption of AI
chatbots to
Discipline
enhance student Quantitative Sample diversity,
analysis, Surveys,
learning N. Sandu, E. Gide research, Limited scope of
AI-driven
experience in Statistical analysis research
personalization
higher education
in India
Linguistic
A chatbot system NLP-mixed, Integration of
challenges,
for education NLP B. Kavitha, B. Knowledge-based, NLP, Active
Maintenance
using deep K.P. Kavitha AI-powered learning,
across multiple
learning chatbot system Customization
regions
EXISTING SYSTEM
TWITTER TREND ANALYSIS USING LATENT DIRICHLET ALLOCATION (LDA)
OVERVIEW : TWITTER TREND ANALYSIS IS A POWERFUL TOOL FOR
UNDERSTANDING PUBLIC OPINION, SENTIMENT, AND BEHAVIOUR. LATENT
DIRICHLET ALLOCATION (LDA) IS A POPULAR TOPIC MODELLING TECHNIQUE
USED FOR EXTRACTING INSIGHTS FROM LARGE VOLUMES OF TEXT DATA,
SUCH AS TWEETS.
THE CURRENT METHODS OF IDENTIFYING LINKEDIN TRENDS INVOLVE
MANUAL BROWSING OR USING THIRD-PARTY ANALYTICS TOOLS THAT OFTEN
PROVIDE GENERALIZED METRICS WITHOUT OFFERING DETAILED TOPIC-
BASED INSIGHTS.
PROPOSED SYSTEM
THE PROPOSED SYSTEM EMPLOYS LDA FOR
ADVANCED TOPIC MODELLING TO EXTRACT
DETAILED INSIGHTS FROM LINKEDIN POSTS.
IT IDENTIFIES DOMINANT TOPICS, TRENDS, AND
KEYWORDS RELEVANT TO VARIOUS INDUSTRIES
AND TRACKS HOW THESE EVOLVE OVER TIME.
ARCHITECTURE Cleaned
text
LDA predictions
Data Collection Data Analysis
Apply LDA
USER User query LDA Topic Model
Checks in data set
RESULTS
NO YES
No data Found in Visualiz
Topic Predictions e
Job role
METHODOLOGY:
LATENT DIRICHLET ALLOCATION (LDA) IS A PROBABILISTIC TOPIC MODELING ALGORITHM
THAT DISCOVERS UNDERLYING TOPICS WITHIN A COLLECTION OF DOCUMENTS BY MODELING
DOCUMENTS AS MIXTURES OF TOPICS AND TOPICS AS DISTRIBUTIONS OF WORDS.
TOPIC MODELING IS A TYPE OF NLP (NATURAL LANGUAGE PROCESSING) TECHNIQUE.
BY APPLYING LDA TOPIC MODELING TO THE LARGE CORPUS DATA IT CLUSTERS HAVING SAME
MEANINGS TO THE WORDS AND IT SHOWS FREQUENCY OF TOPICS FOR EACH DOCUMENT IN
MILLIONS OF DOCUMENTS.
PROGRAMMING LANGUAGE :PYTHON
NLP LIBRARIES : GENSIM, NLTK, SPACY
(BACK-END)
DATA PROCESSING : PANDAS, NUMPY
VISUALIZATION : MATPLOTLIB , SEABORN,
TOOLS PYLDAVIS
DATABASE : SQLITE OR MONGODB (OPTIONAL
FOR STORING LINKEDIN DATA)
FRAMEWORK : FLASK /DJANGO FOR WEB
DEPLOYMENT
FRONT-END : HTML, CSS, REACTJS.
OUTPUT
CONCLUSION
THIS PROJECT SUCCESSFULLY DEMONSTRATES THE APPLICATION OF LATENT
DIRICHLET ALLOCATION (LDA) IN ANALYSING LINKEDIN POST DATA TO
IDENTIFY EMERGING INDUSTRY TRENDS.
BY LEVERAGING NLP TECHNIQUES, THE SYSTEM EFFECTIVELY UNCOVERS KEY
THEMES, SKILL DEMANDS, AND WORKFORCE INSIGHTS, PROVIDING VALUABLE
INTELLIGENCE FOR PROFESSIONALS AND BUSINESSES.
THE RESULTS REVEAL HOW TRENDS EVOLVE OVER TIME, ENABLING DATA-
DRIVEN DECISION-MAKING IN VARIOUS SECTORS.
USING PYTHON LIBRARIES LIKE GENISM, NLTK, AND PANDAS, THE PROJECT
TRANSFORMS VAST UNSTRUCTURED TEXT INTO ACTIONABLE INSIGHTS.