0% found this document useful (0 votes)
1 views

Text Classification using NLP

The project focuses on enhancing text processing and classification through language detection and LSTM-based models for improved accuracy. It outlines the implementation of language detection using libraries like LangDetect and TextBlob, and the architecture of LSTM for better handling of text sequences. Challenges include accuracy in language detection for code-mixed texts and the need for large datasets for LSTM training, with future plans to integrate transformer models and develop a real-time API.

Uploaded by

mitalimeshram4
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Text Classification using NLP

The project focuses on enhancing text processing and classification through language detection and LSTM-based models for improved accuracy. It outlines the implementation of language detection using libraries like LangDetect and TextBlob, and the architecture of LSTM for better handling of text sequences. Challenges include accuracy in language detection for code-mixed texts and the need for large datasets for LSTM training, with future plans to integrate transformer models and develop a real-time API.

Uploaded by

mitalimeshram4
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Project Seminar on

Text Processing and Classification Using Natural


Language Processing
in partial fulfillment of
VIII Semester Bachelor of Engineering (B.Tech)
in
Electronics Engineering

PROJECT PHASE- II (ENP 459)

Prutha Golar-72
Mitali Meshram-76
Ishan Sahare-113
Guided by
Dr. Deepali Kotambkar
Department of Electronics Engineering
Shri Ramdeobaba College of Engineering and Management,
Ramdeo Tekadi, Gittikhadan, Katol Road, Nagpur 440013, India.
Session 2024-25
1
Enhancements in Text Processing &
Classification using NLP

Language Detection & LSTM-based


Text Classification
Introduction

•Objective of Update:
• Enhance text processing by incorporating Language
Detection.
• Improve text classification using LSTM (Long Short-
Term Memory) models for better accuracy.

•Why These Updates?


• Multilingual text classification support.
• Improved model performance using deep learning.

Title of Project 3
Language Detection Feature
• What is Language Detection?
• Identifies the language of a given text before classification.

• Method Used:
• Implemented using LangDetect or TextBlob libraries.
• Model trained on a dataset containing multiple languages.

• How It Works?
• Input text → Tokenization → Character & Word Frequency
Analysis → Predicted Language Output.

• Use Case:
• Helps in multilingual content classification.

Title of Project 4
LSTM-Based Text Classification
•Why LSTM for Text Classification?
• Captures long-term dependencies in text sequences.
• Handles context better than traditional models.

•LSTM Model Architecture:


• Embedding Layer → LSTM Layer → Dense Layer → Softmax
Activation.

•Performance Improvement:
• Compared LSTM with traditional models (Logistic Regression,
Naive Bayes, etc.).
• Achieved better classification accuracy and contextual
understanding.

Title of Project 5
Implementation Flow

1. User Inputs Text


2. Language Detection
3. Text Preprocessing (Tokenization, Stopword Removal,
Stemming)
4. Feature Extraction (TF-IDF / Word Embeddings)
5. LSTM-based Classification
6. Predicted Category Output

Title of Project 6
Challenges Faced

•Language Detection Accuracy:


• Handling code-mixed texts (e.g., Hinglish, Spanglish).
• Low-resource languages pose difficulties.

•LSTM Model Training:


• Requires a large dataset and high computational
power.
• Overfitting mitigated using Dropout layers.

•Data Preprocessing for Multilingual Texts:


• Different stemming and tokenization techniques for
each language.

Title of Project 7
Result And Future Scope
•Results:
1) Improved accuracy in text classification.
2) Expanded support for multilingual classification.

•Future Enhancements:
1) Incorporate Transformer-based models (BERT, GPT) for
better contextual understanding.
2) Build a real-time API for language detection &
classification.
3) Expand dataset coverage to more low-resource languages.

Title of Project 8

You might also like