0% found this document useful (0 votes)

14 views7 pages

Classifier

This project explores the classification of political Reddit posts using deep learning methods, specifically focusing on short text classification challenges. By training word embeddings with Word2Vec and experimenting with various classifiers, the hybrid CNN-LSTM architecture achieved the highest accuracy of 0.662. The study highlights the difficulties of classifying sparse text and suggests future improvements through enhanced feature extraction and expanded training data.

Uploaded by

bisma012425

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views7 pages

Classifier

Uploaded by

bisma012425

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Short Text Classification of Political Reddit Posts

Stanford CS224N {Custom} Project

Shirley Cheng
Department of Computer Science
Stanford University
[email protected]

Abstract
Text classification plays a crucial role in natural language processing, enabling
applications such as text retrieval, processing, and recommendation systems. This
project focuses on the task of classifying the ideological leanings of comments on
Reddit, a domain characterized by short and often sparse text in which traditional
text classification methods struggle. Leveraging the Word2Vec model, I train
word embeddings on the Reddit Corpus. I utilize word embeddings as features in
word vectors and experiment with different classification approaches, including: a
Support Vector Machine (SVM) classifier, Convolutional Neural Network (CNN)
classifier, Long Short Term Memory (LSTM), and CNN-LSTM. I find that a hybrid
CNN-LSTM architecture achieves the best performance, at .662 accuracy rate.

1 Key Information to include

• Mentor: Nelson Liu

2 Introduction
Text classification is the task of assigning labels to textual data based on its content. It is relevant
to many other natural language processing tasks, including text retrieval, processing, and sentiment
analysis. Traditional text classification involved techniques which extract features from text through
methods such as TF-IDF, bag of words, and n-grams. However, in the context of short text classifica-
tion, these techniques struggle due to the sparsity and dimensionality issues, as well as the inability
to capture context dependencies.
Recent advances in deep learning and neural networks have been applied to text classification tasks
to address these problems. These approaches leverage the power of distributed representations and
hierarchical feature learning, leading to significant improvements in performance. For example,
word embedding methods such as Word2Vec use distributed representations by mapping dense
vector representations in a continuous vector space where semantically similar words are mapped
to proximate points. Thus,these embeddings address both sparseness and dimensionality issues.
Additionally, they are also capture contextual and semantic similarities between words, providing a
more nuanced feature set for classification models.
I examine the problem of short text classification in the context of political ideology. I address the
task of classifying ideological leanings of Reddit posts and comments in political subreddits. With
rising political polarization, political ideology on Reddit is an important domain to examine the
applications of NLP to social sciences. Reddit provides a rich corpus to study political conversations,
as subreddit communities are formed along shared political beliefs.
There are several problems associated with the task of classifying Reddit posts and comments by
political ideology. Notably, many Reddit posts and comments are sparse and short, often consisting
of just a few words or sentences. This limited amount of text can make it difficult for classification
models to extract meaningful features and accurately classify the content. Moreover, Reddit users

Stanford CS224N Natural Language Processing with Deep Learning

often employ informal language and may employ domain specific language, such as internet memes,
which can be difficult for classification models to interpret correctly.
In addressing this classfication task, I base my approach on deep neural networks methods. I derive
domain specific word embeddings by training word2vec model on data from political subreddits.
Using word embeddings as features, I experiment with different classification techniques and neural
architectures, including Support Vector Machine(SVM), Convolutional Neural Network(CNN), Long
Short-Term Memory (LSTM), and CNN-LSTM.

3 Related Work
Mikolov et al. introduced the Word2Vec model in 2013 in which the key idea is to learn distributed
representations of words in a continuous vector space. Word2Vec models are typically trained on
large corpora of text data, and during training, they learn to map words to vectors in such a way that
similar words are represented by vectors that are close together in the vector space. Mikolov et. al
were able to train word embeddings on Google News dataset with a vocabulary of 3 million words
with embeddings of 300 dimensions(2013).
Zhang and Han showed that using Word2Vec in conjunction with SVM can be used for short text
classification tasks in the context of classifying subject category of microblogs (2019). The SVM
aims to find a hyperplane that best separates the data into different classes. Zhang and Han trained a
SVM classifier on mean word embeddings and obtain a baseline accuracy of .896.
Onan further explored how word embeddings can be used as input features into different deep neural
architecture for the task of sentiment analysis of Twitter product reviews(2020). Onan proposes a
CNN-LSTM architecture, which yielded a .835 accuracy rate when trained on padded Word2Vec
embeddings. The author finds that a combined CNN-LSTM approach outperforms naive CNN
and naive LSTM approach. These architectures leverage the strengths of both CNNs and LSTMs,
allowing them to capture both local and long-range dependencies in text data.
For the task of classifying political leanings of Reddit posts and comments, I replicate Zhang and
Han’s approach of training an SVM classifier on mean word embeddings (2019). I also replicate
Onan’s approach of building a hybrid CNN-LSTM classifier(2020). Reddit is an important domain to
replicate these techniques on as Reddit posts and comments may face even more extreme sparseness
than conventional classification tasks. For example, it is not uncommon for a Reddit comment to just
be "lol nice." Therefore, it is relevant to investigate how well current short text classification methods
perform in this domain.

4 Approach
4.1 Word2Vec Model

Word2Vec consists of two main architectures: Continuous Bag of Words (CBOW) and Skip-gram.
CBOW aims to predict the target word given its context words, while Skip-gram predicts context
words given a target word. Typically, CBOW is efficient and works well with frequent words in the
dataset, making it suitable for tasks where word order is less important. Skip-gram, on the other
hand, predicts the context words given a target word, capturing detailed semantic relationships
and contextual nuances. I train two different sets of word embeddings by using both architectures.
Because Reddit users may be more likely to use internet slang, I derive domain specific word
embeddings by training Word2Vec on political Reddit posts and comments. I obtain posts and
comments from the following subreddits

”2016Elections, ”Ask_politics”, ”Republican”, ”centrist”, ”democraticparty”, ”democrats”

I utilized Convokit to load PushShift.io Reddit Corpus which contains Reddit posts and comments up
until October 2018. After subsetting for political subreddits, I yielded 4955615 training examples.
I preprocessed the data by stemming, tokenizing, and removing common stop words. I leveraged
Gensim’s implementation of Word2Vec model to derive word embeddings with 100 dimensions.
To visualize the word embeddings trained through my model, I use Principal Component Analysis
for dimensionality reduction.

2
Figure 1: Word embeddings trained using CBOW

4.2 Classification Models

I build four different classification models: SVM, CNN, LSTM, and CNN-LSTM.
SVM aims to find the hyperplane that best separates the data points of different classes. The
hyperplane is a decision boundary that maximizes the margin, which is the distance between the
hyperplane and the nearest data points of each class error. I utilize linear kernels in optimizing the
hyperplane between the two classes of data.
CNNs, originally developed for computer vision tasks, have been adapted for text classification by
treating words or character n-grams as spatial features. Convolutional layers capture local patterns
in the input data, making them effective at learning hierarchical representations of text. I utilize the
following CNN architecture.

LSTMs are specialized variants of RNNs in which RNNs are designed to process sequential data and
are well-suited for tasks involving variable-length inputs. However, traditional RNNs suffer from
the vanishing gradient problem, which limits their ability to capture long-range dependencies in text.
By incorporating mechanisms such as forget and input gates, LSTMs address the shortcomings of
traditional RNNs and are capable of learning long-term dependencies in sequential data. I utilize

3
the following LSTM architecture. In each LSTM layer, I apply input dropout and recurrent dropout
to encourage robustness. Input dropout prevents the model from relying too heavily on specific
input features while recurrent dropout prevents overfitting by introducing noise into the recurrent
connections.

CNN-LSTM can effectively combine the separate advantages of CNN and LSTM architectures to
detect local patterns and long range dependencies. Following from Onan’s work which showed the
potential of CNN-LSTM hybrid architecture, I utilize the following CNN-LSTM architecture shown
below. I also apply input and recurrent dropout in the LSTM layer.

5 Experiments
5.1 Data

My classification task is to classify Reddit post and comments by its political leaning. To create
the training dataset for the classification problem, I load posts and comments from r\democrats
and r\Republicans through Convokit. Because r\Republicans is more active than r\democrats, I
randomly sample 50,000 comments and posts from each subreddit to create an evenly distributed
dataset. Each training example is a tokenized and stemmed post/comment. A training example is
labelled as 0 if it was posted on r\democrats and 1 if it was posted on r\Republicans. I use a .8/.2
split to create training and test data.

5.2 Evaluation method

In order to evaluate the performance of various classification methods, I compare their accuracy rates
using
N umberof CorrectAssignments
P =
T otalN umberof Assignments
Accuracy rates are a common metric for performance. Thus, utilizing accuracy rates as a performance
metric allows for comparison across different baselines.

5.3 Experimental details

Utilizing the word embeddings I trained through Word2Vec, I obtain mean word embeddings for
each training example which I then feed into SVM and CNN classifiers. I also derive sequential word
embeddings which I feed into LSTM, and CNN-LSTM classifiers.
I leverage SKlearn’s SVM classifier which is implemented using hinge loss for the problem of binary
classification. I add a regularization parameter C=.1, which balances the trade-off between achieving
a low error on the training data and minimizing the norm of the weights of the decision function.
I implement the CNN, LSTM, and CNN-LSTM models within a Tensorflow and Keras framework.
For all models, I use a binary cross-entropy loss.

4
loss = −(y log(p) + (1 − y) log(1 − p))

I initialize all models with a learning rate of .001 and use Adam’s optimizer to increase speed of
convergence. Training is performed over 30 epochs with a batch size of 64.

5.4 Results

I obtain the following accuracy rates:

SVM CNN LSTM CNN-LSTM

CBOW .632 .578 .627 .662
skipgram .623 .586 .543 .621

All my models performed considerably worse than the baselines established by Zhang and Onan,
who were able to achieve accuracy rates in the high .80s. There are a few factors that influence .
First, to reduce memory usage, I trained word embeddings only on six subreddits. On the other hand,
Hoffman et. al defined 605 subreddits which compose of the Reddit Politosphere(2022). Thus, my
word embeddings may be low quality due to insufficient training data. Additionally, the low accuracy
rates also highlight the inherent difficulty of classifying political sentiment based on a singular Reddit
comment. For example, one comment appearing on r\democrats reads, "Hahaha!! Not quite, but I’ll
take it." There is no indication from this comment that it expresses an explicit liberal leaning. Taking
into account that there are many comments that do not express explicit political leanings, achieving
an accuracy rate of .662 through CNN-LSTM represents reasonable performance.
In general, excluding the CNN model, CBOW embeddings outperformed skipgram. Because CBOW
focuses on target word predictions, CBOW embeddings capture better representations of frequent
words. Since political discussion often focus on certain key issues(for example, "Trump" is a very
common key word), my task was able to leverage CBOW’s advantage. On the other hand, Skip-gram
performs better when capturing fine grained semantic relationships. However, since my task involves
capturing larger political sentiments, it was not able to leverage Skip-gram’s fine grained advantage.

6 Analysis

Due the large presence of comments that do not indicate explicit political ideology, my models were
only able to obtain moderate accuracy rates. For example, one comment from the preprocessed
dataset simply reads:

’sens’

This comment was posted r\democrat, but due to the incoherence of this comment, all four models
classified it as Republican.
In contrast, the classifiers performed well on comments that expressed political sentiment. Another
comment reads:

’believ’, ’republican’, ’parti’, ’suppos’, ’stand’, ’candid’, ’leadership’

Even though this comment is short, all four models successfully classified this to its true label,
r\Republicans.
Additionally, despite the large number of incoherent comments, CNN-LSTM with CBOW word
embeddings was still able to achieve reasonable performance of .662. I show that CNN-LSTM is
able to effectively leverage the separate advantages of CNN and LSTM to obtain better performance.
For example, both CNN and CNN-LSTM were able to correctly classify the following comment as
Republican, while LSTM incorrectly classified it as Democrat.

5
’dunno’, ’lot’, ’appeal’, ’ego’

As another example, both LSTM and CNN-LSTM was able to correctly classify the following
comment as Democrat, while CNN incorrectly classified it as Republican.

’obtain’, ’gun’, ’legal’, ’proof’, ’gun’, ’control’, ’work’, ’bar’, ’bui’, ’gun’, ’legal’, ’obtain’

In this case, LSTM-CNN and LSTM classifiers were both able to capture the long range dependencies
of the relationship between guns and gun control.

7 Conclusion
This project investigated the application of deep neural methods to the task of short text classification
in the context classifying Reddit posts and comments by political leaning. Despite problems of
extreme sparseness in Reddit comments, a hybrid CNN-LSTM architecture was able to achieve
reasonable performance of .662 accuracy rate. I showed that CNN-LSTM architecture is able to
leverage the separate advantages of CNN and LSTM models to achieve superior performance.
In the context of Reddit posts and comments, most of the problems associated with classifying short
texts remain in feature extraction. When comments have sufficient context, deep neural methods
perform well. However, when comments are only a single word long, there is simply insufficient
information to determine its political leaning. Therefore, a major next step for this project is to
augment my data training data with more features. For example, single word comments on Reddit
are typically in response to a post or a larger thread. Therefore, I can augment my training data by
incorporating the surrounding thread that a comment appears in as features. Additionally, I can also
improve the quality of word embeddings by scaling training to include the entire Reddit Politosphere.
All together, deep neural networks demonstrate potential even when dealing with datasets as sparse
as Reddit comments. To improve text classification of political discussions, the main steps going
forward involve improving input features into these deep neural networks in order to improve feature
representation.

6
References
ConvoKit Developers. (2023). Reddit Corpus (by subreddit). Cornell University. Retrieved from
https://convokit.cornell.edu/documentation/subreddit.html
Hofmann, V., Schütze, H., Pierrehumbert, J. B. (2022). The Reddit Politosphere: A
Large-Scale Text and Network Resource of Online Political Discourse [Data set]. Zenodo.
https://doi.org/10.5281/zenodo.58517293
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J. (2013). Distributed representations of
words and phrases and their compositionality. arXiv preprint arXiv:1310.4546.
Onan, A. (2020). Sentiment analysis on product reviews based on weighted word embeddings
and deep neural networks. Concurrency and Computation: Practice and Experience, e5909.
https://doi.org/10.1002/cpe.5909
Zhang, R. and Han Y. , "Research on Short Text Classification Based on Word2Vec Mi-
croblog," 2020 International Conference on Computer Science and Management Technology
(ICC- SMT), Shanghai, China, 2020, pp. 178-182, doi: 10.1109/ICCSMT51754.2020.00042.
https://ieeexplore.ieee.org/document/944400

Impact of Word Embedding Models On Text Analytics in Deep Learning Environment: A Review
No ratings yet
Impact of Word Embedding Models On Text Analytics in Deep Learning Environment: A Review
81 pages
Word Embedding 9 Mar 23 PDF
No ratings yet
Word Embedding 9 Mar 23 PDF
16 pages
DeepLearning Text
No ratings yet
DeepLearning Text
21 pages
Classification Survey
No ratings yet
Classification Survey
40 pages
Lecture15 - Neural Models For NLP
No ratings yet
Lecture15 - Neural Models For NLP
62 pages
Chapters - Mini Project Report Format
No ratings yet
Chapters - Mini Project Report Format
17 pages
W03 NLP
No ratings yet
W03 NLP
88 pages
An Analysis Method For Interpretability of CNN Text Classification Model
No ratings yet
An Analysis Method For Interpretability of CNN Text Classification Model
14 pages
Short Course On Deep Learning: Welcome!!
No ratings yet
Short Course On Deep Learning: Welcome!!
57 pages
A Sensitivity Analysis of Convolutional Neural Networks For Sentence Classification
No ratings yet
A Sensitivity Analysis of Convolutional Neural Networks For Sentence Classification
18 pages
Sequential Short-Text Classification With Recurrent and Convolutional Neural Networks
No ratings yet
Sequential Short-Text Classification With Recurrent and Convolutional Neural Networks
6 pages
CNN Text Classification
No ratings yet
CNN Text Classification
12 pages
A Word-Concept Heterogeneous Graph Convolutional
No ratings yet
A Word-Concept Heterogeneous Graph Convolutional
16 pages
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
No ratings yet
Character Level Text Classification Via Convolutional Neural Network and Gated Recurrent Unit
11 pages
NLP Slides2
No ratings yet
NLP Slides2
93 pages
Spam Text Classification Using LSTM Recurrent Neural Network
No ratings yet
Spam Text Classification Using LSTM Recurrent Neural Network
5 pages
Background
No ratings yet
Background
3 pages
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
No ratings yet
Impact of Convolutional Neural Network and Fasttext Embedding On Text Classification
17 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
Hindi Text Classification
No ratings yet
Hindi Text Classification
7 pages
Unit - 4 DL
No ratings yet
Unit - 4 DL
33 pages
Document and Word Representations Generated by
No ratings yet
Document and Word Representations Generated by
7 pages
Trend
No ratings yet
Trend
47 pages
DL Unit-IV
No ratings yet
DL Unit-IV
20 pages
"Am I The Asshole?": A Deep Learning Approach For Evaluating Moral Scenarios
No ratings yet
"Am I The Asshole?": A Deep Learning Approach For Evaluating Moral Scenarios
6 pages
3 - Deep Learning
No ratings yet
3 - Deep Learning
33 pages
CNN vs. LSTM For Turkish Text Classification
No ratings yet
CNN vs. LSTM For Turkish Text Classification
6 pages
08 Word Embeddings (2021)
No ratings yet
08 Word Embeddings (2021)
58 pages
Zhou 2020
No ratings yet
Zhou 2020
5 pages
ML For NLP-LO4
No ratings yet
ML For NLP-LO4
42 pages
Word Embeddings Classification
No ratings yet
Word Embeddings Classification
52 pages
Ikm at Semeval-2017 Task 8: Convolutional Neural Networks For Stance Detection and Rumor Verification
No ratings yet
Ikm at Semeval-2017 Task 8: Convolutional Neural Networks For Stance Detection and Rumor Verification
5 pages
2020 NLPDeepLearning
No ratings yet
2020 NLPDeepLearning
72 pages
Talking Points
No ratings yet
Talking Points
8 pages
Thuyết Trình TWP
No ratings yet
Thuyết Trình TWP
7 pages
BDMH LLM
No ratings yet
BDMH LLM
51 pages
Word Embadding
No ratings yet
Word Embadding
24 pages
Word Embedding Methodsof Text Processing
No ratings yet
Word Embedding Methodsof Text Processing
7 pages
Graph Representation Learning
No ratings yet
Graph Representation Learning
32 pages
Applications of CNN For Sentiement Analysis
No ratings yet
Applications of CNN For Sentiement Analysis
6 pages
Detection of Cyber Attack in Network Using Machine Learning Techniques Final
100% (5)
Detection of Cyber Attack in Network Using Machine Learning Techniques Final
50 pages
UNIT-III Text Classification
No ratings yet
UNIT-III Text Classification
4 pages
Knowledge 04 00022 v2
No ratings yet
Knowledge 04 00022 v2
25 pages
Sheet 3
No ratings yet
Sheet 3
5 pages
Poster Version Final Bis
No ratings yet
Poster Version Final Bis
1 page
Unit-III NLP
No ratings yet
Unit-III NLP
15 pages
RCNN
No ratings yet
RCNN
10 pages
CampusX DSMP 2.0 Syllabus
No ratings yet
CampusX DSMP 2.0 Syllabus
62 pages
Aiml (22cs62) QB
No ratings yet
Aiml (22cs62) QB
9 pages
Research Proposal Machine Learning
No ratings yet
Research Proposal Machine Learning
1 page
6 الى13 داتا ماينق
No ratings yet
6 الى13 داتا ماينق
19 pages
Hepatitis Disease Prediction Using - Machine.Learning
No ratings yet
Hepatitis Disease Prediction Using - Machine.Learning
12 pages
Monitoring and Forecasting Water Consumption and Detecting Leakage Using An IOT System
No ratings yet
Monitoring and Forecasting Water Consumption and Detecting Leakage Using An IOT System
11 pages
Neelavem@srmist - Edu.in Mm7177@srmist - Edu.in: Abstract - The Crop Recommendation System Is A
No ratings yet
Neelavem@srmist - Edu.in Mm7177@srmist - Edu.in: Abstract - The Crop Recommendation System Is A
7 pages
A Case Study On Data Classification Approach Using K-Nearest Neighbor
No ratings yet
A Case Study On Data Classification Approach Using K-Nearest Neighbor
7 pages
Integration of Artificial Intelligence in Clinical
No ratings yet
Integration of Artificial Intelligence in Clinical
13 pages
Tandon Report
No ratings yet
Tandon Report
40 pages
2 Marks
No ratings yet
2 Marks
5 pages
JETIR2501512
No ratings yet
JETIR2501512
6 pages
BE EEC SchemeMarch3
No ratings yet
BE EEC SchemeMarch3
45 pages
Journal of Electrical and Computer Engineering - 2023 - Albaji - Investigation On Machine Learning Approaches For
No ratings yet
Journal of Electrical and Computer Engineering - 2023 - Albaji - Investigation On Machine Learning Approaches For
26 pages
Khayyam Offline Persian Handwriting Dataset
No ratings yet
Khayyam Offline Persian Handwriting Dataset
15 pages
SQL Injection Attack Detection Framework Based On HTTP Traffic
No ratings yet
SQL Injection Attack Detection Framework Based On HTTP Traffic
7 pages
Assignment 3 - csp600 - Nur Qamarina Ainaa Binti Zulkifli
No ratings yet
Assignment 3 - csp600 - Nur Qamarina Ainaa Binti Zulkifli
5 pages
02 - Bharghav Fake News Detection
No ratings yet
02 - Bharghav Fake News Detection
49 pages
Sobrang Pahirap Na Sa Buhay
No ratings yet
Sobrang Pahirap Na Sa Buhay
31 pages
Soft Computing Short
No ratings yet
Soft Computing Short
7 pages
A Hybrid Posture Detection Framework: Integrating Machine Learning and Deep Neural Networks
No ratings yet
A Hybrid Posture Detection Framework: Integrating Machine Learning and Deep Neural Networks
8 pages
Convex Problems
No ratings yet
Convex Problems
48 pages
Credit Card Default Report
No ratings yet
Credit Card Default Report
3 pages
Naive Bayes and SVM Based NIDS: Dr. Mrudul Dixit
No ratings yet
Naive Bayes and SVM Based NIDS: Dr. Mrudul Dixit
6 pages
Forcasting NHL Success
No ratings yet
Forcasting NHL Success
12 pages
Research Paper
No ratings yet
Research Paper
21 pages
Twitter Sentiment Analysis Using Support Vector Machine and Deep Learning Model in E-Learning Implementation During The Covid-19 Outbreak
No ratings yet
Twitter Sentiment Analysis Using Support Vector Machine and Deep Learning Model in E-Learning Implementation During The Covid-19 Outbreak
11 pages
A Comparative Study of Sentiment Analysis Using NLP and Different Machine Learning Techniques On US Airline Twitter Data
No ratings yet
A Comparative Study of Sentiment Analysis Using NLP and Different Machine Learning Techniques On US Airline Twitter Data
4 pages
A Self-Learning Approach To Single Image Super-Resolution: Min-Chun Yang and Yu-Chiang Frank Wang, Member, IEEE
No ratings yet
A Self-Learning Approach To Single Image Super-Resolution: Min-Chun Yang and Yu-Chiang Frank Wang, Member, IEEE
11 pages
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
From Everand
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Natural Language Processing with NLTK: Definitive Reference for Developers and Engineers
From Everand
Natural Language Processing with NLTK: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Programming with X10: Definitive Reference for Developers and Engineers
From Everand
Programming with X10: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
From Everand
Advanced Deep Learning Techniques for Natural Language Understanding: A Comprehensive Guide
Adam Jones
No ratings yet
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
From Everand
BERT Foundations and Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
From Everand
Hugging Face Transformers Essentials: From Fine-Tuning to Deployment
Robert Johnson
No ratings yet
Visual Word: Unlocking the Power of Image Understanding
From Everand
Visual Word: Unlocking the Power of Image Understanding
Fouad Sabry
No ratings yet
Conceptual Dependency Theory: Fundamentals and Applications
From Everand
Conceptual Dependency Theory: Fundamentals and Applications
Fouad Sabry
No ratings yet
Relationship Extraction: Fundamentals and Applications
From Everand
Relationship Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
Semantic Network: Fundamentals and Applications
From Everand
Semantic Network: Fundamentals and Applications
Fouad Sabry
No ratings yet
Semantic Translation: Fundamentals and Applications
From Everand
Semantic Translation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet

Classifier

Uploaded by

Classifier

Uploaded by

Short Text Classification of Political Reddit Posts

Stanford CS224N {Custom} Project

1 Key Information to include

Stanford CS224N Natural Language Processing with Deep Learning

”2016Elections, ”Ask_politics”, ”Republican”, ”centrist”, ”democraticparty”, ”democrats”

4.2 Classification Models

5.2 Evaluation method

5.3 Experimental details

I obtain the following accuracy rates:

SVM CNN LSTM CNN-LSTM

’believ’, ’republican’, ’parti’, ’suppos’, ’stand’, ’candid’, ’leadership’

You might also like