0% found this document useful (0 votes)
49 views4 pages

Research Proposal

1. The research proposal aims to improve Twitter sentiment analysis and bug prediction in code changes using deep convolutional neural networks and feature selection techniques. 2. The methodology involves creating a corpus of labeled code changes, extracting features, performing feature selection using filter and wrapper methods to reduce features, training a classification model on the reduced features, and using the trained classifier to predict whether new changes contain bugs. 3. The project will be conducted over 12 months, involving tasks such as literature review, methodology development, implementation, testing, and submission. A budget of 115,000 is requested for books, equipment, travel, and services.

Uploaded by

Neha Mule
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views4 pages

Research Proposal

1. The research proposal aims to improve Twitter sentiment analysis and bug prediction in code changes using deep convolutional neural networks and feature selection techniques. 2. The methodology involves creating a corpus of labeled code changes, extracting features, performing feature selection using filter and wrapper methods to reduce features, training a classification model on the reduced features, and using the trained classifier to predict whether new changes contain bugs. 3. The project will be conducted over 12 months, involving tasks such as literature review, methodology development, implementation, testing, and submission. A budget of 115,000 is requested for books, equipment, travel, and services.

Uploaded by

Neha Mule
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

RESEARCH PROPOSAL

Project Title: Deep Convolution Neural Networks for Twitter Sentiment Analysis

Introduction :Twitter,with over 319 million monthly active users,has now become goldmine who
have a strong political, social and economic interest in maintaining and enhancing their clout and
reputation. Sentiment analysis provides these organizations with the ability to surveying various
social media sites in real time.Text Sentiment analysis is an automatic process to determining
whether a text segment contains objective or opinionated content, and it can furthermore determine
the text’s sentiment polarity. The goal of Twitter sentiment classification is to automatically
determine whether a tweet's sentiment polarity is negative or positive

Literature Review:

1.S. Kim, E. Whitehead Jr., and Y. Zhang, “Classifying Software Changes:


Clean or Buggy?” IEEE Trans. Software Eng., vol. 34,no. 2, pp. 181-196,
Mar./Apr. 2008.

A description of the change classification approach, techniques for extracting features from the
source code and change histories, a characterization of the performance of change classification
across 12 open source projects, and an evaluation of the predictive power of different groups of
features.it also introduced a new bug prediction technique that works at the granularity of an
individual file level change and has accuracy comparable to the best existing bug prediction
techniques in the literature (78 percent on average).

2. K. Gao, T. Khoshgoftaar, H. Wang, and N. Seliya, “Choosing Software


Metrics for Defect Prediction: An Investigation on Feature Selection
Techniques,” Software: Practice and Experience, vol. 41, no. 5, pp. 579-606,
2011.

Defect prediction models are necessary in aiding project managers for better utilizing valuable
project resources for software quality improvement. The efficacy and usefulness of a fault-
proneness prediction model is only as good as the quality of the software measurement data.
Authors has been studied seven filter-based feature ranking techniques and three filter-based subset
selection search algorithms. The search space for the subset selection algorithms is reduced for a
more practical application of the proposed approach.

3.V. Challagulla, F. Bastani, I. Yen, and R. Paul, “Empirical Assessment of


Machine Learning Based Software Defect Prediction Techniques,” Proc. IEEE
10th Int’l Workshop Object-Oriented Real-Time Dependable Systems, pp. 263-
270, 2005.
We evaluate different predictor models on four different real-time software defect data sets. The
results show that a combination of 1R and Instance-based Learning along with the Consistency-
based Subset Evaluation technique provides a relatively better consistency in accuracy prediction
compared to other models. The results also show that “size” and “complexity” metrics are not
sufficient for accurately predicting real-time software defects.

Objectives:

1. To selection of feature using multiple techiques.

2. Improve Bug Prediction in code changed by reducing features.

METHODOLOGY

Creating Corpus
1. File level change deltas are extracted from the revision history of a project, as stored in its SCM
repository
2. The bug fix changes for each file are identified by examining keywords in SCM change log
messages.
3. The bug-introducing and clean changes at the file level are identified by tracing backward in the
revision history from bug fix changes.
4. Features are extracted from all changes, both buggy and clean. Features include all terms in the
complete source code, the lines modified in each change (delta), and change meta-data such as
author and change time. Complexity metrics, if available, are computed at this step. At the end,a
project-specific corpus has been created, a set of labeled changes with a set of features associated
with each change.

Feature Selection
5. Perform a feature selection process that employs a combination of wrapper and filter methods to
compute a reduced set of features. The filter methods used are Gain Ratio, Chi-Squared,
Significance, and Relief-F feature rankers. The wrapper methods are based on the Naive Bayes and
the SVM classifiers. Feature selection is
iteratively performed until one feature is left.
At the end, there is a reduced feature set that performs optimally for the chosen classifier metric.

Classification
6. Using the reduced feature set, a classification mode is trained.
7. Once a classifier has been trained, it is ready to use. New code changes can now be fed to the
classifier, which determines whether a new change is more similar to a buggy change or a clean
change. Classification is performed at a code change level using file level change deltas as input to
the classifier.

Project Plan :

Months & Year Task


June 2018 Information gathering
July 2018 Survey
August 2018 Problem definition and finalization
September 2018 Literature Survey
October 2018 Methodology algorithm
November 2018 Modelling and organizing tools/techniques
December 2018 Implementation
January 2019 Testing and implementation
February 2019 Verification
March 2019 Final Method
April 2019 Submission
Financial Assistance Required:
Sr. No. Item Estimated Expenditure
1. Books & Journals 15,000
2. Equipment 50,000
3. Field Work & Travel 10,000
4. Chemicals and Glassware -
5. Contingency 30,000
6. Having Services 10,000
Total Amount 115,000

Books & Journal:

B. Efron and R. Tibshirani, An Introduction to the Bootstrap. - 14000


E. Alpaydin, Introduction to Machine Learning. - 1000

You might also like