Exemplary RPF
Exemplary RPF
1. Name:
2. Section:
!1
6. Research done so far and possible future plan of research (on the first and second choice).
Please include – Academic rationale/justification, Scope and Method of your proposed
investigation, Possible resources required.
a. English Language and Literature A:
There has been prior research done with regards to analyzing the language used by Donald J. Trump
and Hillary Clinton in interactions such as speeches and debates leading up to the 2016 American
Presidential election, and how this helped them further their political goals and ideology and
influence public opinion, in addition to research done on analyzing how the language used by
several other politicians in the past has also helped them propagate their values and influence public
opinion, as seen in ‘Trump’s and Clinton’s Style and Rhetoric during the 2016 Presidential Election’
by Jacques Savoy, ‘US Election Analysis 2016: Media, Voters and the Campaign’ by Lilleker et al.,
‘The Rhetoric of Blame and Bluster: An Analysis of How Donald Trump Uses Language to
Advance His Political Goals’ by Saku Korhonen, ‘Negative Affective Language in Politics’ by
Stephen M. Utych, and ‘Social Media Analysis and Public Opinion: The 2010 UK General Election’
by Anstead et al.
There has also been research completed so far on the ways in which language used on social media
platforms, including Twitter, can help influence opinion and propagate particular values and goals
of politicians, also with reference to the 2016 US Presidential election, as noted in ‘Twitter
Language Use Reflects Psychological Differences between Democrats and Republicans’ by
Sylwester et al., ‘Gendered campaign tweets: The cases of Hillary Clinton and Donald Trump’ by
Lee et al., ‘Introduction: Social Media, Political Marketing and the 2016 U.S. Election’ by Christine
B. Williams, and ‘Sentiment analysis of tweets for the 2016 US presidential election’ by Joyce et al.
Moreover, there has been research conducted presenting an appropriate methodology and
theoretical framework for social media language analysis in a political context, as seen in ‘Social
Media and Political Communication - A Social Media Analytics Framework’ by Stieglitz et al.
As part of my future plan of research, I intend to analyze a chosen set of tweets from the accounts
of Donald J. Trump (@realDonaldTrump) and Hillary Clinton (@HillaryClinton) posted in 2016,
and show how these tweets reflected a wide range of their values and ideologies, and were effective
in influencing the American public opinion leading up to the 2016 US Presidential election. I plan
to analyze not only the language and the rhetorical, stylistic, and formal devices present in the text
of the tweets, but also analyze the language and graphosemantic features observed in any images or
!2
photographs attached with the tweets, in the temporal, geographical, and socio-political context of
the 2016 US Presidential election. I shall look at how the tweets reflected the candidates’ ideology
and values including, but not limited to, their views and values with respect to each other, gender,
race, the state of their country, optimism for the future, and the American public. Likewise, I also
intend to analyze and evaluate the ways in which the tweets by Donald Trump and Hillary Clinton
reflected each of their political goals leading up to the election and effectively achieved their
purpose of persuading their audience to agree with their respective value system and choose to vote
for each of them.
The academic rationale behind my research is that in recent years, social media has been having an
increasing impact on the ways in which politicians and other personalities are able to propagate
their ideology to the public and on how the public perceives such personalities and their values. The
use of social media by politicians and related personalities has therefore drastically increased in the
past decade, and can now be seen as a powerful means of influencing public opinion and achieving
political and other goals, with Twitter representing an especially preferred platform for politicians to
communicate their views. Hence, it is important and valuable that the language and linguistic
features in tweets in a significant and widely covered socio-political context like the 2016 American
Presidential election be analyzed and their effect on influencing public opinion evaluated, to better
understand how politicians can wield social media platforms like Twitter to advance their political
goals in the current technological paradigm. By analyzing the effect of different linguistic
techniques and devices on the writer’s purpose and on public opinion, particular techniques or
devices which are relatively more effective can also be identified, thus making this research
valuable. There has also not been highly extensive research on such a topic due to the recent rise in
the use of social media platforms to achieve political goals, and thus my research can also help
further the boundaries of knowledge in this domain and be considered original.
The scope of my research includes identifying the values, ideology, and political goals of Donald
Trump and Hillary Clinton, analyzing the use of stylistic, formal, rhetorical, and graphosemantic
features in their tweets in 2016, and evaluating the effect of these tweets and the language used on
influencing public opinion. The methodology of my research shall include the theoretical
framework of critical political discourse analysis, visual textual analysis methods such as the Big 5
Method of Analysis, and the sentiment analysis of the tweets selected, the latter of which can be
performed using a computer program which I have priorly created.
!3
The possible resources I shall require include a set of tweets from the Twitter accounts of Donald
Trump and Hillary Clinton, and a computer capable of performing sentiment analysis of these
tweets, both of which I priorly have access to.
I am extremely passionate about researching the use of language in the media and in political
contexts, and hence writing an Extended Essay on such a topic is of great interest to me. I have been
consistently achieving 7s in Language and Literature and am also in the process of completing an
online course on ‘Language and Composition’ offered by Northwestern University, and thus believe
I am capable of and sufficiently proficient in the subject for writing an Extended Essay in English
Language and Literature A, and am deeply committed to this subject. Furthermore, I regularly
followed the 2016 US Presidential election and thus am aware of the socio-political context of my
research question. I also thoroughly enjoy analyzing digital texts and visual media, as I myself
frequently write and design digital and visual texts, and this research question also provides me with
an opportunity to perform computational sentiment analysis, which aligns with my interests, and
therefore I look forward to possibly researching this topic in greater depth.
RRS Evidence:
!4
b. Mathematics:
As seen in ‘Support Vector Networks’ by Vapnik et. al and ‘Support Vector Machines for
Classification’ by Awad et. al, research has priorly been completed on proposing the idea of support
vector machines (SVMs) and presenting the mathematics behind how this algorithm works for
classification problems, such as regarding the hyperplane used in an SVM and the constraints and
objectives for its position in n-dimensional space.
!5
Research has also been done on using kernel functions in SVMs for linear and non-linear decision
boundaries, on the mathematical reasoning behind how they work, on the various types of kernel
functions (such as linear, gaussian, polynomial, and chi-square kernels), and the applications of
SVMs with kernels in specific domains, as observed in ‘Support Vector Machines — Kernels and
the Kernel Trick’ by Martin Hoffman, in ‘Kernel methods and support vector machines for
handwriting recognition’ by Ahmad et. al, and in ‘Effect of Various Kernels and Feature Selection
Methods on SVM Performance for Detecting Email Spams’ by Trivedi et. al.
In my future plan of research, I firstly intend to use the mathematics presented in prior literature on
the mathematics behind how SVMs work and the constraints and objectives for the position of a
hyperplane, to derive a function to minimize/maximize for optimizing the position of the
hyperplane in an SVM, such that the hyperplane acts as a large margin classifier to separate the
mapped data, maximizing the margins between the hyperplane and positive and negative data points
in n-dimensional space, thus increasing the robustness and classification accuracy of the SVM. In
order to achieve this, I shall present this problem as a quadratic optimization problem and examine
the dual form of the problem, and then utilize ideas from linear algebra and calculus, including
Lagrangian multipliers, partial derivatives, vector dot products and inner products, and Karush-
Kuhn-Tucker conditions.
Following this, I plan to use my solution from the optimization problem above to arrive at a cost
function for SVMs that when minimized will lead to maximizing the margins of the hyperplane.
Next, I shall proceed to discussing kernel functions using Mercer’s theorem, Gram Matrices, and
eigenvectors, prior to conducting a critical comparison and evaluation of gaussian, linear,
polynomial, and chi-square kernels and their properties with regards to their effect on the position
of the hyperplane in n-dimensional space and the accuracy of SVMs, using mathematical reasoning
and analysis. To support my mathematical analysis, I plan to also incorporate a computer program
to train a number of SVMs using each type of kernel function discussed on a breast cancer
diagnosis classification dataset, and empirically visualize their hyperplanes (using graphs) and test
their accuracies on the specific application of breast cancer diagnosis, and analyze this numerical
and graphical data collected. Finally, I plan to present how kernel functions can be customized and
learnt for a specific dataset, to better optimize the hyperplane position of an SVM, through solving a
convex optimization problem.
!6
The academic rationale behind my research is that due to an increase in data and computational
power in recent years, SVMs and other machine learning algorithms are being increasingly used for
a range of classification and regression problems, and therefore, it is important and valuable to
derive mathematical methods of optimizing the position of the hyperplane in SVMs, in order to
maximize the algorithm’s accuracy in various problems. There have also been several kernel
functions proposed in recent years, as shown in the research papers cited above, and hence, it is
valuable to critically compare, contrast, and evaluate the properties of each kernel function, and find
ways to customize kernels, to find the optimal kernel function to use to optimize the hyperplane
position and maximize SVM accuracy, with reference to a significant classification problem such as
Breast Cancer Diagnosis. My plan for research involves going beyond the knowledge that already
exists regarding SVMs and optimizing hyperplane position, and hence can be considered valuable.
The scope of my research includes discussions and analyses about prior research and findings
related to support vector machines and constraints and objectives for the derivation of Lagrangian
and cost functions to optimize the hyperplane position for large margin classification, discussions of
different types of kernel functions and their comparison and evaluation using mathematical and
computational analysis in the breast cancer diagnosis problem, and the customization of kernel
functions. The methodology of my research shall include proof by deduction, various methods of
optimization such as using Lagrange multipliers, second derivatives, and quadratic programming,
derivations of functions from a set of constraints and objectives, empirical testing of various
algorithms using a computer program, and numerical and graphical analysis and reasoning of the
empirical data collected.
The possible resources required for my research include research papers and lectures available
online regarding support vector machines, kernel functions, and concepts of linear algebra and
calculus, a dataset available online for breast cancer diagnosis, and a computer capable of running
the SVM algorithm, all of which I priorly have access to, for free.
I am passionate about this field and specifically this research question, and hence eagerly desire to
work on researching this topic for my Extended Essay. I am highly experienced and knowledgeable
about the mathematics behind algorithms such as the Support Vector Machine, and about ideas from
calculus and linear algebra, which I will regularly utilize in my essay. I have completed over 8
!7
MOOCs (online specialization courses) related to Machine Learning algorithms and the
mathematics (linear algebra and calculus) behind these algorithms on websites such as Coursera and
edX from reputed universities including Stanford University, and also attended and received a 5/5
professor recommendation at a summer program last year at the Georgia Institute of Technology,
where I studied this domain in great depth. I am, and have always been, committed to this subject.
RRS Evidence:
!8
7. Annotated bibliography of the first choice (explain how the cited works helped you to
conceptualize your topic). There should be reference to at least one academic journal/research
monograph in your bibliography.
A. R. Ahmad, M. Khalid and R. Yusof, "Kernel methods and support vector machines for
handwriting recognition," Student Conference on Research and Development, Shah Alam,
Malaysia, 2002, pp. 309-312. : Helped me review the different types of kernel functions, and
!9
helped me conceptualize how I could perform empirical testing of the kernel functions for a
particular application using SVMs as part of my research.
Hofmann, Martin. “Support Vector Machines - Kernels and the Kernel Trick.” 26 June
2006. : Provided me with information on how kernel functions work to optimize the hyperplane
position in SVMs, by transforming the mapped data into a higher-dimensional space before plotting
a new decision boundary in this higher-dimensional space. Introduced me to kernel functions,
which helped me include an analysis and evaluation of kernels as part of my topic and research
plan.
Kim, Seung-Jean. “Learning the Kernel via Convex Optimization.” Stanford University. :
Introduced me to the idea of learning a kernel function as a convex optimization problem, which I
shall further expand on in my research to find a way to customize the kernel to a specific dataset
using some of the methods presented in this paper.
Trivedi, Shrawan Kumar, and Shubhamoy Dey. “Effect of Various Kernels and Feature
Selection Methods on SVM Performance for Detecting Email Spams.” International Journal
of Computer Applications, vol. 66, no. 21, 21 Mar. 2013. : This paper, published in the
International Journal of Computer Applications, showed me the possible effects of using different
kernel functions on the accuracy of SVMs in a specific classification problem. Showed me a
possible methodology I can utilize to perform empirical testing and analysis of the performance of
SVMs using different kernel methods.
Vapnik, Vladimir, and Corinna Cortes. “Support-Vector Networks.” Machine Learning, vol.
20, no. 273, ser. 297, 1995. 297. : This paper was the original paper which first presented the idea
of Support Vector Machines. This paper provided me knowledge regarding the mathematics behind
!10
how SVMs work and regarding the different parameters which can affect the hyperplane position.
This paper presented to me the constraints and objectives placed upon the hyperplane position. This
source helped me conceptualize my topic as it gave me sufficient background knowledge regarding
SVMs, and will help provide the basic constraints and objective statements which I can utilize to
derive Lagrangian and cost functions to optimize the hyperplane position for large margin
classification, during my research process.
!11