NLP tutorial1
NLP tutorial1
Learning
Last Updated : 04 Aug, 2022
Machine Comprehension is a very interesting but challenging task in both Natural
Language Processing (NLP) and artificial intelligence (AI) research. There are several
approaches to natural language processing tasks. With recent breakthroughs in deep
learning algorithms, hardware, and user-friendly APIs like TensorFlow, some tasks
have become feasible up to a certain accuracy. This article contains information about
TensorFlow implementations of various deep learning models, with a focus on
problems in natural language processing. The purpose of this project article is to help
the machine to understand the meaning of sentences, which improves the efficiency of
machine translation, and to interact with the computing systems to obtain useful
information from it.
Our ability to evaluate the relationship between sentences is essential for tackling a
variety of natural language challenges, such as text summarization, information
extraction, and machine translation. This challenge is formalized as the natural
language inference task of Recognizing Textual Entailment (RTE), which involves
classifying the relationship between two sentences as one of entailment, contradiction,
or neutrality. For instance, the premise “Garfield is a cat”, naturally entails the
statement “Garfield has paws”, contradicts the statement “Garfield is a German
Shepherd”, and is neutral to the statement “Garfield enjoys sleeping”.
Natural language processing is the ability of a computer program to understand human
language as it is spoken. NLP is a component of artificial intelligence that deal with
the interactions between computers and human languages in regard to the processing
and analyzing large amounts of natural language data. Natural language processing
can perform several different tasks by processing natural data through different
efficient means. These tasks could include:
Answering questions about anything (what Siri*, Alexa*, and Cortana* can do).
Sentiment analysis (determining whether the attitude is positive, negative, or
neutral).
Image to text mappings (creating captions using an input image).
Machine translation (translating text to different languages).
Speech recognition
Part of Speech (POS) tagging.
Entity identification
The traditional approach to NLP involved a lot of domain knowledge of linguistics
itself.
Deep learning at its most basic level, is all about representation learning. With
convolutional neural networks (CNN), the composition of different filters is used to
classify objects into categories. Taking a similar approach, this article creates
representations of words through large datasets.
Word Vectors:
Words need to be represented as input to the machine learning models, one
mathematical way to do this is to use vectors. There are an estimated 13 million words
in the English language, but many of these are related.
Search for an N-dimensional vector space (where N << 13 million) that is sufficient to
encode all semantics in our language. To do this, there needs to be an understanding
of the similarity and differences between words. The concept of vectors and distances
between them (Cosine, Euclidean, etc.) can be exploited to find similarities and
differences between words.
If separate vectors are used for all of the +13 million words in the English vocabulary,
several problems can occur. First, there will be large vectors with a lot of ‘zeroes’ and
one ‘one’ (in different positions representing a different word). This is also known as
one-hot encoding. Second, when searching for phrases such as “hotels in New Jersey”
in Google, expectations are that results pertaining to “motel”, “lodging”, and
“accommodation” in New Jersey are returned. And if using one-hot encoding, these
words have no natural notion of similarity. Ideally, dot products (since we are dealing
with vectors) of synonym/similar words would be close to one of the expected results.
Word2vec8 is a group of models which helps derive relations between a word and its
contextual words. Beginning with a small, random initialization of word vectors, the
predictive model learns the vectors by minimizing the loss function. In Word2vec, this
happens with a feed-forward neural network and optimization techniques such as the
SGD algorithm. There are also count-based models which make a co-occurrence
count matrix of words in the corpus; with a large matrix with a row for each of the
“words” and columns for the “context”. The number of “contexts” is of course large,
since it is essentially combinatorial in size. To overcome the size issue, singular value
decomposition can be applied to the matrix, reducing the dimensions of the matrix and
retaining maximum information.
Software and Hardware:
The programming language being used is Python 3.5.2 with Intel Optimization for
TensorFlow as the framework. For training and computation purposes, the Intel AI
DevCloud powered by Intel Xeon Scalable processors was used. Intel AI DevCloud
can provide a great performance bump from the host CPU for the right application and
use case due to having 50+ cores and its own memory, interconnect, and operating
system.
Layers –
Conclusion:
As shown, NLP provides a wide set of techniques and tools which can be applied in
all areas of life. By learning the models and using them in everyday interactions,
quality of life would highly improve. NLP techniques help to improve
communications, reach goals, and improve the outcomes received from every
interaction. NLP helps people to use the tools and techniques that are already
available to them. By learning NLP techniques properly, people can achieve goals and
overcome obstacles.
In the future, NLP will move beyond both statistical and rule-based systems to a
natural understanding of language. There are already some improvements made by
tech giants. For example, Facebook* tried to use deep learning to understand the text
without parsing, tags, named-entity recognition (NER), etc., and Google is trying to
convert language into mathematical expressions. Endpoint detection using grid long
short-term memory networks and end-to-end memory networks on bAbI tasks
performed by Google and Facebook respectively shows the advancement that can be
done in NLP models.