Meeting Insights Summarisation Using Speech Recognition
Meeting Insights Summarisation Using Speech Recognition
ISSN No:-2456-2165
Dr. P. Supraja3
Associate Professor, Department of Networking and Communications, SRM Institute of Science and Technology,
Chennai, Tamil Nadu, India
Abstract:- Speech is the strongest mode of discourse the recognition and translation of speech into text. Text
through which people express their emotions and ideas summarization pulls the most significant information from a
through numerous languages. Speech recognition text-based source and offers an effective summary of the
authorization has varied applications as it provides same.
Hassle free procedure which does not require physical
contact as in the case of fingerprint authorization. Speech summarization is the process of condensing
Speech summarisation methods use speech from people human speech into a more concise and manageable form. It
as input and produce a condensed form as spoken or tries to write a summary that is suitable for a specific task.
written language. Speech synthesis offers a variety of The summary should be more coherent than a direct
applications spanning from computer technology to transcription of speech, as it eliminates common
medical care, including improving language libraries irregularities, breaks, repairs, and repetitions. The recent
and reducing therapeutic paperwork load. Every dialect interest in speech summarization is driven by improvements
has its unique collection of features for speaking. Despite in improving the precision of speech recognition systems,
speaking a comparable language, the speed and dialect the standard in audio capturing, and the rising use of natural
differ from individual to individual. This can make language as a computer structure.
comprehending the conveyed message difficult for
certain people. Conferences are an important part of The process of speech summarization involves several
every organisation's operation, regardless of if they took technological components such as automated speech
place via the web or in reality. Meeting translation and recognition (ASR), which translates voice into written form,
summarization standards, on the contrary hand, are and summary modules, which summarise information
typically disagreeable demands because they necessitate summarise key parts of the transcription. Users can use the
time-consuming workers. This project aims to identify Internet's Voice APIs to capture audio and submit it to a
things during meetings like the greatest number of times speech recognition web service for processing.
a person spoke in a meeting to determine his level of
inputs and summarisation of insights of meetings for all Speech summarization has a range of real-world
the employees in the meeting and identifying their applications, such as summarising broadcast news, podcasts,
insights through the words spoken by them. clinical conversations, and meetings. It presents a challenge
in speech understanding research and can be achieved
Keywords:- Speech Recognition, Speech Summarization, through extractive or abstractive summarization techniques.
Speech Pre-Processing, Spacy, Gensim. Extractive summarization preserves the original format and
is typically more fluent, while abstractive summarization is
I. INTRODUCTION more concise and flexible. The summary of speech ought to
be more intelligible than a straight transcript.
Speech is a highly powerful mode of communication
through which humans express their thoughts and feelings Meetings are a common and important part of business
through numerous languages. Each language has its unique operations. They provide opportunities for team members to
set of linguistic qualities. Even while speaking the same collaborate, exchange ideas, and make decisions. However,
language, the speed and accent vary from person to person. meetings can also be time-consuming and distracting,
It makes it difficult for certain people to comprehend the making it difficult for attendees to retain key information
conveyed message. Long speeches can be difficult to follow and insights. To address this challenge, the use of speech
at times owing to factors such as differing pronunciation, recognition and summarization technology has gained
pace, and other factors. Speech recognition, which is a attention as a way to efficiently and effectively process
cross-disciplinary issue in computational language science, meeting content.
contributes to the advancement of technology that allows for
Spacy's primary characteristics include the following: Text classification: Spacy includes a range of built-in
Language support: Spacy handles a number of dialects, models for text classification, including sentiment analysis
including Spanish, German, English, Dutch, French, Italian, and topic modelling. These models can be trained on custom
and others. datasets to create more accurate models for specific use
cases.
Pre-trained models: Spacy offers models that have
been trained for many dialects, which may be uploaded with Customization: Spacy provides a range of tools for
only a few pieces of script. These representations may be customising and training models on specific tasks or
utilised in a variety of tasks involving NLP, including domains. This allows developers to create more accurate
dependency parsing, part-of-speech tagging, named entity models for specific use cases and can help improve
recognition, and others. performance on specific datasets.
Tokenization: Spacy uses advanced tokenization Overall, Spacy is a powerful and flexible library for
techniques to split text into individual words and natural language processing in Python. Its rapidity and
punctuation marks. It can handle a range of languages and flexibility render it suitable for usage in commercial
can also split compound words and contractions. situations, and its variety of functions and customizable
possibilities make it an appealing option among academics
Part-of-speech tagging: Spacy may instantly assign and engineers who are developing an extensive variety of
elements of speech, such as a noun, a verb, an adjective, or applications that use NLP.
adverb, to every syllable in a phrase. This may be helpful for
a variety of uses including sentiment analysis and text
categorization.