0% found this document useful (0 votes)
20 views6 pages

Computers and Linguistic

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views6 pages

Computers and Linguistic

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

PRACTICE

1. Speech Synthesize
Analyze the pronunciation of given words and provide rules for a TTS system.
a. call, cab, cake, cone, cob, cinder, city, cell, cent, cello
b. zoo, boo, moon, spoon, food, room, good, stood, book
c. tough, rough, plough, enough, cough, bough
d. mould, could, would, should
e. bone, home, rode, stove, dove, love, done, move
Here's a breakdown of the pronunciation rules and irregular words:
Pronunciation Rules:
a. Letter <c>: Pronounce <c> as /k/ except when followed by <e>, <i>, or <y>, in
which case pronounce it as /s/.
b. Vowels:
 Pronounce <oo> as /uː/ when it occurs at the end of a word or a syllable.
 Pronounce <ough> as /ʌf/ except in "though," where it's pronounced /ðoʊ/.
 Pronounce <ould> as /ʊld/ except in "could," where it's pronounced /kʊd/.
 Pronounce <ove> as /oʊv/ except in "love," where it's pronounced /lʌv/.
 Pronounce <one> as /oʊn/ except in "done," where it's pronounced /dʌn/.
Irregular Words
 a. None
 b. None
 c. "though"
 d. "could"
 e. "love," "done"
TTS Handling:
1. Lexicon: Maintain a lexicon or dictionary that stores the correct
pronunciation for irregular words. When the system encounters an
irregular word, it can refer to the lexicon to retrieve the correct
pronunciation.
2. Phonetic Transcription: Represent the correct pronunciation of irregular
words using a phonetic transcription system like the International
Phonetic Alphabet (IPA).
3. Machine Learning: Train a machine learning model on a large dataset of
words and their correct pronunciations.
2. Automatic Speech Recognition
Choose one of your favorites ASR sistem in your mobile phone. It can be SIRI
(iOS), Google Assistant (Android), an so on.
Then, try to speak to your ARS, such as:
1. Start with basic commands. For example:
“What’s the weather today?”
“What date is it today?”
2. Choose words or phrases that you think might be problematic. For
example:
Words with multiple meanings (e.g., "lead" vs. "led").
Words with difficult sounds (e.g., "squirrel," "rural").
3. Try speaking in different accents or using regional pronunciations. For
instance:
Use a British accent versus an American accent.
Try regional dialects, accents, or slang.

Some types of words or phrases caused confusion. It can be influenced by:


1. Homophones
2. Technical Terms
3. Names
4. Contextual Phrases
5. Sound-Alike Words
6. Accent Variations
3. Corpus Linguistics
Analyzing a corpus of English literary texts, including the rationale for the chosen
order.
1. Part-of-Speech Tagging
involves labeling each word in a text with its corresponding part of speech, such
as noun, verb, adjective, adverb, etc.
Example:
In the sentence "The quick brown fox jumps over the lazy dog," tagging would
label "The" (determiner), "quick" (adjective), "brown" (adjective), "fox" (noun),
"jumps" (verb), "over" (preposition), "the" (determiner), "lazy" (adjective), and
"dog" (noun).
2. Identifying Subjects, Direct Objects, and Indirect Objects
involves analyzing the grammatical structure of sentences to identify key syntactic
components: subjects (who or what the sentence is about), direct objects (who or
what is receiving the action), and indirect objects (to whom or for whom the
action is performed).
Example:
In the sentence "The teacher gave the students homework," "The teacher" is the
subject, "homework" is the direct object, and "the students" is the indirect object.
3. Building Syntactic Trees
is a visual representation of the grammatical structure of a sentence. It shows how
words group together into phrases and how those phrases relate to one another.
Example:
For the sentence "Tree structures are very easy," the syntactic tree would show
"Tree structures" as a noun phrase and "are very easy" as a verb phrase, with
"very" as a degree and “easy” as an adjective.
Tree structures are

very easy
4. Producing Word Roots
involves reducing words to their base or root form, a process known as
lemmatization. It typically considers the context and part of speech to derive the
correct root.
Example:
The words "running," "ran," and "runs" would all be reduced to the root "run."

Summary of the Order


1. Start with Part-of-Speech Tagging: This provides the essential context for each
word.
2. Identify Subjects and Objects: Using the tagged information allows for more
accurate identification of grammatical roles.
3. Build Syntactic Trees: The previously identified components facilitate a clearer
structure of the sentence.
4. Produce Word Roots: Finalizing the analysis with roots allows for broader
linguistic insights across the corpus.
By following this order, each step builds logically upon the last, ensuring a
comprehensive understanding of the corpus and its grammatical structures.
Conclusion:
The integration of language and computers has significantly advanced fields such
as linguistics, natural language processing, and computational linguistics. This
practice enables the analysis, understanding, and generation of human language
by machines, impacting everything from automated translation to speech
recognition.
Strengths
Efficiency: Automated systems can process and analyze large volumes of text
much faster than humans, making it easier to extract information and identify
patterns.
Consistency: Computers apply the same rules and algorithms consistently,
reducing the variability that might occur in human analysis.
Accessibility: Language technologies facilitate communication for diverse
populations, including those with disabilities, through tools like speech-to-text
and text-to-speech systems.
Innovative Applications: Advances in machine learning and AI lead to
innovative applications, such as chatbots, sentiment analysis, and real-time
translation.
Weaknesses
Context Sensitivity: Computers often struggle with understanding context,
idioms, and nuances of language that can lead to misinterpretations.
Data Dependence: The performance of language models heavily relies on the
quality and diversity of training data, which can introduce biases or limitations.
Complexity of Human Language: Language is inherently complex and variable,
making it challenging for algorithms to capture all linguistic subtleties and
regional variations.
Resource Intensive: Developing and maintaining advanced language processing
systems can be resource-intensive, requiring significant computational power and
expertise.
Overall, while the practice of integrating language and computers brings
numerous benefits and innovations, it also presents challenges that require
ongoing research and refinement to improve accuracy and inclusivity.

You might also like