0% found this document useful (0 votes)
13 views

Summary of Language Processing

This document summarizes various topics related to language processing, including psycholinguistics, human language processing, speech perception and comprehension, lexical access, syntactic processing, speech production, and computational linguistics. It discusses how spoken language is produced and understood by the human mind, involving both bottom-up and top-down processing. It also covers how computational models can analyze and synthesize human speech and language at the phonetic, lexical, morphological, syntactic and semantic levels.

Uploaded by

Lia Aftanty
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Summary of Language Processing

This document summarizes various topics related to language processing, including psycholinguistics, human language processing, speech perception and comprehension, lexical access, syntactic processing, speech production, and computational linguistics. It discusses how spoken language is produced and understood by the human mind, involving both bottom-up and top-down processing. It also covers how computational models can analyze and synthesize human speech and language at the phonetic, lexical, morphological, syntactic and semantic levels.

Uploaded by

Lia Aftanty
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

LIA AFTANTY / 21202241042 / PBI B

SUMMARY OF LANGUAGE PROCESSING

A. Psycholinguistics
is the field of linguistics study in which researchers investigate the psychological processes
involved in the use of language, including language comprehension, language (speech or sign)
production, and first and second language acquisition.
.
B. THE HUMAN MIND AT WORK: HUMAN LANGUAGE PROCESSING
Speaking and comprehending speech can be viewed as a speech chain, a kind of
brain-to-brain linking,

A spoken utterance starts as a message in the speaker’s brain/mind. It is put into


linguistic form and interpreted as articulation commands, emerging as an acoustic signal.
The signal is processed by the listener’s ear and sent to their brain/mind, where it is
interpreted.
This means that language processing is more than grammar alone, linguistic
performance tries to detail the psychological mechanisms that work with the grammar to
permit language production and comprehension.

C. Comprehension
One of the aims of psycholinguistics is to describe the processes people normally
use in speaking and understanding language. The various breakdowns in performance, such
as “tip-of-the-tongue” phenomena, speech errors, and failure to comprehend tricky
sentences tell us a great deal about how language is processed.
o The speech signal
In linguistics, speech is a system of communication that uses spoken words (or
sound symbols).
i. The vibrations of our vocal cords cause variations of air pressure, and
sounds we produce can be described in terms of:
ii. Fundamental frequency (pitch): How fast the variations of air pressure
occur.
iii. Intensity: The magnitude of acoustic signal, which is perceived as
loudness.
iv. The quality of a speech sound is determined by the shape of the vocal
tract:the shape affects how the sound waves travel.

D. SPEECH PERCEPTION AND COMPREHENSION


Speech is a continuous signal. The speech signal can be broken into strings of:
phonemes, syllables, morphemes, words, and phrases.
a) The "segmentation problem" -> how do listeners carve up the continuous speech
signal into meaningful units? Lexical access, stress, and intonation.
b) The "lack of invariance problem" -> how do listeners recognize different speech
sounds when they are used in different contexts and spoken by different people?
Listeners can normalize their perceptions to account for rate of speech and speaker
pitch differences.
E. BOTTOM-UP AND TOP-DOWN MODELS
a) Top-down processing: proceeding from semantic and syntactic information to the
lexical information from the sensory input
b) Listeners can predict that if a speaker says the then an NP is coming
c) In experiments, listeners seem to make much use of top-down information
d) Bottom-up processing moving from the sensory phonetic input to phonemes. Then
morphemes, etc. up to semantic interpretation
e) Listeners wait to construct an NP until they hear the followed by a noun
F. LEXICAL ACCESS AND WORD RECOGNITION
In order to discover more about lexical access or word recognition, psycholinguists
have devised several experiments:
a) Lexical decision -> Task of subjects in psycholinguistic experiments that involve
people deciding whether or not a string of letters or sounds is a word. Frequently
used words such as car are responded to more quickly than infrequent words such
as fig. This leads researchers to believe that frequent words are more easily
accessed in the lexicon than infrequent words
b) Lexical access experiments show that people retrieve all the meanings of a word.
G. SYNTACTIC PROCESSING
Listeners need to build phrase structure representations of sentences as they hear
them in order to understand the sentence. The mind uses two principles in parsing sentences
that lead people to go stray when encountering garden path sentences:
a) Minimal attachment: In comprehending language, listeners create the simplest
structure consistent with the grammar. Example: The horse raced past the barn is
interpreted as a complete sentence rather than a noun phrase containing a relative
clause, as if it were the horse (that was) raced past the barn.
b) Late closure: In comprehending language, listeners attach incoming material to the
phrase that was most recently processed. Example: He said that he slept yesterday
associates yesterday with he slept rather than with he said.
H. SPEECH PRODUCTION: PLANNING UNITS
Although speech sounds are linearly ordered, slips of the tongue (including
spoonerisms) reveal that speech is conceptualized before it is uttered.
ad hoc → odd hack (The vowels /æ/ of the first word and /ɔ/ /ɒ/ of the second are exchanged
or reversed.)
I. LEXICAL SELECTION
Word substitutions are seldom random: we tend to accidentally replace a word with
a semantically related word.
Sometimes we produce a blend, which is part of one word and part of another:
Splinters/blisters → splisters Edited/annotated → editated A swinging/hip chick → a swip
chick Frown/scowl → frowl
J. APPLICATION AND MISAPPLICATION OF RULES
Sometimes speakers also make errors with morphological and syntactic rules.
a. Rules may be applied to create possible but nonexistent words such as ambigual.
b. Regular rules may accidentally be applied to irregular words as in swimmed.
K. NON-LINGUISTIC INFLUENCES
The discussion on speech comprehension suggested that non-linguistic factors are
involved in and sometimes interfere with linguistic processing. They also affect speech
production. Example:
a) One speaker said, "I’ve never heard of classes on 9 April" instead of the intended
on Good Friday, which fell on 9 April that year.
b) The two phrases are not similar phonologically or morphologically, yet the non-
linguistic association seems to have influenced what was said.
c) Good Friday was on April 9th that year, so even though Good Friday and April 9th
have nothing in common phonologically or morphologically, the nonlinguistic
association was enough to prompt such an error.
L. COMPUTER PROCESSING OF HUMAN LANGUAGE
Computational linguistics is a subfield of linguistics and computer science that
focuses on with the interactions of human language and computers.
Computational linguistics includes the analysis of:
a) Written texts and spoken discourse
b) The translation of text and speech from one language into another
c) The use of human (not computer) languages for communication between computers
and people
d) The modelling and testing of linguistic theories.
M. COMPUTATIONAL PHONETICS AND PHONOLOGY
Computational phonetics and phonology is concerned with processing speech.
There are two sides of computational phonetics and phonology:
a) Speech recognition -> Process of analysing the speech signal into its component
phones and phonemes, and producing, in effect, a phonetic transcription of the
speech.
b) Speech synthesis -> Process of creating electronic signals that simulate the phones
and prosodic features of speech and assemble them into words and phrases for
output to an electronic speaker, or for further processing, as in a speech-generation
application.
- SPEECH SYNTHESIS
Speech sounds can be reduced to a small number of acoustic components that can
be mixed together like a recipe, which is known as formant synthesis:
Start with a tone at the same frequency as vibrating vocal folds (higher if a woman’s or
child’s voice is being synthesised, lower for a man’s).
Emphasise the harmonics corresponding to the formants required for a particular
vowel, liquid or nasal quality.
a) Add hissing or buzzing for fricatives.
b) Add nasal resonances for nasal sounds.
c) Temporarily cut off sound to produce stops and affricates. Etc.
d) Another approach is known as concatenative synthesis which relies on recorded
units from humans that are assembled to form the desired utterance.
N. TEXT TO SPEECH
Text-to-speech programs converts input text into a phonetic representation (for formant
synthesizers) or a representation of whatever units are to be combined (for concatenative
synthesizers). Two problems with text-to-speech programs are:
A. Homographs that are pronounced differently Complex structural knowledge is required
to know whether to pronounce read as [rid] or [rʒd] I have read the book
B. Inconsistencies in spelling
O. COMPUTATIONAL MORPHOLOGY
The processing of word structures by computers is computational morphology. Computers
also need to understand morphology and be able to identify morphemes.
One strategy would be to compile all the morphological forms of all a language’s words
into a dictionary. One method of morphological analysis is called stemming (the process
of detecting affixes and stripping them from roots to identify morphemes).
P. COMPUTATIONAL SYNTAX
Computers must also be able to determine syntactic structure. A parser is a program
that uses grammar to assign phrase structure to a string of words.
a) A top-down parser proceeds by first consulting the grammar rules (use a grammar
containing the rules S → NP VP, NP → Det N and so forth) and then examining
the input string to see if the first word could begin an S.
b) A bottom-up parser looks at the input string first and then finds phrasal categories.
c) A transition network composed of nodes (circles) and arcs (arrows) may be used to
model syntactic processing.
1. COMPOSITIONAL SEMANTICS
Compositional semantics is concerned with 1) producing a semantic representation of the
input in the computer and 2) producing natural language to represent meanings.
2. COMPOSITIONAL PRAGMATICS
Computers use semantic and pragmatic knowledge to analyze structurally ambiguous
sentences.
3. COMPUTATIONAL SIGN LANGUAGE
a) Linguists at Boston University are currently working on computer algorithms that
will recognize sign language as spoken language can be
b) The signer stands in front of a camera and the computer recognizes the distinctive
features of sign language such as hand shape, movement, and orientation.
4. COMPUTER MODELS OF GRAMMAR
a) Computers can be programmed to model the grammar of language
b) This forces linguists to be explicit in formulating the rules grammar
c) If the program cannot generate a possible grammatical sentence, then there is an
error in the grammar
d) If the program generates an ungrammatical sentence, then there is an error in the
grammar
Q. TEXT AND SPEECH ANALYSIS
FREQUENCY ANALYSIS, CONCORDANCES, AND COLLOCATIONS
Computers can be used to:
a) Do frequency analyses to reveal the most common words in written (the, of, and,
to, a, in, that, is, was, he) and spoken (I, and, the, to, that, you, it, of, a, know)
American English.
b) Do concordances, which specify the location of any particular word and its context.
c) Do collocation analyses, which reveals the occurrences of two or more words
within a short space of each other in a corpus and provides evidence that the
presence of one word in a text affects the occurrence of other words.
- COMPUTATIONAL LEXICOGRAPHY
Computational linguists need more information about words and morphemes than
just the meanings.
a. The culturomic revolution
culturomics is a quantitative analysis of a very large corpus of digitised texts, which
may reveal previously undocumented words or pinpoint periods of accelerated
language change.
b. Twitterology
The computer analysis of short electronic textual communications known as
tweets.
- INFORMATION RETRIEVAL AND SUMMARIZATION
Information retrieval: the use of computers to locate and display data from possibly
very large databases. (data mining)
Summarization programs allow computers to eliminate redundancy and identify the
most salient features of a body of information.
- SPELL CHECKER AND MACHINE TRANSLATION
Spell checkers range in sophistication from mindless dictionary lookups to
intelligent flagging of incorrect homonyms (yourfor you're, bearfor bare, etc.) The goal of
automatic machine translation is to input a message from the source language and have it
translated into the target language.
- COMPUTATIONAL FORENSIC LINGUISTICS
Computational forensic linguistics is a subarea of forensic linguistics that concerns
itself with computer applications in matters involving language, the law and the judicial
system
- Computational linguistics can be used in legal disputes regarding trademarks:
1. A computer search proved that the bound morpheme Mc- is now used productively
to mean “basic” or “inexpensive”.
2. But a judge ruled that another company could not use Mc- for their product because
it was too firmly associated with McDonald’s for consumers.
- Computational linguistics can also be used for the interpretation of legal terms:
1. A court case hinged on the meaning of the word visa, and by searching a
multimillion-word corpus, a computational linguist concluded that visa meant “a
kind of permit to enter a country” not “a permit to request permission to enter a
country”.
2. This finding affects laws surrounding international travel
R. SPEAKER IDENTIFICATION
Speaker identification is the use of computers to assist in the task of ascertaining
the identity of a speaker. Displays of wave forms (which show the amplitude changes of
speech over time) and spectograms (which show the frequencies of speech over time) can
help provide evidence in cases needing speaker identification

You might also like