0% found this document useful (0 votes)
106 views8 pages

Ch-2 Sound and Audio

This document discusses sound, audio formats, MIDI concepts and devices, and speech. It describes how sound is produced by vibrating matter which creates pressure variations in the air. It also explains how analog audio is converted to digital samples through analog-to-digital conversion. MIDI allows different electronic instruments to communicate digitally by sending and receiving messages about musical notes, controls, and commands. Speech generation and recognition are also discussed.

Uploaded by

siddhartha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views8 pages

Ch-2 Sound and Audio

This document discusses sound, audio formats, MIDI concepts and devices, and speech. It describes how sound is produced by vibrating matter which creates pressure variations in the air. It also explains how analog audio is converted to digital samples through analog-to-digital conversion. MIDI allows different electronic instruments to communicate digitally by sending and receiving messages about musical notes, controls, and commands. Speech generation and recognition are also discussed.

Uploaded by

siddhartha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Chapter – 2

Sound and Audio


2.1. Basic sound concept representation and formats

Sound is a physical phenomenon produced by the vibration of matter, such as


a violin string, or a block of wood. As the matter vibrates, pressure variations
are created in the air surrounding it. This alteration of high and low pressure is
propagated through the air in a wave-like motion. When a wave reaches the
human ear, a sound is heard.

Sound is produced by the vibration of matter. During the vibration, pressure


variations are created in the air surrounding it. The pattern of the oscillation is
called a waveform.

The waveform repeats the same shape at regular intervals and this portion is
called a period. Since sound wave occurs naturally, they are never perfectly
smooth or uniformly periodic. However, sounds that display a recognizable
periodicity tend to be more musical than those that are nonperiodic. Examples
of periodic sound sources are musical instruments, vowel sounds, whistling
wind and bird songs. Nonperiodic sound sources include unpitched percussion
instruments, coughs and sneezes and rushing water.

Computer representation of sound

The smooth, continuous curve of sound waveform is not directly represented


in a computer. A computer measures the amplitude of the waveform at regular
time intervals to produce a series of numbers. Each of these measurements is a
sample.

The mechanism that converts an audio signal into digital samples is the Analog-
to-Digital Converter (ADC). The reverse conversion is performed by a Digital-to-
Analog Converter (DAC).

Sampling Rate

The rate at which a continuous waveform is sampled


is called the sampling rate. Like frequencies,
sampling rates are measured in Hz. The CD standard
sampling rate of 44100 Hz means that the
waveform is sampled 44100 times per second.

Quantization

Just as a waveform is sampled at discrete times,


the value of the sample is also discrete. The
resolution or quantization of a sample value
depends on the number of bits used in measuring the height of the waveform.
The sampled waveform with a 3-bit quantization results in only eight possible
values: .75, .5, .25, 0, -.25, -.5, -.75 and -1. The shape of the waveform
becomes less recognizable with a lowered quantization, i.e., the lower the
quantization, the lower the quality of the sound (the result might be a buzzing
sound).

Sound Hardware

Before sound can be processed, a computer needs i/o devices. Microphone


jacks and built-in speakers are devices connected to an ADC and DAC,
respectively for the input and output of audio.

Audio Formats

The important format parameters for specification of audio are: sampling rate
(e.g. 8012.8 samples/ second) and sample quantization (e.g. 8-bit quantization).

2.2. Basic music [MIDI] concepts, devices, messages, standards, and software

MIDI (Music Instrument Digital Interface) is a standard that manufacturers of


electronic musical instruments have agreed upon. It is a set of specifications
they use in building their instruments so that the instruments of different
manufacturers can, without difficulty, communicate musical information
between one another.

A MIDI interface has two different components:

 Hardware connects the equipment. It specifies the physical connection


between musical instruments and deals with electronic signals that are
over the cable.
 A data format encodes the information travelling through the hardware.
The encoding includes the notion of beginning and end of a note, basic
frequency and sound volume.

A MIDI device is capable of communicating with other MIDI devices through


channels. The MIDI standard specifies 16 channels. Music data transmitted
through a channel are reproduced at the receiver side with the synthesizer
instrument. The MIDI standard identifies 128 instruments, including noise
effects with unique number. For example, 0 is for the Acoustic Grand Piano, 12
for marimba, 40 for the violin, 73 for the flute, etc.

MIDI Devices

Through the MIDI interface, a computer can control output of individual


instruments. On the other hand, the computer can receive, store or process
coded musical data through the same interface.

The heart of any MIDI system is the MIDI synthesizer device. A typical
synthesizer looks like a simple piano keyboard with a panel full of buttons.
Most synthesizers have the following common components:

 Sound Generator
The principal purpose of the generator is to produce an audio signal that
becomes sound when fed into a loudspeaker. By varying the voltage
oscillation of the audio signal, a sound generator changes the quality of
the sound – its pitch, loudness and tone – to create wide variety of
sounds and notes.
 Microprocessor
The microprocessor communicates with the keyboard to know what
notes the musician is playing, and with the control panel to know what
commands the musician wants to send to the microprocessor. The
microprocessor then specifies note and sound commands to the sound
generators.
 Keyboard
The keyboard affords the musician's direct control of the synthesizer.
Pressing keys on the keyboard signals the microprocessor knows what
notes to play and how long to play them.
 Control Panel
The control panel controls those functions that are not directly
concerned with notes and durations. It includes: a slider that sets the
overall volume of the synthesizer, a button that turns the synthesizer on
and off, and a menu that calls up different patches for the sound
generators to play.

 Auxiliary Controller
They are available to give more control over the notes played on the
keyboard.
 Memory
Synthesizer memory is used to store patches for the sound generators
and settings on the control panel.

MIDI Messages

MIDI messages transmit information between MIDI devices and determine


what kinds of musical events can be passed from device to device. The format
of MIDI messages consists of the status byte (the first byte of any MIDI
message), which describes the kind of message, and data bytes (the following
bytes). MIDI messages are divided into two types:

 Channel Messages
Channel messages go only to specified devices. There are two types of
channel messages:

- Channel voice message send actual performance data between MIDI


devices, describing keyboard action, controller action and control panel
changes. They describe music by defining pitch, amplitude, duration and
other sound qualities.

- Channel mode messages determine the way that a receiving MIDI


device responds to channel voice messages.

 System Messages
System messages go to all devices in a MIDI system because no channel
numbers are specified. There are three types of system messages:

- System real-time messages are very short and simple, consisting only
one byte. They are specially used for system reset, timing clock etc.
- System common messages are commands that prepare sequencers and
synthesizers to play a song. They are used for song selection, tuning the
synthesizers etc.

- System exclusive messages allow MIDI manufacturers to create


customized MIDI messages to send between their MIDI devices.

MIDI Standards

MIDI reproduces traditional note length using MIDI clocks, which are
represented through timing clock messages. Using a MIDI clock, a receiver can
synchronize with the clock cycles of the sender. For example, a MIDI clock
helps to keep separate sequencers in the same MIDI system playing at the
same tempo. When a master sequencer plays a song, it sends out a stream of
'Timing Clock' messages to convey the tempo to other sequencer. The faster
the timing clock messages come in, the faster the receiving sequencer plays
the song.

MIDI Software

Once a computer is connected to a MIDI system, a variety of MIDI applications


can run on it. Digital computers afford the composer or sound designer
unprecedented levels of control over the evolution and combination of sonic
events.

The software applications generally fall into four major categories:

 Music recording and performance applications


This category of applications provide functions such as recording of MIDI
messages as they enter the computer from other MIDI devices, and
possibly editing and playing back the messages in performance.

 Musical notations and printing applications


This category allows writing music using traditional musical notation.
The user can then play back the music using a performance program or
print the music on paper for live performance or publication.
 Synthesizer patch editors and librarians
These programs allow information storage of different synthesizer
patches in the computer's memory and disk drives, and editing of
patches in the computer.

 Music education applications


These software applications teach different aspects of music using the
computer monitor, keyboard and other controllers of attached MIDI
instruments.

2.3. Speech: concept, generation, analysis and transmission

Speech is a form of sound that can be understood and generated by humans


and also by machines. The human ear is almost sensitive in the range from 600
Hz to 6000 Hz. A machine can also support speech generation and recognition.
Today, workstations and personal computers can recognize 25,000 possible
words.

Speech Generation

An important requirement for speech generation is real-time signal generation.


With such requirement met, a speech output system could transform text into
speech automatically without any lengthy preprocessing. An example is the
spoken time announcement of a telephone answering service.

Exactly, there are:

 Vowels – a speech sound created by the relatively free passage of breath


through the larynx and oral cavity, usually forming the most prominent
and central sound of a syllable.
 Consonants – a speech sound produced by a partial or complete
obstruction of the air stream by any of the various constrictions of the
speech organs.
Speech Analysis

 Human speech has certain characteristics


determined by a speaker. Hence, speech
analysis can serve to analyze who is
speaking, i.e. to recognize a speaker for
his/her identification and verification.
 Another main task of speech analysis is to
analyze what has been said, i.e. to
recognize and understand the speech
signal itself. Based on speech sequence,
the corresponding text is generated.
 Another area of speech analysis tries to research speech pattern with
respect to how a certain statement was said. For example, a spoken
sentence sounds differently if a person is angry or calm. An application
of this research could be a lie detector.

The given figure describes the


different components of speech
recognition and understanding. The
system applies the following principle
several times:

 In 1st step, the principle is


applied to a sound pattern
and/or word model. An
acoustical and phonetic
analysis is performed.
 In 2nd step, certain speech units go through syntactical analysis; thereby,
the errors of the previous step can be recognized. In this case,
syntactical analysis provides additional decision help and the result is a
recognized speech.
 The 3rd step deals with the semantics of the previously recognized
language. Here the decision errors of the previous step can be
recognized and corrected with other analysis methods. The result of this
step is an understood speech.
Speech Transmission

The given figure describes about the components


of a speech transmission system for source code.
The goal is to provide the receiver with the same
speech/sound quality as was generated at the
sender side.

Here the original signal (analog speech signal) is


converted into digital signal by the help of A/D
converter. Then the speech is analyzed and
coded with the carrier signal. The signal is then reconstructed and by the help
of D/A converter the digital signal is then finally converted into original signal.

The figure shows components of transmission


system for recognition/synthesis. Speech
analysis (recognition) follows on the sender
side of a speech transmission system and
speech synthesis (generation) follows on the
receiver side.

You might also like