Voice Morphing 101113123852 Phpapp01 121221072434 Phpapp01 PDF

Voice morphing
Presented
By
H.Mohammed.Sabir
09AT1A0461
Supervised
By
Shreedhar Sir
SEMINAR OUTLINES
What It is?
Need of Voice Morphing
Description the Morphing.
Technical details of Morphing.
Application areas.
What is Voice Morphing ??
 Voice morphing is a technique for modifying a (source)

speaker's speech to sound as if it were spoken by a
different (target) speaker.
 In Simpler terms it is being able to change the speech of

one speaker to that of another speaker.
 Technology developed at the Los Alamos National

Laboratory in New Mexico, USA by George Papcun
 Applications for Voice Morphing range from recreational

ones to security ones.
What it actually performs ?
 It is a technique to modify a source speaker's
speech to sound as if it was spoken by a target
speaker.
 Voice morphing enables speech patterns to be
cloned
 And an accurate copy of a person's voice can
be made that can wishes to say, anything in the
voice of someone else.
Need of voice morphing
 Text To Speech (TTS)

 In public speech systems
 For special effects ( just like video or image morphing is
done ).
 To diminish Ethnical barriers.
How to Morph Voice ??
 We need to effectively change the pitch from that of a male

speaker to that of a female speaker. If we reminisce the
excitation signal has information about the speaker.
 We find the LPC coefficients for the Source and Target Signals
and using these coefficients we are going to interpolate
between the two Signals.
 We get the New LPC (linear predictive coding) coefficients

using the formula
new lpc coeff = [const*(lpc source) + (1-const)(lpc

target)]
 0 <= const <= 1
…
How to Morph Speech ?? (contd…)
 The pitch of a female speaker will be close to twice that of

the male speaker. In our example the pitch of the male
speaker is 141Hz and that of the female speaker is 210Hz.
 So we need to develop some time stretching algorithm so

that we can implement pitch shifting. We obtain the residue
of the source signal and stretch it according to the value of
the const. The const indicates what is the position of morphed
signal in between the source and target.
 For example if const = 0.2 then the morphed signal will be

closer in pitch to the source signal and a value of 0.8 for const
will result in a pitch that is closer to the target signal.
How do we shift the Pitch ??
 We break the residue signal into small windows and introduce fade in
and fade out for each block. We recombine everything to form the pitch
shifted signal. Based on the alpha we can time stretch the residue
according to our requirements.
How do we Morph finally ??
• We now have the pitch shifted residue signal and the new
LPC coefficients. We should resample the pitch shifted
signal so that it is played at a faster rate. [Remember
when we pitch shift then the residue will last longer]. If
we inverse filter the resampled pitch shifted residue then
we can effect morphing.
Block Diagram
Time Domain Plots of Source and Target featuring the Pitch
Matching and Warping
 DTW(Dynamic Time Warping)
- Dynamic Time Warping (DTW) is used to

find the best match between the pitch of
the two sounds.
Signal Re-Estimation
 Loss during Signal re-estimation
-Due to signals being transformation into the

cepstral domain, a magnitude function is
used. This results in a loss of phase
information in the representation of the
data.
Limitations

Lots of normalizing problems.
Some applications require extensive sound libraries.
Different languages require different phonetics.
It is very seldom complete.
Advantages
 Allows speech model to be duplicated and an exact

copy of a person’s voice.
 Powerful combat zone weapon.

Disadvantages
 Use to pull out the useful information.
 It hides the actual identity of the user.

Conclusion
 The approach we have adopted separates the sounds into two
forms:
- Spectral envelope information

- Pitch and voicing information.
 Dynamic Time Warping
- Aligns the sounds with respect to their pitches.
 Signal re-estimation algorithm.
- Frames are converted back into a time domain
waveform.
Application Areas
 Fake telephone conversations as evidence in courts of
law.
 Powerful battlefield weapon.
- Provide
fake orders to the enemy's troops,
appearing to come from their own
commanders.
Future Scope
 Extending the functionality of tool.
- Create a powerful and flexible morphing
tool.
 Increased user interaction.

- Graphical User Interface could be
designed and integrated to make the
package more ‘user-friendly’.
BIBLIOGRAPHY:
• Ye, H. and S. Young (2003). "Perceptually Weighted Linear
Transformations for Voice Conversion". Eurospeech 2003,
Geneva.
• Ye, H. and S. Young (2004). "High Quality Voice Morphing".
Int Conference Acoustics Speech and Signal Processing,
Montreal, Canada.
• High quality Voice Morphing Hui Yeand Steve Young.
• Quality-enhanced Voice Morphing
Thank you!!!
Questions??

Voice Morphing 101113123852 Phpapp01 121221072434 Phpapp01 PDF

Uploaded by

Copyright:

Available Formats

Voice Morphing 101113123852 Phpapp01 121221072434 Phpapp01 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Voice Morphing 101113123852 Phpapp01 121221072434 Phpapp01 PDF

Uploaded by

Copyright:

Available Formats

Voice morphing

 Voice morphing is a technique for modifying a (source)

 In Simpler terms it is being able to change the speech of

 Technology developed at the Los Alamos National

 Applications for Voice Morphing range from recreational

 Text To Speech (TTS)

 We need to effectively change the pitch from that of a male

 We get the New LPC (linear predictive coding) coefficients

new lpc coeff = [const*(lpc source) + (1-const)(lpc

 0 <= const <= 1

 The pitch of a female speaker will be close to twice that of

 So we need to develop some time stretching algorithm so

 For example if const = 0.2 then the morphed signal will be

How do we Morph finally ??

 DTW(Dynamic Time Warping)

- Dynamic Time Warping (DTW) is used to

 Loss during Signal re-estimation

-Due to signals being transformation into the

 Allows speech model to be duplicated and an exact

 Powerful combat zone weapon.

 Use to pull out the useful information.

 It hides the actual identity of the user.

- Spectral envelope information

 Powerful battlefield weapon.

 Increased user interaction.

You might also like