Voice Gender Recognition Using Deep Learning: December 2016
Voice Gender Recognition Using Deep Learning: December 2016
net/publication/312219824
CITATIONS READS
32 12,248
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Ali Osman Çıbıkdiken on 25 October 2017.
410
Advances in Computer Science Research, volume 58
A website has been developed by using Django framework: [16] Django, https://djangoproject.com
[17] D.E. Rumelhart, G.E. Hinton, R.J.Williams, Learning representations by
https://www.konya.edu.tr/acdal/projects/deep-learning- back-propagating errors, Nature 323: 533-536, 1986.
voice-gender-detection [18] http://deeplearning.net/tutorial/_sources/mlp.txt
Model has been built same as training part. After compiling [19] S. Haykin, Neural Networks: A Comprehensive Foundation (2 ed.).
Prentice Hall, 1998.
the model saved HDF5 file has been loaded and model weights
[20] S. Kullback, R.A. Leibler, On information and sufficiency, Annals of
have been set up. User can upload wav or mp3 file on web Mathematical Statistics. 22 (1): 79–86, 1951.
browser. Mp3 files convert to wav format. Rpy2 library has
been used to run R code inside Django. After load and
conversation the file, filename passed to R code by using rpy2.
Voice file has been readed as data frame and passed specan
function of warbleR library. Specan function return 22
parameters about loaded file. Chosen 20 parameters have been
succeed to predict result using our model. Results is taken by
Django and showed to user. All computations have been
performed in Advanced Computing and Data Analysis
Laboratory (ACDAL), Necmettin Erbakan University, Konya.
VI. CONCLUSION
The model obtained in paper show us that we can use
acoustic properties of the voices and speech to detect the voice
gender. MLP has been used to obtain the model for
classification from data set which have the parameters of voice
samples. A larger data set of voice samples can be minimized
incorrect classifications from intonation. The web page has
been published to develop the model from loaded examples
about male and female voice samples.
ACKNOWLEDGMENT
This work is supported in part by the Necmettin Erbakan
University, BAP Coordination Office.
REFERENCES
[1] A.P. Vogel, P. Maruff, P. J. Snyder, J.C. Mundt, Standardization of pitch-
range settings in voice acoustic analysis, Behavior Research Methods,
v.41, n.2, p.318-324, 2009.
[2] K. Becker, “Identifying the Gender of a Voice using Machine Learning”,
2016, unpublished.
[3] J. M. Hilbe, Logistic Regression Models, CRC Press, 2009.
[4] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and
Regression Trees, CRC Press, 1984.
[5] L. Breiman, “Random forests”, Machine Learning, Springer US, 45:5–
32, 2001.
[6] J.H. Friedman, Stochastic Gradient Boosting, 1999.
[7] C. Cortes, V. Vapnik, “Support-vector networks”, Machine Learning, 20
(3): 273–297, 1995.
[8] J.H. Friedman, Greedy Function Approximation: A Gradient Boosting
Machine, 1999.
[9] L. Breiman, “Stacked regressions”, Machine Learning, Springer US,
45:5–32, 2001.
[10] Dataset, https://raw.githubusercontent.com/primaryobjects/voice-
gender/master/voice.csv
[11] M. Araya-Salas, G. Smith-Vidaurre, warbleR: an R package to streamline
analysis of animal acoustic signals. Methods Ecol Evolution, 2016,
doi:10.1111/2041-210X.12624.
[12] Python, https://docs.python.org/3/faq/general.html
[13] Keras, Chollet, François, 2015, https://github.com/fchollet/keras
[14] M. Abadi, A. Agarwal, TensorFlow: Large-scale machine learning on
heterogeneous systems, 2015. Software available from tensorflow.org.
[15] S. van der Walt, S.C. Colbert, G. Varoquaux. The NumPy Array: A
Structure for Efficient Numerical Computation, Computing in Science &
Engineering, 13, 22-30, 2011, doi:10.1109/MCSE.2011.37
411
View publication stats