0% found this document useful (0 votes)
9 views5 pages

Jan-June2024 7

The document discusses the integration of artificial intelligence in music processing, focusing on the MP3 format and its equalization techniques. It highlights the challenges faced by users in adjusting equalizer settings for optimal sound quality and proposes a new MP3 format that embeds equalization settings within the audio file. The paper outlines a four-phase implementation plan to enhance MP3 players to automatically adjust equalizer settings based on the embedded information, improving user experience and sound fidelity.

Uploaded by

priyanshkakani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views5 pages

Jan-June2024 7

The document discusses the integration of artificial intelligence in music processing, focusing on the MP3 format and its equalization techniques. It highlights the challenges faced by users in adjusting equalizer settings for optimal sound quality and proposes a new MP3 format that embeds equalization settings within the audio file. The paper outlines a four-phase implementation plan to enhance MP3 players to automatically adjust equalizer settings based on the embedded information, improving user experience and sound fidelity.

Uploaded by

priyanshkakani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

GLIMPSE-Journal of Computer Science •Vol.3(1),JANUARY-JUNE2024,pp.

32-36

NoVeL mp3: tHe musIc wItH seNses


kamna Singh
Department of Computer Science and Engineering, Ajay Kumar Garg Engineering College, Ghaziabad, India

Abstract—The field of computer science known as artificial and bit resolution, an analog signal is transformed into a
intelligence (AI) is dedicated to building machines that exhibit digital signal. When it comes to digital audio processing,
behaviours deemed intelligent by humans. Since the beginning the main focus is usually on analysing the signal’s audibility
of time, people have been fascinated by the idea of creating
mathematically. An audio stream, for instance, can be altered
intelligent robots. With the development of AI programming
techniques, this dream is now starting to come true. Researchers
for many reasons and regulated inside the auditory domain.
are developing ways to imbue computers with human-like Psychological characteristics have a major role in determining
intelligence. Systems that can imitate human thought processes, which portions of the signal are perceived as being heard and
comprehend speech, gestures, fingerprints, and even facial which are not, in addition to the physiology of the human
features are being developed. These systems can also outperform hearing system. Psychoacoustics is the field that analyses
the world’s top chess player, among many other previously these features. Storage, level compression, data compression,
unachievable tasks. Humans are inherently musical, which is one transmission, and enhancement (such as equalization, filtering,
of the main characteristics that set us apart from other species. noise elimination, echo or reverb addition, etc.) are among the
Although most people equate music with emotional expression,
processing techniques and application areas.
it also makes sense that the intellect is a major component
of musical endeavours. The interaction between these two
components is a topic of study in many scientific domains, Because the MP3 format allows sound files to be compressed
including neuroscience. while maintaining a high degree of quality and making them
much easier to keep or send online, it has become the most
Keywords—MP3 track, equalizer, gain level widely used audio format nowadays. The majority of MP3
players are equipped with an inbuilt graphic equalizer, a
I. Introduction high-fidelity audio control that lets the user separately ad-
First of all, to name just a few Machines, cognitive sciences just and view several different frequency bands. Equalization
and artificial intelligence (AI) are fields in which music is of audio is particularly beneficial in improving sound qual-
unquestionably one of the most fascinating applications of ity and limiting the physical extremes that might otherwise
human intelligence. Through examining simulations of this result from recording analog records. Higher-end and mid-
task, scholars endeavour to unravel the enigmas inherent range stereophonic sound systems for home usage frequently
in both music and intellect. However, from a practical have graphic equalizers. Every time a user listens to a sound
standpoint, the ultimate objective of research on AI and music track, these equalizer settings need to be adjusted. This can
is to teach computers to perform like expert musicians. In cause a number of issues, including the deterioration of high-
addition to highly specialist skills like composition, analysis, quality speakers due to abrupt changes in the track without
improvisation, and instrument playing, skilled musicians corresponding equalizer settings, car accidents caused by the
should be able to execute less specialized tasks like reading driver’s distraction from manually adjusting these settings
concert reviews in the newspaper and interacting with other frequently, listener annoyance from constantly adjusting the
musicians. In this scenario, the music machine would require a settings, and more [1].
rudimentary comprehension of human social issues, including
grief and happiness. The ability to perform a variety of such II. Review of the Literature
tasks is currently being searched for ways to integrate them There is a plethora of material available on the present MP3
into research projects. For example, mechanisms from music format and its decoding, despite the fact that there is very lit-
analysis systems are combined with systems for composition tle study directly applicable to the desired music format or its
to enable computers to compose music autonomously in the decoding. Articles about the various equalization techniques
style of pieces that have been analysed, filtering different that can be used on an audio stream are also available.
aspects of the music and echo effects, adjusting speed and
effects, etc. The processing of an auditory signal, or sound, MPEG-1 Audio Layer 3, or MP3, is a well-liked lossy com-
is known as digital audio processing. It describes the process pression and digital audio encoding format. Its goal is to sig-
of sending and receiving sound that has been digitally nificantly reduce the amount of data needed to represent audio
preserved. A digital audio signal that has been informationally while maintaining what most listeners consider to be an ac-
encoded is called digital audio. At a specific sampling rate curate reproduction of the original, uncompressed audio. The

32
GLIMPSE-Journal of Computer Science •Vol.3(1),JANUARY-JUNE2024,pp. 32-
36
audio-specific compression format is called MP3. Compared
to simple techniques, it offers a representation of pulse-code or mp3, is the result of the following: two bits that indicate
modulation-encoded audio in a substantially smaller amount that layer 3 is being utilized, and a bit that indicates this is the
of space by employing psychoacoustic models to filter out mpeg standard. Following this, the values will vary based on
sounds that are less detectable to human ears and efficiently the MP3 file. The ID3 metadata found in the majority of MP3
store the remaining information. A variety of bit rates can be files today either comes before or after the MP3 frames; the
used to compress MP3 audio, offering a range of trade-offs figure illustrates this as well.
between sound quality and data size. As a general rule, the
higher the bit rate employed, the higher the quality during 2.1.1 the equalization of audio signals is explained in this
playback since more information from the original sound file article. tone controls, parametric equalizations, and graphic
is contained. The possible sampling frequencies in MPEG-1 equalizations are covered. the practice of increasing or de-
Layer 3 are 32, 44.1, and 48 kHz, and the available bit rates creasing specific frequency components in a signal is known
are 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, as equalization, or eq. when recording, equalization is a cru-
and 320 kbit/s [5]. cial tool for enhancing the sound of an instrument. Beautiful
music is authentically replicated for listeners in their rooms,
Since MP3 files contain audio separated into frames, each of giving them the impression that they are at a live perfor-
which has a unique bitrate, the bit rate can be dynamically al- mance. tone controls: probably the most widely used equali-
tered while the file is being encoded. This method further im- zation mechanism, tone controls are present on the majority
proves quality and reduces storage space by using more bits of stereo systems. They offer a quick and simple technique to
for parts of the sound with higher dynamics (more movement partially offset the room’s acoustics and customize the sound
in the sound) and fewer bits for areas with lower dynamics. to your preferences. The lowpass shelving filter is controlled
2.1. Graphic Equalizer: A graphic equalizer is a set of filters, by the “bass,” and the high pass shelving filter by the “treble.”
each with a fixed center frequency that cannot be changed. A signal is amplified by a gain larger than one, and reduced
The only control you have is the amount of boost or cut in by a gain smaller than one. Shelving filters: only adjust or cut
each frequency band. This boost or cut is most often con- a piece, leaving the remainder untouched. ‘In-between’ fre-
trolled with sliders. This interface is pretty intuitive because quencies are affected by’mid’ controls, like the 3-band equal-
the frequency response of the equalizer resembles the posi- izers frequently seen on mixers. typically, it doesn’t try to
tions of the sliders themselves. The sliders are a graphic repre- isolate certain frequencies, but instead selectively increase or
sentation of the frequency response. A graphic equalizer uses decrease a specific range of frequencies without affecting oth-
a set of bandpass filters that are designed to completely isolate er frequency bands. graphic equalizers: a graphic equalizer
certain frequency bands. Each filter in the graphic equalizer consists of a collection of filters, each having an unchangeable
has the same input. Their job is to only allow a small band of fixed center frequency. You are only able to adjust how much
frequencies through. The filters are arranged in parallel. For boost or cut is applied to each frequency band. most typi-
each filter that is added in series, its phase response is added cally, sliders are used to manage this cut or boost. Because the
to the phase response of the other filters. The phase response equalizer’s frequency response mimics the slider placements,
also reveals how the filter actually delays the signal. this interface is rather intuitive [2]. the frequency response
is shown graphically by the sliders. A series of bandpass fil-
ters intended to fully isolate specific frequency bands are used
in graphic equalizers. The input of every filter in the graphic
equalizer is the same. their function is to restrict the range of
frequencies that can pass through. In parallel arrangement are
the filters. The phase response of each filter applied in series
is added to the phase response of the preceding filters. The
actual signal delay caused by the filter is clearly visible in the
phase response [3].

2.2 Parametric Equalizers: a single parametric equalizer


lets you adjust the bandwidth and center frequency in addition
Figure 1. Graphic Equalizer
to the boost or cut. a parametric eq with a lot of cut, often
known as a notch filter, can be placed precisely at the
The MP3 header and MP3 data are contained in each of the frequency where the feedback is occurring in order to negate
several MP3 frames that make up an MP3 file. The real audio it. Theoretically speaking, shelving and peaking filters are
payload consists of MP3 data. The figure demonstrates that a just particular kinds of parametric equalizers; nonetheless,
sync word, which is used to determine the start of a legitimate the majority of commercial devices do have parameter
frame, is a part of the MP3 header. MPEG-1 Audio Layer 3, limitations.

33
GLIMPSE-Journal of Computer Science •Vol.3(1),JANUARY-JUNE2024,pp. 32-
36

2.3 a whole mp3 Decoder on a chip Hugo Hedberg, thomas


Lenart, and Henrik svensson - cccD, Department of electro-
science, university, 118, se-221 00 Lund, sweden this paper
presents two distinct hardware decoder implementations, one
for Fpga and the other for asIc. along with implementation
details for the two decoders, it also offers basic information
regarding the mp3 format. the creation was completed in a
project-based learning environment [4].

Figure 4: Example of MP3 Header

The comments section of the ID3 tag of the current MP3 for-
mat can contain equalization settings for various song lengths.
Figure 2: The MP3 Decoding Process MP3 players can be configured to decode these settings and
adjust them accordingly while the track is playing.
The suggested method calls for the equalizer settings to be
adjusted for each sound track each time it is listened to. This III. Suggested Approach:
causes a number of issues, including the deterioration of high- To improve sound quality, and to limit the physical extremes
end speakers when tracks abruptly change without matching that would arise from recording analogue records, equaliza-
equalizer settings, car accidents caused by drivers getting dis- tion of audio is very helpful. Most of the MP3 players have
tracted when they manually adjust these settings frequently, an embedded graphic equalizer which is a high-fidelity audio
and listener annoyance from tuning the settings repeatedly[5]. control that allows the user to see graphically and control indi-
If the equalization settings were saved in the audio format it- vidually a number of different frequency bands. The equalizer
self along with each sound file, these issues may be resolved. settings have been maintained in semicolon separated fields
The music players could then decode the settings and set them for varying time durations; for a certain time duration, the
automatically before playing each track. MP3 players can be gainlevel values have been recorded in colon separated fields.
configured to decode embedded equalization settings and ad- The equalization settings are kept in the comments field of the
just them properly before playing any audio. These settings MP3 format’s id3 tag. The new format can be played on the
can be found in the tags of the MP3 format. To allow for the current MP3 players without changing the MP3 Header MP3
tuning of different frequency vocals with varied equalizations, Data MP3 Header MP3 Data MP3 Header MP3 Data ID3v2x
it is suggested that the equalizer settings for the recording’s in- Metadata Existing MP3 Format Equalizer Settings Embedded
dividual components be incorporated in the new format rather here New MP3 Format equalizer, and the old MP3 format can
than the equalizer settings for the entire file. This would be also be played on the new MP3 player with the equalizer set
helpful for songs sung by performers whose voice frequencies to default values because care has been taken to preserve the
differ significantly as well as for songs including both male audio information contained in the existing MP3 format. A
and female vocalists. Because the frequency of a woman’s program has been created that can convert MP3 files currently
voice is higher than that of a man, different musical instru- in existence into the new format, recover equalizer settings
ment settings are required to improve the audio response. One from any file in the new format, and play newly created MP3
might therefore have more control over the sound track by files in a player by properly adjusting the equalizer while the
using the time-varying equalizations. music is playing [6,7].

There are four stages to the implementation portion. Phase


One: An application was created to insert equalizer settings
within the pre-existing MP3 file, converting it from one for-
mat to another. Phase Two: The MP3 player was configured
to correctly decode and configure these parameters. This also
entails giving Windows Media Player a fresh look. Third
Phase: The player was reprogrammed to decode the time-var-
ying equalizer settings and set the equalizer properly. Fourth
Phase: The application was enhanced and the front end was
improved.
Figure 3: MP3 Format

34
GLIMPSE-Journal of Computer Science •Vol.3(1),JANUARY-JUNE2024,pp. 32-
36

the ability to scroll through, pause, stop, and play mp3 songs
in order to determine the proper equalizer settings. the win-
dows media player activex component has been used to do
this.

3.2.2. phase II: a new skin for windows media player has
been created in order to implement phase II. skins can be
used to add functionality to windows media player that is
not already there. the windows media player 9 series sDk’s
primary interop assembly, wmppia.dll, is made available
by microsoft and makes up the wmpequalizersettingsctrl
Figure 5: conversion of Existing MP3 Format to class. the equalizer has been adjusted to the gain levels kept
new MP3 Format in the new mp3 format using the equalizersettings element.
An event is triggered whenever a media file is opened via the
The implementation part has been divided into four phases. skin, and Jscript handles this event. Before playing the media,
First Phase: An application was designed to convert the exist- this event handler changes the equalization to the settings in-
ing MP3 into the new format i.e. equalizer settings were em- cluded in the new mp3 format. The following files make up
bedded into the already existing MP3 format. Second Phase: the skin: • Skin Definition File, a master file with the.wms file
The MP3 player was programmed to decode and set these set- extension that specifies how the other files are to be used. The
tings appropriately. This also includes creating a new skin for basic instructions for the functions of the skin and the loca-
windows media player. Third Phase: The equalizer settings tions of other files it uses are contained in the skin definition
were made time-varying and the player was reprogrammed files. Extension Markup Language (XML) is used in the skin
to decode these settings and set the equalizer appropriately. definition file to write the instructions. • Art Files: These files
Fourth Phase: Front end was improved and the application hold the skin’s graphic components. Bmp, gIF, Jpeg, and
was enhanced. pNg formats are among them.

After that, the modules were combined, and the finished pro- JScript Files: Script files, which have the.js file extension and
gram was tested and debugged. Phase I The project’s first are generated with microsoft Jscript, are used to construct
phase has been implemented using VB.Net and CDDBCon- more intricate reactions to events. the following photos are
trol.dll, which provides an interface to the ID3 tag of the included in the art files: • Primary photos: Users view these
currently used MP3 format. Inside CDDBControl.dll is the images after installing the skin. one or more images produced
CddbID3Tag class, which has the following data members: by particular skin controls make up the main image. • Map-
BeatsPerMinute as String, Album as String. ping Images: these pictures are employed in image mapping
to set off skin-related actions. However, Windows Media
This string contains the following: Comments, String Cop- Player uses the graphics in an image map file to trigger ac-
yrightHolder, String CopyrightYear, FileId, Genre, ISRC, tions when the user clicks on the skin; they are not intended
Label, Lead Artist, Movie, PartOfSet, TrackPosition, Year, for user viewing. • Alternate graphics: Another option is to
String. program other graphics to appear in response to user actions.
For instance, while the mouse is over a button, an alternate
A ten-band equalizer with the following frequency bands has image of the button will appear. 3.2.3 phase III: the player
its equalizer settings stored in the comments section of the was reprogrammed to decode the time-varying equalizer set-
ID3 tag: 31 Hz, 62 Hz, 125 Hz, 250 Hz, 500 Hz, 1 Hz, 2 Hz, tings and adjust the equalizer accordingly. the time-varying
4 Hz, 8 Hz, and 16 Hz. For each band, the gain level has been equalizer settings are recorded in semicolon separated fields
stored in the comments section of the ID3 tag for varying time for different time durations, and the values for gainlevels have
durations; for a given time duration, the values for gainlevels been stored in colon separated fields for a certain time dura-
have been stored in colon separated fields. In order to do this, tion. The comments field of the id3 tag of the MP3 format
the source file is copied to the target URL and then formatted has been utilized to store these settings. with the aid of the
to the new one. VB.Net cDDBcontrol.dll, settings are integrated into the
current mp3 format.
Additionally, this application uses VB.NET programming
to build an aesthetically pleasing arrangement of subjective the equalization settings for various time durations are ex-
phrases for the equalization controls, making them user- tracted first, and the value for each gainlevel is then extracted
friendly even for inexperienced users. The application offers from these settings, which is how the player has been repro-
grammed to decode these settings. The WMS file uses a timer,

35
GLIMPSE-Journal of Computer Science •Vol.3(1),JANUARY-JUNE2024,pp. 32-
36

REFERENCES
and the Jscript file handles the ontimer event. It compares the [1] A study on the influence of senses and the efficacy of sen-
current track position to the time period of the settings re- sory branding was conducted by Rupini RV and Nandagopal
trieved, and if a match is discovered, sets the equalization to R in the Journal of Psychiatry in 2015. DOI: 10.4172/1994-
the extracted settings for that specific duration. 8220.1000236
[2] Deep Learning Based Music Recommendation System, Ma-
3.2.4 phase IV: During this stage of the project, the applica- lige Gangappa, Avuluri Nikhitha, Bondugula Reshma, Gunda-
tion’s front end was made more user-friendly, several menu- bathina Manasa, Pooja Kumari Singh, International Research
Journal of Engineering and Technology (IRJET), Volume: 10
driven controls were added, and a comprehensive project help
Issue: 02 | Feb 2023
file was produced. Additionally, there was a way to access [3] Xin H. “Enhanced Deep Neural Network-Based Attention
and retrieve the equalization configuration that was saved in Mechanism-Based Music Recommendation Algorithm,”
any MP3 file in the updated format. Additionally, the ability Hindawi Mobile Information Systems, Volume 2022, Article
to construct playlists in HmtL, txt, and m3u formats was ID 4112575, 11 pages (https://doi.org/10.1155/2022/4112575).
introduced. the new player, which was created in phase three, [4] Yunzhe Dong, “A Machine Learning-Based Music Recom-
will also support this playlist. as a result, it is possible to cre- mendation System,” AMMMP 2023, Volume 47 (2023)
ate a playlist that contains both the new and the old MP3 files. [5] The article “Music Recommendation Using Deep Learning A
when this playlist is chosen for playback, the equalization for Study” was published in the International Journal of Creative
Research and Thoughts in May 2022 (Volume 10, Issue 5), by
the files in the older MP3 format is set to the factory defaults,
Shubham Kedari, Jui Walimbe, Sakshi Warule, and Madhuri
and for the files in the newer MP3 format, the settings are de- Thorat.
termined by the parameters inherent in the format. [6] Markus Schedl, Frontier in Applied Mathematics and Statis-
tics, 2019, “Deep Learning in Music Recommendation Sys-
CoNCLuSIoN tem.”
In order to save consumers from having to manually adjust [7] “Feature-combination hybrid recommender systems for au-
the equalizer every time they listen to the same mp3 music, a tomated music playlist continuation,” Andreu Vall, Matthias
new mp3 format has been proposed that may include embed- Dorfer, Hamid Eghbal-zadeh Markus Schedl, Keki Burjorjee,
ded equalizer settings for various track durations. the mp3 and Gerhard Widmer, User Modeling and User-Adapted In-
teraction (2019) 29:527–572, https://doi.org/10.1007/s11257-
player is designed to decode the contained equalization set-
018-9215-8
tings and adjust them appropriately while a track is playing.
the equalizer settings are part of the mp3 format tags. the
Assistant Professor Kamna Singh
new format may be played on existing mp3 players without
works at the Ajay Kumar Garg
the equalization being set, and the old mp3 format can also
Engineering College in Ghaziabad. Her
be played on the new mp3 player with the equalizer set to
degree is B.Tech. from JIIT, Noida, and
default settings because care has been made to preserve the
MTech from UCER, Allahabd. Internet
audio information included in the old mp3 format. a program
of Things, cyber security, information
has been created that can convert MP3 files currently in ex-
security, computer networks, and data
istence into the new format, get equalizer settings from any
mining are among the topics she studies.
file in the new format, and play newly created MP3 files ina
She has numerous information security
player by automatically adjusting the equalizer while the
publications published in reputable journals. She has multiple
music is playing. additionally, playback and playlist creation
research publications published in international journals with
capabilities has been included. To play files in the new MP3
Scopus indexing.
format, a new skin for windows media player has been cre-
ated, which automatically changes the equalizer to settings
encoded in the new mp3 format.

v. FuTuRE SCoPE
In addition to equalization, there may be other improvements
that might be made to the current mp3 standard. In a similar
way, this work can also be expanded to support other audio
formats, such waV, wma, etc.

36

You might also like