Identification of Musical Notes in Sheet Music Images Using Colors
Identification of Musical Notes in Sheet Music Images Using Colors
Identification of Musical Notes in Sheet Music Images Using Colors
Abstract— Optical Music Recognition reads notes from musical The input being an image, OMR Systems usually makes
score sheets by analyzing the interplay of various musical an intensive use of various image processing techniques as its
symbols, specially paying attention to the placement of musical engine. But unlike its predecessors in AI such as the Optical
notes on the stave lines or stave spaces to derive the its meaning.
We present an alternative method for identifying notes. Each of Character Recognition (OCR) wherein text characters against a
the five stave lines and four stave spaces were colored differently, against a uniform or non-uniform background are recognized,
and the musical note heads, a uniform color. The results of their classifying notes or objects in OMR are made extra difficult
overlaid colors were used to identify the letter-conversion of each due to the fact that the musical notes and symbols are super-
musical note on the G-clef staves. imposed on the stave lines- that is, both the objects of interest
Index Terms— Optical Music Recognition, Artificial Intelli- (the notes and symbols) and the background (the stave lines)
gence, Image Processing, sheet music are of the same color. Special techniques must be employed
in order to distinguish which is foreground and which is
background.
I. I NTRODUCTION
A. Background of the Problem B. Statement of the Problem
The WMN (Western Music Notation) had been widely used Image processing of musical score sheets had been a chal-
and accepted by musicians worldwide as a standard format for lenge for researchers due to the super-imposition of notes and
representing musical compositions. Using this format, musical symbols on the staves [10]. The Musical notes and symbols are
notes and symbols are placed systematically on a series of five usually regarded to be more important than the stave lines for
lines and four spaces called staves. Each note along with its they contain in bulk the sheet’s meaning, and various methods
accompanying elements is then visually identified by humans, of separating them from each other had been done in the past.
enabling them to understand the meaning or the semantics of But removing the stave lines usually damages the notes in
the score sheet as a whole. which some pixels belonging to a note are also removed along
Several studies arose from this field of Music wherein the with the stave lines, thus impairing its recognition.
interpretation of a musical score is automated by a machine, The semantics or meaning of a musical score sheet is given
instead of a human user. A branch of Artificial Intelligence by the sheet itself as a whole, and notes and symbols are
(AI), known as the Optical Music Recognition (OMR) System easily recognized by the musician due to the interaction of
had been developed to address on this task. OMR Systems musical notes on the spaces and stave lines. Optical Music
takes in as input a digital image of a musical score sheet Recognition Systems does the same, by keeping track of
and creates an output file suitable for computer manipulation the notes’ position relative to the stave lines or stave space.
from the read image. Output formats come in the form of Vertical lines corresponding to note stems which are possible
wave files (.wav), .pdf files or even text files that contains candidates for a note’s X-coordinates are stored in a list, and
the musical score sheet’s converted form which non-music components, such as the note heads that precede a note stem
practitioners who have difficulty in reading raw musical scores in another list [1][3][12]. With this, I propose to recognize the
could understand. Applications and systems for this field meaning of notes and symbols by assigning colors to each of
had been indispensable for both musicians and non-musicians the five lines and four spaces, varying their colors in intensity
alike, from the interpretation and conversion of a musical score for the different octaves such that when musical notes, also
which the general mass could understand, up to the detailed colored differently other than black coincides with them would
planning of an orchestral piece, and many more. produce colors matched to a specific note’s meaning.
To convert a musical score sheet into such formats, OMR
Systems must identify the stave lines, locate the musical C. Significance of the Study
objects on the score, identify the symbols that the objects
represent, and understand what the position of the of the The study, which poses a new method for classifying
symbols relative to the stave lines and each other means [11]. musical notes, would try to successfully identify the meaning
of musical notes by mixing their color on the colored stave
Presented to the Faculty of the Institute of Computer Science, University
of the Philippines Los Baños in partial fulfillment of the requirements for the
Degree of Bachelor of Science in Computer Science
c 2006 ICS University of the Philippines Los Baños
CMSC 190 SPECIAL PROBLEM, INSTITUTE OF COMPUTER SCIENCE 2
line or space it falls into. It aims to further improve character to the position that will generate the indicated tone
musical object recognition by the interactive use and on a musical instrument, especially on fast musical passages,
assignment of colors for the stave lines, stave spaces and where the mind has little time to process the position of the
musical objects. note character and translate it to the corresponding tone on
the instrument. Thus, for most people the only difficult aspect
D. Scope and Limitations of learning to read Standard Music Notation (SMN) is being
able to determine in an instant what tone is being represented
The system which had been implemented in the study by a note on the staff [10].
focused on the recognition of machine printed and scanned
musical score sheets, taking only the notes assigned on the
Treble-Staff or the right-hand and not on the interactive dual
of both the Treble-Staff and the Bass-Staff. In the real world,
musical notes on the Treble-Staff have accompanying notes
on the Bass-Staff which must be played all at once to produce
a harmonic blending of tones. The method of synchronizing
both staves in an OMR is another different topic, focusing
on the semantics of the musical score sheet as a whole. My
proposed study, which will only deal with the recognition of Fig. 1. Parts of a Staff.
musical notes, will not cover this area.
The system had been tested on several full-page machine- One of the initial challenges in any Optical Music Recog-
printed musical score sheets of varying sizes and resolutions. nition (OMR) system is the treatment of the staves. For
The musical notes’ recognition had been implemented on the musicians, stave lines are required to facilitate reading the
first four spaces and first five lines only, plus the first three notes. For the machine, however, it becomes an obstacle
upper and lower ledger lines. for making the segmentation of the symbols very difficult.
The task of separating background from foreground figures
E. Objective of the Study is an unsolved problem in many machine pattern recognition
The general objective of our study is to implement and test systems in general. There are two approaches to this problem
a method that may further enhance the recognition of musical in OMR systems. One way is to try to remove the stave lines
notes for OMR Systems using colors. It will make use of the without removing the parts of the music symbols that are
result of blending unique colors assigned to each stave line, superimposed. The other method is to leave the stave lines
stave space and musical notes to classify the meaning of a untouched and devise a method to segment the symbols [8].
musical note. Most OMR Systems first locates the position of the stave
The specific objectives are: lines prior to removing them. Many solutions and approaches
had already been proposed and among them is the use
1) to assign different colors for the staff lines,
of horizontal and vertical projections. Five obvious peaks
2) to assign different colors for the staff spaces,
corresponding to the location of stave lines are detected in
3) to assign a uniform non-black color to the musical notes,
horizontal projection. Vertical projection on the other hand
4) to identify the meaning of the musical notes based on
would also produce peaks corresponding to the width of
their color blended with the stave line or stave space;
the musical notes and symbols. Fuginana, I. in his Adaptive
and
Optical Music Recognition (AOMR) in 1988 has used these
5) to make a letter-based output (C,D,E,F,G,A,B) of the
methods to along with connected components analysis and
musical note’s converted form.
run-length coding to successfully isolate parts of a musical
score sheet into the primitive parts: the note stems, note heads,
II. R EVIEW OF R ELATED L ITERATURE bar lines, slurs, and symbols and finally, the staff itself.
Converting musical score sheets into formats suitable for Groups of compound musical objects and singles would be
computer manipulation and general formats non-music in- isolated after removing the stave lines. The next phase would
clined individuals could understand has driven researchers in be to recognize the primitive objects that make up each object.
the field of Computer Science to explore the possibilities and The primitive’s recognized from here would later be used to
intricacies of doing so. analyze the note’s semantics, for there are many number of
The Standard Music Notation (SMN) is the widely-used ways these may be combined to form a note. For example, a
music-composition writing format that started on the 17th note head with a stem, two bars and a dot would make up a
century. It involves the positioning of note characters on one 16th note with one-half of its duration added depending on the
or more staves that consists of five lines and four spaces, each time signature. A commonly used method for the recognition
representing a particular musical tone that occupies different of primitive objects is to use a set of templates that would be
positions on the Treble Staff and the Bass Staff. Lines and matched against objects that have been isolated. [7][1][6]
spaces represent sequential tones, and the note characters The meaning of musical notes are determined by their po-
differ according to their duration. Most people consider it sition or placement on the stave line or stave space [1][3][12].
more difficult to mentally translate the position of each note Bainbridge’s CANTOR System [1], Fujigana’s Adaptive Opti-
CMSC 190 SPECIAL PROBLEM, INSTITUTE OF COMPUTER SCIENCE 3
E. Template Matching
Template matching aims to give a correct guess of an
object’s group or classification by comparing it with a set
of templates that are more or likely different renditions and
samples of the said object obtained from other sources. If an
object is found to be quite like the objects in the template set,
then that object must also be a member of that group or class.
The object’s invariant features or characteristics that remain
constant in spite of the image’s rotation, skew, scale and
other severe warping factors must be wisely selected before
comparing with those same features in the set. The number
of black pixels and the ratio of black pixels with respect to Fig. 6. Stave spaces’ color assignements and corresponding note Letter-
area are the most commonly used features for the template conversions.
matching of musical symbols. K-Nearest Neighbors, one of the
most effective and popular classifiers and template matching
method wherein an object automatically becomes a member of B. Image acquisition and normalization
a class whose members are nearest to it by votation has been
implemented in Fujigana’s AOMR and Baindridge’s CANTOR The image obtained at this stage may or may not be purely
system [1][2][6]. black and white and some may even be uneven in coloring.
All input images must first be binarized, or converted to their
F. Artificial Intelligence (AI) black-and-white versions, a large seas of 1’s and 0’s. Then, the
horizontal projection of the binarized sheet music was derived.
A branch of Computer Science whose goal is to under-
A group of five ominous peaks from the horizontal projection
stand intelligence by building computer programs that exhibit
would correspond to the stafflines. Their y-coordinates would
intelligent behaviors. It is concerned with the concepts and
be recorded for future use.
methods of symbolic inference, or reasoning by a computer,
and how the knowledge used to make those inferences will be
represented inside the machine [5].
A. Assignment of Colors
Colors would be assigned to each of the five stave lines and
four spaces. A set of dark colors had been chosen to represent
Fig. 7. Horizontal projection of a straight sheet music.
them. Pure cyan blue (6,105,178) will be assigned for the note
heads and other musical symbols.
CMSC 190 SPECIAL PROBLEM, INSTITUTE OF COMPUTER SCIENCE 5
to the horizontal staff lines would be recorded, and colors will note heads are irrelevant and would be considered as noise.
be assigned to each horizontal line. An effective method for the This includes the sharps, flats, beams, slurs, bar lines, and
removal of stavelines had been demonstrated by Fujigana[6] others musical symbols, with the vertical note stems included.
which uses vertical black runs. First, the vertical black run of These must be removed such that only the note heads (as
the entire sheet was taken. The most frequent black-runs would much as possible) would remain. To remove bar lines and other
make up the stafflines, because they cover most of the page. thick or thin horizontal symbols, symbols whose horizontal
All vertical runs smaller than or equal to the most common black runs longer 2.0*noteWidth should be removed. A way
black runs(the stafflines) would be removed, leaving most of to remove long vertical lines is by removing all black pixels
the musical objects without the stafflines. The staff line height whose note stack width from Figure 10.0 is higher than 60
and staff space height could also be deduced from the initial percent of the staffspace height. Another method would also
vertical black run. be to remove all horizontal black runs whose widths are greater
than the staffspace width. After removing the note stems and
other symbols, only the note heads and other small symbols
will remain. Color them pure cyan blue (6,10,178).
Fig. 23. Unclean scanned musical score sheet(top) and binarized image
(bottom)
[2] T. Bell and K. Lin, “Integrating paper and digital music information
systems,” Master’s thesis, University of Canterbury, Christchurch, New
Zealand, 2000.
[3] I. Bloch and F. Rossant, “Robust and adaptive omr system including
fuzzymodeling, fusion of musical rules, and possible error detection,”
EURASIP Journal on Advances in Signal Processing, vol. 2007, p. 25,
August 2006.
[4] N. Carter, “Automatic recognition of printed music in the
Fig. 21. Melodies of Life (Final Fantasy 9 OST) context of electronic publishing,” Ph.D. dissertation, University
of Surrey, London, England, February 1989. [Online]. Available:
http://www.npcimaging.com/thesis/thesis.html
[5] R. S. Engelmore. (1993, May) Knowledge-based systems in japan.
[Online]. Available: http://www.wtec.org/loyola/kb/
[6] I. Fujigana, “Optical music recognition using projections,” Master’s
thesis, McGill University, Montreal, Canada, 1988.
[7] I. Fujigana and M. Droettboom, “Interpreting the semantics of music
notation using an extensible and object-oriented system,” in Proceedings
of the Nineth International Python Conference, pp. 71–85.
[8] S. George, Visual Perception of Music Notation: On-Line and Off-Line
Recognition. Australia: Idea Group Publishing, 2004.
[9] S. E. George and S. Sheridan, “Defacing music scores for improved
recognition,” in Proceedings of the Second Australian Undergraduate
Students Computing Conference, Dec. 8–10, 2004, pp. 142–148.
[10] D. Kestenbaum and V. M. Hyman. (2006, Dec) Colored music notation
system and method of colorizing music notation. [Online]. Available:
http://www.freepatentsonline.com/7148414.html
[11] J. McPherson, “Coordinating knowledge to improve op-
tical music recognition,” Ph.D. dissertation, The Univer-
sity of Waikato, Japan, Sept 2006. [Online]. Available:
Fig. 22. Optical Music Recognition Using RGB http://www.cs.waikato.ac.nz/ davidb/publications/acsc96/final.html
[12] M. Roth, An approach to recognition of printed music. Germany:
Eidgenssische Technische Hochschule Zurich, Departement Informatik,
musical score sheets is also yet to be implemented and a Institut fr Theoretische, 1994.
[13] (2008) The audiveris website. [Online]. Available:
straight-forward method of determining the skew angle and https://audiveris.dev.java.net/
correcting the skew by the detected angle is also recom- [14] (1988) Finale (software). [Online]. Available:
mended. Furthermore, the use of colors in identifying notes http://en.wikipedia.org/wiki/Finale Music Notation
could also be tested on both the G-clef and the F-clef staves.
VII. ACKNOWLEDGMENT
Many thanks to Him for inspiring me with ideas when my
mind goes blank, to Ma’am Marge for being His vessel of
guidance for providing me with wonderful ideas and sugges-
tions for the topic, Sir Vlad for the priceless knowledge shared
on the CMSC170 and CMSC191 sessions and my dear loving
parents for whom all of this is dedicated.
R EFERENCES
[1] D. Bainbridge and T. Bell, “An extensible optical music recognition
system,” presented at the Nineteenth Australasian Computer Science
Conference, 1996.