Languages For Computer Music
Languages For Computer Music
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, United States
Specialized languages for computer music have long been an important area of
research in this community. Computer music languages have enabled composers who
are not software engineers to nevertheless use computers effectively. While powerful
general-purpose programming languages can be used for music tasks, experience has
shown that time plays a special role in music computation, and languages that embrace
musical time are especially expressive for many musical tasks. Time is expressed
in procedural languages through schedulers and abstractions of beats, duration and
tempo. Functional languages have been extended with temporal semantics, and object-
oriented languages are often used to model stream-based computation of audio.
This article considers models of computation that are especially important for music
programming, how these models are supported in programming languages, and how
this leads to expressive and efficient programs. Concrete examples are drawn from some
of the most widely used music programming languages.
Keywords: music languages, real-time, music representations, functional programming, object-oriented
programming, sound synthesis, visual programming
Edited by:
Mark Brian Sandler,
Queen Mary University of London,
United Kingdom INTRODUCTION
Reviewed by:
Music presents a rich set of design goals and criteria for written expression. Traditional music
Alberto Pinto,
notation evolved to denote musical compositions that were more-or-less fixed in form. While not
Centro Europeo per gli Studi in Musica
e Acustica (CESMA), Switzerland exactly a programming language, music notation contains control structures such as repeats and
Gyorgy Fazekas, optional endings that are analogous to modern programming languages (Jin and Dannenberg,
Queen Mary University of London, 2013).
United Kingdom Traditional music notation and theory about musical time developed in the thirteenth century,
*Correspondence: while the comparable use of graphs to plot time-based phenomena in science did not occur until
Roger B. Dannenberg the sixteenth century (Crosby, 1997). Perhaps music can also motivate revolutionary thinking in
[email protected] computer science. Certainly, music is unlike many conventional applications of computers. Music
exists over time, while in conventional computation, faster is always better. Music often includes
Specialty section: many voices singing in harmony or counterpoint, while conventional computer architectures and
This article was submitted to
programming languages are sequential, and parallelism is often considered to be a special case.
Digital Musicology,
Music making is often a collaborative process, while computation is often viewed as discrete
a section of the journal
Frontiers in Digital Humanities operations where input is provided at the beginning and output occurs at the end. Perhaps music
will help to expand thinking about computer languages in general.
Received: 31 January 2018
Before embarking on a broad discussion of languages for computer music, it will be useful
Accepted: 02 November 2018
Published: 30 November 2018 to survey the landscape. To begin, the timeline in Figure 1 shows a number of computer music
languages and their date of introduction or development. A number of these will be used
Citation:
Dannenberg RB (2018) Languages for
throughout this article to illustrate different trends and concepts.
Computer Music. In the next paragraphs, we will introduce some of the dimensions of programming languages
Front. Digit. Humanit. 5:26. including their syntax, semantics, implementation issues, and resources for their users. These will
doi: 10.3389/fdigh.2018.00026 be useful in the following sections where we will describe what makes music special and different
FIGURE 1 | A timeline of representative and historically significant computer music languages. The selection here includes only languages that support digital audio
signal generation and processing.
Semantics
Semantics refers to the “meaning” or the interpretation of text
or graphical notation in a programming language. Semantically,
many programming languages are quite similar. They deal with
numbers, text strings, arrays, and aggregates in ways that are very
much a reflection of the underlying hardware, which includes a
large memory addressable by integers and a central processing
unit that sequentially reads, writes, and performs arithmetic on
data.
Since music computation often includes parallel behaviors,
carefully timed output, signal processing and the need to respond
to real-time input, we often find new and interesting semantics in
music languages. Music languages include special data types such
as signals and scores, explicit specifications for temporal aspects
FIGURE 2 | Graphical syntax examples. At left, a Pd program to add 3 + 4. At of program behavior and provisions for real-time scheduling and
right, a Max program to show the number of button presses in the previous 5 s. interaction.
Run-Time Systems
Semantics at the language design level often relate to the “run-
(section Why Is Music Different), models of musical time time system” at the implementation level. The term “run-
(section Models of Time and Scheduling), models for sound time system” describes the organization of computation and a
synthesis and audio signal processing (section Models for Sound collection of libraries, functions, and resources available to the
Synthesis), and examples (section Some Examples). Section running program. In short, the run-time system describes the
Conclusions presents conclusions. “target” of the compiler or interpreter. A program is evaluated
(“run”) by translating it into to a lower-level language expressed
in terms of the run-time system.
Syntax Run-time systems for computer music, like music language
Syntax refers to the surface level of notation. Most computer semantics, are often driven by the special requirements of
music languages are text-based languages with a syntax similar musical systems. In systems with audio signal processing, special
to other programming languages; for example, one might write x attention must be paid both to efficiency and to the need
+ y to add two variables, f(x,y) to evaluate a function with 2 for synchronous sample-by-sample processing. Concurrency in
arguments, or if (c) then f(x,y) to perform a conditional music often motivates special run-time support such as threads,
operation. processes, functional programming, lazy evaluation, or other
Graphical syntax has been especially popular in computer approaches. The importance of time in music leads to scheduling
music. Figure 2 illustrates simple expressions in this form, and support and the association of explicit timing with computation
we will discuss graphical music programming languages later. or musical events.
Whether the syntax is text-based or graphical, music languages
have to deal with timing, concurrency and signals, so perhaps Libraries
even more important than syntax is the program behavior or Most musicians are not primarily software developers. Their
semantics. main interest is not to develop new software, but to explore
musical ideas. Ready-made modules often facilitate exploration traditional computer science theory are time complexity (how
or even inspire new musical directions; thus, libraries of reusable long will a program run?) and space complexity (how much
program modules are important for most computer musicians. memory is required?).
This sometimes inhibits the adoption of new languages, which
do not emerge with a mature set of ready-made capabilities Music Demands Producing “Answers” at the Right
and examples. One trend in computer music software is “plug- Time
in” architectures, allowing libraries (especially audio effects and While computing answers quickly is important, in music the
software synthesizers) to be used by multiple languages and “answer” itself is situated in time. In real-time music systems,
software systems. the timing of musical events is as important as their content.
Computer music languages deal with time in different ways:
Programming Environment
Another important factor for most computer musicians is • An event-based, implicitly timed approach views computation
the programming environment. In earlier days of computing, as arising from input events such as a key pressed on a musical
programs were prepared with a simple text editor, compiled keyboard. Programs describe what to do when an input event
with a translator, and executed by the operating system. arrives, and the response is as fast as possible; thus, timing is
Modern language development is typically more integrated, implicitly determined by the times of events.
with language-specific editors to check syntax and offer • Explicit “out-of-time” systems do not run in real time and
documentation, background compilation to detect semantic instead keep track of musical time as part of the computation.
errors, and the ability to tie run-time errors directly to For example, an automatic composition system might generate
locations in the program text. Some programming languages a musical score as quickly as possible, but it must keep track of
support “on-the-fly” programming (or “live coding”) where the logical progression of time in the music and pass this time
programs can be modified during program execution. Some along into the program output, which is some kind of musical
music programming environments include graphical time-based score representation or perhaps an audio file.
or score-like representations in addition to text (Lindemann, • Precisely timed systems adapt the explicit “out-of-time”
1990; Assayag et al., 1999; Yi, 2017). approach to a real-time, or “in-time” system. The idea is to
maintain an accurate accounting of the “ideal” time of each
Community and Resources output event so that even if real computation lags behind now
Like other programming languages, computer music languages and then, cumulative error can be eliminated. This approach is
often enjoy communities of users who author tutorials, help widely used and is particularly useful when there are multiple
answer questions online, post example code and maintain open processes that need to be musically synchronized.
source implementations. While a vibrant community may have • Sample-synchronous computation is required for audio signal
little to do with the technical merits of a computer music processing. Here, time is effectively quantized into sample
language, the existence of a helpful community has a large impact periods (typically 44,100 samples per second, the sample rate
on the attractiveness of a language to typical users. used by CD Audio). Computation proceeds strictly sample-by-
sample in a largely deterministic fashion. In practice, operating
systems cannot schedule and run a computation for every
WHY IS MUSIC DIFFERENT sample (e.g., every 22 µs), so samples are computed slightly
We have mentioned a number of dimensions in which ahead of time in batches of around 1 to 20 ms of audio. Thus,
computer music languages differ from “ordinary” general while computation logically proceeds synchronously sample-
purpose programming languages. This section will focus on some by-sample, the process must actually compute faster than and
of these differences, and in particular, the importance of time in slightly ahead of real time to avoid any interruptions in the
music programming. flow of samples to the output.
as slope (derivative), and mappings can be nested to represent processing has had a strong influence on computer music
hierarchical structures such as a swing feel (local perturbations of language design, as we shall see.
time) within an overall increasing tempo.
A Good Music Programming Language
Events, Gestures, and Sound Should Be a Good Language
Musical computation takes place at different levels of granularity. There are many languages designed specifically to describe
A musical “event” usually refers to a macro-scale behavior that musical scores and event sequences. In particular, languages
has a beginning, duration, and end time. Conventional musical such as ABC (Walshaw, 2017) for encoding music notation
notes can be considered events, but so can the performance are common. See also Adagio (Dannenberg, 1998), Guido
of an entire movement of a sonata, for example. Events are (Hoos et al., 1998), MUSIC-XML (Good, 2001), and Lillypond
often represented by the invocation of functions in programming (Nienhuys and Nieuwenhuizen, 2003). In spite of the success
languages. of these examples, music is not so restricted and well-defined
“Gestures” in the computer music community usually refer to that it does not need the power of general-purpose programming
a continuous function of time, typically a time sequence of sensor languages. In particular, traditional music compositions rely on
values. Examples include pitch-bend information from a MIDI notation to communicate a fixed sequence of notes and rhythms,
keyboard, accelerometer data from a dancer, and the X-Y path of whereas modern computer music composition and performance
a mouse. Gestural computations require concurrent processing emphasizes dynamic computation of notes and rhythms, and
over time and there may be special language support for this. in general, sound events. This requires a more powerful and
“Sound” refers to sequences of audio samples. Because audio expressive language.
sample periods are in the range of microseconds while the To illustrate this point, consider one of the earliest
(worst case) response time of computers to events is often programming languages for music, Music V, which included a
multiple milliseconds, audio processing usually requires special specialized “score” language to invoke sound synthesis events
organization, buffers, and scheduling, and this has a great impact (e.g., “notes”) along a timeline, and an “orchestra” language
on computer music languages, as we will see in the next section. that described the signal processing to occur within each sound
The need to process events, gestures, and sounds is one of event. Since the score language simply presented a static list of
the main motivations for computer music languages. Often, events, their times, and parameters, Music V was not a very
computer music languages borrow most of their designs from general language in terms of computation. However, users soon
conventional programming languages, and it is these time-based developed more general “score generating languages” as front-
concepts that drive the differences. ends to overcome limitations of Music V. This is evidence that
music programming languages should be flexible to allow any
computation to be expressed.
Data Flow and Synchronous Behavior
Many computer music languages enable audio processing. Music Music Is Not One Thing
audio is often large, e.g., a 20-min composition in 8 channels of Just as there are many styles of music, there are many ways
floating point samples takes ∼1.7 gigabytes of storage. To deal to approach music computationally. Some languages attempt to
with such large sizes and also to enable real-time control, audio focus on just one aspect of computation, e.g., Faust (Orlarey
is usually computed incrementally by “streaming” the audio et al., 2009) is a language for describing audio signal processing
samples through a graph of generators and operators. algorithms, but it mostly leaves instantiation and control of these
Figure 3 illustrates a simple example that mixes two incoming algorithms to other languages and systems. Other languages,
channels, delays them, filters them, and pans the result to two such as Nyquist (Dannenberg, 1997b) and Open Music (Bouche
output channels. The computation expressed by this graph is et al., 2017), strive to be more general, with facilities for scores,
synchronous, meaning that for each computational step, a node automated music composition, control, signal analysis, and
accepts a sample from each input (on the left) and generates sound synthesis. The variety of musical problems and language
a sample for each output (on the right). Samples must never design goals makes the study and design of computer music
be ignored (dropped) or duplicated. This style of processing languages all the more interesting.
is sometimes called “data flow” and is quite different from
processing in more common procedural and object-oriented Flexibility to Compose Solutions Is More Important
languages. The need to support this type of synchronous signal Than Ready-Made Capabilities
It is difficult to define a general language that cleanly addresses
many types of problems. Languages attempting to support many
different tasks often have several “sub-languages” to handle
different programming requirements. For example, Music V has
separate score and orchestra languages, and Max MSP has a
similar syntax but different semantics and scheduling for control,
audio signals, and image processing. In general, ready-made
FIGURE 3 | An audio computation graph.
“solutions” within programming languages and systems tend
to be overly specific and ultimately limiting. Therefore, more
general languages with the flexibility to create new solutions for to proceed concurrently rather than sequentially. Precise timing
the problems at hand are more broadly useful. Within these is obtained by introducing “sleep” or “wait” functions that pause
languages, specific solutions are often developed as sharable computation of one thread, perhaps allowing other threads to
packages and libraries. run. In conventional programs, calling a “sleep” function would
result in an approximately timed pause; then the thread would be
MODELS OF TIME AND SCHEDULING scheduled to resume at the next opportunity. This of course can
lead to the accumulation of small timing errors that can be critical
Time is essential to music, and musicians have sophisticated for music applications.
abstractions of time. In this section, we will consider some of A solution used in many computer music languages is to keep
the abstractions and how these are reflected in programming track of logical time within each thread. When the thread “sleeps,”
languages. its logical time is advanced by a precise amount. The thread with
the lowest logical time always runs next until another “sleep” is
Logical Time Systems made to advance its logical time. In this scheme, threads do not
The most important time concept for computer music systems preempt other threads because, logically, time does not advance
is the idea of logical time. Logical time is also a key concept until a thread sleeps.
for computer simulations that model behaviors and the progress In FORMULA, “threads” are called processes, and “sleeping” is
of time. A simulation must keep track of simulated time even achieved by calling time_advance(delay), which indicates
though simulations may run faster or slower than real time. quite directly that logical time is manipulated. The decision
Similarly, music systems can keep track of simulated, or logical to actually suspend computation depends on the relationship
time, computing the precise, ideal time at which events should between logical time and real time. If logical time is greater, the
occur. When a real-time system falls behind (the logical time is process should suspend until real time catches up. If logical time
less than real time), the system can compute the next event earlier is less, the process is behind schedule and should continue to
to catch up, and if logical time is greater than real time, the system compute as fast as possible until it catches up to real time.
can wait. Thus, systems based on precise logical times can avoid In ChucK, threads are called “shreds” and the “sleep”
accumulating timing errors. operation is denoted by the “ChucK” operator (“=>”). For
For example, Nyquist has operators to control when example, advancing time by 500 ms is performed by the
computations take place within a logical time system. To create command
a musical notes at times 0.5 and 3, one could write: 500::ms => now;
sim(pluck(c4) @ 0.5, pluck(d4) @ 3) In many cases, it is not sufficient to wait to run threads until
Nyquist instantiates the pluck function at logical times 0.5 and real time meets their logical time. Output is often audio, and
3, and the resulting sounds are combined by sim. In practice, audio samples must be computed ahead of real time in order to
Nyquist runs ahead of real time, keeping samples in a playback be transferred to digital-to-analog converters. Therefore, some
buffer, and output timing is accurate to within a sample period. form of “time advance” is used, where threads are scheduled
to keep their logical time a certain time interval ahead of
Tempo and Time Deformation real time. Thus, output is computed slightly early, and there
In addition to logical time, music systems often model tempo is time to transfer output to device driver buffers ahead of
and beats, which essentially “warp” or “deform” musical time deadlines.
relative to real time. FORMULA (Anderson and Kuivila, 1990)
was an early system with elaborate mechanisms for tempo MODELS FOR SOUND SYNTHESIS
and time deformation. In FORMULA, tempo changes are
precisely scheduled events, and tempo can be hierarchical. As mentioned earlier, sound and signal computation is
For example, one process can regulate tempo, and another synchronous and often expressed as a graph where nodes
process, operating within the prescribed tempo framework, can represent computations and edges represent the flow of audio
implement a temporary speeding up and slowing down, or samples (Bernardini and Rocchesso, 1998). In some cases, the
rubato. computational nodes have parameters such as filter frequencies,
In Nyquist, tempo changes are represented by mappings from delay times, scale factors, etc., that can be updated, resulting in a
one time system to another. These mappings can be combined hybrid that combines synchronous data-flow computation with
through function composition to create nested structures such asynchronous parameter updates.
as tempo and rubato. These mappings can be specified using In MaxMSP and Pd, audio computation graphs are described
continuous functions, represented as sequences of samples just graphically. Figure 4 illustrates a simple program in Pd that
like audio (Dannenberg, 1997a). generates a sinusoid tone, with a slider to adjust the frequency
parameter.
Non-preemptive Threads in Formula and
Chuck Functional Programming
One way to express concurrency and precise logical timing is to Another approach to representing audio signal graphs is
use threads or co-routines. Threads allow multiple computations functional programming languages. In the functional approach,
means that the rise time of a sudden onset can be 0, 1.4, or 2.8 ms,
but nothing in between. We are sensitive to the sound quality of
these different rise times.
Some languages, such as Csound (Lazzarini et al., 2016), allow
the user to specify the block size so that smaller blocks can be used
when it matters, but using smaller blocks to solve one particular
problem will result in less efficient computation everywhere.
Max/MSP allows different parts of the audio computation to
use different block sizes. Music V uses a variable block size. In
Music V, a central scheduler keeps track of the logical time of the
next event, which might begin a note or other signal processing
operation. If the next logical time is far enough in the future, a full
sized block is computed. If the next logical time is a few samples
FIGURE 5 | Amatriain’s object model of signal computation.
in the future, then the audio computation graph is traversed to
compute just the next few samples. This allows the graph to be
updated with sample-accurate timing.
time, the program could send a “set_frequency” message to Another way to save computation is to compute some signals
the object to change the rate of vibrato. at a lower control rate. Many signals are used to control
Thus, the object-oriented approach provides an intuitive amplitude, frequency, filter cutoffs, and other parameters that
way to mix synchronous sample-by-sample computations with change relatively slowly. Typically, these control signals can
asynchronous real-time event processing. Amatriain (2005) goes outnumber audio signals, so control rate computation can save a
further to propose a general model for interactive audio and substantial amount of computation. This affects language design
music processing where the fundamental building blocks are because signal-processing primitives may come in two types:
objects with: audio rate and control rate. Often, the control rate is set to
the block rate so that a control signal has just one sample per
• other signal-generating objects as inputs, block while audio signals have many samples (e.g., 8 to 64) per
• event-like inputs that invoke object methods (functions), block.
• signals as outputs, and
• event-like outputs, providing a way for the object to send
notifications about the signal
SOME EXAMPLES
Figure 5 illustrates this object model, and, of course, the model is
recursive in that a signal-processing object can be composed from Now that we have explored some issues and design elements that
the combination of other signal-processing objects. The CLAM influence computer music programming languages, it is time to
system (Amatriain et al., 2006) used this model within a C++ look at some specific languages. There is not space for complete
language framework. descriptions, so this section aims to convey the flavor and unique
characteristics found in a selection of examples.
Block Computation
Music audio computation speed can be a significant problem,
especially for real-time systems. One way to make computation
Music N
The earliest computer music languages were created by Max
more efficient is to compute samples in vectors or blocks. Audio
Mathews at Bell Labs starting in 1957. A series of languages
computation requires the program to follow links to objects, call
named MUSIC I, MUSIC II, etc. inspired similar languages
functions, and load coefficients into registers. All of this overhead
including MUSIC 360 for the IBM 360, MUSIC 4BF implemented
can take as much time as the arithmetic operations on samples.
in FORTRAN, MUSIC 11 for the PDP-11, and Csound
With block computation, we compute block-by-block instead of
implemented in the C programming language. Because the
sample-by-sample. Thus, much of the computational overhead
underlying semantics are so similar, these programs are often
can be amortized over multiple samples. While this is a seemingly
referred to as “Music N.”
small detail, it can result in a factor of two speedup.
The most characteristic aspect of Music N is the separation of
Unfortunately, block computation creates a number of
the “score” and “orchestra” languages, so Music N is really two
headaches for language designers that have yet to be resolved.
languages that work together. The score language is essentially
The main problem with block computation is that logical
an unstructured list of events. Each event has a starting time,
computation times are quantized into larger steps than would
duration, an instrument, and a list of parameters to be interpreted
be the case for individual samples, which typically have a period
by the instrument. The following example shows a few notes from
of 22 µs. In contrast, blocks of 64 samples (a common choice)
a Music V score (Mathews et al., 1969):
have periods of 1.4 ms. There are signal-processing situations
where 1.4 ms is simply not precise enough. A simple example is NOT 0 1 .50 125 8.45 ;
this: suppose that an amplitude envelope is composed of linear NOT .75 1 .17 250 8.45 ;
segments that can change slope at each block boundary. This NOT 1.00 1 .50 500 8.45 ;
Each line plays one note. The first note (line 1) starts at time 0, sequences, scores to be hierarchically composed of sub-scores,
uses instrument #1, has a duration of 0.5 s, and has two more and instruments to contain sub-instruments.
parameters for amplitude and pitch control.
The orchestra language defines instrument #1 as follows:
Max/MSP and Pd
INS 0 1 ;
In contrast to Music N, Max/MSP (Puckette, 2002) and its
OSC P5 P6 B2 F2 P30 ;
open-source twin Pd (Puckett, 1996) are visual programming
OUT B2 B1 ;
languages, but they at least share the idea of “unit generators.”
END ;
Sound synthesis is accomplished by “drawing” unit generators
The idea is that for each note in the score, an instance of an and “connecting” them together with virtual wires, creating data
instrument is created. An instance consists of data for each of flow graphs as we saw in Figure 4.
the signal processing “objects” OSC and OUT. These objects take Max/MSP does not normally structure data-flow graphs
parameters from the score using “P” variables (e.g., the amplitude into “instruments,” make instances of graphs, or attach
and pitch are denoted by P5 and P6, which take the 5th and 6th time and duration to graphs. These are limitations, but
fields from the note in the score). Max/MSP has the nice property that there is a one-to-one
While its syntax is primitive due to 1960’s era computing correspondence between the visual interface and the underlying
capabilities, Music V paved the way for many future languages. unit generators.
One big idea in Music V is that instruments are created Timing, sequencing and control in Max/MSP is accomplished
with a time and duration that applies to all of their signal- by sending “update” messages to unit generators. For example,
processing elements. This idea was extended in Nyquist so the “patch” in Figure 6 contains a mixture of signal-processing
that every function call takes place within an environment that unit generators and message-passing objects. The dashed lines
specifies time, duration (or stretch factor), and other parameters. represent signals and the solid lines represent connections for
Although Music V instruments describe sequential steps (e.g., message passing. This patch uses sfplay∼ to play a sound
OSC before OUT in this example), there are clear data-dependent file. The output of sfplay∼ is passed through ∗ ∼, which
connections (OSC outputs to buffer B2, which is read by OUT), multiplies the signal by another value to implement a volume
and in a modern language like Nyquist, merely by allowing nested control, and the audio output is represented by dac∼. To
expressions, one can write something like play(osc(c4)), control the patch, there is no score. Instead, when “1” is sent to
indicating data dependencies and data flow in a functional style. sfplay∼, it plays to the end of the file (the user simply “drags
Thus, Music N made several contributions and innovations. and drops” the desired file from the computer’s file browser).
First is the idea that one can create virtual “instruments” by The user can click on the message box containing “1” to send
combining various signal processing elements that generate, that message. Similarly, the user can control volume by dragging
filter, mix, and process streams of digital audio samples. In the cursor up or down on the number box containing “0.72.”
Music V, these elements (such as OSC and OUT in the example)
are called “unit generators.” Nearly all software synthesizers
use this concept. Second, Music V introduced the idea that
virtual instruments can be invoked somewhat like functions in
sequential programming languages, but instances of instruments
can exist over time and in parallel, computing a block of
samples at each time step. Third, Music V introduced the
essential ingredient of time into computer music languages.
When instruments are invoked, they are given a starting
time and duration, which affect not only the instrument
but also all of the unit generators activated inside the
instrument.
It should be mentioned that the “modern” Music N language
is Csound, which has progressed far beyond the early Music
V. In particular, Csound supports a much more modern
expression syntax, many more unit generators, some procedural
programming, and provisions for real-time control.
Nyquist, mentioned earlier, is also a descendent of Music
V. If one considers a Music V score to denote the sum of the
results of many timed function calls, and Music V orchestras as
functions that return sounds, defined in terms of unit generators
and other functions, then much of the semantics of Music V
can be expressed in a functional programming style. This is
more powerful than Music N because it unifies the ideas of FIGURE 6 | A simple “patch” in Max/MSP to play a sound file with volume
control.
score and orchestra, allowing “instruments” to contain scores and
Changes are sent as messages to ∗ ∼, which updates its internal unit generators, where lower-frequency control-rate processing is
scale factor to the new number. used for efficiency. The instrument is compiled and loaded into
the synthesizer engine.
SuperCollider The Pbind expression constructs messages using patterns
SuperCollider (McCartney, 2002) is primarily a real-time Pseq, Prand, and Pkey to generate parameter values for
interactive computer music language, having roughly the same the synthesizer and to control the duration of each event. For
goals as Max/MSP. However, SuperCollider is text-based and example, Pseq alternately selects the array [50, 53, 55, 57],
emphasizes (1) more flexible control structures, (2) treating generating one chord, and another array [53, 56, 58, 60], offset
objects as data and (3) support for functional programming. For by a random integer from 0 to 10 (using Pwhite). The result
the most part, SuperCollider is organized around object classes. of each pattern generator is of type Stream, which represents
The class UGen represents unit generators, class Synth represents an infinite sequence of values. In this case, playing the stream
a collection of unit generators that produce audio output, and generates an infinite sequence of events representing chords, and
class Stream represents a sequence of values, which are often sends the events to be played by the synthesizer.
parameters for sound events (e.g., notes).
SuperCollider separates control from synthesis, using two ChucK
processes communicating by messages. One reason for this ChucK (Wang et al., 2015) is an open source project that
is to insulate the time-critical signal processing operations in originally focused on “live coding,” or writing and modifying
the synthesis engine, scsynth, from less predictable control code in real time as a music performance practice. ChucK
computations in the composition language, sclang. terminology uses many plays on words; for example, ChucK
SuperCollider illustrates some of the trade-offs faced by refers to precise timing (described earlier) as “strongly timed,”
language designers. Earlier versions of SuperCollider had a which is a reference to a class of conventional languages that are
more tightly coupled control and synthesis systems, allowing “strongly typed.”
control processing to be activated directly and synchronously by ChucK uses a construct called a “shred” (similar to and rhymes
audio events. Also, audio processing algorithms, or instruments, with the conventional term “thread,” and possibly a reference to
could be designed algorithmically at the time of instrument “shredding” or virtuosic lead electric guitar playing.) A shred is
instantiation. In the latest version, instrument specifications can the basic module of ChucK, providing a thread and some local
be computed algorithmically, but instruments must be prepared variables whose lifetimes are that of the thread. In Figure 8, the
and compiled in advance of their use. This makes it faster first line creates a data flow graph using the unit generators
to instantiate instruments, but creates a stronger separation SinOsc, ADSR, and dac. The “ChucK” operator (=>) forms
between control and synthesis aspects of programs. the connections. Notice how this syntax not only establishes
Figure 7 contains a very short SuperCollider program to connections, but also names the unit generators (s and e, short
play a sequence of chords transposed by a random amount. In for for “sinusoid” and “envelope”). This ability to reference unit
this program, the SynthDef describes an instrument named generators and update them corresponds to Amatriain’s “Objects
sawSynth, which consists of three sawtooth oscillators (Saw, and Updates” model, as we will see in the code.
parameterized by an array of 3 frequencies, returns three One might expect a different syntax, e.g.,
oscillators, which are reduced to stereo by Splay). The sound dac(SinOsc(ADSR)), which would allow unit generators
is then low-pass filtered by LPF, which is controlled by a slowly (such as mixers and multipliers) to accept multiple inputs and
varying cutoff frequency generated by LFNoise2. The ar() parameters. In ChucK, additional parameters are set in various
and kr() methods denote audio rate and control rate versions of ways using additional statements. You can see e.set(...)
FIGURE 7 | A simple SuperCollider composition and synthesis program, based on examples by Bruno Ruviaro (http://sccode.org/1-54H).
used to set envelope parameters, and 0.5 => s.gain used Although Faust is text-based, it uses an unusual syntax to
to set the oscillator gain. encode block diagrams as an alternative to more conventional
After initializing variables, values, and connections, a functional notation. In functional notation, we write f (input1,
typical ChucK shred enters a while loop that describes input2) to denote the output of f with inputs input1 and input2,
a repeating sequential behavior. Here, we see e.keyOn() but in Faust, we write input1, input2: f as if signals are flowing
and e.keyOff(), used to start and stop the envelope, and left-to-right. The comma (,) denotes “parallel composition,”
800::ms => now, used to pause the shred for 800 ms of which is to say that input1 and input2 are fed in parallel
logical time. During the pause, the unit generators continue to to f. The colon (:) denotes “sequential composition”: data
run and generate audio. flows from input1, input2 into f. Other composition operators
In ChucK, unit generators compute one sample at a time, describe feedback loops, splitting (fan-out) and summing (fan-
which is less efficient than block-at-a-time computation, but it in).
allows the thread to awaken and update unit generators with Figure 9 contains an example of a Faust program to generate
sample-period accuracy. This allows for some very interesting a sine tone with frequency and amplitude controls (Orlarey,
control and synthesis strategies that interleave “standard” unit 2015). Here, we see a mixture of functional notation, “block
generators with custom control changes. diagram” notation, and infix notation. Typically, a sine tone
computation would be built into a language, because at best, it
Faust would be very, inefficient to describe this computation in terms of
While most computer music languages provide some way to available operations. Because Faust works at the sample level and
express “notes” or sound events that instantiate an audio flow writes code for an optimizing compiler, it is practical to describe
graph (or, in Max/MSP, send messages to audio objects that oscillators, filters, and many signal processing algorithms. In fact,
already exist), Faust (Orlarey et al., 2009) is designed purely to this is the main goal of Faust.
express audio signal computation. Faust also differs from most In Figure 9, the second line describes the phase computation.
other languages because it does not rely on a limited set of built- The third line simply scales the phase by 2π and computes
in unit generators. Instead, Faust programs operate at the audio the sine. In functional notation, we would say osc(f ) =
sample level and can express unit generators. In fact, Faust can sin(2π · phasor(f )). The phasor function should therefore
output code that compiles into unit generators for a variety of return a signal where samples increase step-by-step from 0 to 1,
languages such as Csound, described earlier. then wrap around (using the modulus function fmod) to 0. In
Rather than inter-connecting pre-compiled unit-generators at a procedural programming language, we might define a variable
run time like many other languages, Faust produces code in the phase and assign new values to it, e.g.,
C++ programming language (also other languages such as Java, phase = fmod(phase + increment, 1);
Javascript, and WebAssembly) that must then be compiled. This
compilation avoids much of the overhead of passing data between But Faust is a functional language with no variables or assignment
unit-generators, allowing the primitive elements of Faust to be operators, so the algorithm is expressed using feedback denoted
very simple operators such as add, multiply, and delay. These by “∼_.” This says to take a copy of the output and feed it back
operations apply to individual samples. into the input. Thus, the previous sample of phase is combined
with f/ma.SR, and these two signals are added (by “+”) and (Orlarey et al., 2004, 2009) and Kronos (Norilo, 2015) offer
become the first argument to fmod. very clean notations for describing unit generators, but neither
The last definition, of process, illustrates that graphical user includes a flexible or powerful notation for events, scores,
interface elements can be considered signal generators. Here, dynamic instantiation of concurrent behaviors or time-based
a slider labeled “freq” controls the amount by which phase is scheduling.
incremented as a way to change the oscillator frequency, and The functional programming approach seems natural for
“level” controls a scale factor of the samples output by osc. signal processing because it is a good match to synchronous
Faust is specialized to describe audio signal processing data flow or stream-processing behaviors that we see inside
algorithms. For example, it would be difficult to use Faust to unit generators. Functional programming is also natural for the
compose a melody. Nevertheless, Faust has become quite popular expression of interconnected graphs of unit generators. However,
for creating unit generators and signal processing plug-ins that it is also natural to view unit generators as stateful objects that
can be used in other languages and systems. There are substantial operate on signals synchronously while allowing asynchronous
libraries of Faust functions, Faust is able to generate ready-to- updates to parameters such as volume, frequency, and resonance.
use modules for a number of different systems, and Faust helps It is the nature of music that things change. If “change” could
developers avoid many low-level details of programming directly always be expressed as a signal, perhaps music representations
in C or C++. would be simpler, but in practice, “change” often arises from
discrete events, for example key presses on a musical keyboard.
CONCLUSIONS Intuitively, these are state changes, so an important challenge
in music language design is providing “natural” functional
Computer music languages offer a fascinating collection descriptions of signal flow while at the same time enabling the
of techniques and ideas. Computer music languages differ expression of state changes, discrete events, and their interaction
from other languages in that they must deal with time, with signals.
complex concurrent behaviors, and audio signals. All of A third challenge is to facilitate the inspection and
these concepts are fairly intuitive as they relate to music, understanding of complex real-time music programs. Max/MSP,
but they can be very tricky to program in conventional with its visual representation of computation graphs, makes
programming languages. Because music making is more it easy to insert probes and visualizations of messages and
a creative process than an engineering discipline, it is signals; thus, it ranks highly in terms of transparency. However,
important for languages to support rapid prototyping and Max/MSP does not encourage abstraction in the form of
experimentation, which also leads to specialized notations, syntax functions, classes, multiple concurrent instances of behaviors,
and semantics. or recursion, and even iteration can be awkward to design and
observe. Gibber (Roberts et al., 2015), a live-coding language,
Computer Music Language Challenges takes the innovative approach of animating the source code to
We have seen a number of design issues and some solutions, indicate when statements are evaluating. Even the font sizes
but there are many challenges for future language designers. of expressions can be modulated in real time to reflect the
We saw how Music N described music synthesis in two amplitude of the signal they are generating. This kind of dynamic
languages: the score language, which specifies timing and display of programs draws attention to active code and helps
parameters for synthesis computations, and the orchestra to associate sounds with the expressions that gave rise to
language, which describes the synthesis computations as a them. In Aura (Danenberg, 2004), one can design instruments
graph of interconnected unit generators. In practice, there is from unit generators with a visual editor (inspired by Max) as
a third language needed to describe the internal processing well as with code. The visual instrument editor automatically
of each unit generator. Nyquist showed a way to unify generates a graphical control panel for the instrument so that
the score and orchestra languages into a mostly-functional the user can test the instrument interactively, either before
programming language, but it would be even better if Nyquist incorporating the instrument into a composition, or after
could define new unit generators as well. Chronic (Brandt, that when questions arise. The general challenge for language
2000, 2002) was one approach to bridging this gap, but it designers is to provide useful inspection and debugging facilities,
required special conventions for expressing signal processing especially for real-time music performances involving timed,
algorithms, almost as if using a separate language. Faust concurrent behaviors.
Language Development Is Active Music applications range from theoretical music analysis
Although computer music language development began in to live coding. Other applications include generating and
the 1950’s, there is quite a lot of activity today. If anything, controlling MIDI data (an interface designed for and
fast computing hardware has opened new capabilities, created universally used by commercial synthesizers), algorithmic
more demand for creative music software, and encouraged composition, and music typesetting. Applications we have
more development. Faster computers also facilitate software already discussed include music signal processing and event-
development. Ambitious language development projects can based real-time systems. Each application area motivates
be accomplished faster than ever before. On the other hand, different language organizations and semantics. To some
users have become accustomed to advanced programming extent, different levels of technical expertise—from beginner
environments with automatic command completion, pop-up to professional software developer—also place emphasis on
hints, reference materials, syntax-directed editing and other different aspects of music programming. For all these reasons, we
conveniences, and this adds to the burdens of language can expect that music language design and development
development. Failure to provide these amenities makes new will remain active and interesting for the foreseeable
languages more difficult to learn and use. future.
REFERENCES Dannenberg, R. B. (1998). The CMU MIDI Toolkit. Manual, Carnegie Mellon
University, Carnegie Mellon University, Pittsburgh. Available online at: http://
Amatriain, X. (2005). An Object-Oriented Metamodel for Digital www.cs.cmu.edu/~music/cmt/cmtman.txt.
Signal Processing. Ph.D. Thesis, Universitat Pompeu Fabra, Good, M. (2001). “MusicXML for notation and analysis,” in The Virtual Score:
Barcelona. Representation, Retrieval, Restoration, eds W. B. Hewlett and E. Selfridge-Field
Amatriain, X., Arumi, P., and Garcia, D. (2006). “CLAM: a framework for (Cambridge: MIT Press), 113–124.
efficient and rapid development of cross-platform audio applications,” Hoos, H., Hamel, K., Renz, K., and Kilian, J. (1998). “The GUIDO music
in Proceedings of ACM Multimedia (New York, NY: ACM), notation format - a novel approach for adequately representing score-
951–954. level music,” Proceedings of the International Computer Music Conference
Anderson, D. P., and Kuivila, R. (1990). “A system for computer music (San Francisco, CA: International Computer Music Association),
performance,” in ACM Transactionas on Computer Systems (New York, NY: 451–454.
ACM), 56–82. Jin, Z., and Dannenberg, R. B. (2013). “Formal semantics for music
Assayag, G., Rueda, C., Laurson, M., Agon, C., and Delerue, O. (1999). Computer control flow,” in Proceedings of the 2013 International Computer Music
assisted composition at Ircam: patchWork & OpenMusic. Comput. Music J. 23, Conference (San Francisco, CA: International Computer Music Association),
59–72. 85–92.
Bernardini, N., and Rocchesso, D. (1998). “Making sounds with numbers: a tutorial Lazzarini, V., Yi, S., Ffitch, J., Heintz, J., Brandtsegg, Ø., and McCurdy,
on music software dedicated to digital audio,” in Proceedings of COST G-6 I. (2016). Csound: A Sound and Music Computing System. Cham:
DAFX (Barcelona). Springer.
Bouche, D., Nika, J., Chechile, A., and Bresson, J. (2017). Computer- Lindemann, E. (1990). “ANIMAL - A rapid prototyping environment for
aided composition of musical processes. J. N. Music Res. 46, 3–14. computer music systems,” in Proceedings of the International Computer Music
doi: 10.1080/09298215.2016.123013 Conference (San Francisco, CA: International Computer Music Association),
Brandt, E. (2000). “Temporal type constructors for computer music 241–244.
programing,” in Proceedings of the 2000 International Computer Music Mathews, M. V., Miller, J., E., Moore, F., R., Pierce, J., R., and Risset, J., C. (1969).
Conference (San Francisco, CA: International Computer Music Association), The Technology of Computer Music. Cambridge, MA: MIT Press.
328–331. McCartney, J. (2002). Rethinking the computer music language:
Brandt, E. (2002). Temporal Type Constructors for Computer Music Programming. supercollider. Comput. Music J. 26, 61–68. doi: 10.1162/0148926023209
Ph.D. Thesis, School of Computer Science, Carnegie Mellon University 91383
Pittsburgh. Available online at: https://www.cs.cmu.edu/~music/chronic/. Nienhuys, H-W., and Nieuwenhuizen, J. (2003). “Lilypond, a system for
Crosby, A. (1997). The Measure of Reality: Quantification and automated music engraving,” in Proceedings of the XIV Colloquium
Western Society, 1250-1600. Cambridge: Cambridge University on Musical Informatics (XIV CIM 2003) (Firenza: Tempo Reale),
Press. 1–6.
Danenberg, R. B. (2004). “Combining visual and textual representations for flexible Norilo, V. (2015). Kronos: a declarative metaprogramming language for digital
interactive signal processing,” in The ICMC 2004 Proceedings (San Francisco, signal processing. Comput. Music J. 39, 30–48. doi: 10.1162/COMJ_a_00330
CA: International Computer Music Association). Orlarey, Y. (2015). A Sine Oscillator. Available online at: http://faust.grame.fr/
Dannenberg, R. B. (1997a). Abstract time warping of compound events and signals. examples/2015/09/30/oscillator.html (accessed Jan 26, 2018).
Comput. Music J. 21, 61–70. Orlarey, Y., Fober, D., and Letz, S. (2004). Syntactical and semantical aspects of
Dannenberg, R. B. (1997b). Machine tongues XIX: nyquist, a language for faust. Soft Comput. 8, 623–632. doi: 10.1007/s00500-004-0388-1
composition and sound synthesis. Comput. Music J. 21, 50–60. Orlarey, Y., Fober, D., and Letz, S. (2009). FAUST: An Efficient Functional Approach
Dannenberg, R. B. (1997c). the implementation of nyquist, a sound synthesis to DSP Programming. New Computational Paradigms for Computer Music.
language. Comput. Music J. 21, 71–82. Paris: Editions Delatour.
Puckett, M. (1996). “Pure data,” in Proceedings, International Computer Music Conflict of Interest Statement: The author declares that the research was
Conference (San Francisco, CA: International Computer Music Association), conducted in the absence of any commercial or financial relationships that could
224–227. be construed as a potential conflict of interest.
Puckette, M. (2002). Max at Seventeen. Comput. Music J. 26, 31–43.
doi: 10.1162/014892602320991356 The reviewer, GF, and handling editor declared their shared affiliation at the
Roberts, C., Wakefield, G., Wright, M., and Kuchera-Morin, J. (2015). time of the review.
Designing musical instruments for the browser. Comput. Music J. 39, 27–40.
doi: 10.1162/COMJ_a_00283 Copyright © 2018 Dannenberg. This is an open-access article distributed under
Walshaw, C. (2017). Available online at: abcnotation.com (Accessed January 29, the terms of the Creative Commons Attribution License (CC BY). The use,
2018). distribution or reproduction in other forums is permitted, provided the original
Wang, G., Cook, P. R., and Salazar, S. (2015). ChucK: a strongly timed computer author(s) and the copyright owner(s) are credited and that the original publication
music language. Comput. Music J. 39, 10–29. doi: 10.1162/COMJ_a_00324 in this journal is cited, in accordance with accepted academic practice. No
Yi, S. (2017). Blue - a Music Composition Environment for Csound. Available online use, distribution or reproduction is permitted which does not comply with these
at: http://blue.kunstmusik.com/ (Accessed January 29, 2018). terms.