Developing Virtual Synthesizers With VCV Rack
Developing Virtual Synthesizers With VCV Rack
Developing Virtual Synthesizers with VCV Rack takes the reader step by step through the process of
developing synthesizer modules, beginning with the elementary and leading up to more engaging
examples. Using the intuitive VCV Rack and its open-source C++ API, this book will guide even
the most inexperienced reader to master efficient DSP coding to create oscillators, filters, and
complex modules.
Examining practical topics related to releasing plugins and managing complex graphical user
interaction, with an intuitive study of signal processing theory specifically tailored for sound
synthesis and virtual analog, this book covers everything from theory to practice. With exercises and
example patches in each chapter, the reader will build a library of synthesizer modules that they can
modify and expand.
Supplemented by a companion website, this book is recommended reading for undergraduate and
postgraduate students of audio engineering, music technology, computer science, electronics, and
related courses; audio coding and do-it-yourself enthusiasts; and professionals looking for a quick
guide to VCV Rack. VCV Rack is a free and open-source software available online.
If, like me, you enjoy creating your own music tools, this book will help you bring your ideas to life
inside the wonderful world of VCV Rack. Whether you are a beginner or an expert in DSP, you will
learn something interesting or new.
Dr. Leonardo Laguna Ruiz, Vult DSP
Developing Virtual Synthesizers
with VCV Rack
Leonardo Gabrielli
First published 2020
by Routledge
52 Vanderbilt Avenue, New York, NY 10017
and by Routledge
2 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
Routledge is an imprint of the Taylor & Francis Group, an informa business
© 2020 Taylor & Francis
The right of Leonardo Gabrielli to be identified as author of this work has been asserted by him in
accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this book may be reprinted or reproduced or utilized in any form or
by any electronic, mechanical, or other means, now known or hereafter invented, including
photocopying and recording, or in any information storage or retrieval system, without permission
in writing from the publishers.
Trademark notice: Product or corporate names may be trademarks or registered trademarks, and
are used only for identification and explanation without intent to infringe.
Library of Congress Cataloging-in-Publication Data
Names: Gabrielli, Leonardo, author.
Title: Developing virtual synthesizers with VCV Rack / Leonardo Gabrielli.
Description: New York, NY : Routledge, 2020. | Includes bibliographical references and index.
Identifiers: LCCN 2019039167 (print) | LCCN 2019039168 (ebook) | ISBN 9780367077730
(paperback) | ISBN 9780367077747 (hardback) | ISBN 9780429022760 (ebook) | ISBN
9780429663321 (epub) | ISBN 9780429660603 (mobi) | ISBN 9780429666049 (adobe pdf)
Subjects: LCSH: VCV Rack. | Software synthesizers. | Computer sound processing. | Sound
studios–Equipment and supplies.
Classification: LCC ML74.4.V35 G325 2020 (print) | LCC ML74.4.V35 (ebook) |
DDC 786.7/4134–dc23
LC record available at https://lccn.loc.gov/2019039167
LC ebook record available at https://lccn.loc.gov/2019039168
Preface .......................................................................................................................... x
Acknowledgments ........................................................................................................ xiii
Chapter 1: Modular Synthesis: Theory.............................................................................. 1
1.1 Why Modular Synthesis? ............................................................................................................1
1.2 An Historical Perspective............................................................................................................2
1.2.1 The Early Electronic and Electroacoustic Music Studios................................................ 2
1.2.2 The Birth and Evolution of Modular Synthesizers .......................................................... 4
1.2.3 The Advent of a Standard ................................................................................................ 6
1.2.4 The Software Shift ........................................................................................................... 7
1.3 Modular Synthesis Basics ...........................................................................................................8
1.3.1 Sound Sources.................................................................................................................. 8
1.3.2 Timbre Modification and Spectral Processing ............................................................... 10
1.3.3 Envelope, Dynamics, Articulation ................................................................................. 12
1.3.4 “Fire at Will,” or in Short: Sequencers .......................................................................... 13
1.3.5 Utility Modules .............................................................................................................. 15
Chapter 2: Elements of Signal Processing for Synthesis..................................................... 16
2.1 Continuous-Time Signals........................................................................................................16
2.2 Discrete-Time Signals .............................................................................................................18
2.3 Discrete-Time Systems ...........................................................................................................22
2.4 The Frequency Domain...........................................................................................................27
2.4.1 Discrete Fourier Series ................................................................................................ 29
2.4.2 Discrete Fourier Transform ......................................................................................... 32
2.4.3 Properties of the Discrete Fourier Transform.............................................................. 35
2.4.4 Again on LTI Systems: The Frequency Response ...................................................... 36
2.4.5 Computational Complexity and the Fast Fourier Transform....................................... 36
2.4.6 Short-Time Fourier Transform .................................................................................... 37
2.5 Once Again on LTI Systems: Filters.......................................................................................38
2.6 Special LTI Systems: Discrete-Time Differentiator and Integrator........................................40
2.7 Analog to Digital and Back.....................................................................................................42
2.8 Spectral Content of Typical Oscillator Waveforms ................................................................46
2.9 Understanding Aliasing...........................................................................................................49
v
vi Contents
In September 2017, a new open-source software project was released in the wild of the Internet,
starting from a few hundred lines of source code written by a talented developer. A couple of
months later, thousands of passionate musicians and developers were eagerly following its quick
growth. With a user base of tens of thousands, VCV Rack is now one of the trendiest platforms for
modular software synthesis, and it is particularly appealing for its open-source codebase and ease of
use.
In October 2017, I was more than happily having fun out in the sun during the weekends, trying to
soak up the last sun of the trailing summer. My teaching and R&D activities were already quite
engaging and kept me more than busy. However, after a glimpse at its source code, I was instantly
aware of the potential of this platform: VCV Rack was the first platform ever that would really
allow coders, even self-taught ones, to build sounding objects at ease. The modern and swift way of
programming a plugin allows almost anyone with some coding background to create their own
modules. The sample-wise processing architecture allows students to focus on signal processing
experiments rather than spending time debugging cumbersome buffering mechanisms or learning
complex application programming interfaces (APIs). As a teaching tool, it allows students to learn
real-world coding instead of plugging graphical blocks or writing scripts in scientific computing
platforms. As a prototyping tool, it also allows developers to test DSP code quickly before adapting
it to their own framework and delivering it.
I was really tempted to help spread the use of this platform and write a teaching resource for
students and enthusiasts. Encouraged by good friends and synthesizer experts, I got into this
adventure that eventually demanded quite a lot of my time. I hope this resource will prove useful for
beginners and students, allowing them a quick bootstrap into the world of virtual synthesizers, sound
synthesis, and digital signal processing (DSP), and I really believe that Rack has the potential to
serve as a great didactical tool in sound and music engineering teaching programmes.
Finally, I humbly apologize for mistakes or shortcomings that may still be present in the book. The
preparation of this manuscript required a lot of effort to track the swift software changes occurring
in the beta versions, before Rack v1 would freeze, and to release a timely resource in this liquid
world required me to go straight to the point and prepare a compact book that could serve as
a primer, an appetizer, or a quick guide, depending on the reader’s needs. Arranging the contents
was not an easy task either. Writing the theoretical section required a lot of trade-offs to adapt to
heterogeneous readers. I hope that my take on DSP and the selection of modules will make most of
the readers happy.
x
Preface xi
Objectives
This book is meant as a guide to get people with some object-oriented programming basics into the
development of synthesizers. Some DSP theory is covered, specifically tailored for sound synthesis
and virtual analog, sufficiently intuitive and suitable for readers not coming from a technical
background (some high school math is required, though). There are many books covering sound
synthesis and DSP for audio, with examples using scientific programming languages, quick to use,
but far from a real-world implementation. Two books published by Focal Press provide a thorough
guide for the development of audio effects and synthesizer plugins in C++. I regard my book as the
perfect starting point. It covers the foundations of digital musical instruments development and
prototyping, gets the reader involved with sound synthesis, and provides some theory in an intuitive
way. It is also an important resource to quickly learn how to build plugins with VCV Rack and get
into its specificities. After this, one could move to more complex platforms such as the VST/AU
software development kits (SDKs) or get involved with more theoretical studies. The book is also an
important resource to quickly learn how to build plugins with VCV Rack and get into its
specificities.
The VCV Rack API is very simple, and it only takes a few minutes to create a module from scratch.
Unlike other commercial SDKs, you will not lose your focus from what’s important: learn the DSP
foundations, improve the sound quality of your algorithms, and make the user experience immediate.
You will get experienced with algorithms and be able to prototype novel modules very easily. The
API is general enough to allow you get your code to other platforms if you wish. But if you want to
get some revenue from your coding, there is no need to shift to other commercial SDKs, as the
VCV Rack plugin store has numerous options for selling your products. Experienced developers will
thus find sections related to the release of commercial plugins.
later in Chapter 9 more advanced concepts are provided. Chapters 6, 7, and 8 introduce readers to
the development of plugins by proposing several examples of increasing mathematical complexity.
Finally, Chapter 10 deals with a few extra topics of general interest regarding the development
process, how to get into the third-party plugin list, and how to make some money out of your
modules. A final chapter will suggest some ideas for further reading and new projects.
This book details the development of several synthesizer modules. These are collected in a Rack
plugin called ABC, available online at www.leonardo-gabrielli.info/vcv-book. Each module is meant
to teach something new about Rack and its API; look out for the tips, indicated by a light bulb icon.
The theoretical section has been enriched with boxes where some intuitive hints are given and
mappings between theory and practice are drawn to improve readability and stimulate the reader. As
an engineering student, I always made myself mappings between theoretical concepts and audio
systems; this helped me a lot to remember, understand and have fun. Exercises are also available,
proposing little challenges to start working autonomously. Some of the online modules include
solutions for the exercises, highlighted using C preprocessor macros.
The ABC modules are designed to be simple to avoid confusing the reader with extra functionalities.
They are not fitted with lots of bells and whistles, and they are not meant to be the best in their
category; they are just meant to get you straight to the point and gain an extra piece of information
each time. The plugin store is full of great plugins, and it is up to you to challenge other developers
and come up with even better ones. For this reason, all ABC plugins humbly start with an “A.”
I describe a sequencer “ASequencer,” a clock “AClock,” and so forth. Maybe you will be the one to
develop “the” sequencer and “the” clock that anybody will use in the Rack community! Have fun!
Acknowledgments
Lots of people have given their contribution to this book during its conception and writing.
First of all, the book would have never been started without the enthusiastic support of Enrico
Cosimi, who is a fabulous teacher, a charismatic performer, and a great entertainer.
The work has drained much of my energies, patiently restored by Alessandra Cardinali. She also
provided useful suggestions on the math and frequent occasions of discussion.
The mechanical design and the crafting of metal parts for the hardware Rack prototype has been
carried out with the help of Francesco Trivilino, who is also a valuable modular musician and
constructor of rare machines, such as reproductions of Luigi Russolo’s Intonarumori.
I’m thankful to Stefano D’Angelo for suggesting some corrections and to Andrew Belt for providing
insights. Reading his code is a breeze and there is a lot to learn from him.
My involvement with modular synthesis would have never been so strong without ASMOC
(Acusmatiq-Soundmachines Modular Circus), a yearly performance format that was created by Paolo
Bragaglia and Davide Mancini after a boastful brainstorming that included Fabrizio Teodosi.
Finally, I want to state my sincere gratitude to my family for transmitting my love for my region, Le
Marche, a land of organic farmers and ill-minded synth manufacturers. My love for musical
instrument crafting is intertwined with my connection to these towns and the seaside. Thanks to my
dad for passing on my the love for musical instrument design, coding, and signal processing. I feel
like a third-generation music technologist after him, and my grandfather, who designed the first
plastic reeds for melodicas. I hope to keep the tradition alive and take it to the next level, with
emerging technologies and paradigms, and I hope that this book will help lots of students and
enthusiasts in getting engaged with digital instrument design and development.
xiii
CHAPTER 1
The scope of this book is twofold: while focusing on modular synthesizers – a very fascinating and
active topic – it tries to bootstrap the reader into the broader music-related DSP coding, without the
complexity introduced by popular DAW plugin formats. As such, an introduction to modular synthesis
cannot be neglected, since Rack is heavily based on the modular synthesis paradigm. More specifically, it
faithfully emulates Eurorack mechanical and electric standards. Rack, as the name tells, opens up as an
empty rack where the user can place modules. Modules are the basic building blocks that provide all
sorts of functionalities. The power of the modular paradigm comes from the cooperation of small, simple
units. Indeed, modules are interconnected at will by cables that transmit signals from one to another.
Although most common hardware modules are analog electronic devices, I encourage the reader to
remove any preconception about analog and digital. These are just two domains where differential
equations can be implemented to produce or affect sound. With Rack, you can add software modules
to a hardware setup (Section 11.3 will tell you how), emulate complex analog systems (pointers to
key articles and books will be provided in Section 11.1), or implement state-of-the-art numerical
algorithms of any sort, such as discrete wavelet transform, non-negative matrix factorization, and
whatnot (Section 11.2 will give you some ideas). To keep it simple, this book will mainly cover
topics related to oscillators, filters, envelopes, and sequencers, including some easy virtual analog
algorithms, and provide some hints to push your plugins further.
1
2 Modular Synthesis
Even though each module in its own right is well understood by its engineer or developer, the whole
system often fails to be analytically tractable, especially when feedbacks are employed. This is
where the fun starts for us humans (a little bit less fun if we are also in charge of analyzing it using
mathematical tools).
Figure 1.1: C.1960, Alfredo Lietti at the Studio di Fonologia Musicale, RAI, Milan. Lietti is shown standing
in front of rackmount devices he developed for the studio, mostly in the years 1955–1956. A cord patch bay
is clearly visible on the right. Credits: Archivio NoMus, Fondo Lietti.
how communication technology had an impact on the development of electronic music devices.
The Studio di Fonologia at its best boasted third-octave and octave filter banks, other bandpass
filters, noise and tone generators, ring modulators, amplitude modulators, a frequency shifter, an
echo chamber, a plate reverb, a mix desk, tape recorders, various other devices, and the famous
nine sine oscillators (Novati and Dack, 2012). The nine oscillators are often mentioned as an
example of how so few and simple devices were at the heart of a revolutionary musical practice
(Donati and Paccetti, 2002), with the stigmatic statement “Avevamo nove oscillatori,” we only
had nine oscillators.
All the electronic music centers of the 1950s had rackmount oscillators, filter banks, modulators, and
tape devices. These would be poor means by today’s standards to shape and arrange sound, yet they
were exactly what composers required at the time to rethink musical creation. Reorganizing musical
knowledge required elementary tools that would allow the composers to treat duration, dynamics,
pitch, and timbre in analytical terms. The electronic devices used in these early electronic music
4 Modular Synthesis
studios were typically adapted from communication technology.2 After all, these first studios were
hosted by broadcast companies.
What was extraordinary about the first electronic and electroacoustic works of the 1950s was their
eclecticism. It turned out that a few rudimentary devices were very flexible and could serve the
grand ideas of experimental composers. The research directions that were suggested by early
electronic and electroacoustic music composers of the time, such as Olivier Messiaen, Pierre
Schaeffer, and Karleinz Stockhausen, proved fruitful. In the works of these studios, we can see the
premises of the advent of modular synthesis, where electronic tools are employed creatively,
suggesting an idea of the studio as a musical instrument or device. A few fundamental steps were
still missing at the time:
• Packing a studio in a case (or in a laptop, as we do nowadays) was impossible. In the 1950s,
the technology was yet underdeveloped in terms of integration, size, and power consumption.
The first bipolar junction transistor had just been patented and produced in 1948 and 1950,
respectively, at the Bell Labs, USA, and years of engineering were required to make production
feasible on a large scale and with small form factors.
• Electronic music was very academic and far from audiences, except from early science fiction
movies. These new forms of music would still take a couple of decades to leak into popular
music, as we shall see in the next section.
the multiplier-type ring modulator and other sound modifying devices […] such as an envelope
follower, a tone-burst-responsive envelope generator, a voltage-controlled amplifier, formant and
other filters, mixers, a pitch extractor, a comparator and frequency divider for the extracted
pitch, and a tape loop repeater with dual channel processing. The modular concept proved
attractive due to its versatility, and it was adopted by Robert Moog when he created his modular
synthesizer in 1964.
(p. 733)
Robert Moog (1934–2005), who probably requires no introduction to any readers, started developing
modular synthesizers in the 1960s. This eventually led him to assemble the first integrated
synthesizer ever based on subtractive synthesis. He presented his first voltage-controlled synthesizer
using a keyboard controller at the 16th annual fall convention of the Audio Engineering Society in
1964. The full paper can be found in the Journal of the Audio Engineering Society (Moog, 1965).
This describes his development of an electronic music system for real-time performance, based on
the concepts of modularity, voltage-controlled devices (filters, oscillators, etc.), and the keyboard as
a control device. Indeed, the paper shows a prototype consisting of a five-octave keyboard and
a small wooden case with wired modules. A part of the paper is devoted to discussing the voltage
control paradigm (Figure 1.2).
Robert Moog is best known as the creator of the Minimoog Model D (1970), acknowledged as the
first and most influential synthesizer of all time because of its playability and compactness.
However, all the prototyping and improvements that made it great would not have been possible
Figure 1.2: Robert Moog’s experience with modular systems led him to create the first compact synthesizer,
the Minimoog Model D. In this picture, an early prototype system is shown from a scientific paper
by Robert Moog. Image courtesy of the Audio Engineering Society (www.aes.org/e-lib/browse.cfm?
elib=1204).
6 Modular Synthesis
without the large amount of work conducted on his modular systems side by side with great
musicians of his time.3 The Model D came after several revisions, and the first of them – the Model
A – was composed of modules. All the above shows how modular synthesis, although obscure and
underground (until recently), played a key role in the birth of the electronic synthesizer. It is only
after years of experimentation and engineering that Moog came out with what would be for years
regarded as the perfect companion to keyboard players of the progressive rock era. The Moog
Modular and the Minimoog had a tremendous impact on popular music. Stanley Kubrick’s
A Clockwork Orange (1971) featured Wendy Carlos’ transpositions of Elgar, Purcell, and Beethoven,
as well as other unpublished works of classical inspiration, all played on the Moog Modular. At the
same time, jazz and progressive rock keyboard players Sun Ra (1914–1993), Keith Emerson
(1944–2016), and Rick Wakeman (1949–) adopted the Minimoog for live usage, inspiring dozens of
keyboard players of their time.
While the subtractive synthesizer with keyboard was appealing to many keyboard players, a different
approach was carried out by a slightly lesser-known pioneer in electronic music history: Donald
“Don” Buchla (1937–2016). He designed modular instruments for composers who were interested in
programmable electronic instruments, and his efforts started independently from Robert Moog.
During his whole life, Don Buchla fostered the development of modular synthesis, following an
experimental approach where user interaction is done through sequencers, touch plates, and the panel
itself. Notable composers working with Don Buchla systems were Pauline Oliveros (1932–2016),
Morton Subotnick (1933–), Suzanne Ciani (1946–), and Ramon Sender (1934–).
Together with Don Buchla, another famous name in modular synthesizer history is that of Serge
Tcherepnin. They are both recognized as references for the so-called West Coast approach to
synthesis, in opposition to the East Coast approach.
Figure 1.3: A typical Eurorack modular system by Doepfer, including oscillators, filters, envelope generators,
and various utility modules. Image courtesy of Doepfer Musikelektronik GmbH.
The Eurorack standard is so widespread that VCV Rack adopts it officially: even though voltages
and modules are virtual, all Rack modules comply (ideally) to the Eurorack format. Thus, you
should bear these specifications in mind.
In recent years, the arsenal of Eurorack modules has grown, including digital modules (devoted to
sequencing, signal processing, etc.), programmable modules, often based on open-source hardware
and software (e.g. Arduino and related), or analog-to-digital and digital-to-analog converters for
MIDI conversion or to provide DC-coupled control voltages to a personal computer (Figure 1.3).
Native Instruments’ Reaktor is also an alternative software solution for the emulation of modular
synthesizers. Despite being a modular software environment since its inception, only with version 6
(2015) did it feature Blocks (i.e. rackmount-style modules). One of the highlights in their
advertisement is the excellence of their virtual analog signal processing. I should also mention that
a lot of research papers in the field are done by employees of this company, and that one of them
shares for free via their website the largest reference on virtual analog filters (Zavalishin, 2018),
a mandatory read for those interested in virtual analog!
Finally, there’s VCV Rack. Launched in September 2017 by Andrew Belt, it is the latest platform
for software modular synthesis, and it has grown quickly thanks to its community of developers and
enthusiasts. It differs from the commercial solutions above in that it provides an open-source SDK to
create C++ modules. But of course, you already know this!
Figure 1.5: Typical synthesizer oscillator waveforms include (from left to right) sawtooth, triangular, and
rectangular shapes.
10 Modular Synthesis
Oscillators usually have at least one controllable parameter: the pitch (i.e. the fundamental frequency
they emit). Oscillators also offer control over some spectral properties. For example, rectangular
waveform oscillators may allow pulse width modulation (PWM) (i.e. changing the duty cycle Δ,
discussed later). Another important feature of oscillators is the synchronization to another signal.
Synchronization to an external input (a master oscillator) is available on many oscillator designs. So-
called hard sync allows an external rising edge to reset the waveform of the slave oscillator and is
a very popular effect to apply to oscillators. The reset implies a sudden transient in the waveform
that alters the spectrum, introducing high-frequency content. Other effects known as weak sync and
soft sync have different implementations. Generally, with soft sync, the oscillator reverses direction
at the rising edge of the external signal. Finally, weak sync is similar to hard sync, but the reset is
applied only if the waveform is close to the beginning or ending of its natural cycle. It must be
noted, however, that there is no consensus on the use of the last two terms, and different
synthesizers have different behaviors. All these synchronization effects require a different period
between slave and master. More complex oscillators have other ways to alter the spectrum of
a simple waveform (e.g. by using waveshaping). Since there are specific modules that perform
waveshaping, we shall discuss them later. Oscillators may allow frequency modulation (i.e. roughly
speaking, controlling the pitch with a high-frequency signal). Frequency modulation is the basis for
FM synthesis techniques, and can be either linear or logarithmic (linear FM is the preferred one for
timbre sculpting following the path traced by John Chowning and Yamaha DX7’s sound designers).
To conclude, tone generation may be obtained from modules not originally conceived for this aim,
such as an envelope generator (discussed later) triggered with extremely high frequency.
Noise sources also belong to the tone generators family. These have no pitch, since noise is
a broadband signal, but may allow the selection of the noise coloration (i.e. the slope of the spectral
rolloff), something we shall discuss in the next chapter. Noise sources are very useful to create
percussive sounds, to create drones, or to add character to pitched sounds.
Finally, the recent introduction of digital modules allows for samplers to be housed in a Eurorack
module. Samplers are usually capable of recording tones from an input or to recall recordings from
a memory (e.g. an SD card) and trigger their playback. Other all-in-one modules are available that
provide advanced tone generation techniques, such as modal synthesis, FM synthesis, formant
synthesis, and so on. These are also based on digital architecture with powerful microcontrollers or
digital signal processors (DSPs).
While engineering textbooks consider filters as linear devices, most analog musical filters can be
operated in a way that leads to nonlinear behavior, requiring specific knowledge to model them in
the digital domain.
Waveshaping devices have been extensively adopted by synthesizer developers such as Don Buchla
and others in the West Coast tradition to create distinctive sound palettes. A waveshaper introduces
new spectral components by distorting the waveform in the time domain. A common form of
waveshaper is the foldback circuit, which wraps the signal over a desired threshold. Other processing
circuits that are common with guitar players are distortion and clipping circuits. Waveshaping in the
digital domain requires a lot of attention in order to reduce undesired artifacts (aliasing).
Other effects used in modular synthesizers are so-called modulation effects, most of which are based
on delay lines: chorus, phaser, flanger, echo and delay, reverb, etc. Effects can be of any sort and are
not limited to spectral processing or coloration, so the list can go on.
Vocoders have had a large impact in the history of electronic music and its contaminations. They
also played a major role in the movie industry to shape robot voices. Several variations exist;
however, the main idea behind it is to modify the spectrum of a first sound source with a second
one that provides spectral information. An example is the use of a human voice to shape
a synthesized tone, giving it a speech-like character. This configuration is very popular.
Figure 1.6: A CRB Voco-Strings, exposed at the temporary Museum of the Italian Synthesizer in 2018, in
Macerata, Italy. This keyboard was manufactured in 1979–1982. It was a string machine with vocoder
and chorus, designed and produced not more than 3 km from where I wrote most of this book.
Photo courtesy of Acusmatiq MATME. Owner: Riccardo Pietroni.
12 Modular Synthesis
Figure 1.8: Advanced envelope generation schemes may go beyond the ADSR scheme. The panel of a
Viscount-Oberheim OB12 is shown, featuring an initial delay (DL) and a double decay (D1, D2) in
addition to the usual controls.
amplitude of a tone through a VCA, they are performing tremolo. Finally, if they are used to
shape the timbre of a sound (e.g. by modulating the cutoff of a filter), they are performing what
is sometime called wobble.
Other tools for articulation are slew limiters, which smooth step-like transitions of a control voltage.
A typical use is the smoothing of a keyboard control voltage that provides a glide or portamento
effect by prolonging the transition from one pitch value to another.
A somewhat related type of module is the sample and hold (S&H). This module does the inverse of
a slew limiter by taking the value at given time instants and holding it for some time, giving rise to
a step-like output. The operation of an S&H device is mathematically known as a zero-order hold filter.
An S&H device requires an input signal and depends on a clock that sends triggering pulses. When these
are received, the S&H outputs the instantaneous input signal value and holds it until a new trigger
arrives. Its output is inherently step-like and can be used to control a range of other modules.
machine-like precision and their obsessive repetition on standard time signatures. Sequencers are made
of an array or a matrix of steps, each representing equally spaced time divisions. For drum machines,
each step stores a binary information: fire/do not fire. The sequencer cycles repeatedly along the steps
and fires whenever one of them is armed. We may call this a binary sequencer. For synthesizers, each
step has one or more control voltage values associated, selectable through knobs or sliders. These can be
employed to control any of the synth parameters, most notably the pitch, which is altered cyclically,
following the values read at each step. Sequencers may also include both control voltage and a binary
switch, the latter for arming the step. Skipping some steps allows creating pauses in the sequence.
Sequencers are usually controlled by a master clock at metronome rate (e.g. 120 bpm), and at each clock
pulse a new step is selected for output, sending the value or values stored in that step. This allows, for
example, storing musical phrases if the value controls the pitch of a VCO, or storing time-synchronized
modulations if the value controls other timbre-related devices. Typical sequencers consist of an array of
8 or 16 steps, used in electronic dance music (EDM) genres to store a musical phrase or a drumming
sequence of one or two bars with time signature 4/4. The modular market, however, provides all sorts of
weird sequencers that allow for generative music, polyrhythmic composition, and so on.
Binary sequencers are used for drum machines to indicate whether a part of the drum should fire or not.
Several rows are required, one for each drum part. Although the Roland TR-808 is widely recognized
as one of the first drum machines that could be programmed using a step sequencer, the first drum
machine ever to host a step sequencer was the Eko Computer Rhythm, produced in 1972 and developed
by Italian engineers Aldo Paci, Giuseppe Censori, and Urbano Mancinelli. This sci-fi wonder has six
rows of 16 lit switches, one per step. Each row can play up to two selectable drum parts (Figure 1.9).
Figure 1.9: The Eko Computer Rhythm, the first drum machine ever to be programmed with a step
sequencer. It was devised and engineered not more than 30 km away from where this book
was written. Photo courtesy of Acusmatiq MATME. Owner: Paolo Bragaglia. Restored by Marco Molendi.
Modular Synthesis 15
Notes
1 An early keyboard-controlled vacuum-tube synthesizer.
2 Indeed, many technological outbreaks in the musical field are derived from previous works in the communication field: from
filters to modulators, from speech synthesis to digital signal processors, there is a lot we take from communication engineering.
3 Among the musicians that assisted Moog in his “modular” years (i.e. before the Model D would be released), we should
mention at least Jean-Jacques Perrey and Wendy Carlos as early adopters and Herb Deutsch for assisting Moog in all the
developments well before 1964.
4 In this case, the filter can be treated as an oscillator (i.e. a tone generator).
CHAPTER 2
Modular synthesis is all about manipulating signals to craft an ephemeral but unique form of art. Most
modules can be described mathematically, and in this chapter we are going to deal with a few basic aspects
of signal processing that are required to develop modules. The chapter will do so in an intuitive way, to
help readers that are passionate about synthesizers and computer music to get into digital signal processing
without the effort of reading engineering textbooks. Some equations will be necessary and particularly
useful for undergraduate or postgraduate students who require some more detail. The math uses the
discrete-time notation as much as possible, since the rest of the book will deal with discrete-time signals.
Furthermore, operations on discrete-time series are simpler to grasp, in that it helps avoid integral and
differential equations, replacing these two operators with sums and differences. In my experience with
students from non-engineering faculties, this makes some critical points better understood.
One of my aims in writing the book has been to help enthusiasts get into the world of synthesizer
coding, and this chapter is necessary reading. The section regarding the frequency domain is the
most tough; for the rest, there is only some high school maths. For the eager reader, the bible of
digital signal processing is Digital Signal Processing (Oppenheim and Schafer, 2009). Engineers and
scholars with experience in the field must forgive me for the overly simplified approach. For all
readers, this chapter sets the notation that is used in the following chapters.
This book does not cover acoustic and psychoacoustic principles, which can be studied in
specialized textbooks (e.g. see Howard and Angus, 2017), and it does not repeat the tedious account
about sound, pitch, loudness, and timbre, which is as far from my view on musical signals as much
as tonal theories are far from modular synthesizer composition theories.
TIP: Analog synthesizers and effects work with either voltage or current signals. As any physical
quantity, these are continuous-time signals and their amplitude can take any real value – this is
what we call an analog signal. Analog synthesizers do produce analog signals, and thus we
need to introduce this class of signals.
A signal is defined as a function or a quantity that conveys some information related to a physical
system. The information resides in the variation of that quantity in a specific domain. For instance,
16
Elements of Signal Processing for Synthesis 17
the voltage across a microphone capsule conveys information regarding the acoustic pressure applied
to it, and henceforth of the environment surrounding it.
From a mathematical standpoint, we represent signals as functions of one or more independent
variables. The independent variable is the one we indicate between braces (e.g. when we write
y ¼ f ð xÞ, the independent variable is x). In other domains, such as image processing, there are
usually two independent variables, the vertical and horizontal axes of the picture. However, most
audio signals are represented in the time domain (i.e. in the form f ðtÞ, with t being the time
variable).
Sine sinð2πftÞ
Cosine cosð2πftÞ
Sawtooth ðt modT Þ
Time can be defined as either continuous or discrete. Physical signals are all continuous-time
signals; however, discretizing the time variable allows for efficient signal processing, as we shall see
later.
Let us define an analog signal as a signal with continuous time and continuous amplitude:
s ¼ f ðt Þ : R ! R
The notation indicates that the variable t belongs to the real set and maps into a value that is
function of t and equally belongs to the real set. The independent variable is taken from a set (in this
case, R) that is called domain, while the dependent variable is taken from a set that is called
codomain. For any time instant t, sðtÞ takes a known value.
Most physical signals, however, have finite length, and this yields true for musical signals as well,
otherwise recording engineers would have to be immortal, which is one of the few qualities they still
miss. For a finite continuous-time signal that lives in the interval T1; T2 , we define it in this shorter
time span:
s ¼ f ðtÞ : T1; T2 ! R
TIP: Discrete-time signals are crucial to understand the theory behind DSP. However, they differ
from digital signals, as we shall see later. They represent an intermediate step from the real
world to the digital world where computation takes place.
Let us start with a question: Why do we need discrete-time signals? The short answer is that
computers do not have infinite computational resources. I will elaborate on this further. You need to
know that:
approximation of the signal, but a good one, if we set the sampling interval according to certain
laws, which we shall review later.
In this chapter, we shall discuss mainly the third point (i.e. sampling). Quantization (the second point)
is a secondary issue for DSP beginners, and is left for further reading. Here, I just want to point out
that if you are not familiar with the term “quantization,” it is exactly the same thing you do when
measuring the length of your synth to buy a new shelf for it. You take a reference, the measuring tape,
and compare the length of the synth to it. Then you approximate the measure to a finite number of
digits (e.g. up to the millimeter). Knowing the length with a precision up to the nanometer is not only
unpractical by eye, but is also useless and hard to write, store, and communicate. Quantization has
only marginal interest in this book, but a few hints on numerical precision are given in Section 2.13.
Let us discuss sampling. As we have hinted, the result of the sampling process is a discrete-time
signal (i.e. a signal that exists only at specific time instants). Let us now familiarize ourselves with
discrete-time signals. These signals are functions, like continuous-time signals, with the independent
variable not belonging to the real set. It may, for example, belong to the integer set Z (i.e. to the set
of all positive and negative integer numbers). While real signals are defined for any time instant t,
discrete signals are defined only at equally spaced time instants n belonging to Z. This set is less
populated than R because it misses values between integer values. There are, for instance, infinite
values between instants n ¼ 1 and n ¼ 2 in the real set that the integer set does not have. However,
do not forget that even Z is an infinite set, meaning that theoretically our signal can go on forever.
Getting more formal, we define a discrete-time signal as s ¼ f ½n : Z ! R, where we adopt
a notation usual for DSP books, where discrete-time signals are denoted by the square brackets and
the use of the variable n instead of t. As you may notice, the signal is still a real-valued signal. As
we have discussed previously, another required step is quantization. Signals with their amplitude
quantized do not belong to R anymore. Amplitude quantization is a step that is independent of time
discretization. Indeed, we could have continuous time signals with quantized amplitude (although it
is not very usual). Most of the inherent beauty and convenience of digital signal processing is
related to the properties introduced by time discretization, while amplitude quantization has only
a few side effects that require some attention during the implementation phase. We should state
clearly that digital signals are both discretized in time and their amplitude. However, for simplicity,
we shall now focus on signals that have continuous amplitude.
A discrete-time signal is a sequence of ordered numbers and is generally represented as shown in
Figure 2.1. Any such signal can be decomposed into single pulses, known as Dirac pulses. A Dirac
pulse sequence with unitary amplitude is shown in Figure 2.2, and is defined as:
(a) (b)
Figure 2.2: The Dirac sequence δ½n (a) and the sequence 10δ½n 3 (b).
Figure 2.3: The signal in Figure 2.1 can be seen as the sum of shifted and weighted copies of the Dirac delta.
0 n≠0
δ½n ¼ ð2:1Þ
1 n¼0
0 n≠T
Aδ½n T ¼ ð2:2Þ
A n¼T
By shifting in time multiple Dirac pulses and properly weighting each one of them (i.e.
multiplying by a different real coefficient), we obtain any arbitrary signal, such as the one in
Figure 2.1. This process is depicted in Figure 2.3. The Dirac pulse is thus a sort of elementary
particle in this quantum game we call DSP.2 If you have enough patience, you can sum infinite
pulses and obtain an infinite-length discrete-time signal. If you are lazy, you can stop after
N samples and obtain a finite-length discrete-time signal s ¼ f ½n : ½0; N ! R. If you are lucky,
you can give a mathematical definition of a signal and let it build up indefinitely for you. This
is the case of a sinusoidal signal, which is infinite-length, but it can be described by a finite set
of samples (i.e. one period).
Table 2.2 reports notable discrete-time sequences that are going to be of help in progressing through
the book.
Table 2.2: Notable discrete-time sequences, their mathematical notation and their graphical
representation
TIP: Discrete-time systems must not be confused with digital systems. As with discrete-time sig-
nals, they represent an intermediate step between continuous-time systems and digital systems.
They have a conceptual importance, but we do not find many of them in reality. One notable
exception is the Bucket-Brigade delay. This kind of delay, used in analog effects for chorus, flan-
gers, and other modulation effects, samples the continuous-time input signal at a frequency
imposed by an external clock. This will serve as a notable example in this section.
Figure 2.4: Unidirectional graph for an LTI system, composed by a summation point, unitary delays (z1 ),
and products by scalar (gains), depicted with a triangle. The graph implements the difference equation
y½n ¼ a1 x½n þ x½n 1 a2 y½n 1.
Elements of Signal Processing for Synthesis 23
fundamental to understand the theory underlying systems theory, control theory, acoustics, physical
modeling, circuit theory, and so on. Unfortunately, useful systems are nonlinear and time-variant
and often have many inputs and/or outputs. To describe such systems, most theoretical approaches
try to look at them in the light of the LTI theory and address their deviations with different
approaches. For this reason, we shall follow the textbook approach and prepare some background
regarding the LTI systems. Nonlinear systems are much more complex to understand and model.
These include many acoustical systems (e.g. hammer-string interaction, a clarinet reed, magnetic
response to displacement in pickups and loudspeaker coils) and most electronic circuits
(amplifiers, waveshapers, etc.). However, to deal with these, a book is not enough, let alone
a single chapter! A few words on nonlinear systems will be spent in Sections 2.11 and 8.3.
Further resources will be given to continue on your reading and start discrete-time modeling of
nonlinear systems.
Now it is time to get into the mathematical description of LTI systems and their properties. A single-input,
single-output discrete-time system is an operator T fg that transforms an input signal x½n into an output
signal y½n:
y½n ¼ T fx½ng ð2:3Þ
with the time index n running possibly from minus infinity to infinity and L being an integer.
Equation 2.4 describes a delaying system introducing a delay of length L samples. You can prove
yourself that this system delays a signal with pen and paper, fixing the length L and showing that for
each input value the output is shifted by L samples to the right. The delay will be often denoted as
zL , for reasons that are clear to the expert reader but are less apparent to other readers. This is
related to the Z-transform (the discrete-time equivalent of the Laplace transform), which is not
covered in this book for simplicity. Indeed, the Z-transform is an important tool for the analysis and
design of LTI systems, but for this hands-on book it is not strictly necessary, and as usual
I recommend reading a specialized DSP textbook.
Let us now take a break and discuss the Bucket-Brigade delay (BBD). This kind of device consists of
a cascade of L one-sample delay cells (z1 ), creating a so-called delay line (zL ). Each of these cells is
a capacitor, able to store a charge, and each cell is separated by means of transistors acting as a gate. These
gates are activated at once by a periodic clock, thus allowing the charge of each cell to be transferred, or
poured, to the next one. In this way, the charge stored in the first cell travels through all the cells, reaching the
last one after L clock cycles, thus delayed by L clock cycles. You can clearly see the analogy with a bucket of
water that is passed from hand to hand in a line of firefighters having to extinguish a fire. Figure 2.6 may
help you understand this concept. This kind of device acts as a discrete-time system; however, the stored
values are physical quantities, and thus no digital processing happens. However, being a discrete-time
system, it is subject to the sampling theorem (discussed later) and its implications.
Figure 2.6: Simplified diagram of a Bucket-Brigade delay. Charge storage cells are separated by
switches, acting as gates that periodically open, letting the signals flow from left to right. When the
switches are open (a), no signal is flowing, letting some time pass. When the switches close (b),
each cell transmits its value to the next one. At the same time, the input signal is sampled, or freezed,
and its current value is stored in the first cell. It then travels the whole delay line, through the cells,
one by one.
Back to the theory, we have now discussed three basic operations: sum, product, and delay. By
composing these three operations, one can obtain all sorts of LTI systems, characterized by the
linearity and the time-invariance properties. Let us examine these two properties.
Linearity is related to the principle of superposition. Let y1 ½n be the response of a system to input
x1 ½n, and similarly y2 ½n be the response to input stimulus x2 ½n. A system is linear if – and only if –
Equations 2.5 and 2.6 hold true:
where a is a constant. These properties can be combined into the following equation, stating the
principle of superposition:
T fax1 ½n þ bx2 ½ng ¼ T fax1 ½ng þ T fbx2 ½ng ð2:7Þ
In a few words, a system is linear whenever the output to each different stimulus is independent on
other additive stimuli and their amplitudes.
LTI systems are only composed of sums, scalar products, and delays. Any other operation would
violate the linearity property. Take, for instance, a simple system that instead of a scalar product
multiplies two signals. This is described by the following difference equation:
This simple system is memoryless (the output is not related to a previous input) and nonlinear as the
exponentiation is not a linear function. It is straightforward that y½n ¼ ðx1 ½n þ x2 ½nÞ2 ≠ x1 ½n2 þ x2 ½n2 ,
except for trivial cases with at least one of the two inputs equal to zero.
Another interesting property of LTI systems is time-invariance. A system is time-invariant if it
always operates in the same way disregarding the moment when a stimulus is provided, or, more
rigorously, if:
Time-variant systems do change their behavior depending on the moment we are looking at them.
Think of an analog synthesizer filter affected by changes in temperature and humidity of the room, and
thus behaving differently depending on the time of the day. We can safely say that such a system is
a (slowly) time-variant one. But even more important, it can be operated by a human and changes its
response while the user rotates its knobs. It is thus (quickly) time-variant while the user modifies its
parameters. A synthesizer filter, therefore, is only an approximation of an LTI system: it is time-variant. We
can neglect slow variations due to the environment (digital filters are not affected by this – that’s one of
their nice properties), and assume time-invariance while they are not touched, but we cannot neglect the
effect of human intervention. This, in turn, calls for a check on stability while the filter is manipulated.
Causality is another property, stating that the output of a system depends only on any of the current
and previous inputs. An anti-causal system can be treated in mathematical terms but has no real-time
implementation in the real world since nobody knows the future (yet).
Anti-causal systems can be implemented offline (i.e. when the entire signal is known). A simple
example is the reverse playback of a track. You can invert the time axis and play the track from the
end to the beginning because it has been entirely recorded. Another example is a reverse reverb effect.
Real-time implementations of this effect rely on storing a short portion of the input signal, reverting it,
and then passing through a reverb. This is offline processing as well, because the signal is first stored
and then reverted.
26 Elements of Signal Processing for Synthesis
This means that if the input signal is bounded (both in the positive and negative ranges) by a value
Bx for its whole length, then the output of the system will be equally bounded by another maximum
value By, and both the bounding values are not infinite. In other words, the output will never go to
infinity, no matter what the input is, excluding those cases where the input is infinite (in such cases,
we can accept the output to go to infinity).
Unstable systems are really bad for your ears. An example of an unstable system is the feedback
connection of a microphone, an amplifier, and a loudspeaker. Under certain conditions (the
Barkhausen stability criterion), the system becomes unstable, growing its output with time and
creating a so-called Larsen effect. Fortunately, no Larsen grows up to infinity, because no public
address system is able to generate infinite sound pressure levels. It can hurt, though. In Gabrielli et al.
(2014), a virtual acoustic feedback algorithm to emulate guitar howling without hurting your ears is
proposed and some more information on the topic is given.
One reason why we are focusing our attention on LTI systems is the fact that they are quite common
and they are tractable with a relatively simple math. Furthermore, they are uniquely defined by
a property called impulse response. The impulse response is the output of the system after a Dirac
pulse is applied to its input. In other words, it is the sequence:
An impulse response may be of finite or infinite length. This distinction is very important and leads
to two classes of LTI systems: infinite-impulse response system (IIR) and finite-impulse response
system (FIR). Both are very important in the music signal processing field.
Knowing the impulse response of a system allows to compute the output as:
X
þ∞
y½n ¼ T fx½ng ¼ h½k x½n k ð2:12Þ
k¼∞
Those who are familiar with math and signal processing will recognize that the sum in Equation
2.12 is the convolution between the input and the impulse response:
y½n ¼ x½n h½n ð2:14Þ
Elements of Signal Processing for Synthesis 27
where the convolution is denoted by the symbol *. As you may notice, the convolution sum can be
quite expensive if the impulse response is long: for each output sample, you have to evaluate up to
infinite products! Fortunately, there are ways to deal with IIR systems that do not require infinite
operations per sample (indeed, most musical filters are IIR, but very cheap). Inversely, FIR
systems with a very long impulse response are hard to deal with (partitioning the convolution can
help by exploit parallel processing units). In this case, dealing with signals in the frequency
domain provides some help, as the convolution between x½n h½n can be seen in the frequency
domain as the product of their Fourier transforms (this will be shown in Section 2.4.3). Of
particular use are those LTI systems where impulse response can be formulated in terms of
differential equations of the type:
X
N X
M
ak y½ n k ¼ bk x½n k : ð2:15Þ
k¼0 k¼0
These LTI systems, most notably filters, are easy to implement and analyze. As you can see,
they are based on three operations only: sums, products, and delays. The presence of delay is
not obvious from Equation 2.15, but consider that to recall the past elements (x½n 1; x½n 2,
etc.), some sort of delay mechanism is required. This requires that the present input is stored
and recalled after k samples. Delays are obtained by storing the values in memory. Summing,
multiplying, writing a value to memory, reading a value from memory: these operations are
implemented in all modern processors3 by means of dedicated instructions. Factoring the LTI
impulse response as in Equation 2.15 helps making IIR filters cheap: there is no need to perform
an infinite convolution: storing the previous outputs allows you to store the memory of all the
past history.
ω 2π
f ¼ ; i:e: ω ¼ 2πf ¼ : ð2:16Þ
2π T
If the concept of frequency is clear, we can now apply this to signal processing. Follow me with
a little patience as I point out some concepts and then connect them together. In signal processing, it
is very common to use transforms. These are mathematical operators that allow you to observe the
signal under a different light (i.e. they change shape5 to the input signal and take it to a new
domain, without affecting its informative content). The domain is the realm where the signal lives.
Without going too abstract, let us take the mathematical expression:
sð xÞ ¼ sinðωxÞ ð2:17Þ
If x 2 R, then the resulting signal sð xÞ lives in the continuous-time domain. If x 2 Z, then sð xÞ lives
in the discrete-time domain. We already know of these two domains. We are soon going to define
a third domain, the frequency domain. We shall see that in this new domain, the sine signal of
Equation 2.17 has a completely new shape (a line!) but still retains the same meaning: a sinusoidal
wave oscillating at ω rad=s.
There is a well-known meme from the Internet showing that a solid cylinder may look like a circle or
a square if projected on a wall with light coming from a side or from the bottom, stating that observing
one phenomenon from different perspectives can yield different results, yet the truth is a complex
mixture of both observations (see Figure 2.7). Similarly, in our field, the same phenomenon (a signal)
can be observed and projected in different domains. In each domain, the signal looks different, because
each domain describes the signal in a different way, yet both domains are speaking of the same signal.
In the present case, one domain is time and the other is frequency. Often the signal is generated in the
Figure 2.7: The projection of a cylinder on two walls from orthogonal perspectives. The object appears with
different shapes depending on the point of view. Similarly, any signal can be projected in the time and the
frequency domain, obtaining different representations of the same entity.
Elements of Signal Processing for Synthesis 29
time domain, but signals can be synthesized in the frequency domain as well. There are also techniques
to observe the signal in both domains, projecting the signal in one of the known time-frequency
domains (more in Section 2.4.6), similar to the 3D image of the cylinder in Figure 2.7. Mixed time-
frequency representations are often neglected by sound engineers, but are very useful to grasp the
properties of a signal at a glance, similar to what the human ear does.6
We shall first discuss the frequency domain and how to take a time-domain signal into the frequency
domain.7
Figure 2.8: The signal of Figure 2.1 is periodic, and can thus be decomposed into a sum of periodic
signals (sines).
30 Elements of Signal Processing for Synthesis
As we know, the frequency of this signal is the reciprocal of the period (i.e. fF ¼ 1=N; i:e: ωF ¼ 2π=N).
In this section, we will refer to the frequency as the angular frequency. The fundamental frequency is
usually denoted with a 0 subscript, but for a cleaner notation we are sticking with the F subscript for the
following lines.
Given a periodic signal with period N, there are a small number of sinusoidal signals that have
a period equal or integer divisor of N. What if we write the periodic signal as a sum of those
signals? Let us take, for example, cosines as these basic components. Each cosine has angular
frequency ωi that is an integer multiple of the fundamental angular frequency ωF. Each cosine
will have its own amplitude and phase. The presence of a constant term a0 is required to take
into account an offset, or bias, or DC component (i.e. a component at null frequency – it is
not oscillating, but it fulfills Equation 2.18). This is the real-form discrete Fourier
series (DFS).
X
N1
~x½n ¼ a0 þ ak cosðωk n þ k Þ; ωk ¼ kωF ð2:19Þ
k¼1
Equation 2.19 tells us that any periodic signal in the discrete-time domain can be seen as a sum of
a finite number of cosines at frequencies that are multiples of the fundamental frequency, each
weighted properly by an amplitude coefficient ak and time-shifted by a phase coefficient k . The
cosine components are the harmonic partials,9 or harmonics, of the original signal, and they
represent the spectrum of the signal.
We can further develop our math to introduce a more convenient notation. Bear with me and
Mr. Jean Baptiste Joseph Fourier for a while. It can be shown that a cosine having frequency and
phase components in its argument can be seen as the sum of a sine and a cosine of the same
frequency and with no phase term in their argument, as shown in Equation 2.20. In other words, the
added sinusoidal term takes into account the phase shift given by . After all, intuitively, you
already know that cosine and sine are the same thing, just shifted by a phase offset:
We can plug Equation 2.20 into Equation 2.19, obtaining a new formulation of the Fourier theorem
that employs sines and cosines and requires the same number of coefficients (ak ; bk instead of ak ,k )
to store information regarding ~x½n:
X
N
~x½n ¼ a0 þ ½ak cosðωk nÞ þ bk sinðωk nÞ ð2:21Þ
k¼1
Now, if you dare to play with complex numbers, in signal processing theory we have another
mathematical notation that helps to develop the theory further. If you want to stop here, this is OK,
but you won’t be able to follow the remainder of the section regarding Fourier theory. You can jump
to Section 2.4.3, and you should be able to understand almost everything.
We take pace from Euler’s equation:
Elements of Signal Processing for Synthesis 31
e jθ þ ejθ 1 jθ 1 jθ
cos θ ¼ ¼ e þ e
2 2 2
ð2:23Þ
e jθ ejθ 1 jθ 1 jθ
sin θ ¼ ¼ je þ je
2j 2 2
Now we shall apply these expressions to the second formulation of the DFS. If we take θ ¼ ωk n, we
can substitute the sine and cosine expressions of Equation 2.23 in Equation 2.21, yielding:
XN
1 1 1 1
~x½n ¼ a0 þ an e jθ þ ejθ þ bn je jθ þ jejθ
n¼1
2 2 2 2
The last step consists of transforming all the constant terms an ; bn into one single vector that shall be
our frequency representation of the signal:
X
N X
N
~x½n ¼ X~ ½k e jθ ¼ X~ ½k e jωk n ð2:24Þ
k¼N k¼N
where 8
< 1
2 ak 12 jbk ; k1
X~ ½k ¼ 1
a0 ; k¼0 ð2:25Þ
:1 2 1
a
2 jk j þ 2 jbjk j ; k 1
This little trick not only makes Equation 2.21 more compact, but also instructs us how to construct
a new discrete signal X~ ½k that bears all the information related to ~x½n.
Several key concepts can be drawn by observing how X~ ½k is constructed:
• X~ ½k is complex valued except from X~ ½0.
• X~ ½0 is the offset component, now seen as a null-frequency component, or DC term.
• X~ ½k is composed of terms related to negative frequency components (k 1) and positive fre-
quency components (k 1). Negative frequencies have no physical interpretation, but nonethe-
less are very important in DSP.
• Negative frequency components have identical coefficients, besides the sign of bk (i.e. they are
complex conjugate X~ ½k ¼ X~ ½k ). Roughly speaking, negative frequency components can be
neglected when we observe a signal.11
To conclude, we are now able to construct a frequency representation of a time-domain signal for
periodic signals. The process is depicted in Figure 2.9.
32 Elements of Signal Processing for Synthesis
Figure 2.9: A signal in time (left), decomposed as a sum of sinusoidal signals (waves in the back). The fre-
quency and amplitudes of the sinusoidal signals are projected into a frequency domain signal (right). The
projection metaphor of Figure 2.7 is now put in context.
-1
0 2 4 6 8 10 12 14 16
-1
0 2 4 6 8 10 12 14 16
Figure 2.10: A finite duration signal x½n (a) is converted into a periodic signal ~
x½n by juxtaposing replicas
(b). The periodic version of x½n can be now analyzed with the DFS.
X
N 1
x½n ¼ X ½k ejð2π=N Þkn ð2:26Þ
k¼0
Please note that in Equation 2.26, we do not refer to a fundamental frequency ωk, since x½n is
not periodic, but we made the argument of the exponential explicit, where k ð2π=N Þ are the
(N−1) frequencies of the complex exponentials and N can be selected arbitrarily, as we shall
later detail.
34 Elements of Signal Processing for Synthesis
Equation 2.26 not only states that a discrete-time signal can be seen as a sum of elementary
components. It also tells how to transform a spectrum X ½k into a time-domain signal. But wait
a minute: we do not yet know how to obtain the spectrum! For tutorial reasons, we focused on
understanding the Fourier series and its derivatives. However, we need, in practice, to know how to
transform a signal from the time domain to the frequency domain. The discrete Fourier transform
(DFT) is a mathematical operator that takes a real13 discrete-time signal of length N into a complex
discrete signal of length N that is function of frequency:
F : R N ! CN ð2:27Þ
The DFT is invertible, meaning that the signal in the frequency domain can be transformed back to obtain
the original signal in the time domain without degradation of the signal (i.e. F 1 fF fx½ngg ¼ x½n). The
direct and inverse DFT are done through Equations 2.28 and 2.29:
X
N 1
X ½k ¼ x½nejð2π=N Þkn ðDFTÞ ð2:28Þ
k¼0
1XN 1
x½ n ¼ X ½k ejð2π=N Þkn ðIDFTÞ ð2:29Þ
N k¼0
The equations are very similar. The added term 1=N is required to preserve the energy. As seen
above, X ½k is a complex-valued signal. How to interpret this? The real and imaginary part of each
frequency bin do not alone tell us anything of practical use. To gather useful information, we need
to process further the complex-valued spectrum to obtain the magnitude spectrum and the phase
spectrum. The magnitude spectrum is obtained by calculating the modulus (or norm) of each
complex value, while the phase is obtained by the argument function:
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
jjXk jj ¼ <fXk g2 þ =fXk g2 ðModulusÞ ð2:30Þ
=Xk
ff Xk ¼ arctan ðArgumentÞ ð2:31Þ
<Xk
The magnitude and phase spectra provide different kinds of information. The first is most commonly
used in audio engineering because it provides information that we directly understand (i.e. the
energy of the signal at each frequency component, something our ears are very sensitive to). The
phase spectrum, on the other hand, tells the phase of the complex exponentials along the frequency
domain. It has great importance in the analysis of LTI systems.
It is important to note that the length of the spectrum is equal to the length of the signal in the time domain.
Since the number of DFT points, or DFT bins (i.e. the number of frequencies), is equal to the length
N of the signal, we incur a practical issue: very short signals have very few DFT bins (i.e. our
knowledge of the signal in frequency is quite rough). To gather more insight, there is a trick called
zero-padding. It consists of adding zeros at the end or at the beginning of the signal. This artificially
increases the signal length to M 4N points by padding with zeros, thus increasing the number of
DFT bins. Zero-padding is also employed to round the length of the signal to the closest power of 2
that is larger than the signal, which makes it suitable to efficient implementations of the DFT (a few
words regarding computational complexity will come later). Applying zero-padding (i.e. calculating
Elements of Signal Processing for Synthesis 35
the M-points DFT) does not alter the shape of the spectrum; it just adds M-N points by interpolating
the values of the N-points DFT.
So far, we have discovered:
• Any finite-length discrete-time signal can be described in terms of weighted complex exponen-
tials (i.e. sum of sinusoids). The weights are called the spectrum of the signal. The complex
exponentials are equally spaced in frequency.
• From the weights of these exponentials, we can obtain the signal back in the time domain.
• Both x½n and X ½k have same length N.
Let us now review the DFT of notable signals.
meaning that the DFT of the sum of two signals is equal to the sum of the two DFTs. This property
has a lot of practical and theoretical consequences.
Time scaling:
1 k
F fxðatÞg ¼ X ð2:33Þ
a a
One of the consequences of this property is of interest to sound designers: it implies the pitch
shifting of a signal that is played back faster (a > 1) or slower (a < 1).
Periodicity:
k ðNþnÞ ðkþN Þn
WN ¼ WN ¼ WNkn ð2:34Þ
This means that the product of two DFTs is equal to the convolution between the two signals. On
par with that, it is also true that the product of two signals in the time domain is equivalent to the
convolution of their spectra in the frequency domain (i.e. x½n y½n , X ðk Þ Y ðk Þ).
Parseval’s theorem:
X
þ∞ ð
1 π
jω
2
jx½nj2 ¼ X e dω
n¼∞
2π π
meaning that the energy stays the same whether we evaluate it in the time domain (left) or in the
frequency domain (right)
36 Elements of Signal Processing for Synthesis
The frequency response is even more useful than the impulse response for musical applications. If
you want to design a filter, for example, you can design its desired spectrum and transform it into
the time domain through the inverse DFT.14 The frequency response of two systems can be
compared to determine which one best suits your frequency specifications.
But there is one more use for the frequency response: you can use it to process a signal.
Processing a signal with an LTI system implies the convolution between the signal and the
system impulse response. This can be expensive in some cases. Computational savings can be
obtained by transforming both in the frequency domain, multiplying them, and transforming
them back in the domain. This is possible thanks to the convolution property, reported in Section
2.4.3. Processing the signal through the LTI system in such a way may reduce the computational
cost, provided that our latency constraints allow us to buffer the input samples for the DFT. The
larger this buffer, the larger the computational savings. Why so? Keep on reading through the
next section.
Studio engineers and music technology practitioners usually confuse the terms “frequency response”
and “magnitude frequency response.” The frequency response obtained by computing the DFT of the
impulse response is a complex-valued signal, not very informative for us humans. Computing its
magnitude provides the sort of curves we are more used to seeing for evaluating loudspeaker quality
or the spectral balance of a mix. However, they miss the phase information, which can be computed
as well as the argument function of the frequency response.
X
N 1
X ½k ¼ x½nWNkn k ¼ 0; 1; …; N 1 ð2:37Þ
n¼0
where WN ¼ ej N .
2π
For a sequence of length N, Equation 2.37 requires N 2 complex products and N ðN 1Þ complex sums.
Summing two complex numbers requires summing the respective real and imaginary parts, and thus it
requires two real sums. The product of two complex numbers X1 ¼ a1 þ jb1; X2 ¼ a2 þ jb2 instead
requires four real products because it is expressed as X1 X2 ¼ a1 a2 b1 b2 þ jða1 b2 þ b1 a2 Þ.
The computational cost is thus of the order of N 2 , because for increasing N, the cost of computing
the N 2 products prevails over the N ðN 1Þ sums.15
The computational cost of the DFT can be improved by exploiting some of its properties to
factor the number of operations. One famous algorithm for a fast Fourier transform (FFT)
was devised by Cooley and Tukey (1965). From that time, the FFT acronym has been widely
misused instead of DFT. It should be noted that the DFT is the transform, while the FFT is
just a fast implementation of the DFT, and there is no difference between the two but the
computational cost. The computational savings of the FFT are obtained by noting that it can
be written as:
X
N=21 X
N=21
X ½k ¼ x½2mWN=2
km
þ WNk x½2m þ 1WN=2
km
ð2:38Þ
m¼0 m¼0
thus reducing the DFT to two smaller DFTs of length N=2. The coefficients computed by these two
DFTs are used as partial results for the computation of two of the final DFT bins. If N=2 is even,
these two DFTs can be split in two smaller ones each. This procedure can be iterated. If N is
a power of 2, the procedure can be repeated up to a last iteration where we have several DFTs of
length 2, called butterflies. If, for example, N ¼ 2M , there are M stages and for each stage only N/2
butterflies need be computed. Since each butterfly has one complex product and two complex sums,
we have in total MN=2 complex products and MN complex sums, yielding an order of Nlog2 N
operations. To evaluate the advantage of the FFT, let us consider N = 1024 bins. The DFT requires
approximately 1024
1024 operations, while the FFT requires 10
1024 operations – a saving of
100 times (i.e. two orders of magnitude)!
(a) (b)
beginning and end, and unitary amplitude in the middle (see Figure 2.11b). Windowing consists
of multiplying sample-wise the slice with the window, thus applying a short fade-in and fade-out
to the signal to be analyzed. It can be shown that multiplying a signal with a window alters its
frequency content. Please note that even if you do not apply windowing explicitly, slicing
a signal is still equivalent to multiplying the original signal by a rectangular window
(Figure 2.11a). Thus, altering the signal is unavoidable in any case.16
The shape of the window determines its mathematical properties, and thus its effect on the
windowed signal. Indeed, windows are formulated in order to maximize certain criteria. In general,
there is no optimal window, and selection depends on the task.
To recap what we have discovered in this section:
• Any infinite-length discrete-time signal, or any finite-length signal that is too long to be practical
to analyze in its entirety, can be windowed to obtain a finite-length signal, in order to apply the
DFT. If the signal is non-stationary, the DFT will highlight the spectral information of the por-
tion of signal we have windowed.
• The operation of windowing alters the content of the signal.
• Low-pass filters (LPFs). These ideally cancel all content above a so-called cutoff frequency. In
reality, there is a transition band where the frequency components are progressively attenuated.
The steepness of the filter in the transition band is also called roll-off. The roll-off may end,
reaching a floor that is non-zero. This region, called stop-band, has a very high, but not infinite,
attenuation. Synthesizer filters usually have a roll-off of 12 or 24 dB/oct, and generally have
a small resonant bell at the cutoff frequency.
Elements of Signal Processing for Synthesis 39
• High-pass filters (HPFs). These ideally cancel those frequency components below a cutoff fre-
quency. They are thus considered the dual of low-pass filters. Also, HPFs have a roll-off and
may have a resonance at the cutoff frequency.
Figure 2.12: Different types of filters: (a) low-pass, band-pass, and high-pass, with pass band and stop
band as indicated in the figure, and transition bands in between; (b) shelving and peaking filters; (c) notch
and band-stop filters; (d) feedforward comb; and (e) feedback comb.
40 Elements of Signal Processing for Synthesis
• Band-pass filters (BPFs). These are filters that only select a certain frequency band by attenuat-
ing both the low- and the high-frequency content. They are defined by their bandwidth (or its
inverse, the quality factor, Q) and their central frequency. They have two transition bands and
two stop bands.
• Shelving filters. These are filters that cut or boost the content in the bass or treble range, and
leave the rest unmodified. They are used in equalizers to treat the two extremes of the audible
range. They are defined by the cutoff, the roll-off, and the gain.
• Peaking filters. These cut or boost a specific bandwidth, and leave the rest unmodified. They are
defined by the bandwidth (or the quality factor, Q), the central frequency, and the gain. They are
used in parametric equalizers.
• Comb filters. These filters apply a comb-like pattern that alters (cuts or boosts) the spectrum at
periodic frequency intervals. They can have feedback (boost) or feedforward (cut)
configurations.
Another class of filters, widely employed in DAWs, the so-called brickwall filters, are just low-pass
or high-pass filters with a very steep roll-off. They are used in audio track mastering applications to
remove any content above or below a certain cutoff frequency.
Most synthesizer filters are of the IIR type and have a gentle roll-off (Figure 2.12).
TIP: This is the 101 for virtual analog modeling of electronic circuits. Read carefully if you want
to get started on this topic.
Later it will be useful to obtain the discrete-time derivative of a signal. A bit of high school math
will refresh the concept of derivative. The derivative of a curve is the slope of the curve in a given
point. A line of the form
yð xÞ ¼ mx þ q ð2:39Þ
has slope m and offset q. A line never changes slope, so its derivative is y_ ð xÞ ¼ m for all values of
x. With the line, we compute the slope as:
Δy
m¼ ð2:40Þ
Δx
where the two quantities are the difference between any two points Δx ¼ x2 x1 with x2 4x1 and
their y coordinates Δy ¼ yðx2 Þ yðx1 Þ. A line never changes its derivative, and thus any two points
are good. For generic signals, the slope can change with time, and thus for each instant an
approximation of the derivative is obtained using Equation 2.40 and considering two very close
points. In the case of discrete-time signals, the choice of the points is pretty obvious:17 there are no
two closer points than two consecutive samples y½n; y½n 1. The difference equation for the first-
order backward differentiator is thus expressed as:
Elements of Signal Processing for Synthesis 41
The quantity Δx is the time corresponding to one sample. When transferring a continuous-time
problem to the discrete-time domain, therefore, Δx ¼ Ts . Otherwise, the term can be expressed in
terms of samples, thus Δx ¼ 1. Many DSP books adopt this rule, while physical modeling texts and
numerical analysis books use the former to retain the relation between timescales in the two
domains.
The frequency response of the ideal differentiator is a ramp rising by 6 dB per octave, meaning that
for each doubling of the frequency there is a doubling of the output value. It also means that any
constant term (e.g. an offset in the signal) is canceled because it lies at null frequency, as we would
expect from a differentiator, that calculates only the difference between pairs of values. It must be
said, for completeness, that the digital differentiator of Equation 2.41 slightly deviates at high
frequencies from the behavior of an ideal differentiator. Nonetheless, it is a widely used
approximation of the differentiation operator in the discrete-time domain. More complex
differentiation schemes exist but are extremely complex and bear additional issues, and thus they
have very limited application to music signal processing.
A very different case is that of digital integrators, where several approximations exist and are selected
depending on the use case. Let us first consider the integration operation. In the continuous-time domain,
the integral is the area under the curve corresponding to a signal. In analog electronics, this is done by
using an operational amplifier (op-amp) with a feedback capacitor, allowing you to cumulate the signal
amplitude over time. Similarly, in the discrete-time domain, it is sufficient to indefinitely cumulate the
value of the incoming signal. Similar to the digital differentiator, the digital integrator is just an
approximation of the ideal integrator. Two extremely simple forms exist, the forward and the backward
Euler – or rectangular – integrators, described by Equations 2.42 and 2.43:
The forward Euler integrator requires two memory elements but features otherwise similar
characteristics. If you want proof that the difference equations (Equations 2.42 and 2.43) implement
an integrator, consider this: integrating a curve, or a function, by definition, implies evaluating the
underlying area. Our input samples tell us the shape of the curve at successive discrete points. We
can take many rectangles that approximate the area between each two consecutive points of that
curve. Figure 2.13, for example, shows two rectangles approximating the area under the curve (black
line), of which we know three points. The area of these two rectangles slightly underestimates the
real value of the area under the curve. Intuitively, by fitting this curve with more rectangles (i.e.
reducing their width and increasing the number of points, that is, the sampling rate) the error
reduces.
A computer can thus calculate an approximation of the area by summing those rectangles, as
follows: the distance between two of them is the sampling interval Ts . This gives us the width of
each rectangle. We only have to decide the height of the rectangle. We can take either x[n] or x[n
+1], the former for the forward Euler integrator and the latter for the backward integrator. The
integrator used in Figure 2.13 is the forward Euler, as the height of the rectangle is taken from x[n−1].
42 Elements of Signal Processing for Synthesis
Figure 2.13: A discrete-time feedback integrator applied to a signal. The hatched area results from the
integration process, while the remaining white area under the signal (solid bold line) is the approximation
error. The shorter the time step, the lower the error.
To obtain the area in real time, as the new samples come in, we don’t have to sum all the rectangles
at each time step. We can just cumulate the area of one rectangle at a time and store the value in the
variable y½n 1. At the next step, we will add the area of a new rectangle to the cumulative value
of all previous ones, and so forth.
Other integrators exist that have superior performance but higher computational cost. In this context,
we are not interested in developing the theory further, but the reader can refer to Lindquist (1989)
for further details.
X
n¼∞
sðtÞ ¼ δðt nTs Þ ð2:44Þ
n¼∞
where the time interval Ts is the inverse of the sampling frequency (or sampling rate, or sample rate)
Fs ¼ T1s . The operation is thus:
Elements of Signal Processing for Synthesis 43
X
þ∞
xs ðtÞ ¼ xc ðtÞ sðtÞ ¼ xc ðtÞ δðt nT Þ: ð2:45Þ
n¼∞
The result of this operation is a signal that is zero everywhere besides the instants xc ðnTs Þ. These
instantaneous values are stored, using a suitable method, in the vector x½n ¼ xc ðnTs Þ.
Is there any particular choice for the Fs ? Yes, there is! Let us take a look at xs ðtÞ in the frequency
domain. By evaluating the continuous-time Fourier transform of sðtÞ and applying its properties, it
can be shown that the sampled signal Xs ð f Þ is periodic in the frequency domain, as shown in Figure
2.15. The product of xc ðtÞ and sðtÞ is equivalent to the convolution in the frequency domain (per the
convolution property of the DFT). The spectrum of xc ðtÞ is thus replicated in both the positive and
negative frequency axis, and each replica has a distance that is equal to Fs . From this observation,
we can deduce that there are values of Fs that are too low and will make the replicas overlap. Is this
a problem? Yes, it is! Not convinced? Let us examine what happens next when we convert back the
discrete signal into a continuous-time signal.
This process is called reconstruction of the continuous-time signal. It creates a continuous-time
signal by filling the gaps. First, a continuous time-domain signal xs ðtÞ is obtained from x½n by
multiplying it with a train of Dirac pulses:
X
þ∞
xs ð t Þ ¼ x½nδðt nTs Þ ð2:46Þ
n¼∞
Figure 2.15: The first stage of the sampling process consists of multiplying xðtÞ by sðtÞ. The product of two
signals in the time domain consists of the convolution between their Fourier transform in the frequency
domain. The figure shows the two magnitude spectra and the result of the frequency-domain convolution,
consisting of replicas of the spectrum of xðtÞ.
At this point, the filtered signal is the reconstructed signal we were looking for, and will be exactly
the same as the input signal xc ðtÞ. Perfect reconstruction is possible if – and only if – the replicas
are at a distance Fs that is large enough to allow the reconstruction filter to cancel all other replicas
and their tails. If the replicas do overlap, the filter will let parts of the replicas in, causing aliasing.
What we have reported so far is the essence of a pillar theorem, the Nyquist sampling theorem,
stating that the sampling rate must be chosen to be at least twice the bandwidth of the input signal
in order to uniquely determine xc ðtÞ from the samples x½n. If this condition is true, then the
sampling and reconstruction process do not affect the signal at all.
In formal terms, the Nyquist theorem18 states that if xc ðtÞ is band-limited:
Xc ð f Þ ¼ 0 forj f j4FN
Table 2.3: DFT of notable signals
Dirac delta
x½n ¼ δ½n X ½k ¼ 1
Cosine
x½n ¼ cosð2πfnÞ
X ½k ¼ π½δðk 2πf Þ þ δðk þ 2πf Þ
Rectangular
pulse or
window
ðkLÞ
1 L5n5L X ½k ¼ L sinkL
x½n ¼
0 elsewhere
Pulse train
P
m¼∞
x½n ¼ δðn mT Þ
m¼∞
P
m¼∞
X ½k ¼ δ k mT
m¼∞
Note: The figures are only for didactical purposes. They do not consider the effect of sampling that replicates the shown spectra
with period Fs. Aliasing will occur for non-band-limited signals.
46 Elements of Signal Processing for Synthesis
then xc ðtÞ is uniquely determined by its samples x½n ¼ xc ðnT Þ; n ¼ 0; 1; 2; … if – and only if –
Fs 42FN : ð2:48Þ
FN is also called the Nyquist frequency, and is considered the upper frequency limit for a signal
sampled at a given Fs .
If Equation 2.48 is not respected, spurious content is added to the original signal that is usually
perceived as distortion, or, properly, aliasing distortion.
A question may now arise: If we are not lucky and our signal is not band-limited, how do we deal
with it? The only answer is: we band-limit it!19 Any real-world signal that needs to be recorded and
sampled is first low-pass filtered with a so-called anti-aliasing filter, having cutoff at the Nyquist
frequency, thus eliminating any content above it.
A final topic related to filtering: we have seen that the reconstruction filter has its cutoff at the Nyquist
frequency, like the anti-aliasing filter, in order to guarantee perfect reconstruction.20 The idea
reconstruction filter is a low-pass filter with cutoff frequency at the Nyquist frequency and vertical
slope; in other words, it would look like a rectangle in the frequency domain. From Table 2.3, we
observe that its inverse Fourier transform (i.e. the impulse response of this filter) is a symmetrical pulse
called sinc that goes to zero at infinity, and thus has infinite length. Such an infinite impulse response is
hard to obtain. In general, ideal low-pass filters can only be approximated (e.g. by truncating the
impulse response up to a certain – finite – length). This makes the reconstruction process not perfect,
but trade-offs can be drawn to obtain excellent acoustic results.
where T is the period of the signal. The signal is divided by T to keep in the range from −1 to 1,
and it is shifted by −1 to have zero mean.
The sawtooth spectrum has even and odd harmonics, and the amplitude of each harmonic is
dependent of its frequency with a 1-over-f relation (i.e. with a spectral rolloff of 6 dB/oct). Such
a rich spectrum is related to the abrupt jump from 1 to −1. It must be noted that a reversed ramp
(i.e. falling) has similar properties.
Elements of Signal Processing for Synthesis 47
Triangle waves are, on the other hand, composed by rising and falling ramps, and thus they have no
discontinuity. The spectrum of a triangular wave is thus softer than that of a sawtooth wave. Its
mathematical description can be derived from that of the sawtooth, by applying an absolute value to
obtain the ascending and descending ramps:
The triangle wave has only odd harmonics, and its decay is steeper than that of the sawtooth (12
dB/oct).
The triangle wave has a discontinuity in its first derivative. This can be observed at the end of the
ramps: a descending ramp has a negative slope, and thus a negative derivative, while an ascending
ramp has a positive slope, and thus a positive derivative. What does the triangle derivative look like?
It looks like a square wave! Since the relation between a signal and its derivative, in the frequency
domain, is an increase proportional to the frequency, we can easily conclude that the square wave
has the same harmonics as the triangle (odd), but with a slower spectral decay (6 dB/oct). The
square wave thus sounds brighter than the triangle wave. If we look at the time domain, we can
justify the brighter timbre by observing that it has abrupt steps from 1 to −1 that were not present in
the triangle wave. The triangular waveform can thus be described as:
dr½n
q½n ¼ ¼ r½n r½n 1; ð2:51Þ
dn
where we approximated the time derivative with a backward difference as in Equation 2.41. At
this point, an important distinction must be made between the square and rectangular waveforms.
A square wave is a rectangular waveform with a 50% duty cycle. The duty cycle Δ is the ratio
between the time the wave is up and the period Δ ¼ Ton =T, as shown in Figure 2.16. This
alteration of the wave symmetry also affects the spectral content, and thus the timbre. Remember
from Table 2.3 that the DFT of a rectangular window has its zeros at frequencies that depend on
the length of the window. Considering that a square wave is a periodic rectangular window, it
can be seen as the convolution of a train of Dirac pulses with period T and a rectangular
window with TON equal to half the period T. In the frequency domain, this is equal to the
product of the spectra of the window and the pulse train. It can then be shown that the square
wave has zeros at all even harmonics. However, if the duty cycle changes, the ratio between the
Figure 2.16: Pulse width modulation (PWM) consists of the alteration of the duty cycle (i.e. the ratio
between TON and the period T).
48 Elements of Signal Processing for Synthesis
Figure 2.17: A rectangular wave with duty cycle different from 50% has a spectrum that results from the
convolution (in time) or the product (in frequency) of the window (a sinc function, dashed lines) and a train
of Dirac pulses (producing the harmonics, solid lines).
period and the TON changes as well, shifting the zeros in frequency and allowing the even
harmonics to rise. In this case, the convolution of the Dirac train and the rectangular window will
result in lobes that may or may not kill harmonics, but will for sure affect the amplitude of the
harmonics, as seen in Figure 2.17. Going toward the limit, with the duty cycle getting close to zero,
the result is a train of Dirac pulses, and thus the frequency response gets closer to that of a Dirac
pulse train, with the first lobe reaching infinity, and thus all harmonics having the same amplitude.
This, however, does not occur in practice.
Sawtooth waveform
Triangle waveform
It should be noted that practical oscillator designs do often generate slight variations over the
theoretical waveform. An example is provided by the Minimoog Voyager sawtooth that was studied in
Pekonen et al. (2011), which shows it to be smoother than the ideal sawtooth wave. A lot of
oscillator designs differ from the ideal waveforms described in this section, and a good virtual analog
emulation should take this difference into consideration.
Aliasing is one of the most important issues in virtual synthesizers, together with computational cost,
and often these two issues conflict with each other: to reduce aliasing, you have to increase the
computational cost of the algorithm. The presence of aliasing affects the quality of the signal. Analog
synthesizers were not subject to aliasing at all. One notable exception, again, is the Bucket-Brigade
delay. BBD circuits are discrete-time systems, and thus subject to the Nyquist sampling theorem. As
such, they can generate aliasing. But any other analog gear had no issues of this kind. With the
advent of virtual analog in the 1990s, solutions had to be found to generate waveforms without
aliasing on the available signal processors of the time. Nowadays, x86 and ARM processors allow
a whole lot of flexibility for the developer and improved audio quality for the user.
Common oscillator waveforms (sawtooth, square, and triangle waves) are not band-limited as they
have a harmonic content that decays indefinitely at a rate of 6 or 12 db/oct typically, and thus they
go well above the 20 kHz limit (ask your dog). This does not pose problems in a recording setting.
Any analog-to-digital converter applies an anti-aliasing filter to suppress any possible component
above FN , thus limiting the bandwidth of the signal. Any digital audio content, therefore, if recorded
and sampled properly, does not exhibit noticeable aliasing. Unfortunately, this is not the case,
because, as we said, generating a discrete-time signal employing a non-band-limited function is
50 Elements of Signal Processing for Synthesis
equivalent to sampling that non-band-limited function (i.e. freezing its theoretical behavior at
discrete points in time, exactly as the sampling process does).
The outcome of an improper sampling process or the discretization of a non-band-limited
signal results in the leak of spurious content in the audible range, which is undesired (unless
you are producing dubstep, but that is another story). With periodic signals, such as a sawtooth
wave sweeping up in frequency, the visible and audible effect of aliasing is the mirroring of
harmonics approaching the Nyquist frequency and being reflected to the 0 FN range. As
aliasing gets even more severe, other replicas of the spectrum get in the range 0 FN and
the visible effect is the mirroring of (already mirrored) harmonics to the left bound of the
0 FN range.
After aliasing has occurred, there is generally no way to repair it, since the proper spectrum and the
overlapping spectrum are embedded together.
Fortunately, the scientific literature is rich with methods to overcome aliasing in virtual synthesizers.
In Sections 8.2 and 8.3, a couple of methods are exposed to deal with aliasing in oscillators and
waveshapers, while other methods are referenced in Section 11.1 for further reading.
Let us examine a signal in the frequency domain. In Section 2.7, we described the sampling process
and discovered that it necessarily generates aliases of the original spectrum, but these can be
separated by a low-pass filter. The necessary condition is that these are well separated. However, if
the original spectrum gets over the Nyquist frequency, the aliases start overlapping with each other,
as shown in Figure 2.18. Zooming in, and considering a periodic signal, as in Figure 2.19, with
equally spaced harmonics, you can see that after the reconstruction filter, the harmonics of the right
alias gets in the way. This looks like the harmonics of the original signal are mirrored. The aliasing
components mirrored to the right come, in reality, from the right alias, while those mirrored at 0 Hz
come from the left alias.
Let us consider aliasing in the time domain and take a sinusoidal signal. If the frequency of the sine
is much lower than the sampling rate (Figure 2.20a), each period of the sine is described by a large
number of discrete samples. If the frequency of the sine gets as high as the Nyquist frequency
(Figure 2.20b), we have only two samples to describe a period, but that is enough (under ideal
circumstances). When, however, the frequency of the sine gets even higher, the sampling process is
“deceived.” This is clear after the reconstruction stage (i.e. after the reconstruction filter tries to fill
in the gaps between the samples). Intuitively, since the filter is a low-pass with cutoff at the Nyquist
Figure 2.18: Overlapping of aliases, occurring due to an insufficient sampling frequency for a given broad-
band signal. The ideal reconstruction filter is shown (dashed, from FN to FN ).
Figure 2.19: The effect of aliasing on the spectrum of a periodic signal. The ideal spectrum of a continuous-
time periodic signal is shown. The signal is not band-limited (i.e. sawtooth). When the signal is sampled
without band-limiting, the original spectrum, aliases gets in the way. The effect looks like a mirroring of
component over the Nyquist frequency (highlighted with a triangle) and a further mirroring at 0 Hz
(highlighted with a diamond), producing the zigzag (dashed line). Please note that this “mirroring” in reality
comes from the right and left aliases.
Figure 2.20: Effect of aliasing for sinusoidal signals. (a) A continuous-time sine of frequency FN =4 is shown
to be sampled at periodic intervals, denoted by the vertical grid with values highlighted by the circles. (b)
A sine at exactly FN is still reconstructed correctly with two points per period. (c) A continuous-time sine at
frequency ð7=4ÞFN is mistaken as a sine of frequency FN =4, showing the effect of aliasing. The dashed curve
shows the reconstruction that is done after sampling, very similar to the signal in (a). In other words, the
sine is “mirrored.”
52 Elements of Signal Processing for Synthesis
frequency, the only thing it can do is fill the gaps (or join the dots) to describe a sine at a frequency
below Nyquist (Figure 2.20c). As you can see, when aliasing occurs, a signal is fooling the sampling
process and is taken for another signal.
From its differential equation, it is clear that five coefficients should be computed (we are not
covering here how) and four variables are required to store the previous two inputs and previous two
outputs. The computational cost of this filter is five products and four sums. The differential
equation can be translated in the signal-flow graph shown in Figure 2.21a. This is a realization of
the filter that is called direct form 1 (DF1) because it directly implements the difference equation
(Equation 2.52).
Without going into deep detail, we shall mention here that other implementations of the same
difference equation exist. The direct form 2 (DF2), for example, obtains the same difference
equation but saves two memory storage locations, as shown by Figure 2.21b. In other words, it is
equivalent but cheaper. Other realizations exist that obtain the same frequency response but have
other pros and cons regarding their numerical stability and quantization noise.
It should be clear to the reader that the same differential equation yields different kinds of
filters, depending on the coefficients’ design. As a consequence, you can develop code for
a generic second-order filter and change its response (low-pass, high-pass, etc.) or cutoff
frequency just by changing its coefficients. If we want to do this in C++, we can thus define
a class similar to the following:
Elements of Signal Processing for Synthesis 53
(a)
(b)
Figure 2.21: Signal-flow graph of a biquad filter in its direct form 1 (DF1) realization (a) and in its direct
form 2 (DF2) realization (b).
class myFilter {
private:
float b0, b1, ...; // coefficients
float xMem1, xMem2, yMem1, ...; // memory elements
public:
void setCoefficients(int type, float cutoff);
float process(float in);
}
In this template class, we have defined a public function, setCoefficients, that can be called
to compute the coefficients according to some design strategy (e.g. by giving a type and a cutoff).
The coefficients cannot be directly modified by other classes. This protects them from unwanted
errors or bugs that may break the filter integrity (even small changes to the coefficients can cause
54 Elements of Signal Processing for Synthesis
instability). The process function is called to process an input sample and returns an output
sample. An implementation of the DF1 SOS would be:
As you can see, the input and output values are stored in the first memory elements, while the first
memory elements are propagated to the second memory elements.
Please note that, usually, audio processing is done with buffers to reduce the overhead of calling the
process function once for each sample. A more common process function would thus take an input
buffer pointer, an output buffer pointer, and the length of the buffers:
void process(float *in, float *out, int length);
As we shall see, this is not the case with VCV Rack, which works sample by sample. The reasons
behind this will be clarified in Section 4.1.1, and it greatly simplifies the task of coding.
2.11.1 Waveshaping
Waveshaping is the process of modifying the appearance of a wave in the time domain in order to
alter its spectral content. This is a nonlinear process, and as such it generates novel partials in the
signal, hence the timbre modification. The effect of the nonlinearity is evaluated by using a pure
sine as input and evaluating the number and level of new partials added to the output. The two most
important waveshaping effects are distortion effects and wavefolding, or foldback.
For foldback, we have an entire section that discusses its implementation in Rack (see Section 8.3).
We shall now spend a few words on the basics of waveshaping and distortion. When waveshaping is
defined as a static nonlinear function, the signal is supposed to enter the nonlinear function and read
the corresponding value. If x is the value of the input signal at a given time instant, the output of the
nonlinear function is just f ð xÞ. Figure 2.22 shows this with a generic example.
Elements of Signal Processing for Synthesis 55
Figure 2.22: A static nonlinear function affecting an input signal. As the term “waveshaper” implies, the
output wave is modified in its shape by the nonlinear function. It should be clear to the reader that changing
the amplitude of the input signal has drastic effects. In this case, a gain factor that reduces the input signal
amplitude below the knees of f ð xÞ will make the output wave unaltered.
As you can see, the waveshaping (i.e. the distortion applied to the signal) simply follows the
application of the mapping from x to y. In Figure 2.23, we show how a sine wave is affected by
a rectifier. A rectifier is a circuit that computes the absolute value of the input, and it can be
implemented by diodes in the hardware realm. The input sine “enters” the nonlinear function, and
for each input value we compute the output value by visually inspecting the nonlinear function. As
an example, the input value x1 is mapped to the output value y1 and similarly for other selected
points.
Since the wave shape is very important in determining the signal timbre, any factor affecting the
shape is important. As an example, adding a small offset to the input signal, or inversely to the
nonlinear function, affects the timbre of the signal. Let us take the rectifier and add an offset. This
affects the input sine wave, as shown in Figure 2.24. The nonlinear function is now y ¼ jx aj. You
can easily show yourself, by sketching on paper, that the output is equivalent to adding an offset to
the sine input and using the previous nonlinearity y ¼ j xj.
A static nonlinearity is said to be memoryless. A system that is composed of one or more static
nonlinear functions and one or more linear systems (filters) is said to be nonlinear and dynamical,
since the memory elements (in the discrete-time domain, the filter states) make the behavior of the
system dependent not only on the input, but also on the previous values (its history). We are not
going to deal with these systems due to the complexity of the topic. The user should know, however,
56 Elements of Signal Processing for Synthesis
Figure 2.23: Rectification of a sine wave by computing the function y¼j xj (absolute value).
Figure 2.24: Rectification of a sine wave by computing the function y ¼ jx aj. The same result is obtained
by offsetting the input waveform by a before feeding it to the nonlinearity.
that there are several ways to model these systems. One approach is to use a series of functions (e.g.
Volterra series) (Schetzen, 1980).21 Another one is to use the so-called Hammerstein, Wiener, or
Hammerstein-Wiener models. These models are composed by the cascade of a linear filter and
a nonlinear function (Wiener model), or a nonlinear function and a linear filter (Hammerstein
model), or the cascade of a filter, a nonlinearity, and another filter (Hammerstein-Wiener model)
(Narendra and Gallman, 1966; Wiener, 1942; Oliver, 2001). Clearly, the position of the linear filters
is crucial when there is a nonlinearity in the cascade. Two linear filters can be put in any order
thanks to the commutative property of linear systems, but when there is a nonlinear component in
Elements of Signal Processing for Synthesis 57
the cascade this does not hold true anymore. There are many techniques to estimate the parameters
of both the linear parts and the nonlinear function, in order to match the behavior of a desired
nonlinear system (Wills et al., 2013), which, however, requires some expertise in the field.
Back to our static nonlinearities, let us discuss distortion. Distortion implies saturating a signal when
it gets close to some threshold value. The simplest way to do this is clipping the signal to the
threshold value when this is reached or surpassed. The clipping function is thus defined as:
τ xτ
f ð xÞ ¼ ð2:53Þ
τ x τ
x elsewhere
where τ is a threshold value. You surely have heard of clipping in a digital signal path, such as the
one in a digital audio workstation (DAW). Undesired clipping happens in a DAW (or in any other
digital path) when the audio level is so high that the digits are insufficient to represent digital value
over a certain threshold. This happens easily with integer representations, while floating-point
representations are much less prone to this issue.22 Clipping has a very harsh sound and most of the
times it is undesired. Nicer forms of distortion are still S-shaped, but with smooth corners. An
example is a soft saturation nonlinearity of the form:
f ð xÞ ¼ tanhð xÞ ð2:54Þ
Other distortion functions may be described by polynomials such as the sum of the terms:
X
N
f ð xÞ ¼ x þ a22x þ a33x þ ¼ ap xp ; ð2:56Þ
p¼1
where a1 ¼ 1. This equation generates a signal with second harmonic with amplitude a2 , third
harmonic with amplitude a3 , and so forth, up to the nth harmonic. It is important to note that even
harmonics and odd harmonics sound radically different to the human ear. The presence of even
harmonics is generally considered to add warmth to the sound. A nonlinear function that is even
only introduces even harmonics:
f ð xÞ ¼ f ðxÞ ð2:57Þ
There are a number of well-known distortion functions, and many are described by functions that are
hard to compute in real time. In such cases, lookup tables (LUTs) may be employed, which help to
reduce the cost. This technique is described in Section 10.2.1.
An important parameter to modify the timbre in distortion effects is the input gain, or drive. This gain
greatly affects the timbre. Consider the clipping function of Equation 2.53, shown in Figure 2.25a. When
the input level is low, the output is exactly the same as the input. However, by rising the input gain, the
signal reaches the threshold and gets clipped, resulting in the clipping distortion. Other distortion functions
58 Elements of Signal Processing for Synthesis
Figure 2.25: Several distortion curves. The clipping function of Equation 2.53 (a), the hyperbolic tangent
function (b), and the saturation of Equation 2.55.
behave similarly, with just a smoother introduction of the distortion effect. The smaller the signal, the
more linear the behavior (indeed, the function in Figure 2.25b is almost a line in the surroundings of the
origin). Again, if we put the drive gain after the nonlinear function, its effect is a mere amplification and it
does not affect the timbre of the signal that comes out of the nonlinear function.
Such a process is also called heterodyning.24 The result in the frequency domain is the shift of the
original spectrum, considering the positive and negative frequencies, as shown in Figure 2.26. It can
be easily noticed that the modulation of the carrier wave with itself yields a signal at double the
frequency ðfc þ fc Þ and a signal at DC ðfc fc Þ. This trick has been used by several effect units to
generate an octaver effect. If the input signal, however, is not a pure sine, many more partials will
be generated.
The conventional AM differs, as mentioned above, in the presence of the carrier. This is introduced
as follows:
where, besides the carrier, we introduced the modulation index β to weight the modulating signal
differently than the carrier. The result is shown in Figure 2.27.
It must be noted that DSB-SC AM is better suited to musical applications as the presence of the
carrier signal may not be aesthetically pleasing. From now on, when referring to amplitude
modulation, we shall always refer to DSB-SC AM.
Figure 2.26: The continuous-time Fourier transforms of the modulating and carrier signals, and their
product, producing a DSB-SC amplitude modulation, also called ring modulation.
60 Elements of Signal Processing for Synthesis
Figure 2.27: Fourier transform of the conventional AM scheme. In this case, the modulating signal is
depicted as band-pass signal to emphasize the presence of the carrier in the output spectrum.
In subtractive synthesis, it is common to shape the temporal envelope of a signal with an envelope
produced by an envelope generator. This is obtained by a VCA module. In mathematical terms, the
VCA is a simple mixer25 as in ring modulation, with the notable difference that envelope signals
should never be negative (or should be clipped to zero when they go below zero). When, instead of
an EG, an LFO with sub-20 Hz frequency is used, the result is a tremolo effect. In this case, the
processing does not give rise to new frequency components. The modulating signal is slowly time-
varying and can be considered locally26 linear.
The first studies of FM in sound synthesis were conducted by John Chowning, composer and professor
at Stanford University. In the 1960s, he started studying frequency modulation for sound synthesis
Elements of Signal Processing for Synthesis 61
with the rationale that such a simple modulation process between signals in the audible range
produce complex and inharmonic27 timbres, useful to emulate metallic tones such as those of bells,
brasses, and so on (Chowning, 1973). He was a percussionist, and was indeed interested in the
applications of this technique to his compositions. Among these, Stria (1977), commissioned by
IRCAM, Paris, is one of the most recognized. At the time, FM synthesis was implemented in digital
computers.28
In the 1970s, Japanese company Yamaha visited Chowning at Stanford University and started
developing the technology for a digital synthesizer based on FM synthesis. The technology was
patented by Chowning in 1974 and rights were sold to Yamaha. In the 1980s, the technology was
mature enough, and in 1983 the company started selling the most successful FM synthesizer in
history, the DX7, with more than 200,000 units sold until production ceased in 1989. Many
competing companies wanted to produce an FM synthesizer at the time, but the technology was
patented. However, it can be shown that the phase of a signal can be modulated, obtaining similar
effects. Phase modulation (PM) was thus used by other companies, leading to identical results, as we
shall see later.
where k is a constant that determines the amount of frequency skew. At this point, it is important to
note that there is a strong relationship between the instantaneous phase and frequency of a periodic
signal. Thus, Equation 2.61 can be rewritten as:
1 dðtÞ
fi ðtÞ ¼ fc þ ð2:62Þ
2π dt
where ðtÞ is the instantaneous phase of the signal. We can thus consider changing the phase of the
cosine, instead of its frequency, to obtain the same effect given by Equation 2.61, leading to the
following equation:
yðtÞ ¼ Ac cosð2πfc t þ ðtÞÞ; ð2:63Þ
where the phase term is a modulating signal. If it assumes a constant value, the carrier wave is
a simple cosine wave and there is no modulation. However, if the phase term is time-varying, we
obtain phase modulation (PM). It is thus very important to note that FM and PM are essentially
the same thing, and from Equation 2.62 we can state that frequency modulation is the same as
a phase modulation where the modulating signal has been first integrated. Similarly, phase
modulation is equivalent to frequency modulation where the input signal has been first
differentiated.
62 Elements of Signal Processing for Synthesis
Let us now consider the outcome of frequency or phase modulation. For simplicity, we shall use the
phase modulation notation. Let the modulating signal be mðtÞ ¼ sinð2πfm tÞ. The result is a phase
modulation of the form:
Considering that the sine is the integral of a cosine, the result of Equation 2.64 is also the same as
a frequency modulation with cosð2πfm tÞ as the modulating signal.
In the frequency domain, it is not very easy to determine how the spectrum will look. The three
impacting factors are fc ; fm ; β. When the modulation index is low (i.e. β 1), the result is
approximately:
which states, in other words, that the result is similar to an amplitude modulation with the presence
of the carrier (cosine) and the amplitude modulation between the carrier (sine) and the modulating
signal. This is called narrowband FM. With increasing modulation index, the bandwidth increases,
as does the complexity of the timbre. One rule of thumb is Carson’s rule, which states that
approximately 98% of the total energy is spread in the band:
Bc ¼ 2ðβ þ 1ÞW ; ð2:66Þ
where W is the bandwidth of the modulating signal. This rule is useful for broadcasting, to
determine the channels’ spacing, but does not say much about the actual spectrum of the resulting
signal, an important parameter to determine its sonic result. Chowning studied the relations between
the three factors above, and particularly the ratio fc =fm , showing that if the ratio is a rational
number,29 the spectrum is harmonic and the fundamental frequency can be computed as
f0 ¼ fc =N1 ¼ fm =N2 if the ratio fc =fm can be normalized to ratio of the common factors N1 =N2 .
Equation 2.63 describes what is generally called an operator,30 and may or may not have an input
mðtÞ. Operators can be summed, modulated (one operator acts as input to a second operator), or fed
back (one operator feeds its input with its output), adding complexity to the sound. Furthermore,
operators may not always be sinusoidal. For this reason, sound design is not easy on such
a synthesizer. The celebrated DX7 timbres, for instance, were obtained in a trial-and-error fashion,
and FM synthesizers are not meant for timbre manipulation during a performance as subtractive
synthesizers are.
To conclude the section, it must be noted that there are two popular forms of frequency modulation.
The one described up to this point is dubbed linear FM. Many traditional synthesizers, however,
exploit exponential FM as a side effect (it comes for free) of their oscillator design. Let us consider
a voltage-controlled oscillator (VCO) that uses the V/oct paradigm. Its pitch is determined as:
where the reference frequency fref may be the frequency of a given base note, and the input voltage
vi is the CV coming from the keyboard. If the keyboard sends, for example, a voltage value of
2 V and fref ¼ 440 Hz, then the obtained tone has f0 ¼ 1600 Hz, two octaves higher than the
reference voltage. If, however, we connect a sinusoidal oscillator to the input of the VCO, we
modulate the frequency of the VCO with an exponential behavior. The modulation index can be
Elements of Signal Processing for Synthesis 63
implemented as a gain at the input of the VCO. Please note that if the input signal is a low-
frequency signal (e.g. an LFO signal), the result is a vibrato effect.
Exponential FM has some drawbacks with respect to linear FM, such as dependence of the pitch on
the modulation index, and is not really used for the creation of synthesis engines. It is rather left as
a modulation option on subtractive synthesizers. Indeed, exponential FM is used mostly with LFO.
Being slowly time-varying, frequency modulation with a sub-20 Hz signal can be considered locally
linear,31 and there are no novel frequency components in the output, but rather a vibrato effect.
TIP: Random signals are key elements in modular synthesizers: white noise is a random signal,
and is widely used in sound design practice. Generative music is also an important topic and
relies on random, probabilistic, pseudo-periodic generation of note events. In this section, we
shall introduce some intuitive theory related to random signals, avoiding probability and statis-
tical theoretical frameworks, fascinating but complex.
noise are important as well, especially for those kinds of noise that do not have equal amplitude
over the frequency range.
Back to the theory: if we consider the evolution of the random value in time, we obtain a random
signal. With analog circuits, there are tons of different stochastic processes that we can observe to
extract a random signal. In a computing environment, however, we cannot toss coins, and it is not
convenient at all to dedicate an analog-to-digital converter to sample random noise just to generate
a random signal. We have specific algorithms instead that we can run on a computer, each one with
its own statistical properties, and each following a different distribution.
The distribution of a process is, roughly speaking, the number of times each value appears if we
make the process last for an infinite amount of time. Each value thus has a frequency of occurrence
(not to be confused with the temporal or angular frequency discussed in Section 2.4). In this context,
the frequency of occurrence is interpreted as the number of occurrences of that value in a time
frame.
Let us consider an 8-bit unsigned integer variable (unsigned char). This takes values from 0 to 255.
Let us create a sequence of ten values generated by a random number generator that I will disclose
later. We can visualize the distribution of the signal by counting how many times each of the 256
values appears. This is called a histogram. The ten-value sequence and its histogram are shown in
Figure 2.28a. As you can see, the values span the whole range 0–255, with unpredictable changes.
Since we have a very short number of generated values, most values have zero occurrences, some
can count one occurrence, and one value has two occurrences, but this is just by chance. If we
repeat the generation of numbers, we get a totally different picture (Figure 2.28b). These are, in
technical terms, two realizations of the same stochastic process. Even though both sequences and
their histograms are not similar at all, they have been generated using exactly the same
algorithm. Why do they look different? Because we did not perform the experiment properly. We
need to perform it on a larger run. Figure 2.29 shows two different runs of the same experiment
with 10,000 generated values. As you can see, both values tend to be equally frequent, with
random fluctuations from value to value. As long as the number of generated values increases,
the histogram shows that these small fluctuations get smaller and smaller. With 1 million
generated values, this gets clearer (Figure 2.30), and with the number of samples approaching
infinity the distribution gets totally flat.
What we have investigated so far is the distribution of the values in the range 0–255. As we have
seen, this distribution tends to be uniform (i.e. all values have the same probability of occurring).
The algorithm used to generate this distribution is an algorithm meant to specifically generate
a uniform distribution. Such an algorithm can be thought of as an extraction of a lottery number
from a raffle box. At each extraction, the number is put back into the box. The numbers obtained by
such an algorithm follow the uniform distribution, since at any time every number has the same
chance of being extracted.
Another notable distribution is the so-called normal distribution or Gaussian distribution. The
loudspeaker samples mentioned above may follow such a distribution, with all loudspeakers having
a slight deviation from the expected impedance value (the desired one). The statistical properties of
such a population are studied in terms of average μ (or mean) and standard deviation σ. If we model
the impedance as a statistic variable following a normal distribution, we can describe its behavior in
terms of mean and standard deviation and visualize the distribution as a bell-like function (a
Gaussian) that has its peak at the mean value and has width dependent on the standard deviation.
Elements of Signal Processing for Synthesis 65
Figure 2.28: Two different sequences of random numbers generated from the same algorithm (top) and
their histograms (bottom).
The standard deviation is interpreted as how much the samples deviate from the mean value. If the
loudspeakers of the previous example are manufactured properly, the mean is very close to the
desired impedance value and the standard deviation is quite small, meaning that very few are far
from the desired value. Another way to express the width of the bell is the variance σ 2 , which is the
square of the variance. To calculate the mean and standard deviation from a finite population (or,
equally, from discrete samples of a signal), you can use the following formulae:
Figure 2.29: Histograms of two random sequences of 10,000 numbers each. Although different, all values
tend to have similar occurrences.
Figure 2.30: Histogram of a sequence of 1 million random numbers. All values tend to have the same
occurrences.
Elements of Signal Processing for Synthesis 67
1X N
μ¼ xi ð2:68Þ
N i¼1
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u N
u1 X
σ¼t ð x i μÞ 2 ð2:69Þ
N i¼1
Figure 2.31: Comparison of two normal distributions. The mean values are indicated by the diamond, and
are, respectively, 248 and 255. The broader curve has larger variance, while the taller curve has lower vari-
ance. If these populations come from two loudspeaker production lines and the desired impedance is 250 Ω,
which one of the two would you choose?
68 Elements of Signal Processing for Synthesis
(a) (b)
Figure 2.32: Power spectral density of white noise (a) and pink noise (b). The latter exhibits a 6 dB/oct (or
20 dB/decade) spectral decay.
frequency. In other words, it decays in frequency by −20 dB/decade or −6 dB/octave. For this reason,
it is darker than white noise and more usable for musical application. Since the slope imposed by
integration of a signal is −6 dB/octave, it can also be obtained from white noise by integration.
Having provided a short, intuitive description of random signals, a few words should be dedicated to
discrete-time random number generation algorithms. Without the intent of being exhaustive, we shall
just introduce a couple of key concepts of random number generators. While natural processes such
as thermal noise are totally unpredictable, random number generators are, as any other software
algorithm, made by sequences of machine instructions. As such, these algorithms are deterministic
(i.e. algorithms that always produce the same outcome given the same initial conditions). This makes
them somewhat predictable. For this reason, they are said to be pseudorandom, as they appear
random on a local scale, but they are generated by a deterministic algorithm.
Thinking in musical terms, we can produce white noise by triggering a white noise sample.
However, this sample is finite, and requires looping to make it last longer. Further repetitions of the
sample will be predictable as we can store the first playback and compare the new incoming values
to the ones we stored, spotting how the next sample will look. The periodicity of the noise sample
can be spotted by the human ear if the period is not too long.
There are better ways to produce random numbers that are based on complex algorithms; however,
these algorithms are subject to the same problem of periodicity, with the only added value that they
do not require memory to store samples, and thus can be extremely long (possibly more than your
life) without requiring storage resources. Most of these algorithms are based on simple arithmetic
and bit operations, making them computationally cheap and quick. The C code snippet below
calculates a random 0/1 value at each iteration and can serve as a taster. The 16-bit variable rval can
be accessed bitwise. This is a very simple example, and thus has a low periodicity. Indeed, if you
send it to your sound card, you will spot the periodicity, especially at high sampling rates, because
the period gets shorter.
for (i = 0; ; i++) {
fout = rval.bit0 ^ rval.bit1;
fout ^= rval.bit11 ^ rval.bit13;
rval <<=1;
rval.b0 = fout; // create a feedback loop
// use the random value fout for your scopes
}
Elements of Signal Processing for Synthesis 69
Random number generators are initialized by a random seed (i.e. an initial value – often the
current timestamp – that changes from time to time), ensuring the algorithm is not starting every
time with the same conditions. If we start the algorithm twice with the same seed, the output will
be exactly the same. In general, this is more important for cryptography rather than our musical
generation of noise, but keep it in mind, especially during debugging, that random sequences will
stay the same if we use the same seed each time, making an experiment based on random value
reproducible.
sum of two floating-point operands takes three cycles. Similarly, the product of two integer
operands takes three cycles, while with floating-point operands it takes five. The division between
two integer operands requires at least 23 cycles, while for floating-point operands only 14 to 16
are required. The division instruction also calculates the remainder (modulo). As you can see,
divisions take a time larger by an order of magnitude. For this reason, divisions should be reduced
as much as possible, especially in those parts of the code that are executed more often (e.g. at
each sample). Sometimes divisions can be precalculated and the result can be employed for the
execution of the algorithm. This is usually done automatically by the C++ code compiler if
optimizations are enabled. For instance, to reduce the amplitude of a signal by a factor of 10, both
the following are possible:
If compiler optimizations are enabled, the compiler automatically switches from the first to
the second in order to avoid division. However, when both operands can change at runtime, some
strategies are still viable. Here is an example: if the amplitude of a signal should be reduced by
a user-selectable factor, we can call a function only when the user inputs a new value:
normalizationFactor = 1 / userFactor;
and then use normalizationFactor in the code executed for each sample:
This will save resources because the division is done only once, while a product is done during
normal execution of the signal processing routine.
Another issue with division is the possibility of dividing by zero. The result of a division by zero is
infinity, which cannot be represented in a numerical format. The value “not a number” (NaN) will
appear instead. NaN must be avoided and filtered away at any cost, and division by zero should be
avoided by proper coding techniques, such as adding a small non-null number in the denominator of
a division. If you develop a filter that may in some circumstances produce NaNs, you should add
a check on its output that resets its states and clears the output from these values.
Calculating the modulo is less expensive when the second operand is a power of 2. In this case, bit
operations can be used, requiring less clock cycles. For instance, to reset an increasing variable
when it reaches a threshold value that is a power of 2 (say 16), it is sufficient to write:
var++;
var = var & 0xF;
where “0x” is used to denote the hexadecimal notation. 0xF is decimal 15. When var is, say, 10
(0xA, or in binary format 1010), a binary AND with 0xF (binary 1111) will give back number 10.
When var is 16 (0x10, binary 10000), the binary AND will give 0.
For what concerns double- and single-precision execution times, in general, double-precision
instructions execute slower than single-precision floating-point instructions, but on x86 processors
Elements of Signal Processing for Synthesis 71
the difference is not very large. To obtain maximum speedup, in any case, it is useful to specify the
format of a number to the compiler so that unnecessary operations do not take place. In the
following line, for example:
float var1, var2;
var2 = 2.1 * var1;
the literal 2.1 will be understood by the compiler as a double-precision number. The variable var1
will thus be converted to double precision, the product will be done as a double-precision product,
and finally the result will be converted to single precision for the assignment to var2. A lot of
unnecessary steps! The solution is to simply state that the literal should be a single-precision
floating-point value (unless the other operand is a double and you know that the operation should
have a double precision):
var2 = 2.1f * var1;
The “f” suffix will do that. If you are curious about how compilers transform your code, you can
perform comparisons using online tools that are called compiler explorers or interactive compilers.
These are nice tools to see how your code will probably be compiled if you are curious or you are
an expert coder looking for extreme optimizations.
To conclude, let us briefly talk about conditional statements. The use of conditional clauses
(if-else, ‘?’ operator) is not recommended in those parts of the code that can be highly optimized
by the compiler, such as a for loop. When no conditional statements are present in the loop, the
compiler can unroll the loop and parallelize it so that it can be executed in a batch fashion. This can be
done because the compiler knows that the code will always execute in the same way. However, if
a conditional statement is present, the compiler cannot know beforehand how the code will execute, if
it will jump and where, and thus the code will not be optimized. In DSP applications, the main reason
for performing for loops is the fact that the audio is processed in buffers. As we shall see later, in
VCV Rack, the audio is processed sample by sample, requiring less use of loops.
• The FFT is just an algorithm to compute the DFT, although these terms are used equivalently in
audio processing parlance.
• The sampling theorem is discussed, together with its implications on aliasing and audio quality.
• Ideal periodic signals generated by typical synthesizer oscillators have been discussed.
• Random signals have been introduced with some practical examples. These signals are dealt
with differently than other common signals. White noise is an example of a random signal.
• The frequency of occurrence of a random value should not be confused with the angular fre-
quency of a signal.
• Only pseudorandom signals can be generated by computers, but they are sufficient for our
applications.
• A few useful tips related to math and C++ are reported.
Notes
1 As a student, it bothered me that a positive time shift of T is obtained by subtracting T, but you can easily realize why it
is so.
2 Sometimes it is fun to draw connections between signal processing and other scientific fields. Interesting relations
between time and frequency exist that can only be explained with quantum mechanics, and carry to a revised version of
Heisenberg’s uncertainty principle, where time and frequency are complementary variables (Gabor, 1946).
3 Nowadays, even inexpensive 8-bit microcontrollers such as Atmel 8-bit AVR, used in Arduino boards, have multiply
instructions.
4 Spoiler alert: We shall see later that the whole spectrum that we can describe in a discrete-time setting runs from 0 to 2π,
and the interesting part of it goes from 0 to π, the upper half being a replica of the lower half.
5 Another tip about scientific intersections: signal processing is a branch of engineering that draws a lot of theory from
mathematics and geometry. Transforms are also studied by functional analysis and geometry, because they can be seen in
a more abstract way as means to reshape signals, vectors, and manifolds by taking them from a space into another. In
signal processing, we are interested in finding good transforms for practical applications or topologies that allow us to
determine distances between signals and find how similar they are, but we heavily rely on the genius of mathematicians
who are able to treat these high-level concepts without even needing practical examples to visualize them!
Elements of Signal Processing for Synthesis 73
6 Different from the transforms that we shall deal with, the human ear maps the input signal into a two-dimensional space
with time and frequency as independent variables, where the frequency is approximately logarithmically spaced. Further-
more, the ear has a nonlinear behavior. Psychoacoustic studies give us very complex models of the transform implied by
the human ear. Here, we are interested in a simpler and more elegant representation that is guaranteed by the Fourier
series and the Fourier transform.
7 We shall see that there are slightly different interpretations of the frequency domain, depending on the transform that is
used or whether the input signal is a continuous- or discrete-time one.
8 Following the convention of other textbooks, we shall highlight that a signal is periodic by using the tilde, if this property
is relevant for our discussion.
9 Partials may also be inharmonic, as we shall see with non-periodic signals.
10 Engineering textbooks usually employ the letter j for the imaginary part, while high school textbooks usually employ the
letter i.
11 The negative frequency components are redundant for us humans, but not for the machine. In general, we cannot just
throw these coefficients away.
12 Non-stationary signals cannot be described in the framework of the Fourier transform, but require mixed time-frequency
representations such as the short-time Fourier transform (STFT), the wavelet transform, and so on. Historically, one of the
first comments regarding the limitations of the continuous-time Fourier transform can be found in Gabor (1946) or Gabor
(1947).
13 For the sake of completeness: the DFT also accepts complex signals, but audio signals are always real.
14 Do not try this at home! Read a book on FIR filter design by windowing first.
15 The exact number is 4N 2 real products and ðN ð4N 2ÞÞ real sums.
16 Unless the original signal is zero everywhere outside the taken slice.
17 Although different approximation schemes exist.
18 It is very interesting to take a look at the original paper from Nyquist (1928), where the scientist developed his theory to
find the upper speed limit for transmitting pulses in telegraph lines. These were all but digital. Shannon (1949) provided
a proof for the Nyquist theorem, integrating it in his broader information theory, later leading to a revolution in digital
communications.
19 More on this in the next section.
20 It actually is the same as the anti-aliasing filter. Symmetries are very common in math.
21 I cannot avoid mentioning the famous Italian mathematician Vito Volterra. He was born in my region, but his fortune lied
in having escaped early in his childhood this land of farmers and ill-minded synth manufacturers.
22 The maximum value obtained with a signed integer is 2N1 1. This is 32,767 for N = 16 bits, or around 8.3 million for
N = 24 bits. Floating-point numbers can reach up to 3:4
1038 .
23 Not to be confused with an audio mixer (i.e. a mixing console). In electronic circuits and communication technology,
a mixer is just a multiplier.
24 Incidentally, it should be mentioned that the Theremin exploits the same principle to control the pitch of the tone.
25 Again, here, mixer is not intended as an audio mixing console.
26 The modulating signal can be considered constant for a certain time frame.
27 Inharmonic means that the partials of the sound are not integer multiples of the fundamental, and are thus not harmonic.
28 The original research by Chowning was conducted by implementing FM as a dedicated software on a PDP-1 computer
and as a MUSIC V program.
29 The two frequencies are integers.
30 Actually, Equation 2.63 describes a PM operator. AnÐ FM operator
requires the integral term, resulting in a somewhat
more complex equation: yðtÞ ¼ Ac cos 2πfc t þ 2πk mðτÞdτ . Yamaha operators also feature an EG and a digitally con-
trolled amplifier, but for simplicity we shall not discuss this.
31 The modulating signal is approximately constant for a certain time frame.
32 Similar considerations can be done for a continuous-time random signal, but dealing with discrete-time signals is easier.
33 Unless we are at absolute zero.
34 The term “white” comes from optics, where white light was soon identified as a radiation with equal energy in all the
visible frequency spectrum.
CHAPTER 3
This chapter provides an overview of VCV Rack. It is not meant as a tutorial on synthesis with Rack, nor
as a guide about all available modules. There are hundreds of modules and tens of developers, thus
making it barely possible to track all of them down. Furthermore, the development community is very
lively and active, and the modules are often updated or redesigned. The chapter just provides a common
ground for the rest of the book regarding the platform, the user interface, and how a user expects to
interact with it. It shall therefore introduce the terminology and get you started as a regular user, with the
aim of helping you design your own modules with an informed view on the Rack user experience.
So, what is VCV Rack exactly? VCV Rack is a standalone software, meant to emulate a modular
Eurorack system. It can also host VST instruments employing a paid module called VCV Host. In
future releases, a paid VST/AU plugin version of Rack will be available, to be loaded into a DAW.
According to VCV founder Andrew Belt, however, the standalone version will always be free of
charge. At the moment, Rack can be connected to a DAW using a module called Bridge; however,
its support will be discontinued as soon as the VST/AU plugin version is out. To record audio in
VCV, the simplest option is to use the VCV Recorder module.
At the first launch, Rack opens up showing a template patch, a simple synthesizer, where modules
can be added, removed, and connected. Patches can be saved and loaded, allowing reuse, exchange,
and quick switch for performances.
Modules are available from the plugin store. In addition, they can be loaded from a local folder.
This is the case for modules that you are developing and are not yet public but you need to test on
your machine. Modules can be divided into a few families:
• Core modules. Built-in modules that are installed together with Rack, for audio and MIDI
connectivity.
• Fundamental modules. Basic building blocks provided by VCV for free. These include
oscillators, filters, amplifiers, and utilities.
• VCV modules. Advanced modules from VCV. These are not free; their price sustains the VCV
Rack project.
• Third-party modules. These are developed independently by the community and are available
under the online plugin store. Some of them are free, some of them are paid, some are
open-source, and some are not. If you are reading this book, you are probably willing to build
new third-party plugins. Some developers provide both a commercial and a smaller free package
(e.g. see Vult Premium and Free plugins, which provide exceptional virtual analog modelling of
real hardware circuits).
• Eurorack replicas. Some modules available on the store are authorized replicas of existing
Eurorack modules. These are free and include, among others, Audible Instrument, Befaco,
Synthesis Technology (E-series), and Grayscale modules.
74
VCV Rack Basics 75
Figure 3.1: The Rack GUI shows modules mounted on the rack rails. When launching the software for the
first time, this template patch shows up, implementing a basic monophonic synthesizer.
76 VCV Rack Basics
Context
Switches Menu
(right-click)
Lights
Module name
Option 1
Input/Output Option 2
connectors Option 3
by one author. Of course, an author can develop different plugins, each containing one or more
modules.
Plugins may also contain expander modules. These are modules that expand the functionalities of other
modules. They may, for example, add ports and knobs to provide additional controls to a parent
module. The parent will thus have reduced size, sacrificing some functionalities that can be restored by
adding an expander. An expander module must be “connected” to the parent module by placing the two
side by side. The expander should go to the right of the parent. This feature mimics expander modules
available in the Eurorack world, which are connected by flat cables on the back of the panel.
3.4.2 MIDI-CV
This module provides MIDI input to your modular patch. It provides 12 different outputs that translate
MIDI functionalities into CV. The module can be monophonic or polyphonic. The top three rows allow
you to select a MIDI source by choosing the driver, the device, and the channel. Available drivers
include the VCV Bridge, the computer keyboard (yes!), and a gamepad input (yes, yes!). Using the
Bridge allows you to get MIDI input from a DAW (e.g. for sequencing events). The computer keyboard
allows you to quickly play a patch even if you have no real MIDI device with you. The driver is selected
VCV Rack Basics 79
by right-clicking the top row of the module, then you can select the device by right-clicking the second
row, and finally the MIDI channel by right-clicking the third row. The available output connectors are:
3.4.3 MIDI-CC
This module translates MIDI Control Change (CC) messages into virtual voltages. Up to 16 CC can
be converted into the assignable outputs. Each connector is initially assigned to a CC from 0 to 15,
as denoted by the 16-number box on the module panel. To assign different CC to any of the
connectors, click the related number (you will see the “LRN” string in place of the number, standing
for “learn”) and instruct the module by sending a CC message to VCV (e.g. by twisting a knob) or
type the CC number on the computer keyboard. Please note that the MIDI learn functionality can
80 VCV Rack Basics
work only if you have previously selected the correct MIDI driver, device, and channel from the
three upper rows of the module panel.
The outputs send a 0–10 V voltage corresponding to the CC data 0–127. As an exception, some
gamepad drivers can generate MIDI values from −128 to 127 that are translated to a voltage from
−10 to 10 V.
3.4.4 MIDI-GATE
This utility module is used to send gate signals corresponding to specific notes (e.g. to employ
a MIDI controller to trigger events using keys or percussive pads). If connected to MIDI sequencers
that send a Note On and Note Off messages in a row, a 1 ms trigger voltage is produced; otherwise,
10 V gate signals are produced in the interval between Note On and Note Off messages.
If you want the gate signal to have amplitude corresponding to the Note On velocity value, you can
tick the “Velocity” label in the right-click context menu.
A Panic option is available in this module too, to reset the MIDI status.
3.4.5 MIDI-MAP
This module allows you to map a MIDI-CC message to parameters of any module in the rack. When
the mapping is done, a full sweep of the CC from 0 to 127 allows the parameter to go from its
minimum to its maximum values. For binary switches, values less than 64 map to 0 and greater than
or equal to 64 map to 1.
The procedure to create the mapping follows:
• Click an empty slot – the text changes to “Mapping.”
• Click a parameter of a module.
• Send a CC message from your MIDI controller.
A yellow square is printed near the mapped parameter. Maps can be undone by right-clicking them.
VCO-1 is a versatile virtual voltage-controlled oscillator with sine, saw, triangle, and square outputs
all available at once on separate outputs. It has exponential frequency modulation with hard/soft
sync and pulse width modulation (PWM) for the square wave. It features analog waveform
emulation but also digital waveform generation. The analog waveforms feature pitch drift and react
to pitch changes with some slew. The digital waveforms have quantized pitch and introduce more
aliasing. Figure 3.3 compares the analog and digital sawtooth waveforms.
VCO-2 is a stripped-down version of VCO-1, with morphing between the same waveform type seen
in VCO-1.
Fundamental VCF is a low-pass/high-pass voltage-controlled filter, emulating a four-pole transistor
ladder filter with overdrive and resonance. There is no switch to select the filtering mode, but a low-pass
and a high-pass output. The cutoff frequency CV is subject to an integrated attenuverter (FREQ CV),
while the resonance (RES) and drive (DRIVE) CV are summed to the related knobs. The sum is subject
to clipping (i.e. when the RES knob is turned to the maximum level, any positive CV sent to the RES
input will not affect the resonance because it already reached the maximum value).
The ADSR module generates envelopes to control the evolution of a sound. Its output can be used as
a control voltage for any module. It follows the ubiquitous Attack-Decay-Sustain-Release paradigm.
Most often it will be used together with the VCA module. The VCA module is a voltage-controlled
amplifier. In essence, it multiplies the input signal with a control signal in order to shape its amplitude.
The response to the control signal can be linear or exponential. If the envelope is sent to the exponential
input, the attack and decay ramps will decay “linearly in dB.” Confusing, uh? In other words, the decay
is exponential, but converted in dB (using the logarithm, which is the inverse operation of the
exponentiation) the decay results linear. As shown in Figure 3.4, an exponential decay looks linear on
a dB scale, and we can say that it feels more natural and linear to the ear as well.
LFO-1 and LFO-2 are the low-frequency oscillators in the Fundamental series. Almost similar to
voltage-controlled oscillators, they have an extremely low frequency, which goes from tens of
seconds per cycle up to 261 Hz (the pitch of C4). LFOs are generally used to modulate a signal and
improve expressivity. LFO-2 is more compact than LFO-1 and offers morphing between wave
shapes instead of individual outputs. This can be a lot of fun if the wave type is modulated using
another signal or LFO.
Figure 3.3: Comparison of the analog (top) and digital (bottom) sawtooth waveforms generated by
Fundamental VCO-1.
82 VCV Rack Basics
Figure 3.4: Comparison of decay plots. A linear decay decreases by a fixed term per unit of time, looking like a
straight line on a linear plot, or as a logarithmic curve on a logarithmic plot (in this case, a dB plot). An exponen-
tial decay decreases by a fixed number of decibels per unit of time, thus looking like a line on a dB plot.
Besides the traditional synthesizer building blocks, Fundamental also offers a delay effect, a mixer,
and other utilities. Let us examine them.
DELAY is a digital delay with control over feedback, delay time, and tone color. It has a dry/wet
knob. The implementation is very good, featuring smoothing on time adjustments, thus avoiding
annoying glitches that even hardware effects sometimes have when abrupt changes occur on the
delay line length.
A small mixer, VC MIXER, is provided. This is a four-channel mixer with independent CV control
of each channel level and overall level. It can be used in conjunction with the Core AUDIO module
to set levels prior to sending out the signals, but it can be used for many other mixing operation
inside your patches.
The attenuverters module, 8VERT, can also be used to fade signals or even invert their phase.
This is a module consisting of eight attenuverters. An attenuverter multiplies the input signal by
a gain that ranges from −1 to 1. If the gain is negative, the signal gets inverted (minus sign)
and attenuated by the absolute value of the gain. As an extra utility, when no input is connected
to a row, the output of that row corresponds to the value of the knob in the range −10 to +10V.
This allows the module to create constant CV outputs that can be used to parametrize other
modules.
Another utility module, UNITY, contains two six-channel direct mixers, with no gain control. It can
sum input signals or average them. In the latter case, the inputs are summed and scaled by the
number of connected inputs. An inverted phase output is also available (INV).
Fundamental MUTES has ten input/output pairs, and a toggle switch for each one, muting the output.
Outputs with empty input copy their signal from the first non-empty input above.
SEQ-3 is an eight-step sequencer with three rows of control voltages. It is driven by an internal
clock, which has a tempo knob (CLOCK), or by an external signal (EXT CLK). Each time the clock
ticks, the sequencer fires a gate signal (GATE output on top) and advances by one step. For each
step, a gate signal is also sent individually (bottom row of outputs). For each step, the three control
VCV Rack Basics 83
voltages assigned to that step (corresponding to the three rows) are sent to the three related outputs
(ROW 1, ROW 2, ROW 3). The clock can be shut off (RUN button) or reset (RESET). The number
of steps does not necessarily have to be eight. Using the STEPS knob, a lower number of steps can
be used, allowing, for example, the change of the tempo signature. To recap, at each clock step:
• the sequencer advances to the next step;
• the three knob values (one per row) are sent to each of the CV outs (ROW 1, ROW 2, and ROW 3);
• a gate signal is fired to the GATE output; and
• the same gate signal is fired on the GATE OUT output corresponding to the current step (bottom
row) – this individual gate signal can be disabled by clicking on the green button close to the output.
A clock generator and a simpler sequencer will be the object of our work in Sections 6.5 and 6.6. If
something is not clear enough at this point, it will become clearer later.
Two utility modules, SS-1 and SS-2, are also present for multiplexing and demultiplexing (in short,
muxing and demuxing). SS-1 is a demultiplexer that scrolls through the outputs in a round-robin
fashion, based on a clock signal. In other words, at each clock pulse, it sends the input signal to the
next output. SS-2 is a multiplexer that scrolls through the inputs in a round-robin fashion, based on
a clock signal. At each clock pulse, it sends a different input to the output. If this is not totally clear
by now, do not worry – more on muxing and demuxing will come in Section 6.3, where we shall
build a mux and demux module.
Utility modules to deal with polyphonic cables are SPLIT, MERGE, SUM, and VIZ. SPLIT takes
a polyphonic cable and splits into monophonic cables. MERGE takes up to 16 input monophonic
cables and merges these in one polyphonic output cable. SUM takes a polyphonic input and outputs
the sum of all its channels into a monophonic cable. Finally, VIZ takes a polyphonic input and
visualizes the intensity of its individual channels through 16 LEDs.
Finally, SCOPE is an oscilloscope emulator, with external trigger, internal trigger based on
a threshold, X-Y mode, vertical zoom and offset, and time zoom. We shall see in the next section
how this works in detail. You should get to master it in order to debug modules.
Figure 3.5: A template patch hosting MIDI inputs and audio outputs.
parameters. For what concerns the audio stuff, we opted for a simple solution that allows four
inputs to be routed to a stereo output. The Mixer module has four inputs and faders. The outputs
can be mixed with Unity and get routed to the left and right output. For a larger number of inputs,
two Mixer modules can be employed and the Mix output of each one can be directly routed to the
left and right outputs, respectively. Please note that the AUDIO module inputs and outputs should
be understood as the module inputs and outputs, respectively. In other words, you should connect
the signal that you want to send to the sound card outputs to the AUDIO module inputs!
By saving such a template and recalling when starting a new patch, you will be able to reduce your
working time significantly.
Figure 3.6: A simple East Coast monophonic synthesizer to build with VCV Rack.
Figure 3.7: A simple East Coast patch with VCO-VCF-VCA cascade, monophonic input, and vibrato.
Figure 3.8: Triggering the signal on the first input using the internal trigger.
edge occurs on the X IN and this edge crosses the trigger value. The threshold is selected using the
TRIG knob and shown in the SCOPE window as a small arrow indicated with a “T.” In Figure 3.9,
a sine wave is shown in the SCOPE, starting with a nonzero phase, due to the alignment with
a negative trigger. Please note that in absence of X IN, the Scope will not synchronize to the Y IN.
An external trigger can be used as well to synchronize the view with a third signal connected to the
EXT input. This resets the view when its value passes the threshold imposed by the TRIG knob.
The SCOPE provides useful statistics related to the X and Y input signals. These are on the top and
bottom of the SCOPE screen, written in a tiny font. Peak-to-peak (pp), maximum, and minimum
values are provided. By default, one vertical division spans 5 V, and thus the display spans from
−10 V to +10 V. If the X SCL or Y SCL knobs are increased or decreased by one step, the vertical
division gets halved or doubled, respectively.
Finally, the SCOPE can plot a Lissajous curve in the X-Y view. The X and Y signals control the
horizontal and vertical coordinates of the drawing point, respectively. This mode is particularly
useful to tune the phase of two signals with same frequency. As you can see in Figure 3.10a, two
sines with π phase difference are shown as a sphere. On the contrary, when in perfect phase, they
are aligned showing one thin line that goes from bottom left to top right, as shown in Figure 3.10b.
In the X-Y configuration, the TIME knob adjusts the persistence of the signal on the screen.
Figure 3.9: Triggering the SCOPE using an external input.
Figure 3.10: The SCOPE in X-Y mode, showing two sine signals with the same frequency, and (a) π phase
shift and (b) almost zero phase shift.
88 VCV Rack Basics
Figure 3.11: Using the SCOPE as a plotter for cool visuals. A sine and a frequency-modulated saw are plot-
ted one against the other (left) and in X-Y mode (right).
Of course, the SCOPE can be used to create cool visuals. In the old-school modular tradition, analog
signals were used to drive a cathode-ray tube screen. Figure 3.11 shows the X-Y plot of a frequency-
modulated signal.
(a)
(b)
Figure 3.12: The spectrum of a sawtooth wave generated with VCO-1 at approximately 1 kHz. (a) The
effect of the aliasing is clearly visible (partials connected with a white line). (b) The exact frequency of the
tone is slightly changed so that the aliased partials hide below the skirts of the proper partials. Please note
that in the first case, although visible, alias is not noticeable by ear as the undesired partials are tens of dB
below the proper ones.
The answer is: it depends on the source module. When connecting ports, the system knows if
a cable needs to be poly or mono. If the output port supports more than one channel, the cable
automatically becomes polyphonic. The MIDI-CV module, for example, supports mono and poly
modes. This setting is available from the context menu, under the Polyphony Channels item. When
you drag the V/OCT output of the Core MIDI-CV module with polyphony channels (right-click) set
to 1, the cable will be monophonic. If polyphony is set to a number 2–16, the cable will be thicker,
indicating that it is a polyphonic cable.
Even though a cable is polyphonic, it can be connected to an input port handling only one channel.
What happens in this case? It depends on the module: some will discard all channels but the first
one, some will sum all the channels into one. The general rule is:
90 VCV Rack Basics
• Audio inputs should sum all the channels to avoid losing some of them.
• CV inputs or hybrid audio/CV inputs should only take the first one.
More on this will come later. Now let us focus on making the East Coast synthesizer polyphonic.
Open the east-coast.vcv patch. Right-click the MIDI-CV module and set any number, from 2 to 16,
from the Polyphony Channels menu. Done! Yes, it is simple as that!
The cables stemming from V/OCT and GATE are now polyphonic. As a domino effect, the VCO
and the ADSR modules are now polyphonic and their outputs will be polyphonic, thus similarly
affecting the VCF, VCA-2, and so on.
Notes
1 When MIDI-CV is connected to VCV Bridge, it receives the Start, Stop, and Continue events from the DAW.
2 Idem.
3 Idem.
CHAPTER 4
In Rack parlance, the term “module” refers to the DSP implementation of a virtual Eurorack-style
module, while we use the term “plugin” to indicate a collection of modules, all bundled in a folder
containing one compiled file, the shared object or dynamic library. Table 4.1 resumes all the terms in
the Rack jargon.
VCV Rack modules are similar to the “plugins” or “externals” or “opcodes” used in other popular
open-source or commercial standards in that they are compiled into a dynamic library that is loaded
at runtime by the application (provided that it passes a first sanity check). However, Rack modules
differ in all other regards. We shall highlight some of the differences in Section 4.1, then we shall
go straight to showing the components required to create a module (Section 4.2), and then give an
overview of basic APIs and classes used to define it and make it work (Section 4.3). At the end of
this chapter, we shall start the practical work of setting up your coding environment (Section 4.4)
and building your first “Hello world” module (Section 4.5).
Please note that some programming basics are assumed along this part of the book. You should have
some experience with C++. We will recall some basic properties of the language at different points
of the book as a reminder for users with little experience in C++ or developers who are used to
working with other programming languages.
plugin A collection of Eurorack-style modules implemented in a single dynamic library, developed by a single
author or company. The Rack browser lists all plugins and allows you to explore all the modules con-
tained in a plugin. Not to be confused with the term as used in the DAW and VST jargon, where it
usually refers to one single virtual instrument.
Plugin A C++ struct implementing the plugin. We shall use Plugin in monospaced font only when explicitly
referring to the C++ struct.
Model A C++ struct that collects the DSP and the GUI of a Eurorack-style module.
module Not to be confused with the C++ struct Module. We shall use the generic term “module” to indicate
either real or virtual Eurorack-style modules.
Module A C++ struct that implements the guts of a module (i.e. the DSP part).
ModuleWidget A C++ struct that implements the graphical aspects of a module.
91
92 Developing with VCV Rack
stress the fact that we need to get below the theoretical limit P. In this time frame, the computer
needs to not only execute our code, but also all other operating system tasks.
We can also define the real-time factor as the ratio between an execution time and the engine
period:
E
RT ¼ ½% ð4:1Þ
P
usually exposed as a percentage. It is obvious that RT must always be less than 100%.
Please note that the audio engine requires some time to pass the output to the sound card, and since
we are working with general-purpose computers there is always some random interrupt incoming
that has priority over our audio application, so a stable audio processing without glitches can be
obtained only with RT factors well below 100% (this is very sensitive on the operating system, the
operating system scheduler, the audio driver and many other factors).
For most musical programming applications, the buffer size is a power of 2 and can often be
selected by the user to obtain a trade-off between latency and computational resources. Pure Data,
for example, by default has a buffer size of 64 samples. To that extent, however, Rack differs from
all the other platforms: only one sample at a time is passed to the periodical processing function,
called process(...).
The definition of RT factor for Rack changes to:
E
RT ¼ ½% ð4:2Þ
T
1
because in this case B ¼ 1. The maximum execution time is now E1 5 ¼ 20:8 μs; lower
48;000
than the case above. Fortunately, in this short time, we just have to compute one sample, not
64. If the execution would be linearly proportional to the number of samples to process, then
the following would hold: E64 ¼ 64 E1 . In that case, the RT factor would stay the same for
batch processing and for sample-wise processing. However, in practice, it turns out that
processing one sample at a time will reduce chances to optimize code and exploit instruction
pipelining and parallelization. Furthermore, a lot of overhead is added. It thus turns out
that E64 5 64 E1 .
If you are wondering why sample-wise processing is less efficient, think about a factory with an
assembly line, processing a batch of 100 shoes compared to a single craftsman producing one shoe
at a time. The assembly line has specialized workers, each excelling at an operation. Shoes are
moved in batch from one machine to another. The assembly line will take less time to make each
shoe than the craftsman does, making the factory more efficient (although questions of quality and
ethics arise).
At this point, you may be wondering why, in VCV Rack, we are not exploiting the efficiency
inherent to process large buffers. In Rack, the main objective is the simulation of electronic
circuits, with the ability to create near delayless loops, allowing signal feedback like in modular
synthesizers.
Imagine the cascade of a module “A” and a second module “B.” Let the output of “B” be fed back
to “A.” In an analog environment, we have a delayless loop. Instantaneous hardware feedback is
94 Developing with VCV Rack
a key concept for so-called No-Input Mixing music performances. In a discrete-time setup, however,
the feedback is delayed by – at least – one sample. Suppose that processing is sample-wise: the
computer must, first, execute the instructions for module “A” and extract one output sample. The
output is used by module “B” to compute its output sample. At the next time instant, the output of
“B” is fed back to “A” and the cycle restarts. If, however, modules A and B are programmed to
compute N output values from N input values, the feedback delay increases to N: samples will be
computed in batches of N and then fed to the next module. With increasing N, the simulation of
a feedback electronic system departs from reality. The only way to get close to a real analog system
is thus to process the smallest unit of signal (i.e. a single sample) and make it as short as possible
(i.e. increasing the sample rate). The higher the sample rate, the closer to the hardware system,
because the time step gets closer to zero.
Sample-wise processing is a necessary ingredient to make Rack get very close to the behavior of
modular systems. Of course, this comes with some CPU sacrifice.
4.1.2 Scheduling
Since there is a scheduling order for the audio processing objects, parallel execution of objects
in many cases is not an option. Imagine a chain of three objects, implementing an oscillator,
a filter, and an amplifier. The audio engine will schedule them serially, as the second needs to
wait for the execution of the first, and the third needs the second one to be executed. The audio
engine lets each one of them run and return the processed signal, which is subsequently fed to
the next object. After the third object has finished, the audio engine can pass the processed
signal on to the sound card. This implies that any of the objects may potentially be stealing
CPU time to the others. If the available execution time is expired while they have not yet
finished processing, there will be audio glitches, as the sound card has no new data to convert
and will recycle old data or fill with zeros. The RT factor must be lower than 100% to avoid
audio dropouts (i.e. loss of data).
As we said, all music programming platforms have an audio processing function. They have other
periodically called functions too. Control data may come in a synchronous or asynchronous way. In
the first case, a rate will apply, although lower than the audio engine rate. In the second case,
asynchronous events may arise, such as MIDI events or user events. In this context, all platforms
exhibit differences. Max, for example, processes control data at a fixed rate of 1 kHz. Rack does not
have any control data. This is one of its prominent characteristics: as in modular synthesizers, all
information is transmitted by a (virtual) voltage.
Fortunately, current CPUs have multiple cores. Rack is thus able to distribute computing tasks
among one or more cores. This means that several process routines are computed at the same time,
reducing the risk of dropouts.
Other periodic functions are the GUI rendering functions. In Rack, each graphical widget has a draw
method that is called periodically at a rate, ideally, of 60 frames per second.
From this discussion, it should be clear that any musical platform enforces the developer to strip all
tasks that are not time-critical or audio-related from the audio processing function, to make it quick.
All these accessory tasks should be handled separately in other processes that may be periodically or
asynchronously called (Figure 4.1).
Developing with VCV Rack 95
Figure 4.1: Scheduling of the process functions “P” and the draw functions “D” in VCV Rack in a single-core
scenario. The process functions are invoked with a period equal to the sample time (i.e. T ¼ 1=Fs ), where Fs
is the audio sampling rate (e.g. 44,100 Hz). The draw functions are called with a larger period of V ¼ 1=Fv ,
with Fv being the video rate, usually 60 Hz. During the sample time, all the modules must perform their
process function. The sum of all their execution times Etot is the total execution time of the patch. The spare
time is left to all other lower-priority tasks. The real-time factor RT is thus Etot =T.
decades of computer music API and with the additional advantage of adopting the latest OOP and C++
standards. Consider the fact that older platforms are starting to wrap their codebase to allow C++ code
on top of a bare C core (e.g. this is the case of Csound (Lazzarini, 2017)).
Finally, the ability to write sample-wise processing code makes developing much easier and makes the
code easier to read. A lot of aspiring developers started writing code for Rack just by learning from the
first few examples provided by VCV (Fundamental, Befaco, and Audible Instruments plugins).
After all, if Rack was not so developer-friendly, this book would never have even been thought of.
Credit for all this goes to VCV Rack’s main developer.
For this reason, it is crucial to compile and run for the right version of Rack. The “right” version is
the one that provides not only the same API, but also the same ABI (application binary interface),
that was used for compiling Rack, and thus symbols match.
std::vector<Param> params;
std::vector<Input> inputs;
std::vector<Output> outputs;
std::vector<Light> lights;
The first four members are vectors of parameters (e.g. knobs), input and output connectors, and
lights that are initialized in the constructor. This means that when we subclass the Module struct, we
are able to impose a certain number of these elements. This is hard-coded (we cannot add an output
during execution, but this makes absolute sense for a virtual modular system).
The next four methods are important for our scopes. The process method is the fundamental method
for signal processing! It is called periodically, once per sample, as discussed in Section 4.1. It takes
a constant argument as input: the ProcessArgs structure, that contains global information such as the
sampling rate. The config method is necessary to indicate the number of parameters, ports, and
lights to create. The configParam describes the range of a parameter, its labels, and more.
Other useful methods that can be overridden are onSampleRateChange, to handle special conditions
at the change of the internal engine sample rate, onReset, to handle special conditions during
reinitialization of the module, and onRandomize, to make special randomization of the parameters.
The ModuleWidget is the object related to the appearance of the module, and it hosts all GUI
elements such as knobs, ports, and so on. It also handles mouse events such as mouse clicks that can
be managed by developing custom code:
The development of novel models is made easy by Rack APIs. Among these, we want to highlight
basic components such as:
Developing with VCV Rack 99
• a library of GUI elements, from abstract widgets to knobs and switches (examined in
Section 5.2);
• DSP and signal processing functions and objects, from the Schmitt Trigger (in include/dsp/
digital.hpp) to interpolation (in include/math.hpp), from efficient convolution (include/dsp/fir.
hpp) to fast Fourier transform (in include/dsp/fft.hpp); and
• various utilities such as random generator functions, string handling (in include/string.hpp), or
clamping and complex multiplications (in include/math.hpp).
We will explore more along these pages; however, you are encouraged to dive into the source code
of Rack to inspect all the pre-implemented utilities it features.
4.4.1 Linux
Several packages need to be installed from your package manager, if not already present.
On Ubuntu, for example, open a terminal window (Ctrl+Alt+T) and type:
On Arch Linux:
pacman -S git wget gcc gdb make cmake tar unzip zip curl jq
That’s it.
4.4.2 Windows
You need to install MSYS2, a software package that allows you to manage the building process as
in Unix-like platforms. It also features a package manager, very useful to get the right tools easily
with one console command. Currently, MSYS2 is hosted at www.msys2.org/, where you can also
find the 32- or 64-bit installers. You need to install the 64-bit version. Please also note that Rack is
not supported on 32-bit OSs, so it would make no sense anyway to install 32-bit tools.
Once done, launch the 64-bit shell (open the Start menu and look for mingw64 shell) and issue the
command:
pacman -Syu
pacman -Su git wget make tar unzip zip mingw-w64-x86_64-gcc mingw-
w64-x86_64-gdb mingw-w64-x86_64-cmake autoconf automake mingw-w64-
x86_64-libtool mingw-w64-x86_64-jq
This will tell the MSYS2 package manager to install the required software tools.
4.4.3 macOS
You need to install Xcode with your Apple ID. Then install Homebrew, a command-line package
manager that will help you install the required tools. Go to https://brew.sh/ and follow the
instructions to install Homebrew (it is generally done by pasting a command into a terminal
window). After Homebrew is ready, in the same terminal you can type:
/home/myusername/Apps/Rack/
Developing with VCV Rack 101
This will prepare a local Git repository and download its latest content. If you want instead to go
straight with one specific branch (e.g. branch “v1”), you can do the following:
cd Rack
Now you need to download other software modules required by Rack available from Git:
and other software dependencies that are automatically downloaded and prepared, invoking the
following command:
make dep
This step will take some time and require that you have an Internet connection active in order to
download packages. It may happen that the required packages cannot be downloaded for temporary
server errors. Take a look at the output. All operations need to be successful, otherwise you will
miss fundamental software components and will not be able to compile and run.
The real compilation of Rack is issued by:
make
make run
will start your compiled version of Rack. Please note that make run will start Rack in development
mode, disabling the plugin manager and loading plugins from the local plugins folder.
If you want to speed up the compilation process, you can issue all the make commands adding the
“-j <parallel jobs>” option. This starts several parallel jobs, up to the number of parallel threads your
CPU supports.
You may encounter some issues in the process. It is very important that you collect the output of the
terminal before asking for suggestions on the official forum, on the GitHub issue tracker, or to users
of the Rack developers Facebook page.
102 Developing with VCV Rack
cd Rack/plugins/HelloWorld
make
This should be sufficient, and you should see one of the following files in the HelloWorld root,
depending on the operating system:
• plugin.so (Linux)
• plugin.dll (Windows)
• plugin.dylib (macOS)
Run Rack to verify that the HelloWorld module is loaded (Figure 4.3):
cd ../../
make run
Fine, but we can do better than this! However, before starting programming, we should take a look
at the source code skeleton we have built so far.
• res/CTemplate.svg. The background image file for the module. We shall use this throughout the
whole book to give our modules a simple yet elegant look and make them look similar.
• Makefile. This is the project makefile (i.e. what is executed after the make command is issued).
You don’t need to supply this to other users.
• LICENSE. You should always add a license file to your modules.
• build/. This folder is created automatically for building. You can disregard its content, and you
don’t need to supply this to other users.
• The plugin dynamic library (plugin.so, plugin.dll, or plugin.dylib, depending on the OS).
The plugin.json file contains data regarding the plugin, including name, authorship and licence, and
the modules included in the plugin, with name description and tags. Allowed tags are listed in Rack/
src/plugin.cpp (see const std::set<std::string> allowedTags). The plugin.json for the
HelloWorld plugin follows:
{
“slug”: “HelloWorld”,
“name”: “Hello World Example”,
“brand”: “LOGinstruments”,
“version”: “1.0.0”,
104 Developing with VCV Rack
“license”: “CC0-1.0”,
“author”: “L.Gabrielli”,
“authorEmail”: “[email protected]”,
“authorUrl”: “www.leonardo-gabrielli.info/vcv-book”,
“sourceUrl”: “www.leonardo-gabrielli.info/vcv-book”,
“modules”: [
{
“slug”: “HelloWorld”,
“name”: “HellowWorld”,
“description”: “Empty Module for Demonstration Purpose”,
“tags”: [“Blank”]
}
]
}
Note: The plugin version number must follow a precise rule. From Rack 1.0 onward, it is R_MAJOR.
P_MAJOR.P_MINOR, where R_MAJOR is the major version of Rack for which you provide
compatibility (e.g. 1 for Rack 1.0) and P_MAJOR and P_MINOR correspond to the version of your
plugin (e.g. 2.4 if you provided four minor changes to the second major revision of your plugin).
The makefile is a script that instructs the make utility how to build your plugin. You will notice that
it includes ../../plugin.mk, which in turn includes ../../arch.mk, a section to spot the architecture of
your PC, and ../../compile.mk, the part containing all the compilation flags.
The source code of the plugin includes at least a header and a C++ file, which take the name from
the plugin (e.g. HelloWorld.hpp and HelloWorld.cpp). The former gives access to the Rack API by
including rack.hpp and declares a pointer to the plugin type. It also declares external models (i.e. the
meta-objects) that contain the ModuleWidget (i.e. the GUI) and the Module
(i.e. the DSP code) for each one of your virtual modules. The HelloWorld.hpp file looks as follows:
#include "rack.hpp"
In HelloWorld.cpp, we instantiate the plugin pointer and we initialize it, adding our modules to it
with the addModel() method:
#include "HelloWorld.hpp"
Plugin *pluginInstance;
p->addModel(modelHello);
Each model is implemented in a separate C++ file. Any such file will include:
• The Module child class, defining the process(...) method and any other method related to
the signal processing.
• The ModuleWidget child class, defining the GUI and other properties of the module, such as
the context menu, and so on.
• The dynamic allocation of the Model pointer.
The Module is subclassed as follows:
enum InputIds {
NUM_INPUTS,
};
enum OutputIds {
NUM_OUTPUTS,
};
enum LightsIds {
NUM_LIGHTS,
};
HelloModule() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
}
};
As you can see, objects are implemented as a struct. In C++, class and struct are almost
equivalent, differing only in the fact that a struct defaults all methods and members to public,
while for a class they default to private. For this reason, throughout the book we will sometimes
use the term “class” as a synonym of “object,” generically including both “struct” and “class” under
this broad term, when both are to be addressed. Since we are following an OOP paradigm, it is
worth stressing the difference with a C struct that is intended only as a collection of variables
(although compilers nowadays allow struct to have methods even in a C project).
The enums are empty, as you see, and the compiler will default NUM_PARAMS, NUM_INPUTS,
NUM_OUTPUTS, and NUM_LIGHTS to zero. There is no need to use the enums; we could just say:
106 Developing with VCV Rack
config(0, 0, 0, 0);
However, you’d better keep this skeleton module more general, to have a little more flexibility in all
the practical cases, when you will need to add inputs, outputs, lights, or knobs. Most of the modules
we are going to create have non-zero NUM_* enums.
As you can see, the class also declares the process(…) method, stating that it will override the same
method from the base class Module. The implementation is empty for now as the HelloModule class
does not define any action on signals; it is just a blank panel (more on blank panels in Section 10.8.3).
We then define the module widget child class and implement its constructor:
/* MODULE WIDGET */
struct HelloModuleWidget: ModuleWidget {
HelloModuleWidget(HelloModule* module) {
setModule(module);
setPanel(APP->window->loadSvg(asset::plugin
(pluginInstance, “res/HelloModule.svg”)));
};
As you can see, HelloModuleWidget inherits the ModuleWidget class. The constructor needs
to have a pointer to the Module subclass it will handle. More on this later. The setPanel method
takes an SVG file, loaded with loadSvg, and applies it as background to the module panel. Please
remember that, by default, the module size will be adapted to the size of the SVG file. The SVG file
must thus respect the size specifications, which impose the height of the module to be 128.5 mm (3
HU in the Eurorack format) and the width to be an integer multiple of 5.08 mm (1 HP). We will
discuss the creation of the SVG panel in more detail in Chapter 5. Please note that the panel size
can be changed from code, as we shall see for the ABC plugins.
When the zoom is 100%, a module height of 128.5 mm corresponds to 380 pixels:
#define RACK_GRID_HEIGHT 380 // include/app/common.hpp
Finally, the file ends with the creation of the Model, indicating the Module and the
ModuleWidget as template arguments:
Model * modelHello = createModel<HelloModule, HelloModuleWidget>(“HelloWorld”);
The input argument takes the slug of the module. The template arguments are the subclass of
Module and ModuleWidget.
Now Rack has all the required bits to compile your Model and add it to the plugin.
Figure 4.4: The Eclipse IDE, with the Project Explorer (left panel) for browsing projects and files, the main
panel (top right) showing multiple source files, and the Console panel (bottom right) showing the makefile
log after building the ABC project.
project structure, launch a build, and have automatic suggestions and code verification, then an
integrated development environment (IDE) is better suited.
One option I would suggest to all readers is the Eclipse platform,3 maintained by the Eclipse
foundation, open-source and highly flexible. This tool is cross-platform (Linux, Windows, macOS)
and is available in different flavors, depending on needs and the language you use. For C/C++
developing, I recommend downloading a build of the C/C++ developers IDE, or adding the Eclipse
CDT package to an existing Eclipse installation.
Eclipse (and similarly most C/C++ IDEs) shows up with several distinct panels, for efficient project
management. The Eclipse CDT shows a main panel with multiple tabs for reading and writing the source
code, a Project Explorer for browsing projects and the related files, and additional panels for searching
keywords, building, debugging, and so on. A typical Eclipse window is shown in Figure 4.4.
Figure 4.5: Eclipse code editor showing automatic suggestions, available by pressing Ctrl+Space.
resides). I suggest including the version in the project name (e.g. Rack101 for v1.0.1), so you will
be able to add future versions to Rack without having to delete previous ones. The toolchain is not
important – you can leave it to <None>.
Now, similarly, we will import the ABC library (or any other plugin folder) with File → New →
Makefile Project with Existing Code. The project location should be the ABC folder, where the makefile
resides, and I suggest you include the version of Rack you are compiling against in the project name.
Now you have the ABC project in the project explorer. Right-click it and go to Properties. Select Project
References and check the Rack project name. Now the indexer will include Rack in its search for classes
and declarations. To force building the C/C++ index, go to Project → C/C++ Index → Rebuild.
Now open a source file (e.g. AComparator). If you Ctrl+Click a Rack object name (e.g. Module),
Eclipse will take you to its declaration. This speeds up your learning curve and development time!
You can also experience automatic suggestions. Automatic suggestions are available by pressing
Ctrl+Space. The Eclipse indexing tool will scan for available suggestions. By typing in more letters,
you will refine the research by limiting it to words starting with the typed letters. If you write, for
example, the letter “A” and then press Ctrl+Space, you will get a large number of suggested objects,
functions, and so on. However, if you type “AEx,” you will get only “AExpADSR,” a class that we
have implemented in our ABC project (Figure 4.5).
Figure 4.6: The Project settings window, showing default settings to be used for building the ABC project.
As you can see, the build command is the default one (make).
a terminal. You can check this by right-clicking on the name of the project in the Project Explorer and
clicking Properties, or going to Project → Properties. On the left, you will see C/C++ Build, and the
Builder Settings should be set by default so that it uses the default build command (“make”). The build
directory is the root folder of your plugin. This is exemplified in Figure 4.6.
To test whether the Build command works fine, go to Project → Build Project. You should see the log of
the build process in the Console window, as seen in Figure 4.4. You should see no errors in the log.
This section was a small hint to setting up your Eclipse IDE and getting ready to go. We will not go
further into the vast topic of developing with Eclipse, as it would be out of our scope, but you are
encouraged to read the documentation, or compare with any other IDE you are already familiar with.
Notes
1 This is the suggested behavior that should be followed by all developers.
2 Rack loads the plugins only once at startup, so you need to restart Rack if any of the plugins have been updated.
3 Experienced developers will have their own choice for the IDE, and some old-school developers will prefer developing
from a terminal using applications such as Vi or Emacs. Some will complain that I’m suggesting this IDE instead of
another one. That’s fine – let’s make peace, not war. Eclipse will suit all platforms and integrates nicely with VCV Rack –
that’s why I’m suggesting it in the first place.
CHAPTER 5
We all recognize how much the look and feel of hardware synthesizer modules influence the user at
first sight. This applies to software modules as well. They should look nice, clear to understand, and
possibly have a touch of craze together with a “trademark” color scheme or a logo that makes all
the modules from the same developer instantly recognizable. Ugly, uninspiring panels are often
disregarded, especially in the Rack third-party plugins list, where there’s already so much stuff.
A good mix of creativity and design experience makes the GUI catchy and inspiring, improving the
user experience (UX).
In this chapter, we will deal with the practical details of creating a user interface. With Rack, the
user interface is created in two stages:
1. Graphical design of the panel background (SVG file) using vector drawing software.
2. Code development to add components and bind them to parameters and ports of the module.
The SVG file is used as a simple background for the panel. All the interactive parts, such as knobs,
connectors, and so on, are added in C++. You will be guided in the design of the background file in
Section 5.1.
As reported in Section 4.5.2, the whole module is contained into a ModuleWidget subclass widget
that collects all the “physical” elements (knobs, inputs, outputs, and other graphical widgets) and
links them to the signals in the Module subclass. The widget class fills the front panel with the
dynamic components that are required for interaction or graphical components that may change over
time (e.g. a graphic display, an LCD display, or an LED). There is a fair choice of components that
you can draw from the Rack APIs (see Section 5.2), but in order to design your customized user
interface you will need to create new widgets. This last topic is postponed to Chapter 9.
110
The Graphical User Interface 111
SVG files can be drawn with several vector graphics applications. However, we will show how
to do that using an open-source software, Inkscape. Inkscape is available for all platforms and
generates SVG files that are ready to use with Rack. It can be downloaded from https://inkscape.org/
or from the software store available with some operating systems.
We will refer to the graphical design of the ABC plugins. These are extremely simple and clean in
their aspect to make the learning curve quick. The ABC plugins adopt the following graphical
specifications1:
• Light gray background fill with full opacity (RGBA color: 0xF0F0F0FF).
• A small 5 px-high blue band on bottom (RGBA color: #AACCFFFF).
• Title on top using a “sans serif” font (DejaVu Sans Mono, in our case, which is open-source)
with font size 20 (RGBA color: #AACCFFFF).
• Minor texts (ports and knobs) will be all uppercase with DejaVu Sans Mono font but font
size 14.
• Major texts (sections of a module, important knobs or ports) will be white in a blue box (RGBA
color: #AACCFFFF).
Figure 5.2: Comparison of a path object (top) and a text object (bottom). A path object is a vector object
defined by nodes, which can be edited. A text object, instead, can be edited as … well … text, with all the
font rendering options and so on.
object and transform it using the menu Path → Object to Path. Now the text cannot be edited
anymore.
This may get annoying, as you always need to keep for yourself a version of the module with the
texts and another with the paths to allow for later editing whenever you change your mind about the
name of a knob or its appearance.
For the ABC modules, we will often print text directly from the code. This practice should be
avoided for released modules as it adds some overhead, but facilitates the development cycle.
From this hierarchy, the last two are usable in your designs, while the first two are made only to be
inherited as they will not show up on your module, because they don’t define an SVG image file for
rendering.
The file include/componentlibrary.hpp defines all the available objects.
To include one of the components widgets in your ModuleWidget, you need to add it according to
the methods described below.
The syntax for an input or output port is:
where:
• Port-type is the name of the widget port class we want to instantiate (e.g. one of the classes for
inputs and outputs available in include/componentlibrary.hpp, such as PJ301MPort). This
basically defines how the port will look like.
• Position is a Vec(x,y) instance telling the position of the top-left corner of the widget, in
pixels.
• Module is the module class pointer; you usually don’t have to change this.
• Number is the index of the related input our output according to the InputIds or
OutputIds enum, indicating which signal is generated from this input (or which signal is
sent to this output). The signals are accessible with inputs[number].value or
outputs[number].value, as we shall see later. However it is generally safer to use the
name from the enum (e.g. myModule::MAIN_INPUT) so that you don’t risk messing up
with the numbers.
The syntax for a parameter (knob, switch, etc.) follows:
addParam(ParamWidget::create<class-type>(position, module, number, min, max, default));
where:
• Class-type is the name of the class we want to instantiate (e.g. one of the classes for knobs and
buttons seen in include/componentlibrary.hpp, such as RoundBlackKnob or NKK for a knob or
a type of toggle switch).
• Position is a Vec(x,y) instance telling the position of the top-left corner of the widget, in
pixels.
• Module is the module class pointer; you usually don’t have to change this.
• Number is the index of the related parameter according to the ParamIds enum, indicating
which parameter value is generated when the user interacts with this object. The value
generated by this object is accessible with params[number].value, as we shall see later.
However, it is generally safer to use the name from the enum (e.g. myModule::CUTOFF_
PARAM) so that you don’t risk messing up with the numbers.
• Min, max, default: these three values indicate the range of the knob/button (min to max) and the
default value assumed by the button. For knobs, it may be any floating-point value. For buttons,
it should be something like 0.f and 1.f as min and max and 0.f as default.
The Graphical User Interface 115
Figure 5.3: Using coloured placeholders (circle and squares on the left) for components in a panel SVG file.
The helper.py script generates code for the ModuleWidget struct, analyzing the shape type, position, and color
of the placeholders in the “components” layer of the SVG file. The final rendering is done by Rack when the
compiled module is loaded.
An explanatory example is provided in Figure 5.3, where three colored shapes in the SVG are
translated to the C++ createParam, createInput, and createOutput commands that draw
the respective components during the in-application rendering. Let us call them, from top to bottom,
CUTOFF, FILTER_IN, and FILTER_OUT by setting their ID in the SVG. We save the file in Rack/
plugins/myPlugin/res/myModule.svg. By issuing the following commands:
cd Rack/plugins/myPlugin/
MyModule() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(CUTOFF_PARAM, 0.f, 1.f, 0.f, ““);
}
addChild(createWidget<ScrewSilver>(Vec
(RACK_GRID_WIDTH, 0)));
addChild(createWidget<ScrewSilver>(Vec(box.size.x - 2 *
RACK_GRID_WIDTH, 0)));
addChild(createWidget<ScrewSilver>(Vec(RACK_GRID_WIDTH,
RACK_GRID_HEIGHT - RACK_GRID_WIDTH)));
addChild(createWidget<ScrewSilver>(Vec(box.size.x - 2 *
RACK_GRID_WIDTH, RACK_GRID_HEIGHT - RACK_GRID_WIDTH)));
addParam(createParam<RoundBlackKnob>(mm2px(Vec(23.666,
22.16)), module, MyModule::CUTOFF_PARAM));
addInput(createInput<PJ301MPort>(mm2px(Vec(26.615, 56.027)),
module, MyModule::FILTER_IN_INPUT));
118 The Graphical User Interface
addOutput(createOutput<PJ301MPort>(mm2px(Vec(26.615,
84.506)), module, MyModule::FILTER_OUT_OUTPUT));
}
};
The skeleton of the module is ready. The components have been placed with the createParam,
createInput, and createOutput methods, and the names of the components are those set in
the SVG file. Positions are given using Vec objects and converting the values from mm (as used
in the SVG) to px (as used in the widget). Most importantly, these names are also used in the
myModule struct, in the related enums, in the config and configParam function calls. At this
point, one can start coding and tweaking the GUI by modifying the components.
Note
1 Please note that colors are expressed as RGBA, which stands for red green blue alpha (i.e. transparency). Please bear in
mind that an alpha value of 0 corresponds to a totally transparent color, thus invisible, while an alpha value of 255 corres-
ponds to no transparency. The 8-bit values (0–255) are provided in hexadecimal notation.
CHAPTER 6
The boring part is over; from here on. we are going to build and test modules!
If you followed the previous chapters, you are already able to compile a module and edit its graphical
user interface. In this chapter, we start with coding. The easiest way to take acquaintance with coding
with VCV Rack is to build “utility” modules that are functional but do not require extensive DSP
knowledge or complex APIs. We are going to set up a comparator module, a multiplexer/demultiplexer
(in short, mux and demux), and a binary frequency divider, shown in Figure 6.1.
6.1 Creating a New Plugin from Scratch, Using the Helper Script
Before we start with the modules, we need to create the basics for a new plugin. This will allow us
to add modules one at a time.
119
120 Let’s Start Programming
TIP: In this section, we will discuss the whole code and process involved in creating your first
complete module.
As a first example, we start with a simple module, a comparator. Given two input voltages,
a comparator module simply provides a high output voltage if input 1 is higher than input 2, or
a low voltage otherwise. In our case, the high and low voltages are going to be 10 V and 0 V. The
module we are going to design has:
• two inputs;
• one output; and
• one light indicator.
These are handled by the process(…) function, which reads the input and evaluates the output value.
The light indicator also follows the output. We shall, first, describe the way inputs, outputs, and
lights are handled in code.
Let us first concentrate on monophonic inputs and outputs. As discussed in Section 4.1.3, Rack
provides a fancy extension to real-world cables (i.e. polyphonic cables), but for the sake of
simplicity we shall leave these for later.
Let’s Start Programming 121
Input and output ports are, by default, monophonic. Three methods allow you to access basic
functionalities valid for all type of ports:
• isConnected();
• getVoltage(); and
• setVoltage(float voltage).
The first method allows you to check whether a cable is connected to the port. The getVoltage()
method returns the voltage applied to the port. This is normally used for reading the value of an
input port. Finally, the last method sets the voltage of the port (i.e. assigns a value to an output
port). These three methods are enough for most of the functionalities we shall cover in this book.
The lights have their own set/get methods:
• getBrightness(); and
• setBrightness(float brightness).
These clearly have similar functionalities; however, we are going to use the latter most of the time
to assign a brightness value to the lights.
While the port voltage follows the Eurorack standards (see Section 1.2.3), the lights get a brightness
value in the range [0, 1]. The value is squared, so negative values will be squared too, and larger
values are clipped.
Now that you have all the necessary bits of knowledge, we can move to the development of the
comparator module. The development steps (not necessarily in chronological order) follow:
• Prepare all the graphics in res/.
• Add the module to plugin.json.
• Add the C++ file AComparator.cpp to src/.
• Develop a struct (Acomparator) that subclasses Module.
• Develop a struct (AComparatorWidget) that subclasses ModuleWidget and its constructor.
• Create a model pointer modelAComparator that takes the above two structs as template arguments.
• Add the model pointer to the plugin into ABC.cpp.
• Declare the model pointer as extern in ABC.hpp.
Naturally, the helper script may speed up the process.
We will start defining how the module behaves. This is done by creating a new struct that inherits
the Module class.
Let’s take a look at the class definition, which shall be placed in AComparator.cpp:
NUM_INPUTS,
};
enum OutputIds {
OUTPUT1,
OUTPUT2,
NUM_OUTPUTS,
};
enum LightsIds {
LIGHT_1,
LIGHT_2,
NUM_LIGHTS,
};
AComparator() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
}
The module defines some enums that conveniently provide numbering for all our inputs, outputs,
and lights. You may want to give meaningful names for the enums. Always remember that,
generally, enums start with 0.
We don’t have members for this module, so we go straight into declaring the methods: we have
a constructor, AComparator(), and a process(…), which will implement the DSP stuff. That’s
all we need for this basic example.
The constructor is the place where we usually allocate stuff, initialize variables, and so on. We leave
it blank for this module and go straight into the process function.
The process(…) function is the heart of your module. It is called periodically at sampling rate,
and it handles all the inputs, outputs, params, and lights. It implements all the fancy stuff that you
need to do for each input sample.
Each time the process function is executed, we compare the two inputs and send a high-voltage or
a low-voltage value to the output. This is done by a simple comparison operator.
In C++ code:
We want to spare computational resources and avoid inconsistent states, thus sometimes it is better
to avoid processing the inputs if they are not “active” (i.e. connected to any wire). If they are both
connected, we evaluate their values and use the operator “?” to compare the two. The result is
a float value that takes the values of 0.f or 1.f. The output goes to the light (so we have visual
feedback even without an output wire) and to the output voltage.
Now that the module guts are ready, let’s get started with the widget. We declare it in AComparator.
cpp. This widget is pretty easy and it only requires a constructor to be explicitly defined.
AComparatorWidget(AComparator* module) {
setModule(module);
setPanel(APP->window->loadSvg(asset::plugin(pluginInstance,
"res/AComparator.svg")));
addChild(createLight<TinyLight<GreenLight˃˃(Vec(80, 150),
module, AComparator::LIGHT_1));
}
};
What is the constructor doing? It first tells AComparatorWidget to take care of AComparator by
passing the pointer to the setModule(…) function. It then sets the SVG graphic file as the panel
background. The file is loaded as an asset of the plugin and automatically tells Rack about the panel
width.
At the end of the widget, we add other children as the inputs and outputs. Let’s have a detailed look
into this. The addInput method requires a PortWidget*, created using the createInput method.
This latter method requires a port template, which basically defines the graphical aspects (as far as
we are concerned now) of this new port. The creation argument of the port defines the position, in
terms of a (x,y) coordinate vector Vec, the parent module, and the port number of that module. We
use, for convenience and readability, the enum name instead of a number. Similarly, we do for the
output ports and the parameters (although we do not have them in this example). Other widgets,
such as lights and screws,1 use an addChild method.
The widget configuration, as you can see, is pretty basic, and most modules will follow the same
guidelines:
Finally, the Model is created, setting the Module subclass and the ModuleWidget subclass as
template arguments and the module slug as input argument.
The last step is to create this model and add it to the list of models contained in our plugin. We do
this in ABC.cpp inside the init(rack::Plugin *p) function as:
p->addModel(modelAComparator);
and we declare this model pointer as extern in ABC.hpp, so that the compiler knows that it exists:
OK, pretty easy! We can compile the code against the provided version of the VCV Rack source
code, et voila, the module is ready to be opened in VCV Rack!
Now it is time for you to test your skills with some exercises. You will find this simple code in the
book repository. Go to the exercise section of this chapter to see how to expand the module and
improve your skills!
Exercise 1
The comparator module has only two inputs and an output port. Wasting space is a pity, so why
don’t we replicate another comparator so that each module has two of them? You can now try to
add two extra inputs and one extra output and light. You can double most parts of the code, but
I suggest you iterate with a for loop through all the inputs and outputs. Try on your own, and at the
end – if you can’t figure it out on your own – take a look at how this is done in the AComparator
found with the online resources.
Exercise 2
Sometimes you want the output to latch for a few milliseconds (i.e. to avoid changing too quickly).
Think of the case where input “A” is a signal with some background noise and input “B” is zero. All
the times that “A” goes over zero, the output will get high. This will happen randomly, following the
randomness of the noisy input “A” signal. How do you avoid this? A hysteresis system is what you
need. Try to design yourself a conditional mechanism that:
• drives the output high only when input “A” surpasses a threshold that is slightly higher than
input “B”; and
• drives the output low only when input “A” drops below a threshold that is slightly lower than
input “B.”
Let’s Start Programming 125
TIP: In this section, you will learn how to add parameters such as knobs, and to pick the right
one from the component library.
The terms “mux” and “demux” in engineering stand for multiplexing and demultiplexing, which are
the processes that make multiple signals share one single medium, in our case a wire (or its virtual
counterpart). In analog or virtual circuits, we need multiplexers to allow two or more signals to be
transferred over a wire, one at a time. Similarly, we demultiplex a wire when we redirect it to
multiple outputs. This is achieved in analog circuits by means, for example, of voltage-controlled
switches. In the digital domain, we do it a bit differently.
Considering an integer variable, selector, holding the value of the input port to be muxed to the
output, a C++ snippet for a multiplexer could be as follows:
output = inputs[selector]; // MUX
Similarly, a C++ snippet for a demultiplex, where one input can be sent to one of many outputs,
looks like:
outputs[selector] = input; // DEMUX
How do we translate this into Rack port APIs? Try this on your own first – write it on paper, as an
exercise.
This second module we are designing requires a selector knob. This is a parameter, in the Rack
terminology. Let us discuss how parameters work and how to add them to a module. Parameters
may be knobs, sliders, switches, or any other graphical widget of your invention that can store
a value and be manipulated by the user. The two important methods to know are:
• getValue(); and
• setValue(float value).
As with lights and ports, the set/get methods return or set the value of the parameter. While the
latter is of use in some special cases (e.g. when you reset or randomize the parameter), you will use
the getValue() most of the time to read any change in the value from the user input.
The parameters are added to the ModuleWidget subclass to indicate their position and bind them
to one of the parameters in the enum ParamIds. Let us look at this example:
addParam(createParam<ParameterType>(Vec(X, Y), module, MyModule::
OUTPUT_GAIN));
The details you have to input in this line are the ParameterType (i.e. the name of a class
defining the graphical aspect and other properties of the object, the positioning coordinates X and Y,
and the parameter from the enum ParamIds of the Module struct to which you bind this parameter).
The last bit of information to give to the system is the value mapping and a couple of strings. All these
are set through the configParam method, which goes into the Module constructor. One example follows:
126 Let’s Start Programming
enum ParamIds {
M_SELECTOR_PARAM,
D_SELECTOR_PARAM,
NUM_PARAMS,
};
enum InputIds {
M_INPUT_1,
M_INPUT_2,
M_INPUT_3,
M_INPUT_4,
D_MAIN_IN,
NUM_INPUTS,
N_MUX_IN = M_INPUT_4,
};
enum OutputIds {
D_OUTPUT_1,
D_OUTPUT_2,
D_OUTPUT_3,
D_OUTPUT_4,
M_MAIN_OUT,
NUM_OUTPUTS,
N_DEMUX_OUT = D_OUTPUT_4,
};
enum LightsIds {
M_LIGHT_1,
M_LIGHT_2,
M_LIGHT_3,
M_LIGHT_4,
D_LIGHT_1,
D_LIGHT_2,
D_LIGHT_3,
D_LIGHT_4,
NUM_LIGHTS,
};
Let’s Start Programming 127
we declare two integer variables to hold the two selector values. To make their values consistent
across each step of the process(…) method, we declare them as members of the AMuxDemux struct:
They are zeroed in the Module constructor, where the value mapping of the selectors is also defined:
AMuxDemux() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(M_SELECTOR_PARAM, 0.0, 3.0, 0.0, "Mux Selector");
configParam(D_SELECTOR_PARAM, 0.0, 3.0, 0.0, "Demux Selector");
selMux = selDemux = 0;
}
/* MUX */
lights[selMux].setBrightness(0.f);
selMux = (unsigned int)clamp((int)params[M_SELECTOR_PARAM].
getValue(), 0, N_MUX_IN);
lights[selMux].setBrightness(1.f);
if (outputs[M_MAIN_OUT].isConnected()) {
if (inputs[selMux].isConnected()) {
outputs[M_MAIN_OUT].setVoltage(inputs[selMux].getVoltage());
}
}
/* DEMUX */
lights[selDemux+N_MUX_IN+1].setBrightness(0.f);
selDemux = (unsigned int)clamp((int)params[D_SELECTOR_PARAM].
getValue(), 0, N_DEMUX_OUT);
lights[selDemux+N_MUX_IN+1].setBrightness(1.f);
if (inputs[D_MAIN_IN].isConnected()) {
if (outputs[selDemux].isConnected()) {
outputs[selDemux].setVoltage(inputs[D_MAIN_IN].getVoltage());
}
}
}
As you can see, there are several checks and casts. We need to cast the input value to an integer
because it is used as an index in the array of inputs. The selector values are always clamped using
Rack function clamp(), which clamps a value between a minimum and a maximum. This is a little
paranoid, but it is very important that the selector value does not exceed the range of the arrays they
128 Let’s Start Programming
index, otherwise an unpleasant segmentation fault is issued and Rack crashes. You can avoid this if
you make the parameters D_SELECTOR_PARAM and M_SELECTOR_PARAM be constrained
properly in your widget.
You will notice that we shut off a light each time before evaluating the new value for selDemux or
selMux, even if the selector has not changed. This makes the code more elegant and reduces the
number of if statements.
Finally, the lines concerning the outputs are reached only if the relevant inputs and outputs are connected.
The assignments to the output ports are done using the setVoltage() and getVoltage()
methods. Compare this with the lines you wrote previously for the exercise. Did you get the thing right?
To conclude the module, let us look at the graphical aspect. We defined through the configParam
method that the selectors span the range 0–3. The last step is to tell the ModuleWidget subclass
where to place the selector, and what it looks like. The latter is defined by the template
<ParameterType> of the createParam method, seen above. We have a lot of options – just open
the include/componentlibrary.hpp and scroll. This header file defines a lot of components:
lights, ports, and parameters. Let us look at the knobs: RoundKnob, RoundLargeBlackKnob,
Davies1900hWhiteKnob, and so on – there are a lot of options! As discussed in Section 5.2, they
are organized hierarchically, by inheritance, and their properties are easy to interpret from the source
code directly.
Besides the graphical aspect, there is one important property to make the mux/demux module work
flawlessly: the snap property. Knobs can snap to integer values, as a hardware selector/encoder
would do, skipping all the real-valued position between two integer values. This way, the user can
select which of the input/output ports to mux/demux easily, with a visual feedback. We take the
RoundBlackSnapKnob from the component library and place it where it fits as follows:
Before moving on to the last remarks, take a look at the provided code, AMuxDemux.cpp. The
module implements both a multiplexer (upper part) and a demultiplexer (bottom part), so you can
test it easily.
Note: The function clamp is an overloaded function. This means that there are two definition of the
function, one for integer and one for float variables. You can use the same function name with either
float or integer values, and the compiler will take care of figuring out which one of the two implemen-
tations to use, depending on the input variable type.
Exercise 1
What about employing an input to select the output (input) of a multiplexer (demultiplexer)? This is
handy as we may want to automate our patches with control voltages. To implement this, it is suffi-
cient to replace the param variable with an input variable.
Let’s Start Programming 129
Exercise 2
You may decide to have both a knob and an input as a selector. How could you make the two work
together? There is no standard for this, and we may follow different rules:
• Sum the input CV and the knob values after casting to int: this is quite easy to understand for
the user. It may not be of use if the knob is turned all the way up: the input signal will not affect
the behavior in any way unless it is negative.
• Sum the input and the knob values and apply a modulo operator: this way, there will always be
a change, although it may be more difficult to have control on it.
• Exclude the knob when the CV input is active.
applied after computing the logarithms. Please note that the denominator should not be zero, thus
b ≠ 1. Finally, if the base is positive, the base is raised to the power of v.
Examples of these mappings are shown in Figure 6.2.
Please note that the displayed value is computed by the GUI in Rack only for display. The value
provided by the Param::getValue method will always be v (i.e. the linear value in the range
from minValue to maxValue). This leaves you free to use it as it is, to convert it in the same way it
is computed according to the configParam, or even compute in a different way. Considering the
computational cost of transcendental functions such as std::pow or std::log, you should
(a) (d)
(b) (e)
(c) (f)
Figure 6.2: Mapping the knob value into a displayed value according to the configParam arguments. In (a),
the values v, ranging from 0.1 to 10, are shown. In (b), the mapping of Equation 6.1a is shown with
m ¼ 2; o ¼ 20. In (c) and (d), the mappings are shown according to Equation 6.1b for b ¼ 0:1
and b ¼ 10, respectively. In (e) and (f), the exponentials are shown with b ¼ 0:1 and b ¼ 2.
Let’s Start Programming 131
avoid evaluating the value according to Equation. 6.1b and Equation 6.1c at each time step. Good
approximations of logarithms can be obtained by roots. Roots and integer powers of floats are
quicker to compute than logarithms or float powers.
In the following chapters, different strategies will be adopted to map the parameters according to
perception and usability.
TIP: In this section, you will learn how to generate short clock pulses using the PulseGenerator
Rack object.
A clock generator is at the heart of any modular patch that conveys a musical structure. The notion
of a clock generator differs slightly among the domains of music theory, computer science,
communication, and electronics engineering. It is best, in this book, to abstract the concept of clock
and consider it as a constant-rate events generator. In the Eurorack context, an event is a trigger
pulse of short duration, with a voltage rising from low to high voltage. In VCV Rack, the high
voltage is 10 V and the pulse duration is 1 ms.
In this section, we will develop a simple clock that generates trigger pulses at a user-defined rate
expressed in beats per minute (bpm). The generated pulses can be used to feed a frequency divider
or an eight-step sequencer as those developed in this chapter.
The operating principle of a digital clock is simple: it takes a reference clock at frequency fref and scales
it down by issuing an event every L pulses of the reference clock. The obtained frequency is thus
f
fclk ¼ refL . The reference clock must be very stable and reliable, otherwise the obtained clock will suffer
jitter (i.e. bad timing of its pulses). In the hardware domain, a reference clock is a high-frequency PLL or
quartz crystal oscillator. In the software domain, a reference clock is a reliable system timer provided by
the operating system or by a peripheral device. The clock generator will count the number of pulses
occurring in the reference clock and will issue a pulse itself when L pulses are reached.
As an example, consider a reference clock running at 10 kHz. If I need my clock generator to run at
5 kHz, I will make it issue a pulse every second pulse of the reference clock. Each time I pulse my
clock, I also need to reset the counter.
In Figure 6.3, the frequency to be generated was an integer divisor of the reference clock frequency.
However, when we have a clock ratio that is not integer, things change.
Figure 6.3: Generation of a 5 kHz clock from a 10 kHz clock by firing a pulse each second pulse of the
reference clock.
132 Let’s Start Programming
Let us consider the following case. We need to generate a 4 kHz clock from a 10 kHz clock. The
exact number of pulses to wait from the reference clock before firing a pulse is L4k ¼ 10 4 kHz ¼ 2:5;
kHz
which is not integer. With the aforementioned method, I can only count up to 2 or 3 before issuing
a pulse. If I fire a pulse when the counter reaches 3, I generate a clock that is 3.3 kHz, yielding an
error of 0.6 kHz, approximately 666 Hz. Not very precise. Firing a pulse when the counter reaches
2 means generating a clock at 5 kHz, which is even farther from 4 kHz.
If I could increase the frequency of the reference clock, the error would be lower, even zero. If
I have a clock at, for example, 100 kHz, I could count up to 25 and obtain a clock of 4 kHz.
A larger clock ratio fref =fclk is thus of benefit to reduce the error or even cancel it, where fref is
a multiple of fclk .
However, if I cannot change the ratio between the two clocks, a better strategy needs to be
devised. In software, we can use fractional numbers for the counter. One simple improvement can
thus be done by using a floating-point counter instead of an integer one. Let us again tackle the
case where we want to generate a 4 kHz clock from a 10 kHz reference clock. As before, I can
still fire a pulse only in the presence of a pulse from the reference clock, which is equivalent to
setting an integer threshold for the counter, but this time I allow my counter to go below zero and
have a fractional part.
The counter threshold L ^ is set to 2 and the floating-point counter starts from zero. At the beginning,
after counting two pulses of the reference clock, a pulse will be fired, but without resetting the
counter to 0. Instead, L4k is subtracted from the counter, making it go to −0.5. After two oscillations
of the reference clock, the counter will reach 1.5, not yet ready to toggle. It will be toggled after the
third reference pulse because the counter goes beyond 2 (more precisely, it reaches 2.5). At this
point, L4k is subtracted and the counter goes to 0, and we get back to the starting point. If you
continue with this process, you will see that it fires once after two reference pulses and once after
three reference pulses. In other words, the first pulse comes too early, while the second comes too
late. Alternating the two, they compensate each other. If you look on a large time frame (e.g.
over 100 reference clock pulses), you see 40 clock pulses (i.e. the average frequency of the clock is
4 kHz and the error – on average – is null). If you think about the whole process, it is very similar
to the leap year (or bissextile year), where one day is added every four years because we cannot
have a year that lasts for 365 days and a fraction.2
Please note that this way of generating a clock may not be desirable for all use cases, because the
generated clock frequency is correct only on average. In fact, it slightly deviates from period to
period, generating a jitter (i.e. an error between the expected time of the pulse and the time the
pulse happens). In the previous example, the jitter was heavy (see Figure 6.4) because we had
a very low clock ratio. In our scopes, however, the method is more than fine, and we will not dive
into more complex methods to generate a clock. We are assuming that the clock ratio will always
be large because we will be using the audio engine as a clock source to generate bpm pulses,
which are times longer than the sample rate. The audio engine is very stable and reliable,
otherwise the audio itself would be messy or highly jittered. The only case when the assumption
fails is when the system is suffering from heavy load, which can cause overruns or underruns.
However, in that case, the audio is so corrupted that we won’t care about the clock generator
module. The audio engine also provides a large clock ratio if we intend to build a bpm clock
generator (i.e. a module that at its best will generate a few pulses per second). Indeed, the ratio
between the reference clock (44,100 at lowest) and the bpm clock is much larger than the example
above. While in the example above the reference clock was less than an order of magnitude faster,
Let’s Start Programming 133
Figure 6.4: The process of generating a clock that has an average frequency of 4 kHz from a 10 kHz clock
source. The figure shows the reference clock (a), the floating-point counter (b), the clock triggered by the
counter (c), which will be 4 kHz only on average, the desired clock at precisely 4 kHz (d), and the time
difference between the generated clock and the desired one (e). Although the generated clock has an
average frequency of 4 kHz, it is subject to heavy jitter because of the low clock ratio.
in our worst case here we have a reference clock that is more than three orders of magnitude
faster. If you take, for example, 360 bpm as the fastest clock you want to generate, this reduces to
6 beats per second, while the audio engine generates at lowest 44,100 events per second. The ratio
is 7,350. In such a setting, the jitter cannot be high, and the average error, as shown above,
reduces to zero.
Let us now see how to build the module in C++.
We first describe the PulseGenerator object. This is a struct provided by Rack to help you create
standard trigger signals for use with any other module. The struct helps you generate a short positive
voltage pulse of duration defined at the time of triggering it. You just need to trigger when needed,
providing the duration of the pulse, and then let it go: the object will take care of toggling back to 0
V when the given pulse duration is expired. The struct has two methods:
• process(float deltaTime); and
• trigger(float duration).
The latter is used to start a trigger pulse. It takes a float argument, telling the duration of the pulse
in seconds (e.g. 1e-3 for 1 ms). The process method must be invoked periodically, providing, as
input argument, the time that has passed since the last call. If the pulse is still high, the output will
be 1. If the last pulse trigger has expired, the output will be 0. Always remember that you need to
multiply the output of the process function to obtain a Eurorack-compatible trigger (e.g. if you want
to generate a 10 V trigger, you need to multiply by 10.f).
Figure 6.5 shows a trigger pulse generated by PulseGenerator, where the trigger method is first
called telling the impulse duration. At each process step, the output is generated, checking whether
the trigger duration time has passed, and thus yielding the related output value.
For modules generating a clock signal, a good duration value is 1 ms (i.e. 0.001 s, or 1e-3f in the
engineering notation supported by C++). A positive voltage of 10 V is OK for most uses.
134 Let’s Start Programming
We now move to the module implementation, discussing how to convert bpm to a period
expressed in Hz, and how the floating-point counter algorithm described above is put into
practice.
The clock module is based on the struct AClock, a Module subclass, shown below:
enum LightsIds {
PULSE_LIGHT,
NUM_LIGHTS,
};
dsp::PulseGenerator pgen;
float counter, period;
AClock() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(BPM_KNOB, 30.0, 360.0, 120.0, "Tempo", " BPM");
counter = period = 0.f;
}
The enums define a knob for bpm selection, one output, and a light to show when the trigger is
sent. The knob is configured with minimum, maximum, default values, and with two strings: the
name of the parameter and the unit, in this case beats per minute (bpm). One spacing character is
intentionally inserted to the left of bpm. The module also contains a PulseGenerator object and two
auxiliary integer variables, initialized to 0.
The process function is pretty short, thanks to the use of the PulseGenerator object:
counter++;
float out = pgen.process( args.sampleTime );
outputs[PULSE_OUT].setVoltage(10.f * out);
lights[PULSE_LIGHT].setBrightness(out);
}
In the process function, we first convert the bpm value to a period expressed in Hz, which tells us
how distant the triggers are far from each other in time. A bpm value expresses how many beats
are to be sent in a minute. For the sake of simplicity, our clock sends one trigger pulse for each
beat. To get the number of pulses per second, we need to divide the bpm by 60 seconds. For
instance, a bpm of 120 is equivalent to a frequency of 2 Hz (i.e. two beats per second). However,
we need to convert a frequency to a period. By definition, a frequency is the reciprocal of
a period. Remember that the reciprocal of a division is a division with inverted order of the
terms (i.e. ðBPM=60Þ1 ¼ 60=BPM). Finally, the period is in seconds, but we are working in
a discrete-time setting, where the sampling period is the basic time step. We thus need to multiply
the seconds by the number of time steps in a second to get the total time steps to wait between
each trigger. As an example, if the bpm is 120, the period is 0.5 s, corresponding to 22,050 time
steps at a sampling rate of 44,100. This means that each 22,050 calls to the process function, we
need to trigger a new pulse.
Back to the code – once the period is computed, we compare it against the counter, increasing by
one unity at each time step. When the counter is larger than the period, a trigger is started by calling
the trigger method of the PulseGenerator pgen. At this point, if we would reset the counter to 0, we
would keep cumulating the error given by the difference between period and counter at each pulse.
Instead, removing the period from counter (which is always greater or equal than period when we
hit the code inside the if) enables compensation for this error, as discussed in the opening of this
section.
At the end of the process function, we increase the counter and we connect the output of the pulse
generator to the output of the module.
136 Let’s Start Programming
As promised, we will now discuss the use of lights and show how to obtain smooth transitions. The
first way to turn the light on while the pulse trigger is high is to simply set its value, as we did in
the previous module, using the setBrightness method.
There is one issue with this code, however. The pulse triggers are so short that we cannot ensure
that the light will be turned on and be visible to the user. If the video frame rate is approximately
60 Hz (i.e. the module GUI is updated each 16 ms), it may happen that the pulse turns on and off in
the time between two consecutive GUI updates, and thus we will not see its status change.
To overcome this issue, there a couple of options:
• Instantiate a second pulse generator with trigger duration longer than 16 ms that is triggered
simultaneously to pgen but lasts longer for giving a visual cue.
• Use another method of the Light object, called setSmoothBrightness, which rises instantly
but smooths its falling envelope, making it last longer.
The first option is rather straightforward but inconvenient. Let us discuss the second one.
The setSmoothBrightness method provides an immediate rise of the brightness in response
to a rising voltage and a slow fall in response to a falling voltage, improving persistence of the
visual cue.
To invoke the setSmoothBrightness method to follow the pulse output, you can do as follows:
lights[PULSE_LIGHT].setSmoothBrightness(out, 5e-6f);
where the first argument is the brightness value and the second argument scales the fall duration.
The provided value gives us a nice-looking falling light.
Well done! If you are still with me, you can play with the code and follow on with the exercises.
This chapter has illustrated how to move from an algorithm to its C++ implementation. From now
on, most modules will implement digital signal processing algorithms of increasing complexity, but
don’t worry – we are going to do this step by step.
Exercise 1
Try implementing a selector knob to choose whether the bpm value means beats per minute or bars
per minute. The selector is implemented using an SvgSwitch subclass.
Exercise 2
A useful addition to the module may be a selector for the note duration that divides or multiplies the
main tempo. Try implementing a selector knob to choose between three different note durations:
TIP: In this section, you will learn about an important Rack object, the SchmittTrigger. This
allows you to respond to trigger events coming from other modules.
An N-step sequencer stores several values, ordered from 1 to N, and cycles through them at each
clock event. Values may be continuous values (e.g. a voltage to drive an oscillator pitch in order to
create a melodic riff) or binary values to tell whether at the nth step a drum should be triggered or
not. Of course, in modular synthesis, creative uses are encouraged, but to keep it simple at this time
we will just think of a step sequencer as a voltage generator that stores N values and outputs one of
them sequentially, with timing given by a clock generator.
A step sequencer is a simple state machine that transitions from the first to the last state and resets
automatically at the last one. For each state, the module sends an output value that is set using a knob. The
simplest interface has one knob per state, allowing the user to change any of the N values at any time.
In this section, we want to build a basic eight-step sequencer, with eight knobs and a clock input
(see Figure 6.6). At each pulse of the clock, the state machine will progress to the next state and
start sending out a fixed voltage equal to the one of the knob corresponding to the current state.
Eight lights will be added to show which one is the currently active status.
The challenge offered by this module is the detection of trigger pulses on the input. This can be
simply done using the SchmittTrigger object provided by Rack. A Schmitt trigger is an electronic
device that acts as a comparator with hysteresis. It detects a rising edge and consequently outputs
a high value, but avoids turning low again if the input drops below the threshold for a short amount
of time. The SchmittTrigger implemented in Rack loosely follows the same idea, turning high when
the input surpasses a fixed threshold and turning low again only when the input drops to zero or
below. It provides a process(float in) method that takes a voltage value as input. The method returns
true when a rising edge is detected, and false otherwise. The object relies on a small state machine
Figure 6.6: An eight-step sequencer seen as a state machine. Transitions between states are unidirectional
and are issued by an input trigger. For each state, a sequencer module outputs the voltage value
corresponding to the current state. Conventional sequencers have one knob per state, allowing for easy
selection of the values to be sent to output. Please note that an additional reset button may allow
transitions to happen from the current state toward the first one.
138 Let’s Start Programming
that also detects the falling edge so that the user can always check whether the input signal is still in
its active state or has already fallen down, using the isHigh method.
Our sequencer module, ASequencer, will react to each rising edge at the clock input and advance by
one step. The Module subclass follows:
enum LightsIds {
LIGHT_STEP_1,
LIGHT_STEP_2,
LIGHT_STEP_3,
LIGHT_STEP_4,
LIGHT_STEP_5,
LIGHT_STEP_6,
LIGHT_STEP_7,
LIGHT_STEP_8,
NUM_LIGHTS,
};
dsp::SchmittTrigger edgeDetector;
int stepNr = 0;
Asequencer() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
for (int i = 0; i < Asequencer::NUM_LIGHTS; i++) {
configParam(PARAM_STEP_1+i, 0.0, 5.0, 1.0);
}
}
Let’s Start Programming 139
};
As you can see, the knobs also provide storage of the state values.
The process function is as simple as follows:
if (edgeDetector.process(inputs[MAIN_IN].getVoltage())) {
stepNr = stepNr % 8; // avoids modulus operator
}
outputs[MAIN_OUT].setVoltage(params[stepNr].getValue());
}
The SchmittTrigger processes the input signal in order to reveal any trigger pulse. Whenever
a rising edge is detected, the current step is increased, with the caveat that we need to reset it to 0
when we reach 7. We can do that using the C++ modulus operator “%” as shown above, or the bit
manipulation trick shown in Section 2.13. You will find the latter in the actual implementation of the
ABC plugins.
Exercise 1
Now that we have both a Clock and a Sequencer module, we can build musical sequences by driving
the sequencer with a clock at a given bpm. Try building a simple musical phrase using the sequencer,
the clock, and a VCO.
Exercise 2
It may be useful to reset the step counter at any time using a button or a gate input. Try implement-
ing both using a SchmittTrigger. It will help to debounce the button and detect rising edges on the
input signal.
Exercise 3
Try implementing a binary step sequencer, replacing knobs with switches, for programming drums.
You can use the CKSS parameter type. It has two states, suiting your needs.
140 Let’s Start Programming
TIP: In this section, you will learn how to encapsulate and reuse code.
Frequency-dividing circuits and algorithms are necessary to slow down a clocking signal (i.e.
a periodic pulse train) to obtain different periodic clock signals. In digital electronics (and in
music production too!), these are usually called clock dividers. Many usage examples of
frequency dividers exist. Back in the analog era, DIN SYNC clocks had no standard, and to
synchronize a machine employing 48 pulses per quarter note (ppq) with a slave machine
requiring 24 ppq, one needed to use a clock divider with a division factor of 2. In modular
synthesis, we can use a clock divider for many purposes, including sequencing, stochastic event
generation, and MIDI synchronization (remember that MIDI clock events are sent with a rate of
24 ppq). Binary frequency-dividing trees are employed to obtain eighth notes, quarter notes, half
notes, and so on from a fast-paced clock signal. For this reason, we may want to implement
a tree of binary frequency dividers, which allows us to get multiple outputs from one single
clock, with division factors of 2, 4, 8, 16, and so on. We may see this as an iterative process,
where output[i] has half the frequency of output[i-1] (i.e. the same as implementing multiple
factor-2 frequency dividers and cascading each one of them).
There are many C++ implementations out there of binary frequency-dividing trees even in the VCV
Rack community. Some of them are based on if/else conditional statements. Here, we want to
emphasize coding reuse to speed up the development. We will implement a clock divider by creating
a struct implementing a factor 2 divider and cascading multiple instances of this, taking each
output out.
We start with the implementation of a 2-divider object implemented as follows:
struct div2 {
bool status;
bool process() {
status ^= 1;
return status;
}
};
To divide the clock frequency by a factor of 2, we can simply output a high voltage every second
pulse we receive as input. In C++ terms, we may toggle a 25-Boolean variable status every two calls
of its process() function. This is done by XOR-ing the status with 1.
Let’s Start Programming 141
The module, Adivider, has a bunch of div2 objects, one per output. Each div2 object is connected to
a PulseGenerator, which in turn sends its signal to the output. Each divider feeds another divider,
except for the last one, so that at each stage the toggling frequency slows down.
We also use the SchmittTrigger to detect edges in the input signal. In this case, we will activate at the
rising edge of an incoming clock signal, initiating the process of going through our clock dividers.
This is done by a method that must be called recursively to explore the cascade of our div2 objects
until we reach the last of our clock dividers or until the current divider is not active. In other words,
if the current divider activates, it propagates the activation to the next one. The function is called
iterActiv because it iterates recursively looking for activations.
The Module subclass is shown below:
enum LightsIds {
LIGHTS1,
LIGHT2,
LIGHT4,
LIGHT8,
LIGHT16,
LIGHT32,
NUM_LIGHTS,
};
div2 dividers[NUM_OUTPUTS];
dsp::PulseGenerator pgen[NUM_OUTPUTS];
}
}
dsp::SchmittTrigger edgeDetector;
ADivider() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
}
Finally, we implement the processing with the cascade of div2 objects. Please note that we start the
recursive iteration only when the SchmittTrigger activates on a clock edge. Each output is processed
so that it gives a +10 V pulse if the corresponding PulseGenerator is high:
if (edgeDetector.process(inputs[MAIN_IN].getVoltage())) {
iterActiv(0); // this will run the first divider (/2) and
iterate through the next if necessary
}
Before closing this section, we may want to note that the execution time of this implementation is
quite variable as in some cycles there will be no call to iterActiv, while in some other cases this will
be called recursively until the last output. For this module, we are not worried by this because the
number of iterations is bounded by NUM_OUTPUTS and the function is quick to execute, but in
DSP practice recursion is possibly dangerous. Always weigh the pros and cons and evaluate your
code thoroughly. In your DSP development, take into account that one overly slow processing cycle
may create a domino effect on other DSP modules or system functions, with unpredictable results.
Suppose that you can implement the same algorithm in two ways: one with constant computational
cost and another with variable computational cost. If the latter has a lower computational cost, on
average, but with rare peaks of computing requirements, the first solution may still be better: the
system will be more robust to unpredictable computing peaks. Of course, it depends a lot on the
application, and experience will teach what is best.
A final remark about the panel graphics. Take a look at the background SVG that was created for
ADivider. The use of a vertical spacing of 40 px was easy to accomplish in Inkscape, since the
Shift+Down combination shifts the selected object by 20 px. Also, it is very easy to create vertical
lines of 40 px length thanks to the ruler function in the status bar of Inkscape. This tells the angle
and distance (i.e. the length) of your line.
Let’s Start Programming 143
Exercise 1
What if you want to reset all your counters (e.g. for a tempo change or to align to a different metro-
nome)? You need a reset method in div2 to be called for each divider at once. You also need a button,
and you need this button to be processed only at its pressure, using a SchmittTrigger. Your turn!
Exercise 2
There are a few lines of code we did not explore – those related to the widget. There is nothing new in
these lines; however, I want you to take a small challenge now: since the number of outputs and
lights is bounded by NUM_OUTPUTS, you may try to write the widget code with a for loop that
instantiates a light and an output port for each of the items in the OutputIds enum. Of course, each
port and light will have coordinates that are dependent on the iteration counter. Remember to con-
nect each port and light to the correct item in the respective enum.
Exercise 3
We ended up coding a few lines to get a nice and useful module. Compare this module with other
modules in the VCV Rack community performing the same function and look at their code. Try to
understand how they work and make a critical comparison trying to understand in what regards this
module is better and what could represent a disadvantage. Take into account computational cost con-
siderations, reusability, code readability, memory allocation, and so on.
Exercise 4
The dsp namespace (see include/dsp/digital.hpp) has a ClockDivider struct. Try to create a clock div-
ider tree module using this struct and compare it to the ADivider module.
TIP: In this section, you will learn how to generate random numbers using Rack utility
functions.
Random number generation is an enabling feature for a large array of compositional techniques,
allowing unpredictable behaviors to jump in or even leaving the whole generation process to chance.
Moreover, generating random numbers at sampling rate is equivalent to generating noise, and thus
you can generate random numbers at sampling rate to create a noise generator. Finally, random
144 Let’s Start Programming
numbers are used to randomize the parameters of your module or can be necessary for some
numerical algorithms (e.g. to initialize a variable). For a recap on random signals and random
generation algorithms, please refer to Section 2.12.
VCV Rack provides random number generation routines in its APIs, under the random namespace.
These are:
• void random::init(). Provides initialization of the random seed.
• uint32_t random::u32(). Generates random uniform numbers in the range of a 32-bit
unsigned int variable.
• uint64_t random::u64(). Generates random uniform numbers in the range of a 64-bit
unsigned int variable.
• float random::uniform(). Generates random uniform float numbers ranged 0 to +1.
• float random::normal(). Generates random float numbers following a normal distribution
with 0 mean and standard deviation 1.
Please note that only the normal distribution has zero mean, and thus the other three will have a bias
that is half the range. Of course, from these APIs provided by Rack, you can obtain variations of the
normal and uniform distributions by processing their outputs. By adding a constant term, you can
alter the bias. By multiplying the distributions, you change the range and, accordingly, the variance.
Furthermore, by squaring or calculating the absolute value of these values, you get new distributions.
In this section, we describe how a random number generator module, ARandom, is developed to
generate random signals at variable rate. Our module will thus have one knob only, to set the hold
time (i.e. how much time to wait before producing a new output value). This is not unlike applying
a zero-order hold circuit to a noise source (i.e. sampling and holding a random value for a given
amount of time). However, equivalently, and conveniently, we can just call the random generation
routine when we need it. The knob will produce, on one end, one value for each sampling interval,
producing pure noise, or one value per second, producing a slow but abruptly changing control
voltage. An additional control that we can host on the module is a switch to select between the two
built-in noise distributions (normal and uniform).
The module struct follows:
enum LightsIds {
NUM_LIGHTS,
Let’s Start Programming 145
};
enum {
DISTR_UNIFORM = 0,
DISTR_NORMAL = 1,
NUM_DISTRIBUTIONS
};
int counter;
float value;
ARandom() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(PARAM_HOLD, 0, 1, 1, "Hold time", " s");
configParam(PARAM_DISTRIB, 0.0, 1.0, 0.0, "Distribution");
counter = 0;
value = 0.f;
}
The ParamIds enum hosts the hold time knob and a distribution selector. We enhance code
readability by adding another enum to the struct, associating the name of the two distributions to
two numbers (i.e. the two values of the PARAM_DISTRIB switch). Two variables are declared,
a float that stores the last generated value and an integer storing the number of samples since we last
generated a random number. The latter will increase once per sample time, and be compared to the
value to see if it is time to generate a new sample.
Before going into the details of the process function, let us consider the configParam for the hold
time knob. The configured range goes from 0 to 1 second, for simplicity. Obviously, when the knob
is turned to the minimum value, the hold time will be equivalent to one sample, for continuous
generation of random values. On the other hand, the hold time will be equivalent to Fs samples.
We are getting now to the process function. This only has to read the parameters and let the random
routines generate a new value, if it is time to do that. The code follows:
if (counter ≥ hold) {
if (distr == DISTR_UNIFORM)
value = 10.f * random::uniform();
else
value = clamp(5.f * random::normal(), -10.f, 10.f);
counter = 0;
}
146 Let’s Start Programming
counter++;
outputs[RANDOM_OUT].setVoltage(value);
}
The routine first reads the status of the parameters. The number of samples to wait until
the next number is generated is stored into the variable hold and is computed by multiplying
the value of the knob by the sampling rate. The PARAM_HOLD knob goes from 0 to 1
seconds.
If the counter is equal to3 or greater than hold, the module generates a number according to the
chosen distribution. When hold is 0 or 1, the module generates a new sample for each call of
process.
Each distribution is scaled properly to fit the Eurorack voltage range: random uniform will be
ranged 0 to 10 V, random normal will give random numbers around 0 V, possibly exceeding the
−10 to +10 V range, and thus clamping it to the standard voltage range is a good idea.
The rest of the module is quite standard, so you can refer to the source code to see how the module
is created.
Exercise 1
The hold knob maps time linearly in the range [0, 1]. The first quarter turn of the knob compresses
most of the useful values. What about replacing time with tempo? By considering the value on
a beats per minute (bpm) scale, we have a musically meaningful distribution of time values along the
knob excursion range that provides finer resolution where it matters more. Specifically, the knob
should display bpm values in the range 15 to 960. This can be done according to the following:
BPM ¼ 60 2v ð6:2Þ
where the knob value is v 2 ½2; 4. This is easily achieved exploiting optional arguments of the
Module::configParam method.
The process method should compute the values according to the above, and convert the bpm value
into a time in samples according to the following:
60 Fs Fs
holdjsmp ¼ ¼ v ð6:3Þ
BPM 2
Such a module may be useful for generating random values at a steady tempo, but it is not generating
noise anymore. One workaround to this would be to increase the upper bpm limit to a value that
makes the hold time equal or shorter than one sample. Do consider, however, that the sample dur-
ation depends on the sampling rate, and thus one would have to adjust the upper limit if the engine
sample rate is changed by the user. Unfortunately, Rack v1 APIs do not allow for this, and thus your
best solution is to consider the worst case (highest sampling rate).
One last tip related to this example: change the strings in configParam to indicate that the knob is
a tempo, not an absolute time.
Let’s Start Programming 147
Exercise 2
The module of Exercise 1 is synchronized to an internal bpm tempo. This is useful for generating
rhythmic random modulations, but cannot synchronize with other clock sources. Modify the ARandom
module to include an input port that allows synchronization to an external clock source. By checking
whether the input port is connected, you can exclude the internal clock generation mechanism. This
system is the equivalent of a sample and hold circuit applied to a random generator.
6.9.2 Polyrhythms
By using two sequencers driven by the same clock, you can generate interesting rhythmic patterns.
We can connect AClock to two ASequencer modules, but they both would have eight steps for a 4/4
time signature. You can use ADivider to have them running at multiple rates. However, if you want
to have one running with a 4/4 signature and the other one at a 3/4 signature, for example, you need
to modify ASequencer to add a knob to select how many steps to loop (from one to eight), or have
a Reset input and ensure it is triggered at the right time (after 3/4).
Notes
1 Please note, we are not going to place screws in the ABC modules for the sake of simplicity.
2 The period of a year consists of 365.24 days approximately.
3 Tip: In principle, the condition to trigger the generation of a new number should be counter == hold. However, this
comparison may produce bugs if for some reason (e.g. an overrun due to preemption to Rack or a variable overwritten by
accident by a buggy write access) we skip this code. From the next time on, the condition will be never verified, because
counter will always be larger than hold and the algorithm will fail in generating a new random number. Thus, it is always
better to use the equal to or greater than condition, to avoid such issues.
CHAPTER 7
Getting Serious:
DSP “Classroom” Modules
By now, you have the basics to develop Rack modules. In this chapter, we will start delving into
DSP stuff, which is ultimately what we are aiming for when we program in VCV. This chapter will
guide you into several examples and will report, along the way, useful tips for DSP programming.
Finally, one of the modules will be modified to allow for polyphony.
The modules presented in this chapter are shown in Figure 7.1.
float getVoltageSum()
This sums together all channels. It is advisable to scale the result by the number of active channels
to avoid overly large signals.
148
Getting Serious 149
For what concerns polyphonic outputs, we shall see in Section 7.6 how to turn a monophonic
module into a polyphonic one with a few minor tweaks. The setVoltage method, by default, writes
the value to channel 0 of the polyphonic cable. However, the second argument can be exploited to
write the value to any of the 16 channels:
outputs[<outputID>].setVoltage(<value>);
outputs[<outputID>].setVoltage(<value>,1);
The system should also be informed about the number of channels that the cable is carrying,
otherwise it will treat it as a default mono cable, even though nonzero values have been written to
the upper channels. The method for setting the number of channels is:
This method also takes care of zeroing those channels that are not used.
When designing polyphonic modules, special care must be taken to match the cardinality of the
input and output channels. Let us consider a poly filter with N inputs. This should produce N filtered
outputs by instantiating N filter engines. Similarly, a poly oscillator with N V/OCT inputs should
produce N outputs. How about the secondary inputs?
150 Getting Serious
The method handles all cases, provided that 0 V is written to input channels that are not used (this is
normally done by the setChannels method, executed beforehand by the module providing the cable).
All this will be discussed with an example later in this chapter.
Tg
Astep ¼ ð7:1Þ
Fs At þ E
If this is clear, we can generalize the reasoning to the other two step evaluations (i.e. that of the
decay Dstep and the release Rstep ). In these cases, the target value is the distance to travel from the
top (1.0) to the sustain level and from the sustain level to zero. To compute Rstep, we also add the
to the numerator, to avoid zero when the sustain is zero.
This is the code for computing the steps:
Please note that the step constants are clamped to avoid overshoots, because the smallest attack
times may yield a step larger than the entire excursion range, depending on the EPSILON and the
sampling rate.
A state machine determines the envelope generation stage and requires a SchmittTrigger to detect
a rising edge on the gate input for determining the attack phase. If the state machine is properly
built, the release phase is inferred by a low value of the gate signal. Overall, the finite-state machine
corresponding to an ADSR EG is reported in Figure 7.2.
The state machine we are building needs to go to Attack any time the rising edge is detected. While
the gate input is active, the state machine controls the status of the envelope, conditionally switching
Figure 7.2: The state machine corresponding to an ADSR envelope generator. Jumps to the Attack phase or
to the Release phase on gate rising or falling edge are shown by the dashed line.
152 Getting Serious
to the next phase: when the target value is reached, the Attack ends and the Decay starts. Similarly,
when the sustain value is reached, the Decay stage ends, and the Sustain stage begins. In Release,
when zero is reached, the envelope generation is considered closed and zero will be written to the
output until a new rising edge gate input is detected.
The code follows:
if (isRunning) {
if (gate) {
//Attack
if (isAtk) {
env += Astep;
if (env ≥ 1.0)
isAtk = false;
}
else {
//Decay
if (env ≤ sus + 0.001) {
env = sus;
}
else {
env += Dstep;
}
}
} else {
//Release
env += Rstep;
if (env ≤ Rstep)
isRunning = false;
}
} else {
env = 0.0;
}
if (outputs[OUT_ENVELOPE].isConnected()) {
outputs[OUT_ENVELOPE].setVoltage(10.f * env);
}
One thing worth mentioning: the target value is set to 1.0 for simplicity, and the generated envelope
is multiplied by 10 in the last lines to comply with the 10 V Eurorack standard.
Getting Serious 153
Exercise 1
In Equation 7.1, we used a small value to avoid division by zero. Conversely, we can allow At E
when we set the knob excursion range in the configParam method and save the sum in the denomin-
ator of Equation 7.1. Try this out in your code for all the step evaluations and the time input values.
You will not save much in computational terms here, but, in the code optimization stage, reducing the
number of operations that are done at least once per sample sums up to an improved outcome.
Exercise 2
Generalizing the concept of the EG, you can devise a module that generates an envelope signal by
ramping between N points, of which you can set the time and the amplitude by knobs. Figure 7.3
provides an idea of this envelope generation method. Please note, this module also accepts a trigger
signal as input, since the falling edge of the gate has no use.
Figure 7.3: An envelope generator following a generic scheme with N points with adjustable amplitude
and time. Amplitude and time are shown as coordinates of the points tðAi ; ti Þ.
Exercise 3
Referring to the module of Exercise 2, can you allow the state machine to loop indefinitely between
the N points? What kind of module are you obtaining if the loop period is very short, say lower than
approximately 20 ms?
TIP: In this section, you will learn how to model a simple RC circuit, making sense of those
high school lessons on electric circuits. We will take the circuit from the analog to the digital
domain, encapsulate it in a class, and use it to process a signal.
154 Getting Serious
The linear envelope generator we developed in the previous section has somewhat limited use in
musical synthesizers for two reasons: our perception of loudness is not linear, and analog envelope
generators are usually exponential, due to the simplicity of generating exponential rise and fall
curves. For this reason, we are now going to implement an envelope generator with exponential
curves, writing the process function from scratch. We could just adapt the code of the linear
envelope generator to this task by calculating the exponential of the linear ramp, but this could be
expensive. Transcendental functions, such as logarithms, exponentials, and goniometric functions
from the math library, are expensive as they are computed with a high precision. In this context,
we can obtain an exponential decay much more efficiently.
There are a few ways to implement an exponential ADSR. The one I prefer to report here is easy,
reusable, and efficient, and it does not involve explicit evaluation of exponentials.
In other words, it varies exponentially, decaying from an initial value Vi with a time constant that is
generally defined as τ ¼ RC. Please note that τ is the time required for the signal to fall to Vi =e
when the initial voltage is Vi and a null voltage is applied to the input (short circuit). The RC circuit
is also known to be a passive low-pass circuit (see Figure 7.4).
Let us move through the process of obtaining a discrete-time model of the continuous-time RC circuit. We
are interested in the input–output relation. We can get this from evaluating the voltage drop across the
resistance. According to Ohm’s law, this is proportional to the current that flows through it according to:
Figure 7.4: An analog first-order low-pass RC filter. If the capacitor is charged (vo ðtÞ = vi ðtÞ = Vi for t < 0)
and the input voltage drops to zero for t = 0, the output voltage follows a falling exponential curve.
Getting Serious 155
We can also look at the circuit and note that vi ðtÞ vo ðtÞ ¼ vR ðtÞ. Now we can show that the current
flowing into a capacitor is:
dvo ðtÞ
iðtÞ ¼ C ð7:4Þ
dt
which is derived from plugging the definition of capacitance C ¼ Q=vðtÞ into the definition of
current iðtÞ ¼ dQ=dt.
By mixing Equations 7.3 and 7.4, we see that:
dvo ðtÞ
vi ðtÞ vo ðtÞ ¼ RC ð7:5Þ
dt
This difference equation can be discretized to obtain a discrete-time implementation of the RC filter
approximation:
y½ n y½ n 1
x½n y½n ¼ RC ð7:6Þ
T
where we applied the definition of derivative in the discrete-time domain seen in Section 2.6 using
Δx ¼ T to make time consistent across the two domains:
dvo ðtÞ y ½ n y ½ n 1
! ð7:7Þ
dt T
where T is the sampling period (i.e. T ¼ ð1=Fs Þ) and y½n; x½n are the discrete-time output and input
voltage, respectively. Note that the input sequence is known and the previous output value can be
stored in a memory element. We can thus rearrange Equation 7.6 to evaluate the output sequence as:
where:
a ¼ RC=ðRC þ T Þ ð7:9Þ
Equation 7.8 informs us that the output is the cumulative contribution of itself and a new input sample.
T process(T xn) {
yn = a * yn1 + (1-a) * xn;
yn1 = yn;
return yn;
}
This way, the time set from the module knobs will be the tau time (i.e. the time to reach {2/3} of
the whole excursion).
Now that we have designed our RC filter, we only need the control code to build our new module.
Different from the linear ADSR, this one will generate constant voltage values that the RC filter will
follow. To create the attack ramp, for example, we let the RC filter follow a step signal. The RC
filter will react slowly to the step signal, getting closer and closer to the constant value of the step
signal, in an exponential fashion. Once the signal reaches the target value (or gets very close to it),
the step function drops to the sustain level, allowing the filter to tend to its value, and so on, as
shown in Figure 7.5. The gate rising and falling edges are detected, employing the SchmittTrigger
class already discussed in the previous chapters.
Figure 7.5: The RC filter output (thick solid line) following a step-like function (thin solid line). For its inner
nature, the RC filter will decay or rise exponentially.
Getting Serious 157
SchmittTrigger gateDetect;
RCFilter<float> * rcf = new RCFilter<float>(0.999);
The process function is implemented below, where you can find the knob parameter reading, the
gate processing, and the ADSR state machine:
if (isRunning) {
if (gate) {
if (isAtk) {
//Attack
rcf->setTau(Atau);
env = rcf->process(1.0);
if (env ≥ 1.0 - 0.001) {
isAtk = false;
}
}
else {
//Decay
rcf->setTau(Dtau);
if (env ≤ sus + 0.001)
env = sus;
else
env = rcf->process(sus);
}
} else {
//Release
rcf->setTau(Rtau);
env = rcf->process(0.0);
if (env ≤ 0.001)
isRunning = false;
}
} else {
env = 0.0;
158 Getting Serious
if (outputs[OUT_ENVELOPE].isConnected()) {
outputs[OUT_ENVELOPE].setVoltage(10.0 * env);
}
Please note that the exponential function reaches the target value in a possibly infinite amount of
time, and thus the condition for switching from attack to decay and from decay to sustain is referred
to a threshold level, arbitrarily set to 0.001 in the code above.
All the other parts of the module follow the linear ADSR we already designed.
Note: The Rack API provides a state-variable first-order RC filter, implemented in include/dsp/filter.
hpp. Its process method differs from the one we implemented here as it does not return an output
value. It is so because it provides two output signals, low-pass and high-pass, that can be obtained
separately after each call to the process invoking the low-pass and high-pass member functions. Fur-
thermore, it does not have a method to set the decay rate parameter tau.
Exercise 1
This module builds from theory quite easily. We are going further in understanding bits of circuit
theory and digital signal processing – well done! Now take a look at other envelope generators out
there in the VCV Rack community. You will find that many implementations are less efficient com-
pared to this one (e.g. some of them compute the exponentiation using powf(), which is not as effi-
cient as two sums and multiplications) and they are not as simple. Of course, some of them will
handle more intricate functions in addition to ADSR generation that this implementation may not
allow, so try to make a fair comparison. To compare the computational resources required by each
module, you can turn on the CPU Timer from the Engine menu.
TIP: In this section, we will reuse code, showing the benefits of OOP in a DSP context.
The envelope of a signal represents its short-term peak value and outlines its extremes, as shown in
Figure 7.6. This is done by extracting local peaks and smoothly interpolating between them.
Getting Serious 159
Figure 7.6: A sinusoidal signal modulated by a sine wave of lower frequency. The envelope (solid bold line)
of the signal shows the modulating low-frequency sine wave.
Detecting the envelope of a signal is useful for level monitoring, dynamic effects, gain reduction,
demodulation, and more. Depending on the task, we may be interested in getting the peak envelope,
as in Figure 7.6, or the root mean square envelope. Here, we shall focus on the peak value.
where vD ðtÞ is the voltage across the diode and iD ðtÞ is the diode current. From the equation, we
observe that the time-varying current is related to the time-varying voltage, while all other terms are
constants related to fabrication process, silicon properties, and temperature. Let us consider a simple
circuit including a resistor, as shown in Figure 7.7. When the voltage vD ðtÞ is positive (Figure 7.7a),
the current flows across the diode, charging the capacitor and producing a voltage at the envelope
detector output terminals. Differently (Figure 7.7b), the diode cannot conduct (it does produce an
extremely small current Is , which is negligible in this context) and the output voltage is the voltage
of the charged capacitor. A faithful model of the diode current requires some more computational
power (e.g. see D’Angelo et al., 2019), which is not necessary here. In the context of this chapter,
160 Getting Serious
(a) (b)
Figure 7.7: A diode-resistor circuit. When a positive voltage V + is applied, the output voltage is positive.
When a negative voltage is applied, the output is zero and the reverse current flow is stopped by the diode.
we can very roughly approximate a diode by a switch that conducts a current when the voltage at its
input is positive (current runs in forward direction) and shuts the current when the input voltage is
negative (current runs in reverse direction). This is shown in Figure 7.7. Note how the diode symbol
used in schematics recalls this mechanism, being composed of an arrow in the forward direction and
a perpendicular line, like a barrier, in the reverse direction.
In our software applications, we can consider the diode as a box that simply applies the following
C/C++ instruction:
y = x > 0 ? x: 0;
where x and y are, respectively, the input and output virtual voltage.
A simple envelope detector can be made by a series-diode and a shunt RC circuit, as depicted in
Figure 7.8.
We shall thus consider two scenarios: when the diode is active, it acts as a pump that charges the
capacitor, while when interdicted the circuit reduces to a RC parallel. The resistor in parallel with
the capacitor will slowly discharge the capacitor, as shown in Figure 7.9b. The entity of the resistor
and the capacitor imply the decay time constant. This can be left as a user parameter, as we want to
keep the system usable over a wide range of input signals.
We can improve this design slightly by rectifying the negative wave too, resulting in a full-wave
rectifier. This is done using a rectifying diode bridge: when four diodes are connected together as
Figure 7.8: A simple diode envelope follower and the resulting output, approximating the ideal envelope.
Getting Serious 161
Figure 7.9: Equivalent circuit of the envelope detector during diode conduction (a) and diode interdiction
(b). In the first case, the diode injects current that charges the capacitor, while during interdiction the
capacitor discharges through the resistor.
Figure 7.10: A diode rectifier or diode bridge. The output voltage takes the absolute value of the input
voltage. In other words, the negative wave is rectified.
shown in Figure 7.10, the output is approximately the absolute value of the input. In C/C++, this is
obtained by an if statement of the kind:
if (x < 0)
y = -x;
else
y = x;
or the equivalent:
y = (x > 0) ? x: -x;
In terms of efficiency, we are often repeating that conditional instructions, such as if or the “?” oper-
ator, should be avoided. However, simple instructions, such as the ones for the absolute value or for
rectification, can be compiled efficient. If compiler optimizations are enabled, this is able to translate
the conditional instruction to a dedicated processor instruction that implements the statement in
a short number of cycles. A compiler explorer, such as https://godbolt.org/, allows you to watch how
a snippet of C++ code is compiled into assembler. You can provide the pertinent compiler flags (those
related to the code optimization and math libraries that the Rack makefile imposes) and observe what
processor instructions are invoked to implement your C++ statements. As an example, Rack currently
162 Getting Serious
float absol(float x) {
return (x > 0) ? x : -x;
}
The code is compiled into a single instruction: andps, which basically is a bitwise AND, showing how
bit manipulation can do the trick. You can also watch how the standard C++ implementation of the
absolute value is compiled by replacing the previous code with the following:
#include <cmath>
float absol(float x) {
return std::abs(x);
}
The diode alone is not sufficient for recovering the envelope. We need at least an additional element,
a capacitor, to store the charge when the maximum value is reached. Figure 7.11 shows how the
capacitor retains the value of a rectified wave and a fully rectified wave (absolute value). The ripple
in the estimated envelope must be as low as possible, and the full-wave rectifier helps in this regard.
Changing the capacitor value (i.e. the discharge time) affects the ripple amount.
Figure 7.11: Comparison between the diode rectifier (a) and the full-wave rectifier (b). The output of the
diode or the full-wave rectifier (solid thin line) is shown against the output of the envelope detector (bold
solid line).
Getting Serious 163
The charge method sets the current output and the previous state (the stored voltage) to a given
value (which is also returned for convenience):
T charge(T vi) {
yn = yn1 = in;
return vi;
}
Since we demanded most of the DSP code to RCDiode, the process function will result in a fairly
short snippet, as follows:
rcd->setTau(tau);
float rectified = std::abs(inputs[MAIN_IN].getVoltage());
if (rectified > env)
env = rcd->charge(rectified);
else
env = rcd->process(rectified);
if (outputs[MAIN_OUT].isConnected()) {
outputs[MAIN_OUT].setVoltage(lights[ENV_LIGHT].value = env);
}
Figure 7.12: AM modulation and demodulation patch. The VCO-1 (carrier signal) is modulated by LFO-1
(low-frequency modulating signal) using a conventional VCA. The resulting signal is demodulated using the
envelope follower. The scope shows the oscillating modulated signal and the envelope on top of it.
As you can see, the charging is done in one line as the charge method return value is assigned
to env.
The tau value is clamped as it was for the exponential ADSR, to avoid overly quick or slow filter
responses. When the rectified voltage is larger than the current output envelope, it takes on so that
the output envelope follows the rectified voltage, and the capacitor is virtually charged. Otherwise,
there is no input to the RC filter and the output voltage follows its natural decay.
A good way to test this module is to conduct amplitude modulation and reconstruct the envelope
with this module according to the following scheme. This is shown in one of the demo patches,
depicted in Figure 7.12.
Exercise 1
Try alternative calculations of the peak amplitude. You may experiment with a simple max hold algo-
rithm with linear decay, such as:
float in = std::abs(inputs[MAIN_IN].getVoltage());
if (in > env + 0.001) {
env = in;
} else {
env += Rstep;
}
You need to adapt the Rstep from the linear envelope generator of Section 7.2.
Try to enumerate the differences between this and the previous method. Is the resulting envelope
going to be smooth?
Getting Serious 165
Exercise 2
Try adding a smoothing RC filter to the output of the envelope follower to improve the quality of the
detected envelope. Can you notice the phase delay applied to the envelope signal? Does it change
according to the tau parameter?
TIP: This section shows a fundamental building block of any musical toolchain: the digital
state-variable filter. Its formulation is derived by discretization using basic virtual analog
notions. Analog state-variable filters have been used in many famous synthesizers, such as the
Oberheim SEM.
This section introduces you to the design and development of one of the most used digital filters in
musical application. Filter design is a wide topic and research is still undergoing, with statistical and
computational intelligence experts chiming in to find novel optimization techniques to fulfill filter
design constraints. In musical applications, however, a few filter topologies are used most of the
time and need be learned. Among these, the state-variable filter is “a must” because it has several
interesting features:
• It is multimode, having low-pass, high-pass, and band-pass outputs all available at once.1
• It has independent control of resonance and cutoff.
• It is rather inexpensive and has good stability to sudden filter coefficients modifications.
The RC filter of Section 7.3 is nothing but a first-order low-pass filter, and it has somewhat limited
applicability for processing audio signals. In this section, we are going to introduce the second-order
state-variable filter (SVF). Its name comes from the fact that it is formulated according to a
state-space model (Franklin et al., 2015), gaining advantage in several regards if compared to
other filter formulations.
Figure 7.13: The topology of a state-variable filter. This mathematical description of the SVF can be
implemented in both the analog and the digital domain.
The filter has three outputs, yH ðtÞ; yL ðtÞ þ yB ðtÞ, the high-pass, low-pass, and band-pass outputs,
respectively. Furthermore, a notch output can be calculated as yN ðtÞ ¼ yL ðtÞ þ yH ðtÞ.
The filter requires two integrators. In the analog domain, these are implemented by operational
amplifiers with a feedback capacitor. In the digital domain, we can approximate an ideal integrator,
as discussed in Section 2.6. With this filter topology, the choice of the integrator type is imposed by
problems of delay-free loops. Indeed, if choosing a backward rectangular integrator for the first
integrator, we would have the flow graph shown in Figure 7.14a. The output of the block would not
be computable in a discrete-time setting as yB ½n depends on itself. Looking at the flow graph, you
can extract the expression:
(a) (b)
(c)
Figure 7.14: Discretization of the state-variable filter topology. The circuit surrounding the first integrator is
discretized using the backward rectangular integration (a), resulting in a delay-free loop that prevents com-
putability. If forward rectangular integration is used, the circuit is computable in a discrete-time setting (b).
A more efficient version of this circuit is shown in (c), where the two delays are refactored in one.
Getting Serious 167
This mathematical model of the filter is legitimate, but cannot be computed, and thus it would be of
no use to us. By using a forward rectangular integrator, we obtain a computable, as shown in Figure
7.14b, leading to the following expression:
yB ½n ¼ Dfωc T ðu½n þ yB ½nÞg þ yB ½n 1
ð7:12Þ
¼ ωc T ðu½n 1 þ yB ½n 1Þ þ yB ½n 1
where we denoted the delay operator as Dfg (corresponding to the z1 symbol in the figures).
Before moving on, there is a small step we can take to improve the efficiency of the system by
removing one memory element. Since all the addends in Equation 7.12 are delayed by one sample
(i.e. all are referred to time instant ½n 1), we can just use one delay element at the end of the
forward branch that applies to all the addends, as shown in Figure 7.14c. The change in the flow
graph is reflected in the code implementation. Please note that the mathematic expression is still the
one of Equation 7.12, and thus we have not changed the system; we just have changed its
implementation to make it feasible.
The discretization of the second integrator is not subject to computability issues now, and thus we
can simply use the backward rectangular integrator that only requires one memory element. The
discrete-time flow graph for the SVF is shown in Figure 7.15, where γ ¼ 2ρ. This topology is also
known as the Chamberlin filter (Chamberlin, 1985). Similar topologies have been derived by
Zölzer and Wise (Wise, 2006). A different route has been taken by Zavalishin, using trapezoidal
integration (not covered in this book) and delay-free loop elimination techniques (Zavalishin,
2018).
The filter coefficients have stayed the same in the digital domain, besides the sampling time T.
The cutoff is thus:
¼ ωc T ¼ 2πFc T ð7:13Þ
To translate the signal flow graph into C++ code, we can look at Figure 7.15. At time step n, we
can start from the band-pass output, looking back to the previous band-pass output and the previous
high-pass output. Then we move on, evaluating the low-pass output from the current band-pass
output and the previous low-pass output. Finally, we evaluate the high-pass output, taking the current
input, band-pass, and low-pass outputs. The equations are rewritten in code as:
bp = phi*hp + bp;
lp = phi*bp + lp;
hp = x – lp – gamma*bp;
Another issue related to the discrete-time implementation of the filter is stability. A high cutoff
frequency and a large damping factor can lead to instability (the filter output diverges to infinity,
making it unusable). Stability conditions that ensure a safe use of the filter were derived, for
example, by Dattorro (2002):
0 1; 0 γ 1; ð7:15Þ
These conditions are slightly conservative and limit the cutoff frequency to approximately π=3 (i.e.
Fs =6). Other stability conditions can be obtained that consider both the coefficients and allow for
a higher cutoff frequency by reducing the damping (e.g. see Wise, 2006). If you are curious enough
and keen to dig into the scientific literature, you will find other discretizations of the SVF or
variations of the Chamberlin filter that can improve the usability range (e.g. see Wise, 2006;
Zavalishin, 2018).
public:
Getting Serious 169
void reset() {
hp = bp = lp = 0.f;
}
Following Rack guidelines, we declare it as a struct, which will make the methods and members
public by default. The setCoeffs method clamps the coefficients to enforce the stability conditions.
The state-variable filter module can be designed like a wrapper for the SVF class, as it needs only to
handle input and output signals and to set the filter cutoff and damping parameters. The process
function follows:
outputs[LPF_OUT].setVoltage(lpf);
outputs[BPF_OUT].setVoltage(bpf);
outputs[HPF_OUT].setVoltage(hpf);
}
170 Getting Serious
The module is monophonic, but the input handles gracefully the presence of multiple signals by
summing the all in one, using the getVoltageSum method.
The reader is encouraged to read the rest of the code in the book repository.
Exercise 1
Evaluating the coefficient f requires a sin(), which has a cost. We should avoid calculating the coeffi-
cients each time the process function is invoked by checking whether the input arguments have
changed. Can you formulate a way to do this? This trick reduces the CPU time by at least a factor of
2 on my system. What about yours? Observe the values reported by the CPU timer with and without
the trick.
Exercise 2
The cutoff of the filter is a very important parameter for the user, and the linear mapping used here is
not so handy, as the most relevant frequency range is compressed in a short knob turn. Try to imple-
ment a mapping that provides unequal spacing, such as the following:
where v is the knob value and the range span by the knob is approximately 10 Hz to 10 kHz.
Exercise 3
One annoyance of the formulation followed hereby is the use of a damping coefficient, which is quite
unusual for musical use. Can you figure out how to replace the damping knob with a resonance or
“Q” knob (i.e. inverting its value range)?
Exercise 4
The cutoff filter is an important parameter to automate by means of control signals. Add a cutoff CV
input to the module, to follow LFO or EG signals.
Exercise 5
After adding the cutoff CV, you will run into an computational cost issue: if the CV signal is not constant,
the CPU-preserving trick of Exercise 1 will lose its effectiveness. Indeed, if the coefficients are continuously
modulated, a faster implementation of the sine function will help. We discuss this topic in Sections
10.2.2–10.2.3. Try to implement these solutions and evaluate the effect of the approximation on the CPU
time and the precision of the coefficient.
Getting Serious 171
TIP: In this section, you will learn how to handle input and output polyphonic cables.
Up to now, we have been considering ports and cables in the traditional way (i.e. as single-signal
carriers). However, Rack allows you to transmit multiple signals in one cable, hence the term
“polyphonic cable.” This feature can be exploited in multiple ways. One could, for example, convey
heterogeneous data over one cable for convenience, such as pitch, velocity, gate, and aftertouch CVs
coming from Core MIDI-CV. This is totally legitimate and reduces wires on the screen. However,
the dual of wire reduction is module reduction, and I think that the true power of polyphonic cables
resides in this scenario. Harnessing polyphonic cables allows you to obtain instant polyphony
without having to duplicate modules. This, in turn, reduces the computational burden by avoiding
drawing multiple modules and performing a lot of overhead function calls.
In this section, we are going to see how simple it is to make the state-variable filter module
polyphonic. The only thing we need to duplicate is the number of SVF instances. The process
function needs to perform the filtering and input/output handling in a for loop. If you look at the
differences between ASVFilter.cpp and APolySVFilter.cpp, you will notice some more differences
related to a change of the enum names, in order to clarify which ports are polyphonic, but this is
just for the sake of readability.
Please note that we are going to take advantage of the code for Exercises 2 and 4 from the previous
section, and thus we are including the exponential mapping of the frequency knob and the presence
of a CV for the frequency (this will be polyphonic too).
The main differences in the code are outlined in the following. The filter member is now an array of
pointers to instances of SVF, instead of a single pointer. The constructor creates new instances of the
SVF<float> type assigning one to each pointer, while in the monophonic module this was done
directly during the declaration of filter. The polyphonic module thus looks like:
enum OutputIds {
POLY_LPF_OUT,
POLY_BPF_OUT,
POLY_HPF_OUT,
NUM_OUTPUTS,
};
enum LightsIds {
NUM_LIGHTS,
};
SVF<float> * filter[POLYCHMAX];
float hpf, bpf, lpf;
APolySVFilter() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(PARAM_CUTOFF, 1.f, 2.5f, 2.f, “Cutoff”);
configParam(PARAM_DAMP, 0.000001f, 0.5f, 0.25f);
The process function now wraps in a for loop the computing of the frequency (the sum of the
frequency knob and the CV input), the filtering, and writing of the outputs. The reading of the knob
is done only once, before the loop. One thing to consider at this point is the different methods
provided by Rack API to handle multiple channels. The getVoltage and setVoltage methods are
defined using default arguments to allow getting and setting the values of each channel:
In fact, up to now, we always exploited these methods without providing the channel number, thus
defaulting to the first channel. As outlined in previous chapters, all cables in Rack are polyphonic by
design; they just change their appearance when they carry one channel only.
The number of channels on an input port can be detected by the following:
int getChannels()
The following method sets the number of output channels, and automatically zeroes the unused
channels (if any):
int ch;
for (ch = 0; ch < inChanN; ch++) {
float fc = knobFc +
std::pow(rescale(inputs[POLY_CUTOFF_CV].
getVoltage(ch), -10.f, 10.f, 0.f, 2.f), 10.f);
filter[ch]->setCoeffs(fc, damp);
filter[ch]->process(inputs[POLY_IN].getVoltage(ch), &hpf,
&bpf, &lpf);
outputs[POLY_LPF_OUT].setVoltage(lpf, ch);
outputs[POLY_BPF_OUT].setVoltage(bpf, ch);
outputs[POLY_HPF_OUT].setVoltage(hpf, ch);
}
outputs[POLY_LPF_OUT].setChannels(ch + 1);
outputs[POLY_BPF_OUT].setChannels(ch + 1);
outputs[POLY_HPF_OUT].setChannels(ch + 1);
}
This is it. Please notice that to avoid unnecessary processing, we limit the number of iterations in
the for loop to inChanN (i.e. the number of input channels). We could always loop over the entire
16-channel bundle; however, this would only be a waste of CPU resources. The number of active
input channels is obtained from the getChannels() method. Differently, we do not care about the
number of channels in the POLY_CUTOFF_CV input since this only affects the behavior of the
filter cutoff.
One important tip: the transition from monophonic to polyphonic is simplified by the fact that input
and output ports are polyphonic by default, but mono-compatible.
At this point of the book, this is the only polyphonic module we have developed. For a quick test of its
functionalities, however, we can rely on Fundamental modules Split and Merge.
Note
1 Additionally, band-stop mode can be achieved by summing the low-pass and high-pass outputs.
CHAPTER 8
This section deals with some more advanced topics that may be useful to start designing Rack
modules. Since the range of topics to cover is incredibly wide, three selected topics have been
covered that open endless exploration:
• modal synthesis;
• virtual analog oscillators with limited aliasing; and
• waveshaping with reduced aliasing.
The latter two are surely very complex, especially in their mathematical details. For this reason, we
will cover some theoretical background, but we will limit the complexity of the discussion, resorting
to intuition as much as possible.
The modules in this section can get computationally expensive (see Figure 8.1). Each section
discusses computational cost issues.
TIP: This section gives you a first look into physical modeling principles. You will also learn how
to provide additional options using the context menu.
Modal synthesis (Adrien, 1991) is a sound generation technique strongly connected to physical
modeling of acoustic instruments and virtual acoustics. It is based on the analysis of modes (i.e. the
resonant frequencies of a vibrating body) and uses this knowledge to synthesize sound in an
expressive and natural way. When used in physical modeling applications, the analysis of a body (a
struck metal bar, a piano soundboard, etc.) is done in the time/frequency domain to characterize the
number of modes, their frequency, bandwidth, and decay times. On the basis of the analysis stage,
sound synthesis is performed employing resonating filter banks tuned to emulate the body resonant
modes.
As with additive synthesis, the modal synthesis process can be computationally heavy, depending on
the number of modes to emulate, because each mode requires one resonator. A resonator can be
implemented by a second-order filter with tunable frequency and damping (or, inversely, Q factor).
175
176 Crunching Numbers
Tones with hundreds of partials are better generated, employing other synthesis techniques; however,
in many cases, the modal synthesis method is extremely versatile and rewarding. We shall see
a simple yet powerful modal engine implementation that includes amplitude modulation to generate
bell-like tones.
Figure 8.2: The exciter/resonator interaction is at the heart of the modal-based synthesis technique. The
feedback reaction (dashed arrow) may or may not be considered in the modeling.
The advantage of modal synthesis is its flexibility and adaptability to several applications. The
modeling of the exciter and resonator allows independent control of the salient timbre features.
However, one of the main issues in designing a modal synthesis engine is the control of the
parameters. The interaction mechanism and the exciter may have several degrees of freedom. The
resonator, composed of second-order oscillators, has two degrees of freedom for each one of them.
While in the emulation of physical objects a physical law can be employed to derive the parameters,
in a modular synthesis setting these can be left for maximum freedom.
may emulate a bow friction or a sharp hit with multiple bounces. By the way, feedback
interaction can be created, thanks to the architecture of VCV Rack, which allows one-sample
latency between blocks.
You will also note that “hitting” the resonating system while it is still active2 generates slightly
different spectral envelopes each time, depending on the status of the filters.
The AModal module will have the following parameters:
enum ParamIds {
PARAM_F0, // fundamental frequency
PARAM_DAMP, // global damping
PARAM_INHARM, // inharmonicity
PARAM_DAMPSLOPE,// damping slope in frequency
PARAM_MOD_CV, // modulation input amount
NUM_PARAMS,
};
We will have two inputs, one to hit the resonators and the other one for the modulating signal:
enum InputIds {
MAIN_IN, // main input (excitation)
MOD_IN, // modulation input for AM
NUM_INPUTS,
};
Figure 8.3: Architecture of the Modal Synthesis module, made of N resonators and an AM modulation stage.
if (changeValues) {
for (int i = 0; i < MAX_OSC; i++) {
float f = f0 * (float)(i+1);
if ((i % 2) == 1)
f *= inhrm;
float d = damp;
if (dsl >= 0.0)
d += (i * dsl);
180 Crunching Numbers
else
d += ((MAX_OSC-i) * (-dsl));
osc[i]->setCoeffs(f, d);
}
}
float in = inputs[MAIN_IN].getVoltage();
if (inputs[MOD1_IN].isConnected())
cumOut += cumOut * mod_cv * inputs[MOD1_IN].getVoltage();
The upper portion of the code is devoted to elaborating the parameters. In particular, to avoid
unneeded operations, the resonator coefficients are changed only when a parameter change is
detected. In that case, the changeValues Boolean variable is true and the conditional part where all
the resonator coefficients are computed is executed.
At this point, the process function of each oscillator is called, getting the sinusoidal output,
which is cumulated into cumOut and normalized by dividing by the number of resonators. If the
modulation input is active, its signal modulates the resonator output. It is worth mentioning that
if a constant value is sent to the modulation input, it acts as a gain. Finally, the signal is sent to
the output.
than artistical choice. One can, for example, decide to use the module as a filter (remember that the
oscillator is of the SVF type), limiting the oscillators to one, or decide how many of them to use
depending on their computational cost. Furthermore, muting or adding oscillators creates sudden
changes in the output that may be undesired. By adding this value to the context menu, we hide it
from the panel, leaving space for other functions, and we place it under the hood, to avoid
distractions.
The context menu we are going to add has these additional items:
• an empty line, for spacing;
• a title for the new section, “Oscillators”;
• 1;
• 16;
• 32; and
• 64.
In general, tweaking the context menu is done in two steps, roughly:
1. Create a new struct, inheriting MenuItem. This defines how the new item or items will look and
interact with our module.
2. Implement the appendContextMenu method to add the required items.
We inherit MenuItem and call the new struct nActiveOscMenuItem, which contains a pointer to the
modal Module, and a number equal to the oscillators this item represents. We also override its
onAction method. This determines what happens when the nActiveOscMenuItem is clicked.
Specifically, it calls the onOvsFactorChange from the trivial oscillator module. This changes the
oversampling factor of the module:
As to the overriding of the appendContextMenu, this is part of the module widget methods, but is
declared virtual, allowing us to implement its behavior. We declare it in nActiveOscMenuItem. Now
that the compiler knows it will be overridden, we can implement it.
Each of these new items must be added as a child of the context menu. We add the title
“Oscillators” as a MenuLabel type, a text-only item include/ui/MenuLabel.hpp. Then we add the
four items, one for each oscillator number, from 1 to 64. Each item is a pointer to a new
nActiveOscMenuItem and it consists of a string (the ->text field), a pointer to the trivial oscillator
(necessary to take action when the item is clicked, as seen above), a number (->nOsc) that tells the
application how many oscillators have been selected by the user. The item also has a right text field,
which displays a checkmark “✔” or an empty string. The CHECKMARK define applies the
checkmark if the input argument condition is verified.
182 Crunching Numbers
menu->addChild(new MenuEntry);
When the user clicks one of these items, onAction is called and the number of active oscillators is
instantly changed. There is no need for further operations, but – just in case – we could add these to
the onAction method.
Crunching Numbers 183
Exercise 1
Try computing the cost of a single oscillator in terms of sums and multiplications. Refer, for simplicity,
to the flow graph of Figure 7.14c.
Exercise 2
Try to replicate the tests on your platform with optimization turned on (the default). Do not compile
a non-optimized version of the plugin unless you know what you are doing.
Exercise 3
Add a method to AModal that fades out the output and then resets the resonators. Call this method
when a different number of resonators is selected from the context menu.
184 Crunching Numbers
Table 8.1: Average execution times for the modal synthesis engine,
with a number of resonators varying between 1 and 64
1 16 32 64
TIP: Our first virtual analog oscillator! In this section, you will also learn how oversampling is
handled in Rack. A polyphonic version will be also discussed at the end.
It is now time to deal with the scariest beast of virtual analog synthesis: aliasing! All techniques
explored up to now did not pose problems of aliasing. Indeed, when designing oscillators and
nonlinear components, aliasing is a serious issue that must be considered. In Chapter 2, we
conducted a review of the basics about sampling and aliasing. This section discusses the
implementation of a trivial oscillator design and derives two intuitive variations that reduce aliasing.
Please note that the square wave can also be obtained as the time derivative of the triangle wave. If
you require an intuitive proof of this, think about the triangle wave shape: it is composed of two
segments, one with positive slope m and one with negative slope –m. The first-order derivative of
a function, by definition, gives the slope of that function. Differentiating the trivial triangular waveform
will thus give us a function that toggles between m and –m: a square wave!
Crunching Numbers 185
Figure 8.4: Methods for generating trivial waveforms in the digital domain. Each method shows how
a counter and one or two thresholds (top line) is sufficient to generate several mathematical waveforms
(bottom line).
The fundamental frequency of these waveforms can be set by either using a fixed step size and
selecting the threshold accordingly, or by setting a fixed threshold and computing the required step
size. For a square and a triangle wave, this method is used to determine the half-period, while for
a sawtooth and a rectangle wave it determines the length of the period.
These methods are easy to implement, and it takes a few lines of code to get these waveforms
generated in VCV Rack. However, as stated before, the mathematical operations described above are
done in a discrete-time setting. Since we are producing a mathematical signal with infinite
bandwidth in a discrete-time setting (without any additional processing), the discrete-time signal is
aliased as though it was being generated in the continuous-time domain and discretized through
a sampling process. Generating signals with such discontinuities thus produces aliasing that cannot
be removed. Both discontinuities in the signal itself, such as the ramp reset in the sawtooth
waveform, or in its first derivative, such as the slope change in the triangle wave, imply an infinite
bandwidth. However, these signals have a monotonous decreasing envelope of −6 dB/oct or −12 dB/
oct. It may therefore happen that the mirrored components (aliasing) are low enough not to be heard.
Furthermore, psychoacoustics may alleviate the problem, because some masking phenomena occur
that reduce our perception of aliasing. For virtual analog synthesis, we are thus not required to
totally suppress aliasing. We may just reduce aliasing down to that threshold that makes it
imperceptible by our auditory system. The perception of aliasing is discussed in Välimäki et al.
(2010a) with experiments and data.
Let us now discuss the implementation of a trivial sawtooth. A snippet of code follows:
In this case, the increment step is variable and the threshold is fixed. This has the advantage
of obtaining signals of amplitude threshold for the whole frequency range. With a fixed step
size, the amplitude would be inversely proportional to the frequency. The increment is
computed in order to reach the threshold pitch times in one second (i.e. in getSampleRate()
steps).
186 Crunching Numbers
-10
-20
-30
-40
-50
-60
0 5 10 15 20
Figure 8.5: The spectrum of a 3,312 Hz sawtooth wave generated using the trivial method at 44,100 Hz. All
components besides the fundamental and the harmonics up to the sixth (numbered) are undesired aliased
components. The SNR is 8.13 dB, considering all aliased components as noise.
The moduloCounter is always positive. Its mean value is thus positive, introducing a DC offset, or
bias. In the case of the sawtooth, square, and triangle waveforms, the DC offset is 0.5, and should
be subtracted:
As discussed previously, the trivial generation techniques run into troubles with aliasing. Figure
8.5 shows the spectrum of a 3,312 Hz sawtooth wave generated at 44,100 Hz with the trivial
method in VCV Rack. Its signal-to-noise ratio (SNR) is only 8.13 dB. This means that the
signal has barely 2.5 times the power of the whole aliased components! Considering that such
a pitch is still inside the extension of the piano keyboard, it is unacceptable to have such
a low SNR.
(b) Oversampled
Interpolation Domain Decimation
x[n] y[n ]
M LPFM Nonlinear EFX LPFD D
Figure 8.6: Two uses of oversampling to reduce aliasing. In (a), a trivial waveform is generated at a large
sampling rate and then decimated. In (b), a nonlinear effect, generating aliasing, is run at a large sampling
rate. Sampling rate conversion is done by interpolating and decimating by integer factors M and D. If
M = D, x[n] and y[n´] are subject to the same sampling rate. If M = D and no effect is applied in the
oversampled domain, x[n] = y[n´].
(a) (b)
(c)
Figure 8.7: Effect of oversampling for reduced-aliasing oscillators. The effect of aliasing is shown at the
sampling frequency FS (a) and at the higher sampling frequency FH (b). The undesired partials are
highlighted by a triangle cap down to the noise floor (bottom line). If the signal is generated at FH and
decimated using the decimation low-pass filter (gray area), the sampling frequency is reverted to FS with
reduced aliasing (c).
188 Crunching Numbers
Please note that oversampling is also used for nonlinear effects that may generate aliasing, such as
distortion and waveshaping. In that case, the input signal must be first converted to the higher
sampling rate by means of interpolation and then processed at the higher sample rate where aliasing
is less of an issue, as shown in Figure 8.6b.
The use of oversampling is pretty straightforward and allows you to easily implement the trivial
oscillator waveforms with reduced aliasing. However, it increases the computational requirements for
two reasons:
1. The number of operations per second is increased by a factor equal to the oversampling factor.
2. The process of downsampling requires a low-pass decimation filter that needs to be as steep as
possible, and is thus expensive.
The constructor of this class designs the filter impulse response and has a default input argument,
0.9, meaning that the default cutoff frequency of the filter will be 0:9 FH =2, but this setting can be
overridden using another float input when calling the constructor. The class also has a process
function, which is called to process the input vector in, of length OVERSAMPLE. The function is
thus called every time we want to generate an output sample (e.g. at each call of the module process
function). The higher sampling rate is FH ¼ OVERSAMPLE
Fs .
We can now start developing a module that implements trivial generation of a waveform both at the
audio engine sampling rate and at higher sampling rates. Only integer oversampling and decimation
factors are considered, for simplicity. Only the sawtooth oscillator is described, leaving to you the
fun of implementing other waveforms.
We want to take into consideration oversampling factors of 1× (no oversampling), 2×, 4×, and 8×.
We shall define a convenience enum that makes the code more readable, where the allowed
oversampling factors are defined:
enum {
OVSF_1 = 1,
OVSF_2 = 2,
OVSF_4 = 4,
OVSF_8 = 8,
MAX_OVERSAMPLE = OVSF_8,
};
Crunching Numbers 189
enum InputIds {
VOCT_IN,
FMOD_IN,
NUM_INPUTS,
};
enum OutputIds {
SAW_OUT,
NUM_OUTPUTS,
};
enum LightsIds {
NUM_LIGHTS,
};
ATrivialOsc() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
out = 0.0;
}
};
190 Crunching Numbers
The module has two inputs and two parameters. VOCT_IN is the typical V/OCT input, taking a CV
to drive the pitch from, for example, a keyboard, while the FMOD_IN takes a signal for logarithmic
frequency modulation of the oscillator pitch. The amount of frequency modulation is decided by
a knob, FMOD_PARAM, a simple gain to be applied to the frequency modulation input. Finally,
a pitch knob, PITCH_PARAM, is available, which sets the pitch when no other input is available, or
offsets the pitch in the presence of non-zero inputs. Its range goes from −54 to +54 semitones with
respect to the frequency of a C4. We shall see later how the displayBase and displayMultipler values
are chosen. A float array, saw_out, is statically allocated, large enough to consider the maximum
storage required (i.e. for the maximum oversampling factor). Several decimation filters are also
created, d2, d4, and d8, one for each oversampling ratio. As you can see, a function is defined for
setting the oversampling factor, onOvsFactorChange. The function sets the oversampling factor and
cleans the saw_out array to avoid glitches during the change of the oversampling factor.
The process function follows:
if (ovsFactor > 1) {
saw_out[0] = saw_out[ovsFactor-1] + incr;
for (unsigned int i = 1; i < ovsFactor; i++) {
saw_out[i] = saw_out[i-1] + incr;
if (saw_out[i] > 1.0) saw_out[i] -= 1.0;
}
} else {
saw_out[0] += incr;
if (saw_out[0] > 1.0) saw_out[0] -= 1.0;
}
switch(ovsFactor) {
case OVSF_2:
out = d2.process(saw_out);
break;
case OVSF_4:
out = d4.process(saw_out);
break;
case OVSF_8:
out = d8.process(saw_out);
break;
case OVSF_1:
Crunching Numbers 191
default:
out = saw_out[0];
break;
}
if(outputs[SAW_OUT].isConnected()) {
outputs[SAW_OUT].setVoltage(out - 0.5);
}
The upper portion of the process function computes the pitch of the sawtooth and the increment
step. The base pitch is that of a C4 (261.626 Hz). All the other contributions (i.e. frequency knob,
V/OCT and FMOD inputs) add to that on a semitone scale. The base pitch is thus altered by
a number of semitones equal to the sum of the pitch knob, the V/OCT input, and the FMOD input
(attenuated or amplified by the frequency modulation knob).
Given a reference tone having frequency fb, the general formula to compute the pitch of a note that
is v semitones away from it is:
fv ¼ fb 2v=12 ð8:1Þ
where v can be any real number. To clarify this, we provide a short example. Given a certain pitch
2
(i.e. that of a C4), we obtain the pitch of a D4 by adding two semitones: fD4 ¼ fC4 212 . Similarly, to
1
get the pitch of a B3, fB3 ¼ fC4 2 12 .
We allow the pitch range to span nine octaves, −54 to +54 semitones with respect to the base pitch.
The pitch is thus calculated as:
The value displayed on the module frequency knob is calculated similarly. Its multiplier is still dsp::
FREQ_C4, while the base is 21=12 , and thus for any value of the knob the displayed result shall
be fC4 2v =12.
For what concerns the trivial ramp generation, the increment is calculated as in the trivial
case, with the exception that the increment should be divided by the oversampling factor.
Indeed, the oversampled trivial waveform is computed as though we were at a higher sampling
rate.
If no oversampling is applied, only one output is computed and stored in saw_out[0]. However, if
oversampling is applied, the trivial algorithm is repeated in a for loop as many times as required by
the oversampling factor (e.g. twice for the 2× case, four times for the 4× case, etc.). These values
are stored in the saw_out array, using a part of it for 2× or 4× oversampling, or the full array in the
8× oversampling case. After computing samples at a higher sampling rate, we need to convert this
signal to the engine sampling rate, using the decimator filter and yielding one sample that will go to
the module output. The decimator filters all the content above FS =2, so that the subsequent step (i.e.
the sample rate reduction) will not add aliasing. The sample rate reduction is done by simply picking
192 Crunching Numbers
one sample over N, where N is the oversampling factor. Both the filtering process and the sampling
rate reduction are conducted using one of the three Decimator filters, outputting one sample at FS
from a vector of samples at FH. Finally, since the sawtooth wave has a DC offset, we shift the wave
down by subtracting the offset.
To conclude the implementation to the module, we need to add the oversampling factor options
to the context menu. We follow the same path used in Section 8.1.4 to add a separate entry
for each sampling rate the module supports and define a subclass of MenuItem, called
OscOversamplingMenuItem. There is one tiny difference with the Modal module. In this case, we
call a method implemented in the Module, instead of directly changing a value. This method is
onOvsFactorChange, and it takes care of resetting the saw_out buffer to avoid glitches.
This is it. If the implementation is clear enough, we can now move on to evaluate the execution
times. The evaluation follows the same criteria and code discussed in Section 8.1.3. The
execution time is measured with the output port only connected to some other module and the
knob in its initial position (the frequency of a C4). We also evaluate the execution times turning
the frequency knob to some other values or by connecting both inputs to another module. The
outcome is quite interesting. First of all, we notice that the execution times do not increase
linearly with the oversampling factor. We would expect, for example, that the 8× case would take
eight times more CPU time than the 1× case. This is not the case, since there is a constant cost
that even the lightest modules have. For instance, the AComparator module has an execution time
of 0.07 μs, even though it does not perform any tough processing. Taking this value as the
average overhead of each call to the process function, we can hypothesize that the trivial
sawtooth oscillator takes 0.02 μs for each cycle. Following this reasoning, the 2× oversampling
implementation should take twice this time plus 0.07 μs. The same goes for the 4× and 8× cases.
We can see that our hypothesis is in good accordance with the data except for a small error
(0.11 μs, 0.15 μs, and 0.23 μs versus 0.11 μs, 0.14 μs, and 0.25 μs). This means that a part of
the execution time increases linearly with the oversampling factor; however, a constant term is
always present.
We also notice that the frequency knob value has an impact on the performance! When it is not in
its initial position (0 semitones), the execution time increase largely! This is due to compiler
optimizations that avoid computing the power of 20. In any other case (pitchKnob or pitchCV not
zero), the exponentiation is computed, increasing the computational cost. There are lots of practical
cases where the actual values of constants and signals impact the performance.5
Table 8.2: Average execution times for the trivial sawtooth oscillator with
and without oversampling
1× 2× 4× 8×
smarter ways have been proposed for more than 20 years already. There is thus a large number of
algorithms that may be implemented to generate reduced-aliasing VA oscillators. For this reason, selecting
one for this book was not easy. Psychoacoustic tests and comparisons have been proposed for many of
them (Välimäki and Huovilainen, 2007; Nam et al., 2010), suggesting that the method known as band-
limited step (BLEP) (Brandt, 2001), derived from the band-limited impulse train (BLIT) (Stilson and Smith,
1996), is among the best in terms of aliasing suppression. Explaining the principles of the BLIT method
and deriving the related BLEP formulation requires some basics that this book does not cover. These
methods also require experiments with parameters to find an optimal trade-off. Their code is also not very
linear to read and understand, since their goal is to correct the samples surrounding the steep transitions in
sawtooth and square oscillators. This also makes the computational cost not very easy to determine.
An alternative method that is easier to understand and implement is the differentiated parabolic
waveform method (DPW) (Välimäki, 2005) and its extensions (Välimäki et al., 2010a). We will give
an intuitive explanation to the method and later consider its advantages.
We know from high school calculus classes that the integral of the derivative of a function is
the function itself plus a constant value (the integral constant):
ð
f 0 ð xÞdt ¼ f ð xÞ þ c ð8:2Þ
Now, if we examine a single period of the sawtooth ramp, we can easily tell that it is a segment of
a linear function of the kind f ð xÞ ¼ mx þ q, where the constant term q is chosen so that the
waveform has zero DC offset and the slope coefficient m sets the ramp growth time. Let us consider
a wave with q = 0. If we integrate this linear function and we differentiate it afterwards, we will get
the original function:
Ð Ð
F ð xÞ ¼ f ð xÞ ¼ mxdx ¼ m2 x2 þ c ðintegration stepÞ
ð8:3Þ
dF ð xÞ 0
dx ¼ F ð x Þ ¼ mx ð differentiation step Þ
Of course, it makes no sense to integrate and differentiate a function. However, it was noticed by
Välimäki that if we obtain the parabolic function x2 from squaring the ramp, instead of integrating it,
and then we differentiate the parabolic function, what we obtain is a very good approximation of the
sawtooth signal with the additional benefit of reduced aliasing. The alias reduction comes from the
spectral envelope tilt introduced by the two operations, as shown in Figure 8.8. The benefit of the
DPW algorithm is very high, even compared to the oversampling generation method seen in Section
8.2.2. The difference between a trivial and a DPW sawtooth is shown in Figure 8.9 at two different
pitches. A slight correction of the sample values is responsible for such a large spectral difference.
(a) (b)
1 0
0.5 -10
magnitude [dB]
0 -20
-0.5 -30
-1 -40
0 5 10 15 20 25 30 0 0.5 1 1.5 2
104
(c) (d)
1 0
0.8
-10
0.6
-20
0.4
-30
0.2
0 -40
0 5 10 15 20 25 30 0 0.5 1 1.5 2
104
(e) (f)
1 0
0.5 -10
0 -20
-0.5 -30
-1 -40
0 5 10 15 20 25 30 0 0.5 1 1.5 2
104
(g) (h)
1 0
0.5 -10
0 -20
-0.5 -30
-1 -40
0 5 10 15 20 25 30 0 0.5 1 1.5 2
104
Figure 8.8: Comparison between several 5 kHz sawtooth waveforms sampled at 48 kHz. The trivial
sawtooth (a), the squared sawtooth (parabolic function) (c) and its first derivative (e), and a trivial
sawtooth generated with 2× oversampling (g). The respective magnitude spectra are shown on the right
(b, d, f, h). The harmonics of the trivial sawtooth are highlighted in (b) with a diamond, while the rest of the
partials are the effect of aliasing. The crosses in (f) and (h) show the position of the harmonics of the trivial
sawtooth. A slight attenuation of the higher harmonics can be noticed by looking, for example, at the
difference between the ideal amplitude (cross) of the 4th harmonic (2 kHz). While such a small difference is
not noticeable, the aliasing present in (b) is largely reduced in (f). The 2× oversampling algorithm does not
perform as well as the DPW algorithm and has a higher cost.
Crunching Numbers 195
(a) (b)
Figure 8.9: Comparison between a trivial sawtooth waveform (solid line) and the differentiated parabolic
waveform (stem plot) at 3 kHz (a) and 5 kHz (b). The difference between the trivial and the alias-suppressed
waveforms gets larger with increasing pitch.
The DPW algorithm is depicted in the flow graph of Figure 8.10b. Although the DPW has
been shown to be not as effective as BLEP methods, it has a low code complexity, requires
very low memory, and has a constant computational cost. Another advantage is the possibility
of iterating the method to obtain improved aliasing reduction. It is possible, in other words, to
extend the concept by finding a polynomial function of order N that provides a low aliasing
sawtooth approximation when differentiated N-1 times. The general procedure to obtain such
polynomials is described in Välimäki et al. (2010a) and the extension to the third order is
shown in Figure 8.10c. The polynomials are reported in Table 8.3. One issue with the iterative
differentiation is that the process reduces the waveform amplitude, especially at low frequency,
increasing the effect of quantization by raising the signal to quantization noise ratio. This can
be overcome by rescaling the waveform at each differentiation stage using a scaling factor that
depends on the order of the polynomial and the period of the waveform, shown as a gain in
Figure 8.10c. For DSP newbies, the amplitude reduction due to differentiation can be intuitively
explained: you implement it by subtracting two contiguous values. Those values are going to
be very close to each other, especially if the signal has a low frequency, and thus the output of
Table 8.3: Polynomial functions for generating a sawtooth or a triangular wave using DPW with orders
from 1 to 6 and the related scaling factors to be used during differentiation
DPW Order Polynomial Function (SAW) Polynomial Function (TRI) Scaling Factor
(a)
(b)
(c)
Figure 8.10: General flow graph of the DPW family of algorithms (a), the 2nd order sawtooth DPW
algorithm (b), and 3rd order DPW algorithm (c).
the differentiator is generally small. Of course, the outcome of the differentiation gets large
when abrupt or very quick changes happen (i.e. when contiguous samples are not similar to
one another).
Several approaches have been proposed to compute the scaling factor. An accurate solution able to
keep the amplitude almost constant over the oscillator frequency range is provided in Table 8.3.
Some factors can be precomputed to reduce the computational cost.
The DPW technique can also be extended to the triangle wave by computing different polynomials,
as shown in Table 8.3, and generating a trivial triangle wave as input. The square wave can then be
obtained by differentiating the triangle wave. Table 8.3 reports the polynomials required by each
DPW order for both the sawtooth and the triangular wave. Increasing the order of the DPW will
increase the computational cost and reduce the aliasing further.
typedef enum {
DPW_1 = 1,
DPW_2 = 2,
DPW_3 = 3,
DPW_4 = 4,
MAX_ORDER = DPW_4,
} DPWORDER;
Crunching Numbers 197
typedef enum {
TYPE_SAW,
TYPE_SQU,
TYPE_TRI,
} WAVETYPE;
The first four orders of the DPW are defined, and the maximum order is the fourth. We also define
three types of waveform. The struct, called DPW, is templated, to allow float and double
implementations, for reasons that you will discover later.
Since DPW is based on the generation of a trivial waveform, we have pitch and phase variables.
The gain is the correction factor c applied during differentiation to compensate the amplitude
loss due to the differentiator. The DPW order and the waveform type are stored in two variables.
A buffer diffB stores the last samples, for the differentiator, and an index dbw stores the
position in that buffer where the next sample will be written. This buffer is treated as circular:
when the last memory element is written, the index wraps to zero, so that we start writing again
from the beginning.
Finally, a variable init is used at initialization or reinitialization of the buffer to prevent glitches, as
we shall see later:
template <typename T>
struct DPW {
T pitch, phase;
T gain = 1.0;
unsigned int dpwOrder = 1;
WAVETYPE waveType;
T diffB[MAX_ORDER];
unsigned int dbw = 0; // diffB write index
int init;
...
The constructor sets the waveform type to the default value, clears the buffer, computes the
parameters, and initializes init to the order of the DPW:
DPW() {
waveType = TYPE_SAW;
memset(diffB, 0, sizeof(T));
paramsCompute();
init = dpwOrder;
}
The paramsCompute function finds the correct scaling factor to compensate for the amplitude
reduction given by the differentiator. This function is also called by setPitch, to recompute the
correct gain, since the latter depends on the pitch:
198 Crunching Numbers
void paramsCompute() {
if (dpwOrder > 1)
gain = std::pow(1.f / factorial(dpwOrder) * std::
pow(M_PI / (2.f*sin(M_PI*pitch * APP->engine->getSampleTime())),
dpwOrder-1.f), 1.0 / (dpwOrder-1.f));
else
gain=1.0;
}
The generation of a the DPW sawtooth starts from the trivial sawtooth. We have a method for this
where we only implement the sawtooth case – the rest is left to the reader:
T trivialStep(int type) {
switch(type) {
case TYPE_SAW:
return 2 * phase - 1;
default:
return 0; // implementing other trivial waveforms is left as an
exercise.
}
}
The generation of one output sample relies on the process method, which in turn calls the trivialStep
method and the dpwDiff method, used for differentiating the polynomial values. These are reported below:
T dpwDiff(int ord) {
ord = clamp(ord, 0, MAX_ORDER); // avoid unexpected behavior
T tmpA[dpwOrder];
memset(tmpA, 0, sizeof(tmpA));
int dbr = (dbw - 1) % ord;
return tmpA[0];
}
T process() {
// differentiation
diffB[dbw++] = poly;
if (dbw >= dpwOrder) dbw = 0;
if (init) {
init–;
return poly;
}
return dpwDiff(dpwOrder);
}
The process function generates the trivial waveform and updates its phase. The polynomial is
computed according to the order. Obviously, with order 1, the trivial value is just returned. The
differentiation then takes place: the last value is stored into the diffB buffer and its index is
advanced, with wrapping to zero. Now, if init is nonzero, we are in the very first samples, right
after creation of the class or reinitialization of the DPW order. In this case, the value is just
200 Crunching Numbers
returned and the differentiation is skipped. This avoids large values, causing glitches. Indeed, if
the differentiation is performed with the buffer still not filled up completely, a difference
between a value and zero would occur. This would get multiplied by the scaling factor gain,
possibly generating values higher than the Eurorack voltage limits. The variable init is
decreased, so that when it gets to zero the section of code inside the if will not be executed
anymore. At this point, the buffer is filled, and from now on the differentiation will be always
performed. The buffer is as large as the highest DPW order allowed. However, only a part of it
Figure 8.11: The cascade of three differentiators as required by the DPW of order 4.
Figure 8.12: The DPW::dpwDiff method repeats several differentiation steps, as required by the algorithm.
The DPW of order 4 requires differentiating three times.
Crunching Numbers 201
will be used: the buffer is considered as large as the order. The actual size of the buffer is thus
controlled by the wrapping conditions imposed to dbw. If this index gets larger than the current
order, it will wrap to zero. Similarly, the number of values to wait before starting differentiating
is exactly the order of the DPW (Figures 8.11 and 8.12).
As you can see, when the order changes, the code below gets executed:
unsigned int onDPWOrderChange(unsigned int newdpw) {
if (newdpw > MAX_ORDER)
newdpw = MAX_ORDER;
dpwOrder = newdpw;
memset(diffB, 0, sizeof(diffB));
paramsCompute();
init = dpwOrder;
return newdpw;
}
The DPW order is set, and the buffer is erased and set to zero using memset. The gain is recomputed
and init is set to the current order to allow diffB to be filled up to dpwOrder values, before starting
differentiating. Finally, the new DPW order is returned. This last line is useful to return the correct
value, as the check performed at the beginning of the method may alter the input argument.
This is it for the oscillator struct!
Now let’s get to the module implementation. Its skeleton resembles that of the ATrivialOsc module. The
module name is ADPWOsc and it will be a templated module. It has the same enums seen in ATrivialOsc
and the parameters are configured similarly. The main difference is a pointer to the oscillator object, DPW.
This is initialized in the constructor by dynamically allocating a DPW object. ADPWOsc also has
a method to be called when changing the DPW order. This, in turn, calls the onDPWOrderChange method
of the DPW object and stores the order returned by it, just in case it has been corrected.
enum InputIds {
VOCT_IN,
FMOD_IN,
NUM_INPUTS,
};
enum OutputIds {
SAW_OUT,
NUM_OUTPUTS,
};
202 Crunching Numbers
enum LightsIds {
NUM_LIGHTS,
};
DPW<T> *Osc;
unsigned int dpwOrder = 1;
ADPWOsc() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(PITCH_PARAM, -54.f, 54.f, 0.f, "Pitch", " Hz",
std::pow(2.f, 1.f/12.f), dsp::FREQ_C4, 0.f);
configParam(FMOD_PARAM, 0.f, 1.f, 0.f);
Osc = new DPW<T>();
}
};
The process function follows. It processes the inputs and the parameters, sets the pitch, and calls the
process method of the oscillator to get one output sample. Please note that to avoid unnecessary
computing of coefficients, the setPitch method verifies that the pitch has not changed.
Osc->setPitch(pitch);
T out = Osc->process();
if(outputs[SAW_OUT].isConnected()) {
outputs[SAW_OUT].setVoltage(out);
}
}
Crunching Numbers 203
Similar to ATrivialOsc, where we used the context menu to choose an oversampling factor, in this
case we use the context menu to pick a DPW order, allowing us to evaluate the amount of aliasing
in real time. The code is very similar, but you can find it in the ABC plugin source code.
Table 8.4: Average execution times for the DPW sawtooth oscillator
DPW order: 1 2 3 4
(a) (b)
Figure 8.13: The spectrum of a 3,312 Hz sawtooth tone produced at 44,100 Hz sampling rate using 8×
oversampling (a) and 2nd order DPW (b). The SNR is 22.5 dB (a) and 18 dB (b).
204 Crunching Numbers
int ch;
for (ch = 0; ch < inChanN; ch++) {
Osc[ch]->setPitch(pitch);
T out = Osc[ch]->process();
if(outputs[POLY_SAW_OUT].isConnected()) {
outputs[POLY_SAW_OUT].setVoltage(out, ch);
}
}
outputs[POLY_SAW_OUT].setChannels(ch+1);
}
The number of outputs is determined by the number of input V/OCT CV that we have, inChanN.
This makes sense, as we are expecting N note CV signals to set the pitch of N output sawtooth
waves. We iterate over inChanN oscillators to get the output of each one of them and set the voltage
of the related channel using the overloaded setVoltage(out, ch). Finally, we tell the output port how
many channels were written using the setChannels method.
Crunching Numbers 205
That’s it! Considering that it took years to get analog engineers from the first monophonic
synthesizer to the first polyphonic one, I think this two-minute job can be regarded as a considerable
leap.
Exercise 1
The DPW oscillator is templated. A define at the top of ADPWOsc.cpp allows you to switch from
float to double. Try compiling the code with either data type and observe with a Scope the waveform
with DPW order 4 at low frequency (e.g. a C2, i.e. 65.41 Hz). What difference do you observe? What
data type would you choose, in the light of the observation done and the computational cost reported
above? This should also trigger a question related to design priorities when developing a synthesizer
oscillator.
Exercise 2
Cheap tricks: referring to the issue observed in Exercise 1, there is one easy solution that avoids the
computational burden of the double precision implementation. At low frequency, the aliasing of the
trivial waveform is not noticeable. You can just implement the oscillator in single-precision float and
switch to the trivial waveform below a certain frequency. Even better, you can design a transition fre-
quency band where the trivial and the DPW oscillator output are crossfaded to ensure a smooth tran-
sition in timbre.
Exercise 3
Add the capability of generating square, rectangle, and triangle wave oscillators to the trivial and the
oversampling methods.
Exercise 4
Add square and triangle waves to the DPW oscillator.
Exercise 5
The DPW oscillator makes use of iterations. This makes the code small; however, it can be shown that
the cascade of several differentiators can be reduced to a feedforward filter. If implemented in such
a way, the DPW has a reduced computational cost. If you wish to improve the efficiency of your code,
you can implement a different function for each order of differentiation you want to support and
make it as fast as possible. A graphical approach can help us find the differential equation of
a cascade of differentiators and find an optimized method to compute this. The cascade of three dif-
ferentiators (DPW of order 4) is shown in Figure 8.10c. The difference equation can be computed by
looking at all the branches that the input signal can take. It can:
206 Crunching Numbers
This can be implemented in C very easily. For other DPW orders, there will be different difference
equations. As a first assignment, try to get the correct difference equation for higher DPW orders. As
a second assignment, implement this in code and try evaluating the computation times for the recur-
sive and the optimized versions.
Exercise 6
Wait, we forgot to add a V/OCT input to tune the pitch of the AModal module! Now that you know
how to deal with a V/OCT input, add such an input to that module.
8.3 Wavefolding
Waveshaping is one of the distinctive traits of West Coast synthesis. Waveshapers are meant to
provide rich harmonic content from simple signals such as sinusoidal or triangular waveforms
by applying nonlinearities. One common kind of waveshaper in West Coast synthesis is the
wavefolder, or foldback. This was one of the earliest designs by Don Buchla, consisting of
a device that folds back the waveform toward zero. The discontinuities that are created in the
input signal at the reaching of a threshold value are determinant in the bandwidth expansion of
the original signal. Unfortunately, this expansion is also a concern for virtual analog
developers since the widened spectrum may fall over the Nyquist frequency and cause aliasing.
A simple wavefolder has an input/output mapping that is piecewise linear according to the relations:
8
< μ ðx μÞ ¼ 2μ x; x4 μ
y ¼ f ð xÞ ¼ x; μ x μ ð8:5Þ
:
μ þ ðμ xÞ ¼ x 2μ; x5 μ
where μ is a threshold value. The nonlinear mapping can be seen in Figure 8.14a along with the
output of such a wavefolder on a sinusoidal input in Figure 8.14b.
(a)
(b)
Figure 8.14: The piecewise linear mapping (a) and its application to a sinusoidal signal (b), where the thin
solid line is the input sinusoidal signal and the bold line is the output of the wavefolder.
208 Crunching Numbers
if (x > mu)
out = 2 * mu - x;
else if (x < -mu )
out = - 2 * mu - x;
else
out = x;
where we used an implementation of the sign function to contemplate both the upper and lower
cases in one “if” for conciseness. Since the C++ library does not supply a sign function by default,
we can implement it as a template function with inline substitution. It works on most signed data
types and is branchless, to help the compiler optimize execution:
As we can see from Figure 8.15a, the aliasing introduced by this trivial implementation is not
negligible, at least with high-pitched input and high gain. Oversampling alleviates the problem, but
there is another method that we can suggest, described in the next section. In the next section, we
shall also describe how the module is implemented, including both trivial and anti-aliasing
implementations.
(a) (b)
(c)
Figure 8.15: Magnitude spectra of a sine wave folded with the trivial foldback (a), the alias-reducing
foldback (b), and a 2 oversampled version of the trivial foldback (c). The input signal is a 1,250 Hz sinewave
with unitary gain. The foldback threshold is 0.7. The alias reduction of the oversampling method is slightly
worse than the algorithm of Section 8.3.2. Please note that the foldback we described is a perfectly odd
function (symmetric with respect to the origin), and thus only odd harmonics are generated.
Depending on the formulation of the function f ðÞ, the computational cost of evaluating its
antiderivative F ðÞ may largely vary, and thus tricks may need to be applied. Let us examine our
function and proceed with integration to see how complex our F ðÞ is. Starting from Equation 8.5,
we can quickly integrate the linear terms and the constant terms using basic integral relations. We
must also add the integration constant, resulting in the following:
8 2
> x
< 2 þ 2μx þ c1 ; x4 μ
F ð xÞ ¼ ð8:7Þ
2
2 þ c2 ; μ x μ
x
>
: x2
2 2μx þ c3 ;
5
x μ
To obtain a smooth function, we need to match these three curves at their endpoints. This can be
done by setting this constraint and calculating the integration coefficients accordingly.
Our first constraint is that the first and the second curves need to match at the concatenation point:
x2 x2
þ 2μx þ c1 ¼ þ c2 for x ¼ μ ð8:8Þ
2 2
c1 ¼ c2 μ2 ð8:9Þ
Now we only need to assign a value to c2 . If we look at Equation 8.7, we notice that the second
condition is the equation of a parabola symmetrical to the vertical axis with offset c2 . We can
impose that the minimum of the parabola stands in the origin by setting c2 ¼ 0. We can also notice
that the third condition in Equation 8.7 is symmetrical to the first one, so it is armless to say that for
our goal c1 ¼ c3 and that c1 ¼ μ2 .
It is a good idea to store the F ðx½nÞ term in an auxiliary variable, so at each step the F ðx½n 1Þ
need not being computed again.
Thus, in the module constructor, we calculate the actual threshold voltage, mu, as the product of
the positive peak of the Eurorack standard for oscillating signals and the above define. We also
calculate its squared value to spare some computational resources later. If using the anti-aliasing
method, Fn1 and xn1 will be of use. These variables store the value of F ðx½n 1Þ and x½n 1,
respectively.
enum InputIds {
MAIN_IN,
GAIN_IN,
OFFSET_IN,
NUM_INPUTS,
};
Crunching Numbers 211
enum OutputIds {
MAIN_OUT,
NUM_OUTPUTS,
};
enum LightsIds {
NUM_LIGHTS,
};
double t, tsqr;
double Fn1, xn1;
bool antialias = true;
AWavefolder() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
configParam(PARAM_GAIN_CV, 0.0, 1.0, 0.0, "Gain CV Amount");
configParam(PARAM_OFFSET_CV, 0.0, 1.0, 0.0, "Offset CV Amount");
configParam(PARAM_GAIN, 0.1, 3.0, 1.0, "Input Gain");
configParam(PARAM_OFFSET, -5.0, 5.0, 0.0, "Input Offset");
t = 5.0 * WF_THRESHOLD;
tsqr = t*t;
Fn1 = xn1 = 0.0;
}
};
Now let us get to the DSP part. The anti-aliasing implementation of the foldback can be
implemented according to the following pseudocode:
One issue with the method we introduced is the division by the difference x½n x½n 1, which can
get very close to zero if the input signal is slowly time-varying or very low. This may introduce
numerical errors or may even result in a division by zero. To avoid this ill-conditioning issue, it is
advisable to adopt a different expression when the difference x½n x½n 1 falls below a certain
threshold. In this case, the output should be evaluated as:
212 Crunching Numbers
(a) (b)
(c)
Figure 8.16: Magnitude spectra of three wavefolded signals: a 5 kHz sine tone (a), a 5.5 kHz sine tone (b),
and the sum of two sine tones at 5 kHz and 5.5 kHz, both reduced by {1/2}. As can be seen, the output in
(c) shows not only the harmonics of the input tones, as seen in (a) and (b), but a series of harmonics of
a “ghost” 500 Hz tone (i.e. the frequency delta between the two input tones).
x þ x
n n1
yas ½n ’ f ð8:10Þ
2
To account for this issue, we first compute the difference and store it into a variable, dif, then decide
based on its value whether to adopt Equations 8.7 or 8.10. The code follows. Please note that the
output variable, out, is set at the beginning, and then replaced in case the signal is over the
threshold, according to one of the techniques discussed in this chapter:
if(antialias) {
double dif = x - xn1;
Crunching Numbers 213
if (outputs[MAIN_OUT].isConnected())
outputs[MAIN_OUT].setVoltage(out);
Exercise 1
You can try to implement a clipping nonlinearity employing the anti-aliasing method shown in this
section. Integrating it is no more difficult than what we have done with the wavefolder.
Exercise 2
To alter the spectral content of the signal, this module uses a gain and an offset. The gain, how-
ever, also affects the output amplitude, which may not be desired. For instance, a large output
signal could be clipped or saturated at the input of the next module. Two solutions can be
devised for this issue:
The latter is quick to add given the current implementation of the module. Be careful, however, to
avoid dividing by zero or by very large numbers. Keeping the input gain constrained in the range
0.1–3.0, for example, works without issues.
214 Crunching Numbers
Exercise 3
Practical wavefolding is generally realized with a cascade of several simple wavefolding stages (e.g. see
the Lockhart wavefolder) (Esqueda et al., 2017). Experiment with a cascade of wavefolders with differ-
ent settings, gains, or saturating nonlinearities between them. This can be done by encapsulating the
wavefolder code in a separate class.
Exercise 4
As discussed above, numerical issues may arise with small values of the variable dif; this is why we
proposed the use of Equation 8.10 as a solution. Let us observe what happens and when. What is the
simplest periodical waveform that provides zero difference almost everywhere?
Try commenting out the “if” clause that gets to the implementation of Equation 8.10 and watch
what happens with such an input signal with the anti-aliasing technique.
How can the small numerical threshold be estimated to safely avoid these issues?
Notes
1 The only concern regards computational efficiency. By hardcoding a bank of parallel second-order filters in one class and
explicitly using pragma statements to exploit processor parallelization, the computational cost can be greatly reduced. How-
ever, although often misquoted, I think Donald Knuth’s statement “premature emphasis on efficiency is a big mistake” fits
well in this case.
2 What in musical practice is called ribattuto.
216 Crunching Numbers
3 Remember the assembly line metaphor we used in Chapter 4? It applies to the computation of batches of oscillators, not
only of batches of samples.
4 Similar to the anti-aliasing filter used during the analog-to-digital conversion.
5 One example above all is the issue with denormal floating-point values (e.g. Goldberg, 1991).
6 The antiderivative is also known as the indefinite integral.
7 Please remember that the wavefolder we described in this chapter is an odd function.
CHAPTER 9
While Chapter 5 was meant to give a quick introduction to the creation of a GUI in Rack, using
ready-made widgets and allowing the reader to move quickly to the development of the first plugins,
this chapter discusses the development of new widgets. We start by introducing the rendering library,
NanoVG, used to draw user-defined lines and shapes and to render SVG files, then we suggest how
to customize available classes to get custom knobs and text labels. At the end of the chapter, we
propose a variation of the Modal Synthesis plugin that adds an interactive GUI element for sound
generation.
The nvgRect defines a rectangle in the NVGcontext * vg starting at the coordinates defined by the
2nd and 3rd arguments, with width and height defined by the 4th and 5th arguments. A fill color is
217
218 The Graphical User Interface
defined by nvgFillColor and the rectangle is filled with nvgFill. All these operations are preceded by
nvgBeginPath, which must be called before each path or shape is created.
Colors are declared as NVGcolor objects, and have red, green, blue, and alpha floating-point values.
However, an NVGcolor can be created using several utility functions:
• nvgRGB (or nvgRGBf for floats) taking unsigned char (or float) red, green, and blue and
settings by default the alpha to the maximum (no transparency);
• nvgRGBA (nvgRGBAf) taking unsigned char (float) red, green, blue, and alpha arguments;
• NvgLerpRGBA returning a NVGcolor interpolating from two NVGcolors given as input
arguments; and
• nvgHSL (nvgHSLA) taking floating point hue, saturation lightness (and alpha) as input and
transforming these to a regular RGBA NVGcolor.
Moving beyond the concept of color, we have the concept of paint (i.e. of a varied stroke or fill).
The NVGpaint object allows for gradients or patterns. Gradients can be linear, boxed, or radial,
while a pattern is a repetition of an image in both the horizontal and vertical axes.
Some example images are shown in Figure 9.1.
NanoVG also allows for rotations, translations, matrix transforms, scaling, and skew, making it
possible to move objects and transform them.
If you have some basics of vector graphics, you can scroll through the functions of NanoVG
functions. The place to start is src/nanovg.h from the NanoVG repository. However, we will examine
some basic functions in this chapter. Basically, the average reader will not need to dive further with
NanoVG because when inheriting a Widget or a Widget subclass he or she will only need to
override the draw method and add the code he or she requires to get executed.
Figure 9.1: Examples of objects rendered in NanoVG: a solid color square (left) and a rounded rectangle
with linear gradient paint (right).
The Graphical User Interface 219
Section 8.1.3. This widget, described in Section 9.3, has a bouncing object that responds to
mouse clicks in order to “hit” the modal synthesis engine. By learning from these examples,
you should be able to entirely customize your interface by changing the appearance of standard
elements such as the knobs, to add informative text, or even to create touchable interfaces that
react to the user.
1. Sketch your idea on paper and decide the width and height of the component in
millimeters (mm).
2. Design an SVG image for the component with the exact size as decided at the previous step,
and save it to the res/folder of your plugin.
3. Create a new struct by inheriting a basic component and define the SVG file to be used for the
new component.
The reason for having the exact size decided at the beginning is that Rack does not support
rescaling (it did only in versions below 0.6), and thus the size of the SVG file will also apply to
the knob in Rack.
For the design of the SVG file, we take Inkscape again as the reference application. You should first
open a new document and set its size. Go to File → Document Properties (or Shift+Ctrl+D) and:
• make “mm” the Default Units;
• make “mm” the Units (under Custom Size line); and
• set the Width and Height in mm as decided at step 1.
It is also suggested to use a grid, for better alignment, snapping, and positioning of objects. In the
Document Properties, go to the Grid tab, create a New Rectangular Grid, and verify that the spacing
for both X and Y is 1 mm and the origin is at 0, 0. Checking the “Show dots instead of lines” box
is suggested for improved visibility. In Figure 9.2, we show a simple blue knob where the bounding
box of the circle has been snapped to the page corners.
Exporting the drawing to SVG in the ABC/res/folder will make it available for use in our modules.
Let us see how.
In Section 5.2, we reported that the RoundBlackKnob object inherits RoundKnob and adds an SVG
resource, used as image. We will similarly inherit RoundKnob and add our own SVG resource to
create a new class RoundBlueKnob. The code follows:
Figure 9.2: A blue knob drawn in Inkscape with size 10×10 mm.
Please note that we are using an SVG file that has size 10×10 mm. Differently, if we want to create
a small trimpot, similar to the Trimpot class, we have to use a differently sized SVG, because we are
not allowed to scale the SVG inside the code. We have to design a smaller knob (e.g. of size 6×6 mm).
Suppose that we also want to change the minimum and maximum angles for knob turning. We’d
better inherit SvgKnob, which is a parent of RoundKnob. Indeed, the only difference between the
two is that RoundKnob defines the turn angles to a fixed value, and we want to be free to set our
own. We inherit from SvgKnob (defined in the app namespace) and define a new range for the knob
turn:
Similar considerations apply to input and output ports. A generic port graphic is defined as:
setSVG(APP->window->loadSvg(asset::system("path/to/my/
file.SVG")));
}
};
Please note that ports and knobs have a shadow drawn by default. The default parameters for the
blur radius, the opacity, and the position can be overridden by changing the following properties in
the constructor:
• shadow->blurRadius. The radius of the blurring (default: 0.0).
• shadow->opacity. The opacity of the shadow. The closer to 0.0, the more transparent
(default: 0.15).
• shadow->box.pos. The position of the shadow. By default, it is Vec(0.0, box.size.y * 0.1) (i.e.
the x coordinate is exactly the same as the knob, while the vertical position is slightly below by
a factor dependent on the knob’s y size). Remember that the y coordinate increases from top to
bottom.
By changing these values, you can drop the shadow slightly to a side, blur it a bit, or change its
transparency (or inversely its opaqueness). In general, making shadows consistent is a good choice
for a pleasing graphical appearance. Figure 9.3 shows the default shadow setting and a shadow
dropping slightly to a side with opacity 0.3 and blurRadius 2 mm.
Giving a custom look to switches follows a similar principle; however, while knobs require one
image only (i.e. rotated), switches require more SVG images that are interchanged when pressing the
(a) (b)
Figure 9.3: A custom knob with the default shadow settings (a) and modified shadow settings (b) (horizon-
tal shift, opacity 0.3, and blurred shadow).
222 The Graphical User Interface
switch. These different images are called frames. The number of such frames should impose the
number of states that the switch has. The Switch object can implement a momentary button or
a maintained switch. A momentary button is active only during user pressure. A maintained switch
changes its value at the user pressure. The type of switch is set by the momentary bool variable.
When it is set to false (the default), more than one state is available, depending on the number of
SVG frames we add to the newly defined class:
It’s as easy as that. Please note that SvgSwitch is a child of Switch that adds, for example, the
pointer to the vector of the SVG frames.
AStepDisplay(Vec pos);
void setColor(unsigned char r, unsigned char g, unsigned char b,
unsigned char a);
The Graphical User Interface 223
This class has a constant font height of 20 px, an NVGcolor struct that defines the text color, and
a pointer to font type. It has a pointer to the ASequencer module for reading the step number. The
setColor method sets the text color, using 8-bit values for red, green, blue, and alpha (transparency).
The draw method is the graphical equivalent of the process method for modules: Rack calls the draw
method of all the instantiated widgets periodically, in order to redraw the content of each one. It is
usually called with a frame rate of approximately 60 fps. When designing new widgets, we have to
override the default method and add to it our custom code. The code for the draw method is:
TransparentWidget::draw(args);
drawBackground(args);
drawValue(args, tbuf);
}
This method takes a struct containing arguments that are needed to draw. Among these, there is the
pointer to the NVGcontext, found under args.vg. This is necessary, as we have seen above, for
calling NanoVG methods. In our overridden draw, we first check for the existence of the module, to
avoid a segmentation fault. We then read its stepNr variable, storing the current sequencer step. This
is translated into text using the snprintf C library function. Finally, we call the original draw method
from TransparentWidget and we supplement it with our custom drawing methods: drawBackground
for the background and drawTxt for the text. The text is written after the background so that it
appears over it. The functions are reported below:
The background consists of a rounded rectangle that adds contrast to the text and highlights it.
Drawing a rectangle can be done using the nvgRect command seen above. However, for this custom
shape, we need to create a path point by point and fill it afterwards. To create a path, the
nvgBeginPath command is issued, then a “virtual pen” is “placed on the paper” by moving its tip to
a starting point with nvgMoveTo. Several nvgLineTo commands are issued to move the pen without
detaching it from the paper. Finally, the nvgClosePath command tells NanoVG that the path should
be closed. If you want to create curves, you have to use nvgBezierTo, nvgQuadTo, and nvgArcTo
commands instead of nvgLineTo. A little knowledge on vector graphics is required to apply these
functions correctly, but, as an example, we can see how to draw a rectangle with rounded sides, as
in Figure 9.4. We call nvgQuadTo, in order to make the sides rounded instead of straight lines. This
method requires two additional arguments, providing the coordinates of an additional point where
a curved Bézier should pass through. Once the path is drawn, it can be filled (in white) and stroked
(in black, with alpha 0xF).
nvgFontSize(args.vg, fh);
nvgFontFaceId(args.vg, font->handle);
nvgTextLetterSpacing(args.vg, -2);
nvgTextAlign(args.vg, NVG_ALIGN_CENTER);
Figure 9.4: A rectangle with rounded sides, drawn using NanoVG library. Text is superimposed by drawing
it afterwards.
The Graphical User Interface 225
Figure 9.5: The rounded path drawn according to the code snippets provided above. The bounding box is
shown in a dashed line.
As you can see, drawTxt calls several functions from the NanoVG library. All operations are
conducted on the NanoVG context pointer vg. Font properties are set:
• the size via nvgFontSize;
• the actual font via nvgFontFaceId;
• the letter spacing nvgTextLetterSpacing is reduced to shrink a bit and fit more text; and
• the alignment is set with nvgTextAlign.
Finally, the text fill color is set and the text stored in the text string is written.
This text widget is provided as a basic didactic example to override the default draw method. You
are encouraged to experiment with widgets, have a look at NanoVG, design novel concepts, and try
to implement them.
Figure 9.6: The AModalGUI module. The blue mass falls down due to gravity. By clicking on the black area,
the bouncing mass is repositioned and left falling. When it hits the ground, it impacts the resonators, thus
stimulating their oscillation.
• On mouse click, the mass (a blue circle) will start falling down from the set height.
• The mass will bounce as result of an anelastic collision.
• When the mass hits the ground (hit point), the velocity is transferred to the Modal module to
“kick” the resonators.
The Graphical User Interface 227
A pointer to the related AModal module is required, and will be initialized in AModalGUIWidget.
The variables we initialized are the mass x and y coordinates, massX and massY. The velocity of the
mass, massV, is zero. A constant acceleration, massA, is defined, as well as the mass radius (5 px)
and a ratio, massR, later explained. A last constant value is thresV, a threshold velocity value for
stopping the mass from bouncing.
The onButton event is defined as follows:
This method gets the button event, e, and stop propagating to any child class, if any. If the pressed
button is the left one, it calls the moveMass method, which changes the position of the mass, and
consumes the event. The moveMass method just changes the mass coordinates:
228 The Graphical User Interface
Now, how do we ensure that the mass will fall down? The draw method will periodically update the
mass position, changing its vertical position according to a constant acceleration law. We get into the
draw method to see what happens at each periodical call of the draw function.
The top of draw handles the mass model:
// FALL MODEL
if (massY <= impactY) {
// free mass
massV += massA;
massY += massV;
} else {
// impact
module->impact(massV, massX); // transmit velocity
massV = -massR * massV;
if (fabs(massV) < thresV)
massV = 0.f;
else
massY = impactY;
When the mass is “suspended in air” (i.e. massY is above the ground, impact), the mass is in free
fall. Please note that the y coordinate increases from top to bottom, and thus the mass is over the hit
point impactY, if its y value is smaller. The mass fall is simulated by simply increasing the velocity
massV of a constant quantity massA, the mass acceleration, and computing the y coordinate by
cumulating the velocity. When the mass is in touch with the resonating body (i.e. below the impactY
position), we have an impact. We invoke the impact method of AModal, sending the mass velocity
and x coordinate. We will see later how these values are used. We also invert the mass velocity
vector by imposing a negative sign and reduce its absolute value to simulate a transfer of energy to
the resonating body (otherwise bounces would go on indefinitely). Finally, we place the mass exactly
at the position impactY to allow it be in the free mass condition at the next round. Notice that this is
not a real physical model of an impact, but is intuitive and functional enough for our goals here.
To prevent infinite bounces, when the residual energy from the impact is too small (below a constant
threshold), we prevent the mass from bouncing again. Specifically, if the mass velocity is below the
threshold, thresV, it is stopped, by imposing a null velocity, and its position is left as is, in touch
with the resonating body, preventing it from being in the free condition again.
Now, the drawing. The mass, the waveform, and the background do overlap. We thus have to draw
them from the background to the foremost element, respecting their layering: the last that is drawn
will cover the previous elements. The background is rendered as a dark rectangle:
The Graphical User Interface 229
//background
nvgFillColor(args.vg, nvgRGB(20, 30, 33));
nvgBeginPath(args.vg);
nvgRect(args.vg, 0, 0, box.size.x, box.size.y);
nvgFill(args.vg);
We set a dark fill color (20, 30, 33) for a rectangle of the size of the entire widget and we fill it.
The variable args.vg is a pointer to NVGcontext.
The resonator is graphically represented as a waveform hinting at its oscillations: we will take the
output from the module, buffer it, and draw it. The reason for buffering is that we need to draw
multiple points and connect them in a path. Furthermore, the draw method is executed at a lower
frequency than the process method, and thus it would be impossible to update the drawing for each
output sample. AModalGUI will fill this buffer periodically and HammDisplay will read it. The
waveform path is drawn, taking samples from module->audioBuffer, a float array of size
SCOPE_BUFFERSIZE. The module also exposes an integer counter that tells were the last written
sample was in the array. The array is treated as a circular buffer by resetting the counter to zero
every time it hits the end of the array. The drawing code follows:
// Draw waveform
nvgStrokeColor(args.vg, nvgRGBA(0xe1, 0x02, 0x78, 0xc0));
float * buf = module->audioBuffer;
Rect b = Rect(Vec(0, 15), box.size.minus(Vec(0, 15*2)));
nvgBeginPath(args.vg);
unsigned int idx = module->idx;
for (int i = 0; i < SCOPE_BUFFERSIZE; i++) {
float x, y;
x = (float)i / float(SCOPE_BUFFERSIZE-1);
y = buf[idx++];
if (idx > SCOPE_BUFFERSIZE) idx = 0;
Vec p;
p.x = b.pos.x + b.size.x * x;
p.y = impactY + y * 10.f;
if (i == 0)
nvgMoveTo(args.vg, p.x, p.y);
else
nvgLineTo(args.vg, p.x, p.y);
}
nvgLineCap(args.vg, NVG_ROUND);
nvgMiterLimit(args.vg, 2.0);
nvgStrokeWidth(args.vg, 1.5);
nvgGlobalCompositeOperation(args.vg, NVG_LIGHTER);
nvgStroke(args.vg);
We first set the color of the path with nvgStrokeColor. We get the audio buffer, which is collected
inside AModal, and for each element of this buffer we set the y coordinate of a new point proportional
to the value of the audio sample (a 10.f multiplier magnifies the value), and the x coordinate is
a progressively increasing value, such that the last sample of the buffer will be drawn to the extreme
right. The y value is offset by impactY to make it appear exactly where the impact point with the mass
230 The Graphical User Interface
happens. Finally, the two commands used to draw lines are ngMoveTo for the first point of the shape
and ngLineTo for any following element. ngLineTo will create a segment between the last point
(previously set by ngLineTo or ngMoveTo) and the new given point. Remember that the path is created
by a “virtual pen.” To draw a line, the pen must be dragged on the paper (e.g. see ngLineTo). To move
the pen without drawing, you have to lift it and move to a new position (ngMoveTo).
After the shape has been drawn, a few commands are invoked to set the line properties: the line cap,
the miter limit, the stroke width (i.e. the thickness of the line), and the layer compositing. With the
nvgStroke, we complete the operations and finally get the line visible.
The mass is the last element we need to draw:
// Draw Mass
NVGcolor massColor = nvgRGB(25, 150, 252);
nvgFillColor(args.vg, massColor);
nvgStrokeColor(args.vg, massColor);
nvgStrokeWidth(args.vg, 2);
nvgBeginPath(args.vg);
nvgCircle(args.vg, massX, massY, massRadius);
nvgFill(args.vg);
nvgStroke(args.vg);
The operations are similar to those seen for the other elements. We set an RGB color and set it as the
fill and stroke of the object we are going to draw. We set the stroke of the object, we start a new path,
and make it a circle at coordinates (massX, massY) with radius massRadius. We fill and stroke it. Done!
Now let’s have a look at the changes done in AModal. The impact method is simply:
That is, when an impact occurs, the velocity and impact x coordinate are communicated to
AModalGUI. The impact x coordinate is transformed into a value that is useful to modify the damp
slope (i.e. it is mapped to the range ±DAMP_SLOPE_MAX).
In the process function, the damp slope is affected by hitPoint:
dsl += hitPoint;
The hit velocity is treated as an impulse that adds with the input signal port:
if (hitVelocity) {
in += hitVelocity;
hitVelocity = 0.f;
}
Whenever this value is nonzero, it is added to the input and zeroed as we consumed it.
The Graphical User Interface 231
audioBuffer[idx++] = cumOut;
if (idx > SCOPE_BUFFERSIZE) idx = 0;
The audio buffer index idx wraps when the end of the buffer memory is reached; whoever reads the
buffer (the HammDisplay widget) will need to take care of wrapping too.
To conclude this section, the AModalGUIWidget has a few changes with respect to AModalWidget,
in that it has a different size and it adds the HammDisplay widget as follows:
The HammDisplay is created and its properties are set, including position, a reference to the
AModalGUI module, and the impact point position.
TIP: One important tip to remember when you design custom widgets: in Rack v1, the module
browser draws a preview for each module, directly calling the draw method of the module
widget and its children. When doing so, it does not allocate the module, and thus the pointer
to the module will be NULL. It is important to enforce a check on the module pointer: if this is
NULL and we proceed to call its methods and members, we get a segmentation fault! Thus, in
the draw method of your custom widgets, always perform the following check:
Exercise 1
One concern about this implementation is related to the buffering mechanism. The draw and the pro-
cess methods are concurrently working on the same variables: audioBuffer and idx. One task may pre-
empt the other, inadvertently modifying the contents of these variables. This issue can be solved by
introducing two buffers, one for writing and the other for reading. A synchronization mechanism, pro-
vided by a bool variable, tells whether the read buffer is ready for drawing. All buffers and the bool vari-
able are instantiated in the module. The widget can access them through the pointer to the module.
232 The Graphical User Interface
Take a pen and paper, design your system, and try to prevent possible issues before starting coding!
Synchronization of concurrent access to data is a delicate topic – take your time.
CHAPTER 10
Additional Topics
233
234 Additional Topics
The “--args” flag tells GDB to call the application with the arguments provided after its name, in
this case the “-d” flag. The debugger will start and load Rack and its symbols. If it finds these, it
will show “Reading symbols from ./Rack…done.”
Now a prompt appears and you can start the debug session. If you have never used GDB, a few
commands will come in handy. A non-exhaustive list is reported in Table 10.1.
You will usually set some breakpoints before you start running the application. At this point, the
debugger only knows symbols from Rack. Since your plugins will be loaded dynamically by Rack
after you start it, their symbols are not yet known. If you ask GDB to set a breakpoint at:
break ADPWOsc<double>::process
the debugger will tell you to “Make breakpoint pending on future shared library load?” If you are
sure that you spelled the filename and the line correctly, you can answer “y”. Indeed, when Rack
will load the libraries, their symbols will be visible to the debugger as well, and GDB will be able
to make use of them. Now you can run the debuggee:
run
Rack is running and you can interact with it as usual. Now load ADPWOsc. Rack will load it, and
as soon as the process function is called for the first time GDB will stop the execution at the
beginning of it, since we set a breakpoint there. Whenever the debugger stops execution, it shows
a prompt where we can type commands. The debugger will also show the line of code where the
debuggee was stopped (in this case, line 51):
Thread 4 “Engine” hit Breakpoint 1, ADPWOsc<double>::process
(this=0x1693600, args=..)
at src/ADPWOsc.cpp:51
51 float pitchKnob = params[PITCH_PARAM].getValue();
The line shown has not been executed yet, and thus the value of pitchKnob is not initialized.
Indeed, printing it with:
print pitchKnob
will probably result in a weird value: the memory area assigned to the variable has not been cleaned
or zeroed, and thus we are reading values that were written in that memory area by other processes
or threads in the past. If we want to read its value, we can advance by one line with:
next
Now we can again print the value of pitchKnob and see the actual value, given by the knob position.
Please note that the next and step commands differ in that the former jumps to the next line of code in
the context, while the other enters any function call it finds in its way. For instance, doing step at line 51
(shown above) would enter the getValue() method, while next goes to line 52 of the process
function.
Please note that you can print only variables that belong to the current context and global variables.
Let us show this with an example. Set a breakpoint at the process method of DPW and issue
a continue to run until you get past it:
break DPW<double>::process
continue
Here, you will be able to examine, for example, triv, but not pitchKnob, which now belongs to the
parent function and is not in the current context. If you are not convinced, try printing the call stack:
where
at src/engine/Engine.cpp:251
#3 0x00000000006da256 in rack::engine::Engine_step (that=0xdde060) at
src/engine/Engine.cpp:334
As you can see, the current context is DPW<double>::process. This was called by
ADPWOsc<double>::process, which in turn was called by the engine function stepModules, and so
on and so forth.
Now advance through the DPW::process until you get to the return statement. Going past it will take
you back to ADPWOsc::process. Once you are back there, you will be able to see pitchKnob
again and all the other variables that are in the scope.
This code clearly causes a segmentation fault, because you will soon end trying writing “1” into
a protected memory area.
Now start a debug session with GDB:
If you add AComparator to the rack, as soon as the module constructor is executed, the execution
will stop and you will get the following:
Thread 1 “Rack” received signal SIGSEGV, Segmentation fault.
0x00007fffec007b8c in AComparator::AComparator (this=0x197e810) at src/
AComparator.cpp:31
31 crashme[i++] = 1;
The debugger is telling you where in the code the issue is. You can print the counter i, to see how
far it went before crossing a protected memory area. Printing the values of the variables in context
gives valuable information.
Please note that the fault is detected only when the counter has crossed the end of the memory area
reserved to the process. The offending code section may have written “1” to many other variables
that belong to the process, and have thus been corrupted! In any case, you should always take care
of your increment/decrement array indexes or pointers.
default” and provide the path of the folder where the Rack executable resides. Finally, go to the
“Common” tab and add it to the “Debug” favorite menu, for recalling it quickly.
This should be enough to launch a debug session inside Eclipse. You can now launch the debug
session by clicking on the debug button and selecting the “debug Rack” configuration.
The system should start Rack and stop at the beginning of the main, as seen in Figure 10.1. You can
click on the continue button to go ahead. Rack will open up and run. Now you can add modules of
yours, and add breakpoints to them. The debugger will be able to stop on these breakpoints. All the
other benefits of a debugging session will be at your disposal, including watching variables or
reading memory areas.
To handle the program flow, Eclipse provides the following buttons:
• Resume. See GDB command continue.
• Suspend. Suspends the application at any time, wherever it is.
• Terminate. Aborts execution.
• Step into. The equivalent of GDB command step.
• Step over. The equivalent of GDB command next.
Breakpoints can be added or deleted by double-clicking on the code window, on the column to the
left, showing line numbers.
Figure 10.1: A debug session for Rack in the Eclipse IDE. On the top left, the call stack is printed, showing
that the application is suspended at main() in main.cpp. The expressions window on the top right shows
a list of variables or expressions and their values. The memory browser on the bottom right allows reading
memory areas, such as arrays or strings. Finally, breakpoints are resumed in the right pane.
Additional Topics 239
enum InputIds {
NUM_INPUTS,
};
enum OutputIds {
NUM_OUTPUTS,
};
enum LightsIds {
NUM_LIGHTS,
};
FILE *dbgFile;
const char * moduleName = "HelloModule";
HelloModule() {
config(NUM_PARAMS, NUM_INPUTS, NUM_OUTPUTS, NUM_LIGHTS);
std::string dbgFilename = asset::user("dbgHello.log");
dbgFile = fopen(dbgFilename.c_str(), "w");
dbgPrint("HelloModule constructor\n");
}
~HelloModule() {
fclose(dbgFile);
}
};
The resulting output, read from dbgHello.log after opening HelloModule in Rack and closing
Rack, is:
[HelloModule]: HelloModule constructor
[HelloModule]: HelloModule destructor
Printing must not be abused. It is senseless to write a printf statement inside the process()
function with no conditional statement because it will overload your system with data that you can’t
read anyway because it is printed tens of thousand times for each second. If you need to monitor
Additional Topics 241
a variable inside the process() function, you should use a counter to reduce the periodicity of
execution of the DEBUG or vfprintf. An example may look as follows:
if (counter > args.sampleRate() * 0.5f) {
DEBUG("%f", myVar);
counter = 0;
} else counter++;
The counter variable is increased at each process invocation, and when it reaches a threshold it
prints out the value of the variable myVar. This happens only twice for a second because we set the
threshold to be half the audio sampling rate.
10.2 Optimization
Optimization is an important topic in time-critical contexts. Real-time signal processing falls into
this category, and as a developer you are responsible for releasing reliable and optimized code, in
order to use the least computational resources necessary for the task. Since your modules will
always be part of a patch containing several other modules, the computational cost of each one has
a weight on the overall performance, and all developers should avoid draining resources from other
developers’ modules.
Optimization, however, should not be considered prematurely. Yes, you should design your code
before you start coding. You should also consider possible bottlenecks and computational issues of
the implementation strategy you are adopting. You should also rely on classes and functions of the
Rack library, since these are generally well thought and optimized. However, you should not waste
time optimizing too early in the development process. Well-designed code requires optimization only
after it has been tested and it is mature enough. Look at the CPU meter: if the module looks to be
draining too many resources compared to the other ones (for a given task), then you may start
optimizing.
In the optimization process, you should look for bottlenecks first. Bottlenecks are code fragments
that take most of the execution time. It does not make sense to improve the speed of sections of
code that do not weigh much on the overall execution time.
In general, expensive DSP tasks are those that:
• work on double-precision and complex numbers;
• iterate over large data (e.g. convolutions/FIR filters with many taps);
• nest conditional statements into iterative loops; and
• compute transcendental functions, such as exp, pow, log, sin, and so on.
Modern x86 processors have many instructions for swiftly iterating over data, working on double
and complex data types, or process data in parallel. SIMD processor instructions can be exploited to
improve code speed by several times. This topic, however, falls outside the scope of this book.
Other dangerous activities are those related to I/O and system calls. These operations may be
blocking, and should not be called in the process method, which is time-constrained. The execution
of the process function should be deterministic (i.e. the function should complete in a regular
amount of time, without too much variation from time to time).
242 Additional Topics
In the following subsection, we discuss methods to reduce the cost of computing transcendental
function. We discuss sine and cosine approximation techniques, and table lookup for storing
precomputed values of a transcendental function.
where σ and are adjustable constants, determining the horizontal offset and the width of the bell-like
shape. Computing the value for each input x may be quite expensive due to the presence of several
operations, including two divisions and an exponentiation.
This and many other waveshaping functions have polynomials, trigonometric series, or exponential
functions that are expensive to compute on the fly. The best option is to precompute these functions
at discrete input values and store them into a table. This is shown in Figure 10.2, where a set of
precomputed output values (a saturating nonlinearity) are stored together with the corresponding
input values. As the figure suggests, however, we do not know how to evaluate the output when the
input has intermediate values not stored in the table. For instance, what will it look like?
The solution to the issue is interpolation. There are several interpolation methods. The simplest one
is called linear interpolation and is nothing more than a weighted average between two known
points. The concept is very simple. Consider the mean between two values. You compute it as
ða þ bÞ=2. This will give a value that is halfway between a and b (i.e. the unweighted average
between the two). However, in general, we are interested in getting any intermediate value that sits
Figure 10.2: A lookup table (LUT) storing pairs of discrete input values and the corresponding output
values, and its illustration. How do you compute the output for input values that have no output value
precomputed in the table?
Additional Topics 243
in the interval between a and b. By observing that ða þ bÞ=2 ¼ 0:5a þ 0:5b, you can figure out
intuitively that this is a special case for a broader range of expressions. The weighted average is thus
obtained by ð1 wÞa þ wb. The weight w tells where, in the interval between a and b, we should
look. Let us get back to the example of Figure 10.2. If we want to evaluate the output for y8:5, we
need to get halfway between x8 and x9 , and thus y8:5 ¼ 0:5 x8 þ 0:5 x9 . If we want to get the
output for input x8:25 that is halfway between x8 and x8:5 , we should do y8:25 ¼ 0:75x8 þ 0:25x9 .
The approach for getting a value out of a lookup table using linear interpolation is:
i ¼ ½x ð10:2aÞ
w¼xi ð10:2bÞ ð10:2Þ
~yð xÞ ¼ ð1 wÞ yðiÞ þ w yði þ 1Þ ð10:2cÞ
where i is the integer part (the brackets bc denote the floor operations) and the weight w is the
fractional part of i. In geometrical terms, linear interpolation essentially consists of connecting the
available points with segments. Of course, this will introduce some error, as the function to be
interpolated is smooth (see Figure 10.4); however, the number of operations it involves is little.
Other interpolation schemes are known, which connect the points with curves instead of lines,
obtaining more accurate results. There is a computational cost increase involved with these, so we
will neglect them here.
Let us now discuss the practical aspects of linear interpolation. In Rack, we have a utility function,
interpolateLinear, from include/math.hpp, which handles linear interpolation for us. This
function requires the pointer to the LUT and the input value as arguments and it returns the
Figure 10.3: Linear interpolation for three points. Linear interpolation approximates the curve connecting
the known points with segments.
244 Additional Topics
Figure 10.4: Three points of a lookup table (black dots), the curve represented by the lookup table (solid
line), and the curve obtained by linear interpolation from the lookup table points. The figure shows the linear
interpolation error e = y1 − y1L , where y1L is the linearly interpolated output corresponding to input value x1 ,
while y1 is the expected output for input x1 .
What we do is define the LUT length M and its values f[0] .. f[M-1]. It is not necessary to
store the values if we follow a given convention. The convention could be that M values are equally
spaced in the interval ½10; 10. Given this convention, we also need to ensure that the input value
x never exceeds the upper and lower bounds to avoid reading outside the table boundaries. This is
done using the built-in clamp function of Rack. To translate the input range into the table length
(i.e. to map it into the range ½0; M 1), we use the built-in function rescale. We are assuming
that x = M/2 is the origin of the LUT. Function interpolateLinear reads the array and
interpolates the value corresponding to the second input argument.
Warning: The second argument to interpolateLinear must always be lower than the size of
the lookup table. Reading outside the memory area of table will yield unpredictable values and –
even worse – a possible segmentation fault!
Regarding the choice of the table size, bear in mind that a trade-off must be considered between
memory occupation and interpolation quality. The interpolation error depends on the number of
values we store in the table: the bigger the LUT, the lower the error (noise). However, note that
when defining the LUT in your C++ code, this will be part of the compiled code, increasing the size
of the plugin. When the size of a table gets larger than a few thousands values, it is better to save it
as a file and develop some code for loading the file into memory.
Large LUTs should not be copy-pasted in the code directly, but should be better included as external
binary resources (e.g. in the form of a *.dat file). Such a file can be referenced from the source
code. To do this:
Additional Topics 245
BINARIES += myLUT.dat
2. In the .cpp file, place the following line were you would declare the LUT:
BINARY(myLUT_dat)
BINARY_START(myLUT_dat)
BINARY_END(myLUT_dat)
BINARY_SIZE(myLUT_dat)
Please note that the first two return a (const void *) type.
For further reference, you can find these defines in include/common.hpp.
As a hint for further reads, we must also add that the table lookup technique is also used for
wavetable synthesis. In this case, the table stores samples of a waveform that is read periodically to
get the output waveform. Depending on the read speed, the pitch of the output waveform changes.
This technique uses a playback head that can assume floating-point value, and thus requires
interpolation. In Fundamental VCO (plugins/Fundamental/src/VCO.cpp), tables for the sawtooth and
triangle waveforms are included. To interpolate, they use the interpolateLinear.
The set of points, spaced by 0.1, is shown in Figure 10.5 for the first half of a sine period.
Since the remaining part of the sine period is identical apart from vertical mirroring, it is not
meaningful to store values over π=2 as we can compute them from the range 0 π=2. By adding
some more complexity to code even a quarter of a sine could be sufficient, since the values repeat
(although also mirrored horizontally).
Figure 10.5: Half a period of sine as floating-point values stored in a lookup table.
2
1 16 14 1:080 50:5
ð10:4Þ
2
1 16 34 1:080:5 51
This sine approximation will show additional harmonics because it is not perfect, but the result is
sufficient for many practical use cases.
For this oscillator, the output that we previously associated to the low-pass filter is now a sinusoidal
output, while the output previously associated to the band-pass filter is now a cosinusoidal output.
A large number of low-cost sine oscillators have been proposed in the scientific literature, as well as
quadrature sine/cosine oscillators. Some of these excel in numerical properties such as stability to
coefficients change, a property that is necessary for frequency modulation, for example. Others have
good precision at extremely low frequency. For the eager reader, I suggest comparing different
topologies and evaluating the best ones, depending on your needs. The second-order digital
waveguide oscillator is discussed in Smith and Cook (1992). The direct-form oscillator and
variations of the coupled-form oscillator are shown in Dattorro (2002). Lastly, a white paper by
Martin Vicanek supplies a brief comparison of several sine oscillator algorithms (Vicanek, 2015).
Additional Topics 247
“frameRateLimit”: 70.0
A value of 25 may be a compromise between usability and performance. The graphics processing on
a patch of about 15 modules on an old 64-bit processor with no GPU can be heavy. In a test on an
i5 core of third generation, I get the CPU from 70% to 50% by reducing the frame rate from 70 to
25 fps.
• At the beginning of a new time step, Rack examines messageFlipRequested and swaps the two
pointers if it is true; producerMessage now points to s2 and consumerMessage points to s1 .
• At time step 1, module A::process is called:
○ Module A writes data to producerMessage.
○ Module A sets messageFlipRequested to true.
• At time step 1, module B::process is called:
○ Module B reads data from the consumerMessage pointer of module A; this corresponds to s1 ,
which contains data from time step 0.
• And so on.
In this example, we considered – for simplicity – a case with module A writing data to module
B. However, the mechanism allows both modules to read and write. The ping-pong mechanism
implied by the two swapping pointers allows one to read on a data structure referenced by
consumerMessage, while the other is writing to the second data structure, referenced by
producerMessage. The two are then swapped, and thus the modules can always refer to the same
pointer transparently. One obvious consequence of this mechanism is a one-sample delay between the
writing and the reading of messages. Whether this is acceptable for your application is up to you.
Consider that a one-sample delay is implied by connecting modules through regular cables anyway.
Now we shall consider, as an example, the implementation of an expander module that exposes all
16 individual outputs of the APolyDPWOsc. These are the specifications:
• The expander should host 16 output ports.
• The data structure that is passed to the expander is defined as an array of 16 float values.
• No data is sent back from the expander.
The data structure is defined as:
struct xpander16f {
float outs[16];
};
Its definition should be available to both APolyDPWOsc and its expander, APolyXpander.
Let us consider, first, the changes that we have to apply to APolyDPWOsc.
Two xpander16f structures are allocated statically, for simplicity,1 in APolyDPWOsc:
xpander16f xpMsg[2];
In the constructor, these two are assigned to the consumer and producer pointers:
rightExpander.producerMessage = (xpander16f*) &xpMsg[0];
rightExpander.consumerMessage = (xpander16f*) &xpMsg[1];
The first line verifies the presence of a module to the immediate right, and verifies if it is of the
modelAPolyXpander kind. If so, a convenience pointer is created of the xpander16f type
(remember that rightExpander.producerMessage is a void pointer, and thus we have to
cast it). Then we use this pointer to write the individual float values, reading the outputs (the
computation of the oscillator outputs has been done already). Finally, we request the flipping, or
swapping, of the producer and consumer pointers, which will take place at the next time step.
Now let us get to the Expander. This is developed as any other module, so it consists of two structs,
APolyXpander: Module and APolyXpanderWidget: ModuleWidget, and a model,
modelAPolyXpander. The module has 16 outputs. To ease your wrists, there is a convenient
define, ENUMS, that spares you from writing down all 16 outputs in the OutputIds enum. It
simply states that there are 16 outputs as follows:
enum OutputIds {
ENUMS(SEPARATE_OUTS,16),
NUM_OUTPUTS,
};
We will see how to address each one of them. To avoid putting each one of them in the
ModuleWidget, we can iterate as follows:
This creates two columns of eight ports each. As you can see, each output is addressed using
SEPARATE_OUTS plus an index.
The process method has nothing to do but:
• check for the presence of an APolyDPWOsc module to the left; and
• if it is present, read its data and write the values to the outputs.
250 Additional Topics
As you can see, the first line checks if the pointer to the module on the left is not null and whether it is of
2
the APolyDPWOsc type. Then, if the parent is present, it looks for its consumerMessage buffer, sets
a convenience pointer to it, and starts copying its values to each individual output of the expander. If the
parent module is not present all, outputs are set to zero. The mechanism is depicted graphically in
Figure 10.6.
If you want to investigate how the system works, you can start a debug session. By printing the
memory address to which producerMessage and consumerMessage point, you will notice that they
are swapped at each time step.
Before concluding, we should clarify that the exchange mechanism can be bidirectional. In order to do
so, two additional buffers can be allocated in the expander from pushing data to the parent,
symmetrically to what has been described so far. Furthermore, to accommodate data from the expander,
the data structure can be increased, adding any sort of variables and data types. If, for instance, knobs or
buttons are added to the expander to affect the behavior of the DPW oscillator, this data needs to be
written back to the oscillator. A larger data structure should supersede xpander16f. Let us define it as:
struct xpanderPacket {
float outs[16];
bool reset;
};
Not all values are written or read at all times: the oscillator module would be responsible for reading
the Boolean variable and writing the 16 float values, while the expander module would be
responsible for writing reset and reading the 16 float values.
time step n = 0
producerMessage
Parent writes outputs at step n = 0
consumerMessage
0, 0, ...., 0 Xpander reads
time step n = 1
consumerMessage
outputs at step n = 0 Xpander reads
producerMessage
Parent writes outputs at step n = 1
time step n = 2
producerMessage
Parent writes outputs at step n = 2
consumerMessage
outputs at step n = 1 Xpander reads
Figure 10.6: The message exchange mechanism used to send data from the parent module to the expander.
Two twin data structures are allocated in the parent module. Two pointers, producerMessage and
consumerMessage, point to one such structure each. At each time step, the parent writes to
producerMessage and the expander reads from consumerMessage. At the beginning of each time step,
the producerMessage and consumerMessage pointers are swapped. This introduces a one-sample
latency. Please note that for bidirectional data exchange, two other buffers are required. These can be
allocated in the expander module and treated similarly.
What is JSON, by the way? JavaScript Object Notation, aka JSON, is a data exchange format that is
similar in scope to markup languages such as XML. Despite the name, it is agnostic on the
programming language, and by now it has libraries for reading and writing JSON in all the most
popular programming languages. It is human-readable and easy to parse and is widely used in all
fields of ICT. Its main components are objects or tuples of the type:
where objects can be of any kind: string, integer or real number, another key:value object, another
array, a Boolean value, and so on.
To understand this practically, let us examine the content of a VCV file, opening it with a text
editor. We start opening VCV Rack with an empty rack and put only one of our modules (e.g. ABC
ALinADSR). The file content will be something like:
{
“version”: “1.dev”,
“modules”: [
{
“plugin”: “ABC”,
“version”: “1.0.0”,
“model”: “ALinADSR”,
“params”: [
{
“id”: 0,
“value”: 0.5
},
{
“id”: 1,
“value”: 0.5
{
“id”: 2,
“value”: 0.5
},
{
“id”: 3,
“value”: 0.5
}
],
“id”: 22,
“pos”: [
0,
0
]
}
],
“cables”: []
}
Additional Topics 253
This is standard JSON. As you can see, the content is easily understood by visual inspection. In
summary, we have a key:value object where “version” and “value”3 are specified, an array of modules,
in this case only ALinADSR, with position (0,0) in the rack, and with the parameters all set to 0.5.
Finally, there is an empty cables array, meaning that we did not connect any input/output port.
This is sufficient to store and recall most modules in the Rack community. The operation of storing
module information in JSON and recalling from JSON is done by two methods of the Module object:
virtual json_t *datatoJson();
virtual void datafromJson(json_t *root);
These will take care of storing parameter values and basic module information. However, some
special modules may have more data to store. This is the case, for example, for special GUI
elements that you design and need the content to be saved explicitly because Rack, by default,
knows how to save parameter values. Similarly, you can store the values of custom context menu
icons, such as the DPW order in the ADPWOsc module.
Rack employs jansson, a C library to store and recall JSON data. This library is part of Rack
dependencies, and is downloaded when you execute the make dep command while preparing your
setup (see Section 4.4.4). The following function from jansson is used to store a new line:
int json_object_set_new(json_t *json, const char *key, json_t *value)
where the JSON content is given as the first argument, and the {key, value} tuple is given as second
and third arguments, respectively.
To find a value corresponding to a key in a JSON content, you can use the following function from jansson:
where the first argument is the JSON content and the key is the second argument. The function
returns a pointer to a JSON element, which contains a value. Please note that it may also return
a NULL pointer if the key is not found, so be aware that you need to check it. The pointer can be
parsed to get the proper value and store it in one of the variables of your module. For a float value,
for example, you can recall the value and store it into a variable as follows:
json_t * valueToRecall = json_object_get(root, “myKey”);
if (valueToRecall) {
myFloatVar = json_number_value(valueToRecall);
}
The key strings will be “hitPoint” and “nActiveOsc” (the same as the variables names, for
simplicity). The values are float and integer, respectively. We need to declare that we will be
overriding the base class methods, so we add the following lines to the declaration of AModalGUI:
json_t *dataToJson() override;
void dataFromJson(json_t *rootJ) override;
As you can see, we first instantiate a json_t object. We make use of the json_object_set_new method
to set the key and value pairs in the json root object. The key must be a const char *, so passing
a constant char array as “hitPoint” is fine, but for simplicity the key strings are defined by
JSON_NOSC_KEY and JSON_XCOORD_KEY macros. The value object must be casted to
a json_t * type. This is done through convenience functions such as json_real and
json_integer, which take a real (double or float) and an integer value, respectively, and return
the pointer to json_t.
This function saves data to JSON. Now, to restore this data back during loading, we can do the
following.
We scan the root for the JSON_NOSC_KEY and JSON_XCOORD_KEY strings using the
json_object_get. This function returns a NULL pointer if it does not find the key, so a check
is performed before converting it into a numeric value with the json_number_value function
and json_integer_value, respectively. The converted value is stored directly into the variable
of interest. Special care must be taken for the hit point variable. This belongs to the module struct.
However, we need to change the massX variable in the HammDisplay widget in order to really
change the position of the mass. A mechanism to propagate this information from the module to the
widget follows: we add a Boolean value to the module (defaulting to false) that tells whether the hit
point has been changed in the module. We set this to true after reading the new value from JSON.
In the HammDisplay draw method, we add the following:
Additional Topics 255
if (module->hitPointChanged) {
massX = (module->hitPoint/DAMP_SLOPE_MAX + 0.5f)*MASS_BOX_W;
module->hitPointChanged = false;
}
This ensures that if hitPoint has been changed from the module, we read it and set massX
consequently. We also set hitPointChanged to false, avoiding having to enter this if clause
again. Please note that the conversion of the hitPoint into massX is the reverse of what is done in
the impact method of AModalGUI:
Finally, we can test the code we just added. Open an empty Rack patch, place AModalGUI in it, and
click on a position to the right so that the mass falls down. Save the patch and close Rack. Launch
Rack and open the patch: the ball will fall in the same x coordinate when we saved the patch,
allowing the same timbre to be available. If we take a look at the .vcv file, we notice that the
“hitPoint” and “nActiveOsc” keys are there:
“version”: “1.dev”,
“modules”: [
{
“plugin”: “ABC”,
“version”: “1.0.0”,
“model”: “AModalGUI”,
“params”: [
{
“id”: 0,
“value”: 1.0
},
{
“id”: 1,
“value”: 0.00999999978
},
{
“id”: 2,
“value”: 1.0
},
{
“id”: 3,
“value”: 0.0
},
{
“id”: 4,
“value”: 0.0
}
],
“data”: {
“nActiveOsc”: 1,
“hitPoint”: 0.000701605692
},
256 Additional Topics
“id”: 23,
“pos”: [
0,
0
]
}
],
“cables”: []
}
To conclude this section, strings can be also added to a JSON file. The convenience functions make
a string into a json_t pointer and then store it:
json_t *json_string(const char *value);
The conversion functions to get a string from a json_t pointer are reported below. Additionally,
we can also get a string length in advance so that we can allocate a char array to store it:
const char *json_string_value(const json_t *string);
size_t json_string_length(const json_t *string);
fix 1.0.0 and continue developing 1.1.0. After the fix, you will supply version 1.0.1 and will get to
finish 1.1.0. After that, you’ll want to merge the 1.1.0 branch with the main version.
This is just an example of how useful versioning is. If you are inexperienced with developing and
versioning, you will find a few words on the topic here, leaving most of the subject to more specific
textbooks.
Several software versioning systems are used nowadays, but probably git is the most popular one at
the time of writing. Git is also the versioning system employed by VCV Rack, which is well
integrated with it. You already learned to use git commands for cloning the Rack repository. Now
we see how to manage our repository:
1. Create a GitHub account. There is a plethora of free git repository, but this one is the same used by
VCV. Once you have a GitHub account, you will also be able to create tickets in the Rack issue tracker.
2. Create a new (empty) repository from your home page at github.com (e.g. called myPlugin1).
3. Initialize your local repository. Go to an empty folder where you will be fiddling with code (e.g.
Documents/sandbox/Rack/plugins/myPlugin1/) and type:
git init
4. Create a README.md file. This is part of GitHub’s best practices and is not mandatory.
However, we want to add at least one file to our first commit:
5. Link your local repository to the remote one we just created on GitHub:
git remote add origin https://github.com/yourgithubname/myPlugin1.git
Now you can check online that the readme file has been pushed, heading your web browser to
https://github.com/yourgithubname/myPlugin1.From now on, if you work alone on the project,
you can:
• add existing files to the list of versioned files with the git add command;
• create a “commit” (i.e. a small milestone of your project) that will be visible forever on the
local repository, with git commit; and
• push the local commits to the remote repository so that they will be available to others, with
git push.
If you are not familiar with git, a nice alternative to using it from the command line is a git client.
A good cross-platform choice is SmartGit, which is free for non-commercial projects.
This is especially useful for viewing the history, merging your work with others, and managing tags. Tags
are useful to keep some of the commits bookmarked with a version number. If, for example, you want to
allow a collaborator or an external user to download a specific version of your code, you can tag it with
258 Additional Topics
a string such as “v1.2.3” or similar. Please remember that this may also be useful to keep track of the
compatibility with Rack versions (e.g. jumping from v1.2.3 to v2.2.3 whenever Rack will switch from v1
to v2).
Please note that you don’t need to synchronize to a remote repository; a nice feature of git is that it
can also work locally. However, having a remote repository backs you up in case your PC has any
hardware or software faults.
Finally, if you sign up for a free GitHub account, your repository will be public. If you want to keep
your project closed-source, you need to pay or sign up to some other repository hosting service that
leaves you the faculty to make your repository private.
10.8.2 Donations
Whether you are a newbie wanting to give it a try or you are a free-software activist accepting no
compromise in releasing code to the community, the donation is an option for making a small
revenue from your modules. Donations help developers maintain the code, keeping it up to date,
cleaning it from bugs, improving its functionalities, and expanding the modules collection. The VCV
Rack plugin manager offers the option to add a “Donate” link that allows users to send you an
arbitrary amount of money through platforms such as PayPal.
The procedure to add the Donate link to the VCV store is pretty straightforward:
1. Sign up for an account on a money transfer platform such as PayPal. This should give you
a link that anyone can use to send you money.
2. Add the link to your plugin manifest (see Section 10.7) under the item “donation.”
3. Send a pull request after you have added the manifest to get the plugin manager updated.
Do not be upset if users are not sending you any donations! It is up to them to decide. You can try
raising the bar or make your plugins more easily distinguishable from others so people associate
your brand with the quality of your modules.
260 Additional Topics
Notes
1 If allocation is done dynamically, these need be deleted in the module destructor.
2 If dynamic allocation would be used, a check to avoid dereferencing a NULL pointer would be necessary.
3 The version field is empty if you are running from a version of Rack compiled from git.
CHAPTER 11
The first aim of this book is to leverage the fun and simplicity of VCV Rack to help the reader
get involved with programming, digital signal processing, and music experimentation. This book
is thus not only a quick guide to Rack, but also a bridge toward in-depth learning of many other
subjects. With this in mind, some suggestions for further reading are left to the reader, in the
hope that his or her next read will be a deeper and more complex book on one of the suggested
topics. Other suggestions are also provided for those who prefer to play with electronics.
Have fun!
261
262 After Reading This Book
2019). Three detailed papers (old but gold) from Jon Dattorro covering reverberation, filtering,
modulation effects, and pseudo-noise generation that may be interesting to the reader can also be
found (Dattorro, 1997a, 1997b, 2002).
Besides the traditional approaches, interesting novel variations of morphing and hybridization are
obtained using non-negative matrix factorization (NMF) (Burred, 2014), the discrete wavelet
transform (DWT) (Gabrielli and Squartini, 2012), and live convolution (Brandtsegg et al., 2018).
Deep learning is the newest approach for sound synthesis (van den Oord et al., 2016; Gabrielli
et al., 2018; Mor et al., 2018; Boilard et al., 2019). At the moment, it is not computationally
feasible to fit these algorithms in real-time music processing platforms, but in the years to come
this may change. Tones can be also generated offline by these algorithms and be processed in real
time, as is done for the NSynthsuper project, which mixes tones generated by the Nsynth neural
synthesizer (Engel et al., 2017). Machine learning is also a valuable tool for generative music.
Generating musical events is not so demanding in computational terms, and thus a generative
model based on neural networks or other machine learning techniques is feasible. A thorough
review of computational intelligence techniques for generative music is provided in Liu and Ting
(2017). Cellular automata have been one of the preferred choices for generative music for decades
(Burraston and Edmonds, 2005). Currently, a lot of tools are released openly by Google AI
researchers taking part in the Magenta project (https://magenta.tensorflow.org/), mainly based on
JavaScript and Python.
Novel types of oscillators can be designed by discretizing chaotic nonlinear dynamical systems.
Examples are the Chua-Felderhoff circuit (Borin et al., 2000) and the Lotka-Volterra biological
population model (Fontana, 2012). Putting these into a novel oscillator module may be an interesting
idea. Other computationally efficient oscillators that can produce crazy and unpredictable sounds are
those based on feedback amplitude modulation (Kleimola et al., 2011b; Timoney et al., 2014).
Vector phaseshaping is another interesting abstract form of synthesis (Kleimola et al., 2011a).
Figure 11.1: A proof of concept of a virtual modular Eurorack synthesizer with touchscreen and an embed-
ded PC. A digital Eurorack module acts as a DC-coupled signal interface (module on the bottom row). Two
Eurorack modules act as input devices: a proximity antenna (top row, right) and a brainwave reader (to be
connected to a Bluetooth EEG headset). Additional potentiometers allow expressive control of the
performance (below the screen).
buy the digitizer. The environment will be grateful. I suggest buying a digitizer that comes mounted
on robust glass to make it live gig-proof. Any sound card can be mounted, of course, but the best
way to integrate this setup with Eurorack modules is to use a DC-coupled interface, able to read
and generate control voltage signals, otherwise you will not be able to send and receive CV
signals, only audio signals. Finally, the project can be complemented with Eurorack modules,
either mounted on board or mounted on a separate case, and eventually a set of knobs and sliders
that are read by a microcontroller. Arduino boards and the like are a good option and are easy to
program. These can read the voltage across the potentiometers and send MIDI data accordingly
through a USB cable to the board. This provides additional knobs that will improve the live
performance.
During the installation of the PC you will need an external keyboard and mouse, but once the
touchscreen is up and working you can use the on-screen keyboard provided by the operating
system. Ubuntu provides a great touch experience from version 18.04, but Windows 10 also has
a nice interface (tablet mode).
At the time of writing, the use of Rack with a touchscreen is not officially supported, but it is much
more engaging to use it with a touchscreen rather than a mouse. One fundamental option must be
tweaked to allow moving knobs with a touchscreen.
After Reading This Book 265
Figure 11.2: The architecture of the proposed hybrid hardware/software modular synthesizer based on VCV
Rack. The optional elements include Eurorack modules and a microcontroller to expand the capabilities of
the system by adding additional control elements (e.g. knobs) that send MIDI messages to the computer via
MIDI-USB link.
In the Rack folder, open settings.json with a text editor and change the allowCursorLock to false:
“allowCursorLock”: false
Without this change, there will be no chance of moving a knob with a touch of your finger.
As the one proposed here is just a proof of concept, I expect enthusiasts to build even better
solutions in the future.
Bibliography
Jean-Marie Adrien, “The missing link: Modal synthesis.” In Giovanni De Poli, Aldo Piccialli, and Curtis Roads,
Eds., Representations of Musical Signals, pp. 269–297. MIT Press, Cambridge, MA, 1991.
Federico Avanzini, Stefania Serafin, and Davide Rocchesso, “Interactive simulation of rigid body interaction with
friction-induced sound generation.” IEEE Transactions on Audio, Speech, and Language Processing 13.6
(November 2005): 1073–1081.
Balázs Bank, Stefano Zambon, and Federico Fontana, “A modal-based real-time piano synthesizer.” IEEE Trans-
actions on Audio, Speech, and Language Processing 18.4 (2010): 809–821.
Stefan Bilbao, Wave and Scattering Methods for Numerical Simulation. John Wiley & Sons, Hoboken, NJ, 2004.
Harald Bode, “A new tool for the exploration of unknown electronic music instrument performances.” Journal of
the Audio Engineering Society 9 (October 1961): 264.
Harald Bode, “History of electronic sound modification.” Journal of the Audio Engineering Society 32.10 (1984):
730–739.
Jonathan Boilard, Philippe Gournay, and Roch Lefebvre, “A literature review of WaveNet: Theory, application,
and optimization.” In Audio Engineering Society Convention 146, Dublin, Ireland, 2019.
Gianpaolo Borin, Giovanni De Poli, and Davide Rocchesso, “Elimination of delay-free loops in discrete-time
models of nonlinear acoustic systems”, IEEE Transactions on Speech and Audio Processing 8.5 (September
2000): 597–605.
E. Brandt, “Hard sync without aliasing.” In Proceedings of International Computer Music Conference (ICMC),
Havana, Cuba, 2001, pp. 365–368.
Øyvind Brandtsegg, Sigurd Saue, and Victor Lazzarini, “Live convolution with time-varying filters.” Applied
Sciences 8.1 (2018).
Dave Burraston and Ernest Edmonds, “Cellular automata in generative electronic music and sonic art: A historical
and technical review.” Digital Creativity 16.3 (2005): 165–185.
Juan José Burred, “A framework for music analysis/resynthesis based on matrix factorization.” In The Proceedings
of the International Computer Music Conference, Athens, Greece, 2014, pp. 1320–1325.
Hal Chamberlin, Musical Applications of Microprocessors. Hayden Book Company, Indianapolis, IN, 1985.
John Chowning, “The synthesis of complex audio spectra by means of frequency modulation.” Journal of the
Audio Engineering Society 21.7 (1973): 526–534.
Perry R. Cook, “Physically informed sonic modeling (PhISM): Synthesis of percussive sounds.” Computer Music
Journal 21.3 (1997): 38–49.
James W. Cooley and John W. Tukey, “An algorithm for the machine calculation of complex Fourier series.”
Mathematics of Computation 19.90 (1965): 297–301.
Stefano D’Angelo, Virtual Analog Modeling of Nonlinear Musical Circuits. PhD thesis, Aalto University, Espoo,
Finland, November 2014.
Stefano D’Angelo, Leonardo Gabrielli, and Luca Turchet, “Fast approximation of the Lambert W Function for
virtual analog modeling.” In Proc. DAFx, Birmingham, UK, 2019.
Jon Dattorro, “Effect design – Part 1: Reverberator and other filters.” Journal of the Audio Engineering Society
45.9 (1997a): 660–684.
Jon Dattorro, “Effect design – Part 2: Delay-line modulation and chorus.” Journal of the Audio Engineering Soci-
ety 45.10 (1997b): 764–788.
Jon Dattorro, “Effect design – Part 3: Oscillators – sinusoidal and pseudonoise.” Journal of the Audio Engineering
Society 50.3 (2002): 115–146.
266
Bibliography 267
Giovanni de Sanctis and Augusto Sarti, “Virtual analog modeling in the wave-digital domain.” IEEE Transactions
on Audio, Speech, and Language Processing 18.4 (May 2010): 715–727.
Paolo Donati and Ettore Paccetti, C’erano una volta nove oscillatori: lo studio di Fonologia della RAI di Milano
nello sviluppo della nuova musica in Italia. RAI Teche, Rome, 2002.
Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Mohammad Norouzi, Douglas Eck, and
Karen Simonyan, “Neural audio synthesis of musical notes with WaveNet autoencoders.” In Proceeding of
the 34th International Conference on Machine Learning (ICML17) – Volume 70, Sydney, NSW, Australia,
2017, pp. 1068–1077.
Fabián Esqueda, Henri Pontynen, Julian Parker, and Stefan Bilbao, “Virtual analog model of the Lockhart
wavefolder.” In Proceedings of the Sound and Music Computing Conference (SMC), Espoo, Finland, 2017,
pp. 336–342.
Fabián Esqueda, Vesa Välimäki, and Stefan Bilbao, “Rounding corners with BLAMP.” In Proceedings of the 19th
International Conference on Digital Audio Effects (DAFx-16), Brno, Czech Republic, 5–9 September 2016,
pp. 121–128.
Antoine Falaize-Skrzek and Thomas Hélie, “Simulation of an analog circuit of a wah pedal: A Port-Hamiltonian
approach.” In Proceedings of 135th AES Convention, New York, October 2013.
Agner Fog, Optimization Manuals, Chapter 4: “Instruction tables.” Technical report, 2018, available online at:
www.agner.org/optimize/instruction_tables.pdf (last accessed 4 November 2018).
Federico Fontana, “Interactive sound synthesis by the Lotka-Volterra population model.” In Proceedings of the
19th Colloquium on Music Informatics, Trieste, Italy, 2012.
Gene F. Franklin, J. David Powell, and Abbas Emami-Naeini, Feedback Control of Dynamic Systems, 7th Edition,
Pearson, London, 2015.
Dennis Gabor, “Theory of communication – Part 1: The analysis of information.” The Journal of the Institution of
Electrical Engineers 93.26 (1946): 429–441.
Dennis Gabor, “Acoustical quanta and the theory of hearing.” Nature 159.4044 (1947): 591–594.
L. Gabrielli, M. Giobbi, S. Squartini, and V. Välimäki, “A nonlinear second-order digital oscillator for virtual
acoustic feedback.” In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing
(ICASSP), Florence, 2014, pp. 7485–7489.
Leonardo Gabrielli, Carmine E. Cella, Fabio Vesperini, Diego Droghini, Emanuele Principi, and Stefano Squartini,
“Deep learning for timbre modification and transfer: An evaluation study.” In Audio Engineering Society Con-
vention 144, Milan, Italy, 2018.
Leonardo Gabrielli and Stefano Squartini, “Ibrida: A new DWT-domain sound hybridization tool.” In AES 45th
International Conference, Helsinki, Finland, 2012.
Leonardo Gabrielli, and Stefano Squartini, Wireless Networked Music Performance, Springer, New York, 2016.
David Goldberg, “What every computer scientist should know about floating-point arithmetic.” ACM Computing
Surveys (CSUR) 23.1 (1991): 5–48.
Geoffrey Gormond, Fabián Esqueda, Henri Pöntynen, and Julian D. Parker, “Waveshaping with Norton amplifiers:
Modeling the Serge triple waveshaper.” In International Conference on Digital Audio Effects, Aveiro, Portu-
gal, 2018, pp. 288–295.
Aki Härmä, “Implementation of recursive filters having delay free loops.” In Proc. Intl. Conf. Acoust., Speech, and
Signal Process. (ICASSP 1998), vol. 3, Seattle, WA, May 1998, pp. 1261–1264.
David H. Howard and Jamie A.S. Angus, Acoustics and Psychoacoustics, 5th Edition, Focal Press, Waltham, MA,
2017.
Jari Kleimola, Victor Lazzarini, Joseph Timoney, and Vesa Välimäki, “Vector phaseshaping synthesis.” In The Pro-
ceedings of the 14th Int. Conference on Digital Audio Effects (DAFx-11), Paris, France, September 2011a.
Jari Kleimola, Victor Lazzarini, Vesa Välimäki, and Joseph Timoney, “Feedback amplitude modulation synthesis.”
EURASIP Journal on Advances in Signal Processing (2011b).
Donald E. Knuth, “Structured programming with go to statements.” ACM Computing Surveys (CSUR) 6.4 (1974):
261–301.
Victor Lazzarini, “Supporting an object-oriented approach to unit generator development: The Csound Plugin
Opcode Framework.” Applied Sciences 7.10 (2017).
Claude Lindquist, Adaptive and Digital Signal Processing. Steward & Sons, 1989.
Chien-Hung Liu and Chuan-Kang Ting, “Computational intelligence in music composition: A survey”. IEEE
Transactions on Emerging Topics in Computational Intelligence 1.1 (February 2017): 2–15.
268 Bibliography
Bernhard Maschke, Arjan van der Schaft, and P.C. Breedveld, “An intrinsic Hamiltonian formulation of network
dynamics: Non-standard Poisson structures and gyrators.” J. Franklin Institute 329.5 (September 1992):
923–966.
Dana C. Massie, “Wavetable sampling synthesis.” In M. Kahrs and K. Brandenburg, Eds., Applications of Digital
Signal Processing to Audio and Acoustics, pp. 311–341. Kluwer, Boston, MA, 1998.
Robert A. Moog, “Voltage-controlled electronics music modules.” Journal of the Audio Engineering Society, 13.3
(July 1965): 200–206.
Noam Mor, Lior Wolf, Adam Polyak, and Yaniv Taigman, “A Universal Music Translation Network. arXiv pre-
print arXiv:1805.07848 (2018).
Juhan Nam, Vesa Välimäki, Jonathan S. Abel, and Julius O. Smith, “Efficient antialiasing oscillator algorithms
using low-order fractional delay filters.” IEEE Transactions on Audio Speech and Language Processing 18.4
(2010): 773–785.
K.S. Narendra and P.G. Gallman, “An iterative method for the identification of nonlinear systems using
a Hammerstein model.” IEEE Transactions on Automatic Control 11.7 (1966): 546–550.
Maddalena Novati and Ricordi John Dack, Eds., The Studio Di Fonologia: A Musical Journey. BMC Ricordi,
Milan, 2012.
Harry Nyquist, “Certain topics in telegraph transmission theory.” Transactions of the American Institute of Elec-
trical Engineers 47.2 (1928): 617–644.
Nelles Oliver, Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models.
Springer, New York, 2001.
Alan V. Oppenheim, and Ronald W. Schafer, Digital Signal Processing, 3rd Edition. Prentice Hall, Upper Saddle
River, NJ, 2009.
Julian Parker, “Efficient dispersion generation structures for spring reverb emulation.” EURASIP Journal on
Advances in Signal Processing 2011.1 (2011): 646134.
Julian D. Parker, Vadim Zavalishin, and Efflam Le Bivic, “Reducing the aliasing of nonlinear waveshaping using
continuous-time convolution.” In Proceedings of Int. Conf. Digital Audio Effects (DAFx-16), Brno, Czech
Republic, 2016, pp. 137–144.
Jussi Pekonen, Victor Lazzarini, Joseph Timoney, Jari Kleimola, and Vesa Välimäki, “Discrete-time modeling of
the Moog Sawtooth oscillator waveform.” EURASIP Journal on Advances in Signal Processing (2011).
Will Pirkle, Designing Software Synthesizer Plug-Ins in C++: For RackAFX, VST3, and Audio Units. Focal Press,
Waltham, MA, 2014.
Will Pirkle, Designing Audio Effect Plug-Ins in C++, 2nd Edition. Focal Press, Waltham, MA, 2019.
Augusto Sarti and Giovanni De Sanctis, “Systematic methods for the implementation of nonlinear wave digital
structures.” IEEE Transactions on Circuits and Systems I, Regular Papers 56 (2009): 470–472.
Martin Schetzen, The Volterra and Wiener Theories of Nonlinear Systems. John Wiley & Sons, Hoboken, NJ,
1980.
Claude E. Shannon, “Communication in the presence of noise.” In Proceedings of the Institute of Radio Engineers,
1949. Reprinted in: Proceedings of the IEEE 86.2, February 1998, pp. 447–457.
Julius O. Smith III and Perry R. Cook, “The second-order digital waveguide oscillator.” In Proceedings of the
1992 International Computer Music Conference, San Jose, CA, 1992, pp. 150–153.
T. Stilson and Julius O. Smith, “Alias-free digital synthesis of classic analog waveforms.” In Proceedings of Inter-
national Computer Music Conference, Hong Kong, China, 1996, pp. 332–335.
Karlheinz Stockhausen, Texte Zur Musik 1984–1991, Vol. 8 Dienstag aus Licht; Elektronische Musik, Eds.
Christoph von Blumröder. Stockhausen-Verlag, Kürten, 1998.
Joseph Timoney, Jussi Pekonen, Victor Lazzarini, and Vesa Välimäki, “Dynamic signal phase distortion using
coefficient-modulated Allpass filters.” Journal of the Audio Engineering Society 62.9 (September 2014): 596–610.
Vesa Välimäki, “Discrete-time synthesis of the sawtooth waveform with reduced aliasing.” IEEE Signal Process-
ing Letters 12.3 (March 2005): 214–217.
Vesa Välimäki and Antti Huovilainen, “Antialiasing oscillators in subtractive synthesis.” IEEE Signal Processing
Magazine 24.2 (2007): 116–125.
Vesa Välimäki, Juhan Nam, Julius O. Smith, and Jonathan Abel, “Alias-suppressed oscillators based on differenti-
ated polynomial waveforms.” IEEE Transactions on Audio, Speech, and Language Processing 18.4 (2010a):
786–798.
Vesa Välimäki, Julian Parker, and Jonathan S. Abel, “Parametric spring reverberation effect.” Journal of the Audio
Engineering Society 58.7/8 (2010b): 547–562.
Bibliography 269
Aaron van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves,
Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu, Wavenet: A Generative Model for Raw Audio.
White paper, 2016, available online at: https://arxiv.org/pdf/1609.03499.pdf (last accessed 30 October 2019).
Martin Vicanek, A New Recursive Quadrature Oscillator. White paper, 21 October 2015, available online at:
www.vicanek.de/articles/QuadOsc.pdf (last accessed 30 March 2019).
N. Wiener, “Response of a nonlinear system to noise.” Technical report, Radiation Lab MIT 1942, restricted.
Report V-16, no. 129 (112 pp.). Declassified July 1946. Published as rep. no. PB-1-58087, U.S. Department
of Commerce, 1942.
Adrian Wills, Thomas B. Schön, Lennart Ljung, and Brett Ninness, “Identification of Hammerstein–Wiener
models.” Automatica 49.1 (January 2013): 70–81.
Duane K. Wise, “The modified Chamberlin and Zölzer filter structures.” In Proceedings of the 9th International
Digital Audio Effects Conference (DAFX06), Montreal, Canada, 2006, pp. 53–56.
Lior Wolf, Adam Polyak, and Yaniv Taigman, “A universal music translation network.” arXiv preprint arXiv:
1805.07848 (2018).
Stefano Zambon, Leonardo Gabrielli, and Balazs Bank, “Expressive physical modeling of keyboard instruments:
From theory to implementation.” In Audio Engineering Society Convention 134, Rome, Italy, 2013.
Vadim Zavalishin, The Art of VA Filter Design, 2018, available online at: www.native-instruments.com/fileadmin/
ni_media/downloads/pdf/VAFilterDesign_2.1.0.pdf (last accessed 28 May 2018).
Vadim Zavalishin and Julian D. Parker, “Efficient emulation of tapelike delay modulation behavior.” In Proceedings
of 21st Int. Conf. Digital Audio Effects (DAFx-18), Aveiro, Portugal, 2018, pp. 3–10.
Udo Zölzer, DAFX: Digital Audio Effects, 2nd Edition, John Wiley & Sons, Hoboken, NJ, 2011.
Index