Untitled
Untitled
Adam Taylor
Dan Binnun
Saket Srivastava
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the U.S. Library of Congress.
ISBN-13: 978-1-63081-683-4
All rights reserved. Printed and bound in the United States of America. No part
of this book may be reproduced or utilized in any form or by any means, elec-
tronic or mechanical, including photocopying, recording, or by any information
storage and retrieval system, without permission in writing from the publisher.
All terms mentioned in this book that are known to be trademarks or service
marks have been appropriately capitalized. Artech House cannot attest to the
accuracy of this information. Use of a term in this book should not be regarded
as affecting the validity of any trademark or service mark.
10 9 8 7 6 5 4 3 2 1
Contents
Acknowledgments xi
Introduction xv
CHAPTER 1
Programmatic and System-Level Considerations 1
1.1 Introduction: SensorsThink 1
1.2 Product Development Stages 2
1.3 Product Development Stages: Tailoring 5
1.4 Product Development: After Launch 6
1.5 Requirements 7
1.5.1 The V-Model 10
1.5.2 SensorsThink: Product Requirements 12
1.5.3 Creating Useful Requirements 13
1.5.4 Requirements: Finishing Up 16
1.6 Architectural Design 16
1.6.1 SEBoK: An Invaluable Resource 17
1.6.2 SensorsThink: Back to the Journey 20
1.6.3 Systems Engineering: An Overview 21
1.6.4 Architecting the System, Logically 23
1.6.5 Keep Things in Context (Diagrams) 24
1.6.6 Monitor Your Activity (Diagrams) 26
1.6.7 Know the Proper Sequence (Diagrams) 28
1.6.8 Architecting the System Physically 28
1.6.9 Physical Architecture: Playing with Blocks 29
1.6.10 Trace Your Steps 31
1.6.11 System Verification and Validation: Check Your Work 32
1.7 Engineering Budgets 33
1.7.1 Types of Budgets 33
1.7.2 Engineering Budgets: Some Examples 34
1.7.3 Engineering Budgets: Finishing Up 38
1.8 Interface Control Documents 38
1.8.1 Sticking Together: Signal Grouping 39
1.8.2 Playing with Legos: Connectorization 41
1.8.3 Talking Among Yourselves: Internal ICDs 42
v
vi Contents
1.9 Verification 42
1.9.1 Verifying Hardware 43
1.9.2 How Much Testing Is Enough? 43
1.9.3 Safely Navigating the World of Testing 44
1.9.4 A Deeper Dive into Derivation (of Test Cases) 45
1.10 Engineering Governance 47
1.10.1 Not Just Support for Design Reviews 48
1.10.2 Engineering Rule Sets 48
1.10.3 Compliance 49
1.10.4 Review Meetings 49
References 50
CHAPTER 2
Hardware Design Considerations 53
2.1 Component Selection 53
2.1.1 Key Component Identification for the SoC Platform 53
2.1.2 Key Component Selection Example: The SoC 57
2.1.3 Key Component Selection Example: Infrared Sensor 63
2.1.4 Key Component Selection: Finishing Up 65
2.2 Hardware Architecture 66
2.2.1 Hardware Architecture for the SoC Platform 66
2.2.2 Hardware Architecture: Interfaces 70
2.2.3 Hardware Architecture: Data Flows 72
2.2.4 Hardware Architecture: Finishing Up 72
2.3 Designing the System 72
2.3.1 What to Worry About 73
2.3.2 Power Supply Analysis, Architecture, and Simulation 73
2.3.3 Processor and FPGA Pinout Assignments 74
2.3.4 System Clocking Requirements 75
2.3.5 System Reset Requirements 75
2.3.6 System Programming Scheme 75
2.3.7 Summary 76
2.3.8 Example: Zynq Power Sequence Requirements 76
2.4 Decoupling Your Components 78
2.4.1 Decoupling: By the Book 78
2.4.2 To Understand the Component, You Must Be the Component 78
2.4.3 Types of Decoupling 79
2.4.4 Example: Zynq-7000 Decoupling 80
2.4.5 Additional Thoughts: Specialized Decoupling 81
2.4.6 Additional Thoughts: Simulation 81
2.5 Connect with Your System 82
2.5.1 Contemplating Connectors 82
2.5.2 Example: System Communications 83
2.6 Extend the Life of the System: De-Rate 87
2.6.1 Why De-Rate? 87
2.6.2 What Can Be De-Rated? 87
Contents vii
CHAPTER 3
FPGA Design Considerations 123
3.1 Introduction 123
3.2 FPGA Development Process 124
3.2.1 Introduction to the Target Device 126
3.2.2 FPGA Requirements 127
3.2.3 FPGA Architecture 129
3.3 Accelerating Design Using IP Libraries 130
3.4 Pin Planning and Constraints 131
3.4.1 Physical Constraints 134
3.4.2 Timing Constraints 137
3.4.3 Timing Exceptions 138
3.4.4 Physical Constraints: Placement 139
3.5 Clock Domain Crossing 140
3.6 Test Bench and Verification 143
3.6.1 What Is Verification? 143
3.6.2 Self-Checking Test Benches 144
3.6.3 Corner Cases, Boundary Conditions, and Stress Testing 145
3.6.4 Code Coverage 146
3.6.5 Test Functions and Procedures 146
3.6.6 Behavioral Models 147
3.6.7 Using Text IO Files 147
viii Contents
CHAPTER 4
When Reliability Counts 199
4.1 Introduction to Reliability 199
4.2 Mathematical Interpretation of System Reliability 201
4.2.1 The Bathtub Curve 202
4.2.2 Failure Rate (λ) 202
4.2.3 Early Life Failure Rate 204
4.2.4 Key Terms 205
4.2.5 Repairable and Nonrepairable Systems 206
4.2.6 MTTF, MTBF, and MTTR 206
4.2.7 Maintainability 208
4.2.8 Availability 209
4.3 Calculating System Reliability 210
4.3.1 Scenario 1: All Critical Components Connected in Series 212
4.3.2 Scenario 2: All Critical Components Connected in Parallel 215
4.3.3 Scenario 3: All Critical Components Are Connected in
Series-Parallel Configuration 216
4.4 Faults, Errors, and Failure 216
4.4.1 Classification of Faults 218
4.4.2 Fault Prevention Versus Fault Tolerance: Which One Can
Address System Failure Better? 219
4.5 Fault Tolerance Techniques 221
4.5.1 Redundancy Technique for Hardware Fault Tolerance 223
4.5.2 Software Fault Tolerance 224
4.6 Worst-Case Circuit Analysis 227
4.6.1 Sources of Variation 230
4.6.2 Numerical Analysis Using SPICE Modeling 233
Selected Bibliography 240
About the Authors 241
Index 243
Acknowledgments
I have been truly lucky in my career to be able to work with some amazing engi-
neers and companies on a range of exciting projects, many of which are literally
out of this world. In recent years, I have been extremely fortunate to be able to
run my own engineering consultancy focusing on embedded system development.
The opportunity to run my own business came from a blog that I wrote called the
MicroZed Chronicles (www.microzedchronicles) over 8 years. Now I write weekly
blogs on embedded system development.
However, I have often wanted to go a little further than allowed in the blog or
series of blogs and really walk new and younger engineers through the life cycle
of an engineering project from concept to delivery. Hopefully, this book will act as
a guide and aide to those who are starting out in what I consider to be one of the
most rewarding careers there is. Unlike many engineering texts, my coauthors and
I have set out to tell the story of an embedded development project, in parallel with
the writing of the book, that we have designed, manufactured, and commissioned
for the board.
All design information will be made available, along with additional technical
content, on the book’s website, looking further into deep technical areas than sev-
eral books possibly could.
The creation of a book is a challenge and there are several people to thank,
most especially my coauthors Dan and Saket. It has been truly a pleasure to work
on the creation of this book with such talented engineers.
I would like to thank Steve Leibson, Mike Santarini, and Max Maxfield, who
first published my articles; Rebecca To and the wider Xilinx community, who have
been so supportive over the years as I work with their products to create blogs,
projects, and real applications for clients; and the Avnet team of Kristine Hatfield,
Bryan Fletcher, Kevin Keryk, Fred Kellerman, Dan Rozwood, and the wider Avnet/
Hackster and Element 14 communities.
No man is an island, and I must thank my wife and son Jo and Dan for under-
standing the long hours that I spend both working for clients and creating content.
—Adam Taylor
xi
xii ����������������
Acknowledgments
him was what was going to be the flavor of this book, as it was always my desire
to write something that can help the readers not only solve real-life challenges but
also is easy enough to understand and follow. When we had our preliminary dis-
cussions, it was exciting to see that all of us were aiming for the same goals, which
really motivated me to join the team. That was the easy part; writing a book from
scratch and, most importantly, making sure that it is technically sound but still
easy enough to read and follow are the most challenging part. During the journey
of writing this book, we all learned a lot from each other. Having never worked
in the industry, personally it was enlightening for me to see the point of view of
my coauthors, both of whom have extensive knowledge of what embedded design
engineers in the industry are really like.
I would like to thank my wife, parents, and other family members, who have
been encouraging me throughout my life to achieve higher academic goals and to
write a book for as long as I could remember. I especially thank my wife Nupur,
who helped me keep my focus even during the trying times of the Covid-19 pan-
demic, when major portions of this book were written and did not let me lose track
when I was getting lost without new ideas.
I again thank my wonderful coauthors for the patience that they showed
throughout this journey, even when I was missing deadlines, and to Artech House,
for keeping faith in this project despite the delays. Without the understanding and
commitment of the entire team, this book project may not have been successful.
—Saket Srivastava
Introduction
xv
xvi ������������
Introduction
Before we can design, we must have a clear and frozen technical baseline, which
allows for progressive assurance and risk retirement.
Another day begins bright and early at SensorsThink, a product design company
that specializes in sensing technology solutions for all types and sizes of markets.
A typical day for an engineering team member here is similar to a typical day for
most engineering team members: chock-full of engineering change orders (ECOs),
project status meetings, and other interesting, but mostly turn-the-crank engineer-
ing work. Today will be different and more exciting. There is a buzz in the office, a
new product platform has been talked about for months, and the day may finally
have arrived when management and marketing are ready to unveil the idea that
will become the next great SensorsThink product. After months of research and
speaking with potential customers, they have concluded that one product platform
can be developed to suit the needs of several potentially large and growing markets:
In situations like this, there will often be a project kick-off meeting to generate
some momentum for the work. This may take the form of a company-wide meeting
or an email to the project team or engineers may be pulled aside by the project team
leader and given an overview. So if you were to find yourself as part of a new prod-
uct development team, perhaps pulled into a meeting with representatives from
management and marketing, what would come next? How would you approach
the problem of creating a product from what may be as vague as a bullet-point list
of must-have features?
This is a situation in which almost every design engineer will be put at some
point in their careers, and it can seem daunting, overwhelming, and intimidating.
By the time you have read through this book, the task at hand will seem more
1
2 ��������������������������������������������
Programmatic and System-Level Considerations
manageable and you will understand how to take that list of features and turn them
into aspects of a well-engineered and well-designed product.
In order to set the stage for further understanding of what we consider best practices
for doing product development work, it is vitally important that you understand
the stages (or phases) of product design and development. It is equally important to
note that there is a clear distinction between methods used in product design and
development (Agile, for example, is an extremely popular method in the software
engineering world) and the generalized stages that a development life cycle has.
The focus here is not on the methods applied, but rather the stages that should be
worked through.
For the purpose of this book, a classic stage-gate style approach will be dis-
cussed. However, before exploring that approach, it should be noted that there are
many other approaches that are used in industry. These approaches are regularly
tailored on a project-by-project basis for any number of reasons: project complex-
ity, project timeline, product volumes, and cost target, for example. Any of these
factors can change the approach or tailor the approach that a company uses.
Another key concept to understand is that the length of the entire process, or
the length of a stage within the overall process, can vary wildly from project to
project. Often, this is directly proportional to product complexity and anticipated
product volumes.
A critical takeaway to have is that while the process may be documented in a
certain way by your company, it may not always be followed to the letter of the
law. Adaptability is a critical characteristic to have in the modern development
world with aggressive timelines and aggressive requirements often seeming to be in
direct opposition with each other. Remember the spirit of the process, but try not
to be paralyzed from making progress by feeling restricted by it.
A classical stage-gate type product development typically consists of several
stages of works, followed by checkpoints in the process referred to as gates. It is
at these checkpoints where stakeholders in a project determine whether the project
and/or product are still viable both from a business and technical perspective. One
of several items that is evaluated at these gates is risk. Risk can be from any number
of areas: financial risk, market risk, technical risk, and staffing risk, to name a few.
In general, the best-performing projects tend to reduce risk early in this stage-gate
process so that as you near the product launch date, the remaining risk is minimal,
and no major risk items are left for the final hours. This book is not intended to
serve as a complete reference for a stage-gate process; there are many excellent re-
sources available both online and in reference texts for learning more about these
types of processes and how they can be tailored to a specific company, product, or
project.
For the purpose of this development journey, we will assume that SensorsThink
has chosen to follow a 5-stage development path, shown as a simple flow diagram
in Figure 1.1 [1].
A brief overview of each stage in the process is given here to provide further
understanding and context.
1.2 Product Development Stages 3
•• Stage 1: During this stage of the process, the main goals are essentially to
evaluate the product and project from all angles. What size is the market?
What is your competition doing? What competition already exists? What
will differentiate your product from those of your competitors? Does your
company possess the skills, experience, and know-how to deliver?
•• Stage 2: Building a business case is exactly as it sounds; this is where the
company must determine the following:
• What is the product? What precisely is going to be offered and what
technical capabilities will it have? What level of reliability is required by
the consumers of the product?
• What legal and regulatory requirements will be imposed upon the new
product prior to it entering the marketplace? In safety-critical applica-
tions, the level of rigor and documentation can be quite daunting and is
very resource-intensive. Knowing this well ahead of time can go a long
way towards having a viable case for developing and launching that new
product.
• What risks are present in the product? This includes business risks, fi-
nancial risks, and technical risks.
• What is the plan for the project? This can vary in level of detail greatly,
but most often will include key milestones, required staffing, and re-
quired financial resources (including any capital expenditures for new
equipment or for expensive time at qualification labs at the end of the
project).
• Is this feasible? After spending a great deal of time and energy in analyz-
ing the project and product, the company then decides if it is worth do-
ing. If they feel it is worth doing, stage 3 will occur; this is typically when
most design engineers will be involved in the process.
4 ��������������������������������������������
Programmatic and System-Level Considerations
•• Stage 3: This is the stage in the product development process with which
most engineers are familiar. This is the execution of the (hopefully) detailed
and rigorous analysis performed in the prior stages of work. This can include
deployment of prototypes to customers during the process to get valuable
customer feedback on the product as it is being developed. Ultimately, the
product development team is tasked with delivering a prototype product
ready for the next stage of the process, testing and validation.
•• Stage 4: Testing and validation take on many forms, depending on in which
department you work.
• As a design engineer doing hardware, you will likely be rigorously test-
ing your hardware against requirements and specifications to ensure
performance.
• As a software developer, you will be checking your code for cover-
age, perhaps compliance with coding standards, and regression testing
releases.
• In a regulatory and compliance role, you will be testing the product
against worldwide standards for things such as electromagnetic inter-
ference/electromagnetic compatibility (EMI/EMC) and general product
safety. Depending on what type of product you have designed and the
markets in which you will be selling your product, there are different
harmonized standards and directives against which your product will be
evaluated.
• There is also field testing, perhaps redeploying updated or further refined
products to your early adopters from stage 3.
• Lastly, there is market testing, where a business may tease the product
and gauge interest prior to moving to the final stage of development,
product launch.
•• Stage 5: This stage is the culmination of the prior 4 stages of work (there are
actually a few additional stages sometimes) to which the company has de-
voted countless resources. This is a very exciting time for all involved in the
development process, from marketing, engineering, sales, and field support
and service. This is also a very intense time for all involved in the process,
with all involved hoping that the product will be successful.
Before we proceed further into our journey, let’s touch on some of the other aspects
of the development process. As mentioned previously, this is not an in-depth guide
to the product development process, just an overview. However, some additional
discussion on the topic is useful to provide a broader understanding for anyone not
familiar with a stage-gate process. The other aspects that are perhaps most impor-
tant to touch on are:
A full 5-stage development process is not always necessary, even when consid-
ering the intent of always doing best-practice engineering development. This can be
true if a product development is starting from ground zero. For example, consider
the case of a value engineering (also known as cost reduction) or a customizable
product family that may need a minor modification to complete a sale, sometimes
referred to as a design-in-win. These are very different cases from the new idea to
new product development cycle, and as such, the process can be tailored according-
ly. This does not mean that steps should be skipped randomly and caution thrown
to the wind. Generally speaking, the same deliverables are going to be required,
but stages can be combined. We will examine these two cases in a bit more detail.
For the case of a value engineering project, it might be prudent to apply a
slightly condensed version of this process; refer to Figure 1.2.
In this example, assume that the changes being made are to change the en-
closure paint and hardware of an existing product. These changes may have an
impact on performance in terms of regulatory or safety issues, but hopefully no
impact to functionality. After scoping the work, the business chooses to go directly
to development and test in one step, knowing that they intend to replace the cur-
rent product offering with the new and lower-cost product. Once the engineering
and product management teams are satisfied with the test results of the changed
product (having skipped the gate to proceed to testing), the product simply “cuts
in” to production (meaning that, as old inventory is depleted, the new, lower-cost
model replaces it).
For the case of a design-in-win scenario, it might be prudent to condense the
process even further; refer to Figure 1.3.
In this example, assume that an existing product needs only to have the paint
color changed and a prospective customer will purchase 1,000 units. In a case like
this, the necessity for complete engineering rigor does not have justification. After
a quick scoping of the necessary changes and product management impact, this
product can be launched as a new orderable part number for the customer in a
rapid fashion. The same deliverables can be required of the team, but most can be
recycled from the original product.
Being able to adapt to these conditions and differences in product development
undertakings as a design engineer is very important in today’s landscape. A truly
brand-new product development is far less common than a derivative product;
think, for example, of mobile phones or televisions or even cars and other vehicles.
Perhaps there are a few new features, but upon close examination, the similarities
almost always outweigh the differences.
Lastly, it is important to understand that this process of development must be
fed ideas; this idea buffet can be filled from many different places: fundamental
research (true research and development (R&D)), internal idea capture from em-
ployees, and voice of customer (which can be anything from direct interaction with
potential customers to focus groups and even collaborative design efforts with the
customer).
In this development adventure on which we are about to embark, we are as-
suming that this is a true ground-up development project for a new platform. We
are also under the assumption that marketing and management have done their
homework properly, and this idea has been vetted in terms of the marketplace and
in terms of the available skills the company has to deliver upon it. This places the
bulk of the content of our journey in what was previously described as stage 3 (de-
velopment) and stage 4 (testing and validation).
As a final note, once a product is released, it enters a new life cycle of sorts, often
divided into four distinct stages:
1.5 Requirements
At this point in our journey, SensorsThink has announced their intention to develop
and launch a new product platform; they have done their homework through the
first two stages of a typical 5-stage development process. Marketing and manage-
ment sit the broader engineering team down in a room (yourself included) and
present a set of requirements to help the team start to better comprehend the task
at hand (see Figure 1.4).
Figure 1.4 is a Systems Modeling Language (SysML) requirements diagram,
meant to be a visual representation of the high-level requirements of the new prod-
uct platform; it does not contain textual information when presented this way,
but that textual information can be viewed in a number of ways depending on the
systems engineering toolset that is in use. For SensorsThink, Enterprise Architect
14 (created by Sparx Systems) [2] is the in-house tool of choice. The diagram pre-
sented in Figure 1.4 is an abstracted view of the full set of textual requirements
through which we will be working during our development process. (As an aside,
8 ��������������������������������������������
Programmatic and System-Level Considerations
if marketing and management ever present something that looks like this, a cel-
ebration is in order. Starting out with requirements entered in a modeling tool is a
fantastic starting point and often is not the norm.)
A modeling tool that supports SysML is extremely useful during product devel-
opment. It makes the application of many modern engineering best practices much
easier and, in some cases, may be the only way to implement them in a satisfactory
way for a critical or safety relevant system. Some of these best practices supported
by SysML tools such as Enterprise Architect include:
1.5 Requirements 9
•• The ability to trace multiple SysML element types to and from each other;
requirements to other requirements, requirements to test cases, requirements
to use cases, and requirements to design elements.
•• The ability to monitor the status of a requirement (or any other element) as
reviewed or approved with built-in attributes for modeled elements. This can
be vitally important in a safety-critical system environment. Even better is
the ability to make these things visible in diagrams.
•• Traceability through the use of relationship matrices to view exiting trace-
ability between element types and identify gaps in traceability. Refer to Fig-
ure 1.5.
•• Checking for unrealized requirements; as an example, consider a large system
with thousands of requirements. Manually checking that each requirement
has been realized through a use case would be extremely cumbersome. Model
validation in Enterprise Architect makes this task much more manageable.
These are just a few examples of the kinds of tasks that must be considered and
properly managed throughout the development process and through a product’s
life cycle.
It is important to note that what is shown here is not intended to represent a
full set of product requirements, if you are familiar with the requirements develop-
ment process, you probably already know that a product of even moderate com-
plexity can have thousands of requirements. A full elicitation and decomposition of
requirements is not the focus of this journey. The requirements that are shown here
will be addressed throughout the later parts of this text, so that the full process of
requirement creation to requirement validation can be shown.
•• The model is too simple to reflect the realities of software development and
is more suited towards management than developers or users.
•• It is inflexible and has no inherent capability to respond well to changes.
•• The approach often forces testing to be fit into a small window at the end of
the development cycle as earlier stages take longer than anticipated, but the
launch date remains fixed [4].
That said, there is certainly merit to the use of the V-model. It has been widely
adopted by both the German government and U.S. government. The model pro-
vides assistance on how to implement an activity and the steps to take during
that implementation. It also provides instructions and recommendations for those
activities. Starting with the V-model (or any established model) is almost always
better than trying to reinvent the wheel. A framework like this can be tailored and
adapted to a company’s culture or a specific project’s needs quite easily, and most
mature models allow for such tailoring, providing guidance on how to approach
that task.
Taking a cursory look at the V-model, it is evident that there are both different
sides and different levels, with links traversing the model. It is important to note
that from a purely theoretical perspective, as you move down the left side, across
the bottom, and up the right side of the model, you are always moving forward in
time. This makes the V-model an extension of sorts of waterfall product develop-
ment and project management.
a look at the textual requirements themselves, rather than the pure diagrammatical
view presented earlier. Since this is a sensing platform, let’s look closely at the re-
quired sensing technologies and associated requirements as identified by marketing
and management.
If we refer to the higher-level elements of the diagrams from Enterprise Archi-
tect 14, we can see that the basic sensing technology requirements are:
These are the key functional elements for the new sensing platform that Sen-
sorsThink will be attempting to integrate into a single product platform. It is not
uncommon for the starting point of a project to be nothing more than a bullet-
point list such as the one above, but in our case, we are in much better shape.
Our management and marketing teams have gone above and beyond, putting in
some details around the requirements for us to help engineering better understand
the system and engineer the right product to meet the market demands. Required
ranges have been provided for all sensor technologies:
We also see a nice-to-have feature request, which tells engineering that, if pos-
sible (after proper analysis and architecture), multiple sensors should be colocated
or all sensors should be on a single sensing head. This is a preferred (but not re-
quired) solution for the platform. Note that this is not a requirement element, but
rather an informational statement.
This type of definition is a wonderful starting point; if we refer back to the V-
model, we can see how this can easily be decomposed into domain requirements
for hardware and software capabilities and even lends itself nicely to the creation
of accurate level 10 validation testing.
refinement, before they are actionable by engineering (or other disciplines, for that
matter) [5].
The International Council on Systems Engineering (INCOSE) is an organiza-
tion founded to develop and propagate the principles and practices that enable
the creation of successful systems. The organization was formed in 1990 and has
become the de facto standard for many aspects of systems engineering, includ-
ing requirements writing. The INCOSE Guide for Writing Requirements [6] is
a document that is often cited in company policy for how requirements are to be
constructed. The document is comprehensive and will be heavily cited here as it is
an excellent reference for all those tasked with creating, reviewing, designing to, or
testing against requirements.
The document is specifically about how a requirement is properly expressed in
a textual format, in other words: how to properly write a requirement. It is not a
guide on how one might go about gathering requirements or turning those require-
ments into SysML diagrams. It is important to understand that, while often done
by the same people, the tasks are quite different.
The general approach of the guide is to present characteristics and practical
rules for crafting requirements. The characteristics express why a rule makes sense,
and the rules express how to write the requirements. This helps to provide readers
with context knowing that rules must often be adapted to fit or suit a certain need.
For the purpose of this discussion, the focus will be on the characteristics of a well
written requirement statement:
(When reading the following list, start each item with “a well-written require-
ment is…”)
Once the requirements are agreed to at the customer level, several key engineering
activities are ready to begin:
1. System-level requirements;
2. System-level architecture and design;
3. System validation plan.
System-level requirements are the next step down the decomposition ladder
in requirements elicitation; for the SensorsThink product development project ex-
ample, the given requirements are more of a hybrid level 10/level 20 requirements
set. Remember that the key here is that the framework for approaching product
design and development for embedded systems is the focus here, not on a fully and
completely elicited requirements set.
Done concurrently with system requirements, system architecture is something
that is absolutely essential to delivering a well-engineered product and solution that
meets the needs of the customer and is neither overengineered nor underengineered.
The right product is always the holy grail of development: the right features, the
right performance, and the right price.
1.6 Architectural Design 17
In our journey, we will not be taking a deep dive into modern systems engi-
neering methodology, since it is tangential to our core focus. However, a discussion
on the topic is certainly valuable and warranted to provide proper context and
understanding.
These two purposes are certainly relevant for us and the engineering staff at
SensorsThink. Without a solid foundation in place that comes from proper systems
engineering, the product being developed is not nearly as likely to be the right prod-
uct, with the right features, performance, and price.
For technology development, there is a wonderful excerpt from the SEBoK
website that should be kept in mind as we architect our new sensor platform [13]:
used to create the overall road map for the new product offering. In these cases,
new product ideas impose requirements on new technology developments.
On the other hand, when technology developments or breakthroughs drive
product innovation or the generation of new markets, the technology develop-
ments may also generate requirements on product features and functionalities.
technical concept at hand and the market (and markets in our case) being targeted,
that is essential to a successful product development and launch.
1. System requirements;
2. System architecture:
(a) Logical architecture model development;
(b) Physical architecture model development.
22 ��������������������������������������������
Programmatic and System-Level Considerations
Let’s proceed with the assumption that our set of approximately 30 require-
ments is the full set for our new product, even though we know better. The next
step would be to architect the system.
by customers, but also by other stakeholders in the process: test and quality assur-
ance, production, and end-of-life procedures. Recall the four stages of a product’s
life from Section 1.2: introduction, growth, maturity, and decline. A well-executed
architecture will consider all of these stages in addition to the functional require-
ments of the product in the marketplace.
It is very important that the intent of logical architecture is well understood.
Logical architecture describes how a product works. This can be applied at a higher
level of abstraction (consider the use case of a product being decommissioned) as
well as lower levels of abstraction (how a certain piece of functionality should
work); it is vital to separate this conceptual architecture from the physical way in
which it will be implemented [16]. These logical architecture diagrams typically
make use of three types of models:
•• Functional: A set of functions and their subfunctions that defines the work
done by the system to meet the system requirements.
•• Behavioral: Arranging functions and interfaces (inputs and outputs) to de-
fine the execution sequencing, control flow, data flow, and performance level
necessary to satisfy the system requirements.
•• Temporal: Classifying the functions of a system based upon how often they
execute. This includes, for example, the definition of what parts of a function
are synchronous or asynchronous.
Note that all of these end users will be interacting with the system controller
through the same communications interface (or interfaces, in our case). This helps
to put definition to those interfaces and to things such as logging in to the system,
access levels, and security needs for the interface.
We also see that our environment or item of interest is shown in this context di-
agram. This is what the sensor platform is monitoring and taking measurements of.
It is important to take note that this diagram is not prescriptive; it does not de-
fine anything about a particular interface or feature of the system. It is not defining
what type or volume of information will flow over various interfaces. It simply pro-
vides the context for which the system will be operating in this particular use case.
Diagrams like this are extremely useful in defining the behavior of a system at
varying levels of depth; the shutdown routine of a safety-critical system may war-
rant many layers of complex diagrams, contrasted against the shutdown routine of
1.6 Architectural Design 27
a coffee maker, which may only require one or two actions be shown (it may be as
simple as the user turning the power button off).
28 ��������������������������������������������
Programmatic and System-Level Considerations
is to create models and views of a physical solution that accommodates the logical
architecture model (in other words, accounts for all functional, behavioral, and
temporal models and diagrams) and satisfies system requirements. Physical ele-
ments (hardware, software, mechanical parts) have to be identified that can support
these models and requirements.
One of the key drivers for the physical architecture model may be interface
standards; they may be one of the key drivers for the overall system. However, it
is also sometimes the case that interfaces are chosen later on in the model develop-
ment process. In our system, we have both of these cases present: the communica-
tions interfaces that were stated as system requirements (Ethernet and Wi-Fi or
Bluetooth) and the sensor communications interfaces that were not specified in any
meaningful way.
Because the system will likely be communicating over established networks to
the outside world, it makes sense that these protocols are defined and required at a
very high level. On the other end of the system, there are several internal interfaces
that are only exposed to the Smart Sensor Controller. These sensor communica-
tions interfaces can be of any type as long as the selected sensor meets its sensing
performance requirements, the real driver for the sensors are those performance
requirements, not the way in which data is transmitted between sensor node and
controller.
Note that most of the physical blocks here are represented (at first glance) by
hardware only, but a closer look would reveal that there is a large volume of firm-
ware and software work that must occur in the system to operate properly and reli-
ably. There must be accurate date and time keeping for time-stamping data, data
integrity must be guaranteed, sensor fusion must be performed, image processing
(for our infrared sensor), data storage, and data transmission over multiple inter-
faces (perhaps up to three). When analyzing the system and what needs to be at the
heart of it, there could be several implementations:
30 ��������������������������������������������
Programmatic and System-Level Considerations
1. Discrete microprocessor;
2. Discrete microprocessor and coprocessor/field programmable gate array
(FPGA);
3. System on Chip (SoC).
The goal of any engineering project is to deliver the final product on time, to quality,
and on budget. Naturally, we consider the budget element to indicate the financial
budget allocated for the development, which it primarily is. However, successful
project delivery will require the achievement of many engineering budgets across
several subsystems of the design.
Engineering budgets are a response to requirements, contained in the system
requirement specification. The engineering budget allocates portions of the overall
target defined by a system requirement to subsystems in the design, providing clear
guidance and targets to the designers.
It is natural when doing this allocation to retain a portion of the overall bud-
get as contingency, in case one subsystem is unable to meet its budgets. Retaining
contingency at the highest level prevents each subsystem from retaining a contin-
gency, which makes achieving its budget harder and therefore increases the techni-
cal risk and potentially impacts the delivery timescales and cost trying to achieve
the budget.
•• Unit price cost: A financial target that defines the manufacturing cost of the
assembled product. This includes not only the cost of the bill of materials but
also the cost of assembly and manufacture.
•• Mass budget: A physical target that defines the acceptable mass of the fin-
ished system. Achieving this budget is critical on applications for mass con-
strained applications (e.g., aerospace, space, automotive application). Mass
budgets will allocate budgets to elements of the design such as enclosure,
circuit card, power supplies, and actuators.
•• Power budget: A physical target defines the maximum power that may be
drawn by the application. A power budget may define different operating
conditions and allowable power budget for those operating conditions.
Examples of this include low power and reduced power operating require-
ments. Power budgets will also include allocations for direct current (dc)/dc
conversion.
•• Memory budgets: A storage target, this defines the required nonvolatile and
volatile memory required for the overall application.
34 ��������������������������������������������
Programmatic and System-Level Considerations
•• Performance budgets: This will vary from project to project depending upon
the application; however, the performance budget will allocate the overall
performance requirement across the subsystems. This enables challenging
subsystems to have a larger proportion of the budget than less challenging.
For example, when measuring real-world parameters using a sensor, the ana-
log element of the design that is more effected by temperatures, tolerance,
and drift will be allocated more of the budget.
•• Thermal budget: On many applications, the thermal dissipation can be chal-
lenging; examples include wearable products, space, and aerospace where
thermal management is difficult or could result in harm. The thermal budget
is therefore closely linked with the power budget and will address not only
the external interface temperatures but also the allowable dissipation of the
subsystems and even key components within the design that drive the ther-
mal solution.
•• Enclosure and fixings: The enclosure of the system, which protects it from its
environment and provides thermal dissipation.
•• The ac/dc converter: Power converter, which converts the input voltage to
secondary voltage required by the system.
•• Circuit card assembly: The circuit cards that contain the electronic compo-
nents, which implement the required functionality of the system.
•• Contingency: 500g;
•• Enclosure and fixings: 3,500g;
•• The dc/dc converter: 500g;
•• Circuit card assembly: 500g.
•• The payload shall provide regulated supply rails at 3.3V, 5.0V, and 12.0V.
•• The total current drawn from each rail may be up to 600 mA.
•• The total power dissipation during the sunlit orbit shall be 400 mW.
up-to-date to ensure that any power increases due to scope creep are addressed and
are still within budget.
The budgets that we must achieve for this CubeSat mission are:
In this example the CubeSat payload contains an FPGA along with configura-
tion memories, SRAM memory, and support circuits such as oscillators.
Creating the power budget for this application requires the use of power es-
timation spreadsheets provided by the FPGA manufacturer (e.g., Xilinx XPE) to
determine the maximum power required for the FPGA application. This estimation
needs to include clocking rates, logic, and block random access memory (BRAM)
resources, specialist macros (e.g., DSP, Processor, and Clocking). At early stages of
the project, it is also good practice to include a margin factor in this estimation.
It also goes without saying that our power budget should be based upon the
worst-case power dissipation taking into account process, voltage, and tempera-
ture (PVT) variation.
For this CubeSat application, the power estimation for the FPGA and support
circuitry can be seen in Tables 1.1 and 1.2.
These power estimations identify the power drawn from each of the subregu-
lated rails; they do not include the power required from the supply rail.
The proposed architecture uses the 3v3 supply to directly power the FPGA 3v3
IO, non-volatile random-access memory (NVRAM), and controller area network
(CAN) interface. As such for the 3v3 supply rail, we do not need any further con-
version, the load on the 3v3 Supply rail is 1.59W or 78% of the available 1.98W.
The power drawn from the 3v3 rail can therefore be said to be within the avail-
able budget, while the 1v0, 1v8, and 2v5 rails are generated using dc-dc switching
converters connected to the 5v0 rail. When we convert dc-dc voltages, we must
account for the switching efficiency; depending upon the efficiency, the power re-
quired from the supply rail may be considerably higher than that drawn by the
load. When trying to meet tight power budgets using dc-dc converters, we want to
select high-efficiency converters. For this example, a converter efficiency of 85%
was assumed.
The total power drawn from the 5v0 rail is 2.1W, this is again within the allow-
able budget of 3W, providing a 70% load on the 5v0 rail.
The loads on the 3v3 and 5v0 rail provide margins of 22% and 30% respec-
tively, we also need to ensure the power budget is able to address short duration
start-up currents. Typically, these will be addressed in a worst-case analysis of the
design.
With a total power draw of 3.66W the final determination is the time the
CubeSat payload can be operational to achieve the 400-mW average during the
45-minute sunlit orbit.
When we calculate 400 mW/3.66W, we see the ratio required to achieve the
average, as such the payload can be operational for just over one-tenth (10.92%)
of the 45-minute operational time.
This becomes a limiting factor on the FPGA design, it must be able to achieve
its task within the allocated timescale as determined by the power budget. This
time limit can be therefore included as a derived requirement in the FPGA level
requirements.
One final point of note on this example is throughout this example we used
the worst-case figures. If we had used the nominal figures the power requirements
from the 3v3 and 5v0 rails would be 1.485W and 1.445W, respectively, resulting
in a total power of 3.185W. This nominal dissipation of 3.185W compared to the
worse case of 3.66W is a 13% difference and would result in the 400-mW sunlit
orbit average being exceeded if the operational time was based of the nominal pow-
er dissipation and process, voltage, and temperature variation resulted in power
dissipation closer to the maximum. This increased power dissipation may result
in system level impacts as the battery experiences more dissipation than expected.
When it comes to designing our system, the systems interfaces represent an area of
risk. This risk arises as the system interfaces include multidisciplinary factors (e.g.,
electronic and mechanical design along with the complexity of interfacing with ex-
ternal systems). As engineers to mitigate issues arising due to incorrect interfacing
between units or modules, we define the interfaces accurately in an interface control
document (ICD). The ICD enable us to clearly define the different interfaces within
our system and will cover a range of disciplines. As such, they will cover such areas
as:
•• Thermal ICD: A thermal ICD will define the thermal interfaces of the sys-
tem. This will include the thermal conductivity between the enclosure and
the mounting points, along with any additional thermal management meth-
ods. These additional thermal management methods may include forced air
cooling, conductive paths, or liquid cooling. The thermal ICD enables the
system designers to understand the thermal environment and design a system
which can operate within it. The thermal ICD will also be very useful when
(if) it comes to determining the part stress analysis and reliability figure.
•• Electrical ICD: An electrical ICD will define all the electrical interfaces of the
system. Often this will be split into an external electrical ICD and an internal
electrical ICD. The electrical ICDs will define the power supplies, signal in-
terfaces, and grounding. Obviously when working with electrical interfaces
we should try to work with industry standard interfaces such as Ethernet,
RS232, and USB depending upon the needs of the system. This is especially
true for external interfaces where the system is required to interface with
others. Internally within the system again standard interfaces should be used
wherever possible. However, if they are not the interfaces should be defined
in depth using the internal EICD. One critical element of the EICD along
with the signal and interface types is the definition of the electrical connec-
tor types and pin allocations. Internally, this enables the individual circuit
card designers to develop solutions from a single reference document ensur-
ing they are fully aware of the connector type required and the exact pin
allocations.
•• Maximum power: The maximum power that can be drawn from the power
rail.
•• Noise characteristics: How much ac and dc ripple present on the rail.
•• Ground return: The return path for the supply current.
Defining these different groups within the ICD is also useful in the later stages
of the design process when layout is underway. These different groups can have
different layout constraints applied to them regarding length matching and even
separation between signal groups.
1.8 Interface Control Documents 41
Once the different signal groups are created, we are then in a position to begin
creating the actual interface information (e.g., signal, direction, and pin allocation).
The signals are then allocated to the pins between the return and power pins at
either end. If the application requires high reliability, it may be further necessary to
isolate different groups of signals from each other.
In this case, guard pins may be used to ensure that the critical signal can-
not short. Guard pins therefore surround the critical signal pin. These guard pins
should be connected to ground via a high value resistor, ensuring that the critical
signal cannot be shorted to either return or the supply rail.
1.9 Verification
After completing design work, and pouring hundreds upon hundreds (sometimes
thousands upon thousands) of hours of effort into just coming up with design con-
cepts, and then even more hours piled on top for doing design work and design
reviews, a milestone is reached. New PCBs, mechanical enclosures, critical compo-
nents, and sensors are finally ordered. This is a major milestone, but it is far from
the end of a product development process. (There is a reason those pesky revision
blocks exist on drawings after all.)
Having already touched upon the V-model and looked at how systems engi-
neering and architecture work lays the groundwork for product design, implemen-
tation and ultimately validation. We should now consider the other V in V&V:
verification. Recall the anecdote from earlier in the book that we mentioned; vali-
dation is often regarded as “was the right system built?” testing, and verification is
often regarded as “was the system built correctly?” testing.
1.9 Verification 43
compared with that given by the specification (consider an eye-mask diagram for
the Ethernet).
Fault injection testing means intentionally introducing faults in the hardware
product and analyzing the response of the hardware. Model-based fault injection
(e.g., SPICE, IBIS, FEA, and thermal) is also applicable, especially when fault injec-
tion testing is overly complex to do at the hardware product level. (Consider trying
to perform single-bit error testing that might be expected from cosmic rays.)
Electrical testing aims at verifying compliance with requirements within the
specified voltage range. Again, the example of a power supply performing over a
wide input range is a good one to consider.
The next level up from the last couple of examples is what we would consider
to be integration level testing, where the hardware has been unit-tested, and these
individual components and circuits have been verified. Combining this set of hard-
ware with software that performs nominal operation of the hardware is really the
first big integration milestone for an embedded system. Once this is ready and the
underlying hardware is believed to be adequately tested, the overall embedded sys-
tem (or, if in a complex enough system, perhaps a subsystem) is ready for robust-
ness testing.
During environmental testing with basic functional verification, the embedded
system is put under various environmental conditions during which performance is
assessed. For the SoC platform that we are designing, this may consist of perform-
ing a strenuous Ethernet traffic test over a wide range of temperature and humidity
and over a very long Ethernet cable. This type of test gives strong confidence in the
robustness of an interface if the test is successful.
Expanded functional testing is intended to verify the response of the system to
input conditions that are expected to occur only rarely or that are outside the nomi-
nal specification of the hardware, that consider receiving an invalid command, or
that interrupt a data transmission intentionally. If the system can gracefully handle
these conditions, that is a good indicator of robustness.
Accelerated life testing is a topic worthy of its own book (or several for that
matter) and its own subject matter experts. The short version of the intent of the
testing is to predict the behavior and performance of the product as it ages by
exposing it to conditions outside of its normal operating conditions. This is a test
method based on analytical models of failure mode acceleration and can provide
valuable information related to the robustness and margin available in the design.
These tests must be properly designed and executed to provide value [22].
Mechanical endurance testing is intended to stress the product to the point
of either failure or noticeable physical damage. Things such as vibration testing,
drop testing, and impact testing are often performed to ensure that the product can
withstand the environment in which it is going to operate. Performing the testing
to failure gives information on the available margin in the design. Consider a drop
test for the product that must pass at 10 feet. If the test passes at 10 feet, but fails
at 10 feet and 1 inch, the engineering team may want to take a closer look at the
design. If the test does not fail until 15 feet, the team can probably rest easy on that
particular test.
EMI/EMC testing is always a hot topic in engineering circles and again war-
rants a book of its own; and is certainly its own area of expertise. For most prod-
uct engineers, this EMI/EMC testing is the gateway into the marketplace. Without
passing those tests, your product will not be approved for sale. That is why they are
such a good marker for product robustness and a necessary one at that. During this
testing, your product will be zapped with high-voltage transients, have high-power
RF energy pointed at it, and have extremely sensitive RF antennas listening to it
while it operates. If the product cannot take the heat or if it makes too much noise,
the engineering team has to go back to the drawing board.
Luckily for us, the requirements provided to us are relatively easy for which
to derive test cases and it would not be too demanding of a task to put together a
cogent and defensible test plan to show that the hardware portion of the system
was performing robustly for the application.
SensorsThink is getting off the hook a bit easily in this regard, as they are not a
safety product and they are not operating in overly harsh environments. That does
not mean that we do not want to do a good job verifying our design. It just means
that the level of scrutiny placed on the engineering team is not as high as it might
be in another situation. In the hardware design part of our journey, we will take a
closer look at how we can design our product to make testing easier. For now, this
should put in perspective the necessity of proper testing and some additional food
for thought on how SensorsThink may want to approach their product testing.
Engineering governance is probably one of the least fun elements of the design pro-
cess, but at the same time it is one of the most important. Engineering governance
defines how we progress from the initial concept of the project to the completed
delivery and even beyond to end of life and safe disposal.
As you would expect, engineering governance is tied closely with the design
life cycles and the design reviews, which are held as the program progresses. The
engineering governance program defines which of the checklists and reviews need
48 ��������������������������������������������
Programmatic and System-Level Considerations
Complying with the design rule sets as the design progresses can be challeng-
ing, as often the rules are defined in published documents, which, despite best
intentions, are easily overlooked.
1.10.3 Compliance
The best way to ensure compliance with engineering governance rules is to incorpo-
rate them within the CAD tools we are using to implement our embedded system.
Within schematics and layout tools, we can implement rules using constraints and
design for manufacturing settings. If additional checks are required, it is possible to
create scripts in Python or Tool Command Language (TCL) to implement these ad-
ditional checks. When it comes to ensuring compliance with the RTL and software
coding standards, linting tools can be used to ensure compliance against rule sets
that include naming conventions and design rules.
The automation of rule sets directly in the CAD tool itself obviously increases
the compliance with the rule set. It also makes demonstrating compliance with the
rule set easier when it is time for design sign-off.
If automation cannot be achieved, then checklists can be used to ensure that the
necessary design rules have been complied with. To ensure repeatability, the check-
lists should be stored under a configuration control system, enabling the design
team to access the latest version.
dress a wide range of developments. There will also be corner cases that do not
align with the checklist.
This is where the roles of the chief engineer and independent expert come into
play along with the engineering team. Each of the rule’s violations needs to be ana-
lyzed and allocated to one of two categories:
•• Must be fixed: These rule violations must be addressed before the next stage
of the project can be started.
•• Will not be fixed: These rule violations do not need to be addressed. For each
violation, a detailed explanation of the reasoning being the exception must
be recorded.
Once the design rules review has been completed, the design review notes, cat-
egorization of rule violations must be stored within a change management or prod-
uct life-cycle management system. This provides a record of the review occurring
and makes the data easily available for higher-level design reviews and certification
reviews.
References
[1] Cooper, R., Winning at New Products: Creating Value Through Innovation, 3rd edition,
New York: Basic Books, 2011.
[2] Sparx Systems Pty Ltd., “Enterprise Architect 14,” August 22, 2019. https://www.sparxsys-
tems.com/products/ea/14/index.html.
[3] Reliable Software Technologies, “Mew Models for Test Development,” 1999. http://www.
exampler.com/testing-com/writings/new-models.pdf.
[4] Forsberg, K., and H. Mooz, “The Relationship of Systems Engineering to the Project Cy-
cle,” National Council on Systems Engineering (NCOSE) and American Society for Engi-
neering Management (ASEM) Conference, Chattanooga, TN, October 21, 1991. https://
web.archive.org/web/20090227123750/http://www.csm.com/repository/model/rep/o/pdf/
Relationship%20of%20SE%20to%20Proj%20Cycle.pdf.
[5] INCOSE, “About: INCOSE,” https://www.incose.org/about-incose.
[6] Requirements Working Group, International Council on Systems Engineering (INCOSE),
“Guide for Writing Requirements,” San Diego, CA, 2012.
[7] International Electrotechnical Commission (IEC), “Available Basic EMC Publications,”
August 22, 2019. https://www.iec.ch/emc/basic_emc/basic_emc_immunity.htm.
[8] RoHS Guide, “RoHS Guide,” August 22, 2019. https://www.rohsguide.com/.
[9] European Commission, “REACH,” August 22, 2019. https://ec.europa.eu/environment/
chemicals/reach/reach_en.htm.
[10] Ericsson Inc., “Reliability Prediction Procedure for Electronic Equipment,” SR-
332, Issue 4, March 2016. https://telecom-info.telcordia.com/site-cgi/ido/docs.
cgi?ID=SEARCH&DOCUMENT=SR-332&.
[11] BKCASE Editorial Board, “The Guide to the Systems Engineering Body of Knowledge (SE-
BoK), v. 2.0,” Hoboken.
[12] SEBoK Authors, “Scope of the SEBoK,” August 18, 2019. https://www.sebokwiki.org/w/
index.php?title=Scope_of_the_SEBoK&oldid=55750.
[13] SEBoK Authors, “Product Systems Engineering Background,” August 18, 2019.
https://www.sebokwiki.org/w/index.php?title=Product_Systems_Engineering_
Background&oldid=55805.
1.10 Engineering Governance 51
The System on Chip (SoC) Platform project is well underway at this point at Sen-
sorsThink. Requirements have been established (refer to Section 1.5: Requirements),
system architecture has been defined (refer to Section 1.6), and the engineering
domains (hardware, software, firmware, mechanical) are starting to do their engi-
neering wizardry to turn those requirements and ideas into a real-life functioning
product.
For hardware design, this process starts with component selection, more spe-
cifically, key component selection. Let’s explore what we mean by a key component
in a bit more detail before we dive into the selection process for some of the key
components in our design.
For the sake of discussion, let’s assume that the following two criteria are how
we can identify a key component:
1. A component that, once selected, will not have a viable second source op-
tion or drop-in replacement. This excludes things such as resistors, capaci-
tors, and diodes (with a few exceptions).
2. A component that is crucial to realizing the required functionality of the
product and directly traces back to requirements. In our design, a great
example of this is the selection of the actual sensors that will be providing
data to the processing system.
53
54 ������������������������������
Hardware Design Considerations
It is natural to wonder why some of the other components are not considered
key components, the answer (as it usually is) is that it depends. Things like memory
can typically be found from at least two suppliers in a true drop-in replacement
because of the JEDEC standards around memory.
In JEDEC’s own words [1]:
The other major architectural blocks that we did not identify as key compo-
nents are connectors and power conversion and monitoring. The reasoning here is
that these components will be driven by the other components around them. We
need to know the power requirements for the key components to design a power
system, and we need to know what kinds of connectors may be required to inter-
face to sensors or a debug adapter for whatever processor or SoC that we choose
to be the core of the design.
So now that we have identified the key components that will drive our hard-
ware design forward, there are a couple other key considerations to have in mind
before we start to scour the internet and component supplier websites for the per-
fect parts:
exercise, the cost will not be part of the Pugh matrix so as to keep the focus on
technical decision making.
A quick review of the three supplier SoC offerings brings us our three options
to put into the Pugh matrix:
A basic Pugh matrix starts by defining a set of criteria that are scored and then
summed. This gives each option a final score, which can then be ranked.
A weighted Pugh matrix evolves the concept further by weighting the criteria
in order of importance. The more important the criteria, the higher the weight that
it should be given (a higher number by which to multiply the ranking). This way,
the resulting final score more accurately captures the fit of each option when the
values are summed.
All of the information for the Pugh matrix was gathered by sorting through the
device family documentation for each device type. This can be a time-consuming
exercise and can require contacting applications engineers to really help with sort-
ing through all the details and nuances of each family. For SensorsThink, the clear
2.1 Component Selection 59
winner here is the Xilinx Zynq-7000 series of devices, and because we took an ana-
lytical approach to the selection it is difficult to argue against the decision.
From this key decision, we can now start to build the framework around our
main SoC device. Choose support components such as:
Before we get too far ahead of ourselves, we should also consider the compo-
nent life cycle, temperature range, and packaging type. We previously identified
these as important, so a review of the options is certainly worth a brief discussion.
A great place to start is a product table (or product selection guide); suppliers
will often summarize many of these options for us visually to make the informa-
tion easier to consume. Another good resource can be parametric search tables
from suppliers of devices, which allow the designer to sort and filter parts based on
specific functionality, package size, or environmental ratings, among other things.��
For SensorsThink, a dual-core ARM is the most sensible starting point. The
reasoning behind this is that it allows for things such as asymmetric multiprocess-
ing. One core can run the Linux operating system, and the other can run FreeRTOS
(or another real-time operating system). This really opens up options for optimizing
performance in the end application. You could also dedicate both cores to Linux if
necessary. The flexibility is the attractive selling point for a dual-core system.
That leaves two device branches to explore: the cost-optimized family, built on
Artix-7 fabric, and the mid-range family based on Kintex-7 fabric.
The analysis of Kintex-7 fabric versus Artix-7 fabric is out of the scope of our
discussion; however, cost is a very easy barometer to indicate what might make
sense given the features and trade-offs. A quick search on a distributor website
such as Digikey or Mouser is a great way to get a rough-order estimate of the cost
of each device relative to the other. Let’s examine the highest-end cost-optimized
device and the lowest-end mid-range device, the Z-7020 and Z-7030, respectively
(Figure 2.6).
That is quite the increase in cost, essentially twice the cost for jumping the
barrier from cost-optimized to mid-range. From Figure 2.5, we can see that some
key differentiators are PCI Express lanes, logic elements, and DSP slices. In our ap-
plication, we are doing some sensor fusion and some mathematical operations will
certainly occur. However, without compelling input from a software engineer or
systems engineer, the additional cost is hard to justify. For this reason, we can push
closer to arriving at the major milestone of choosing the SoC for our platform. The
Z-7010, Z-7015, and Z-7020 are the group of devices left to choose from (Figure
2.7).
Since this is an early stage of the development process, having a good migra-
tion path within our device family is something for which to aim if at all possible. �
By examining this Figure 2.7, we can see that a CLG400 footprint provides us
with a tremendous amount of flexibility, which could lead to cost reduction at the
end of the project if we find that the initial selection has excessive horsepower for
our needs. This extends down to the Z-7007S, a single core version of the Zynq-
7000, up to the Z-7020, the highest density device available in our dual-core cost
optimized family of components.
Note: The shading indicates the footprint compatibility range. The numbers in
each cell are the number of I/O available to the FPGA, processor system, and the
number of high-speed GTP transceivers.
Based on this, we will start our development project with the Z-7020 device
in a CLG400 package. We will also need the industrial temperature range to ac-
commodate the requirement to operate down to –10°C. We will also start in the –2
speed grade, the faster of the two options available in our temperature range. This
provides us the most processing power in our migration path and allows us to see
if we can migrate down as we near completion of the development.
Figure 2.8 Zynq-7000 Technical Reference Manual: DDR Memory Section Table 10. (Adapted from Xilinx.)
to review more detailed documentation from Xilinx for the Zynq-7000 series to
see any potential missed nuances for implementing DDR3. The technical reference
manual for the device family is the deepest dive that you can take, and it is where
we will go next.
The first row in this table confirms that our design intent is valid. If we choose
a 4-Gb component at 16 bits wide and place 2 of them in parallel, we will have a
32-bit-wide data bus at 8 Gb (1 GB) of total memory density. A suitable part num-
ber for us to use is Micron MT41K256M16TW-107; this part has the right bus
width and density, is on the supported list from Micron, and is currently an active
part. This is one more design decision checked off the list for our SoC platform.
Note that there is one other commercially available option from Melexis, but it
does not have the required plane array size, as it is only 32 × 24 pixels.
The FLIR Lepton 3.5 image sensor meets all of our stated requirements:
Because of the limited options available, the selection of this sensor becomes
merely an exercise in finding available options. If there was no sensor available that
met all of the stated requirements, the hardware engineers and product manage-
ment at SensorsThink would need to sit down and discuss alternative paths for-
ward. That could mean reducing the requirement to align with available sensors,
creating a partnership with a specialist partner company to develop a sensor, or
removing the sensing technology altogether. Luckily for us in the hardware group,
the requirements are able to be met.
The next step in hardware design is to start to put the design into place. Finally,
after months of research and legwork, requirements, and component selection, the
design work starts to take shape. All the work that has been put into the project so
far is essential for the hardware designer at SensorsThink to start the nitty-gritty de-
sign process. Without key components chosen and a system-level concept in place,
the hardware design process is akin to assembling a puzzle with the pieces upside
down. However, with those things well-defined, the puzzle pieces turn right-side up.
It may be a 100-piece kids’ puzzle sometimes and a 2,000-piece monster puzzle at
others, but it is always easier to take that task on with the pieces flipped over.
A natural question at this juncture would be regarding the difference between
the system-level physical architecture that was done previously and this current
architectural design step. The diagrams can often look quite similar; however, this
not always the case. In complex systems with more abstract black boxes, the dia-
grams can become more layered, with a system of systems architecture starting to
reveal itself as the implementation path forward.
In our project, this is not the case, but the highest return on the investment for
going through this part of the design process is always achieved by doing it up-
front, before the real design work has started.
Hardware architecture ties into the discussion here in taking the time to docu-
ment the key components, their interfaces, and the data that flows over those in-
terfaces. It allows the entire engineering team to understand the real nuts-and-bolts
impact to design decisions. If a new sensor comes along that would be a nice to
have in the system but causes the SoC Platform to require that we make a technol-
ogy jump to add high-speed transceivers to our SoC, we can much more clearly
see that with a well-defined and documented architecture, fed by key component
selection.
ISO 26262 also provides some guidance as to what makes a well-planned ar-
chitecture; SensorsThink would be wise to consider Table 2.1.
As a final point, consider this last bit of advice from ISO 26262: “Nonfunctional
causes for failure of a safety-related hardware component shall be considered dur-
ing hardware architectural design, including the following influences, if applicable:
temperature, vibrations, water, dust, EMI, cross-talk originating either from other
hardware components of the hardware architecture or from its environment.”
If we again simply remove the word safety and replace it with the word “sys-
tem,” we can see that this is wise advice to heed on all projects and product de-
velopments. Consider other failure modes aside from functional failures. Consider
how to protect hardware against its operating environment, consider how to pro-
tect circuits from each other, and consider this early in the design process. It is
much easier to populate a set of component pads to help with an EMI/EMC prob-
lem than it is to cut a board up, hack a part into place, and then respin your design
to pass the testing. An ounce of prevention is worth a pound of cure.
•• VoSPI does not utilize the master-out slave-input (MOSI) line, and that pin
should be set to a logic 0 or connected to signal GND.
From these pieces of information, we can very clearly convey the requirements
for the master SPI device (our SoC) to our firmware engineering team and also start
to put the puzzle pieces together in terms of the number of peripherals required of
our SoC to communicate to our sensors. Buried deeper in the section of the data-
sheet is the number of bits in each video frame and the different modes of commu-
nication in which the Lepton can operate.
As an example, let’s consider the Raw14 video format mode. In this mode,
there are 164 bytes per packet and 60 packets per video frame. That gives us ap-
proximately 10 kB per video frame. This is useful information to know for further
consideration on where to place the peripheral and to be able to share as a starting
point for firmware engineering. This is one of the two interfaces shown on the dia-
gram. The other one was I2C, let’s take a look at that interface next.
Table 2.3 shows the entry from the Lepton datasheet for pins on the Command
and Control interface. In addition to the reference to the appropriate page section,
we also see that this is an I2C-compatible interface. This is key information that
should be (and is) captured on the hardware architecture diagram.
If we now jump into that section of the datasheet, we can find information on
how the interface is used and how to properly communicate with it.
Lepton provides a command and control interface (CCI) via a two-wire inter-
face similar to I2C. In this section, we are given a very useful reference to a separate
interface description document (IDD) to fully explain the interface [5]. There is
also a list of some of the parameters that are controllable via this interface; things
such as gain, telemetry, frame averaging, and radiometry mode selection are all
done via this interface.
Again, all the pieces of information that we need are here to help us to make
informed decisions in piecing the puzzle together to better understand our embed-
ded system.
is where having a solid design process in place can make the insurmountable seem
manageable.
For a system such as the one being designed by SensorsThink, the process to
follow is the same as for a system being designed for a coffee maker or a server
grade motherboard. The differences come in the complexity of those steps, and the
amount of planning, checking, and replanning that likely go into each step.
Note that it is important to take a step back and acknowledge that, in engineer-
ing and in design work, there is always more than one way to get a job done prop-
erly. The intent of this story and the steps being laid out are to provide a frame-
work that can be followed, which hopefully leads to successful and well-engineered
designs and solutions. Over the course of your career, you will find what works
best for you, and you will find what does not work for you at all. It is all part of
the journey of learning the engineering trade and learning about a specific field of
engineering. Design will evolve, software tools will change and improve, and the
way that design is approached will have to change as well in order to keep up with
the times. Now let’s jump back into the story and outline some high-level steps to
follow when designing an embedded system.
This tool brings a number of very useful power system design tools together under
one umbrella including individual power supply design and power system design
and works hand-in-hand with LTspice to run more detailed simulations of power
supplies. As an example, consider Figure 2.12 (tools used with the friendly permis-
sion of Analog Devices Inc.
This diagram is not for the SensorsThink design, but it is a good example of
how you can visualize an entire power system, include all of the nominal loads, and
really get a high level of confidence that your solution is a good one. After entering
all of your design data, the tool also does power calculations on the system to show
you things such as total power consumption, dissipation, and margin.
If we then wanted to look at the transient performance of a specific power sup-
ply, we can jump into LTspice and run that simulation as well, linking the simula-
tion to this system-level view, making for a really nice design artifact.
2.3.7 Summary
There is no shortage of design topics that can be concerning for engineers doing
hardware design work. In high-speed digital systems, there can be months of work
doing signal integrity simulations planning for routing structures (vias, fanout pat-
tern, pin escape) on multigigabit-per-second interfaces. In sensitive analog systems,
there can be many months of work doing analog simulation and analysis on op-
amp circuit stability, looking at Bode plots, and doing noise analysis before arriving
at what should be committed to the schematic. The more preparation work that
is done, the more at ease the designer will ultimately feel when it comes time to
stamp a schematic and PCB design with his or her approval. Let’s take a look at an
example of one of the topics we just mentioned in a bit more detail to get an idea
of examples of what to expect when undertaking this type of work. Let’s focus on
the Zynq device, as it is the most complex device on the SensorsThink Platform
towards which we have been marching.
During design reviews, it is often what seems to be the most mundane topics that
get the most attention: the color of LEDs, the percentage tolerance of specific resis-
tors, what sheet size was used for the design; on and on it can go, seemingly with no
value added. One topic that can seem to fall into that category is decoupling capaci-
tors, but doing so would be a considerable mistake, and, as such, the designers at
SensorsThink are wise to consider decoupling strategy in great detail.
Despite the boring exterior, decoupling is absolutely essential to PCB design.
Without it, you risk having a totally nonfunctional circuit. With too much or too
little of it, you can have instability in all kinds of circuits: power supplies, loop
compensation circuits, phase locked loops (PLLs), and more.
A note to the reader: In the course of my career, I have spent countless hours,
days, and probably weeks discussing decoupling with seasoned engineers from ana-
log, power, and digital perspectives. Decoupling is not a one-size-fits-all approach,
and knowing that is a big part of conquering the battle with those little capacitors.
journey, less work, and you disturbed no one else on your journey to get that en-
ergy. Before you know it, that little bucket gets filled back up, and the next time
that you need energy, it is sitting right there for you again.
By providing a local energy storage device (decoupling capacitor) for your cir-
cuits, you make their lives easier, they work better and more efficiently, and they
are friendly to their neighbors. This is a win-win scenario.
1. Aluminum electrolytic;
2. Aluminum polymer;
3. Tantalum;
4. Ceramic;
5. Film.
Each type of capacitor has its own strengths and weaknesses, some due to the
type of dielectric material, some due to packaging sizes, and some due to material
properties and applications. A full discussion on each capacitor type and the ad-
vantages and disadvantages warrants a book all to itself. As a starting point, you
can check the references at the end of the chapter for some very insightful discus-
sion and information [9–12].
As the designer, you have all the information that you need to properly de-
couple the Zynq at a schematic level, make smart component selection choices and
substitutions for parts that may be hard to find, and understand the role each type
of component plays.
almost anything these days with powerful simulation tools such as HyperLynx [14,
15] or Ansys SIwave. There are tools that can analyze your capacitor placement
such as HyperLynx DRC [16] and alert you to physical placement issues that are
easy to miss on large or complex designs. Using those tools takes time, but the time
invested is almost always worth the return on that investment.
In Section 2.4, we talked about a seemingly mundane topic that has many more nu-
ances and layers than meet the eye. A topic that is much more straightforward and
requires less math and more common sense is connector choice or connectorization.
To be clear, connectorization is a made-up word; you cannot find it in a dictionary
or in any kind of conversation outside of engineering. That will not stop the word
from being used in conference rooms at SensorsThink, however; and it will not stop
you (nor should it) from using the word either.
A note to the reader: Sometimes picking connectors is really easy, and some-
times it is not. There are often organizational influences on connector choices:
what tooling in which the company may have already invested, what companies
are preferred partners, or what families are already in heavy use across different
products. These can all be important in the bigger picture decision-making process,
especially if you are working on a product with massive volumes (hundreds of
thousands or millions of units). For the sake of discussion, we are going to keep the
focus on technical reasons to choose a connector and assign signals to locations,
because it is impossible to predict all of the organizational influences that may or
may not be present in your situation.
With so many things to consider, it is a fair question to wonder why this topic
was just stated to be straightforward. The easiest way to demonstrate that is to
walk through an example of choosing a connector for an interface of our Smart
Sensor Controller.�
2.5 Connect with Your System 83
the service personnel will use the interface. That means that at every service call,
the RJ45 connector could be unplugged from the customer network, plugged into
a service computer, and then reverted back to the customer network. That equals
four mating cycles per visit, two visits per year: eight cycles. Eight cycles for 12
years equal 96 cycles. Let’s keep with the theme of round numbers and round up to
100 cycles for maintenance.
Let’s also assume that at least once per year the customer damages the wired
connection and has to replace the cable to the interface; that is another two cycles
per year for 12 years, or 24 cycles. We will go with 25 for our counting purposes.
For all intents and purposes, this is the bulk of the use that the interface will
see. There are other times the interface will see action, during production testing,
for example. There are the outlier customers that will change cables once a month
or unplug the connector once a week for their application, maybe to clean around
the system, for example.
To give ourselves some additional margin, and because the total number (125)
is so modest, we will triple our calculation and assume that 375 mating cycles is
the target.
There are no requirements for water or dust ingress protection, so this is all the
information we need to make our choice.
This all points towards this being a good choice for our application and this
interface. It did take a bit of information gathering and a bit of common sense
and guesstimation (another made up word) to round things out but it was pretty
straightforward compared to many other design tasks that are encountered during
hardware development.
If the team at SensorsThink follows a similar approach for all of the interfaces
and connectors, they will end up making sound technical choices for the system.
1. Power;
2. Voltage;
3. Current;
4. Temperature;
5. Actuation cycles.
As we have done for other topics, such as verification, we can look to harmo-
nized standards for a good starting point for discussion and investigation into this
topic.
88 ������������������������������
Hardware Design Considerations
IEC 61508 [20] is an international standard for functional safety systems, and
it is often cited as a baseline standard for other types of safety systems that develop
their own standards for safety.
From the document’s first section, we can refer to the introductory paragraph
to better understand the intent of the IEC 61508 series:
This International Standard sets out a generic approach for all safety lifecycle ac-
tivities for systems comprised of electrical and/or electronic and/or programmable
electronic (E/E/PE) elements that are used to perform safety functions. This unified
approach has been adopted in order that a rational and consistent technical policy
be developed for all electrically-based safety-related systems.
For the engineering team at SensorsThink, adhering to this standard is not re-
quired for the SoC Platform, as it is not a safety system or one used in a safety ap-
plication. However, as we used the guidance from ISO26262 to better understand
the value of, and methods for, proper verification, we can take similar lessons for
de-rating from IEC 61508 and create a very robust hardware design, rooted in the
principles of the most robustly designed systems in the world.
De-rating (see IEC 61508-7) should be considered for all hardware components.
Justification for operating any hardware elements at their limits shall be docu-
mented (see IEC 61508-1, Clause 5).
NOTE: Where de-rating is appropriate, a de-rating factor of approximately two-
thirds is typical.
In other words, in a safety-related system that falls under the scope of IEC
61508, a 100-V rated capacitor can only be used on a voltage supply of 67V or less,
unless there is supporting justification and documentation that shows why a higher
applied voltage is acceptable.
factor in safety systems, and solid tantalum capacitors are usually de-rated to 40%
of their listed working voltage due to their long-standing reputation as devices that
self-ignite during failure.
Knowing the guidance from IEC 61508 is a great starting point, but it is not
a substitute for engineering rigor and for deeper understanding of the exceptions
to the rules, especially in safety-critical systems. The engineering team that ignores
those rules and exceptions will not get very far in a safety assessment by functional
safety experts.
Even if your system is not part of a functional safety product, that does not
necessarily preclude the use of de-rating to ensure a robust system and product. It
can sometimes drive cost and can lead to discussions about overengineering a par-
ticular system. There is little harm to overengineering on the side of reliability and
robustness. Most importantly, engineers must be aware of all the trade-offs that
they are making and make the right one for the product that they are designing.
performance of particular sensors, or elaborate stress tests for the high-speed por-
tions of the design, a common occurrence when too many engineers gather in the
same room.
Before discussing which pseudo-random bit pattern should be sent over a com-
munications link or how many nanovolts of measurement accuracy are needed for
a measurement, we should all remember the basics.
When a new PCB arrives, it must be treated as a complete unknown. Despite
the thousands of hours of effort that went into conceptualizing and designing the
circuits in the past several months, never forget that things happen. Human error
is always a factor and, despite our best efforts, mistakes will be made. A well-
constructed verification plan will push these errors to the forefront and allow them
to be found, corrected, and protected from escaping into the field.
It is best to start with a very conservative approach and slowly expand the
scope of testing as confidence in the design builds. Consider the following series of
steps for taking a brand-new PCB out of the box and getting ready to test it:
This list of 10 steps can save a lot of headaches for engineers of all disciplines
and only takes a short amount of time to perform on an appropriate sample size of
the total lot of PCBs that have been received.
For particularly costly or complex designs, 100% inspection may well be war-
ranted. In some cases, if very high-cost components are used, it is good practice to
have several boards assembled without those components so that the underlying
hardware environment can be verified to be safe for those high-value components.
For example, consider a very high-end FPGA such as the Xilinx Virtex UltraScale+
Series of devices with on-board high bandwidth memory (HBM). These devices can
cost tens of thousands (sometimes over hundreds of thousands) of dollars each.
Taking extra precautions is absolutely warranted and is the right thing to do in
cases like that.
It is reasonable to wonder what the value of each of the 10 steps listed provides
to the engineering team, especially given that it can be quite exciting to get a new
92 ������������������������������
Hardware Design Considerations
piece of hardware in hand. Oftentimes, the instinct is to just plug it in, turn it on,
and see what happens. Patience is a virtue, as the old saying goes, so let’s explore
that further.
(a) The last step before diving into the nitty-gritty verification world is to
make sure that the processors and/or FPGA devices on the PCB can
be programmed; this is usually accomplished via a JTAG (or another
debugger) interface.
(b) In some cases, the JTAG interface can be used to program attached
FLASH memory storage; if that is the case in a particular design, the
time to check for basic functionality of that interface is early, before
too much detailed checking has started.
By doing these tests for at least one representative PCB prior to distributing
them to the other engineers who are part of the team, there is a reasonable level of
confidence that a nominally functional hardware platform is in the hands of other
engineers for further testing and development. Any necessary issues are found early
and can be performed to all boards at the same time, ensuring a consistent deploy-
ment of hardware. This is crucial in teams of any size, but the larger the team, the
more important.
•• 10/100 Ethernet:
• 100BASE-TX Physical Layer;
• MII/RMII interface (payload data);
• MDIO interface (PHY management).
•• DDR3;
•• SPI:
• Accelerometer;
• Infrared image sensor (video data);
2.7 Test, Test, Test 95
That is quite the list. Going page by page and breaking it down, we essentially
have created an outline for the majority of our test plan. Now the details can be
filled in: how the tests will be executed, what measurements will be made, what the
pass/fail criteria are, and so on.
devices, such as those that hold a bootloader, readback of data from memory to
host may be much more important to test than write performance to the memory.
It is always good to understand the use case of each component.)
In this design, the interface communicates at 3.3-V logic levels and it is con-
nected to the Zynq processor complex.
Let’s summarize what we have so far:
1. I2C interface;
2. 3.3-V voltage levels;
3. 1-MHz maximum frequency;
4. Write values;
5. Read values.
We are starting to piece a test case together, which is excellent progress. Let’s
dig a bit deeper into the details. How can we really make sure that the device is
working correctly?
2.7.4.2 Specifications
In nearly all digital communications devices, there will be both dc and ac character-
istics provided for the device; these are very important to review during the design
phase, and equally important to review when designing a test plan (Table 2.7).
Values for minimum and maximum voltage levels that are interpreted as 0s and
1s are given in the dc specification, and things such as rise time and fall time are
given in the ac specification for the device.
From the dc characteristics, we know now what voltage levels to expect on the
interface and what voltage levels must be met for proper operation for both inputs
to the device and outputs from it. For the I2C interface, the clock is always an input
to the device; during a write transaction, the data is an input, and during a read
transaction, it is an output.
Therefore, when we write to the device, we need to provide signals that reach
at least as low as 3.3 * 0.2 (remember, 3.3V is our VCC) or 0.66V. When we read
from the device, we should expect that the voltage level for a logic state of 1 should
be no less than 3.3 * 0.7, or 2.31V. �
From the ac characteristics, we can now compare the measurements of the sig-
nals on our design to the requirements of the device (Table 2.8). In this case, they
are dependent on the period of the clock cycle. For a 1-MHz clock, the period is
1 µs; 9% of 1 µs is 0.09 µs, or 90 ns. This is a really clear pass/fail criteria against
which we can measure and verify our design’s performance.
2.8 Integrity: Important for Electronics 97
Part of verifying your design comes in the form of simulation, and this can take
on many forms in embedded system design. For hardware specifically, we can con-
sider two broad topics for simulation: power integrity and signal integrity.
A note to the reader: The topics of power and signal integrity are massive top-
ics to try to explain in a concise manner, and they warrant books and conferences
of their own. Because of that reality, it is not within the scope or spirit of this
particular book to dive into great depths of detail for all things related to power
and signal integrity. The intent of this brief discussion is to bring awareness to the
topics for further consideration in your particular project or design. Many titans of
the industry have written excellent books on the topics: Howard Johnson, Martin
Graham, and Eric Bogatin are among our favorites, but there are many other excel-
lent resources as well.
With all of the above information, a very in-depth analysis can be performed:
extended networks through regulators and to downstream loads to provide high
confidence levels in the capability of the power delivery interconnect to provide the
power needed for the design.
1. The desired target impedance for critical components: this type of informa-
tion is sometimes available in datasheets; other times, more direct ques-
tions must be asked of IC suppliers to get the information.
2. Accurate capacitor models for components used in the design: Typically,
these will be in the form of s-parameter files. S-parameter files are consid-
ered to be more accurate in many cases for high-frequency analysis because
they can more accurately reflect nonlinear behaviors such as the variability
of series inductance over frequency.
3. Clear understanding of the specific pins of interest for the devices that
require this analysis: Not all package pins are created equally, and a good
understanding of which pins supply which particular sections of a silicon
100 ������������������������������
Hardware Design Considerations
The reason that this type of analysis is becoming more common is due to the
increased edge rates of modern silicon. Even devices such as logic gates from the
74LVC logic family boast rise time specifications on the order of only a few nano-
seconds. An important concept to bear in mind with signal integrity is that the fre-
quency of the signal is sometimes not the cause for concern, but rather the rise and
fall times of the edges. A 1-ns rise time is equivalent to frequency content occurring
at 1 GHz. At these frequencies, it is easy to see why designing with signal integrity
in mind is so important. Performing basic signal integrity analysis requires accurate
driver and receiver models and basic information about the interface such as:
2.8 Integrity: Important for Electronics 101
1. Operating voltage level, in order to correctly select the model of the buffer
to use in simulation. For even simple devices such as logic gates, there are
often IBIS models describing the behavior of the driver at various nominal
operating voltages. Choosing them incorrectly can lead to very misleading
and strange results.
2. Operating frequency, in order to properly stimulate the interconnect. For
an SPI that is intended to run at no faster than 10 MHz, using a 500-MHz
input stimulus provides no value.
3. Specific drive strength and slew rate information for drivers. In modern
devices, it is not uncommon to have multiple options for setting the drive
strength (given in mA, typically) and slew rate (fast, medium, slow) for
individual IO or specific peripherals. Changing these parameters can have
a drastic impact on simulated and real-world performance.
HyperLynx software to perform the pre-layout analysis and to verify that we have
a sound design architecture in mind for the ULPI interface.
In the exercise of pre-layout simulation, the goal is to ensure that under ideal-
ized conditions the interfaces and design details (such as selected series termina-
tion) produce a very clean and idealized signal.
If the pre-layout simulation does not show nearly ideal signal behavior, it is an
indicator that further investigation is warranted. There is something that is amiss,
either with the setup of the simulation or with the design topology and component
selection. Regardless of root cause, the time to find it and address it is certainly in
the pre-layout stage and not later.
In addition to validating design decisions, pre-layout signal integrity analysis is
a wonderful learning tool, as you can freely try things and look at the impact to the
signals without any real-world risk or impact; this kind of learning opportunity is
always welcomed in engineering.
For the ULPI interface, we have a relatively simple architecture to consider:
A note to the reader: PCB stack-up design is an art unto itself that is brief-
ly covered later in this book. For the purposes of pre-layout signal integrity, the
requirements are only to have a stack-up well enough defined to facilitate the cre-
ation of transmission lines of a known impedance.
For our design, the interface is intended to operate from a 1.8-V supply, and,
per the ULPI specification, the maximum clock frequency and data rate are 60
MHz. This allows us to run a baseline simulation and view the waveform result
relatively quickly. In order to get a clear view of the quality of the signal, it is pos-
sible to run an edge simulation (either rising or falling) or an oscillator simulation
at a specific frequency. Additional options can be set for the corner of interest:
slow-weak, typical, or fast-strong. In this example, we will run an oscillator simu-
lation at 60 MHz of the typical IBIS model. This simulation will be of data flowing
from the Zynq to the USB3320 device.
We can observe some signal overshoot and ringing on this initial simulation
(Figure 2.20), and this is something for us as designers to watch during post-layout
analysis (which takes into account actual routing geometry). However, we can also
study the impact of series termination so that if post-layout analysis reveals similar
problems, we can have a fix ready to implement.
A note to the reader: In a classical sense, series source termination is placed as
close as possible to the transmitting device in order to try to better match the source
impedance of the driver to the transmission line. On a bidirectional bus, it is ideal
to place the termination at the mid-point between the two devices in order to try
to match both transmitting devices. It is important to keep in mind that the ideal
signal is almost impossible to implement on real electronics, which is why pass/
fail criteria and a thorough understanding of the way that an interface works are
crucial to these kinds of exercises.
In the revised circuit (Figure 2.21), we have split the transmission line in half
and placed a 22Ω resistor in series with the transmission line pair.�
We can see that the signal quality is greatly improved on this simulation run,
with significant reduction in overshoot and ringing (Figure 2.22).
These kinds of simulations can be repeated in all kinds of varying conditions:
longer transmission lines, different driver models, additional fanout, and differ-
ent frequencies. Knowing what to simulate at different stages of the design comes
with experience and a little bit of experimentation. However, the value and insight
gained along the way are invaluable and should not be discredited as a reason to
err on the side of simulation if any doubt exists.
In terms of the amount of work that goes into designing a PCB, it can be argued
that the majority of the hours put into the design are accumulated during the layout
phase. Layout is the phase of PCB design in which the schematic interconnect is
turned into the physical pieces of copper that (hopefully) result in a functional piece
2.9 PCB Layout: Not for the Faint of Heart
Figure 2.21 HyperLynx schematic model of the ULPI data path: added series termination.
of hardware that implements the many months (and sometimes years) of planning,
design work, and development efforts of countless engineers and management.
PCB layout is an often-misunderstood engineering discipline, sometimes even
an overlooked one. Too often schematic netlists are “thrown over the wall” to an
unsuspecting PCB design engineer who is given little input, less guidance, and the
highest of expectations.
As with nearly all tasks in engineering, garbage in produces garbage out. If the
engineers at SensorsThink provide a well-designed schematic along with a clear set
of design constraints and layout considerations to follow and they engage in the
layout as a collaborative effort, the team will be set up for success.
PCB layout is an art form for sure, but one that begins with a tremendous
amount of planning and detailed consideration. There is nearly an endless amount
of design trade-offs that will be made out of necessity throughout the process of
designing a PCB, and navigating those trade-offs is both part of the fun and part of
the battle of PCB layout.
mechanical details such as mounting holes and the physical size of the PCB are all
important to consider from the very start of floor planning.
Electrical constraints are yet another consideration. What circuits are sensitive
and likely to be impacted by electrical noise? What circuits are likely to annoy their
neighbors? In keeping with the theme of a house, where do we want our jacks for
cable and internet? Where does the dishwasher go or the washing machine? These
things are considered long before the carpenters and tradespeople begin the work
of building the house. If they are not, we can imagine that the home would be built
ad hoc, and the results would likely be a mixed bag at best with even the most tal-
ented tradespeople performing the labor.
improvement. The connections now all have a clear line of sight to the pins on the
IC. While this does look promising, a quick look in a 3-D rendered environment
will show the problem with the current arrangement (Figure 2.25).
We can imagine that it might be less than ideal to try to plug a cable into the
connectors with the IC sitting right in front of the jack (Figure 2.26).
If we take one more attempt at this, we can probably arrive at a reasonable
placement for these three components, relative to one another.
In Figure 2.27, we arrive at what is likely the best we can do with the informa-
tion that we have at hand for these three components and the rat’s nest connections
that are showing. It looks like some of the lines are crossing over each other (Fig-
ure 2.27); it was much better before. It was, but it was not a reasonable physical
approach when we viewed things in the 3-D world in which we live. So we had to
make the first of the many trade-offs that will be necessary during the process of
110 ������������������������������
Hardware Design Considerations
laying out this (and any) PCB. There are many ways to unravel these crisscrossed
lines during the routing phase of the design, and many of those tricks are learned by
doing and come from years of practice. Experienced PCB layout engineers see these
solutions as quickly as they see the problems, and the trade-offs that must be made
are clear to them very quickly. Until that experience is something that can be relied
upon, mistakes will be made and learned from; all of that is part of the process.
In nearly every design, there are going to be two types of circuits: circuits that
create noise and interference, and circuits that are particularly susceptible to noise.
In floor planning, understanding which circuits are noise creators is vitally impor-
tant; equally important is knowing which circuits are susceptible to malfunction in
the presence of noise. Once these circuits are identified, we can try to use physical
space between the two types of circuits as our best ally for mitigation against un-
wanted interference and circuits that misbehave (Figure 2.28).
Some circuits that are typically considered noise creators are power supplies
and high-speed switching digital electronics, and analog signal conditioning and
data conversion are typically circuits that ideally are kept free and clear of these
noise creators. In Figure 2.28, we see an example of a floor plan that can be used to
help mitigate circuit interference using physical space as one of the primary means
of achieving the goal.
Another type of electrical constraint is high-voltage considerations, be it in the
form of ac mains-connected power or high-voltage dc power supply sources that
are used in scientific instrumentation systems. Understanding where these inter-
faces and interconnect are going to be at the start of the design can make or break
the success of a high-voltage PCB layout. Once those constraints are understood
the PCB design engineer can begin considering how to address them; it could be
slots in the PCB to prevent high-voltage breakdown, or if it is not possible to ac-
commodate the required distances, encapsulation may be required. Surrounding
the components with a material that can withstand and suppress high-voltage dis-
charges better than air can may be necessary.
A note to the reader: High-voltage distances are called creepage and clearance
distances and are the distance along a surface and through the air, respectively.
IPC-2221B is an example of one reference for determining the required creepage
and clearance distances needed in a specific situation. Also important to consider
are product safety directives and standards, which may have additional constraints
to consider.
will need. Xilinx provides an excellent resource that walks through a sound meth-
odology for estimating the number of routing layers needed to accommodate a PCB
design that includes BGA devices of various sizes. We will be referring heavily to
this resource, UG1099 BGA Device Design Rules.
A very quick way to estimate the number of layers required is based on the
following equation:
Signals
Layers =
Routing Channels × Routes Per Channel
For Xilinx devices, a ratio of 60% of the total BGA pads is given as an approxi-
mate rule of thumb for finding the number of signals that need to be routed. The
other 40% are typically power supply and return pads.
A routing channel is defined as a path to get a trace from a BGA pad to the area
of the PCB outside of that pad, and the number of routes per channel is the number
of traces that can fit within each channel, almost always one or two.
If we use our Zynq device on the SoC Platform as an example, we can deter-
mine how many routing layers that we will need to reserve for the PCB design.
With 400 BGA pins in our device package (the CLG400), we can assume that
at maximum we will have 60% of these pads to route out to other components;
60% of 400 is 240.
The number of routing channels can be determined by examining the PCB foot-
print, and we can refer to Xilinx UG865 for this.
With 20 total rows and columns, there are 19 total routing channels available
per side; and with four sides, that calculates out to be 76 routing channels in total
for the device.
The number of routes per channel can be either one or two; we will calculate
the required layers for both of these cases. The reason for this is that, right now,
we do not have all of the constraints in place to rule out one approach or the other.
We can now plug these values into the equation with which we started to get a
rough idea for the required number of routing layers:
Case 1: One Route Per Channel
240
Layers =
76 × 1
Layers =~ 3.2
240
Layers =
76 × 2
Layers =~ 1.6
2.9 PCB Layout: Not for the Faint of Heart 115
We cannot have half of a routing layer, so we can say that we will need ap-
proximately two or three routing layers for this design in order to fully route the
Zynq SoC device.
The additional steps of determining the possibility of routing one or two traces
per channel can be followed step by step in UG1099. They will not be covered in
detail here, but are mentioned for completeness and to give a clear overview of the
complexities that can be involved.
In addition to this basic calculation, a slew of other considerations are impor-
tant in determining what is feasible to expect, among them are:
•• Pad size and pad pitch: The size of the BGA pad and the space between them
are directly correlated to how many traces can fit in each routing channel.
A 1.0-mm pitch device will be much easier to accommodate two traces per
channel, and a 0.5-mm pitch device will almost assuredly not be able to do
so.
•• Fabrication technologies: Things such as via-in-pad, blind and buried vias,
and micro vias (which are part of a category of PCB design technologies re-
ferred to as high-density interconnect (HDI) technologies) can greatly reduce
the number of routing layers required by freeing up space in the PCB sand-
wich for other traces to be routed. We will cover HDI later in this chapter.
116 ������������������������������
Hardware Design Considerations
•• PCB thickness: PCB thickness has a direct impact on the minimum mechani-
cal drill size that can be performed on any particular design. A typical and
safe aspect ratio is 10:1, meaning that the board cannot be any thicker than
10 times the diameter of the mechanically drilled hole. For a typical drilled
hole of approximately 8 mils (0.008 inch), that means the thickest stack-up
that we can design is 80 mils. If we try to add too many layers to a board and
do not adjust the minimum drill size accordingly, our PCB will be impossible
to manufacture.
Typically, performing this exercise for the highest pin-count and/or smallest
pad-pitch device will yield a stack-up design and routing layer count that works for
all the devices on the PCB.
However, having three routing layers is not the end of the story here, in ad-
dition to routing layers, we also have to consider power and return layers, and
oftentimes the top and bottom layers of the PCB can be so full of components that
routing on them is not always feasible.
Taking all of this into account, we can see that arriving at an 8-layer stack-up
design is probably a reasonable starting point for this design. However, needing to
add additional layers may be required, and ending up with a 10-layer or 12-layer
stack-up would be completely reasonable as well.
inch). The use of a particular style and thickness of dielectric depends greatly on the
situation for which it is being designed.
A note to the reader: The best practice for creating and verifying PCB stack-up
designs is to engage directly with the contract manufacturer that will be fabricating
your PCB. This is the best way to make sure that your PCB stack-up design is able
to be manufactured.
2.9.5.5 HDI
Earlier in this chapter, we touched on several technology types that are part of a
group of fabrication methods referred to as HDI. HDI is one of the more rapidly
evolving spaces in the PCB manufacturing domain. IPC-2226 classifies PCB de-
signs into six groups, or types, and also outlines material types and mechanical and
physical properties that are acceptable for use in these types of designs. Let’s refer
back to our house analogy for PCB design. HDI is similar to adding fancy modern
118 ������������������������������
Hardware Design Considerations
gadgets and extras to your house. You may not need them, but they can make life
easier. They almost always cost more than the old-fashioned way of doing things
(like a tankless water heater versus a traditional tank-style water heater), but often
come with performance benefits.
•• Micro-vias: Typically, a via spanning two layers (layer 1 to layer 2, for ex-
ample) and typically will be a laser-drilled via. These micro-vias usually re-
quire an aspect ratio of 1:1, meaning that they cannot drill deeper than the
diameter of the hole that they create.
•• Via-in-pad: This is exactly what it sounds like, drilling a via directly in the
pad of a component. This technology requires an additional process step of
plugging the via holes so that the solder does not drain into the pad holes.
Via-in-pad can be a traditional through-via, micro-via, or blind-via.
•• Blind vias: A via that only goes partially through the PCB, for example, on
an 8-layer board, from layer 1 to layer 4. These vias can connect any two (or
more) layers that they span.
•• Buried vias: A via that is not exposed to the outer layers of the PCB, for ex-
ample, on an 8-layer board, from layer 4 to layer 6. These vias can connect
any two (or more) layers that they span.
•• Multiple laminations: Any PCB assembly that needs to be laminated (pressed
together) more than one time in order to accommodate blind or buried vias.
•• If you need to design a very small product with a lot of interconnect, you are
likely going to need at least one HDI technology.
•• If you have extremely high-speed digital interfaces, you are likely going to
need some type of advanced HDI processing to achieve signal integrity goals.
•• If you have very fine-pitch devices, you will likely need via-in-pad or micro-
vias to even be able to fit a trace into the BGA area.
There are some handy graphs and plots for assessing when HDI may be the
best path forward; the HDI Handbook [23] has one such region map that tells
us that the key metrics for assessing this are component density (the number of
parts per square inch) and component complexity (the number of pins per compo-
nents). So just because you are working on a design with 1,000 components with
80 components per square inch, it does not mean that you need HDI if 998 of those
2.9 PCB Layout: Not for the Faint of Heart 119
components have two pins. Conversely, just because a design has only 5 parts per
square inch, it does not preclude the use of HDI if the components have an average
of 80 pins.
•• Very tight control of the dielectric constant over frequency and environmen-
tal conditions;
•• Low loss, or dissipation factor at high frequencies (specified as high as 40
GHz).
Figure 2.31 Dielectric constant variation versus frequency of standard FR-4 style material.
There are many aspects of PCB design that we did not cover here, things such
as design for manufacturability (DFM), design for assembly (DFA), and design
for test (DFT). These topics are all broad and deep at the same time and really are
worthy of entire books all on their own.
Hopefully, this has shed some light on the complexities and trade-offs to expect
with PCB design work and why it truly takes a talented team of engineers to design
a highly functioning PCB. If the engineering team working on the SoC Platform
takes these factors into consideration, they will be on the path to success, even if
there are many bumps along the way.
References
[17] IEEE SA, “802.3-2018 - IEEE Standard for Ethernet,” February 15, 2020. https://stan-
dards.ieee.org/standard/802_3-2018.html.
[18] Bel Magnetic Solutions, February 15, 2020. https://belfuse.com/resources/drawings/mag-
neticsolutions/dr-mag-0826-1x1t-32-f.pdf.
[19] Murata, “Chip Multilayer Ceramic Capacitors,” November 27, 2017. https://www.mu-
rata.com/~/media/webrenewal/support/library/catalog/products/capacitor/mlcc/c02e.
ashx?la=en-us.
[20] International Electrotechnical Commission (IEC), “IEC 61508:2010 CMV,” March 28,
2020. https://webstore.iec.ch/publication/22273.
[21] PCI SIG, “PCI Express 4.0 Electrical Previews,” 2014. https://pcisig.com/sites/default/files/
files/PCI_Express_4_0_Electrical_Previews.pdf.
[22] Bogatin, E., Signal and Power Integrity: Simplified, Upper Saddle River, NJ: Prentice Hall,
2009.
[23] Holden, H., et. al, The HDI Handbook, Seaside, OR: BR Publishing, 2009, https://www.
hdihandbook.com/.
[24] Bogatin, E., and L. D. Smith, Principles of Power Integrity for PDN Design—Simplified:
Robust and Cost Effective Design for High Speed Digital Products, Upper Saddle River,
NJ: Prentice Hall, 2017.
[25] Graham, M., and H. Johnson, High Speed Digital Design: A Handbook of Black Magic,
Upper Saddle River, NJ: Prentice Hall, 1993.
[26] Johnson, H., High Speed Signal Propagation: Advanced Black Magic, Upper Saddle River,
NJ: Prentice Hall, 2003.
[27] Cooper, R., Winning at New Products: Creating Value Through Innovation, 3rd ed., New
York: Basic Books, 2011.
[29] Sparx Systems Pty Ltd., “Enterprise Architect 14,” August 22, 2019. https://www.sparxsys-
tems.com/products/ea/14/index.html.
[29] Reliable Software Technologies, “Mew Models for Test Development,” 1999. http://www.
exampler.com/testing-com/writings/new-models.pdf.
[30] Forsberg, K., and H. Mooz, “The Relationship of Systems Engineering to the Project Cy-
cle,” Joint Conference of the National Council on Systems Engineering (NCOSE) and
American Society for Engineering Management (ASEM), Chattanooga, TN, October 21–
23, 1991. https://web.archive.org/web/20090227123750/http://www.csm.com/repository/
model/rep/o/pdf/Relationship%20of%20SE%20to%20Proj%20Cycle.pdf.
[31] INCOSE, “About: INCOSE,” https://www.incose.org/about-incose.
[32] Requirements Working Group, International Council on Systems Engineering (INCOSE),
“Guide for Writing Requirements,” San Diego, CA, 2012.
[33] International Electrotechnical Commission (IEC), “Available Basic EMC Publications,”
August 22, 2019. https://www.iec.ch/emc/basic_emc/basic_emc_immunity.htm.
[34] RoHS Guide, “RoHS Guide,” August 22, 2019. https://www.rohsguide.com/.
[35] European Commission, “REACH,” August 22, 2019. https://ec.europa.eu/environment/
chemicals/reach/reach_en.htm.
[36] Ericcson Inc., August 22, 2019. https://telecom-info.telcordia.com/site-cgi/ido/docs.cgi?ID
=SEARCH&DOCUMENT=SR-332&.
[37] BKCASE Editorial Board, “The Guide to the Systems Engineering Body of Knowledge (SE-
BoK), v. 2.0,” Hoboken.
[38] SEBoK Authors, “Scope of the SEBoK,” August 18, 2019. https://www.sebokwiki.org/w/
index.php?title=Scope_of_the_SEBoK&oldid=55750.
[39] SEBoK Authors, “Product Systems Engineering Background,” August 18, 2019.
https://www.sebokwiki.org/w/index.php?title=Product_Systems_Engineering_
Background&oldid=55805.
[40] SEBoK Authors, “System Design,” August 18, 2019. https://www.sebokwiki.org/w/index.
php?title=System_Design&oldid=55827.
122 ������������������������������
Hardware Design Considerations
3.1 Introduction
123
124 ��������������������������
FPGA Design Considerations
In this chapter, we will explore these techniques in detail, thus enabling the
SensorsThink engineering team to correctly architect and implement the FPGA
solution.
The development process for an FPGA follows elements of both hardware and
software development.
The FPGA development flow starts in the systems engineering phase, when the
subsystem segmentation allocates specific requirements to be implemented within
the FPGA. These requirements are further expanded upon with the FPGA require-
ments, which includes not only the system-allocated requirements, but also derived
requirements that are required for the FPGA. These derived requirements will in-
clude such requirements as FPGA technology (SRAM, flash, or One Time Program-
mable), operating temperature, operating frequency, and fixed-point performance
requirements.
Once the requirements have been agreed upon, the next stage is the architec-
tural design. During the architectural design, the architecture and hierarchy of the
design are outlined, reviewed, and agreed upon. This architecture will identify the
following:
•• Clock architecture: The number of different clock domains and clock do-
main crossing approach.
•• Functional architecture: The number of modules required to implement the
overall requirement set.
3.2 FPGA Development Process 125
Following sign-off of the architecture, the next stage of the development pro-
cess is detailed design and verification; this will include:
constraints will help to achieve the required performance across the op-
erating conditions. Incorrectly defined timing constraints may result in
the device not working as expected across the operating conditions. It
may also significantly increase the implementation time as the imple-
mentation tool searches for a solution that is not realistic or achievable.
• Placement constraints: Define the physical location of logic elements/
modules or groups of modules within the FPGA. Placement constraints
can be used to help the tool achieve the timing performance and provide
isolation between functional modules for security or safety.
•• Synthesis: Synthesis translates the HDL written at a register transfer level into
the logic resources, which are available in the targeted FPGA. To guide the
synthesis tool, synthesis constraints may be used for timing-driven synthesis.
•• Placement: Places the synthesized logic resources within the FPGA logic
elements.
•• Routing: The final stage of the implementation. Routing attempts to connect
all the placed logic elements in the FPGA together, while achieving the timing
performance defined by the constraints.
•• Bit stream generation: Once the implementation has been completed, the
implemented design needs to be converted into a format that can be down-
loaded or used to program the FPGA.
•• FLIR Lepton 3.5: interface with the FPGA using I2C and SPI interfaces;
•• LSM303AHTR Magnetometer Sensor: interfaces with the FPGA using a I2C
interface;
•• SHTW2 Temperature Sensor: interfaces with the FPGA using an I2C
interface;
•• ADCMXL3021 Vibration Sensor: interfaces with the FPGA using an SPI;
•• LSM6DS3USTR Accelerometer Sensor: interfaces with the FPGA using an
I2C sensor;
•• HX94 Remote Humidity and Temperature Sensor: interfaces with the FPGA
using XADC.
The FPGA is also able to help the processor expand its limited multiplexed IO
(MIO). The processor has 54 MIO pins that can be implemented in interfaces such
as I2C, SPI, Gigabit Ethernet (GigE), and Universal Serial Bus (USB), as shown in
Figure 3.1. However, in some applications, the processor requires more MIO than
are available. In this instance, for many of the low-speed interfaces, the peripheral
signals can be routed to the programmable logic. This enables the PS interfaces to
use the programmable logic IO for interfacing; this is called extended MIO (EMIO)
as the MIO are extended into the programmable logic. In the SensorThink solu-
tion, EMIO is used to route out PL IO pins to support the Wi-Fi and Bluetooth
interfaces.
At the FPGA requirements level, the requirements must trace back to the
higher-level system requirements. The FPGA requirements can be seen in detail at
SensorsThink.com.
128 ��������������������������
FPGA Design Considerations
These requirements drive the overall architecture of the solution between the
PS and the PL.
each of the dedicated sensor controllers. Once the sensor data from all four of the
sensors has been received, the words are bundled together and written into the
BRAM and an interrupt is issued to the processing system. The processing system
can then, on receipt of this interrupt, read the BRAM via the AXI BRAM control-
ler IP. Once in the processing system, the software is then able to process and work
on the data.
Communicating with the controller and scheduling block using the BRAM en-
ables the processing system to define several parameters used in the control and
scheduling block, for example, the time between sampling cycles.
The FLIR input path is like the sensor path; however, the FLIR imager, once
configured, will be free-running outputting frames at 9 Hz. The FLIR Lepton is
configured over an I2C link; as such, in this application, the M1 processor is used
connected to an I2C interface. The M1 processor can configure the FLIR Lepton
under the control of the processing system, for example, changing the pixel out-
put from grayscale to red, green, blue (RGB) false color or enabling the auto gain
control. The M1 processor is operating from a program stored within BRAM; as
this memory is dual-ported, updates to the M1 program can be executed by the
processing system.
Video data output from the FLIR Lepton uses a video over SPI format. This
means that packets of data are sent over the master in, serial out (MISO) port,
when the master provides the serial clock (SCLK) and asserts chip select. In the
video over SPI format, the master out, serial in (MOSI) port is not used.
The video packets output by the Lepton are either valid packets or invalid pack-
ets to comply with export regulations regarding frame rate. Once a valid packet is
output with a valid line number, each of the line numbers increment until the entire
frame has been read out. If the packet header indicates that the frame is incorrect,
it needs to read out the entire corrupted frame and discard it.
As the frame rate is so low that a custom register-transfer level (RTL) block will
be created that checks the frame header to determine if the header is valid or not.
If the header is valid, the frame data will be written to block RAM, until the entire
frame has been read out and written to BRAM. Once this has been completed, an
interrupt will be generated to inform the processor a new image is available.
The processor can then access the image from the BRAM using the block ran-
dom access memory (RAM) controller over the maser AXI interface.
Now that we understand the architecture, the next step is to start creating and
verifying the RTL.
Before the SensorsThink engineering team begins to start RTL development, they
should consider the use of existing intellectual property blocks, often called IP
blocks. These IP blocks are provided by device vendors, third-party commercial
IP vendors, and open-source initiatives (e.g., open cores). IP blocks fall within two
groupings. The first group is those that are created to implement commonly used
functions, for example, block RAM memories, first in, first out (FIFOs), and clock-
ing structures. These basic IP building blocks are universal across a diverse range
of FPGA developments and can be used to accelerate the design capture process
3.4 Pin Planning and Constraints 131
by providing IP blocks that enable the designer to focus on creating the value-
added element of the solution, while the second group is domain-specific IP blocks.
Domain-specific IP blocks focus upon a specific domain and implementation of
functions commonly used within that domain. Example domains include quantita-
tive finance, machine learning, and embedded vision. Domain-specific IP blocks are
often significantly more complex than basic building blocks, for example, imple-
menting algorithms such as color space conversion, video layering for embedded
vision applications, or rate setting for quantitative finance. Using IP blocks provides
a significant time-scale reduction as it reduces the number of custom blocks that
need to be developed and verified. Many IP blocks also use the AXI standard out-
lined above that further reinforces the benefit of using a standard interface, as easy
integration with existing IP blocks is provided.
The inclusion of a programable logic device on our Smart Sensor SoC means that
we need to carefully consider the pin allocations. Programmable logic offers com-
plete flexibility in assigning pins to signals; however, to get the optimal solution, we
need to undertake careful pin planning. This pin planning is required to ensure that:
•• Correct clocking resources are used: Xilinx Seven series provide both global
and regional clocks.
•• Correct pins are allocated for differential pairs.
•• Correct IO standard is allocated to an IO bank.
•• All IO standards used on a bank are compatible.
Once the MIO and DDR entries have been created, we are able to elaborate
the design. From an elaborated design, we can export an IBIS file that describes the
MIO and DDR interfaces. This can be critical for verifying that high-speed proces-
sor interfaces in the PCB layout have the correct signal integrity. As this can impact
the design of the board, it is best to generate the IBIS file early on in the design
process (Figure 3.7).1
1. The use of these IBIS files for signal integrity performance is discussed in detail in Chapter 2.
3.4 Pin Planning and Constraints 133
With the processor element of the pin allocation completed, we can create a
pin planning project to verify the pins allocated to the programmable logic half of
the Zynq.
Here we can allocate a pin to an FPGA signal and IO standard and control the
speed and drive strength and termination. Once all the pins have been allocated,
the developer can run a design rule check (DRC) to ensure that the FPGA pin al-
location does not break the IO banking rules. Once complete, the pin out must be
verified to be valid by passing the DRC.
These pins can be written out then as Xilinx Design Constraints File, which is
how the Vivado understands which pin and IO standard to use for specific designs
input and output.
set_property PACKAGE_PIN C20[get_ports HX94B_TEMP_BUF]
set_property PACKAGE_PIN E17[get_ports HX94B_TRH_BUF]
134 ��������������������������
FPGA Design Considerations
Constraints are particularly important in FPGA design for not only ensuring ex-
ternal interfacing is defined, but also a range of other applications ranging from pin
allocation to controlling timing performance and even location of logic in the pro-
grammable logic implementation. Now SensorsThink engineers have understood
how we can pin plan our FPGA design (Figure 3.8); let us take a few moments to
understand the wider role of constraints in our programmable logic development.
As always, there are a few constraints which sit outside these groups. Vivado
has three, and they are predominantly used on the netlist.
signal integrity tool may be used; these tools require an IBIS model. We can extract
an IBIS model of our design from Vivado when we have the implemented design
open. Using the File->Export->Export IBIS model option, this file can then be used
to close the system level signal integrity issues and timing analysis of the final PCB
layout.
Once the design team is happy with the signal integrity performance and timing
of the system as a whole, we end up with a number of constraints as below for the
IOs in the design.
set_property PACKAGE_PIN G17 [get_ports {dout}]
set_property IOSTANDARD LVCMOS33 [get_ports {dout}]
set_property SLEW SLOW [get_ports {dout}]
set_property DRIVE 4 [get_ports {dout}]
With the HP IO banks, we can also use the digitally controlled impedance to
terminate the IO correctly and increase the signal integrity of the system without
the need for external termination schemes. We must also consider the effects of the
IO if there is no signal driving it. For instance, if it is connected to an external con-
nector, we can use the IO constraints to implement a pull-up or pull-down resistor
to prevent the FPGA input signal from floating, which can cause system issues.
We can also use physical constraints to improve the timing of our design by
implementing the final output flip-flop within the IO block itself; doing so reduces
the clock to output timing. We can also do the same thing on input signals which
allows the design to meet the pin-to-pin setup and hold timing requirements.
3.4 Pin Planning and Constraints 137
To determine with which type of clock we are dealing with, we can use the
clock report produced by Vivado to aid us in identifying asynchronous and unex-
pandable clocks.
With these identified, we can use the set clock group constraint to disable tim-
ing analysis between them. Vivado uses SDC-based constraints; as such, we can use
the command below to define a clock group:
set_clock_groups –name –logically_exclusive –physically_exclusive –asynchro-
nous –group
The –name is the name given to the group; the –group option is where one can
define the members of the group (i.e., the clocks that have no timing relationship).
The logically and physically exclusive options are used when we have multiple
clock sources that are selected between to drive a clock tree (e.g., BUFGMUX or
BUFGCTL); therefore, the clocks cannot be present upon the clock tree at the same
time. As such, we do not want Vivado to analyze the relationship between these
clocks, as they are mutually exclusive, while the –asynchronous is used to define
asynchronous clock paths.
The final aspect of establishing the timing relationship is to consider the noni-
deal relationship of the clocks in particular jitter. Jitter needs to be considered in
two forms: input jitter and system jitter. Input jitter is present upon the primary
clock inputs and is the difference between when the transition occurs and when it
should have occurred under ideal conditions, while system jitter results from noise
existing within the design.
We can use the set_input_jitter constraint to define the jitter for each primary
input clock, while the system jitter is set for the whole design (those are all the
clocks) using the set_system_jitter constraint.
What this means for the simple example that we have been following is that the
hold multiplier is defined by the equation
•• BEL: The Basic Element of Logic (BEL) allows a netlist element to be placed
within a slice.
•• LOC: Location (LOC) places an element from the netlist to a location within
a device.
•• PBlock (Physical Block): This can be used to constrain logic blocks to a re-
gion of the FPGA.
Thus, while a LOC allows you to define a slice or other location within the de-
vice, a BEL constraint allows you to target at a finer granularity than the LOC and
identify the flip-flop (FF) to use within the slice. PBlocks can be used to group logic
together; they are also used for defining logical regions when we wish to perform
partial reconfiguration.
The PBlock is great for when we wish to segment large areas of our design or
if we wish to perform partial reconfiguration. In some instances, we will wish to
group together smaller logic functions to ensure that the timing is optimal. While
we could achieve this using PBlocks, it is more common for us to use relationally
placed macros (RPMs).
RPMs allow design elements such as DSP, FF, lookup table (LUT), and RAMS
to be grouped together in the placement. Unlike PBlocks, they do not constrain the
location of these to a specific area of the device (unless you want it to) but instead
RPMs group these elements together when they are placed.
Placing design elements close together allow us to achieve two things. It allows
us to improve resource efficiency, and it means that we can fine-tune interconnec-
tion lengths to enable better timing performance.
140 ��������������������������
FPGA Design Considerations
The RLOC constraints use the definition RLOC = XmYm where the X and
Y relate to the coordinates of the FPGA array. When we define an RLOC, we can
make this either a relative or absolute coordinate. Depending upon if we add in the
RPM_GRID attribute, this makes the definition absolute and not relative.
As these constraints are defined within the HDL, as in Figure 3.12, it is often
necessary for a place and route iteration to be run initially before the constraints
are added to the HDL file to correctly define the placement.
As we start developing the application for our Smart Sensor SoC, engineers at
SensorsThink need to be careful with the clocking architecture used within the
programmable logic. Data is transferred between registers within the FPGA syn-
chronously on a clock edge.
As we firm up the architecture of the programmable logic implementation,
we may need to use several different clocks at different frequencies. The need for
several clocks within the solution stems from the need to be able to interface with
the sensor at specific clock frequencies. The four local sensors located on the board
range in clock frequencies from 400 kHz to 14 MHz (Table 3.1).
The main clock to transfer data into the processing system is at a much higher
rate, as it must be able to read multiple sensors in a short period of time. This
means that sensor data must safely be transferred between the sensor clock and the
processing system clock.
Transferring data between clocks can present several challenges to us as we
develop the programmable logic solution. Failure to implement a safe transfer be-
tween clocks can result in a data corruption that could impact the performance of
the system.
Transferring data between the clocks is often referred to as CDC; safely trans-
ferring data from one clock to another requires the designer to understand the
concept of a clock domain.
A clock domain is a grouping of clocks that are all related. Related means
that the clocks are generated from the same source. A clock domain may include
generated clocks from the source at different frequencies using a phase lock loop,
multiclock manager, or traditional counter. Clocks generated in this way are related
if the output clock is an integer multiple of the source clock. Related clocks there-
fore have a known and unchanging timing relationship between clock edges; this
enables flip-flop setup and hold windows to be achieved. This means transferring
data between clocks in the same clock domain does not require any special syn-
chronization or handshaking (Figure 3.13).
Clocks are in a different clock domain if the source of the clock is different,
even if the clock is the same frequency. Generated clocks are also in different clock
domains if they are not an integer multiple of the source clock. This fractional
relationship guarantees that the setup and hold windows of registers cannot be
achieved (Figure 3.14).
Violating the setup and hold time of the registers results in registers going meta-
stable; that is, the output of the flip-flop goes to a value between 0 and 1. Eventual-
ly, the flip-flop will recover, and the output will go to either a 0 or 1. This recovery
time of the flip-flop and the random resulting output value mean that if we do not
correctly synchronize CDC, our system can end up with corrupted data.
Multiclock designs therefore require additional analysis and design effort to
ensure that CDC is safe and reliable.
Detecting CDC within a design can be complex; it is unlikely to be detected
during RTL simulation, as this simulation does not include timing information
for registers, whereas gate-level simulation that does include timing information
requires considerable time and that the right question be asked (e.g., the unrelated
clocks being timed exactly right to cause metastability). This is hard to create in a
simulation but oddly easy to create on the deployed system in the field.
The engineers at SensorsThink would be well served to perform clock domain
analysis; it is a critical analysis to be performed in multiclock domains. There are
several ECAD tools available that can perform CDC analysis. For the development
Figure 3.13 Related clock: a 10-MHz source used to generate a 5-MHz clock.
Figure 3.14 Unrelated clock: a 10-MHz source used to generate a 4-MHz clock.
142 ��������������������������
FPGA Design Considerations
of the programmable logic design, engineers at SensorsThink use the Blue Pearl
Software Visual Verification Suite (Figure 3.15). This provides us the ability for
static code analysis and CDC analysis; Blue Pearl can analyze our design and iden-
tify not only the clock domains, but also signals that cross from one clock domain
to the next.
Once the clock domain crossing signals have been identified, we can implement
CDC structures to safely transfer data from one clock domain to another. Safely
crossing clock domains requires the ability to resynchronize signals from one clock
domain to another. The synchronization scheme used depends upon the type of
signal being passed between clock domains.
•• Single static signal: Transferring a single signal between clock domains can
be implemented using a two-register synchronization structure.
•• Pulsed data: Transferring pulsed data, especially from a faster clock domain
to a slower clock domain, can present a challenge.
•• Data Bus: Mux-based synchronizer.
•• Data Bus: Handshake synchronizer.
•• Data Bus: Asynchronous FIFO.
•• Data Bus: Gray code.
We must also be careful to ensure that, when working with single logic signals,
several single logic signals do not converge in a receiving clock domain (Figure
3.16), as the random delay through the synchronizer path can result in logic issues.
As we are implementing our design targeting a Xilinx programmable logic de-
vice, we should consider the specialist CDC structures provided. If we need to
implement CDC structures within a Xilinx programmable logic, we can leverage
the Xilinx Parameterized Macros (XPM), which provide several CDC structures.
Figure 3.15 Clock domain identification using Blue Pearl (1 synchronized and 2 unsynchronized
crossings).
3.6 Test Bench and Verification 143
Figure 3.16 Reconvergence of two standard logic vectors in a different clock domain.
verification and test benches, we are ensuring that the UUT performs against its
specification; with the size and complexity of modern FPGA designs, this can be a
considerable task. The engineering team must therefore determine at the start of the
project what the verification strategy will be; choices can include:
Whatever the verification strategy, the engineering team will need to produce
a test plan that shows how the individual modules and the final FPGA are to be
verified and how all requirements will be addressed.
Typically, to ensure that you can verify most requirements placed upon the
FPGA, a test bench is required to stimulate the UUT, thus ensuring that the UUT
performs as specified.
Figure 3.18 Boundary condition and corner cases for a 16-bit adder.
146 ��������������������������
FPGA Design Considerations
65,535, and finally when A equals 65,535 and B equals 0. The boundary cases are
the values between the corner cases.
In some applications, we may also want to stress test the UUT to ensure a mar-
gin above and beyond the normal operation. Stress testing will change depending
upon the module, but it could involve repeated operation, attempting to overflow
buffers and FIFOs. Stress testing a design allows the verification engineer to have a
little fun trying to break the UUT.
Achieving 100% in the above parameters will not prove that the UUT is meet-
ing functional requirements; however, it does easily identify sections of the UUT
that have not been exercised by the test bench. It is especially important at this
point to mention that code coverage only addresses what is within the UUT.
Good output checking functions can make use of the VHDL signal attributes
stable, delayed, last_value, and last_event. These attributes can be especially im-
portant when confirming that the UUT is compliant with the timing interfaces
required for memories or other interfaces. These can also be useful for the ensuring
that setup and hold times are achieved and reporting violations if they are not.
•• Moore: State machine outputs are a function of the present state only, a clas-
sic example is a counter.
•• Mealy: State machine outputs are a function of the present state and inputs,
a classic example of this is the Richards controller.
1. State box: This is associated with the state name and contains a list of the
state outputs (Moore).
2. Decision box: This tests for a condition being true and allows the next state
to be determined.
3. Conditional output box: This allows the state machine to describe Mealy
outputs dependent upon the current state and inputs.
Figure 3.21 Algorithmic state chart for the state machines in Figure 3.20 Moore (left) and Mealy
(right).
A state machine does not have to be just Moore or Mealy; it is possible to cre-
ate a hybrid state machine that uses both styles to create a more efficient implemen-
tation of the function required.
�Figure 3.22 Showing simulation results for Mealy and Moore outputs (Mealy bottom).
process contains both current and next state logic within a single process. The al-
ternative two process approach separates the current and next state logic.
As a rule, engineers prefer to implement a single process state machine, these
are seen as having several benefits over the traditional two process approach:
Regardless of which approach you decide to use to implement it, the next state
determination and any outputs will be evaluated using a CASE statement. Figure
3.23 shows a side-by-side comparison between a Moore (left) and Mealy (right)
using a single process approach.
Both sequential and Gray encoding schemes will require several flip-flops that
can be determined by:
LOG10 ( States )
FlipFlops = Ceil
LOG10 ( 2)
while one-hot encoding schemes require the same number of states as there are
flip-flops.
The automatic assignment of state encoding depends upon the number of states
that the state machine contains. While this will vary depending upon the synthesis
tool selected, as a rule of thumb, the following encoding will be used:
Often, you will not necessarily think about what state encoding to use, instead
allowing the synthesis engine to determine the correct implementation getting in-
volved if the chosen style causes an issue. However, should the engineers need to
take things into their own hands and define the state encoding, there is no need
for them to define the state encoding long-hand, defining constants for each state
within the state encoding. Instead, the engineers can use an attribute within the
code to drive the synthesis tool to choose a particular encoding style as demon-
strated below.
TYPE state IS (idle, led_on, led_off);
SIGNAL current_state : state := idle;
ATTRIBUTE syn_encoding STRING;
ATTRIBUTE syn_encoding OF current_state : SIGNAL IS “sequential”;
3.7 Finite State Machine Design 153
where “sequential” can be also “gray” and “onehot”; these three choices can also
be combined with the safe attribute to ensure that the state machine can recover to
a valid state, should the state machine enter an illegal state.
You can also use the syn_encoding attribute to define the values of the state
encoding directly. For example, suppose that the engineer desired to encode the
3-state state machine using the following state encoding; Idle = “11” led_on =
“10,” led_off = “01” as opposed to the more traditional sequence of “00,” “01,”
and “10.”
TYPE state IS (idle, led_on, led_off) ;
SIGNAL current_state : state := idle;
ATTRIBUTE syn_encoding STRING;
ATTRIBUTE syn_encoding OF current_state : SIGNAL IS “sequential”;
As the engineer, you are responsible for setting the correct settings in the syn-
thesis tool to ensure that attributes are not ignored.
Figure 3.24 Implementation and timing performance for the state machine when all functionality
is included.
Figure 3.25 Implementation and timing performance for the state machine when all functionality
is decoupled.
Decoupling the state machine from the counters, RAM interfacing, and shift
register results in slightly decreased utilization, but also a greatly increased maxi-
mum frequency of 228.88 MHz.
3.7 Finite State Machine Design 155
Note that the equation for conversion from WNS to Fmax is 1/(T – WNS)
where T is the required target clock period. In this case, it is 10 ns, as both designs
were targeted for 100-MHz operation.
Following these good design practices will enable our state machines to be
not only effective, but also easier to debug and maintain as the life of the product
evolves, which it no doubt will.
ID and CRC followed by 160 bytes of video when the imager is configured for
raw output or 240 bytes when outputting RGB888. To read out the entire array,
we need to read out 240 packets. To ensure frame compliance with export control
rules, only 9 valid frames per second are output; in between these frames, the FLIR
Lepton outputs corrupted frames. To maintain synchronization, these corrupted
frames must be read out and discarded. After this has been achieved, the FLIR and
the Zynq will be synchronized; to maintain synchronization, the Zynq must:
1. Not violate the intrapacket timeout. Once a packet starts, it must be com-
pletely clocked out within 3 line periods.
2. Not fail to read out all packets for a given frame before the next frame is
available.
3. Not fail to read out all available frames.
CRC, or video data. If the video packet is corrupted, it is discarded and the remain-
ing video is read out. If the packet is valid, then the video is written to the connect-
ed block RAM to enable access from the processor for the image to be rendered.
The implementation for the state machine results in the images of FLIR Lepton
being able to be synchronized and maintain synchronization with the Zynq and the
images being made available to the process for further deployment in the Internet
of Things (IOT) application.
As we develop our Smart Sensor SoC, we must be careful about ensuring that it
can work in harsh environments as outlined in the requirements in Chapter 1. This
harsh environment could mean that the system has to be able to work in electrically
noisy, high-altitude, or dynamic environments.
Engineers at SensorsThink must therefore carefully consider the operating en-
vironment against the need to design defensively in the implementation of state
machines. If the SensorsThink engineering team considers it appropriate to design
defensively, the engineer must take additional care in the design and implementa-
tion of the state machines (as well as all accompanying logic) inside the program-
mable logic device to ensure that they can function within the requirements.
One of the major causes of errors within state machines is single-event upsets
caused by either a high-energy neutron or an alpha particle striking sensitive sec-
tions of the device silicon. Single event upsets (SEUs) can cause a bit to flip its state
(0 -> 1 or 1 -> 0), resulting in an error in device functionality that could potentially
lead to the loss of the system or even endanger life if incorrectly handled. Because
these SEUs do not result in any permanent damage to the device itself, they are
called soft errors.
Terrestrially single event upsets are more likely to occur at higher altitudes and
certain latitudes.
The backbone of most programmable logic design is the finite state machine,
a design methodology that engineers use to implement control, data flow and al-
gorithmic functions. When implementing state machines within FPGAs, designers
will choose one of two styles (binary or one-hot), although, in many cases, most
engineers allow the synthesis tool to determine the final encoding scheme. Each
implementation scheme presents its own challenges when designing reliable state
machines for mission-critical systems. Indeed, even a simple state machine can en-
counter several problems (Figure 3.27). You must pay close attention to the encod-
ing scheme and, in many cases, take the decision about the final implementation
encoding away from the synthesis tool.�
Figure 3.27 Even a simple state machine can encounter several types of errors.
2N number of states when defining the state machine signal and cover the unused
states with the “others clause” at the end of the case statement. The others clause
will typically set the outputs to a safe state and send the state machine back to its
idle state or another state, as identified by the design engineer. This approach will
require the use of synthesis constraints to prevent the synthesis tool from optimizing
these unused states from the design, as there are no valid entry points. This typically
means synthesis constraints within the body of the RTL code (“syn_keep”).
The second method of handling the unused states is to cycle through them at
startup following reset release. Typically, these states also keep the outputs in a safe
state; should they be accidentally entered, the machine will cycle around to its idle
state again.
One-hot state machines have one flip-flop for each state, but only the current
state is set high at any one time. Corruption of the machine by having more than
one flip-flop set high can result in unexpected outcomes. You can protect a one-
hot machine from errors by monitoring the parity of the state registers. Should
you detect a parity error, you can reset the machine to its idle state or to another
predetermined state.
With both methods, the state machine’s outputs go to safe states and the state
machine restarts from its idle position. State machines that use these methods can
be said to be SEU detecting, as they can detect and recover from an SEU, although
the state machines’ operation will be interrupted. You must take care during syn-
thesis to ensure that register replication does not result in registers with a high fan-
out being reproduced and hence left unaddressed by the detection scheme. Take
care also to ensure that the error does not affect other state machines with which
this machine interacts.
Many synthesis tools offer the option of implementing a safe state machine
option. This option often includes more logic to detect the state machine enter-
ing an illegal state and send it back to a legal one, normally the reset state. For
3.8 Defensive State Machine Design 159
a high-reliability application, design engineers can detect and verify these illegal
state entries more easily by implementing any of the previously described meth-
ods. Using these approaches, the designers must also consider what would happen
should the detection logic suffer from an SEU. What effect would this have upon
the reliability of the design? Figure 3.28 is a flowchart that attempts to map out the
decision-making process for creating reliable state machines.�
The techniques presented so far detect or prevent an incorrect change from one
legal state to another legal state. Depending upon the end application, this could re-
sult in anything from a momentary system error to the complete loss of the mission.
Figure 3.28 This flowchart maps out the decision process for creating reliable state machines.
160 ��������������������������
FPGA Design Considerations
state and hence been reset to idle without generating the needed signal. To avoid
deadlock, it is therefore good practice to provide timeout counters on critical state
machines. Wherever possible, these counters should not be included inside the state
machine but placed within a separate process that outputs a pulse when it reaches
its terminal count. Be sure to write these counters in such a way as to make them
reliable.
When checking that a counter has reached its terminal count, it is preferable to
use the greater-than-or-equal-to operator, as opposed to just the equal-to operator.
This is to prevent an SEU from occurring near the terminal count and hence no
output being generated. You should declare integer counters to a power of 2, and,
if you are using VHDL, they should also be modulo to the power of 2 to ensure
in simulation that they will wrap around as they will in the final implementation
[count <= (count + 1) Mod 16; for a 0-bit to 15-bit integer counter]. Unsigned
counters do not require this because there is no simulation mismatch between RTL
and post-route simulation regarding wraparound. You can replicate data paths
within the design and compare outputs on a cycle-by-cycle basis to detect whether
one of them has been subjected to an SEU event. Wherever possible, edge-detect
signals to enable the design to cope with inputs that are stuck high or stuck low.
You should analyze each module within the design at design time to determine how
it will react to stuck-high or stuck-low inputs. This will ensure that it is possible to
detect these errors and that they cannot have an adverse effect upon the function
of the module.
Using the state machine presented in Figure 3.27 as an example, the attribute
can be applied in the RTL source code as shown.
type state is (idle, rd, wr, unmapped);
signal current_state : state;
signal transfer_word : std_logic_vector(31 downto 0);
162 ��������������������������
FPGA Design Considerations
Proof the state machine has been implemented using a Hamming 3 encoding
can be observed in both the synthesis report and the elaborated design.
Along with the finite state machine control structures, SensorsThink FPGA engi-
neers are going to need to implement several mathematical algorithms. It is there-
fore crucial that all the SensorsThink engineering team understand how math is
implemented in programmable logic. One of the many benefits of an FPGA-based
solution is that we engineers can implement the mathematical algorithm in the best
possible solution for the problem at hand. For example, if response time is critical,
then we can pipeline stages of mathematics. If accuracy of the result is more criti-
cal, we can use more bits to ensure the desired accuracy is achieved. Many modern
FPGAs also provide us the benefit of embedded multipliers and DSP slices, which
can be used to ensure the optimal implementation is achieved in the target device.
the range of a signed number depends upon the encoding scheme used: sign and
magnitude, one’s complement, or two’s complement.
Sign and magnitude utilize the leftmost bit to represent the sign of the number
(0 = positive, 1 = negative); the remainder of the bits represents the magnitude.
Therefore, in the sign and magnitude system, both positive and negative numbers
have the same magnitude; however, the sign bit differs. Due to this, it is possible
to have both a positive and negative zero within the sign and magnitude system.
The one’s complement uses the same unsigned representation for positive num-
bers as sign and magnitude representation. However, for negative numbers, the
inversion (one’s complement) of the positive number is used.
The two’s complement is the most widely used encoding scheme for represent-
ing signed numbers. Just like sign and magnitude, one’s complement schemes that
are positive are represented in the same manner as an unsigned number, whereas
negative numbers are represented as the binary number that you add to a positive
number of the same magnitude to get zero. A negative two’s complement number is
calculated by first taking the one’s complement (inversion) of the positive number
and then adding 1 to it. The two’s complement number system allows subtraction
of one number from another by performing an addition of the two numbers. The
range that a two’s complement number can represent is given by
–(2n – 1) to +(2n – 1 – 1)
One method that we can use to convert a number to its two’s complement
format is to work right to left, leaving the number the same until the first one is
encountered; after this, each bit is inverted.
The normal way of representing the split between integer and fractional bits within
a fixed-point number is x,y where x represents the number of integer bits and y
represents the number of fractional bits. For example, 8,8 represents 8 integer bits
and 8 fractional bits, while 16,0 represents 16 integer buts and 0 fractional bits. In
many cases, the correct choice of the number of integer and fractional bits required
will be undertaken at design time normally following conversion from a floating-
point algorithm. Thanks to the flexibility of FPGAs, we can represent a fixed-point
number of any bit length; the number of integer bits required depends upon the
maximum integer value that the number is required to store, while the number of
fractional bits will depend upon the accuracy of the final result. To determine the
number of integer bits required, we can use the following equation
LOG10 Integer_Maximum
Integer Bits Required = Ceil
LOG10 2
For example, the number of integer bits required to represent a value between
0.0 and 423.0 is given by
164 ��������������������������
FPGA Design Considerations
LOG10 423
9 = Ceil
LOG10 2
meaning that we would need 9 integer bits, allowing a range of 0 to 511 to be rep-
resented. If we want to represent the number using 16 bits, this would allow for 7
fractional bits. The accuracy that this representation would be capable of providing
is given by
Storing the integer of the result (9) will result in the number being stored as
1.37329101563 × 10–4 (9/65,536). This difference between the number required to
be stored and the stored number is substantial and could lead to an unacceptable
error in the calculated result. We can obtain a more accurate result by scaling the
number up by a factor of 2, which produces a result between 32,768 and 65,535,
therefore still allowing storage in a 16-bit number. Using the earlier example of
storing 1.45309806319 × 10–4, we can, by multiplying by factor of 228, obtain
a number that can be stored in 16 bits and will be highly accurate of the desired
number.
The integer of the result will result in the stored number being 1.45308673382
× 10–4, which will give a much more accurate calculation, provided that the scaling
factor of 228 can be addressed at a later stage within the calculation. For example, a
multiplication of the scaled number with a 16-bit number scaled 4,12 will produce
a result of 4, 40 (28 + 12); however, the result will be stored in a 32-bit result.
234.58* 28 = 60,052.48
312.732 * 27 = 40,029.69
The two numbers to be added together are 60,052 and 40,029; however, before
the two numbers can be added together the decimal point must be aligned. To align
the decimal points by scaling up the number with the largest number of integer bits,
the 9,7 format number must be scaled up by a factor of 21:
40,029 * 21 = 80,058
1
= 0.066666 ′
15
This will produce a result that is formatted 9,23 when the two numbers are
multiplied together:
174886701
= 20.8481193781
8388608
While the expected result is 20.8488, if the result is not accurate enough, then
the reciprocal can be scaled up by a larger factor to produce a more accurate result.
Therefore, never divide by a number when you can multiply by the reciprocal.
3.10.2 Overflow
When implementing algorithms, we must ensure that the result is not larger than
what is capable of being stored within the result register. When this condition oc-
curs, it is known as overflow; when overflow occurs, the stored result will be incor-
rect and the most significant bits are lost. A remarkably simple example of overflow
would occur if two 16-bit numbers each with a value of 65,535 were added to-
gether and the result was stored within a 16-bit register.
The above calculation would result in the 16-bit result register containing a
value of 65,534, which is incorrect. The simplest way to prevent overflow is to
determine the maximum value, which will result from the mathematical operation,
and use the first equation in this chapter to determine the size of the result register
required. If an averager were being developed to calculate the average of up to 50
16-bit inputs, the size of the required result register can be calculated.
50 * 65,535 = 3,276,750
Using the first equation in this chapter would require a 22-bit result register to
prevent overflow occurring. Care must also be taken when using signed numbers
to ensure that overflow does not occur when using negative numbers. Using the
averager example again, this time taking 10 averages of a signed 16-bit number and
returning a 16-bit result:
10 * –32,768 = –327,680
The input value will range between 0 and 10 millibars with a resolution of 0.1
millibar. The output of the module is required to be accurate to +/–0.01m. As the
module specification does not determine the input scaling, this can be determined
by the following equation.
LOG10 10
4 = Ceil
LOG10 2
LOG10 133.29
8 = Ceil
LOG10 2
LOG10 1.7673
1 = Ceil
LOG10 2
These scaling factors allow the scaled spreadsheet to be calculated, this can be
seen in Table 3.4. The results of each stage of the calculation will produce a result
that will require more than 16 bits.
The calculation of the Cx2 will produce a result that is 32 bits long when
formatted 4,12 + 4,12 = 8,24; this is then multiplied by the constant C producing
a result that will be 48 bits long when formatted 8,24 + 0,16 = 8,40. For the ac-
curacy required in this example, 40 bits of fractional representation are excessive;
therefore, the result will be divide by 232 to produce a result with a bit length of 16
bits formatted as 8,8. The same reduction to 16 bits is carried out upon the calcula-
tion of Bx to produce a result that is formatted as 5,11. The result is the addition
of columns Cx2, Bx, and A; however, to obtain the correct result, the radix points
must first be aligned. This can be achieved through either the shifting up of A and
Cx2 to align the numbers in an x,11 format. The alternate approach would be to
shift down the calculated Bx to a format of 8,8 aligning the radix points with the
calculated values of A and Cx2.
In this example, the calculated value was shifted down by 23 to align the radix
points in an 8,8 format. This approach simplified the number of shifts required
and therefore reduces the logic need to implement the example. Note that if the
accuracy could not be achieved through by shifting down to align the radix points,
then the radix points must be aligned by shifting up the calculated values of A and
Cx2. In this example, the calculated result is scaled up by a power of 28; the result
can then be scaled down and compared against the result calculated obtained with
unscaled values. The difference between the calculated result and the expected re-
sult is then the accuracy; using the spreadsheet commands of MAX() and MIN(),
the maximum and minimum error of the calculated result can be obtained across
the entire range of spreadsheet entries. Once the calculated spreadsheet confirms
that the required accuracy can be achieved, the RTL code can be written and
3.10 Fixed Point Mathematics 169
simulated. If desired, the testbench can be designed such that the input values are
the same as those used in the spreadsheet; this allows the simulation outputs to be
compared against the spreadsheet’s calculated results to ensure the correct RTL
implementation.
As the SensorsThink team develops the FPGA solution, they may be required to im-
plement several significant mathematical algorithms. The SensorsThink engineer-
ing team may find implementing these algorithms directly will result in a complex
implementation. This complex implementation may have knock-on effects in tim-
ing closure and verification. One technique that the SensorsThink engineering team
might consider for complex algorithms is a different approach to direct implemen-
tation. This alternative approach is to leverage polynomial approximation.
3.11 Polynomial Approximation 171
( )
R = R0 × 1 + a × t + b × t 2
where R0 is the resistance at 0°C and a and b are coefficents of the PRT.
Most systems that use a PRT will know the resistance from the design of the
electronic circuit and the calculating temperature is what is desired; this requires
rewriting the equation as:
a look-up table, this may prove acceptable; however, you will still need a linear
interpolator function which can be complex mathematically.
Figure 3.31 Trend line and polynomial equation for the temperature transfer function.
3.12 The CORDIC Algorithm 173
Figure 3.32 Plotting between 269°C and 300°C to provide a more accurate result.
174 ��������������������������
FPGA Design Considerations
of the most important algorithms in the FPGA engineer’s toolbox and one that few
engineers are aware of, although as engineers they will almost certainly use results
calculated by a scientific calculator that uses the algorithm for trigonometric and
exponential functions, starting with the HP35 and continuing to present-day calcu-
lators. The real beauty of this algorithm is that it can be implemented with a very
small FPGA footprint and requires only a small look-up table, along with logic
to perform shifts and additions; importantly, the algorithm requires no dedicated
multipliers or dividers to implement.
This algorithm is one of the most useful for DSP and industrial and control ap-
plications. Depending upon its mode and configuration, the CORDIC implements
some very useful mathematical functions. The CORDIC algorithm can operate in
one of three configurations: linear, circular, or hyperbolic. Within each of these
configurations, the algorithm functions in one of two modes: rotation or vector-
ing (see Table 3.5). In the rotation mode, the input vector is rotated by a specified
angle, while in the vectoring mode, the algorithm rotates the input vector to the x
axis while recording the angle of rotation required.
Additionally, there are other functions that can be derived from the CORDIC
outputs. In many cases, these could even be implemented by the use of another
CORDIC in a different configuration.
( )
0.5 Note 2
SQR = X 2 − Y 2
Xi +1 = Xi − m ∗ Yi ∗ di ∗ 2i
Yi +1 = Yi + Xi ∗ di ∗ 2i
Zi +1 = Zi − di ∗ ei
where m defines the configuration for either hyperbolic (m = –1), linear (m = 0), or
circular (m = 1) correspondingly the value of ei as the angle of rotation changes de-
pending upon the configuration. The value of ei is normally implemented as a small
look-up table within the FPGA (Table 3.6).
di is the direction of rotation that depends upon the mode of operation for
rotation mode di = –1 if Zi < 0, else +1, while in the vectoring mode di = +1, if Yi <
0, else –1.
When configured in either circular or hyperbolic using rotation mode, the out-
put results will have gain that can be precalculated using the number of rotations
defined using the equation.
An = ∑√(1+2–2i)
This gain is typically fed back into the initial setting of the algorithm to remove
the need for post-scaling of the result (Table 3.7).
While the algorithm presented above is particularly important to the design
engineer, it has to be noted that the CORDIC algorithm only operates within a
strict convergence zone that may require the engineer to perform some prescaling
to ensure the algorithm performs as expected. It is worth noting that the algorithm
will get more accurate the more iterations (serial) or stages (parallel) the engineer
decides to implement. A general rule of thumb is that for n bits of precision, n itera-
tions or stages are required. However, all of this is easily modeled in simple tools
such as Excel or MATLAB prior to cutting code to ensure the accuracy is obtained
with the selected iterations.
3.13 Convergence
The CORDIC algorithm as defined will only converge (work) across a limited range
of input value. For circular configurations of CORDIC algorithms, convergence is
guaranteed for the angles below the sum of the angles in the look-up table (i.e., be-
tween –99.7° and 99.7°). For angles outside of this, the engineer must use a trigono-
metric identity to translate one within. This is also true for convergence within the
linear configuration. However, in the hyperbolic mode to gain convergence, certain
iterations must be repeated (4, 13, 40, K… 3K + 1). In this case, the maximum input
of q is approximately 1.118 radians.
CORDICs are used in a wide range of applications, including digital signal pro-
cessing, image processing, and industrial control systems. The most basic method
of using a CORDIC is to generate sine and cosine waves when coupled with a
phase accumulator. The use of the algorithm to generate these waveforms can, if
correctly done, result in a high spurious free dynamic range (SFDR). Good SFDR
performance is required for most signal processing applications. Within the field
of robotics, CORDICs are used within kinematics where the addition of coordi-
nate values with new coordinate values can be easily accomplished using a circular
CORDIC in the vectoring mode. Within the field, image processing 3-D operations
such as lighting and vector rotation are the perfect candidates for implementation
using the algorithm. However, perhaps the most common use of the algorithm is in
implementing traditional mathematical functions as shown in Table 3.1 where mul-
tipliers, dividers, or more interesting mathematical functions are required in devices
where there are no dedicated multipliers or DSP blocks. This means that CORDICs
are used in many small industrial controllers to implement mathematical transfer
functions, with true RMS measurement being one example. CORDICs are also be-
ing used in biomedical applications to compute fast Fourier transforms (FFTs) to
analyze the frequency content of many physiological signals. In this application,
along with performing the traditional mathematical functions, the CORDIC is used
to implement the FFT twiddle factors.
One of the simplest methods of modeling your CORDIC algorithm before cutting
code is to put together a simple Excel spreadsheet that allows you to model the
number of iterations and gain (An) initially using a floating-point number system
and later using a scaled fixed-point system providing a reference for verification of
the code during simulation (Table 3.8).
As can be seen from the Excel implementation, the initial X input is set to An
to reduce the need for postprocessing the result, and the initial argument is set in Z
that is defined in radians, as are the results.
3.16 Implementing the CORDIC 177
Unless there is a good reason not to, the simplest method of implementing a
CORDIC algorithm within an FPGA is to utilize a tool such as the Xilinx Core
Generator. The Core Generator provides a comprehensive interface allowing the
engineer to define the exact functionality of the CORDIC (rotate, vector).
Unfortunately, the Core Generator does not provide options for working with
CORDICs within the linear mode (the Core Generator does provide separate cores
to perform these functions). However, the VHDL code required to implement the
algorithm can be written in a very few lines, as the simple example of a circular
implementation below shows. There are two basic topologies for implementing a
CORDIC in an FPGA, using either a state machine-based approach or a pipelined
approach. If the processing time is not critical, the algorithm can be implemented
as a state machine that computes one CORDIC iteration per cycle until the desired
number of cycles has been completed. If a high calculation speed is required, then
a parallel architecture is more appropriate. This code implements a 15-stage paral-
lel CORDIC operating within the rotation mode. It uses a simple look-up table of
ArcTan, coupled with a simple array structure to implement the parallel stages.
LIBRARY ieee;
USE ieee.std_logic_1164.all;
USE ieee.numeric_std.all;
clk : IN std_logic;
resetn : IN std_logic;
178 ��������������������������
FPGA Design Considerations
BEGIN
PROCESS(resetn, clk)
BEGIN
IF resetn = ‘0’ THEN
x_array <= (OTHERS => (OTHERS => ‘0’));
z_array <= (OTHERS => (OTHERS => ‘0’));
y_array <= (OTHERS => (OTHERS => ‘0’));
ELSIF rising_edge(clk) THEN
IF signed(z_ip)< to_signed(0,18) THEN
x_array(x_array’low) <= signed(x_ip) + signed(‘0’ & y_ip);
y_array(y_array’low) <= signed(y_ip) - signed(‘0’ & x_ip);
z_array(z_array’low) <= signed(z_ip) + tan_array(0);
ELSE
x_array(x_array’low) <= signed(x_ip) - signed(‘0’ & y_ip);
y_array(y_array’low) <= signed(y_ip) + signed(‘0’ & x_ip);
z_array(z_array’low) <= signed(z_ip) - tan_array(0);
END IF;
FOR i IN 1 TO 14 LOOP
IF z_array(i-1) < to_signed(0,17) THEN
3.17 Digital Filter Design and Implementation 179
The Smart Sensor SoC being developed by SensorsThink interfaces with several
sensors that gather real-world data. This data will be subject to noise. As such, Sen-
sorsThink engineers may want to consider using digital filters to remove noise from
the signals, especially the remoteHX94 Remote Humidity and Temperature sensor,
which is sampled directly by the XADC.
Digital filters are a key part of any signal-processing system, and as modern
applications have grown more complex, so has filter design. FPGAs provide the
ability to design and implement filters with performance characteristics that would
be exceedingly difficult to re-create with analog methods. These digital filters are
immune to certain issues that plague analog implementations, notably component
drift and tolerances (over temperature, aging and radiation, for high-reliability ap-
plications). These analog effects significantly degrade the filter performance, espe-
cially in areas such as passband ripple.
Digital models have their own quirks. The rounding schemes used within the
mathematics of the filters can be a problem, as these rounding errors will accumu-
late, impacting performance by, for example, raising the noise floor of the filter.
The engineer can fall back on a number of approaches to minimize this impact,
and schemes such as convergent rounding will provide better performance than
traditional rounding. In the end, rounding-error problems are far less severe than
those of the analog component contribution.
One of the major benefits of using an FPGA as the building block for a filter is
the ability to easily modify or update the filter coefficients late in the design cycle
with minimal impact, should the need for performance changes arise due to inte-
gration issues or requirement changes.
Highpass filters are the inverse of the lowpass and will only allow through
frequencies above the cutoff frequency. Bandpass filters allow a predetermined
bandwidth of frequencies, preventing other frequencies from entering. Finally,
band-reject filters are the inverse of the bandpass variety and therefore reject a
predetermined bandwidth, while allowing all others to pass through.
Most digital filters are implemented by one of two methods: finite impulse
response (FIR) and infinite impulse response (IIR). Let’s take a closer look at how
to design and implement FIR filters, which are also often called windowed-sinc
filters. So why are we focusing upon FIR filters? The main difference between the
two filter styles is the presence or lack of feedback. The absence of feedback within
the FIR filter means that for a given input response, the output of the filter will
eventually settle to zero. For an IIR filter subject to the same input that contains
feedback, the output will not settle back to zero. The lack of feedback within the
filter implementation makes the FIR filter inherently stable, as all the filter’s poles
are located at the origin. The IIR filter is less forgiving. Its stability must be care-
fully considered as you are designing it, making the windowed sinc filter easier for
engineers new to DSP to understand and implement. If you were to ask an engineer
to draw a diagram of the perfect lowpass filter in the frequency domain, most
would produce a sketch similar to that shown in Figure 3.33.
The frequency response shown in Figure 3.33 is often called the brick-wall
filter. That is because the transition from passband to stopband is very abrupt
and much sharper than can realistically be achieved. The frequency response also
exhibits other perfect features, such as no passband ripple and perfect attenuation
within the stopband. If you were to extend this diagram such that it was sym-
metrical around 0 Hz extending out to both +/–FS Hz (where FS is the sampling
frequency) and perform an inverse discrete Fourier transform (IDFT) upon the
response, you would obtain the filter’s impulse response, as shown in Figure 3.34. �
This is the time-domain representation of the frequency response of the perfect
filter shown in Figure 3.33, often called the filter kernel. It is from this response
that FIR or windowed-sinc filters get their name, as the impulse response is what is
achieved if you plot the sinc function.
h [i ] = sin ( 2 ∗ PI ∗ Fc ∗ i )
i ∗ PI
Combined with the step response of the filter, these three responses—frequen-
cy, impulse, and step—provide all the information on the filter performance that
you need to know to demonstrate that the filter under design will meet the require-
ments placed upon it.
chosen symmetrically around the center main lobe, where N is the desired filter
length (please remember that N must be an even number). This truncation affects
the filter’s performance in the frequency domain due to the abrupt cutoff of the new,
truncated impulse response. If you were to take a discrete Fourier transform (DFT)
of this truncated impulse response, you would notice ripples in both the passband
and stopband along with reduced roll-off performance. Therefore, it is common
practice to apply a windowing function to improve the performance.
highpass one. The simplest method, called spectral inversion, changes the stopband
into the passband and vice versa. You perform spectral inversion by inverting each
of the samples, while at the same time adding one to the center sample. The second
method of converting to a highpass filter is spectral reversal, which mirrors the fre-
quency response; to perform this operation, simply invert every other coefficient.
Having designed the lowpass and highpass filters, it is easy to make combinations
that you can use to generate bandpass and band-reject filters. To generate a band-
reject filter, just place a highpass and a lowpass filter in parallel with each other
and sum together the outputs. Meanwhile, you can assemble a bandpass filter by
placing a lowpass and a highpass filter in series with each other.
184 ��������������������������
FPGA Design Considerations
The SensorsThink engineering team will also be working with ADCs and filters;
several of these may require verification in the frequency domain during testing. As
such, it is important that the SensorsThink development team understands how to
operate in the frequency domain. For many engineers, working in the frequency do-
main does not come as naturally as working within the time domain; however, the
engineer needs the ability to work within both domains to unlock the true potential
of FPGA-based solutions.
Depending upon the type of the signal, repetitive or nonrepetitive, discrete or non-
discrete, there are several methods that one can use to covert between the time
and frequency domains, for instance, Fourier series, Fourier transforms, or Z
3.19 How Do We Get There? 185
where x[i] is the time-domain signal, i ranges from 0 to N – 1, and k ranges from 0
to N/2. The algorithm above is called the correlation method, and what it does is
multiply the input signal with the sine or cosine wave for that iteration to determine
its amplitude.
We will wish at some point in our application to transform back from the
frequency domain into the time domain, as such we can use the synthesis equa-
tion, which combines the real and imaginary waveforms to recreate a time-domain
signal as such:
N 2 N 2
2 πki 2 πki
x [i ] = ∑ ReX [k] cos + ∑ ImX [k] sin
k=0 N k=0 N
However, ReX and ImX are scaled versions of the cosine and sine waves as such
we need to scale them; hence, ImX [k] or ReX [k] are divided by N/2 to determine
Figure 3.36 N bits in the time domain to n/2 real and imaginary parts in the frequency domain.
186 ��������������������������
FPGA Design Considerations
the values for ReX and ImX in all cases except when ReX [0] and ReX [N/2]; in this
case, they are divided by N.
For obvious reasons, this is called the IDFT. Having explained the algorithms
used for determining the DFT and IDFT, it may be useful to know what we can use
these for.
We can use tools such as Octave, MATLAB, and even Excel to perform DFT
calculations upon captured data and many lab tools such as oscilloscopes can per-
form DFT upon request.
However, before we progress, it is worth pointing out that both the DFT and
IDFT above are referred to as real DFT and real IDFT in that the input is a real
number and not complex; why we need to know this will become apparent.
DFTtime = N ∗ N ∗ Kdft
where Kdft is the processing time for each iteration to be performed; as such, this
can become quite time-consuming to implement. As a result of this, DFTs within
FPGAs are normally implemented using an algorithm called the fast Fourier trans-
form to calculate the DFT. This algorithm has often been called the “the most im-
portant algorithm of our lifetime” as it has had such an enabling impact on many
industries. The FFT differs slightly from the DFT algorithms explained previously,
in that it calculates the complex DFT, that is, it expects real and imaginary time-
domain signals and produces results in the frequency domain that are n bits wide
as opposed to n/2. This means that when we wish to calculate a real DFT, we must
3.19 How Do We Get There? 187
first address this by setting the imaginary part to zero and moving the time-domain
signal into the real part. If we wish to implement an FFT within our Xilinx FPGA,
we have two options; we can write one from scratch using the HDL of our choice
or use the FFT IP provided within the Vivado IP catalog or another source. Unless
there are pressing reasons not to use the IP, the reduced development time resulting
from using the IP core should drive its selection and use.
The basic approach of the FFT is that it decomposes the time-domain signal
into several single-point time-domain signals. This is often called bit reversal as the
samples are reordered. The number of stages that it takes to create these single-
point time-domain signals is calculated by Log2 N where N is the number of bits,
if a bit reversal algorithm is not used as a short cut.
These single-point, time-domain signals are then used to calculate the frequen-
cy spectra for each of these points. This is straightforward as the frequency spectra
are equal to the single-point time domain.
It is in the recombination of these single frequency points that the FFT algo-
rithm gets complicated. We must recombine these spectra points one stage at a
time, which is the opposite of the time-domain decomposition; thus, it will again
take Log2 N stages to recreate the spectra. This is where the famous FFT butterfly
comes into play.
When compared with the DFT execution time, the FFT takes
FFTsize
FFTNoise Floor ( dB) = 6.02n + 1.77 + 10log10
2
where n is then number of quantized bits within the time domain and FFTsize is
the FFT size. For FPGA-based implementation, this is normally a power of 2, for
example, 256, 512, and 1,024. The frequency bins will be evenly spaced at
Fs
Bin Width = 2
FFTsize
For a quite simple example, an FS of 100 MHz with an FFT size of 128 would
have a frequency resolution of 0.39 Hz. This means that frequencies within 0.39
Hz of each other cannot be distinguished.
188 ��������������������������
FPGA Design Considerations
Commonly, FPGA-based systems must interface with the real world having per-
formed the task that it was designed to do. Sadly, the real world tends to function
around an analog signal as opposed to digital signals; therefore conversion is re-
quired to and from the digital domain from the analog domain. As with selecting
the correct FPGA for the job, the engineers are faced with a multitude of choices
when selecting the correct ADC or DAC for their system. The first thing for the
engineers to establish is the sampling rate required for the signal to be converted, as
this will drive not only the converter selection but also impact the FPGA selection
as well to ensure that the processing speed and logic footprint required can be ad-
dressed. As all engineers should be aware, the sampling rate of the converter needs
to be at least twice that of the signal being sampled. Therefore, if the engineers re-
quire sampling a signal at 50 MHz, the sampling rate needs to be at least 100 MHz;
otherwise, the converted signal will be aliased back upon itself and not be correctly
represented. This aliasing is not always a bad thing and can be used by engineers
to fold back signals into the useable bandwidth if the converter bandwidth is wide
enough.
•• Flash converters are known for their speed and use a series of scaled analog
comparators to compare the input voltage against a reference voltage and
use the outputs of these comparators to determine digital code.
190 ��������������������������
FPGA Design Considerations
DACs are also provided in several different implementations; some of the most
common are binary-weighted, R-2R Ladder, and pulse width modulation.
•• Binary-weighted: These are one of the fastest DAC architectures and sum
the result of individual conversions for each logic bit; for example, a resistor-
based one will switch on or out resistors depending upon the current code.
•• R-2R Ladder: These converters use a structure of cascaded resistors of value
R-2R. Due to the ease with which precision resistors can be produced and
matched, these are more accurate than the binary-weighted approach.
•• Pulse width modulation: The simplest type of DAC that passes the pulse
width modulation waveform through a simple lowpass analog filter. This
is commonly used in motor control but also forms the basis for delta sigma
converters.
When performing this testing, the engineer must take care to ensure that the
FFT used is correctly sized to ensure that the noise floor shown is correct and not
inadvertently incorrect due to the size of the FFT selected. The FFT noise floor is
given by
These tests should be performed using a single tone test, normally a simple sine
wave to reduce the complexity of the output spectrum. For the engineer to get the
best results, performing coherent samples of the output is a must. Coherent sam-
pling occurs when there are an integer number of cycles within the data window
for this to be correct; then
FS/Fin = Ncycles/FFT
3.20.3 Communication
As with all external devices, ADCs and DACs, interfaces are provided in several
different interfacing options. The most common differentiation is the interface be-
ing either parallel or serial. Typically, higher-speed devices will implement a parallel
interface, while slower-speed devices will utilize a serial interface. However, your
choice of application may drive the engineer down a particular route. It is easier,
for instance, to detect a stuck at bit in a serial interface than it is a parallel one.
High-speed interfaces provide multiple output buses (I and Q) or use double data
rate outputs; some devices might even offer both options. This allows the data
rate to be maintained while reducing the frequency of operation required for the
interface. For example, an interface sampling at 600 MHz will produce an output
at 300 MHz (half the sampling frequency); it is much easier to recover if the clock
frequency is 75 MHz (FS/4) and there are two data buses that provide the samples
from the device using double data rate (DDR), as is the case with the 10D1000ADC
from National Semiconductor. This allows the relaxation of the input timings re-
quired to be achieved by the FPGA engineer. Many high-speed converters utilize
LVDS signaling on their IO as the lower-voltage swing and low current reduce the
coupling that can occur using other signaling standards (i.e., LVCMOS), which can
affect the mixed signal performance of the converter.
sin
( π ∗ Fin )
A= FS
π ∗ Fin
FS
Create the correction factor that is the reciprocal of those calculated for the
roll-off. The engineer can then take an inverse Fourier transform to obtain the
coefficients needed to implement the filter. Typically, this filter can be implemented
with a few taps. Table 3.10 shows the first 10 coefficients for the filter, while Fig-
ures 3.39 and 3.40 show the implementation of the filter in the frequency domain.
pattern generators, spectrum analyzers) to achieve the desired testing and determine
the actual performance. However, the reprogrammable flexibility of the FPGA al-
lows specific test programs to be inserted into the device that either capture and
analyze the output of an ADC or allow the stimulus of a DAC reducing hopefully
the need for additional extra test equipment.
So far, the approaches to logic design for SensorsThink have focused on one of
the two hardware description languages that is VHDL. Both hardware description
languages, VHDL and Verilog, operate at what is commonly known as the register
transfer level (RTL), as the languages are used to create registers and describes the
transfer of data between registers. Both developing and verifying RTL designs is
194 ��������������������������
FPGA Design Considerations
The output of these 3 stages is Verilog and VHDL source code, which can be
imported as an IP block into the implementation tool.
However, C does not describe parallel operations as hardware description lan-
guages do. To address this, the HLS compiler can be instructed by the developer
where parallel structures exist in the source code. The user identifies parallel struc-
tures in the code using pragmas; these pragmas are used to control the compilers’
implementation of the design. Example pragmas include:
•• Interval: The number of clock cycles between the synthesized IP module be-
ing able to start processing a new input.
•• Unrolling: Where loops are used to process data elements, by default, HLS
tools will keep loops rolled up, ensuring minimal resource requirements.
However, keeping loops rolled results in decreased performance as each it-
eration of the loop uses the same hardware and incrementing a loop counter
requires at least one clock. Unrolling a loop results in the creation of hard-
ware for each iteration of the loop significantly reducing the execution time
of the loop, while at the same time increasing the required resources. This
resource for performance trade-off is one of the key aspects of optimizing
HLS designs.
•• Pipelining: Pipelining is another technique that can be used to increase
throughput in your HLS implementation. Pipelining is an alternative to
unrolling and enables loops to be pipelined such that a new input can be
processed on each clock cycle (other constraints permitting). This does not
affect the latency of the pipeline, but it increases the throughput enabling the
design to achieve higher performance than a nonpipelined design.
•• Memory partitioning and reshaping: By default, arrays in C code are synthe-
sized to the block RAMs. Block RAMs can act as bottlenecks when arrays
are accessed, preventing unrolling or pipelining with an optimal interval.
The bottleneck occurs as block RAMs have limited access ports that pro-
hibit multiple accesses at the same time. Partitioning block RAMs allows a
single block RAM to be split into several smaller block RAMs. The HLS tool
is then able to arrange the data across the smaller block RAMs to enable
multiple accesses in parallel, removing the bottleneck. In the most extreme
cases, the block RAM contents can be partitioned into individual registers.
Reshaping is like partitioning; however, instead of splitting the words across
block RAMS, reshaping changes the size of the words stored and accessed.
Along with the optimization pragmas, HLS compilers can also use pragmas to
define the interfacing standard or standards that are used in the synthesized HDL
to interface with larger programmable logic design.
HLS is ideal for accelerating the development of signal processing and image
processing applications, as engineers developing our IOT sensor application, we
may need to perform filtering on sensor data to remove noise; this is an ideal ap-
plication for HLS.
The HLS development will implement a 16-bit block average; that is, 16 ele-
ments will be sampled and the average of the 16 elements will be output. As the
196 ��������������������������
FPGA Design Considerations
IOT board application contains a Xilinx Zynq SoC, the HLS tool used for the de-
velopment of this filter module is Vitis HLS.
Creating such an example in HLS is amazingly simple; we can implement this
in less than 10 lines of C, as shown in Figure 3.41. �
Before we can synthesize the HLS design, we need to know that the C algo-
rithm works as expected; within Vitis HLS, we have two verification methods:
The C simulation created to verify the average filter can be seen in Figure 3.42;
this test bench is very flexible, loading in the verification values from a text file and
applying them to the average filter. The resulting average value generated by the
average IP core is printed for verification.
Synthesizing this code results in an IP core that can calculate one average every
18 clock cycles and begin to process a new input every 19 clock cycles.
To ensure the most optimal implementation in our FPGA, we can use pragmas
to remove potential bottlenecks in the design. Examining the design analysis view
in Vitis HLS provided in Figure 3.43, we see first clock cycle used to start the IP
core. The main accumulation loop takes 2 clock cycles and is run 16 times. How-
ever, as the HLS tool has been able to pipeline the loop access to memory (indicated
by the II=1 in the loop descriptor) the loop is able to be completed in only 16 clock
cycles, while the final clock cycle is used to output the result of the average consid-
ering the accumulated value.
Examining the analysis flow shows that the main bottleneck in being able to
increase performance and reduce the number of clock cycles taken is the reading of
the 16 input values from the external block RAM.
In the end, a well-designed system is only as good as the reliability it offers. While
designers may have worked hard to meet all technical requirements and system
specifications, none of it would matter if the system failure occurred as reliability
concerns were not addressed.
199
200 �����������������������
When Reliability Counts
Over the past few years, penetration of embedded systems in all aspects of hu-
man life has put more focus on the importance of incorporating reliability aspect
in design practices. While the issue of reliability is almost a certainty when starting
any new project, the emphasis laid on reliable design could vary greatly from proj-
ect to project. In some projects, reliability aspect during the system design process
is given far greater importance than some other considerations, such as time to
market. In general, reliability is given the highest consideration in industries where
system failure could lead to loss of life (e.g., automobile or aeronautical industries)
or a severe impact on the company’s reputation (e.g., financial systems or large-
scale factory automation). However, there are several other major industrial sec-
tors where reliability as a design aspect is not given a similar importance as time
to market. One good example is the present-day smartphone industry. Given the
nature and dynamics of the market and the fast pace of technology, if the designers
test the devices for all possible reliability concerns over a period of time, the tech-
nology used in the smartphone model could be out of date even before it hits the
market. Keeping this in mind, designers must understand what level of emphasis
is required for addressing reliability in the project that they have undertaken and
what industry or end user it will serve.
While it is easy to recognize the importance of reliability in mass-produced
systems that could lead to loss of life (such as automobiles and aircrafts), the reli-
ability of a system generally has a far greater financial impact on the brand and
profitability of companies engaged in other electronics industries such as mobile
devices or consumer electronics.
The emphasis on reliability of a component varies greatly from one industry
to another. A component deployed in a product used in the smartphone industry
could have a completely different requirement for reliability in as compared to the
same component being used in another industry (e.g., aerospace). As shown in
Table 4.1, errors can occur during any stage of the design life cycle and can have
various degrees of impact on the reliability of a system. Therefore, it is important
to study reliability from a technical point of view. Before we dive into the techni-
cal definitions of reliability and several other parameters associated with it, let us
consider a few scenarios to emphasize the importance of reliability in the design
process.
( )
RS (t ) = Prob to ≤ t ≤ t f ∀f ∈ F
Here to is the time at which the system is introduced into service and tf is the
time at which the first critical fault f takes place. f belongs to a set of critical faults
F that can cause the system to fail. The failure probability QS(t) of a system is
complementary to the reliability RS(t) of the system. Hence, failure rate (or failure
probability) of a system can be calculated by using the relation:
QS (t ) + RS (t ) = 1
As mentioned above, each system (or subsystem) consists of one or more indi-
vidual critical components and the reliability of the system RS(t) is dependent on
the individual reliability of each critical component. Let us denote the reliability of
an individual critical component as RC(t). Using the same formula as above, the
failure rate (probability) for a component QC(t) is complementary to the reliability
RC(t) of the component and can be calculated by using the relation:
QC (t ) + RC (t ) = 1
The failure rate QC(t) generally varies throughout the life of the component;
hence, it is not very useful to calculate the reliability. Failure of a system can be de-
fined as a moment of time at which the system stops to fulfill its specified function.
Mathematically, failure rate of a system is defined as:
Figure 4.1 The bathtub curve (also known as the component reliability model).
Table 4.2 Summary of Failure Rates and Possible Causes During Different Phases of Bathtub curve
Phase Failure Rate Possible Causes Possible Improvement Actions
Burn-in Decreasing Manufacturing defects, soldering, Better QC, acceptance testing, burn-in
assembly errors, part defects, poor testing, screening, highly accelerated
QC, poor design stress screening
Useful life Constant Environment, random loads, hu- Excess strength, redundancy, robust
man errors, random events design
Wear-out Increasing Fatigue, corrosion, aging, friction Derating, preventive maintenance,
parts replacement, better material,
improved designs, technology
assumed to be constant during the useful life of the component as shown in Figure
4.1 and is generally expressed as failure in time (FIT). For example, if there is an
occurrence of 10 failures for every 109 hours, then λ will be equal to 10 FIT. The
value for λ can also be calculated using the formulas given below with the MTBF
or MTTF shown in the reliability data. By making use of λ, we can model the reli-
ability of a component using an exponential distribution.
RC (t ) = e − λt
Example 4.1
During the trial of a key component for the SoC Platform project, sensor company
SensorsThink observed that that 95 components failed during a test with a total
operating time of 1 million hours (total time for all items: both failed and passed).
Calculate the reliability of the key component after 1,000 hours (i.e., t = 1,000).
204 �����������������������
When Reliability Counts
Solution:
Failure rate is given by:
Total number of failed components
Failure Rate = λ =
Total Operating Time
95
∴λ= = 9.5 × 10−5 per hour
1,000,000
Reliability of a component at time t = RC (t ) = e − λt
RC (1,000) = e
(
− 9.5 ×10−5 ×1,000 ) = 0.909
Therefore, there is a 90.9% probability that the designed component for the
sensor platform project will survive after 1,000 hours.
∞ 1
MTBF = ∫ R (t ) dt =
0 λ
adjustments to prevent future failures. It is obvious that the final onus on causes
of failure thus lies with either the designer or the production engineer, which may
not be welcomed by either of them unless a robust classification of failure types is
already agreed upon. When speaking about the reliability of a system, we end up
talking more about the following three aspects: failure causes, failure modes, and
failure mechanisms.
While it is important for all embedded system design engineers to know these
terms and their importance while working on a new system design, it is also impor-
tant to understand that most big companies have a separate reliability and fault test-
ing division to undertake such studies. This is mainly because there is an inherent
conflict of interest between designers and testers, even though they both are aiming
towards the same goal of providing the best product to their clients or customers.
206 �����������������������
When Reliability Counts
However, for smaller and mid-sized companies, it may not be financially viable to
maintain a completely different departed dedicated for reliability testing; hence, the
risk of conflict of interest is managed using other operating protocols.
1
( )
∞
MTTF = E t f = ∫ R (t ) dt =
0 λ
MTBF is a term used for repairable electronic systems and is defined as the ex-
pected time between two failures of a repairable system as shown in Figure 4.2. It
is calculated in the same way as MTTF, with an assumption that the system failure
is repairable.
∞ 1
MTBF = ∫ R (t ) dt =
0 λ
MTTR is a term used in the context of repairable systems and is the amount of
time required to repair a system and make it operational again.
Table 4.3 Key Reliability Characteristics for Repairable and Nonrepairable Systems
Nonrepairable Systems Repairable Systems
Availability Function of reliability Function of reliability and maintainability
Key Factor Failure rate Failure rate and repair rate
Time to Failure MTTF MTBF
Expected Life Mean residual life (MRL) MRL (Economic justification)
4.2 Mathematical Interpretation of System Reliability 207
Figure 4.2 MTBF for repairable systems. MTTF, which is a characteristic of nonrepairable systems,
is a special case of this plot that calculates the first instance of failure (tf1).
While calculating both MTBF and MTTR, it is safe to assume that that there
are more than one failure and repair during the operational life of the system.
Example 4.2
Let us assume that the SoC Platform project by Sensor Company SensorsThink will
eventually be deployed as a subsystem to detect fuel efficiency in a car. The pro-
totype was tested in 200 cars for a total of 28,000 hours. Overall, 7 failures were
observed. Calculate the MTBF and failure rate.
Solution:
Assuming that the overall car system is a repairable system (since it is highly un-
likely that someone will change the car due to a fuel efficiency detector failure), we
can use MTBF
Example 4.3
Let us consider a scenario where SoC Platform project by sensor company Sensors-
Think is being deployed as the primary control system to detect ignition tempera-
ture at a chemical plant. Five such systems were tested in with failure hours of 157,
201, 184, 239, and 225. Calculate the MTTF and failure rate.
208 �����������������������
When Reliability Counts
Solution:
Considering the primary control system used in the chemical plant as a nonrepair-
able system (needs replacement), we use MTTF:
4.2.7 Maintainability
The maintainability of an embedded system is the measure of the ability of the sys-
tem or component to be retained or restored to a specified condition when mainte-
nance is performed by qualified personnel using specified procedure and resources.
It is an important parameter to be considered during the design phase as low main-
tenance leads to both reduced downtime as well as reduced costs. Maintainability
also helps one determine the level of fault tolerance and redundancy techniques
required for a certain system. Maintainability is measured by MTTR.
∑ λ Rt
n
i =1 i i
MTTR =
∑ λ
n
i =1 i
Here n is the number of systems available and λi is the failure rate of ith system.
Rti is the repair time for the ith system. The probability of performing a mainte-
nance action within an allowable time interval can be calculated using:
M (t ) = 1 − e − t / MTTR
Example 4.4
Let us assume that sensor company SensorsThink has ordered 4 prototypes from
the SoC Platform project that will be used for solar panel direction control systems
for maximizing the angle of incidence of solar rays on the panel. During testing,
it was found that one of the components on each the control systems fail at 1,100
hours, 1,150 hours, 1,050 hours, and 1,200 hours, respectively. The system could
be repaired by replacing the faulty component, and the total time to repair all 4
systems was 48 hours. Calculate the time to failure of this system if it was:
Solution:
MTTR for this system = 48/4 = 12 hours.
1. In this type of application, the solar panel controller can easily be classified
as a repairable system; therefore, we use MTBF
MTBF =
1
=
Total Operational Time
=
(1,100 + 1,150 + 1,050 + 1, 200) − 48
λ Total Number of Failures 4
∴ MTBF = 1,113 hours
MTTF =
1
=
Total Time
=
(1,100 + 1,150 + 1,050 + 1, 200) − 48
λ Total Number of Failures 4
∴ MTTF = 1,125 hours
As seen in this example, the average time to failure of a system can vary signifi-
cantly based on whether it is classified as a repairable system or a nonrepairable
system.
Example 4.5
Let us consider a scenario where SoC Platform project requested by sensor com-
pany SensorsThink will be deployed as a subsystem in an automatic airflow control
system, developed for an industrial warehouse, with an MTTR of 5 hours. What is
the probability that the system can be repaired within 3 hours?
Solution:
Assuming that the sensor system used in the chemical plant is a repairable system,
the probability of completion of repair within 3 hours can be found using:
M (t ) = 1 − e − t / MTTR
= 1 − e −3/5
∴ M (t ) = 1 − 0.549 = 0.451
4.2.8 Availability
Availability of an embedded system is a measure of the time during which the sys-
tem is operational versus the time that the system was planned to operate. It is the
probability that the system is operational at any random time t. The availability of
210 �����������������������
When Reliability Counts
a system is very critical measure for repairable systems as it determines the overall
operating efficiency and costs of the bigger system. For example, an airplane with
an availability of 50% would take twice the number of years to just break even the
cost of purchase and it could also result in higher cost expenditure with regard to
buying more planes to serve the same amount of traffic. There are three common
measures of availability: inherent availability, achieved availability, and operational
availability
However, in the context of this book, we will just cover inherent availability, as
there are other texts available to study this topic in more detail.
The inherent availability (Ai) is one of the most common measures to calculate
the availability of a system. It does not consider the time for preventive mainte-
nance and assumes that there is a negligible time lag between the time of failure and
the start of repair. For repairable systems, it is defined as follows:
µ MTBF
Ai = =
λ + µ MTTR + MTBF
where μ is the repair rate (=1/MTTR) and λ is the failure rate (=1/MTBF).
As evident in the examples given above, classification of a system as repairable
or nonrepairable depends on several parameters and scenarios. Moreover, there
are multiple parameters that define the reliability of the system. Some of these
parameters are more important than others depending on the application and the
environment in which the sensor system is used.
Example 4.6
The sensors company SensorsThink develops an SoC Platform to be used within a
gas detection system developed for a food storage warehouse, which has an MTTR
of 15 hours and an MTBF of 3,050. What is the inherent availability of the system?
Solution:
MTBF
Ai =
MTTR + MTBF
3,050
= = 0.9951
15 + 3,050
Like the bathtub curve shown in Figure 4.1 for key components of the system, it is
important to plot a bathtub curve for the embedded system that you have designed,
which will include one or many such key (critical) components. Once designers
have enough information to plot the bathtub curve for the system being designed,
4.3 Calculating System Reliability 211
they have a powerful visualization tool available to estimate and predict system
failure probability during different phases of the design cycle. The bathtub curve
is also a good indicator of the useful life of the product that you are developing.
As mentioned earlier, the useful life of a product is the time during which the sys-
tem operates with a constant failure rate. Useful life is an important measure as it
determines the cost of the system and the frequency of replacements as well as the
duration of warranty. It is vital to understand that the product reliability (for the
embedded system that you have designed) will vary greatly on the following factors:
1. Total number of components in the system including both critical and non-
critical components;
2. Total number of critical components in the system;
3. The configuration in which components have been placed during the de-
sign and production phase;
4. Criteria that determines what constitutes a system failure.
Keeping the above factors in mind, designers need to come up with a bathtub
model for the designed system that can capture the most basic estimate of overall
system reliability. Even a simple embedded system can have hundreds, if not thou-
sands, of individual components, making it an onerous task to calculate system
reliability. At the same time, it is almost impossible to design a system and sell it to
your client without having even a rough estimate of system reliability and the com-
ponents that have the highest probability of failure and if the reliability of the sys-
tem can be improved by clever design techniques, good programming, or providing
fault tolerance. Let us try to understand the effect of these factors on the reliability
of the system SoC Platform that you have been asked to design.
The first task is to determine the total number of components used in your
design. The sensor platform project requested by SensorsThink, in Chapter 2, has a
total 941 individual components. Trying to find the individual failure rate of each
of those 941 components and then trying to come up with a single mathematical
formula for system reliability that includes the individual failure rate of all 941
components is an almost impossible exercise. However, a good design engineer still
needs to know all the components used in the system design. The more important
task is to understand which of the components are critical and noncritical.
We have provided a technical definition of the key component for a sensor
platform project requested by SensorsThink in Chapter 2. In simple terms, a criti-
cal (key) component of an embedded system can be defined as a piece of hardware
or software, the failure of which would cause the system to fail altogether or seri-
ously affect its capability to perform the desired task. A good example of a critical
component would be a microprocessor chip in a laptop or a power supply unit.
A noncritical component are those components that could potentially affect the
performance of the system, but would not lead to critical failure. For example, a
microphone or webcam on a laptop could potentially have a serious effect on user
experience but would not probably lead to replacement of the laptop. There are
many components, in each system, that could be labeled as critical or noncritical
depending on how they are designed and how they are used. For example, a system
designed to handle only one high capacity memory chip would fail if the memory
device failed. However, if the same system can handle multiple chips, each with
212 �����������������������
When Reliability Counts
RCi (t ) = e − λi t
As the failure rates of all individual critical components are statistically inde-
pendent, the overall system reliability in this case will be a product term of n indi-
vidual critical component reliability RS(t)
Figure 4.3 Critical path of a system if all key components are connected in series.
4.3 Calculating System Reliability 213
RS (t ) = RC1 (t ) × RC 2 (t ) × RC 3 (t ) × … × RCn (t )
n
⇒ RS (t ) = ∏R (t ) Ci
i =1
The overall system reliability can therefore be represented in terms of the indi-
vidual failure rate (λi) of each component:
− t
∑ i =1λi
n
RS (t ) = e
= e − t λS
where λS is the overall serial failure rate and can be represented by:
∑
n
λS = λ
i =1 i
A special case of series connection can be if the failure rates of critical compo-
nents are not statistically independent. In that case, we need to derive a failure rate
equation based on the subset of critical components that form the critical path.
In such a case, the failure probability of the system can be found by determin-
ing the subsets of critical path components, whose functioning ensures the func-
tioning of the system even if other critical components outside the critical path fail.
For example, in Figure 4.4, we can have four such critical paths: {1,2,6}, {1,2,4,5},
{3,4,5}, and {3,6}. However, this is a special case scenario and most likely results
in system malfunction whenever the faulty critical path is chosen and properly
functioning otherwise. In a way, it will again come down to how we define system
failure and critical components.
Example 4.7
Consider a sensor system that has 50 critical components connected in series. If the
reliability of each component is 0.999, what will be the overall system reliability?
Solution:
Since the components are connected in series, the system will operate or fail if any
one of the critical components fail. The reliability of the system is given by the
equation:
n
RS (t ) = ∏ RCi (t )
i =1
Figure 4.4 Critical path of a series system with dependent critical components.
214 �����������������������
When Reliability Counts
Here, individual reliability for all critical components is the same; therefore,
the equation reduces to:
RS (t ) = ( RC (t ))
n
Plugging the values for RCi = (t) = 0.985 and n = 50, we get
RS (t ) = (0.999)
50
= 0.9512
Example 4.8
Let us take an example of the embedded system design for SensorsThink and deter-
mine the reliability of the system being developed after 5 years. Refer to Chapter 2,
where we performed component selection and listed the key components used in the
design. For the sake of simplicity, we will choose only a subset of key components
used in the design.
C1. SoC: Xilinx Zynq-7000 Series; XC7Z020-2CLG400I
C2. DDR3: (2) Micron 16-bit/4 Gbit MT41K256M16TW-107
C3. Filing System Storage: Micro SD Card
C4. Ethernet PHY: Microchip KSZ9031
C5. USB 2.0: Microchip USB3320
C6. USB to UART Converter: FTDI FT230X
C7. Wi-Fi/Bluetooth Module: Microchip ATWILC3000
C8. Local Relative Humidity/Temperature Sensor: Sensiron SHTW2
C9. Infrared Sensor: FLIR Lepton 500-0771-01
Component
Component Reliability
No. Component Description Failure rate (l) (RCT = e–lt)
C1 Xilinx Zynq-7000 series SoC 22 FIT 0.999
C2 DDR3 RAM module 100 FIT 0.996
C3 Micro SD card 220 FIT 0.990
C4 Ethernet PHY IC 45 FIT 0.998
C5 USB 2.0 IC 165 FIT 0.993
C6 USB to UART converter module 425 FIT 0.982
C7 WiFi/Bluetooth module 185 FIT 0.992
C8 Humidity/Temperature Sensor 400 FIT 0.983
C9 Infrared Sensor 35 FIT 0.998
[Note 1: 1 FIT is equal to one failure in billion working hours or in other words
FIT = λhours × 109.]
[Note 2: In the table above, the value of RC(t) is calculated using t = 5 years =
43,800 hours.]
4.3 Calculating System Reliability 215
Since each of the above components is critical, we can assume that the com-
ponents are connected in series for the purpose of reliability calculation. The reli-
ability of the system is given by the equation:
n
RS (t ) = ∏ RCi (t )
i =1
QCi (t ) = 1 − RCi (t )
⇒ QCi (t ) = 1 − e − λi t
Assuming that the failure rates of all critical components are statistically inde-
pendent, we get:
n
QS (t ) = ∏Q (t )
i =1
Ci
Figure 4.5 Critical path of a system if all key components are connected in parallel.
216 �����������������������
When Reliability Counts
Example 4.9
Consider a sensor system that has 5 critical components connected in parallel. If
the reliability of each component is 0.98, what will be the overall system reliability?
Solution:
Since the components are connected in parallel, the system will operate reliably if at
least one of the critical components does not fail. The failure probability of a single
component is:
QCi (t ) = 1 − RCi (t )
⇒ QCi (t ) = 1 − 0.98 = 0.02
So far, we have seen that reliability of a system is a function of the individual reli-
abilities of its key components. As a system designer, most of the time, you would
not have control over the reliability and failure rates of the components that you use
for your design. The little leverage you have in this regard is to make an informed
choice to choose the best possible components that are most reliable based on their
technical documentation. However, this does not mean that designers cannot do
Figure 4.6 Critical path of a system with key components in series-parallel combination.
4.4 Faults, Errors, and Failure 217
anything to improve system reliability, even when they are forced to choose a com-
ponent that may not be offering the best reliability (lowest possible failure rate).
In simple terms, the reliability of a system is a function of its failure rate. Failure
in a system occurs when the delivered service deviates from specified service. Gen-
erally, the failure of a system occurs when a key component of the system (hard-
ware or software) was erroneous. The cause of this error is termed as the fault. In
other words, a fault creates a latent error, which becomes effective when activated.
When this activated error affects the delivered service, it results in a failure. This
relationship between fault, error, and system failure is shown in Figure 4.7.
There can be several reasons for a fault to occur. Some of the most common
ones are:
•• Design errors;
•• Software or hardware (either mechanical or electronic);
•• Manufacturing problems (related to production and quality control);
•• Damage, fatigue, and deterioration (operating the system in the wearout
phase of the bathtub curve);
•• External disturbances (can be environmental, situational, operational, or
manual);
•• Harsh environmental conditions such as electromagnetic interference and
ionization radiation;
•• System misuse (one of the leading causes of system failure).
Faults can be broadly classified under the following four categories shown in
Figure 4.8.
easier option than hardware changes (which sometimes leads to system redesign
design), the presence of software faults is generally unacceptable within a system.
Software does not deteriorate with age, so it is either correct or incorrect, but some
faults can remain dormant for long periods. Moreover, in the case of embedded
systems, a change in software due to hardware changes could potentially lead to
bugs or errors being introduced in the system (as is the case when we upload new
drivers or update libraries). That kind of fault can be classified as both hardware
and software-related.
Figures 4.9 and 4.10 show the classification of faults and errors based on a
number of factors discussed above.
4.4.2 Fault Prevention Versus Fault Tolerance: Which One Can Address System
Failure Better?
There are two main approaches to building a reliable embedded system: fault pre-
vention and fault tolerance. Fault prevention attempts to eliminate the possibility
of a fault occurring in a system while it is operational. Fault tolerance is the ability
of a system to deliver expected performance even in the presence of faults. Both
these approaches attempt to produce reliable systems that have well-defined failure
modes. As discussed earlier in this section, faults lead to errors and errors cause fail-
ure. Therefore, failure mode analysis needs to consider different types and sources
of faults and errors that lead to system failure as shown in Figure 4.11.
some reason or another, hardware components will fail (with some probability of
failure). Moreover, the fault prevention approach is considered unsuccessful when
either the frequency or duration of repair times too high or if the system is inacces-
sible for maintenance and repair activities (e.g., a satellite in space or a mission to
another planet).
make an informed decision on the level of fault tolerance that needs to be incorpo-
rated into a design, we need to answer these three questions (Figure 4.13):
•• How critical is the component? For example, will the sensor design platform
ordered by SensorsThink be considered a highly critical system if it were to
be used as one of the control systems in a nuclear power plant or in a fighter
jet?
•• How likely is the component to fail? Some components, the other casing
of the sensor design platform, are not likely to fail, so no fault tolerance is
needed.
•• How expensive is it to make the component fault tolerant? Requiring a re-
dundant Zynq processor in the sensor design platform, for example, would
likely be too expensive both economically and in terms of weight, space, and
power consumption to be considered.
Fault tolerance techniques for embedded systems can broadly be classified into
two categories: (1) hardware fault tolerance techniques, and (2) software fault tol-
erance techniques. It needs to be understood that both these fault tolerance catego-
ries have been well studied and researched over the past few decades. However,
embedded system design is a special case where both techniques are implemented
together and with equal importance. This is because reliability of an embedded
system is a function of reliable design of both its hardware and its software. �
Figure 4.14 The TMR technique for hardware fault tolerance. All three processor units (PR1, PR2,
and PR3) process the same input and the majority voter (MV) compares the results and outputs the
majority vote (two of three).�
224 �����������������������
When Reliability Counts
subcomponents and the majority voting circuit. All three subcomponents process
the same input, and the majority voter compares the output from all three subcom-
ponents. If any one of the output is different from other two, that component is
masked out and is considered faulty.
exception handling are desirable and advantageous in each version of the software
being used in the multiversion scheme. Two of the most widely used techniques of
multiversion software fault tolerance are recovery block (RB) scheme and N-version
programming (NVP) scheme. Of these two, NVP is a static method, like the triple
modular redundancy used in hardware fault tolerance. However, the RB scheme is a
dynamic redundancy approach. Both NVP and RB schemes require design diversity
(for various alternate versions of the software) as well as hardware redundancy to
implement the scheme. Design diversity is a software design approach in which dif-
ferent versions of the software are built independently but deliver the same service.
The fundamental assumption of design diversity is that components built differently
will fail differently.
NVP is a static scheme in which output is computed using N independently
designed software modules or versions and their results are sent to a decision al-
gorithm that determines a single decision result. The primary belief behind the
NVP scheme is that the “independence of programming efforts will greatly reduce
the probability of identical software faults occurring in two or more versions of
the program.” The first step in the NVP scheme involves the generation of N ≥
2 functionally equivalent programs (also called alternate versions) from the same
initial specification. As the goal of the NVP scheme is to minimize the probability
of similar errors in each version of the software component, it encourages the use
of different algorithms, programming languages, environments, and tools wherever
possible. With such a big diversity in the design of different versions of the same
software component, NVP requires considerable developmental effort and increase
in problem complexity. According to some studies, a 25% rise in the problem com-
plexity can lead to a 100% rise in program complexity. However, in case of NVP,
the increase in complexity is not greater than the inherent complexity of building
a single version.
226 �����������������������
When Reliability Counts
The different version of program execute concurrently with the same inputs
and their results are compared by a decision algorithm (driver) as shown in Figure
4.16. The decision algorithm should be capable of detecting the software versions
with erroneous outputs and prevent the propagation of bad values to main output.
The decision algorithm should be developed keeping in mind the security and reli-
ability of the application. In the case of applications that require embedded system
design using SoCs, the NVP scheme is quite common. Significant advancements in
chip design industry are leading to a greater complexity in design; therefore, the
probability of design fault is greater because a complete verification of the design
is very difficult to achieve within a given timeframe. The use of N-versions of com-
ponents allows the continued use of chips with design faults as long as their errors
at the decision points are not similar.
The efficiency of NVP scheme depends on a number of factors, primarily the
initial design specification. If the specification is incomplete or incorrect, the same
error will manifest in all different versions of the software, rendering the scheme
useless. Design diversity is also a key parameter of with regard to success of NVP
scheme. Finally, the development budget for the NVP scheme limits the number
of independent versions of the software that can be used. A 3-version scheme will
require 3 times the budget of a single version, which raises another question. Would
the resources and time needed to implement an N-version scheme may be better
utilized by producing a more efficient single-version scheme with the same amount
of resources and time? This question probably depends on the industry for which
the product is intended. Most companies or industries will probably opt for a ro-
bust single-version scheme with a highly robust software system rather than an
N-version scheme.
The recovery block scheme, shown in Figure 4.17, uses alternate versions of
the software to perform a run-time error detection using an acceptance test per-
formed on the results delivered by the first (primary) version. If the system fails the
acceptance test, the state is restored to what existed prior to the execution of that
algorithm and an alternative module is executed. If the alternative module also fails
the acceptance test, the program is restored to the recovery point and yet another
module is executed. Recovery is considered complete when the acceptance test is
passed. A sample test script for a recovery block scheme is shown in Figure 4.18.
Checkpoint memory is needed to recover the state after a version fails to provide a
valid starting operational point for the next version.
Figure 4.17 The recovery block scheme consists of three key elements: a primary module to ex-
ecute critical software functions, an acceptance test (AT) for the output of the primary module, and
alternate modules, which perform the same functions as the primary module and deliver output in
the event of a fault with the primary module.
If all modules fail, then the system itself fails and recovery must take place at
a higher level.
While both NVP and RB schemes are quite useful in providing software fault
tolerance, there are some key differences in both schemes. These differences have
been summarized in Table 4.4.
With the ever-increasing complexity and omnipresence of embedded systems
and different aspects of our everyday lives, the comparison between NVP and RB
schemes needs to be studied in far greater depth and understanding than just a
section in this book. So we only take a cursory view of most used hardware and
software fault tolerance techniques with which the readers should be acquainted.
For a deeper understanding of this topic, there are some outstanding texts available
in the market.
Worst-case circuit analysis (WCCA) is a commonly used technique used in the in-
dustry to specify the boundary parameters for the operating conditions during the
228 �����������������������
When Reliability Counts
useful life of a product. WCCA examines the effects of potentially large magni-
tudes of variations of components used in the embedded system design beyond
their initial tolerance level. While it may not seem that obvious, the variation in
component parameters by just a few percentage points can lead to a big difference
in system reliability under the worst-case scenario. The variations can be the result
of aging or environmental influences, which can cause circuit outputs to drift out
of specification.
WCCA combined with failure mode and error analysis and MTBF analyses are
essential to the design of any reliable system; however, there is one key difference.
WCCA is a quantitative analysis technique as opposed to a part-based approach for
stress testing, FMEA, and MTBF (Figure 4.19). WCCA determines the mathemati-
cal sensitivity of circuit performance to parameter variations (for components and
subsystems) and provides both statistical and nonstatistical methods for handling
the variables that affect system performance. In a nutshell, WCCA is used to predict
the most likely scenarios under which there is a high probability of system failure.
WCCA technique usually involves considerable amount of planning and data
collection before it can be performed on a circuit. Before commencing, we should
first document the theory of operation and develop a functional breakdown of
the circuit (block diagrams and outlines of critical subcomponents). We should
also have access to technical documentation of most parts used in the design (see
step C below). The flowchart shown in Figure 4.20 presents the general overview
of WCCA technique. It demonstrates the interrelationship of the various tasks re-
quired for completion of a WCCA.
(
− Nominal value × ∑ ( Random effects )
2
)
Worst − Case Maximum = Nominal value
(
+ ( Nominal value × ∑ Negative biases ) + Nominal value × ∑ (Random effects )
2
)
Using the table below, determine the worst-case minimum and maximum
values for a 1,200-μF CLR capacitor. These parameters are used to determine
the potential resultant effect of the CLR capacitor drift on circuit applications.
4.6 Worst-Case Circuit Analysis 231
Example 4.10
Bias (%)
Capacitance Parameters Negative Positive Random (%)
Initial Tolerance at 25°C — — 18
Low Temperature (–20°C) 25 — —
High Temperature (+80°C) — 15 —
Other-Environments (Hard Vacuum) 18 — —
Radiation (10KR, 1,013 N/cm2) — 9 —
Aging — — 12
TOTAL VARIATION 43 24 (20)2 + (10)2 = 22.4
Solution:
Worst-case minimum and maximum occur when the variation due to bias is of the
same sign to the random variation, hence adding the effect of both:
232 �����������������������
When Reliability Counts
While the designers can choose any worst-case design technique listed in Table
4.7, the most common industry practice is to perform MCA due to the availability
of software tools that can help accomplish that task. LTSPICE is an open-source
SPICE modeling tool used widely and, although its user interface and features do
4.6 Worst-Case Circuit Analysis 233
not match up to the standards of licensed software such as Altium Designer, one
can easily perform MCA on smaller circuits in no time.
There are several circuit simulators available for engineering analysis and veri-
fication of electrical circuits based on SPICE technology. The reason why these
simulators use SPICE technology is because each element of an electric circuit can
be represented in the form of a mathematical model, where the set of mathematical
models of elements and their correlations form the electric circuit model. In this
manner, the circuit simulator can perform mathematical calculations on electric
circuits, which allow it to perform a number of computed experiments (such as
WCCA using the Monte Carlo method), with high reliability.
•• Does not require statistical inputs for cir- assess risk (modify circuit to meet EVA
cuit parameters (easiest to apply) requirements, or apply RSS or MCA for
•• Database needs only supply part parameter
less conservatism)
variation extremes (easiest to apply)
•• If circuit passes EVA, it will always func-
tion properly (high confidence for critical
production applications)
RSS •• More realistic estimate of worst case per- •• Standard deviation of piece part param-
formance than EVA eter probability distribution is required
•• Knowledge of part parameter probability •• Assumes circuit sensitivities remain con-
density function (pdf) is not required stant over range of parameter variability
•• Provides a limited degree of risk assessment •• Assumes circuit performance variability
(percentage of units to pass or fail) follows a normal distribution
MCA •• Provides the most realistic estimate of true •• Requires use of computer
worst-case performance •• Consumes a large amount of CPU time
•• Provides additional information in support •• Requires knowledge of part parameter
of circuit/product risk assessment pdf
for all the components used in the design, so it may be a tedious task to find one
for the component that you want to use in your design. While most leading manu-
facturers do provide SPICE model libraries, many others do not, so one needs to be
aware of this factor while performing MCA. Second, it takes a long time to assign
SPICE model parameters to all circuit components and then use the SPICE model to
vary the parameters and observe and record outputs for worst-case scenario. Also,
performing the analysis on the entire circuit together also pushes the WCCA to-
wards the end of the project cycle, which may not be an ideal scenario if the analysis
shows results that do not match the specifications. For example, the example used
in this book has approximately 941 components. Assigning tolerance values to each
of them and running a SPICE simulation for the entire circuit would be an onerous
task with so many component variations happening simultaneously.
Therefore, another standard industry practice is to perform worst-case analysis
on smaller circuit components as they are being designed and tests which limit any
potential surprises down the line. Figure 4.21 shows the Altium Designer schematic
of the LDO voltage regulator circuit used in the design of SoC Platform design
project for SensorsThink. Let us use this circuit as an example to estimate the
worst-case voltage output range for this circuit. In our design, we have used input
voltage tolerance of 2% and component tolerance of 1% as shown in Table 4.8.
In order to perform WCCA using MCA, we used LTSPICE to run a dc sweep
of the input voltage from 2V to 3.3V, as shown in Figure 4.22.
Ideally, the dc sweep should be run from 0 to 3.3V, but since the ideal output
voltage is 2.8V, we are only interested in input values of 2.8V and higher, as lower
input values in the dc sweep analysis only give a zero output. We are interested in
checking the maximum and minimum output swing around the ideal output volt-
age of 2.8V. Moreover, to keep it simple to understand, we have assumed that the
4.6 Worst-Case Circuit Analysis 235
Figure 4.21 Schematic of an LDO voltage regulator circuit used in SoC Platform project.
Figure 4.22 LTSPICE schematic of an LDO voltage regulator circuit with SPICE modeling
components.
236 �����������������������
When Reliability Counts
Figure 4.23 LTSPICE dc sweep simulation of an LDO voltage regulator circuit with voltage supply
tolerance = 2% and resistance/capacitance tolerance = 1%.
Figure 4.24 LTSPICE schematic of an LDO voltage regulator circuit with higher tolerance values for
SPICE modeling components.
Figure 4.25 LTSPICE dc sweep simulation of an LDO voltage regulator circuit with voltage supply
tolerance = 3% and resistance/capacitance tolerance = 5%.
is between 2.64V and 2.96V, which is a significant variation from the ideal 2.8-V
output level.
Example 4.11
A schematic diagram of a clock generator circuit used in the design of the SoC
Platform design project for SensorsThink is shown below. We will perform Monte
Carlo analysis for the worst-case performance of this circuit if the supply voltage
tolerance is 2% and resistance or capacitance tolerance is 2%. Assume that the
LTC6908 oscillator has 0% tolerance (ideal).
238 �����������������������
When Reliability Counts
Solution:
In order to understand this better, let us run the Monte Carlo simulation for this
circuit in two steps. In the first case, we run a transient analysis of the circuit and
observe the impact of voltage supply variation while keeping the other components
ideal. The LTSPICE schematic of this circuit with component parameters and toler-
ances is shown below.
Running transient analysis on this circuit analysis for 20 μs with voltage toler-
ance 2% and 0% (ideal) for all other components gives the following output.
4.6 Worst-Case Circuit Analysis 239
It can be observed that 2% supply voltage variation only impacts the output
voltage level and it varies between 3.2V to 3.4V. These values are approximately
equal to the voltage variation calculated theoretically using VOUT = VIN*(1 ± 0.02).
Running transient analysis on this circuit analysis for 20 μs with voltage toler-
ance 2% and 0% (ideal) for all other components gives the following output.
It can be observed that 2% supply voltage variation only impacts the output
voltage level and it varies between 3.2V and 3.4V. These values are approximately
equal to the voltage variation calculated theoretically using VOUT = VIN*(1 ± 0.02).
It is important to note that the LTSPICE simulation examples given above as-
sume that the designer has good knowledge of SPICE modeling and circuit analysis.
It is out of scope for this book to present a tutorial on SPICE modeling, however,
there are some very reference materials in print and online that can get one started
on using SPICE modeling.
Finally, it is important to understand that reliability and worst-case analysis for
a given circuit or system are only as good as the robustness of the system specifica-
tions and the reliability of the individual components. So while ideally we would
like everything to be 100% reliable and fault tolerant, it is never going to happen
due to all the factors affecting the design cycle as discussed earlier. All we can do
is to manage the reliability of a circuit within a reasonable range that takes into
account the application and the operating environment. If the design itself cannot
help in bringing down the reliability numbers within a reasonable range, then the
only other method (more expensive) is to provide redundancy to improve the fault
tolerance levels of the circuit or system.
240 �����������������������
When Reliability Counts
Selected Bibliography
Avizienis, A., “Fault Tolerance and Fault Intolerance. Complimentary Approaches to Reliable
Computing,” Proc. 1975 Int. Conf. Reliable Software, Los Angeles, CA, April 21-27, 1975,
pp. 458–464.
Avizienis, A., “N-Version Approach to Fault Tolerant Software,” IEEE Software, Vol. SE11,
No. 12, December 1985, pp. 1491-1501.
Ireson, W. G., C. F. Coombs, and R. Y. Moss, Handbook of Reliability Engineering and Manage-
ment, 2nd ed., New York: McGraw-Hill, 1996.
Laprie, J. C., “Dependable Computing and Fault Tolerance: Concepts and Terminology,” Pro-
ceedings of 15th International Symposium on Fault-Tolerant Computing (FTSC-15), 1985,
pp. 2–11.
LTspice, Analog Devices.
O’Connor, P. D. T., and A. Kleyner, Practical Reliability Engineering, New York: Wiley, 2012.
RAC Publication, CRTAWCCA, Worst Case Circuit Analysis Application Guidelines, 1993.
About the Authors
Dan Binnun has been involved in product development for nearly his entire career,
spanning companies as small as 12 people to large, multinational, publicly traded
corporations. He was exposed early in his career to product development and re-
search and development processes and has a wide range of experiences that has
given him an engineering process and development perspective.
Mr. Binnun has expertise in electrical system design, embedded system architec-
ture, and printed circuit board design. He founded E3 Designers to aid his growth
as an engineer. He graduated from the University of Hartford in 2007 with a bach-
elor’s degree in electrical engineering. Since graduating, Mr. Binnun has continued
his education in engineering by taking courses in EMI/EMC and high-speed digital
design. You can connect with him on LinkedIn at www.linkedin.com/in/dbinnun/
or follow E3 Designers at https://www.linkedin.com/company/e3-designers.
241
242 �����������������
About the Authors
243
244 Index