CCIEANDCCDE
CCIEANDCCDE
CCIEANDCCDE
CONTENTS AT A GLANCE
Introduction
CONTENTS
Introduction
IoT Fundamentals
Fog Computing
Edge Computing
IoT Security
Threat Vectors
Authentication
Authorization
Network Segmentation
Network Visibility
Summary
Review Questions
References
Cloud Fundamentals
Essential Characteristics
Service Models
Public Cloud
Private Cloud
Community Cloud
Hybrid Cloud
Multicloud
Performance Routing
Cloud Security
Workload Migration
Compute Virtualization
Virtual Machines
Containers
Virtualization Functions
Cloud Connectivity
AWS
Multicloud Connectivity
Cisco SD-WAN
Virtual Switching
Kubernetes
Volumes
Labels
Kubernetes Cluster
Kubernetes Networking
Creating a Pod
OpenStack
Artifact Repositories
Multitenant
Application Migration
Summary
Review Questions
References
Northbound API
Southbound API
NETCONF
ConfD
DevNet
Discover
Technologies
Community
Support
Ansible
gRPC
Summary
Review Questions
References
INTRODUCTION
Cisco is once more leading the way in building a workforce capable of moving with
technological changes through the evolution of its certification programs. Changes
to the Expert-Level (CCIE/CCDE) programs will enable candidates to bridge their
core technology expertise with knowledge of the evolving technologies that
organizations are adopting at an accelerated pace, such as cloud, IoT, and network
programmability.
Combining this book with the other Cisco Press certification books that are written
for a specific track will provide a complete source of knowledge to help CCIE and
CCDE candidates succeed on their written exams.
The most important and somewhat obvious goal of this book is to help you pass
the written CCIE and CCDE exams. One key methodology used in this book is to
help you discover the exam topics that you need to review in more depth, to help
you fully understand and remember those details, and to help you prove to
yourself that you have retained your knowledge of those topics. This book does not
try to help you pass by memorization, but helps you truly learn and understand the
topics.
This book is not designed to be a general networking topics book, although it can
be used for that purpose. This book is intended to tremendously increase your
chances of passing the evolving technologies components of the CCIE and CCDP
written exams. Although other objectives can be achieved from using this book, the
book is written with one goal in mind: to help you pass the exam.
Although this book could be read cover to cover, it is designed to be flexible and
allow you to easily move between chapters and sections of chapters to cover just
the material that you need more work with.
The questions for each certification exam are a closely guarded secret. However,
we do know which topics you must know to successfully complete the evolving
technologies portion of all CCIE/CCDE-level written exams. Cisco publishes them
as an exam blueprint for CCIE/CCDE Evolving Technologies. Table I-1 lists each
exam topic listed in the blueprint along with a reference to the book chapter that
covers the topic. These are the same topics you should be proficient in when
working with Cisco wireless LANs in the real world.
Each version of the exam can have topics that emphasize different functions or
features, and some topics can be rather broad and generalized. The goal of this
book is to provide the most comprehensive coverage to ensure that you are well
prepared for the exam. Although some chapters might not address specific exam
topics, they provide a foundation that is necessary for a clear understanding of
important topics.
It is also important to understand that this book is a “static” reference, whereas the
exam topics are dynamic. Cisco can and does change the topics covered on
certification exams often.
This exam guide should not be your only reference when preparing for the
certification exam. You can find a wealth of information available at Cisco.com that
covers each topic in great detail. If you think that you need more detailed
information on a specific topic, read the Cisco documentation that focuses on that
topic.
Note that as technologies continue to evolve, Cisco reserves the right to change the
exam topics without notice. Although you can refer to the list of exam topics in
Table I-1, always check Cisco.com to verify the actual list of topics to ensure that
you are prepared before taking the exam. You can view the current exam topics on
any current Cisco certification exam by visiting the Cisco.com website, hovering
11
over Training & Events, and selecting from the Certifications list. Note also that, if
needed, Cisco Press might post additional preparatory content on the web page
associated with this book at
http://www.ciscopress.com/title/9780789759726. It’s a good idea to check
the website a couple of weeks before taking your exam to be sure you have up-to-
date content.
As with any Cisco certification exam, you should strive to be thoroughly prepared
before taking the exam. There is no way to determine exactly what questions are
on the exam, so the best way to prepare is to have a good working knowledge of all
subjects covered on the exam. Schedule yourself for the exam and be sure to be
rested and ready to focus when taking the exam.
The best place to find out the latest available Cisco training and certifications is
under the Training & Events section at Cisco.com.
Exam candidates never really know whether they are adequately prepared for the
exam until they have completed about 30 percent of the questions. At that point, if
you are not prepared, it is too late. The best way to determine your readiness is to
work through the “Review Questions” at the end of each chapter and review the
corresponding section for any questions you answered incorrectly. It is best to
work your way through the entire book unless you can complete each subject
without having to do any research or look up any answers.
Because Cisco occasionally updates exam topics without notice, Cisco Press might
post additional preparatory content on the web page associated with this book at
http://www.ciscopress.com/title/9780789759726. It is a good idea to check
the website a couple of weeks before taking your exam, to review any updated
content that might be posted online. We also recommend that you periodically
check back to this page on the Cisco Press website to view any errata or supporting
book files that may be available.
12
The Internet of Things, also known as IoT, has become the latest industry
buzzword. So what exactly does it mean? At its simplest, IoT is a network of things
(or devices) that traditionally are not a part of a computer network (printers,
laptops, servers, or cell phones). With all these things connected to the network,
data can be collected off these things and extrapolated in ways that were almost
impossible to imagine before. This information can be used to change the ways
people have done things in the past and improve the way we live, work, play, and
learn.
Over the past years, specific companies have dominated their respective markets.
The barrier to entry has been high to enter a market, thus making it difficult for
new companies.
For example, ride-share companies like Uber and Lyft have revolutionized and
transformed the taxi industry with the use of smartphones. Customers request a
ride from their phone, are picked up, and are then transported to their destination
seamlessly. The process of hailing a taxicab or trying to find a taxicab company
(which varies from city to city) is simplified. Drivers can choose to work a number
of hours based on their availability, making it easy for them to maintain work-life
balance—and making it more desirable employment than working for a cab
company. Digitization offers all these benefits to customers and employees while
providing an experience that is better, faster, and cheaper than taking a traditional
taxi.
Other companies like Airbnb and VRBO have disrupted the hotel markets. In fact,
most companies have realized that they need to embrace the digitization process
and look at other business models so that they can maintain their existing
customer base while acquiring new customers through different business use
cases.
13
Manufacturing
Mining
Oil and gas
Utilities
Smart buildings
Health and medical
Retail
Hospitality
Transportation
Connected cities and emergency services
IoT Fundamentals
The first computer network derives from the mainframe architecture. Mainframe
computers contained massive amounts of processing power, storage, and memory.
Mainframes were very expensive to acquire and operate, but they could run
multiple programs simultaneously, making them cheaper than other computer
systems at the time. Direct console access to the mainframe was limited, and users
were force to connect to the mainframe from “thin” (dumb) terminal clients via the
first computer network.
Just as the design of these IT networks evolved as more and more computers were
connected to them, the network architecture for IoT networks must evolve too.
Traditional IT networks follow the common Access-Distribution-Core model that
can connect thousands of devices.
14
Note
OT and IoT networks are often maintained by engineers who are not from IT
departments and hence might not be familiar with traditional IT best practices.
IoT networks are similar to the first computer networks, which ran multiple
network protocols like IPX/SPX, NetBEUI, and TCP/IP. IoT networks must support
a range of devices that use a variety of protocols. This is because the hardware
refresh cycle is longer than corporate IT systems. As new systems are installed in
some parts of a facility, the older systems using different protocols are not
updated. Connecting devices that use different protocols requires some skill, and
the architecture must also be able to scale to thousands of things while providing
security.
This provides benefits in the IoT space, considering that multiple things made by
different manufacturers often must communicate using different protocols. The
M2M IoT architecture focuses on IoT services, applications, and networks by
providing interoperability through a variety of application programming interfaces
(APIs):
15
Physical Devices (Layer 1): The bottom layer, which contains things
(devices, sensors, and so on).
Connectivity (Layer 2): This layer provides connectivity among things in
Layer 1, Layer 2, and Layer 3. Communications and connectivity are
concentrated in this one level.
16
Edge Computing (Layer 3): The functions in this level are determined by
the need to convert data into information that is ready for processing at a
higher level.
Data Accumulation (Layer 4): This layer is responsible for storing data that
was traditionally transmitted live across the wire. The storage of the data
allows for analysis or computation at a later time.
Data Abstraction (Layer 5): This layer is responsible for rendering data
and its storage in ways that enable the development of faster or simpler
applications. It is responsible for reconciling multiple data formats from
different sources, assuring consistent semantics, and confirming that the
data is complete.
Application (Layer 6): This layer is where information is analyzed and
interpreted.
Collaboration and Processes (Layer 7): This layer encompasses people
and processes. In essence, this layer is responsible for providing people the
right data, with the right analysis, at the right time so that they can engage
the correct process.
Multiple other IoT reference models have continued to evolve over the years. Some
of the models have minor differences between them, whereas others have major
differences in how they interact. A simplified IoT Architecture consists of the
following:
17
Things
Communications
IoT applications
Figure 1-3 demonstrates the components of the Common IoT model that will be
discussed in the following sections.
Things
A smart thing (also known as a smart object) can take relevant information from
the data it receives and take appropriate action after analysis of that information.
18
Some of the newer wireless electronic thermostats are examples of a smart things
because they have the capability to be programmed remotely.
Some electronic thermostats can poll a remote temperature sensor and average
the temperature of its local sensor to identify the average temperature before
turning the HVAC unit on or off. To some people that might not seem very “smart.”
However, what if the electronic thermostat could check two different types of
sensors? The electronic thermostat could check a temperature sensor and a sensor
attached to a door to see if the door is open. If the door is open, turning on the
HVAC system would result in a waste of electricity, and now the thermostat has
relevant information about whether the cooled air would just flow out through the
open door.
Note
Vibration, energy consumption, and location sensors are used in some of the most
common business use cases.
Vibration: Placing vibration and torque values on motors and machines allows
computers to detect when a failure will occur on a device (motor, engine, truck,
robot, and so on). This allows maintenance to be scheduled in advance and repairs
to be made without affecting other processes. For example, if a rock truck breaks
down in a mine, it blocks the tunnel, preventing other rock trucks from passing in
the tunnel. The broken-down rock truck must be towed out of the tunnel to allow
operations to be restored. During this time, the mine cannot haul raw product.
Energy consumption: Users and businesses are now capable of tracking the
amount of energy being consumed in real time. This may result to changes in how
a business uses a building or machine in an attempt to lower its cost. However, by
connecting these sensors with the utility companies, businesses can now create
smart grids. This allows companies to detect power outages and pinpoint failure
locations quicker so that service can be restored. Another benefit of using a smart
grid is that it allows consumers to use green technologies like solar power to run
their homes, or return power to the grid.
Location: Location sensors allow for real-time location services (RTLS) such as the
following:
Enabling the transportation industry to track its vehicles and make that
information available to its customers.
Locating specific resources (forklifts, torque tools, laptops, and so on)
quickly. These sensors can also be used for inventory management for
manufacturers.
Locating employees. Generally, this is done to locate an employee with a
specific skillset, but could also be used for man-down safety reasons in
chemical/refinery environments.
Restricting or prohibiting certain things or devices from entering a specific
area. When an area is defined in the RTLS system, when a tag crosses the
barrier, it creates an alarm. This is known as geo-fencing.
20
Table 1-1 provides a high-level overview of typical IoT sensors used in various
market verticals.
Oil Health
Smart Connected Emergency
Manufacturing Mining and Utilities and Retail Hospitality Transportation
Buildings Roadways Services
Gas Medical
Temperature
Humidity
Vibrations
Pressure
Occupancy
Air Quality
Water
Quality
Energy
Consumption
Location
A variety of variables and factors need to be taken into account concerning the type of smart thing that should be
used. Specifically, the type of data and business use cases should always dictate which sensors are to be used. The
selection of things used should also take into account the following items:
Power source: Is the power source wired or battery driven? If battery driven, how long can the battery last,
and how often does it need to be replaced or recharged?
22
Ability to move: Is the thing physically attached to an immovable object
(steel beam, pole, and so on) or is it mobile via a wearable, attached to an
animal or a cart?
Frequency that data is transmitted: How often is data expected to be
transmitted from the thing? The amount of data and frequency can directly
correlate to the life of the thing before a temporary power source must be
recharged or replaced.
Connectivity method (wired or wireless): How does the thing
communicate with other devices? Does it use a wired technology (serial,
Ethernet, and so on) or does it use a wireless technology? The transmission
rate for that media and distance to the next upstream network device should
be accounted for, too.
Density of things: How many things are needed in an area to effectively
capture data? Can data be transmitted through other things like a mesh
network?
Processing capability: Not all IoT devices have the same amount of
processing power or storage. These components vary based on the function
and size of the device and directly correlate to the cost.
Communications
From the connectivity context, the gateway connects PANs, HANs, LANs, FANs,
LANs, LPWANs to the backbone network. Gateways connect to a backbone
network to provide a high-speed and hierarchical architecture.
Gateways can provide a method of aggregation of things and for processing data at
the edge of the network. This concept is explained further in the “Data
Transportation and Computation” section.
Note
The network protocol should be an open standard considering that most IoT
networks come from devices and things from different manufacturers. Proprietary
protocols would require translation and introduce complexity when things
communicate with proprietary protocols with other manufacturers’ things that do
not have access to them. The Internet Protocol (IP) has been well established and
has become one of the de facto network protocols for IoT.
Note
Network protocols include the IPv6 protocol too. The use of IPv6 is encouraged
because the address space does not have the limitation of the IPv4 protocol. In fact,
multiple IoT specifications such as 6LoWPAN and RPL do not include IPv4.
Data Exchange Protocols: The flow of data between things and systems must
either be initiated or requested. Some sensors will push data at regular intervals,
whereas other sensors require polling of information (also known as a pull). Table
1-2 displays multiple data exchange protocols and some key factors used for
selecting a data exchange protocol.
25
IoT Applications
Now that all of these smart things can communicate with each other, the true
power of IoT can be realized with the usage of applications. IoT applications are
classified as analytic and control applications.
An example of an IoT analytic application could be one that collects the amount of
energy consumed in a manufacturing plant on a thing-by-thing basis. The analytic
application could provide reports that contain power consumption on a daily basis
that could be correlated with the types of machines and product yield for that day.
The plant planner could make changes to his schedule or process based on this
information to improve efficiency.
IoT Control Application: This application controls smart things. A key concept is
that the IoT control application contains the ability to process logic that the smart
thing does not contain, or to provide a level of orchestration across multiple things.
An example of an IoT control application comes from the oil and gas segment,
where oil needs to be moved from one location to another. The application would
26
open the necessary valves in the pipeline and then start the pumps so that oil can
flow from one location to another. In this example, the valves would not have any
knowledge of the pumps, or vice versa, and the control application would
orchestrate turning the pumps on/off or making the valves opened/closed.
It is important to note that some IoT applications can perform both control and
analytic functions.
Note
Smart services are IoT applications that increase overall efficiency. For example, a
program on a thermostat that can detect if a door is open and overrides the
thermostat from turning on the HVAC system can be considered a smart service.
The design of an IoT application must take into account the data structure,
frequency, size, and volume of the data transmitted. All of this information directly
correlates to the volume of the data transmitted from things to the IoT
applications:
On the other hand, if a thing transmits 100KB of data every minute and there are
1,000 such things sending data; then the daily data amounts to only 144GB. So
even if a thing sends more data, if the frequency is lower, then the overall amount
of data is lower.
Normal corporate networks use a hierarchical network model that divides the
network into three layers: core, distribution, and access. Most end-user devices
27
connect to the access layer, and the data center (DC) connects to the core devices
(or a server switch block that connects to the core). Most of the data is generated,
stored, and processed in the data center.
However, IoT networks generate almost all of their data at the access layer. IoT
networks are generally very large, and they must scale appropriately to handle the
data from all of the smart things. The data structure and number of smart things
directly correlate to the volume of traffic that is analyzed.
A large IoT network with a million sensors that transmit 15KB of data every 30
seconds can result in 1.8 terabytes (1,800,000MB) of data on an hourly basis (~16
petabytes of data annually). IoT networks of this size must take into account the
amount of available bandwidth the access layer has. Depending on the physical
access medium, an update could saturate the link.
Figure 1-5 demonstrates four things that transmit via radio towers that can only
support 150KB of network traffic. The size of an update is 45KB, which can easily
pass across a 150KB link. However, when Thing 1’s packet is combined with the
updates for Things 2, 3, and 4, the total data exceeds the supported speed of the
radio towers. There is not enough bandwidth for all four things’ updates.
Connected vehicles can generate large amounts of data that exceed 20GB per day.
The transfer of this data to the manufacturer on a daily basis is not necessary
because most of the relevant processing should be performed within the vehicle.
Only a small subset of the data generated is needed by the manufacturer to
understand how the vehicle can be improved based on its usage.
Data centers are the heart of delivering IT services by providing storage, compute
resources, communications, and networking to the devices, users, and business
processes and applications. Data centers are designed with proper power and
cooling requirements to sustain large amounts of servers for the processing of
data. Recently, data analytics have added value and growth in data centers by
touching on aspects of an enterprise and its processes.
28
Latency can become an issue with sending all the data from sensors to the cloud or
DC. Depending on the amount of time for a packet to transmit from a thing to a DC,
analysis of the data at the DC with the proper response sent back to the thing could
exceed an acceptable amount of time. For example, suppose a camera takes a photo
of a manufacturing line as parts leave a stamping press for visual inspection as part
of the quality control process. Inferior parts should be discarded and removed
from the conveyor belt as they are moved to the next station. If there is too much
latency, a failed part could move down the conveyor belt. One solution could be to
lower the speed of the conveyor belt, but that might reduce the performance of
that manufacturing facility.
Fog Computing
Fog computing provides a mechanism for reducing data crossing the network by
converting flows into information that is ready to be stored or processed at higher
levels. The processing occurs in the gateways and network devices.
Note
The term “fog computing” comes from the fact that the processing is occurring
close to the data source, which is similar to how traditional fog is close to the
ground. This is in direct contrast to “cloud computing,” which is far away from the
source and often outsourced to a third party.
Fog computing can reduce the amount of northbound network traffic through
analysis and reduction of data transmitted. The reduction of data occurs outside of
typical network data compression techniques. Most network data compression is a
data de-duplication algorithm that examines crossing between two devices for
redundant patterns. This means that a device examines packets looking for
patterns in incremental sizes (for example, 4KB, 16KB, and so on) and creates a
signature for each of those patterns. If the pattern is sent a second time, the data is
replaced with a signature by the first network device. The signature is sent across
the WAN link and replaced with the data by the second network device. This
drastically reduces the size of the packet as it crosses the WAN, but still keeps the
original payload between the devices communicating.
29
Analysis and reduction of data could result in changing the way the data flows. For
example, if a sensor is sending temperature data from a sensor in 5-second
intervals that includes just the sensor’s IP address, temperature, and timestamp,
this could result in a 5KB packet being sent every 5 seconds (60KB a minute).
However, if the temperature is constant, the fog device could send the first update,
which includes the sensor’s IP address, temperature, and start time, but not
provide an update if the temperature stays the same. When the temperature does
change, the fog device would send a 6KB update that includes the sensor’s IP
address, temperature, and the start and stop times for that temperature cycle. It
would then send a second update with the sensor’s IP address, the new
temperature, and start time.
Just within the timespan of a minute, the fog device reduces 60KB of traffic to 11KB
if the temperature does not change. Now if the temperature remains consistent for
1 hour, that would result in a difference of 3.6MB of data, compared to 11KB,
which is a drastic reduction in data transmitted.
Note
Another common use case for fog computing is the translation of information
between different network vendors. For example, one sensor may report
temperature in Fahrenheit and a different sensor may report temperature in
Celsius. It is important for the data analysis to be consistent, so the fog device
could translate both sensors’ temperature to Kelvin before transmitting for
computation.
Edge Computing
Fog computing presents multiple benefits to IoT applications at a very quick pace
because of its location to the devices and sensors. Those same advancements could
further be refined by increasing the amount of compute resources on the actual
smart things. This allows for basic low-level analytics to be performed, and for
even faster decisions to be made. In addition, edge computing can still occur and
provide feedback to the local operations in the event of a network failure.
Note
30
The term “mist computing” has been assigned to edge computing because mist is
even lower to the ground when compared to fog.
The use of edge, fog, and centralized computing (data center and cloud) should be
structured in a hierarchical manner to provide the most efficient form of
processing. The following constructs should be used:
The IoT network architecture needs to take into account multiple factors, such as
the amount of data that the things generate, the density of things (from a cell, area,
or region), the network access medium, application latency requirements, and the
duration for which data is retained.
IoT Security
The following list details the terms and functions associated with data
confidentiality, data integrity, and data availability:
Data availability: Ensuring that the network is always available allows for
the secure transport of the data. Redundancy and proper design ensure data
availability.
Threat Vectors
Most enterprise environments use a castle security theory and rely heavily on
applying security technologies only at the perimeter of a network that includes the
following:
Starting at the bottom of the Common IoT model, things become the first point of
compromise. Some of these things have very small and limited processing
capability. Most of the smaller devices probably will not have a method to secure
33
access to their operating system. Even if they did, how can 20,000 things be
patched upon discovery of a security exploit? This comes with the assumption that
the device manufacturer is aware of the exploit and produces a patch.
The next threat vector comes from the fact that a majority of the manufacturing
protocols are insecure. When Profinet, CIP, EthernetIP, and Modbus were built, the
programmable logic controllers (PLCs), robots, and other devices all had minimal
compute and memory resources, so the protocol was built with efficiency in mind.
Security was never taken into consideration because the devices operated in a
closed-loop environment.
The physical access can present challenges to securing the network, too. Most OT
engineers are concerned with basic connectivity when connecting devices with
Ethernet. OT engineers would deploy unmanaged network switches that could not
support 802.1x network access control (NAC). The more advanced OT engineers
have learned the benefits of a managed switch, but still have yet to deploy 802.1x
NAC. In either scenario, anyone can walk up to a switch and have direct access to
PLCs, robots, or other components of a network.
Note
Many companies have had their OT networks impacted or shut down because a
vendor attached to their network and did not have InfoSec protections that
matched the company standard and accidentally released viruses into their
company’s network.
Note
Just like some manufacturing protocols do not have security mechanisms included,
some of the wireless mediums like LPWAN are built for small packets using low
bandwidth, so the use of encryption technologies could have a significant impact
on the amount of traffic that can be transmitted across them.
Developing a security strategy for an IoT network must take into account the OT
network priorities:
Availability
Integrity
Confidentiality
Availability takes priority because an IoT network directly correlates with the
capability to generate revenue. If an IoT network is down, it could result in
hundreds of dollars to millions of dollars of lost revenue per hour. These values are
calculated by loss of goods manufactured, wasted man-hours (assembly workers
are still paid even if a robot fails on the line), wasted raw product, or financial
34
penalties for failing to meet service level agreements (SLAs) for delivery of utilities
(that is, power 99.999% of the time).
Note
Some industries contain InfoSec engineers for IT systems, but they do not have
InfoSec engineers for OT systems. This results in different levels of security at the
OT level, depending on the facility.
Deploying proper security mechanisms in an IoT network should use the following
technologies and techniques.
Most IT-based systems provide authentication within the application itself. For
example, a login is required to access a file server or an email account. So even if a
user was able to gain access to the network, they would still have to bypass the
security mechanisms for that application.
However, OT systems do not have these security components integrated into them
because they were developed in a closed loop system. This means that access to
the network needs to be restricted to only approved users and devices. When NAC
36
is deployed, users can plug in their computers to a network port, but they will not
be able to send or receive network traffic until the NAC system grants them access.
Authentication
The central component to NAC is the authentication layer used to provide and
verify the identify information of an endpoint. In IT environments, the identity is
associated to a user ID; however, in OT environments, the identity must be linked
to a device. The identity could be a MAC address, static user credentials,
certificates, or some other form of hardware identification.
Authorization
Vendors are further expanding the capabilities of NAC by adding context. For
example, a U.S.-based employee may be able to authenticate to the network while
visiting his company’s office in Australia, but the NAC system can detect that he is
out of the country and restrict his access. Context within NAC can now check the
following items:
Most IoT devices and applications do not have a built-in authentication mechanism
to prevent unauthorized access. This limitation can be overcome by deploying NAC
globally in an environment to restrict access to devices by user ID and dynamic
ACLs, TrustSec, or similar technologies.
Network Segmentation
The role of the network engineer is to ensure that network traffic can flow from
one device to the other, and the role of an InfoSec engineer is to put security
measures and controls in place to keep the network safe. The Purdue model does a
great job of forcing segmentation between the IT and OT networks, but IT and OT
networks are still vulnerable to east/west traffic. For example, assume that a
user/device becomes infected with a computer virus in one portion of the network.
37
Because access is not restricted in an area, the virus can spread freely to all the
devices within the same network segment.
Outbreaks like this have forced network engineers to become more conscious of
security. They are being forced to understand the traffic patterns between devices
and insert security mechanisms to permit only business-relevant traffic between
areas of the network. This same concept needs to be applied to the OT space
because most systems only need to communicate within their area. It is the fog or
edge devices that need to communicate northbound, and that traffic could then be
inspected further by firewalls or intrusion protection systems.
Figure 1-7 demonstrates a set of four manufacturing cells. Each cell contains its
own HMI stations, PLCs, input/output blocks, and robots. There is no interaction
between these cells. Proper network segmentation would block communication
between cells at the area switch. This can be accomplished with ACLs, private
VLANs, or by forcing traffic through a firewall at a higher level.
Note
Network segmentation should occur at every level possible: the cell, area, and
zone. Conduits are then created to allow proper communication between devices
that need to talk to each other.
38
Network Visibility
Historically, InfoSec engineers would review the logs and information provided to
them from their firewalls and intrusion protection systems to detect security
incidents. This information would be fed into a security information and event
management (SIEM) system that would cross-correlate events to detect an attack.
However, the ratio of security devices to hosts or network devices is very small,
and these security devices might not detect every security incident.
More advanced security departments have realized that they can collect
information from network devices (routers and switches) that can provide
additional contextual information for traffic patterns. Specifically, collecting
information about network traffic (source/destination IP address, protocol, and
port) through NetFlow can provide relevant data. This data can then be processed
with analytics to create a baseline of normal network traffic. If a host starts to
transmit data that is irregular from a new application or volume, it could alert
InfoSec engineers of a potential security incident so that it can be investigated. This
same type of analytics can also be used by NAC to help provide context to the
device and identify the type or function of the device as part of the authorization
process.
Note
There are significant advances in network visibility systems that provide direct
interaction with NAC systems. For example, a threat could be detected on the
network visibility system that forwards commands to the NAC system. The NAC
system notifies the switch to shut down the switch port that the attacker/infected
PC is connected to.
How can an organization be assured that the vendor has verified that it is safe to
operate a device (like a robot) without causing injury to someone nearby? This
39
question alone reiterates the fact that no traffic from the enterprise level should be
allowed direct access to the manufacturing/industrial zone.
The best way is to provide a jump box system in the industrial DMZ. A jump box is
a dedicated computer with all of the appropriate tools to manage devices in the OT
portion of the network. The vendor then establishes connectivity to the OT
network through the jump box in the DMZ. For example, an on-site technician can
implement all the proper physical safety controls and then launch a screen share
session on the jump box.
Note
Remote access via dial-up modem, LTE router, or other VPN device that is placed in
the OT network could represent a threat. Some vendors will place these devices as
part of their design, unbeknown to the customer, as a backdoor. In some countries,
this is an illegal activity for certain industries like utility companies.
Summary
The Internet of Things has grown drastically over the past years. Businesses are
now realizing the value that can be extracted from sensors when combined with
real-time analytics. These technologies are allowing them to digitize their business
model and reinvent themselves by inserting technology into markets that reside
outside of the tech industry. These technologies provide them with a differentiator,
thus allowing them to be more efficient and profitable.
Things: What type of thing is being used, how much data is being sent, how
frequently is the data being sent, and how many things are deployed in an
area?
Communications: What network medium is being selected, is a gateway
being used, what does the backhaul network look like, what network
protocol is being used, and which protocol is being used to exchange
information between devices?
Computation models: Are the things capable of edge processing, or can we
use fog-based processing to reduce latency and bandwidth requirements on
the network?
Security: How can we ensure secure communication between the devices,
and how can the number of threat vectors be reduced?
40
The widespread adoption of IoT devices has begun. Some companies have
migrated their product portfolio to IoT devices because of the future growth.
Analysts predict that there will be 50 billion devices that produce over 3.3
zettabytes (that is, 3,300,000,000,000,000,000,000 bytes) of traffic in the near
future (2021).
Review Questions
1. Edge nodes
2. Fog nodes
3. Data centers
4. Mainframe computers
5. Cloud service providers
2. A fog node can process data and send the data to another fog node. True or
False?
1. True
2. False
3. IoT networks are easier to secure than traditional IT networks. True or False?
1. True
2. False
4. When defining an IoT security model, which factor is the most important to
consider?
1. Cost of solution
2. Availability
3. Complexity
4. Integrity
5. Confidentiality
6. None of the above
5. What is one significant factor that must be taken into consideration with an OT
network as compared to an IT network?
1. Data confidentiality
2. Safety
3. Data integrity
4. OT protocol capability for encryption
5. Remote access for vendors
41
References
Hanes, David, Gonzalo Salgueiro, Patrick Grossetete, Rob Barton, Jerome Henry. IoT
Fundamentals. Indianapolis: Cisco Press, 2017. Print
Cloud Fundamentals
Performance, Scalability, and High Availability
Security Implications, Compliance, and Policy
Workload Migration
Compute Virtualization
Virtualization Functions
Cloud Connectivity
Automation and Orchestration Tools
Cloud Fundamentals
Essential Characteristics
The following are five essential characteristics as defined by NIST that all cloud
services should offer. If any of them is lacking, it is likely not a cloud service.
Note
45
One big difference between cloud computing and traditional virtualization is that
although virtualization can abstract resources from the underlying physical
infrastructure, it lacks the orchestration to pool them together and deliver them to
consumers on demand and without manual intervention.
Service Models
The following sections describe the different cloud service models available today.
Figure 2-2 illustrates that with IaaS, the service provided to the customer is
virtualization, servers (compute capacity), storage, networking, and other
fundamental computing resources where the customer is able to deploy and run
any OS, middleware, and applications. The customer does not manage or control
the underlying cloud infrastructure but has control over operating systems,
storage, and deployed applications. The cloud service providers may also include
limited control of networking components (such as a virtual switch).
This type of service model is great for IT departments because it allows them to
migrate their on-prem applications easily into the cloud.
IBM Bluemix
This is very similar to IaaS; the only difference, as illustrated by Figure 2-3, is that
PaaS also provides the OS as well as development application platforms (that is,
the ability to run PHP, Python, or other programming languages), databases, file
storage, collaboration, machine learning, big data processing, and so on. The
consumer can deploy third-party applications supported by the CSP or they can
develop their own application in the cloud without the worrying about the
complexities of managing the underlying infrastructure services.
This type of service model is ideal for developers because they don’t have to worry
about having to install a database or any middleware. With a PaaS service model,
all of this is included and there is no need to manage the underlying servers,
networks, or other infrastructure. All that needs to be provided is the application
and the application data.
In this type of service model, everything is provided by the cloud service provider
except the application data, as illustrated in Figure 2-4. This is the type of service
that everyone uses on a daily basis; for example, web-based email (such as Gmail),
Dropbox, Facebook, and so on, are all SaaS services.
Salesforce.com
SAP Business by Design
Oracle on Demand
Office365
Cisco Webex
Box
This can be seen as a combination of IaaS, PaaS, and SaaS where everything can be
a service. Here are some examples:
Desktop as a Service
Backup as a Service
Database as a Service
Security as a Service
IP Telephony as a Service
48
Public Cloud
For this type of cloud deployment model, the cloud infrastructure is available to
the general public over the Internet. This type of cloud is owned, managed, and
operated by the CSP. Some examples include the following:
Private Cloud
OpenStack
Microsoft Azure Stack
VMware vCloud Suite Private Cloud
Cloud providers can also emulate a private cloud within a public cloud
environment (think of it as a cloud within a cloud). Amazon Web Services and
Google Cloud Platform call this type of private cloud a Virtual Private Cloud (VPC),
whereas Microsoft Azure calls it a Virtual Network (VNet).
A VPC or VNet isolates resources for a cloud tenant from other tenants through a
private IP subnet and a virtual network segment defined by the user.
49
Community Cloud
Community clouds are very common for the public sector or the government
because they require regulatory standards that will be described later in this
chapter in the “Security Implications, Compliance, and Policy” section.
AWS GovCloud
Google Apps for Government
Microsoft Government Community Cloud
Salesforce Community Cloud
Capital Markets Community Platform (NYSE)
Healthcare Community Cloud (Carpathia)
Hybrid Cloud
Figure 2-5 illustrates a hybrid cloud, which is when a private cloud and a public
cloud combine and are bound together by standardized or proprietary technology
that enables data and application portability (that is, cloud bursting for load
balancing between clouds).
50
Multicloud
The following are some of the benefits from adopting a multicloud solution:
Elasticity can give the illusion of infinite cloud resources while addressing
performance, scalability, and high availability concerns.
One key factor that can have an impact in application performance is that most
traditional applications do not take the network characteristics into account and
typically rely on protocols like TCP for communicating between different systems.
Traditional IT applications tend to be chatty and are usually designed for LAN
environments that can provide high-speed bandwidth and are not easily
congested. When an application resides across a WAN, most users assume the
application responsiveness or performance is directly related to the available
bandwidth on the WAN link. While bandwidth is one of the factors that can affect
application performance, there are other factors that can also affect performance,
such as path latency, congestion, and application behavior.
When those applications are migrated to the cloud, they are typically accessed
across the WAN from branch locations. To make sure the application performance
is not affected, consistent, high-quality performance with maximum reliability and
minimum latency is required on the WAN link, and this can be achieved with WAN
optimization.
Cisco Wide Area Application Service (WAAS) and Akamai Connect technologies
provide a complete WAN optimization and application acceleration solution for
overcoming WAN performance issues that can have a direct negative impact on
application performance, which consequently affects end-user productivity. They
are transparent to the endpoints (clients and servers) as well as devices between
the WAAS/Akamai Connect devices.
Cisco WAAS and Akamai Connect also provide a method of caching objects locally.
Caching repeat content locally shrinks the path between two devices and can
53
reduce latency on chatty applications. For example, suppose the latency between a
branch desktop and the local object cache (WAAS/Akamai Connect) is 10ms, and it
takes 100ms to retrieve a file from the cloud. Only the initial file transfer would
take the 100ms delay, and subsequent requests for the same file are provided
locally from the cache with a 5ms response time.
A detailed discussion of QoS is outside the scope of this book. Further information
on QoS deployment can be found in the Cisco Press publication End-to-End QoS
Network Design: Quality of Service for Rich-Media & Cloud Networks, Second Edition,
by Tim Szigeti, Christina Hattingh, Robert Barton, and Kenneth Briley.
Performance Routing
Cloud bursting is a way to scale out a private cloud into a public cloud whenever
there is an increase in demand that goes above a specified threshold. When the
demand decreases, the public cloud resources are released. Another way to look at
cloud bursting is as elasticity in a multicloud environment that allows scaling out
and in between cloud providers.
A good use case for cloud bursting would be retail; for example, if a good volume of
sales is expected during the holiday season, more resources from the private cloud
could be required during that time. If the private cloud does not have enough
resources to cope with the increase in service demands, it can then burst into the
public cloud, leveraging extra capacity as required, and when the resources are not
needed, they can be released and the consumer can just pay for the resources that
were used.
Cloud providers offer service level agreements (SLAs) in which they indicate the
monthly uptime percentage of availability/uptime for their services. For example,
at the time of writing, AWS EC2, Google Cloud Platform, and Microsoft Azure all
offer 99.99% availability/uptime, and if they do not meet the agreement, they all
offer financial credit that varies, depending on the severity of the downtime and
other stipulations in their SLAs. In other words, 99.99% availability/uptime
essentially implies that AWS, Google, and Microsoft Azure can go down every day
for 8.6 seconds (or no more than once a year for 52 minutes and 37.7 seconds). If
the downtime goes beyond this, then the cloud consumer is eligible for financial
credit.
Cloud security is more than just securing the cloud; it is also about the new
security implications that come with the shift in how the networks and endpoints
access data and applications. Organizations accessing data and applications in the
cloud have shared risks and responsibilities with the CSPs that vary depending on
which type of cloud service model is in use.
The cloud service models (IaaS, PaaS, and SaaS) can be thought of as stacks, where
IaaS is at the bottom of the stack, with PaaS in the middle and SaaS at the top. In
this stack, the security responsibilities are shared between the cloud customer and
the CSP, where at the bottom of the stack the consumer has the most responsibility
and, as you move further up the stack, the CSP has the most responsibility. In other
words, security responsibilities in the cloud map to the degree of control that the
cloud customer or CSP has over the service model; this is known as the shared
responsibility model.
Figure 2-7 shows an example of the shared responsibility model where on the left
side, the customer is mostly responsible, and higher up the stack, the CSP is mostly
responsible.
56
Cloud Security Alliance’s “Security Guidance for Critical Areas of Focus in Cloud
Computing v4.0” includes two recommendations that directly correlate with the
shared responsibility model:
The Cloud Security Alliance (CSA) provides two tools to help meet the
recommended requirements:
With cloud computing environments, data can reside anywhere and services are
delivered on demand to any endpoint. Public clouds allow organizations to reduce
their IT infrastructure costs, as well as management costs, by storing data assets in
a multitenant, but secured, cloud-hosted environment. Organizations can also build
their own private clouds that can deliver cloud-based services to their own
organization.
57
There are different regulatory compliance laws for different verticals, such as the
following:
Payment Card Industry Data Security Standard (PCI DSS): This is for
companies that handle credit card information, and it serves to protect
customer data in an attempt to reduce credit card fraud.
Federal Risk and Authorization Management Program (FedRAMP) and
the Federal Information Security Management Act (FISMA) / NIST 800-
53: These are for government agencies and their service providers. They
assist in assessing and meeting FISMA requirements to attract government
agency business moving to the cloud as part of FedRAMP.
Health Insurance Portability and Accountability Act (HIPAA) and
Health Information Technology for Economic and Clinical Health
(HITECH): These are for the healthcare segment, bringing multilocation
medical centers and healthcare organizations into compliance.
International Organization for Standardization (ISO) 27001 (2013):
Provides requirements for establishing, implementing, maintaining, and
continuously improving an information security management system (ISMS).
ISO 27018 (2014): Code of practice for protection of personally identifiable
information (PII) in public clouds acting as PII processors.
ISO 27017 (2015): Code of practice for information security controls based
on ISO / International Electrotechnical Commission (IEC) 27002 for cloud
services.
Service Organization Controls (SOC), SOC1 / SOC2: SOC 1 is the reporting
option for which the Statement on Standards for Attestation Engagements
(SSAE) 16 professional standard is used, resulting in a SOC 1 SSAE 16 Type 1
and/or a SOC 1 SSAE 16 Type 2 report. SOC 2 is the reporting option
specifically designed for many of today’s cloud computing services, SaaS, and
technology-related service organizations.
Independent parties need to perform periodic audits for most of these standards to
validate an organization’s continuous compliance. Cloud environments and their
security risks are taken into consideration in such audits.
The CSA Top Threats working group conducted a survey to compile industry
expert professional opinions on the greatest security issues within cloud
computing. With the survey results and their expertise, they crafted a report titled
“The Treacherous 12—Top Threats to Cloud Computing in 2016.” Table 2-1
58
displays the 12 top threats included in the report. They are arranged based on
order of severity, as per the survey results, where 1 is the highest severity.
tools.
This includes using cloud
services to launch DDoS
Abuse and A CSP must have an incident
attacks, email spam and
10 nefarious use of response framework to
phishing, digital currency
cloud services address misuse of resources.
mining, and hosting of
malicious or pirated content.
This is an attack that Employ distributed denial of
consumes an inordinate service (DDoS) attack
amount of cloud resources or detection.
targets a specific
11 Denial of service
vulnerability to make the Put in place a mitigation
cloud service slow or plan.
completely unavailable to
other cloud users. Use multicloud.
CSPs rely on multitenant
environments. This can lead Multifactor authentication.
Shared
to shared technology
12 technology
vulnerabilities that can Intrusion prevention system
vulnerabilities
potentially be exploited in all (IPS).
delivery models.
Note
Keep in mind that these security issues are not unique to cloud computing and
could happen in a traditional IT environment.
To make appropriate cloud security strategies, the CSA group recommends to use
its threat research report in conjunction with the latest available versions of the
following guides:
Cloud Security
Cloud services provide reduced cost and efficiency gains for businesses as long as
security policies, processes, and best practices are taken into account. If not,
businesses are vulnerable to security data breaches or other threats, as mentioned
earlier in this chapter, which can eliminate any benefits gained from switching to
cloud technology.
61
Table 2-2 includes only the most important Cisco security products shown in
Figure 2-8. It covers their capabilities, the form factor, as well as which part of the
security ecosystem they play a security role in.
62
Operates
Product Capability Form Factors
In
Endpoint Agent, NGIPS (AMP
Advanced
Endpoint, for Networks), Next-Generation
Malware Antimalware,
network, Firewall, Meraki MX UTM
Protection malware sandboxing
cloud platform, Branch router (ISR),
(AMP)
cloud service
Network Admission
Control (NAC),
AnyConnect Endpoint Endpoint agent
virtual private
network (VPN)
User behavior
analytics (UEBA),
Cloudlock data security, DLP, Cloud Cloud service
cloud access security
broker (CASB)
Cisco Email
Network,
Security Email security Physical or virtual appliance
cloud
Appliance (ESA)
Firepower
(NGFW, Firewall Network Physical or virtual appliance
NGFWv)
Firepower
Threat Defense IPS Network Physical or virtual appliance
(FTD)
Cisco Identity
Services Engine NAC, context Network Physical or virtual appliance
(ISE)
Stealthwatch Flow analytics Network Physical or virtual appliance
Stealthwatch
UEBA, flow analytics Cloud Cloud service
Cloud
Network,
Tetration Flow analytics Endpoint agent, cloud service
cloud
Domain name system
Umbrella (DNS) security, web Cloud Cloud service
security
Cisco Web
Security Network,
Web security Physical or virtual appliance
Appliance cloud
(WSA, WSAv)
63
Workload Migration
There are also performance-intensive applications that are better suited to run on
bare-metal servers (for example, trading applications where fast transactions
dictate revenue). Furthermore, security compliance and regulations may keep
certain applications from running in public clouds.
Examples of applications or workloads not suitable for cloud may include the
following:
Feature requirements
Service model (IaaS, PaaS, or SaaS)
Cloud deployment (public, private, community, or hybrid)
Performance sensitivity
Availability requirements
Application migration priority
Application complexity
Type of application (development, test, and so on)
Security considerations (using CSA’s security tools)
Regulatory compliance
Dependencies to applications that cannot be migrated
Business impact (critical or not)
Benefit of migrating
Networking requirements
Hardware dependencies
Software dependencies
Migration complexity
Applications that can be used as pilots
After identifying the applications that are suitable for migration, the organization
should identify a migration strategy for each and add them to the matrix as well.
Amazon AWS has identified six different migration strategies (called the 6 R’s) that
they see customers implementing when migrating applications to the cloud:
Note
AWS’s 6 R’s migration strategies build upon the 5 R’s migration strategies
originally outlined by Gartner.
After the migration strategy is completed, here are the next steps to follow:
As can be seen, migrating applications can be quite involved; fortunately, Cisco has
a solution that can simplify this whole process called Cisco CloudCenter (CCC).
The CCC solution, along with AppDynamics iQuate and CloudEndure, can help
seamlessly migrate applications to and monitor them in the cloud. CloudCenter is
discussed in the “Automation and Orchestration Tools” section of this chapter.
Compute Virtualization
VMs and containers are discussed in this section, and for you to be able to clearly
understand their differences, it is necessary to understand all the following basic
server components:
Central processing unit (CPU): The CPU (aka the processor) is without a
doubt the most important component of a computer. It is responsible for the
majority of processing jobs and calculations.
Internal storage: The most usual internal storage devices are hard drives,
which are a permanent or persistent form of storage. This is where the
operating system, applications, and application data are typically stored.
Main memory: Main memory, also known as random access memory (RAM),
is a fast, volatile (temporary) form of storage that is constantly being
66
accessed by the CPU. If the CPU had to constantly access the hard drive to
retrieve every piece of data it needs, it would operate very slowly.
When an application runs, its most important components that need to be accessed
by the CPU are loaded into RAM, and other pieces of the application are loaded as
required. Once the application is running in RAM, any file that is opened with the
application is also loaded into RAM. When you save the file, it is saved to the
specified storage device (for example, the hard drive), and when you close the
application, the application as well as the file are purged from RAM to make room
for other applications to run.
There are two types of kernels: monolithic and microkernel. The difference
between the two is that a monolithic kernel executes all its services and functions
in the kernel space, which makes the kernel rigid and difficult to modify and
enhance, whereas a microkernel only executes basic process communication and
I/O control in the kernel space while the remaining system services such as the file
system, device drivers, and so on, are executed in the user space. This makes
microkernel OSs more flexible and allows for easy enhancements and
modifications, which makes them ideal for cloud computing. Recent versions of
Windows and macOS are examples of microkernel OSs.
Virtual Machines
One key capability of VMs is that they can be migrated from one server to another
while preserving transactional integrity during movement. This can enable many
advantages; for example, if a physical server needs to be upgraded (for example, by
adding more memory), the VMs can be migrated to other servers with no
downtime. Another advantage is that it provides high availability; for example, if a
server fails, the VMs can be spun up on other servers in the network, as illustrated
in Figure 2-12.
69
Containers
Figure 2-13 shows a side-by-side comparison of VMs and containers. Notice how
each VM requires an OS and that containers all share the same OS while remaining
isolated from each other.
Containers, on the other hand, share the underlying resources of the host
operating system, and each application, along with the dependencies that it needs
to run, is completely isolated, which makes the applications very lightweight (small
size) and portable (easy to move/migrate). In other words, a container is typically
just a tarball (that is, an archive file similar to a ZIP file) that packages the code,
70
A container does not try to virtualize a physical server like VMs do; instead, the
abstraction is the application or the components that make up the application.
Here is one more example to help you understand the difference between VMs and
containers: When a VM starts, the OS needs to load first, and once it’s operational,
the application in the VM can then start and run, which usually takes minutes.
When a container starts, it leverages the kernel of the host OS, which is already
running, and it typically takes a few seconds to start.
As previously mentioned, one of the benefits of containers is that they make it easy
for developers to know that their software will run, no matter where it is deployed,
without having to worry about any dependencies their apps or components might
have. Another benefit of containers is that they facilitate applications based on a
microservices architecture. Instead of having one large, monolithic application, a
microservices architecture breaks an application down into multiple, smaller
components that can communicate with each other, and each of these components
can be placed into a container. This facilitates a continuous integration/continuous
delivery (CI/CD) approach, which increases feature velocity; in other words,
different development teams can more easily work on different components of an
application, and as long as they don’t make any major changes to how those
application components interact, they can work independently of each other. This
allows for delivering small batches of software into production continuously.
Cloud-native applications and services are specifically built for the cloud to
leverage all the advantages of cloud computing (availability, elasticity, and so on),
and they typically use containerized microservices. The reason for using
containers and not VMs is that VMs are too heavy (since they emulate a full
71
physical server with an OS) and are very slow to start, whereas containers are very
lightweight and start fast.
Cloud-native applications or services are often developed using the 12 Factor App
(https://12factor.net/) design methodology as a baseline. This methodology
dictates that applications employ a number of best practices to ensure portability,
flexibility, and resiliency. The 12 Factor App methodology was created by
developers at Heroku and was presented for the first time by Adam Wiggins circa
2011.
Virtualization Functions
Some of the benefits of NFV include benefits similar to server virtualization and
cloud environments:
NFV infrastructure (NFVI) is all the hardware and software components that
comprise the platform environment in which virtual network functions are
deployed. It is the section highlighted in green in Figure 2-14.
A VNF, as its name implies, is the virtual or software version of an NF, and it
typically runs on a hypervisor as a VM. VNFs are commonly used for L4–L7
functions such as those provided by load balancers (LBs) and application delivery
controllers (APCs), firewalls, intrusion detection systems (IDSs), WAN
optimization appliances, and so on. However, they are not limited to L4–L7
functions; they can also perform lower-level L2–L3 functions, such as those
provided by routers and switches.
The NFV Orchestrator is responsible for creating, maintaining, and tearing down
VNF network services. If multiple VNFs are part of a network service, the NFV
Orchestrator enables the creation of an end-to-end network service over multiple
VNFs. The VNF Manager manages the lifecycle of one or multiple VNFs as well as
74
FCAPS for the virtual components of a VNF. The NFV Orchestrator and VNF
Manager together are known as NFV Management and Orchestration (MANO).
Figure 2-16 shows Cisco’s NFV solution architecture that was designed to meet the
NFVI standards as well as the requirements just discussed. The NFVI components
are shown in the green box, which should be looked at as a single integrated
platform (composed of software and hardware) that can be used to deploy VNFs.
75
The solution has been tested to onboard simple and complex VNFs from multiple
third-party vendors spanning different network services—for example, routing,
firewalls, session border controllers, Virtualized Evolved Packet Core (vEPC),
Virtualized Policy and Charging Rules Function (vPCRF), Virtualized IP Multimedia
System (vIMS), and so on.
The components shown at the top of Figure 2-16 include Cisco’s and third-party
VNFs as well as Cisco’s Network Services Orchestrator (NSO) as the NFV
orchestration component and Cisco’s Elastic Services Controller (ESC) as the VNF
Manager. These components, although part of Cisco’s NFV solution, are optional,
and the customer may choose any third-party components that meet the NFV
requirements.
There are currently three Cisco-integrated solutions available that rely on Cisco’s
NFV solution offering:
Note
Cloud Connectivity
AWS
Speeds of 1Gbps to 10Gbps are available through the co-location option for a single
link. Speeds of 50Mbps to 500Mbps can be ordered from any APN partner
supporting AWS Direct Connect.
77
Microsoft’s direct network connectivity option ExpressRoute (ER) allows for the
extension of a private network to any of Microsoft’s cloud services, including
Microsoft Azure, Office 365, and Dynamics 365. A customer’s WAN router connects
to one of many peering locations and gains access to all regions within the
geopolitical region. A premium add-on is available to extend connectivity across
geopolitical regions around the world.
Cloud Interconnect provides two options for extending the customer’s private
network into their GCP VPC networks.
The Dedicated Interconnect option supports speeds of 10Gbps per circuit with a
maximum of eight circuits per Dedicated Interconnect connection. If less speed is
78
desired, then the Partner Interconnect option offers speeds of 50Mbps and up to
10Gbps per VLAN attachment.
The “big three” CSPs (AWS, Microsoft Azure, and GCP) as well as other CSPs such as
IBM have global infrastructures that are built around regions and zones. How each
CSP defines regions and zones varies slightly, but in the end, they all serve the
same purpose. In this section, AWS will be used as an example to define regions
and zones.
AWS has multiple regions around the world. A region is an independent separate
geographic area, and each region has multiple, isolated availability zones (AZs) that
are connected to each other through low-latency links, as illustrated in Figure 2-17.
Customers can deploy their applications and databases across multiple AZs within
a region to make them highly available, fault tolerant, and scalable.
In addition, to increase redundancy and fault tolerance even further, AWS allows
for replicating applications and data across multiple regions. Resources aren’t
replicated across regions unless explicitly specified.
Multicloud Connectivity
Organizations all over the world are on a nonstop trend of using multiple CSPs, in
part due to the different cloud services and functionalities each one of them offers.
However, most organizations are not fully migrating their applications and data to
the CSPs; instead, CSPs are becoming extensions to their on-prem or private-cloud
environments where workloads and data are expected to move across their WAN
to multiple CSPs, as well as between CSP VPCs/vNETs, while providing secure
connectivity.
One of the ways to achieve multicloud connectivity is by using a virtual router such
as the CSR 1000V router. The CSR 1000V ensures secure, scalable, and consistent
connectivity for multicloud networking. It can run on VMware ESXi, Red Hat KVM,
Citrix Xen, and Microsoft Hyper-V, as well as on Microsoft Azure and Amazon Web
Services. Some of the security features it supports include IPSec VPNs and built-in
zone-based firewalls. It also supports Cisco Digital Network Architecture (DNA)
Encrypted Traffic Analytics (ETA), which has the ability to find threats in
encrypted traffic.
Cisco has co-developed solutions using the CSR 1000V with cloud providers such
as Transit VPC/VNET. The Transit VPC solution, illustrated in Figure 2-18, is a hub-
and-spoke design built on AWS by deploying two CSRs in the transit VPC (the hub)
for redundancy and using AWS VGWs (virtual private gateways) on the spoke
VPCs, which host customer applications. The spoke VPCs join the transit VPC via
automation, and this allows for spoke VPCs to be able to communicate with each
other. The on-prem network can also be extended into the cloud securely using
IPSec encryption through the Transit VPC. This is useful for enterprise customers
using multiple VPCs for different departments or projects, such as Development
VPC, Production VPC, and Test VPC, and they need connectivity between VPCs as
well as to on-prem resources.
80
Campus Fabric: An evolved campus network that allows for host mobility
without stretching VLANs, network segmentation without Multiprotocol
Label Switching (MPLS), and RBAC without end-to-end support for TrustSec.
SD-Access can be used to extend TrustSec into AWS Transit VPC to control access
to spoke VPCs. This can be achieved based on Security Group Tags (SGTs) and ISE
policy enforcement on the Transit VPC Hub CSR 1000Vs, as illustrated in Figure 2-
19. The table in the lower left of the figure shows the VPCs the end users would
have access to according to their role.
This is all deployed from the DNA Center GUI, which works along with ISE to push
the SGT Tag and Policy Enforcement configuration to the CSR 1000Vs. Extending
TrustSec to the Transit VPC Hub has the following benefits:
enforcement at the application level within a VPC rather than at the VPC level, as in
the previous transit VPC case. The SGT policy is pushed by ISE to the virtual router
or firewall, and it can then be mapped to the cloud provider’s own Security Groups.
The table in the lower left of the figure shows the apps the end users would have
access to according to their role.
Lower costs and risks with simple WAN automation and orchestration.
Extend their enterprise networks (such as branch or on-prem) seamlessly
into the public cloud.
Provide optimal user experience for SaaS applications.
Be able to leverage a transport-independent WAN for lower cost and higher
diversity.
Enhance application visibility and use that visibility to improve performance
with intelligent path control to meet SLAs for business-critical and real-time
applications.
Provide end-to-end WAN traffic segmentation and encryption for protecting
critical enterprise compute resources.
Note
At the time of writing, Viptela functionality was actively being integrated into all
IOS XE enterprise routing platforms, such as Integrated Services Router (ISR), CSR,
ASR 1000 Series Router (ASR1K), CSR 1000V, and Enterprise Network Compute
System 5K (ENCS 5K) platforms, along with the advanced services mentioned
previously (voice, compute, and so on). This will bring Viptela/Cisco SD-WAN the
advance service capabilities of IWAN through a simple software upgrade.
All three SD-WAN solutions work in a similar fashion, but in this chapter only Cisco
SD-WAN based on Viptela will be covered.
Cisco SD-WAN
The Cisco SD-WAN solution is composed of four main components and an optional
analytics service:
vEdge routers: These are routers that support standard router features,
such as OSPF, BGP, ACLs, QoS, and routing policies, in addition to the SD-
WAN overlay control and data plane functions. Each vEdge router
automatically establishes a secure DTLS connection with the vSmart
controller and standard IPSec sessions with other vEdge routers in the
fabric.
85
These capabilities can bring many benefits that are not possible without
vAnalytics; for example, if a branch office is experiencing latency or loss on its
MPLS link, vAnalytics will detect this, and it will then compare that loss or latency
with other organizations in the area that it is also monitoring to see if they are also
having that same loss and latency in their circuits. If they are, they can then report
this issue with confidence to their SPs. It can also help predict how much
bandwidth is truly required for any location, and this is useful to decide if a circuit
can be downgraded to a lower bandwidth, resulting in reduced costs.
Out of these components, the vEdge routers and the vBond orchestrator are
available as physical appliances and VMs, whereas vManage and vSmart are only
available as VMs.
All of the VMs, including the vEdge router, can be hosted on-prem using ESXi or
KVM, or they can be hosted in AWS and Microsoft Azure.
Note
At the time of writing, vManage capabilities were being integrated into DNA Center
to bring full DNA Center capabilities to Cisco SD-WAN, such as assurance, analytics,
and integrated workflows.
Traditional enterprise WAN architectures are not designed for the cloud. As
organizations adopt more SaaS applications like Office 365 and public cloud
infrastructures like AWS and Microsoft Azure, the current network infrastructure
poses major problems related to the level of complexity and end-user experience.
fabric to the public cloud while at the same time increasing high availability and
scale.
SaaS applications reside mainly on the Internet, and to be able to achieve optimal
SaaS application performance, the best-performing Internet exit point needs to be
selected.
Figure 2-22 illustrates a remote site with dual direct Internet access (DIA) circuits
from two different Internet service providers (ISP1 and ISP2). When Cloud
OnRamp for SaaS is configured for a SaaS application on vManage, the vEdge
router at the remote site will start sending small HTTP probes to the SaaS
application through both DIA circuits to measure latency and loss. Based on the
results, the vEdge router will know which circuit is performing better and will
send the SaaS application traffic out of that circuit (ISP2). The process of probing
continues, and if a change in performance characteristics of the ISP2’s DIA circuit
occurs (for example, due to loss or latency), the remote site vEdge router will make
an appropriate forwarding decision.
Figure 2-23 illustrates another example of Cloud OnRamp for SaaS. The remote site
has a single DIA circuit to ISP1 and an SD-WAN fabric DTLS session to the regional
hub.
87
Similar to the previous case, Cloud OnRamp for SaaS can be configured on the
vManage and become active on the remote site vEdge router. However, in this case,
Cloud OnRamp for SaaS also gets enabled on the regional hub vEdge router and is
designated as the gateway node. Quality probing service via HTTP toward the
cloud SaaS application of interest starts on both the remote site vEdge and the
regional hub vEdge.
Unlike the HTTP probe sent towards the SaaS application via the DIA link,
Bidirectional Forwarding Detection (BFD) runs through the DTLS session between
the remote site and the regional hub. BFD is a detection protocol originally
designed to provide fast forwarding path failure detection times between two
adjacent routers. For SD-WAN, it is leveraged to detect path liveliness (up/down)
and measure quality (loss/latency/jitter and IPSec tunnel MTU).
For SaaS over DIA, BFD is not used because there is no vEdge router on the SaaS
side to form a BFD session with. The Regional hub vEdge router reports its HTTP
connection loss and latency characteristics to the remote site vEdge router via an
Overlay Management Protocol (OMP) message exchange through the vSmart
controllers. At this time, the remote site vEdge router can evaluate the
performance characteristics of its local DIA circuit versus the performance
characteristics as reported by the regional hub vEdge. It also takes into
consideration the loss and latency incurred by traversing the SD-WAN fabric
between the remote site and the hub site (calculated via BFD) and then makes an
88
Multicloud is now the new norm for enterprises. With multicloud, certain
enterprise workloads remain within the boundaries of the private data centers,
while others are hosted in the public cloud environments, such as Amazon Web
Services (AWS) and Microsoft Azure. This approach provides enterprises the
greatest flexibility in consuming compute infrastructure, as required.
With the Cisco Software-Defined WAN (SD-WAN) solution, you can extend
ubiquitous connectivity, zero-trust security, end-to-end segmentation, and
application-aware Quality of Service (QoS) policies of the organizational WAN into
the IaaS environments, as illustrated in Figure 2-24. The transport-agnostic
capability of the Cisco SD-WAN solution allows the use of a variety of connectivity
methods by securely extending the SD-WAN fabric into the public cloud
environment across all underlying transport networks. These include the Internet,
MPLS, 3G/4G LTE, satellite, and dedicated circuits such as AWS’s DX and Microsoft
Azure’s ER.
Virtual Switching
A virtual switch (vSwitch) operates like a physical Layer 2 Ethernet switch, and it
enables VMs to communicate with each other within a virtualized server and with
external physical networks via the physical network interface cards (pNICs).
Multiple vSwitches can be created under a virtualized server, but network traffic
cannot flow directly from one vSwitch to another vSwitch within the same host,
and they cannot share the same pNIC.
Figure 2-25 illustrates a virtualized server with three vSwitches connected to the
virtual network interface cards (vNICs) of the VMs as well as the pNICs. vSwitch1
and vSwitch3 are uplinked to pNIC 1 and pNIC 3, respectively, to access the
physical network, whereas vSwitch2 is not uplinked to any pNICs. Since network
traffic cannot flow from one vSwitch to another, network traffic from VM1 destined
to the external network, or VM0, will need to flow through the firewall (NGFWv).
The downside to using standard vSwitches is that every vSwitch that is part of a
cluster of virtualized servers needs to be configured individually in every virtual
host, which is where distributed switching comes into the picture. Distributed
90
Examples of virtual distributed switches include Cisco Nexus 1000V, Cisco VM-FEX,
Cisco AVS, HPE 5900v, Open vSwitch (OVS), IBM DVS 5000v, and vSphere
Distributed Switch.
Table 2-3 provides a quick glance at the vSwitch options available from multiple
vendors.
HPE 5900v*
Cisco Nexus
1000V
NEC
Broadcom
Cisco Nexus
Linux Bridge (some distributions include OVS 1000V
KVM
natively)
OVS
XEN OVS OVS
* VMware ceased support for the third-party virtual switches starting with
vSphere 6.5U2 and beyond.
91
Containers, just like VMs, also rely on vSwitches (aka virtual bridges) for
communication within a node (server). Docker, for example, by default creates a
virtual bridge called Docker0, and it is assigned the default subnet block
172.17.0.1/16. This default subnet can be customized, and user-defined custom
bridges can also be used.
Figure 2-26 illustrates how every container created by Docker is assigned a virtual
Ethernet interface (veth) on Docker0. The veth interface appears to the container
as eth0. The eth0 interface is then assigned an IP address from the bridge’s subnet
block. As more containers are created by Docker within the node, they are each
assigned an eth0 interface and an IP address from the same private address space.
This results in all containers being able to communicate with each other only if
they are within the same node. Containers in other nodes are not reachable by
default, which is something that needs to be managed via routing at the OS level, or
by using an overlay network.
If Docker is installed on another node using the default configs, then it will end up
with the same IP addressing as the first node, and this needs to be resolved on a
node-by-node basis.
A better way to manage and scale containers and the networking connectivity
between them within and across nodes is to use a container orchestrator such as
92
Kubernetes
Google has been using containers for over a decade. Every application that runs at
Google—for example, Gmail, Google apps, YouTube, or GCP—is a distributed
application that run on containers, so needless to say, Google is an expert in
managing distributed containerized applications at scale. In June 2015, Google
published a research paper at Eurosys with details about Borg, Google’s internal
container-oriented large cluster-management system.
Volumes
A pod volume (persistent storage) is required for containers that need to maintain
the state of an application after reboots, relocations, and crashes. The reason for
this is because a container file system only lives as long as the container does; if it
restarts, any data and state will be lost.
Labels
Kubernetes Cluster
The Kubernetes master node can be thought of as the control plane of Kubernetes,
and it is recommended to have at least three for redundancy. The master node is
composed of five main components:
Note
In some special cases, the master node can also act as a worker node, which means
it would also include a container runtime, a kubelet, and a kube-proxy.
and as they come in, the scheduler needs to figure out where to put them in a
very similar way to a Tetris game.
controller-manager: The controller manager is a daemon that runs all the
controllers included with Kubernetes. A control loop is a nonterminating
loop that regulates the state of a system. A controller is a control loop that
watches the shared state of the cluster through the apiserver and makes
changes, attempting to move the current state toward the desired state.
Logically, each controller is a separate process, but to abstract complexity, they all
run as a single process.
The worker nodes are where the pods are run, and they include the following
components:
Note
Kubernetes Networking
The default switch for a node is called cbr0 (instead of Docker0). This bridge is
connected to the pods as well as to the external network. How it connects to the
external network depends on the networking plug-in used.
While there are many options on how to deploy an external network for inter-pod
communication in Kubernetes, the Cisco recommended option is to use Cisco
Application Centric Infrastructure (Cisco ACI), which is designed to offer policy-
based automation, security, mobility, and visibility for application workloads,
regardless of whether they run on bare-metal servers, hypervisors, or Linux
containers. The Cisco ACI system-level approach extends the support for Linux
containers by providing tight integration of Kubernetes.
Creating a Pod
A simple way to deploy a pod (or pods) is by creating a declarative YAML or JSON
deployment configuration file. Declarative means the configuration file has
instructions on what is desired (the intent), but it does not include details on how
to do it; that is for Kubernetes to figure out.
98
The high-level steps that take place when a pod is created, as shown in Figure 2-31,
are as follows:
1. The file is posted to the API server via a REST API post through the
Kubernetes GUI, the kubectl CLI, or from an application such as CloudCenter.
2. The information in the file is then sent to the scheduler, and it is also stored
in etcd.
3. The scheduler then needs to figure out what the intent is of the configuration
file; understand the constraints, policies, requirements, and how many CPU
cores the pod needs; see what the required memory is, as well as any affinity
to a specific node, replica sets (how many pods should be created), a specific
persistence requirement (volume), and so on. The scheduler then selects a
node that meets the intent of the configuration file.
4. The scheduler sends a message to the kubelet running on the selected node
concerning what the requirements are for the pod.
5. The kubelet then contacts Docker and instructs it on what it needs to do (for
example, number of pods, number of containers within the pod, IP address
assignment for the pod, image download, and so on). At this point, the pod is
created and ready for use.
OpenStack
OpenStack is a cloud operating system that can be used to build private and public
clouds; however, it is more prevalently used to build private clouds. It has a
modular architecture, where each of its components (aka services or projects) is
developed in individual development projects. Table 2-4 includes the OpenStack
projects available in the Queens release. The services highlighted in bold are part
of the core functionality of OpenStack.
OpenStack- Deployment/lifecycle
Ansible playbooks for deploying OpenStack
Ansible tools
Deployment/lifecycle Deploys OpenStack services as Docker
Kolla
tools containers
Deployment/lifecycle
Tripleo Heat templates for deploying OpenStack
tools
Docker network plug-in that uses Neutron to
Container
Kuryr provide networking services to Docker
infrastructure
containers
Tacker NFV ETSI MANO NFV Orchestrator / VNF Manager
A resource reservation service for virtual and
Blazar Reservation
physical resources
Note
The codenames are usually related to the service function; for example, cinder
provides block storage (to remember this, think of a cinderblock). As another
example, ironic provisions bare-metal servers (think of iron as bare metal).
New versions of OpenStack are released every six months, which usually include
new feature improvements and sometimes new projects. The OpenStack version-
naming convention follows an alphabetical order—for example, Austin, Bexar,
Cactus, Diablo, Essex, Folsom, Grizzly, Havana, Icehouse, Juno, Kilo, Liberty, Mitaka,
Newton, Ocata, Pike, and Queens, with Queens being the latest release available at
the time of writing.
CCM is the primary interface for users and administrators, and it can be accessed
through a web browser user interface, CLI, or REST API. It includes all the
management tools required for modeling, securely deploying, and managing
applications in multiple on-prem, private cloud and public cloud environments. It
also includes functions that can be used by administrators to have full visibility
and control across all applications, users, governance rules, and clouds. For a
traditional on-prem CCM deployment, the manager is delivered as a preinstalled
virtual appliance, and only one CCM is required for each deployment, but
additional managers can be added to meet disaster-recovery or high-availability
102
Figure 2-33 shows CCM’s simple, visual topology modeler. Creating an application
profile is as easy as dragging and dropping from the library of ready-to-use or
customized services, images, and containers on the left pane onto the topology
modeler and then just adding the connections between them. The application can
be modeled with a mixture of containers, VMs, and services.
103
Figure
2-35 CCC Cloud Application Deployment
Deploy simply refers to clicking on the deploy button to deploy a new application
profile as well as related components and data to any on-prem or cloud
environment.
management commands such as deploy, start, stop, and remove. The Orchestrator
runs those commands and sends a status update back to the CCM.
The Orchestrator abstracts the unique API and services offered by each cloud, and
it uses the same communication mechanism back to the Manager regardless of the
cloud on which the Orchestrator is installed.
Artifact Repositories
Multitenant
Full isolation: With Cisco CloudCenter, each tenant can be fully isolated
from other peer tenants. In this way, two completely independent business
units can use a single Cisco CloudCenter instance while strictly separating
tenants.
Partial isolation: Cisco CloudCenter offers an option for partial isolation
between parent and child tenants. In some cases, a central IT organization
may offer shared services, delivered either on-prem or through the CSP, that
are consumed by various business units that are independent. For
independent IT departments, the central IT organization may want to
enforce OS image standards, require use of specific artifact repositories, or
require a common rules-based governance framework.
Flexible sharing: Cisco CloudCenter facilitates sharing within each tenant.
Powerful features for sharing application profiles, application services,
deployment environments, and more multiply the speed and agility benefits
of an application-defined management solution.
107
Application Migration
It’s a challenge to find all the application components for existing applications not
created via CloudCenter (also known as brownfield applications), but with the help
of application discovery tools, the application dependencies can be derived and the
grouping of virtual machines that make up the application can be discovered. Once
the application discovery is done, the migration of the application can be planned.
Application Discovery
iQuate provides an agentless SaaS service, which means there is no need to install
agents on the VMs to monitor them; all that is required is to install an iQuate
virtual appliance in the environment where the applications reside and to provide
the login credentials for the VMs that need to be monitored. That is all it needs to
generate a report with application discovery data.
To make sure that the application performance is the same or better in the public
cloud, a pre-migration performance baseline needs to be performed. The
application discovery data report generated by iQuate can be used to identify the
VMs that make up the full application. The VMs are imported into CCC to enable
AppD agents on them to capture the performance baseline of the application
before migration based on technical and business metrics. Using technical and
business metrics for the performance baseline allows measuring the performance
not only at the technical level but also at the business level (for example, the end-
user experience with the application, understanding business key performance
indicators, and so on).
109
Migrate
After application discovery and initial performance baselining, the application can
now be migrated. Two types of migrations can be performed:
Compare Performance
Summary
In this chapter, a basic definition of cloud was provided using NIST’s definition,
which includes the following characteristics and models:
Essential Characteristics
o On-demand self-service: Portal
o Measured service: Billing
o Rapid elasticity: Scale out and scale in automatically
o Resource pooling: Massive shared resources
110
Performance, scalability, and high availability for cloud were described with more
focus on how to apply features and technologies (WAAS, PfR, and so on) to
enhance application performance, which is directly correlated to what matters the
most—the end-user experience.
How organizations share security risks and responsibilities with the CSPs was
explained, as was security compliance guidelines, cloud threats, and
recommendations on how to secure the cloud with Cisco solutions.
Virtual machines and containers were described and compared. Also, a brief
introduction was given showing how containers enable cloud native applications
using microservices architectures.
The various cloud connectivity options offered by the three leading CSPs—
Microsoft Azure, GCP, and AWS—were discussed. These include direct connect, co-
location, and IPSec VPN (the most popular). User-to-cloud access control using
TrustSec SGTs and the CSR 1000V as well as other Cisco Security virtual platforms
was shown. In addition, Cisco’s SD-WAN solutions were included as the preferred
way to access IaaS and SaaS services from the cloud using Cloud OnRamp, which is
part of Cisco’s cloud-first next-gen SD-WAN solution that enables multicloud.
In the last section, the Kubernetes architecture and its components are described,
along with a high-level example of how pods can be deployed. OpenStack Queens
111
Review Questions
1. True
2. False
1. True
2. False
1. Rehosting
2. Replatforming
3. Repurchasing
4. Refactoring
5. Retain
1. vManage
2. vSmart
3. vBond Orchestrator
4. vAnalytics
5. vEdge Routers
112
References
Lloyd Noronha. “Getting poor Cloud performance? Cisco SD-WAN is here for you”
(presented at Cisco Live, Barcelona, 2018).
Darrin Miller. “Advanced Security Group Tags: The Detailed Walk Through”
(presented at Cisco Live, Barcelona, 2018).
113
OpenStack, https://www.OpenStack.org.
115
Data Models
Controller-Based Network Design
Configuration Management Tools
There are many different ways to connect to and manage your network. The most
common method is to use the command-line interface, or CLI. This has been the
most widely used way to configure your network for the last 30 years. However,
the CLI, like anything, else has its pros and cons. Perhaps one of the most glaring
and biggest flaws with using the CLI to manage your network is misconfiguration.
Oftentimes, businesses have a high frequency of change in their network
environment, and some of those changes can be extremely complex. When
businesses have increased complexity in their network, the cost of something
failing can be very high. This can stem from the increased time it takes to
troubleshoot the issues in a complex network.
Pros Cons
Well known and documented. Difficult to scale.
Commonly used method. Large amount of commands.
Commands can be scripted. Inflexible (you must know the command syntax).
Syntax help is available on each
Can be slow to execute commands.
command.
Connection to the CLI can be
Not intuitive.
encrypted (SSH).
Can only execute a command at time.
CLI and commands can change between software
versions and platforms.
Using the CLI can pose a security threat if you’re
116
Context sensitive help is a structure built in to the Cisco CLI. When “?” is issued
after each command, all available configuration help in that subconfiguration mode
is displayed. Figure 3-1 illustrates the context sensitive help available for the show
command.
Northbound API
Northbound APIs are often used to communicate from a network controller to its
management software. For example, Cisco DNA Center has a software graphical
user interface (GUI) that is used to manage its own network controller. Typically,
when a network operator logs in to a controller to manage their network, the
information that is being passed from the management software is leveraging a
northbound API. Best practices suggest that the traffic is encrypted between the
117
software and the controller. Most types of APIs have the ability to use encryption
to secure the data in flight.
Southbound API
An API that uses REST is often referred to a RESTful API. What does this mean?
RESTful APIs use HTTP methods to gather and manipulate data. Because there is a
defined structure on how HTTP works, it offers a consistent way to interact with
APIs from multiple vendors. REST uses different HTTP functions to interact with
the data. Table 3-2 lists some of the most common HTTP functions and their
associated use cases.
HTTP functions are very similar to the functions that most applications or
databases use to store or alter data, whether it is stored in a database or within the
application itself. These functions are called “CRUD” functions. CRUD is an acronym
that stands for CREATE, READ, UPDATE, and DELETE. For example, in a SQL
database, the CRUD functions are what are used to interact with or manipulate the
data stored in the database. Table 3-3 lists the CRUD functions and their associated
actions and use cases.
Whether you are trying to learn how APIs interact with applications or controllers,
to test code and outcomes, or are wanting to become a full-time developer, one of
the most important pieces of interacting with any software via APIs is testing.
Testing your code helps to ensure you are accomplishing the desired outcome that
you hoped to achieve when executing the code. This section covers some tools and
resources that will help you practice using APIs and REST functions. This also
helps hone your coding skills in becoming a more efficient network engineer.
Earlier, this chapter mentioned being able interact with a software controller using
RESTful APIs. It also discussed being able to test your code to see if the desired
outcomes are accomplished when executing it. Please keep in mind that APIs are
software interfaces into an application or a controller. This means just like with
any other device, you will need to authenticate to gain access to utilize the APIs.
However, once you are authenticated, any changes you have access to make via the
119
APIs will impact that application. This means that if you use a REST API call to
delete data, that data will be removed from the application or controller, just as if
you logged in to the device via the CLI and deleted the data. It is a best practice to
use a test lab or the Cisco DevNet sandbox while learning or practicing any of these
concepts. This way, there will be no accidental impact to your production
environment.
Note
Google has an application called Postman that allows us to interact with APIs using
a console-based approach. Postman allows us to use various data types and
formats to interact with REST-based APIs. Figure 3-3 depicts the main Google
Postman application dashboard.
Note
The screenshots of Google Postman used at the time of this writing may differ from
the currently available version.
Within the Postman application, you can see the various sections you can interact
with. Under the default configuration, the focus will be on using the “Builder”
portion of the dashboard. The following sections are the ones that will require the
most focus of our attention:
120
History tab
Collections window
New tab
URL bar
The History tab will show a list of all the recent API calls that were made using
Postman. You have the option to clear your entire history at any time should you
want to remove the complete list of API calls that have been made. This is done by
clicking the “Clear all” link at the top of the Collections window. You also have the
ability to remove individual API calls from the history list by simply hovering the
mouse over the API call and clicking the trash can icon from the submenu that pops
up. An example of this is shown in Figure 3-4.
API calls can be stored into groups that are specific to a structure that fits your
needs. These groups are called “collections.” Collections can follow any naming
convention and appear as a folder hierarchy. For example, it’s possible to have a
121
collection called DNA-C to store all of your DNA Center API calls in. This helps
during testing phases, as API calls can easily be found and sorted by saving them
into a collection. You can also select the collection to be a favorite by clicking the
star icon to the right of the collection’s name. Figure 3-5 illustrates a collection
called DNA-C that is selected as a favorite.
Tabs are another very convenient way to work with various API calls. Each tab can
have its own API call and parameters that are completely independent of any other
tab. For example, this means you can have one tab open with API calls interacting
with the DNA Center controller and another tab open that is interacting with a
completely different platform, such as a Cisco Nexus switch. This is because each
tab has its own URL bar to be able to use a specific API. Remember that an API call
using REST is very much like an HTTP transaction. Each API call in a RESTful API
maps back to an individual URL for a particular function. This means every
configuration change or poll to retrieve data you make in a REST API has a unique
URL for it, regardless of whether is a GET, POST, PUT, PATCH, or DELETE function.
122
Figures 3-6 and 3-7 illustrate two different tabs using unique URLs for different
API calls.
Figure 3-6 Google Postman URL Bar with DNA Center Token API Call
123
Figure 3-7 Google Postman URL Bar with DNA Center Host API Call
Now that the Postman dashboard has been shown, it’s time to discuss two of the
most common data formats used with RESTful APIs. The first one is called
Extensible Markup Language (XML). This format may look familiar because it is the
same format that is commonly used when constructing web services. XML is a tag-
based language. This means that when you enter a tag within XML, it must begin
with the < symbol. This also means it must end with the > symbol as well. For
example, if you wanted to create a start tag named “interface,” it would be
represented as <interface>. One rule that goes along with XML is that if you start a
section, you must end it. Or, in other words, for every beginning there is an end.
Because you created a start tag called <interface>, the section will need to be
closed by using an end tag. The end tag must contain the string of the start tag you
are working with, followed by a / character. In this example, the end tag for
<interface> would be <interface/>. Now that there is a start tag and an end tag,
different types of code and parameters can be put inside or in between the tags.
124
Example 3-1 depicts a snippet of XML output with both start and end tags as well
as some various configuration parameters.
<users>
<user>
<name>root</name>
</user>
<user>
<name>Jason</name>
</user>
<user>
<name>Jamie</name>
</user>
<user>
<name>Luke</name>
</user>
</users>
Notice that each section of Example 3-1 has a start tag and an end tag. The data is
structured within a section called “users” that contains the following four
individual users:
root
Jason
Jamie
Luke
Before and after each username is a start tag called <user> and an end tag called
<user/>. The output also contains a start tag called <name> and an end tag called
<name/>. These items start and end the tag that contains the actual user’s name. If
you wanted to create another section to add on, you could simply follow the same
logic as used in the previous example and build out more XML code. A key thing to
keep in mind is that indentation of your XML sections is very important. For
instance, if you didn’t use indentation, it would be much harder to read and follow
each section in the XML output. This is why XML has been deemed so easy to read.
Not only can humans read it, but applications can also read it. Another very
common XML snippet is one that shows available interfaces on a device, such as a
router or a switch. The Example 3-2 snippet, however, shows an XML code snippet
without indentation to illustrate the difference in legibility. Although indentation is
not required, it is certainly a recommended best practice.
Example 3-2 Source Code for SIMPLE: A Very Simple OpenGL Program
<interfaces>
125
<interface>
<name>GigabitEthernet1</name>
</interface>
<interface>
<name>GigabitEthernet11</name>
</interface>
<interface>
<name>Loopback100</name>
</interface>
<interface>
<name>Loopback101</name>
</interface>
</interfaces>
The second data format that is important to cover is called JavaScript Object
Notation (JSON). Although JSON has not been around as long as XML, it is currently
taking the industry by storm. This data format is gaining popularity because it can
be argued that JSON is much easier to work with than XML. It is simple to read and
create, and the way the data is structured is much cleaner. JSON stores all its
information in key-value pairs. There is much debate about whether JSON will
eventually replace XML. Much like with XML, data that is indented is much cleaner
in appearance and more legible. However, even without indentation, JSON is
extremely easy to read. Like the name suggests, JSON uses objects for its format.
Unlike XML, JSON objects start with { and end with }. These are commonly referred
to as curly braces. Example 3-3 shows the same username example shown earlier
in the XML section, but now in JSON format. It also can be read as having four
separate key-value pairs—one for each user’s name.
{
"user": "root",
"user": "Jason",
"user": "Jamie",
"user": "Luke"
}
In the case of this JSON code snippet, you can see that the key used is “user” and
the value for each key is a unique username.
Now that the XML and JSON data formats have been explained, it is important to
circle back to actually using the REST API and the associated responses and
outcomes of doing so. First, covering the different HTTP response codes and their
importance is of top priority. Most Internet users have experienced navigating to a
website and getting the dreaded “404 Not Found” error. What might not be clear to
126
the users is what that error actually means. Table 3-4 lists the most common HTTP
response codes as well as the reasons users may receive each one.
The first example covered in this section is the DNA Center Token API mentioned
earlier in this chapter. The DNA Center Controller expects all incoming data from
the REST API to be in JSON format. It is also important to note that the HTTP POST
function is used to send the credentials to the DNA Center Controller. DNA Center
uses an authorization concept of Basic Auth to pass a username and password to
the DNA Center Token API to authenticate users. This API is used to authenticate
users to the DNA Center Controller in order to make additional API calls. Just as
users do when logging in to a device via the CLI, if the platform secured properly,
they should be prompted for login credentials. The same method applies to using
an API to authenticate to software. The key pieces of information necessary to
successfully set up the API call in Postman are as follows:
Once Postman has been set up, you will need to select the Send button to pass the
credentials to the DNA Center Controller via the Token API. Figure 3-8 illustrates
the Postman setup required to authenticate with the DNA Center Controller.
127
Figure 3-8 Google Postman URL Bar with DNA Center Token API Call
Once you successfully authenticate to the DNA Center Controller, you will receive a
“token” that contains a string similar to the following:
"eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOiI1YTU4Y2QzN2UwNWJiYTA
wOGVmNjJiOTIiLCJhdXRoU291cmNlIjoiaW50ZXJuYWwiLCJ0ZW5hbnROYW1lIjoiV
E5UM
CIsInJvbGVzIjpbIjVhMzE1MTYwOTA5MGZiYTY5OGIyZjViNyJdLCJ0ZW5hbnRJZCI6
I
jVhMzE1MTlkZTA1YmJhMDA4ZWY2MWYwYSIsImV4cCI6MTUyMTQ5NzI2NCwid
XNlcm5hb
WUiOiJkZXZuZXR1c2VyIn0.tgAJfLc1OaUwaJCX6lzfjPG7Om2x97oiTIozUpAzomM"
This token is needed for all future API calls to the DNA Center Controller. Think of
it as a hash that is generated from your login credentials. This token will change
128
You can see that an HTTP Status code of “200 OK” was received from the DNA
Center Controller. Based on Table 3-4, you know that an HTTP Status code of 200
means that the API call completed successfully. In addition, you can also see how
long it took to process the HTTP POST request. In this case, it was 980ms.
Now that you have successfully authenticated to the DNA Center Controller, you
can now look at some of the other available API calls. The first API call covered in
this section is the Network Device API, which allows users to retrieve a list of
devices currently in inventory that are being managed by the DNA Center
129
Controller. The first step is to prepare Postman to use the token that was
generated when you authenticated to the controller. The following steps are
necessary to leverage Postman to utilize the Network Device API:
The last step is to click Send to pass the token to the DNA Center Controller and
perform an HTTP GET to retrieve a device inventory list using the Network Device
API. Figure 3-10 illustrates the proper setup of Postman to use the Network Device
API.
Figure 3-10 Postman Setup to Retrieve Network Device Inventory via an API Call
130
Note
The token will be different from the one in this book. It is unique to each
authenticated user.
Based on the response received from the DNA Center Controller, you can see that
you not only received an HTTP Response code of 200 OK, you also successfully
retrieved the device inventory. Example 3-4 shows a list of devices in the inventory
that were pulled via the Network Device API.
Example 3-4 Device Inventory Pulled via the Network Device API Call in Google
Postman
{
"response": [
{
"type": "Cisco ASR 1001-X Router",
"family": "Routers",
"location": null,
"errorCode": null,
"macAddress": "00:c8:8b:80:bb:00",
"lastUpdateTime": 1521645053028,
"apManagerInterfaceIp": "",
"associatedWlcIp": "",
"bootDateTime": "2018-01-11 15:47:04",
"collectionStatus": "Managed",
"interfaceCount": "10",
"lineCardCount": "9",
"lineCardId": "a2406c7a-d92a-4fe6-b3d5-ec6475be8477, 5b75b5fd-
21e3-4deb-a8f6-6094ff73e2c8, 8768c6f1-e19b-4c62-a4be-
51c001b05b0f,
afdfa337-bd9c-4eb0-ae41-b7a97f5f473d, c59fbb81-d3b4-4b5a-81f9-
fe2c8d80aead, b21b6024-5dc0-4f22-bc23-90fc618552e2, 1be624f0-
1647-4309-8662-a0f87260992a, 56f4fbb8-ff2d-416b-a7b4-
4079acc6fa8e,
164716c3-62d1-4e48-a1b8-42541ae6199b",
"managementIpAddress": "10.10.22.74",
"memorySize": "3956371104",
"platformId": "ASR1001-X",
"reachabilityFailureReason": "",
"reachabilityStatus": "Reachable",
"series": "Cisco ASR 1000 Series Aggregation Services Routers",
"snmpContact": "",
"snmpLocation": "",
131
"tunnelUdpPort": null,
"waasDeviceMode": null,
"locationName": null,
"role": "BORDER ROUTER",
"hostname": "asr1001-x.abc.inc",
"upTime": "68 days, 23:23:31.43",
"inventoryStatusDetail": "<status><general code=\"SUCCESS\"/></status>",
"softwareVersion": "16.6.1",
"roleSource": "AUTO",
"softwareType": "IOS-XE",
"collectionInterval": "Global Default",
"lastUpdated": "2018-03-21 15:10:53",
"tagCount": "0",
"errorDescription": null,
"serialNumber": "FXS1932Q1SE",
"instanceUuid": "d5bbb4a9-a14d-4347-9546-89286e9f30d4",
"id": "d5bbb4a9-a14d-4347-9546-89286e9f30d4"
},
{
"type": "Cisco Catalyst 9300 Switch",
"family": "Switches and Hubs",
"location": null,
"errorCode": null,
"macAddress": "f8:7b:20:67:62:80",
"lastUpdateTime": 1521644291747,
"apManagerInterfaceIp": "",
"associatedWlcIp": "",
"bootDateTime": "2018-01-11 14:42:33",
"collectionStatus": "Managed",
"interfaceCount": "41",
"lineCardCount": "2",
"lineCardId": "feb42c9f-323f-4e17-87d3-c2ea924320cb, 0f0c473e-b2e0-
4dcf-af11-9e7cf7216473",
"managementIpAddress": "10.10.22.66",
"memorySize": "889225360",
"platformId": "C9300-24UX",
"reachabilityFailureReason": "",
"reachabilityStatus": "Reachable",
"series": "Cisco Catalyst 9300 Series Switches",
"snmpContact": "",
"snmpLocation": "",
"tunnelUdpPort": null,
"waasDeviceMode": null,
"locationName": null,
132
"role" : "ACCESS",
"hostname": "cat_9k_1.abc.inc",
"upTime": "69 days, 0:15:51.44",
"inventoryStatusDetail": "<status><general code=\"SUCCESS\"/></status>",
"softwareVersion": "16.6.1",
"roleSource": "AUTO",
"softwareType": "IOS-XE",
"collectionInterval": "Global Default",
"lastUpdated": "2018-03-21 14:58:11",
"tagCount": "0",
"errorDescription": null,
"serialNumber": "FCW2136L0AK",
"instanceUuid": "6d3eaa5d-bb39-4cc4-8881-4a2b2668d2dc",
"id": "6d3eaa5d-bb39-4cc4-8881-4a2b2668d2dc"
},
{
"type": "Cisco Catalyst 9300 Switch",
"family": "Switches and Hubs",
"location": null,
"errorCode": null,
"macAddress": "f8:7b:20:71:4d:80",
"lastUpdateTime": 1521644755520,
"apManagerInterfaceIp": "",
"associatedWlcIp": "",
"bootDateTime": "2018-01-11 14:43:33",
"collectionStatus": "Managed",
"interfaceCount": "41",
"lineCardCount": "2",
"lineCardId": "789e00f9-f52d-453d-86c0-b18f642462ee, 242debfd-ff6c-
4147-9bf6-574e488c5174",
"managementIpAddress": "10.10.22.70",
"memorySize": "889225280",
"platformId": "C9300-24UX",
"reachabilityFailureReason": "",
"reachabilityStatus": "Reachable",
"series": "Cisco Catalyst 9300 Series Switches",
"snmpContact": "",
"snmpLocation": "",
"tunnelUdpPort": null,
"waasDeviceMode": null,
"locationName": null,
"role" : "ACCESS",
"hostname": "cat_9k_2.abc.inc",
"upTime": "69 days, 0:22:17.07",
"inventoryStatusDetail": "<status><general code=\"SUCCESS\"/></status>",
133
"softwareVersion": "16.6.1",
"roleSource": "AUTO",
"softwareType": "IOS-XE",
"collectionInterval": "Global Default",
"lastUpdated": "2018-03-21 15:05:55",
"tagCount": "0",
"errorDescription": null,
"serialNumber": "FCW2140L039",
"instanceUuid": "74b69532-5dc3-45a1-a0dd-6d1d10051f27",
"id": "74b69532-5dc3-45a1-a0dd-6d1d10051f27"
},
{
"type": "Cisco Catalyst38xx stack-able ethernet switch",
"family": "Switches and Hubs",
"location": null,
"errorCode": null,
"macAddress": "cc:d8:c1:15:d2:80",
"lastUpdateTime": 1521644825918,
"apManagerInterfaceIp": "",
"associatedWlcIp": "",
"bootDateTime": "2018-01-11 15:20:34",
"collectionStatus": "Managed",
"interfaceCount": "59",
"lineCardCount": "2",
"lineCardId": "15d76413-5289-4a99-98b6-fcacfe76b977, f187f561-9078-
4f30-b1a1-c6c6284bd075",
"managementIpAddress": "10.10.22.69",
"memorySize": "873744896",
"platformId": "WS-C3850-48U-E",
"reachabilityFailureReason": "",
"reachabilityStatus": "Reachable",
"series": "Cisco Catalyst 3850 Series Ethernet Stackable Switch",
"snmpContact": "",
"snmpLocation": "",
"tunnelUdpPort": null,
"waasDeviceMode": null,
"locationName": null,
"role": "CORE",
"hostname": "cs3850.abc.inc",
"upTime": "65 days, 11:23:52.43",
"inventoryStatusDetail": "<status><general code=\"SUCCESS\"/></status>",
"softwareVersion": "16.6.2s",
"roleSource": "MANUAL",
"softwareType": "IOS-XE",
"collectionInterval": "Global Default",
134
Hopefully, you can start to see how powerful using APIs can be. Within a few
moments, users are able to gather a tremendous amount of information about the
devices currently being managed by the DNA Center Controller. One could even say
that in the time it takes someone to log in to a device via the CLI and issue all the
relevant show commands to gather this data, an API call can be used to gather it
for the entire network within seconds. Talk about giving some time back to some
engineers to do other things!
Manipulating data using filters and offset lists is very common with the use of APIs.
A great example would be leveraging the Network Device API and only gathering
the information on only a single device in the inventory. Perhaps it is the second
device in the inventory. This is where the API documentation becomes so valuable.
Most APIs that are created are documented as to what they can be used to
accomplish.
In Postman, it is possible to modify the Network Device API URL and add ?limit=1
to the end of the URL to show only a single device in the inventory. We can also add
the &offset=2 command to the end of the URL to state that we want to see the
second device in the inventory. Although it may seem confusing, the limit keyword
simply states that a user only wants to retrieve one record from the inventory, and
the offset command states that the user only wants that record to be the second
record in the inventory. Figure 3-11 shows how a user can adjust the Network
Device API URL in Postman to show the second device in the inventory. Users can
specify a filter that states that they only want to see a single device in the inventory
(?limit=1) and to show the second device in the inventory (&offset=2).
135
Based on the response, you can see that the second device is consistent with the
output you saw in the initial Network Device API call that was shown in Example 3-
4. That device is a “Cisco Catalyst 9300 Switch” with the MAC address of
f8:7b:20:67:62:80.
This section covers some of the most common data models and tools and how they
are leveraged in a programmatic approach. The data models and tools covered are
as follows:
The Simple Network Management Protocol (SNMP) is widely used for fault
handling and monitoring. However, it is not often used for configuration changes.
CLI scripting is used more often than other methods. YANG data models are an
alternative to SNMP Management Information Base (MIBs) and are becoming the
standard for data definition languages. YANG was defined in RFC 6020 and uses
data models. Data models are used to describe whatever can be configured on a
device, everything that can be monitored on a device, and all the administrative
actions that can be executed on a device, such as resetting counters and rebooting
the device. This includes all the notifications that the device is capable of
generating. All of these variables can be represented within a YANG model. Data
models are very powerful in that they create a uniform way to describe data, which
can be beneficial across vendors’ platforms. Data models allow network operators
to configure, monitor, and interact with network devices holistically across the
entire enterprise environment.
YANG models use a tree structure. Within that structure, the models are
constructed similar to the XML format and are built in modules. These modules are
hierarchical in nature and contain all the different data and types that make up a
YANG device model. YANG models make a clear distinction between configuration
data and state information. The tree structure represents how to reach a specific
element of the model. These elements can be either configurable or not
configurable.
Elements all have a defined type. For example, an interface can be configured to be
on or off. However, the interface state cannot be configured (for example, up or
down). Example 3-5 illustrates a simple YANG module taken from RFC 6020.
container food {
choice snack {
case sports-arena {
leaf pretzel {
type empty;
}
leaf beer {
type empty;
}
}
case late-night {
leaf chocolate {
type enumeration {
137
enum dark;
enum milk;
enum first-available;
}
}
}
}
}
The output in the previous example can be read as follows: You have food. Of that
food, you have a choice of snacks. In the case that you are in the sports arena, your
snack choices are pretzels and beer. If it is late at night, your snack choices are two
different types of chocolate. You can choose to have milk chocolate or dark
chocolate, and if you are in a hurry and do not want to wait, you can have the first
available chocolate, whether it is milk chocolate or dark chocolate. To put this into
more network-oriented terms, see Example 3-6.
list interface {
key "name";
leaf name {
type string;
}
leaf speed {
type enumeration {
enum 10m;
enum 100m;
enum auto;
}
}
leaf observed-speed {
type uint32;
config false;
}
}
The YANG model shown here can be read as follows: You have a list of interfaces.
Of the available interfaces, there is a specific interface that has three configurable
speeds. Those speeds are 10Mbps, 100Mbps, and auto, as listed in the leaf named
“speed.” The leaf named “observed-speed” cannot be configured due to the config
false command. This is because as the leaf is named, the speeds in this leaf are
those that were auto-detected (observed); hence, it is not a configurable leaf. This
138
NETCONF
NETCONF was defined in RFC 4741 and RFC 6241. NETCONF is an IETF standard
protocol that uses the YANG data models to communicate with the various devices
on the network. NETCONF runs over SSH, TLS, or the Simple Object Access
Protocol (SOAP). Some of the key differences between SNMP and NETCONF are
listed in Table 3-5. One of the most important differences is that SNMP can’t
distinguish between configuration data and operational data, but NETCONF can.
Another key differentiator is that NETCONF uses paths to describe resources,
whereas SNMP uses object identifiers (OIDs). A NETCONF path can be similar to
interfaces/interface/eth0, which is much more descriptive than what you would
expect out of SNMP. Here is a list of some of the more common use cases for
NETCONF:
SNMP NETCONF
Resources OIDs (Object Identifier) Paths
Data Models Defined in MIBs YANG Core Models
Data Modeling SMI (Structure of Management
YANG
Language Information)
Management
SNMP NETCONF
Operations
Encoding BER (Bit Error Rate) XML, JSON
Transport Stack UDP SSH/TCP
Example 3-7 illustrates a NETCONF element from RFC 4741. This NETCONF output
can be read as follows: There is an XML list of users named “users.” In that list,
there are the following individual users: root, Fred, and Barney.
Note
<rpc-reply message-id="101"
xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">
<data>
<top xmlns="http://example.com/schema/1.2/config">
<users>
<user>
<name>root</name>
</user>
<user>
<name>fred</name>
</user>
<user>
<name>barney</name>
</user>
</users>
</top>
</data>
</rpc-reply>
Beverages
Soft Drinks
Cola
Root Beer
Tea
Sweetened
Unsweetened
140
ConfD
Network operators today are looking for a way to manage their networks
holistically, rather than managing them on a box-by-box basis. To do this, many
network devices have northbound APIs that allow a management tool or suite of
tools to interact with the devices across the network. This allows for service
applications and device setup to be done uniformly across the campus
environment. This type of automation introduces the concept of a transactional
deployment model.
This is a much cleaner and less error-prone way to deploy certain features. In the
transactional model, there wouldn’t be any partially configured features resident
in the campus devices, which ensures the integrity of the network as a whole. This
also functionally makes the network act as a federated database rather than as a
collection of separated devices.
Service applications can include things like VPN provisioning, QoS, firewall
capabilities, and so forth. Device setup components can include configuration
templates, scripts, and other device-specific operations. ConfD is a device-
management framework that is very different from the traditional management
tools. ConfD uses YANG data models to interact with various network devices. It
can also use NETCONF, among other things, as a protocol to carry the different
transactions to the equipment to be executed.
Once you have a YANG model for a device, ConfD automatically renders all the
management protocols mentioned earlier. For example, you will automatically
have a WEB UI into the YANG model of the device without having to program
anything. The YANG model also supplies the configuration database (CBD, covered
in the next section) schema, so the structure of the fields is taken from the YANG
model, as well. Table 3-6 compares some of the differences between ConfD and
other traditional management tools.
ConfD has many different management protocols that can be used northbound to
manage the product. Some of these protocols are shown in Figure 3-12. Those
management protocols are as follows:
NETCONF
SNMP v1, v2c, v3
REST
CLI
WEB UI
Some of the different components of ConfD are the Core Engine, CDB database,
Managed Object API, and the Data Provider API. The CDB database is where the
configurations are stored in the ConfD solution. The internal CDB database is
optional, and network operators can choose to have external databases for their
configurations or use both internal and external databases to fit their needs. This
may make sense if you have a number of legacy applications already using an
external database. Most often, the internal CDB database is used.
142
Note
You can also store alarms, performance data, internal state, and other items in the
CDB database.
Multiple components comprise the ConfD Core Engine, including the following:
Transaction management
Session management/authentication
Role-based access control (RBAC)
Redundancy/replication for HA
Event logging/audit trails
Validation syntax/semantic
Rollback management
The rollback management function is another very useful component of ConfD. For
instance, after a network operator authenticates to the ConfD system and gets a
role assigned to them, the rollback management component will track all the
transactions made by that operator within their current session and create
rollback files in the event that the transactions need to be reverted back to a
previous state. This creates a timeline of transactions that allows the operator to
pick a point in time to roll back to.
DevNet
The examples and tools discussed in this chapter are all available to use and
practice at http://developer.cisco.com. This is the home for Cisco DevNet. DevNet
is a single place to go when looking to enhance or increase your skills with APIs,
coding, Python, and even controller concepts. In DevNet you will find learning labs
and content that will help solidify your knowledge in network programmability.
Regardless of whether you are just getting started or are a seasoned programmatic
professional, DevNet is the place to be! In this section, DevNet will be covered from
a high-level overview perspective, including the different sections of DevNet as
144
well as some of the labs and content you can expect to interact with. Figure 3-14
shows the DevNet main page.
Across the top of the main page you can see a few menu options. These menu
options will be covered individually and are as follows:
Discover
Technologies
Community
Support
Discover
First is the Discover tab. This is where you navigate the different offerings DevNet
has available. Under this tab you find subsections like guided learning tracks.
These learning tracks guide you through various different technologies and the
associated API labs. Some of the labs you can interact with are Programming the
Digital Network Architecture (DNA), ACI Programmability, Getting Started with
145
Cisco Spark APIs, and Introduction to DevNet, to name a few. Once you pick a
learning lab and start the module, the website will track all your progress and
allow you to continue where you left off. It’s excellent for continuing your
education over multiple days or weeks.
Technologies
The Technologies section allows you to pick relevant content based on the
technology that you want to study and to dive directly into the associated labs and
training. Figure 3-15 illustrates some of the networking content that is currently
available.
Note
Available labs may differ from those shown in this chapter. Please visit developer.
cisco.com to see the latest content available and to interact with the latest learning
labs.
146
Community
Support
The Support section of DevNet is where you can post questions and get answers
from some of the best in the industry. Technology-focused professionals are
available to answer your questions both from a technical perspective and a
theoretic perspective, meaning you can ask questions about specific labs or the
overarching technology (for example, Python or YANG models). You can also open
a case with the DevNet Support team and your questions will be tracked and
answered within a minimal amount of time. This is a great place to ask one-on-one
questions with the Support team as well as to tap into the expertise of the Support
147
engineers. Figure 3-17 shows the DevNet Support page as well as where to open a
Support case.
Let’s say, for example, your business is a software company. When developers
work on creating and coding software, each development team is responsible for
their own section or piece of the overall software suite or program. This means
that there may be different teams of developers working on creating a specific
portion of the software. Maybe they are creating a subset of features or developing
a single use case into the software code. What ends up happening at some point is
that all these developers have to consolidate all their separate versions of code into
the overall codeset that makes up the software suite or program they are all
developing. If each developer used different coding methods or error-checking
methods, you could imagine that once all the code is put together into a single code
set, it would likely be error-prone and might not function as desired. This is where
Continuous Integration (CI) comes into play. If you were to use a common
structure or guideline for every developer that includes error checking and testing,
the codeset would have a higher likelihood of working the way the developers
designed it to work. Continuous Integration automatically tests all of the code as it
148
is loaded to ensure that any new changes to the code do not impact the overall
codeset. Figure 3-18 illustrates an example of a common Continuous Integration
workflow.
Once the changes in the code have been successfully tested, the code will need to
be packaged up to be deployed to either a test development environment or
production. This is where the code gets put into a container, along with any startup
scripts, so that the software and any of its dependencies are all in the same
location. This allows the developers to ensure that the code will launch and run
consistently. Figure 3-19 depicts the process used once the developers’ code has
been consolidated and tested. This concept is called Continuous Packaging.
Now that the developers’ code has been packaged and put into a container, the
package will need to be deployed, as mentioned earlier, to a development or
production environment. Once this has been completed, the software will be
available for people to use. The entire process that was just covered is called an
149
One of the most commonly used Source Control Management platforms is Git,
which not only allows developers to have version control over their files, but also
allows them to work cohesively with other developers on those files. This allows
developers to accelerate working on software projects without sacrificing the
integrity of the data. Git has two data structures that make it function. The first
data structure is called an Index and is often referred to as the “cache.” This is
because it caches the information of the working directory you are using for your
files as well as the unsaved or uncommitted changes to those files in the working
150
directory. The second data structure is called the Object Database, which contains
four different types of objects. Table 3-7 lists these four different object types.
One of the most efficient and commonly adopted ways of using Git is to use GitHub.
GitHub is a hosted web-based repository for Git. It has capabilities for bug tracking
and task management as well. GitHub provides one of the easiest way to track
changes on your files, collaborate with other developers, and share code with the
online community. It is a great place to look for code to get started on the
programmability journey. Oftentimes, other engineers or developers are trying to
accomplish similar tasks and have already created and tested the necessary code.
One of the most powerful features of using GitHub is the ability to rate and provide
feedback on other developer’s code. Peer review is encouraged when it comes to
the coding community. Figure 3-21 shows the main GitHub web page.
151
Fortunately, a guide is available that steps you through how to create a repository,
start a branch, add comments, and open a pull request. You can also just start a
GitHub project after you are more familiar with the GitHub tool and its associated
processes. Projects are repositories that contain code files. This is where you will
have a single pane to create, edit, and share your code files. Figure 3-22 shows a
repository called “EvolvingTech” that contains three files:
EvolvingTech.txt
JSON_Example.txt
README.md
152
GitHub also provides a great summary of commit logs, meaning when you save a
change in one of your files or create a new file, it shows the details on the main
repository page. This can be seen in Figure 3-22 as well. By drilling down into one
of the files in the repository, you can see how easy it is to edit and save your code.
If you drill down into JSON_Example.txt, you will see its contents and how to exit
the file in the repository. Once you click the filename JSON_Example.txt, you can
see that the file has seven lines of code and is 76 bytes in size. Figure 3-23 shows
the contents of the JSON_Example.txt file and the associated options for what you
can do with the file.
153
The pencil icon allows you to go into editing mode so you can alter the file
contents. This editor is very similar to any text editor. Developers can simply type
into the editor or copy and paste code from other files directly into the editor. The
example in Figure 3-24 shows the addition of another user named “Zuul.” If we
were to commit the changes, the file would be saved with the new user added to
the file. Now that the file is available in the repository, other GitHub users and
developers can contribute to the code, or add and delete lines of code based on the
code we created. This is the true power of sharing your code. For example, if you
have some code to add a new user via JSON syntax, someone could use that code
and simply modify the usernames or add to the code to enhance it.
154
Ansible
Consistent
Secure
Highly reliable
Minimal learning curve
Unlike many other automation tools, Ansible is an agentless tool. This means that
no software or agent needs to be installed on the client machines to be managed.
This is considered by some to be a major advantage of using Ansible versus other
products. Ansible communicates via SSH for a majority of devices. It can also
support WinRM and other transport methods to the clients it manages. Ansible
also doesn’t need an administrative account on the client. It can use built-in
authorization escalation, like sudo, for example, when it needs to raise the level of
administrative control. Ansible uses the concept of a “control station” to send all
the requests from. This could simply be a laptop or a server sitting in a data center.
It is quite literally the computer used to run Ansible and to issue the changes or
requests from. This control station sends the requests to the remote hosts. Figure
3-25 illustrates the workflow.
155
These concepts are some of the reasons that administrators, developers, and IT
managers seek to use Ansible. This allows for an easy ramp-up for users who aim
to create new projects as well as sets the stage for a long-term automation
initiatives and processes to further benefit the business. Previously we discussed
the risk of human error and the impact it has on the business. Automation, by
nature, reduces the risk of human error because we are taking known best
practices that have been thoroughly tested in our environment and duplicating
them in an automatic process. However, it is important to note that if a bad process
or erroneous configuration is automated, it can be detrimental as well. When
you’re preparing to automate a task or set of tasks, it is important to start with the
desired outcome of automating the task(s). Once a desired outcome has been
documented, you can then move on to creating a plan to achieve the outcome. This
process follows the PPDIOO (Prepare, Plan, Design, Implement, Observe, Optimize)
methodology. Figure 3-26 outlines the PPDIOO lifecycle.
156
Now that you know what the basic structure of a Playbook is, you need to further
understand the language used to create Playbooks. Ansible Playbooks are written
in YAML syntax. YAML stands for Yet Another Markup Language. Ansible YAML
files usually begin with a series of three dashes (---) and end with a series of three
periods (…). YAML files also contain lists and dictionaries. Example 3-9 illustrates a
YAML file containing a list of musical genres.
---
# List of music genres
Music:
- Metal
- Rock
- Rap
- Country
…
YAML lists are very easy to read and consume. You can see based on the previous
example that we have the ability to add descriptions to the YAML file by using the
hash or pound sign (#) and adding text immediately following the it. You can also
see the example of the starting --- and trailing …, which indicate the starting and
end of the YAML file, respectively. More importantly, you can see that we started
each line of the list with a dash and a space (-). Indentation is also important in
YAML files.
YAML also has the concept of dictionaries. YAML dictionaries are very similar to
JSON dictionaries because they also use key-value pairs. Remember from earlier in
this chapter that key-value pairs are represented by “key: value”. Example 3-10
depicts a YAML dictionary containing an example employee record.
---
# HR Employee record
Employee1:
Name: John Smith
Title: Network Architect
Nickname: D'Bug
Lists and dictionaries can be used together as well in YAML. Example 3-11 shows a
dictionary with a list in a single YAML file.
---
# HR Employee records
- Employee1:
Name: John Dough
Title: Developer
Nickname: Mr. D'Bug
Skills:
- Python
- YAML
- JSON
- Employee2:
Name: Jane Dough
Title: Network Architect
Nickname: Lay D'Bug
Skills:
- CLI
- Security
- Automation
YAML Lint is a free online tool used to check the format of YAML files to make sure
they are in a valid syntax. Simply go to www.yamllint.com, paste the contents of a
YAML file into the interpreter, and click Go. This will tell you if you have an error in
your YAML file. Figure 3-27 shows the same YAML dictionary and list example, but
it cleans up the formatting and removes the description line with the # sign.
159
Ansible has a CLI tool that can be used to run Playbooks or ad-hoc CLI commands
on targeted hosts. This CLI tool has very specific commands needed in order to
enable automation. The most common Ansible CLI commands are shown in Table
3-9 with their associated use cases.
Ansible uses an inventory file to keep track of the hosts it manages. The inventory
can be a named group of hosts or a simple list of individual hosts. Hosts can belong
to multiple groups and can be represented by either an IP address or a resolvable
160
DNS name. Example 3-12 shows the contents of a host inventory file with the host
192.168.10.1 in two different groups.
[routers]
192.168.10.1
192.168.20.1
[switches]
192.168.10.25
192.168.10.26
[primary-gateway]
192.168.10.1
Now that we have seen the fundamental concepts of Ansible and YAML, it is time to
cover some useful examples. This section illustrates some basic examples of
Ansible Playbooks used to accomplish common tasks. Imagine using a Playbook to
deploy an interface configuration on a device without having to manually configure
it. Perhaps taking this idea a step further, you could use a Playbook to configure an
interface and deploy an EIGRP routing process. Example 3-13 illustrates the
contents of an Ansible Playbook called ConfigureInterface.yaml. This Playbook will
be used to configure the GigabitEthernet2 interface on a CSR1000V router.
Leveraging the ios_config Ansible module, this Playbook will add the following
configuration to the GigabitEthernet2 interface on the CSR1KV-1 router:
---
- hosts: CSR1KV-1
gather_facts: false
connection: local
tasks:
- name: Configure GigabitEthernet2 Interface
ios_config:
lines:
161
Building out a Playbook can greatly simplify configuration tasks. Example 3-14
shows an alternate version of the ConfigureInterface.yaml Playbook named
EIGRP_Configuration_Example.yaml, where EIGRP is added along with the ability
to save the configuration by issuing a “write memory.” These tasks are
accomplished by leveraging the ios_command module in Ansible. This Playbook
will add the following configuration to the CSR1KV-1 router:
On GigbitEthernet2:
162
On GigbitEthernet3:
Global configuration:
Save configuration:
write memory
---
- hosts: CSR1KV-1
gather_facts: false
connection: local
tasks:
- name: Configure GigabitEthernet2 Interface
ios_config:
lines:
- description Configured by ANSIBLE!!!
- ip address 10.1.1.1 255.255.255.0
- no shutdown
parents: interface GigabitEthernet2
lines:
- description Configured By ANSIBLE!!!
- no ip address
- shut
parents: interface GigabitEthernet3
- name: WR MEM
ios_command:
commands:
- write memory
Once the Playbook is run, the output will show the tasks as they are completed and
their associated status. Based on the output in Figure 3-29, we can see that tasks
with the following names are completed and return a status of “changed”:
Furthermore, the WR MEM task completes as well, which is evident by the “ok:
[CSR1KV-1]” status. At the bottom of the output, we see in the PLAY RECAP section
that we have a status of ok=4 and changed=3. This means that out of the four tasks,
three actually modified the router and made a configuration change, while one task
saved the configuration after it was modified.
!
interface GigabitEthernet1
ip address 172.16.38.101 255.255.255.0
negotiation auto
no mop enabled
no mop sysid
!
interface GigabitEthernet2
165
Since the last task in the Playbook is to issue the “write memory” command,
verification involves quite simply issuing the show startup-config command with
some filters to see the relevant configuration on the CSR1KV-1 router. Figure 3-30
illustrates the output from the show startup-config | se
GigabithEthernet2|net3|router eigrp 100 command.
gRPC
Python
C++
Java
Node.JS
C#
Ruby
Go
Objective-C
Dart (beta)
Note
As you can see, this is a powerful technology that offers a wide support structure.
The reason this is so important is that you can have multiple developers working
on their own portion of a cloud project in their own programming language and
still have the ability for them all to work together as a holistic system. For example,
one developer can be working on a section of the application in Go, and the code
can interact and work with code developed by another developer using Python.
Figure 3-31 depicts the typical communication flow between multiple
programming languages.
Figure
3-31 gRPC Traffic Flow
167
Google is one of the biggest consumers of this technology as of late. Google is using
this for most of its cloud products and externally facing APIs. gRPC offers the
ability to create a highly scalable and agile distributed system capable of handling
workloads from different programming languages, making it very appealing to
large companies looking to develop cloud-scale applications.
Like with other remote procedure calls, the data being sent across the connection
needs to be serialized. gRPC has the ability to leverage HTTP/2. HTTP/2 offers
quite a few benefits over its predecessor HTTP1.1. This is especially important
when serializing multiple data streams over a single TCP connection. This is called
multiplexing. Table 3-10 illustrates some of the key differences between the two
versions of HTTP.
In order to understand the benefit gRPC has with data serialization, it is important
to cover protocol buffers. Protocol buffers are Google’s open source method of
serializing structured data. Structured data can be a variety of data formats such as
JSON. First, you have to define the data to be serialized into a text file with a proto
extension. In Python, we define objects by name. Proto files are used to apply an
integer to a definition. This means when serializing the data, instead of having to
send the whole name you defined, you can simply send a number that represents
the defined object. For example, the proto file contains a list of fields that just map
the object names to numbers. Example 3-16 shows the content of a sample proto
file.
message Employee {
string EmployeeName = 1;
int32 EmployeeID = 2;
}
168
Based on this example, we can send a 1 or a 2 instead of having to send the whole
field name, like EmployeeName or EmployeeID, and still have the same outcome.
This can save a significant amount of time and reduce latency as well when
serializing the data. Essentially, one can send a single number that represents a
much larger field or definition of structured data.
Summary
This chapter covered some of the foundational skills needed to get started with
network programmability. This chapter also covered some of the key capabilities
and benefits from using a programmatic approach to managing a network. The
tools covered in this chapter are available online and are very useful in terms of
building skill and expertise. DevNet is a single place where network operators and
developers can visit to practice any of the technology and examples covered in this
chapter. You will often hear the following advice in various forms in reference to
programmability: “Start small, but just start!” It is best to practice using a sandbox
environment and just build code and run it to see what you can accomplish. The
best way to learn any of these topics is to get started and practice. You are only
limited by your imagination and coding skills! Remember to have fun and keep in
mind that programmability is a journey, not a destination. Separate your learning
into smaller, more manageable chunks. You will get better with practice and time.
Review Questions
1. True
2. False
2. To authenticate with Cisco’s DNA Center, which type of HTTP request method
must be used?
1. PUT
2. PATCH
3. GET
4. POST
5. HEAD
1. Google Postman
2. Ansible
3. TELNET
4. XML
5. JSON
6. SSH
1. {
2. "user": "root",
3. "user": "Jason",
4. "user": "Jamie",
5. "user": "Luke"
6. }
7. <users>
8. <user>
9. <name>root</name>
10. </user>
11. <user>
12. <name>Jason</name>
13. </user>
14. <user>
15. <name>Jamie</name>
16. </user>
17. <user>
18. <name>Luke</name>
19. </user>
20. </users>
21.
o root
o Jason
o Jamie
o Luke
22. [users[root|Jason|Jamie|Luke]]
1. 201
2. 400
3. 401
4. 403
5. 404
170
9. Ansible uses the TAML syntax for creation of Playbook files that start with three
dashes (---).
1. True
2. False
10. What is the proper command for executing a Playbook using Ansible?
1. ansible-playbook ConfigureInterface.yaml
2. ansible ConfigureInterface.yaml
3. play ansible-book ConfigureInterface.yaml
4. play ansible-book ConfigureInterface.taml
References
RFC 6020, “YANG—A Data Modeling Language for the Network Configuration
Protocol (NETCONF),” M. Bjorklund, Ed., IETF, https://tools.ietf.org/html/rfc6020,
October 2010.
APPENDIX
Chapter 1
3. a. Explanation: IoT networks are more complex because they have a larger
scale and more attack vectors than traditional IT networks.
Chapter 2
1. c. Explanation: NIST only provides definitions for SaaS, PaaS, and IaaS.
Chapter 3
1. b. Explanation: Configuring a large number of devices via the CLI is not only
time consuming, it also leads to an increase in human error, ultimately putting the
business at risk.
3. b. Explanation: CRUD stands for CREATE, READ, UPDATE, and DELETE. These
are the common actions associated with the manipulation of data. For example, a
database uses these actions.
4. a and b. Explanation: Postman and Ansible both have the ability to specify a
RESTful API or URL to be used to practice or test API functions. Telnet and SSH are
transport methods, whereas XML and JSON are data structures.
5. a. Explanation: A JSON data format is built from key-value pairs. For example,
“user”: “Jason” is a key value pair, where user is the key and Jason is the value.
9. b. Explanation: Ansible uses Yet Another Markup Language (YAML) for the
creation of Playbook files. TAML doesn’t exist.