Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems

Ebook440 pages4 hours

Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems

Name: Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems
Author: Dr. Bruce Holenstein
ISBN: 9781434316059

By Dr. Bruce Holenstein and Dr. Bill Highleyman

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book is Volume 2 of a three-part series on active/active systems. It describes techniques that can be used today for extending system failure times from years to centuries, often at little or no additional cost.

As our daily lives and corporate well-being become more dependent upon computers, system reliability grows increasingly important. No longer are frequent system outages acceptable. In many cases, failure intervals must now be measured in centuries.

Starting with a summary of Volume 1, techniques for achieving extraordinary availabilities are reviewed. These techniques use active/active architectures, in which multiple independent nodes using a common distributed database are cooperating in a common application. Should a node fail, all that is required is to switch the users on that node to a surviving node.

Equally important to the achievement of high availability is the ability to upgrade the system hardware and software without denying service to the users. The procedures to do this within an active/active system are described.

The secret to high availability is to let it fail, but fix it fast. This volume explores the server, database, and network redundancy techniques that allow fast-fix to happen. The cost considerations involved in such redundant architectures are also explored.

Skip carousel

Computers

LanguageEnglish

PublisherAuthorHouse

Release dateJun 1, 2007

ISBN9781434316059

Author

Dr. Bruce Holenstein

Dr. Bill Highleyman, Paul J. Holenstein, and Dr. Bruce Holenstein have a combined experience of over 90 years in the implementation of fault-tolerant, highly available computing systems. This experience ranges from the early days of custom redundant systems to today’s fault-tolerant offerings from HP (NonStop) and Stratus. Dr. Bill Highleyman has done extensive work on the effect of failure mode reduction on system availability. He has built fault-tolerant systems for train control, racetrack wagering, securities trading, message communication, and other applications. He is the Managing Editor of the Availability Digest (availabilitydigest.com). Paul J. Holenstein and Dr. Bruce Holenstein have architected and implemented the various data replication techniques required for the availability enhancements described in this book. Their company, Gravic, provides the Shadowbase line of data replication products to the fault-tolerant community.

Related authors

Skip carousel

Related to Breaking the Availability Barrier Ii

Related ebooks

Skip carousel

How To Do Virtualization: Your Step-By-Step Guide To Virtualization
Ebook
How To Do Virtualization: Your Step-By-Step Guide To Virtualization
byHowExpert
Rating: 0 out of 5 stars
0 ratings
Service Availability: Principles and Practice
Ebook
Service Availability: Principles and Practice
byMaria Toeroe
Rating: 0 out of 5 stars
0 ratings
Windows Azure Hybrid Cloud
Ebook
Windows Azure Hybrid Cloud
byDanny Garber
Rating: 0 out of 5 stars
0 ratings
VMware Horizon 6 Desktop Virtualization Solutions
Ebook
VMware Horizon 6 Desktop Virtualization Solutions
byRyan Cartwright
Rating: 0 out of 5 stars
0 ratings
WiFi, WiMAX, and LTE Multi-hop Mesh Networks: Basic Communication Protocols and Application Areas
Ebook
WiFi, WiMAX, and LTE Multi-hop Mesh Networks: Basic Communication Protocols and Application Areas
byHung-Yu Wei
Rating: 0 out of 5 stars
0 ratings
Cloud Computing and Virtualization
Ebook
Cloud Computing and Virtualization
byDac-Nhuong Le
Rating: 0 out of 5 stars
0 ratings
VMware View Security Essentials
Ebook
VMware View Security Essentials
byLangenhan Daniel
Rating: 0 out of 5 stars
0 ratings
VMware Horizon View 6 Desktop Virtualization Cookbook
Ebook
VMware Horizon View 6 Desktop Virtualization Cookbook
byJason Ventresco
Rating: 0 out of 5 stars
0 ratings
Microsoft Exchange Server 2013 - Sizing, Designing and Configuration: A Practical Look
Ebook
Microsoft Exchange Server 2013 - Sizing, Designing and Configuration: A Practical Look
byKrishna Kumar
Rating: 0 out of 5 stars
0 ratings
Microsoft Exchange Server 2013 High Availability
Ebook
Microsoft Exchange Server 2013 High Availability
byNuno Mota
Rating: 0 out of 5 stars
0 ratings
vSphere High Performance Cookbook - Second Edition
Ebook
vSphere High Performance Cookbook - Second Edition
byChristopher Kusek
Rating: 0 out of 5 stars
0 ratings
Tomcat 6 Developer's Guide
Ebook
Tomcat 6 Developer's Guide
byDamodar Chetty
Rating: 4 out of 5 stars
4/5
VMware vRealize Orchestrator Cookbook
Ebook
VMware vRealize Orchestrator Cookbook
byLangenhan Daniel
Rating: 0 out of 5 stars
0 ratings
Discovering Requirements: How to Specify Products and Services
Ebook
Discovering Requirements: How to Specify Products and Services
byIan F. Alexander
Rating: 4 out of 5 stars
4/5
Zero Trust Security: Building Cyber Resilience & Robust Security Postures
Ebook
Zero Trust Security: Building Cyber Resilience & Robust Security Postures
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
Online Identity A Complete Guide - 2020 Edition
Ebook
Online Identity A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Getting Started with Windows VDI
Ebook
Getting Started with Windows VDI
byAndrew Fryer
Rating: 0 out of 5 stars
0 ratings
Malware Analysis A Complete Guide - 2020 Edition
Ebook
Malware Analysis A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Enterprise Information Security Architecture A Complete Guide - 2020 Edition
Ebook
Enterprise Information Security Architecture A Complete Guide - 2020 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
“Careers in Information Technology: Artificial Intelligence (AI) Robotics Engineer”: GoodMan, #1
Ebook
“Careers in Information Technology: Artificial Intelligence (AI) Robotics Engineer”: GoodMan, #1
byPatrick Mukosha
Rating: 0 out of 5 stars
0 ratings
Security controls Complete Self-Assessment Guide
Ebook
Security controls Complete Self-Assessment Guide
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
Red Hat Ansible A Complete Guide - 2021 Edition
Ebook
Red Hat Ansible A Complete Guide - 2021 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings
VMware Horizon View High Availability
Ebook
VMware Horizon View High Availability
byAlloway Andrew
Rating: 0 out of 5 stars
0 ratings
Citrix XenApp Performance Essentials
Ebook
Citrix XenApp Performance Essentials
byLuca Dentella
Rating: 0 out of 5 stars
0 ratings
360° Vulnerability Assessment with Nessus and Wireshark: Identify, evaluate, treat, and report threats and vulnerabilities across your network (English Edition)
Ebook
360° Vulnerability Assessment with Nessus and Wireshark: Identify, evaluate, treat, and report threats and vulnerabilities across your network (English Edition)
byRaphael Hungaro Moretti
Rating: 0 out of 5 stars
0 ratings
Computer Networking Bootcamp: Routing, Switching And Troubleshooting
Ebook
Computer Networking Bootcamp: Routing, Switching And Troubleshooting
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
CYBER SECURITY HANDBOOK Part-1: Hacking the Hackers: Unraveling the World of Cybersecurity
Ebook
CYBER SECURITY HANDBOOK Part-1: Hacking the Hackers: Unraveling the World of Cybersecurity
byPoonam Devi
Rating: 0 out of 5 stars
0 ratings
IaaS Mastery: Infrastructure As A Service: Your All-In-One Guide To AWS, GCE, Microsoft Azure, And IBM Cloud
Ebook
IaaS Mastery: Infrastructure As A Service: Your All-In-One Guide To AWS, GCE, Microsoft Azure, And IBM Cloud
byRob Botwright
Rating: 0 out of 5 stars
0 ratings
Learning VMware App Volumes
Ebook
Learning VMware App Volumes
byPeter von Oven
Rating: 0 out of 5 stars
0 ratings
Network Architecture A Complete Guide - 2019 Edition
Ebook
Network Architecture A Complete Guide - 2019 Edition
byGerardus Blokdyk
Rating: 0 out of 5 stars
0 ratings

Computers For You

Skip carousel

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Ebook
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
byMargot Lee Shetterly
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
The Invisible Rainbow: A History of Electricity and Life
Ebook
The Invisible Rainbow: A History of Electricity and Life
byArthur Firstenberg
Rating: 5 out of 5 stars
5/5
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
Ebook
Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition
byAndrew Hodges
Rating: 4 out of 5 stars
4/5
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
Ebook
The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 4 out of 5 stars
4/5
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
Ebook
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
byAlec Rowe
Rating: 0 out of 5 stars
0 ratings
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
An Ultimate Guide to Kali Linux for Beginners
Ebook
An Ultimate Guide to Kali Linux for Beginners
byAnsh Goyal
Rating: 3 out of 5 stars
3/5
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
Ebook
CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61
byQuentin Docter
Rating: 0 out of 5 stars
0 ratings
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
Ebook
CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide
byJoe Shelley
Rating: 5 out of 5 stars
5/5
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
Ebook
The Hacker Crackdown: Law and Disorder on the Electronic Frontier
byBruce Sterling
Rating: 4 out of 5 stars
4/5
People Skills for Analytical Thinkers
Ebook
People Skills for Analytical Thinkers
byGilbert Eijkelenboom
Rating: 5 out of 5 stars
5/5
Discord For Dummies
Ebook
Discord For Dummies
byTee Morris
Rating: 0 out of 5 stars
0 ratings
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byT.C. Boyle
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 4 out of 5 stars
4/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
Ebook
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!
byJohannes Wild
Rating: 0 out of 5 stars
0 ratings
Tor and the Dark Art of Anonymity
Ebook
Tor and the Dark Art of Anonymity
byLance Henderson
Rating: 5 out of 5 stars
5/5
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
Ebook
Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls
byKathleen Hale
Rating: 4 out of 5 stars
4/5
I Forced a Bot to Write This Book: A.I. Meets B.S.
Ebook
I Forced a Bot to Write This Book: A.I. Meets B.S.
byKeaton Patti
Rating: 4 out of 5 stars
4/5
The Best Hacking Tricks for Beginners
Ebook
The Best Hacking Tricks for Beginners
byRAJ TYAGI
Rating: 4 out of 5 stars
4/5
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Uncanny Valley: A Memoir
Ebook
Uncanny Valley: A Memoir
byAnna Wiener
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

The Privacy Paradox with Anna Maria Mandalari
UNLIMITED
The Privacy Paradox with Anna Maria Mandalari
byIoT Security Podcast
0 ratings
0% found this document useful
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
UNLIMITED
One Shot and Metric Learning - Quadruplet Loss (Machine Learning Dojo)
byMachine Learning Street Talk (MLST)
0 ratings
0% found this document useful
Africa’s first cyber-security declaration: Twenty-nine nations sign Africa’s first cybersecurity declaration
UNLIMITED
Africa’s first cyber-security declaration: Twenty-nine nations sign Africa’s first cybersecurity declaration
byDigital Planet
0 ratings
0% found this document useful
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
UNLIMITED
Using AI to supercharge DevX with Deepak Singh of AWS: Developer experience, or DevX, is a critical aspect of modern software development that focuses on creating a seamless and productive environment for developers. It encompasses everything from the tools and technologies used in the development process ...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Matt Tait on Cybersecurity in Ukraine
UNLIMITED
Matt Tait on Cybersecurity in Ukraine
byThe Lawfare Podcast
0 ratings
0% found this document useful
Distributed Tracing Infrastructure with Ben Sigelman and Alex Kehlenbeck: Ben Sigelman Alex Kehlenbeck Observability consists of metrics, logs, and traces. Lightstep is a company that builds distributed tracing infrastructure, which requires them to store and serve high volumes of trace data.
UNLIMITED
Distributed Tracing Infrastructure with Ben Sigelman and Alex Kehlenbeck: Ben Sigelman Alex Kehlenbeck Observability consists of metrics, logs, and traces. Lightstep is a company that builds distributed tracing infrastructure, which requires them to store and serve high volumes of trace data.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#556: November 2022 Update Show 1: It is a BUMPER show of updates!!! Please take 2-3 minutes to fill out a survey letting us know how w
UNLIMITED
#556: November 2022 Update Show 1: It is a BUMPER show of updates!!! Please take 2-3 minutes to fill out a survey letting us know how w
byAWS Podcast
0 ratings
0% found this document useful
What is SASE/SSE & Why It's Even More Important in 2022 - Evgeniy Kharam - ESW #283: Secure Access Service Edge (SASE)/Secure Service Edge(SSE) has quickly become part of day-to-day lexicon. But what exactly is SASE/SSE and will it make enterprise data more secure? How will organizations secure their data in a cloud-first world when...
UNLIMITED
What is SASE/SSE & Why It's Even More Important in 2022 - Evgeniy Kharam - ESW #283: Secure Access Service Edge (SASE)/Secure Service Edge(SSE) has quickly become part of day-to-day lexicon. But what exactly is SASE/SSE and will it make enterprise data more secure? How will organizations secure their data in a cloud-first world when...
byEnterprise Security Weekly (Video)
0 ratings
0% found this document useful
Service Mesh with William Morgan: Containers make it easier for engineers to deploy software. Orchestration systems like Kubernetes make it easier to manage and scale the different containers that contain services. The popular container infrastructure powered by Kubernetes is often cal...
UNLIMITED
Service Mesh with William Morgan: Containers make it easier for engineers to deploy software. Orchestration systems like Kubernetes make it easier to manage and scale the different containers that contain services. The popular container infrastructure powered by Kubernetes is often cal...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Data Backup and Disaster Recovery with Druva’s W. Curtis Preston: Historically, backup and recovery has been a job relegated to a junior person on the team. It’s the kind of grungy work that all companies know they should do, but no one wants to own it. As such, many companies have a poor backup and recovery postur...
UNLIMITED
Data Backup and Disaster Recovery with Druva’s W. Curtis Preston: Historically, backup and recovery has been a job relegated to a junior person on the team. It’s the kind of grungy work that all companies know they should do, but no one wants to own it. As such, many companies have a poor backup and recovery postur...
byPartially Redacted: Data, AI, Security, and Privacy
0 ratings
0% found this document useful
Tackling system complexity with domain-driven design: Despite the term being coined two decades ago by Eric Evans, domain-driven design has arguably become more relevant than ever in software engineering, thanks to the rise of cloud and highly complex distributed systems. In this episode of the...
UNLIMITED
Tackling system complexity with domain-driven design: Despite the term being coined two decades ago by Eric Evans, domain-driven design has arguably become more relevant than ever in software engineering, thanks to the rise of cloud and highly complex distributed systems. In this episode of the...
byThoughtworks Technology Podcast
0 ratings
0% found this document useful
Cookie Hijacking - How Linus Tech Tips got Hacked
UNLIMITED
Cookie Hijacking - How Linus Tech Tips got Hacked
byThe Backend Engineering Show with Hussein Nasser
0 ratings
0% found this document useful
Equinix Infrastructure with Tim Banks: Software-Defined Networking describes a category of technologies that separate the networking control plane from the forwarding plane. This enables more automated provisioning and policy-based management of network resources.
UNLIMITED
Equinix Infrastructure with Tim Banks: Software-Defined Networking describes a category of technologies that separate the networking control plane from the forwarding plane. This enables more automated provisioning and policy-based management of network resources.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Encryption Key Management and Its Role in Modern Data Privacy with Osvaldo Banuelos: When managing your company’s most sensitive data, encryption is a must. To fit your overall data protection strategy, you need a wide range of options for managing your encryption keys so you can generate, store, and rotate them as needed.The risk of sen...
UNLIMITED
Encryption Key Management and Its Role in Modern Data Privacy with Osvaldo Banuelos: When managing your company’s most sensitive data, encryption is a must. To fit your overall data protection strategy, you need a wide range of options for managing your encryption keys so you can generate, store, and rotate them as needed.The risk of sen...
byPartially Redacted: Data, AI, Security, and Privacy
0 ratings
0% found this document useful
FinOps with Joe Daly: On the podcast this week, guest Joe Daly tells , , and our listeners all about FinOps principles and how they’re helping companies take advantage of the cloud while saving their bottom lines. He describes FinOps as financial DevOps, making financial...
UNLIMITED
FinOps with Joe Daly: On the podcast this week, guest Joe Daly tells , , and our listeners all about FinOps principles and how they’re helping companies take advantage of the cloud while saving their bottom lines. He describes FinOps as financial DevOps, making financial...
byGoogle Cloud Platform Podcast
0 ratings
0% found this document useful
Containers with Bryan Cantrill from Joyent: Container infrastructure has benefits of security, scalability and efficiency. Containers are a central component of the DevOps movement. Joyent provides simple, secure deployment of containers with bare metal speed on container-native infrastructure B...
UNLIMITED
Containers with Bryan Cantrill from Joyent: Container infrastructure has benefits of security, scalability and efficiency. Containers are a central component of the DevOps movement. Joyent provides simple, secure deployment of containers with bare metal speed on container-native infrastructure B...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
#21 - Domain-Driven Design and Event-Driven Architecture - Vaughn Vernon
UNLIMITED
#21 - Domain-Driven Design and Event-Driven Architecture - Vaughn Vernon
byTech Lead Journal
0 ratings
0% found this document useful
Building Chess.com with Jay Severson: Chess.com started in 2007 and grew steadily in the years following. The platform exploded in popularity during the pandemic, to the point that their servers struggled with the traffic. It was a great problem to have. Chess.
UNLIMITED
Building Chess.com with Jay Severson: Chess.com started in 2007 and grew steadily in the years following. The platform exploded in popularity during the pandemic, to the point that their servers struggled with the traffic. It was a great problem to have. Chess.
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Cloudburst: Stateful Functions-as-a-Service with Vikram Sreekanti: Serverless computing is a way of designing applications that do not directly address or deploy application code to servers. Serverless applications are composed of stateless functions-as-a-service and stateful data storage systems such as Redis or Dyna...
UNLIMITED
Cloudburst: Stateful Functions-as-a-Service with Vikram Sreekanti: Serverless computing is a way of designing applications that do not directly address or deploy application code to servers. Serverless applications are composed of stateless functions-as-a-service and stateful data storage systems such as Redis or Dyna...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Flow Architectures & the Future of Streaming Data with James Urquhart: James Urquhart is the global field CTO at VMware. He brings more than 25 years of tech experience to this position, having worked as the global field CTO at Pivotal Software, the general manager of learning services at AWS, SVP of performance analytics at
UNLIMITED
Flow Architectures & the Future of Streaming Data with James Urquhart: James Urquhart is the global field CTO at VMware. He brings more than 25 years of tech experience to this position, having worked as the global field CTO at Pivotal Software, the general manager of learning services at AWS, SVP of performance analytics at
byScreaming in the Cloud
0 ratings
0% found this document useful
Active Directory, Azure and Windows Security - Sean Metcalf - PSW #642: Active Directory & Microsoft Cloud (Azure AD & Office 365) Security, including a breakdown of Microsoft's security offerings and recommendations for cloud migrations for Active Directory. Visit for all the latest episodes! Show Notes:
UNLIMITED
Active Directory, Azure and Windows Security - Sean Metcalf - PSW #642: Active Directory & Microsoft Cloud (Azure AD & Office 365) Security, including a breakdown of Microsoft's security offerings and recommendations for cloud migrations for Active Directory. Visit for all the latest episodes! Show Notes:
bySecurity Weekly Podcast Network (Video)
0 ratings
0% found this document useful
Infrastructure Mistakes with Avi Freedman: The blueprint for a typical startup involves investing heavily in cloud services–either from Amazon, Google, or Microsoft. The high costs can quickly eat away at all of the money that startup has raised. In today’s episode,
UNLIMITED
Infrastructure Mistakes with Avi Freedman: The blueprint for a typical startup involves investing heavily in cloud services–either from Amazon, Google, or Microsoft. The high costs can quickly eat away at all of the money that startup has raised. In today’s episode,
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Hadoop Ops: Rocana CTO Eric Sammer Interview: Rocana applies big data, advanced analytics, and visualizations to dev ops in order to guide users to the root causes of problems. Eric Sammer is the co-founder and CTO of Rocana. At Cloudera, he served as an Engineering Manager responsible for tools a...
UNLIMITED
Hadoop Ops: Rocana CTO Eric Sammer Interview: Rocana applies big data, advanced analytics, and visualizations to dev ops in order to guide users to the root causes of problems. Eric Sammer is the co-founder and CTO of Rocana. At Cloudera, he served as an Engineering Manager responsible for tools a...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Ep. 31 - Security Architecture: Rick Smith
UNLIMITED
Ep. 31 - Security Architecture: Rick Smith
byWhat's Your Baseline? Enterprise Architecture & Business Process Management Demystified
0 ratings
0% found this document useful
CockroachDB In Depth with Peter Mattis - Episode 35
UNLIMITED
CockroachDB In Depth with Peter Mattis - Episode 35
byData Engineering Podcast
0 ratings
0% found this document useful
Who Is Your SOC Really For? - Ricardo Lafosse - CSP #43: Managing the volume of security events and continuous threat intelligence can be daunting for the largest of organizations. How do you increase the effectiveness of a Security Operations Center (SOC) and share this information across the organization...
UNLIMITED
Who Is Your SOC Really For? - Ricardo Lafosse - CSP #43: Managing the volume of security events and continuous threat intelligence can be daunting for the largest of organizations. How do you increase the effectiveness of a Security Operations Center (SOC) and share this information across the organization...
byCISO Stories Podcast (Audio)
0 ratings
0% found this document useful
Amazon Kubernetes with Abby Fuller: Amazon’s container offerings include ECS (Elastic Container Service), EKS (Elastic Kubernetes Service), and Fargate. Through these different offerings, Amazon provides a variety of ways that a user can manage Kubernetes clusters and standalone containe...
UNLIMITED
Amazon Kubernetes with Abby Fuller: Amazon’s container offerings include ECS (Elastic Container Service), EKS (Elastic Kubernetes Service), and Fargate. Through these different offerings, Amazon provides a variety of ways that a user can manage Kubernetes clusters and standalone containe...
byCloud Engineering Archives - Software Engineering Daily
0 ratings
0% found this document useful
Kubernetes Registry with Benjamin Elder: Benjamin Elder is a Senior Software Engineer at Google, a Kubernetes SIG Testing Chair & Tech Lead, and a Kubernetes Steering Committee member. In this episode we got to chat with Benjamin about the new kubernetes registry migration from k8s.gcr.io to...
UNLIMITED
Kubernetes Registry with Benjamin Elder: Benjamin Elder is a Senior Software Engineer at Google, a Kubernetes SIG Testing Chair & Tech Lead, and a Kubernetes Steering Committee member. In this episode we got to chat with Benjamin about the new kubernetes registry migration from k8s.gcr.io to...
byKubernetes Podcast from Google
0 ratings
0% found this document useful
Exploring The TileDB Universal Data Engine - Episode 146: An interview with the creator of TileDB about building a universal data engine to support cross-domain collaboration and reduce the burden of data management.
UNLIMITED
Exploring The TileDB Universal Data Engine - Episode 146: An interview with the creator of TileDB about building a universal data engine to support cross-domain collaboration and reduce the burden of data management.
byData Engineering Podcast
0 ratings
0% found this document useful
Business Architecture: Collecting, Connecting and Correcting the Dots.
UNLIMITED
Business Architecture: Collecting, Connecting and Correcting the Dots.
byBizzdesign Enterprise Architecture Podcast
0 ratings
0% found this document useful

Skip carousel

Join the Pod, Man!
Linux Format
UNLIMITED
Join the Pod, Man!
May 30, 2023
8 min read
4 Ways To Protect Your Small Business From Cyberattacks
TechLife News
UNLIMITED
4 Ways To Protect Your Small Business From Cyberattacks
May 14, 2022
3 min read
How Did Cybersecurity Become So Political?
The Atlantic
UNLIMITED
How Did Cybersecurity Become So Political?
Feb 2, 2017
3 min read
Why Companies Should Be Open About Cybersecurity
Futurity
UNLIMITED
Why Companies Should Be Open About Cybersecurity
Oct 29, 2019
2 min read
Data Backups: Critical Part of Cyber Strategy Strategies to Protect Your Data
Techfastly
UNLIMITED
Data Backups: Critical Part of Cyber Strategy Strategies to Protect Your Data
Jun 1, 2022
6 min read
Network monitoring 2022
PC Pro Magazine
UNLIMITED
Network monitoring 2022
Feb 10, 2022
4 min read
2029 VISION Where Technology Is Taking Business
NZBusiness and Management
UNLIMITED
2029 VISION Where Technology Is Taking Business
May 27, 2019
6 min read
Runing with the GNU Debian Hurd
Linux Format
UNLIMITED
Runing with the GNU Debian Hurd
Mar 7, 2023
10 min read
Seed Your Own Cloud
Linux Format
UNLIMITED
Seed Your Own Cloud
Oct 22, 2019
10 min read
Stop Wi-Fi hackers
APC
UNLIMITED
Stop Wi-Fi hackers
Nov 6, 2023
16 min read
Road to 5G
HWM Singapore
UNLIMITED
Road to 5G
May 4, 2020
3 min read
How End User Computing (EUC) is Digitally Transforming the Post-Pandemic Workplace
Techfastly
UNLIMITED
How End User Computing (EUC) is Digitally Transforming the Post-Pandemic Workplace
Jun 1, 2022
4 min read
Protect Your Data
APC
UNLIMITED
Protect Your Data
Jul 13, 2020
13 min read
States At Disadvantage In Race To Recruit Cybersecurity Pros
TechLife News
UNLIMITED
States At Disadvantage In Race To Recruit Cybersecurity Pros
Oct 2, 2021
4 min read
An Urgent Need for Cybersecurity Stocks
Kiplinger
UNLIMITED
An Urgent Need for Cybersecurity Stocks
Apr 29, 2022
In June 2017, Russian hackers launched a malware attack on Ukraine called NotPetya. The attack, which locked users out of their own files unless they paid a ransom in bitcoin, was just one more tactic in the conflict between the two nations that had
4 min read
Kernel Watch
Linux Format
UNLIMITED
Kernel Watch
Nov 16, 2021
Linus Torvalds has released what will likely be the final RC (Release Candidate) kernel prior to 5.15 final. The new kernel will contain a number of features that we’ve covered previously, including a new NTFS filesystem driver, and a new in-kernel “
2 min read
All Change – But Which Platform? Confronting Shift In The Telecom Sector
The European Business Review
UNLIMITED
All Change – But Which Platform? Confronting Shift In The Telecom Sector
Nov 25, 2021
Q Thank you for joining us today, Mr Peters! Would you mind giving us a little backstory on how your interest in telecommunications came about? A My college background (Fairfield University) was in physics and neuroscience, so my introduction to tel
8 min read
Google Lawsuit Marks End Of Washington's Love Affair With Big Tech
NPR
UNLIMITED
Google Lawsuit Marks End Of Washington's Love Affair With Big Tech
Oct 21, 2020
The Justice Department's lawsuit against Google is the clearest sign yet of the 'Techlash' that has politicians on both sides of the aisle bristling at the power of Silicon Valley.
4 min read
4 Ways To Protect Your Small Business From Cyberattacks
TechLife News
UNLIMITED
4 Ways To Protect Your Small Business From Cyberattacks
May 21, 2022
3 min read
Remote Support Software 2020
PC Pro Magazine
UNLIMITED
Remote Support Software 2020
Aug 13, 2020
3 min read
Edge and Cloud Computing Can They Coexist Peacefully?
Techfastly
UNLIMITED
Edge and Cloud Computing Can They Coexist Peacefully?
Jun 1, 2022
6 min read
Workflow
Linux Format
UNLIMITED
Workflow
Nov 17, 2020
3 min read
In Crosshairs Of Ransomware Crooks, Cyber Insurers Struggle
AppleMagazine
UNLIMITED
In Crosshairs Of Ransomware Crooks, Cyber Insurers Struggle
Jul 9, 2021
5 min read
Cybersecurity Firm Fireeye Says Was Hacked By Nation State
TechLife News
UNLIMITED
Cybersecurity Firm Fireeye Says Was Hacked By Nation State
Dec 12, 2020
3 min read
Get Into Coding!
Linux Format
UNLIMITED
Get Into Coding!
Aug 23, 2022
1 min read
Host Your Own Cloud
PC Pro Magazine
UNLIMITED
Host Your Own Cloud
Jul 9, 2020
8 min read
Wi-Fi 6E THE 6GHZ REVOLUTION
APC
UNLIMITED
Wi-Fi 6E THE 6GHZ REVOLUTION
Apr 19, 2021
11 min read
Cybersecurity Courses Ramp Up Amid Shortage Of Professionals
TechLife News
UNLIMITED
Cybersecurity Courses Ramp Up Amid Shortage Of Professionals
Jun 18, 2022
7 min read
Answers
Linux Format
UNLIMITED
Answers
Nov 14, 2023
9 min read
Build A Self-hosted Fediverse Server
Linux Format
UNLIMITED
Build A Self-hosted Fediverse Server
Jan 11, 2022
7 min read

Related categories

Skip carousel

Reviews for Breaking the Availability Barrier Ii

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Breaking the Availability Barrier Ii - Dr. Bruce Holenstein

Dedication

Forward

What is This Book?

Achieving Extreme Availabilities

A Roadmap Through This Book

Authors’ Notes

Acknowledgements

About the Authors

Part 1-Survivable Systems for Enterprise Computing

Chapter 1-Achieving Century Uptimes

What is Reliability?

System Availability

The 9s Measure of Availability

The Price of Reliability

The Why of Century Uptimes

The How of Century Uptimes

The Acceptance of Active/Active Technology

What’s Next

Chapter 2-Reliability of Distributed Computing Systems

Active/Active Systems Reviewed

The Availability Relationship

The Importance of Repair Time

The Importance of Recovery Time and the 4 Rs

System Splitting

Failover Time

Failover Faults

Environmental Faults

What’s Next?

Chapter 3-An Active/Active Primer

A General Solution

Database Locality

Database Synchronization

Synchronous Replication

Failure Mechanisms

Controlling Database Costs

The Availability/Performance/Cost Compromise

What’s Next?

Part 2-Building and Managing Active/Active Systems

Chapter 4-Active/Active Topologies

Architectural Topologies

Network Topologies

What’s Next

Chapter 5-Redundant Reliable Networks

The Need for Network Redundancy

Reliability Is More Than Just Redundancy

The Great Protocol Wars of the Twentieth Century

Redundancy Configurations

Backup Networks

Reconfigurable Networks

The Internet

Fault Detection

Fault Recovery

Fault Repair

Transaction, Session, and Connection Loss

Cost

A Case Study

What’s Next

Chapter 6-Distributed Databases

The Need for Distributed Databases

Database Synchronization

Issues with Distributing a Database

Issues with Remote Access

Failover

Database Recovery

What’s Next

Chapter 7-Node Failures

Causes of Node Failures

Detecting Failures

Failover

Switching Users

Node Recovery

Other Issues

What’s Next

Chapter 8-Eliminating Planned Outages with Zero Downtime Migrations

Introduction

Yes! Application Availability Counts

The ZDM Solution

Uses for ZDM

The Online Copy Facility

Zero Downtime Migrations with Low Risk

Planned Outages Eliminated

What’s Next?

Chapter 9-Total Cost of Ownership (TCO)

Choosing the Solution

Return on Investment (ROI)

Total Cost of Ownership (TCO)

Initial System Cost

Recurring Costs

Putting It All Together

What’s Next

Appendices

Appendix 1-Rules of Availability

Volume 1 Rules

Volume 2 Rules

Volume 3 Rules

References and Suggested Reading

About the Authors

Endnotes

Dedication

This book is dedicated to our spouses,

Denise, Janice, and Karen,

for their enduring patience and support.

We also dedicate this book to Jim Gray

for his fundamental contributions to transaction

processing technology on which this book is based.

Jim, an avid sailor, has been missing at sea

since January 28, 2007.

Forward

Given today’s technology, [six 9s] is unachievable for all practical purposes, and an unrealistic goal.

-Evan Marcus and Hal Stern, 20001¹

My, how things change in just a few years! Not only are we going to talk about achieving systems with six 9s availability but also with eight 9s availability and beyond. Furthermore, we are not talking just about system availability. We are talking about application service availability. After all, following a failure of some sort, if the users of an application are being serviced in an unacceptable manner (such as experiencing excessively long response times), then the application is essentially not available.

If you could configure your current system to:

• provide extreme availability-MTBFs measured in centuries,

• affect only a subset of users upon a failure,

• recover from any failure in subseconds to seconds,

• lose little if any data as the result of a failure,

• eliminate planned downtime,

• achieve disaster tolerance,

• use all available capacity,

• load balance at will,

• be easily expandable,

• require no change to existing applications,

• all at little or no additional cost,

wouldn’t you be interested? We think so, and that is what this book is all about. Active/active systems can and do provide these benefits today.

Abe Lincoln said that it is better to remain silent and be thought a fool than to speak out and remove all doubt. At the risk of sounding foolish to some, we recognize that there are naysayers who will argue that extreme availabilities cannot be achieved. In this book we are speaking out, confident that the many examples of successful installations of active/active systems will prove us not to be fools, notwithstanding Abe.

What is This Book?

We referred to this book in the previous section. Actually, when we started to write this book, we intended it to be the second in a series on active/active systems. However, when we finished it, it became apparent that it was much too long to be a comfortable single book to read. Therefore, we decided to break it up into two volumes.

We will refer to the (now) three volumes as Volumes 1, 2, and 3. This book comprises Volumes 2 and 3. The titles of the active/active trilogy are:

Volume 1: Breaking the Availability Barrier: Survivable Systems for Enterprise Computing, published by AuthorHouse in 2004,

Volume 2: Breaking the Availability Barrier II: Achieving Century Uptimes with Active/Active Systems, published by AuthorHouse in 2007 along with Volume 3.

Volume 3: Breaking the Availability Barrier III: Active/Active Systems in Practice, published by AuthorHouse in 2007 along with Volume 2.

In keeping with Volumes 2 and 3 being essentially one book, this Forward is the same in each volume. However, the content of each volume is markedly different.

Let us now return to the introduction of active/active systems.

Achieving Extreme Availabilities

The secret to the achievement of extreme availabilities is in the configuration. By configuring (or re-configuring) your monolithic system as an active/active architecture, the benefits described in our introduction can all be achieved.

What is an active/active system? We define it as a network of independent processing nodes, each having access to a common replicated database. All nodes can cooperate in a common application, and users can be serviced by multiple nodes.

Image34274.JPG

An Active/Active System

Note an important implication of this definition. Active/active architectures are not just about protecting against hardware failures. In most cases, any event that will bring down a monolithic system will only bring down one node in an active/active system. Such failure events include not only hardware faults, but also software faults, operator errors, environmental failures (air conditioning, power, etc.), and manmade or natural disasters. Active/active architectures protect users against all of these faults, allowing service to be continued by simply switching users from a failed node to one or more surviving nodes.

Another implication is what active/active is not. Active/active is not a technology; it is a business solution. Active/active is not about distributed database synchronization; it is about achieving century uptimes. More specifically,

• Active/active systems are not co-located clusters. A basic tenet of active/active systems is that they protect against area-wide problems. If the nodes cannot be geographically separated, then they are not part of an active/active system.

• Active/active systems are not independent nodes using a common database. In such an architecture, the database cannot be geographically distributed and represents a single point of failure.

• Active/active systems are not those that use hardware replication for database synchronization. Hardware replication cannot guarantee referential integrity.² As a consequence, applications at synchronized sites cannot use the database copies.

• By the same token, active/active systems are not those that use software replication engines that do not guarantee referential integrity.

• Active/active systems are not clusters. Users on an active/active system can be put back into service in seconds by

Forward

switching them to another operating node. Clusters require that another node be brought online, a process that typically takes minutes. This time delay precludes century uptimes.

• Active/active systems are not lock-stepped or voting systems because such designs require each node to process the same requests, thus precluding scalability.

• Active/active systems are not limited to enterprise applications. There are active/active distributed database systems on the market that are loosely coupled and synchronized by replication.

• Active/active systems do not require distributed disk-resident databases. Many active/active systems maintain their databases in memory.

Of course, in some cases, there may be no need for a database in an application (for example, a cluster of Web servers). In such systems, there is no context saved between operations. Implementing clusters of systems such as these is not a difficult task as it is only necessary to route any transaction to any surviving server. However, if an active database is involved such that context is retained from transaction to transaction, then providing a redundant synchronized database is necessary. This brings with it a myriad of issues. These volumes concentrate on applications which depend upon an integrated and updatable distributed database.

In many cases, the nodes in the application network are completely symmetric. Any transaction can be routed to any node, which can read or update any set of data items in the database. Should a node fail, users at the other nodes are unaffected. Furthermore, the users at the failed node can be switched quickly to surviving nodes, with their services restored in seconds or less.

In seconds is the secret. Common today is the use of cluster technology to provide high availability. Should a node in the cluster fail, users are switched to a backup node. However, the applications on that node must be brought up and database tables and files opened before application services can be offered to the users. This process typically takes several minutes or more. In active/active configurations, all applications are already up and running on each node and are actively processing transactions. All that must be done is to switch over affected users to surviving nodes.

Let us say that an active/active system can recover services in three seconds and that the equivalent cluster can recover in five minutes (300 seconds). The cluster will be down one-hundred times longer than will the active/active system. This lops off two nines from the cluster’s availability relative to the equivalent active/active system. A six 9s active/active system would be reduced to an availability of four 9s if it were in a cluster configuration. No wonder in 2007 many pundits still state that six 9s is not possible. But it is, as we will show in these volumes.

This leads to one of our availability rules:

Rule 36: To achieve extreme reliabilities, let it fail; but fix it fast.³

Are extreme availabilities important to you? Are the four 9s available with HP NonStop servers or with PC or Unix clusters acceptable? As we will discuss later, surveys have shown that the costs of downtime can range from USD $100,000 to several million dollars an hour, depending upon the application. Perhaps even worse, downtime can lead to the dreaded CNN Moment and massive losses in stock value (see Chapter 9, Total Cost of Ownership (TCO), in Volume 2 for what happened to AOL in 1996 and eBay in 1999). At the extreme, downtime can lead to significant property loss or even loss of life.

Only you can make this judgment. If extreme availabilities are important to your enterprise, this book is for you.

A Roadmap Through This Book

As we explained earlier, this book is in fact Volumes 2 and 3 of our trilogy describing how to achieve extreme availabilities with active/active systems. The first volume in this series, published in 2004 by AuthorHouse (www.authorhouse.com) and entitled Breaking the Availability Barrier: Survivable Systems for Enterprise Computing, referred to herein as Volume 1, lays the groundwork and the theory supporting the concepts of active/active systems. These two current volumes focus more on the practical aspects of implementing these systems.

They are broken into four parts, Parts 1 and 2 being in Volume 2 and Parts 3 and 4 being in Volume 3:

• Part 1, Survivable Systems for Enterprise Computing, summarizes and expands on Volume 1 and provides the background for the further topics discussed in these Volumes 2 and 3. Volume 1 is not needed to understand the content or the conclusions of Volumes 2 and 3.

• Part 2, Building and Managing Active/Active Systems, demonstrates how to build the redundancy required by active/active systems and how to control their cost and performance.

• Part 3, Infrastructure Case Study, describes an example of commercially available infrastructure products known to the authors to be suitable for production active/active systems. It also provides a valuable performance analysis tool for these products.

• Part 4, Active/Active Systems at Work, summarizes many of the beneficial uses of active/active systems, provides several case studies of active/active systems in use today, and describes various related technologies and issues.

The authors’ intended audience for these Volumes 2 and 3 and their predecessor Volume 1 includes IT executives who feel that they must reduce the downtime of their systems, system architects and senior developers who must build these systems or modify existing systems to achieve the required availability, and operations staff who must run these systems and recover from system faults.

Part 1-Survivable Systems for Enterprise Computing

As the French biologist Louis Pasteur said, Chance favors the prepared mind. To prepare ourselves to understand active/active systems, Volume 1 of this series laid the groundwork for active/active systems and supported the concepts with mathematical analyses. As said earlier, Part 1 of this Volume 2 summarizes and expands upon the contents of Volume 1.

In Chapter 1, Achieving Century Uptimes, we talk about what is reliability and how to quantify it. We then extend these concepts to extremely reliable system configurations called active/active systems.

Chapter 2, Reliability of Distributed Computing Systems, summarizes the mathematical foundations for active/active systems. For the reader who is mathematically adverse, you will be pleased to know that the rest of this book uses minimal mathematics (except for the data replication engine performance model, which is relegated to Appendix 2). In fact, Chapter 2 can be skipped without missing the main points of the material in the following chapters.

An overview of active/active systems is discussed in Chapter 3, An Active/Active Primer. Here we discuss in some detail the structure and characteristics of the all-important data replication engine. We also look briefly at the various failure modes and how to recover from them as well as how to control costs of active/active architectures. These later subjects are analyzed in much greater detail in Part 2 of this volume.

Part 2-Building and Managing Active/Active Systems

The whole rationale behind active/active systems is active redundancy, which masks failures by recovering from them so rapidly that no one notices. A similar but localized philosophy is used in HP’s NonStop servers, in which critical software processes are supported by backup processes in other processors resident in the same node and ready to take over in subsecond time. Also, all databases are redundant so that disk faults are masked.

There are a variety of application network topologies that have the characteristics of active/active systems. In Chapter 4, Active/Active Topologies, examples of many of these configurations are described.

In active/active systems, the inherent redundancy includes networks, databases, and processing nodes. Chapter 5, Redundant Reliable Networks, discusses ways in which to build the reliable networks needed for data replication to provide database synchronization between distributed database copies, for heartbeats to monitor the health of the processing nodes, and for users to be switched between nodes.

Chapter 6, Distributed Databases, describes how data replication engines can be used to keep in synchronism the multiple copies of a database in the application network. It discusses issues with replication such as data collisions and loss of data following a failure. Recovery from a failed database copy and access to a viable database copy following a node or network failure are explored.

The monitoring of a processing node’s health is discussed in Chapter 7, Node Failures. A node can be considered to have failed if the processing system comprising that node has failed, if its database has failed, or if it has lost connectivity to the rest of the application network due to network faults. Techniques for recovering from a node failure are discussed, including issues such as tug-of-wars and operating in split-brain mode.

A highly beneficial use of controlled failures is shown in Chapter 8, Eliminating Planned Outages with Zero Downtime Migration (ZDM). Planned downtime is one of the major causes of reduced application availability. In many installations, the planned downtime required to upgrade a system or to execute other maintenance functions far exceeds unplanned downtime due to faults. In active/active systems, a node can be taken out of service purposefully with little or no impact on the users. This capability can be used to advantage to upgrade hardware, operating system software, application software, database structures, and so on. This technique also allows the capacity of the application network to be easily expanded by adding new nodes online.

Controlling the cost of an active/active system is as important as it is with any other system. However, active/active systems present an additional level of complexity. There are many ways to configure an active/active system to manage the appropriate compromise between cost, availability, and performance. As we look at different potential configurations, how do we know which contenders are the least costly? What are the factors that enter into the total cost of ownership equation? These topics are discussed in some detail in Chapter 9, Total Cost of Ownership (TCO).

Part 3-Infrastructure Case Study

In the first two parts of this book, we describe why active/active systems can provide such high availability and how to build these systems. A set of tools is described that form a basis for the implementation of active/active systems. In Part 3, we look at a set of commercially available tools that fill the needs of active/active systems, and a performance model that can be used to gauge the effectiveness of such tools. The set of tools which are described are necessarily tools with which the authors are quite familiar but are otherwise reflective of several such tools in the marketplace.⁴

The above chapters have covered two of the three legs of the active/active triangle-availability and cost. The third leg is performance. At the heart of most active/active systems is the data replication engine, and the performance of an active/active system is directly related to this engine. In Chapter 10, Performance of Active/Active Systems, we create a performance model for a generic data replication engine and show how its various performance measures are affected by a variety of replication engine architectures. The mathematics behind the performance model are left for Appendix 2, Replication Engine Performance Model, in Volume 3.

The primary facility that is required is an appropriate data replication engine. Chapter 11, Shadowbase, describes the Shadowbase data replication engine that has been used in many such implementations. Shadowbase is an example of a data replication engine with a very low replication latency (the time it takes for a change that is made to a source database to be propagated to the target database). Low replication latency is important to minimize data collisions and also to minimize data loss following a failure.

In order to take a node out of service and later return it to service, it is important to have a database copy facility that can copy the contents of an active database to a node about to be put into service (or even after it has been placed into service) while the source database is being updated. Chapter 12, SOLV, describes such a utility. Working with Shadowbase, SOLV can efficiently make a copy of an active database even while that database is being updated. In addition, future versions of SOLV will verify that two online databases are in synchronism and will resynchronize two active databases by repairing rows with differing content.⁵

In Chapter 13, ZDM with Shadowbase, we discuss the use of Shadowbase and SOLV to upgrade nodes in an active/active system without taking down the applications. With Zero Downtime Migrations, planned downtime can be completely eliminated since nodes in an application network can be upgraded without denying service to any user. Upgrades can include the hardware, operating system, applications, database, and networks, among others. In addition, ZDM can be used to add nodes dynamically into an application network to expand its capacity.

Part 4-Active/Active Systems at Work

After learning how to build an active/active system and having seen an example of a tool set needed to do this, Part 4 looks at some actual uses of this technology in place today. It also describes some related technologies and issues.

We start in Chapter 14, Benefits of Multiple Nodes in Practice, by summarizing the various active/active system benefits that we have discussed in the book. These benefits include achieving extreme availability and very fast response time in the face of unplanned outages and even disasters, the elimination of scheduled downtime, the efficient use of all available processing capacity, the simplification of recovery testing, and application capacity expansion, both symmetric and asymmetric.

In Chapter 15, Case Studies, we look at a variety of actual uses of active/active technology. Our examples come from a wide variety of industries, including financial institutions, telecommunications, travel, web services, brokerages, plant management, and even casinos.

Finally, in Chapter 16, Related Technologies and Drivers, we explore some technologies that are related to availability. They include Grid Computing, the NonStop Server Advanced Architecture, Split Mirrors, the Real-Time Enterprise, Bulletproof Storage, and Virtual Tape. We also discuss the large number of regulatory requirements that may affect your availability decisions.

Appendices

Throughout all three volumes of this trilogy, a variety of rules applicable to highly available systems have been stated. These rules are summarized in Appendix 1, Rules of Availability. These are annotated with volume and chapter so that their context can easily be found and studied.

Appendix 1 is contained in both Volumes 2 and 3. The remaining appendices will be found in Volume 3.

Appendix 2, Replication Engine Performance Model, sets forth the detailed mathematics behind the data replication engine performance model summarized in Chapter 10, Performance of Active/Active Systems. It also structures the resulting model into a set of tables suitable for creating an Excel spreadsheet for convenient performance calculations.

Appendix 3, Regulatory Requirements, summarizes the various regulatory issues that may have a bearing on the availability and operations of processing systems. These regulations are referenced in Chapter 16, Related Topics and Drivers.

Additionally, we asked a noted consultant in the field of highly available systems, Dr. Werner Alexi, President of CS Software, Concepts, and Solutions, GmbH, to provide his comments and critique on active/active systems. His views are presented in Appendix 4, A Consultant’s Critique.

Authors’ Notes

You may have noted that this is a long book when both volumes are considered. As Winston Churchill said, the length of this document defends it well against the risk of its being read. To mitigate this, we would like to point out that most detail is summarized in snippets that can easily be scanned, often as rules. For instance, you might want to just hunt for the rules and read the supporting text. This will give you a good feeling for where we are trying to take you.

In many places throughout this book, reference is made to HP NonStop systems. NonStop systems were originally developed by Tandem Computers to provide very high availability. Tandem Computers was subsequently acquired by Compaq Computers, and Compaq was then acquired by HP. HP has changed the name of the Tandem systems to HP NonStop servers. The authors have considerable experience with these systems. However, concepts and recommendations presented in this book are extendable to all types of commodity systems to make them redundant, including HP Superdome, Windows Server clusters, Unix clusters, Linux servers, and IBM Parallel Sysplex systems.

Each of the chapters in this book has been written to be self-standing at the risk of some repetition. Therefore, the reader is encouraged to pick and choose the topics of interest and to read only those chapters that apply. Adequate reference is made to other chapters to suggest further reading.

Acknowledgements

All three volumes of Breaking The Availability Barrier have benefited from reviews by many people. We gratefully acknowledge the contributions to this volume by Mary Heck for her contributions to Appendix 3 and by Dr. Werner Alexi for his critique, published in Appendix 4. We also thank Burt Liebowitz and John Carson, whose book Multiple Processing Systems for Real-Time Applications provided background for this work, and Jim Gray, whose many writings fueled the fire. They and others who have influenced this volume include:

Werner Alexi, CS Software

Wendy Bartlett, HP

Victor Berutti, Gravic

Richard Buckle, Insession

Robert Cline, SunGard Securities Processing

Dan Coughlin, First Data Corp.

Michael Crispyn, Fifth Third Bank

Terry Cumaranatunge, Motorola

Dick Davis, Gravic

Giampaolo Gandini, Telecom Italia Mobile Jeff Glatstein, SunGard Securities Processing Jim Gray, Microsoft

Jon Healy, SunGard Securities Processing

Mary Heck, Gravic

Tom Hoffmann, Motorola

Bill Holenstein, Gravic

Denise Holenstein, Gravic

Dan Hoppmann, A. G. Edwards

ITUG Connection staff

Clark Jablon, Akin Gump

Gene Jarema, Gravic

Jim Johnson, The Standish Group

Tim Keefauver, HP

Rob Klotz, First Data Corp.

Bill Knapp, Gravic

Bob Kossler, HP

Burt Liebowitz, Consultant

Bob Loftis, HP

Mike Nemerowski, SunGard Securities Processing Carl Niehaus, HP

Kate Noer, SunGard Securities Processing Gianfranco Pompado, Telecom Italia Mobile

Tullio Privitera, Telecom Italia Mobile

Janice Reeder, The Sombers Group

Steve Saltwick, HP

Harry Scott, Carr Scott Software

Scott Sitler, HP

Gary Strickler, Gravic

Bart van Leeuwen, Rabobank

Joanne Welk, Motorola

About the Authors

Paul J. Holenstein is Executive Vice President of Gravic, Inc., the maker of the Shadowbase line of data replication products. Shadowbase is a low latency, high-performance, real-time data replication engine that provides business continuity as well as heterogeneous data integration and synchronization.

Enjoying the preview?

Page 1 of 1

Breaking the Availability Barrier Ii: Achieving Century Uptimes with Active/Active Systems

About this ebook

Dr. Bruce Holenstein

Related authors

Related to Breaking the Availability Barrier Ii

Related ebooks

How To Do Virtualization: Your Step-By-Step Guide To Virtualization

Service Availability: Principles and Practice

Windows Azure Hybrid Cloud

VMware Horizon 6 Desktop Virtualization Solutions

WiFi, WiMAX, and LTE Multi-hop Mesh Networks: Basic Communication Protocols and Application Areas

Cloud Computing and Virtualization

VMware View Security Essentials

VMware Horizon View 6 Desktop Virtualization Cookbook

Microsoft Exchange Server 2013 - Sizing, Designing and Configuration: A Practical Look

Microsoft Exchange Server 2013 High Availability

vSphere High Performance Cookbook - Second Edition

Tomcat 6 Developer's Guide

VMware vRealize Orchestrator Cookbook

Discovering Requirements: How to Specify Products and Services

Zero Trust Security: Building Cyber Resilience & Robust Security Postures

Online Identity A Complete Guide - 2020 Edition

Getting Started with Windows VDI

Malware Analysis A Complete Guide - 2020 Edition

Enterprise Information Security Architecture A Complete Guide - 2020 Edition

“Careers in Information Technology: Artificial Intelligence (AI) Robotics Engineer”: GoodMan, #1

Security controls Complete Self-Assessment Guide

Red Hat Ansible A Complete Guide - 2021 Edition

VMware Horizon View High Availability

Citrix XenApp Performance Essentials

360° Vulnerability Assessment with Nessus and Wireshark: Identify, evaluate, treat, and report threats and vulnerabilities across your network (English Edition)

Computer Networking Bootcamp: Routing, Switching And Troubleshooting

CYBER SECURITY HANDBOOK Part-1: Hacking the Hackers: Unraveling the World of Cybersecurity

IaaS Mastery: Infrastructure As A Service: Your All-In-One Guide To AWS, GCE, Microsoft Azure, And IBM Cloud

Learning VMware App Volumes

Network Architecture A Complete Guide - 2019 Edition

Computers For You

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race

Elon Musk

The Invisible Rainbow: A History of Electricity and Life

Alan Turing: The Enigma: The Book That Inspired the Film The Imitation Game - Updated Edition

The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution

SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL

Grokking Algorithms: An illustrated guide for programmers and other curious people

Mastering ChatGPT: 21 Prompts Templates for Effortless Writing

ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind

Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics

An Ultimate Guide to Kali Linux for Beginners

CompTIA IT Fundamentals (ITF+) Study Guide: Exam FC0-U61

CompTIA Security+ Get Certified Get Ahead: SY0-701 Study Guide

The Hacker Crackdown: Law and Disorder on the Electronic Frontier

People Skills for Analytical Thinkers

Discord For Dummies

Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad

Deep Search: How to Explore the Internet More Effectively

How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters

The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are

Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time!

Tor and the Dark Art of Anonymity

Slenderman: Online Obsession, Mental Illness, and the Violent Crime of Two Midwestern Girls

I Forced a Bot to Write This Book: A.I. Meets B.S.

The Best Hacking Tricks for Beginners

Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work

Uncanny Valley: A Memoir

Related podcast episodes

Related articles

Related categories

Reviews for Breaking the Availability Barrier Ii

What did you think?

Book preview

Breaking the Availability Barrier Ii - Dr. Bruce Holenstein

CONTENTS

Dedication

Forward

Given today’s technology, [six 9s] is unachievable for all practical purposes, and an unrealistic goal.

What is This Book?

Achieving Extreme Availabilities

101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters