0% found this document useful (0 votes)

14 views30 pages

IAU ST Lecture2

The document discusses the design and architecture of reliable, scalable, and maintainable data-intensive applications. It highlights key concerns such as reliability, scalability, and maintainability, and outlines strategies to address challenges like fault tolerance, human errors, and system complexity. Additionally, it emphasizes the importance of operability, simplicity, and evolvability in software systems to minimize maintenance costs and adapt to changing requirements.

Uploaded by

asa5tanha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views30 pages

IAU ST Lecture2

Uploaded by

asa5tanha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Big Data Analytics

Lecture 2
Mohammad Hamzei
Department of Computer Engineering
Islamic Azad University, South Tehran Branch
[email protected]
Reliable, Scalable, and Maintainable
Applications
Introduction

• Many applications today are data-intensive, as

opposed to compute-intensive.
– The amount of data, the complexity of data, and the
speed at which it is changing.
Introduction

Standard building blocks of data-intensive

applications:
• Store data so that they, or another application, can find it again later
(databases)
• Remember the result of an expensive operation, to speed up reads(caches)
• Allow users to search data by keyword or filter it in various ways (search
indexes)
• Send a message to another process, to be handled asynchronously (stream
processing)
• Periodically crunch a large amount of accumulated data (batch processing)
Architecture (Example)
Challenges

• If you are designing a data system or service, a

lot of tricky questions arise.
– How do you ensure that the data remains correct
and complete, even when things go wrong
internally?
– How do you provide consistently good performance
to clients, even when parts of your system are
degraded?
– How do you scale to handle an increase in load?
– What does a good API for the service look like?
Concerns

• Important concerns in most software systems

1. Reliability
2. Scalability
3. Maintainability
Reliability

• Reliability: The system should continue to work

correctly (performing the correct function at the
desired level of performance) even in the face of
adversity (hardware or software faults, and even
human error).
Reliability

• “Continuing to work correctly, even when things

go wrong.”
• The application performs the function that the user
expected.
• It can tolerate the user making mistakes or using the
software in unexpected ways.
• Its performance is good enough for the required use
case, under the expected load and data volume.
• The system prevents any unauthorized access and
abuse.
Fault and Failure

• The things that can go wrong are called faults,

and systems that anticipate faults and can cope
with them are called fault-tolerant or resilient.
• A fault is usually defined as one component of
the system deviating from its spec, whereas a
failure is when the system as a whole stops
providing the required service to the user.
• fault-tolerance: use mechanisms that prevent
faults from causing failures.
Fault tolerance

• Although we generally prefer tolerating faults

over preventing faults, there are cases where
prevention is better than cure (e.g., because no
cure exists).
– This is the case with security matters, for example: if
an attacker has compromised a system and gained
access to sensitive data, that event cannot be
undone.
Faults and Errors

• Hardware faults

• Software errors

• Human errors
Hardware Faults

• Hard disks crash, RAM becomes faulty, the

power grid has a blackout, someone unplugs the
wrong network cable.
• Solutions:
– add redundancy to the individual hardware components in
order to reduce the failure rate of the system.
– Disks may be set up in a RAID configuration
– servers may have dual power supplies and hot-swappable
CPUs
– datacenters may have batteries and diesel generators for
backup power.
Hardware Faults

• There is a move toward systems that can

tolerate the loss of entire machines, by using
software fault-tolerance techniques in
preference or in addition to hardware
redundancy.
• Amazon Web Services (AWS) it is fairly common
for virtual machine instances to become
unavailable without warning, as the platforms
are designed to prioritize flexibility and elasticity
over single-machine reliability
Software Errors

• There is no quick solution to the problem of

systematic faults in software.
• Lots of small things can help:
– carefully thinking about assumptions and
interactions in the system; thorough
• testing
• process isolation
• allowing processes to crash and restart
• Measuring and monitoring
• analyzing system behavior in production
Human Errors

• How do we make our systems reliable, in spite of

unreliable humans?
• The best systems combine several approaches:
– Design systems in a way that minimizes
opportunities for error.
– Decouple the places where people make the most
mistakes from the places where they can cause
failures.
• provide fully featured non-production sandbox
environments where people can explore and experiment
safely, using real data, without affecting real users.
Human Errors

• The best systems combine several approaches

(Continued…):
– Test thoroughly at all levels, from unit tests to
whole-system integration tests and manual tests
– Allow quick and easy recovery from human errors, to
minimize the impact in the case of a failure
– Set up detailed and clear monitoring, such as
performance metrics and error rates.
Scalability

• Scalability: As the system grows (in data volume,

traffic volume, or complexity), there should be
reasonable ways of dealing with that growth.
• Scalability is the term we use to describe a
system’s ability to cope with increased Load.
• How do we maintain good performance even
when our load parameters increase by some
amount?
Scalability

• Describing the performance of a system:

– In a batch processing system such as Hadoop, we
usually care about throughput—the number of
records we can process per second, or the total time
it takes to run a job on a dataset of a certain size.
– In online systems, what’s usually more important is
the service’s response time—that is, the time
between a client sending a request and receiving a
response.
Scalability

• Scale up (vertical scaling)

– moving to a more powerful machine

• Scale out (horizontal scaling)

– distributing the load across multiple smaller
machines
Scalability

• Some systems are elastic, meaning that they can

automatically add computing resources when
they detect a load increase
• other systems are scaled manually.
Scalability

• Distributing stateless services across multiple

machines is fairly straightforward
• Taking stateful data systems from a single node
to a distributed setup can introduce a lot of
additional complexity
– keep your database on a single node (scale up) until
scaling cost or high-availability requirements forced
you to make it distributed
Scalability

• The problem may be

– the volume of reads
– the volume of writes
– the volume of data to store
– the complexity of the data
– The response time requirements
– the access patterns
– or (usually) some mixture of all of these plus many
more issues.
Maintainability

• It is well known that the majority of the cost of

software is not in its initial development, but in
its ongoing maintenance
– fixing bugs
– keeping its systems operational
– investigating failures
– adapting it to new platforms
– modifying it for new use cases
– repaying technical debt
– adding new features.
Maintainability

• Three design principles for software systems to

minimize pain during maintenance :
– Operability

– Simplicity

– Evolvability
Operability

Operability: Make it easy for operations teams to

keep the system running smoothly.
• Data systems can do various things to make

routine tasks easy, including:

– Providing visibility into the runtime behavior and
internals of the system, with good monitoring
– Providing good support for automation and
integration with standard tools
– Avoiding dependency on individual machines
Operability

• Data systems can do various things to make

routine tasks easy, including:
– Providing good documentation and an easy-to-
understand operational model
– Providing good default behavior, but also giving
administrators the freedom to override defaults
when needed
– Self-healing where appropriate, but also giving
administrators manual control over the system state
when needed
– Exhibiting predictable behavior, minimizing surprises
Simplicity

• Make it easy for new engineers to understand

the system, by removing as much complexity as
possible from the system
• In complex software, there is also a greater risk
of introducing bugs when making a change:
– when the system is harder for developers to
understand and reason about, hidden assumptions,
unintended consequences, and unexpected
interactions are more easily overlooked
Simplicity

• Making a system simpler does not necessarily

mean reducing its functionality; it can also mean
removing accidental complexity.
• Complexity is accidental if it is not inherent in
the problem that the software solves (as seen by
the users) but arises only from the
implementation.
• One of the best tools we have for removing
accidental complexity is abstraction.
Evolvability

• Make it easy for engineers to make changes to

the system in the future, adapting it for
unanticipated use cases as requirements change.
• Also known as extensibilIty, modifiability, or
plasticity.
• The Agile community has also developed
technical tools and patterns that are helpful
when developing software in a frequently
changing environment, such as test-driven
development (TDD) and refactoring.

Postman Notes
100% (1)
Postman Notes
31 pages
Unit 1
No ratings yet
Unit 1
85 pages
Software Requirements 11.04.23
No ratings yet
Software Requirements 11.04.23
94 pages
Lecture 12 13
No ratings yet
Lecture 12 13
69 pages
Software Architecture - Ch5 - Part 4
No ratings yet
Software Architecture - Ch5 - Part 4
31 pages
1SPM Note
No ratings yet
1SPM Note
126 pages
Se 1
No ratings yet
Se 1
279 pages
Software Architecture - Ch5 - Part 3
No ratings yet
Software Architecture - Ch5 - Part 3
25 pages
UNIT3
No ratings yet
UNIT3
15 pages
Firewall Master Thesis
100% (2)
Firewall Master Thesis
7 pages
Unit - 1-1.software Components, Software Characteristics, Software Crisis, Software Engineering Processes
100% (1)
Unit - 1-1.software Components, Software Characteristics, Software Crisis, Software Engineering Processes
32 pages
LECT1 Mod-1 Part-1 Introduction
No ratings yet
LECT1 Mod-1 Part-1 Introduction
66 pages
Scalability By Design
From Everand
Scalability By Design
Chukwunonso Offor
No ratings yet
Software Reliability Definition
No ratings yet
Software Reliability Definition
9 pages
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
No ratings yet
CAO Fall 2024 Lecture 07 RISC V Pipelined Implementation
114 pages
2 Fall22-Lecture2QualityMetrics
No ratings yet
2 Fall22-Lecture2QualityMetrics
68 pages
Course Introduction: Dsecl Zc556 Stream Processing and Analytics Lecture No. 1.0
No ratings yet
Course Introduction: Dsecl Zc556 Stream Processing and Analytics Lecture No. 1.0
52 pages
Lecture 2 Scalable Data Systems
No ratings yet
Lecture 2 Scalable Data Systems
41 pages
Unit I: Introduction: - Professional Software Development, Software Engineering Ethics Headings and Subheadings
No ratings yet
Unit I: Introduction: - Professional Software Development, Software Engineering Ethics Headings and Subheadings
18 pages
Complete Unit-1 Notes
No ratings yet
Complete Unit-1 Notes
103 pages
Software Engineering
No ratings yet
Software Engineering
99 pages
Software Engineering 1
100% (1)
Software Engineering 1
88 pages
OOAD
No ratings yet
OOAD
48 pages
Dca 3103
No ratings yet
Dca 3103
11 pages
Softeare Eng
No ratings yet
Softeare Eng
19 pages
Python Programming Internship Report YBI
No ratings yet
Python Programming Internship Report YBI
5 pages
SR Notes
No ratings yet
SR Notes
11 pages
Unit 3 OOAD
No ratings yet
Unit 3 OOAD
24 pages
Software Project Management Note BCA Bhairahawa Multiple Campus
No ratings yet
Software Project Management Note BCA Bhairahawa Multiple Campus
126 pages
Software Engineering (Week-6)
No ratings yet
Software Engineering (Week-6)
84 pages
Chapter 1 - Complexity
67% (3)
Chapter 1 - Complexity
10 pages
Software Engineering Notes
No ratings yet
Software Engineering Notes
47 pages
1 Designing Data-Intensive Apps - CH 1
No ratings yet
1 Designing Data-Intensive Apps - CH 1
2 pages
Software Engineering - Challenges Ahead
No ratings yet
Software Engineering - Challenges Ahead
43 pages
SE Lec1
No ratings yet
SE Lec1
46 pages
Oose Unit 1 Notes
No ratings yet
Oose Unit 1 Notes
36 pages
What Do Reliability, Scalability and Maintainability Mean
No ratings yet
What Do Reliability, Scalability and Maintainability Mean
3 pages
350-701 Scor
No ratings yet
350-701 Scor
80 pages
Software Engineering Notes:: Maintainability Reliability Scalability
No ratings yet
Software Engineering Notes:: Maintainability Reliability Scalability
26 pages
Chp1. Introduction SE
No ratings yet
Chp1. Introduction SE
55 pages
Network Security and Cryptography: Course Code: 15Cs1105 Pre-Requisites: Computer Networks
No ratings yet
Network Security and Cryptography: Course Code: 15Cs1105 Pre-Requisites: Computer Networks
3 pages
Demo Report
No ratings yet
Demo Report
59 pages
SWE-600 SW Dependable System
No ratings yet
SWE-600 SW Dependable System
48 pages
Unit 1 - Software and Memory Processors
No ratings yet
Unit 1 - Software and Memory Processors
88 pages
LS1.1 - V1 Reliable, Scalable and Maintainable Data Applications
No ratings yet
LS1.1 - V1 Reliable, Scalable and Maintainable Data Applications
10 pages
System Design
No ratings yet
System Design
9 pages
Unit 1
No ratings yet
Unit 1
61 pages
Software: Its Nature and Qualities
100% (1)
Software: Its Nature and Qualities
11 pages
A311 Se Blog
No ratings yet
A311 Se Blog
3 pages
Williams Draft Book
No ratings yet
Williams Draft Book
295 pages
Ch1. - Introduction To Software Engineering
No ratings yet
Ch1. - Introduction To Software Engineering
54 pages
Module 1
No ratings yet
Module 1
11 pages
How To Maintain Software Quality
No ratings yet
How To Maintain Software Quality
2 pages
RTUtil560 Users Guide R8
No ratings yet
RTUtil560 Users Guide R8
101 pages
Large Language Models in Finance A Survey
No ratings yet
Large Language Models in Finance A Survey
9 pages
2nd Class SE
No ratings yet
2nd Class SE
15 pages
Software Engineering Unit-1 Notes
No ratings yet
Software Engineering Unit-1 Notes
4 pages
OO Concept Chapt1
No ratings yet
OO Concept Chapt1
74 pages
Requirements Engineering
No ratings yet
Requirements Engineering
13 pages
Software Requirement Article
No ratings yet
Software Requirement Article
5 pages
Size of Software Industry:: USD 500 Billion
No ratings yet
Size of Software Industry:: USD 500 Billion
40 pages
Unit 1 - Cga - 2021
No ratings yet
Unit 1 - Cga - 2021
40 pages
Data Communication & Networking
No ratings yet
Data Communication & Networking
3 pages
February 2011 Master of Computer Application (MCA) - Semester 3 MC0071 - Software Engineering - 4 Credits (Book ID: B0808 & B0809) Assignment Set - 1 (60 Marks)
No ratings yet
February 2011 Master of Computer Application (MCA) - Semester 3 MC0071 - Software Engineering - 4 Credits (Book ID: B0808 & B0809) Assignment Set - 1 (60 Marks)
14 pages
Medassist: Basic Health Symptom Guide: Bachelor of Technology
No ratings yet
Medassist: Basic Health Symptom Guide: Bachelor of Technology
38 pages
Top 10 Architecture Characteristics
No ratings yet
Top 10 Architecture Characteristics
11 pages
Unit I: Generic - Developed To Be Sold To A Range of Different Customers E.G. PC Software Such As
No ratings yet
Unit I: Generic - Developed To Be Sold To A Range of Different Customers E.G. PC Software Such As
26 pages
Unit 1 (Part 1) Software Engineer
No ratings yet
Unit 1 (Part 1) Software Engineer
10 pages
A3 Colour Multifunction Printers: It'S in The Details
No ratings yet
A3 Colour Multifunction Printers: It'S in The Details
12 pages
Software May Be Retired As
No ratings yet
Software May Be Retired As
29 pages
Cambridge O Level: Computer Science 2210/23
No ratings yet
Cambridge O Level: Computer Science 2210/23
16 pages
Dependable and Secure Computing Concepts
No ratings yet
Dependable and Secure Computing Concepts
14 pages
AI and ML in Finance Revolutionizing The Future of Banking and Investments
No ratings yet
AI and ML in Finance Revolutionizing The Future of Banking and Investments
5 pages
Stock Price Prediction Using Multi-Faceted Information Based On Deep Recurrent Neural Networks
No ratings yet
Stock Price Prediction Using Multi-Faceted Information Based On Deep Recurrent Neural Networks
6 pages
Data Structures Syllabus
No ratings yet
Data Structures Syllabus
2 pages
WizPro200XX Programming Data Manual-V1.1
No ratings yet
WizPro200XX Programming Data Manual-V1.1
15 pages
PPS Unit 5
No ratings yet
PPS Unit 5
7 pages
An Introduction To Software Engineering
No ratings yet
An Introduction To Software Engineering
25 pages
Review On Applications of AI & ML in Finance
No ratings yet
Review On Applications of AI & ML in Finance
10 pages
Application of Artificial Intelligence in Electrical
No ratings yet
Application of Artificial Intelligence in Electrical
7 pages
Cit 411
No ratings yet
Cit 411
3 pages
Asset Splitting Algorithm For Ultrahigh Dimensional Portfolio Selection and Its Theoretical Property
No ratings yet
Asset Splitting Algorithm For Ultrahigh Dimensional Portfolio Selection and Its Theoretical Property
11 pages
Multi-Label Text Classification On TextCNN Fused B
No ratings yet
Multi-Label Text Classification On TextCNN Fused B
7 pages
Key Principles of IT Architecture
From Everand
Key Principles of IT Architecture
Nelson Ambrose
No ratings yet
Atc23 Sadeghian
No ratings yet
Atc23 Sadeghian
19 pages
Exploring The Latest Trends in Artificial Intellig
No ratings yet
Exploring The Latest Trends in Artificial Intellig
13 pages
Artificial Intelligence Ai Prospects For Financial Services and Policy Approach Berlin Sept 2020
No ratings yet
Artificial Intelligence Ai Prospects For Financial Services and Policy Approach Berlin Sept 2020
15 pages
Poweredge-T430 - User's Guide15 - En-Us
No ratings yet
Poweredge-T430 - User's Guide15 - En-Us
116 pages
Iot Ia 1
No ratings yet
Iot Ia 1
37 pages
Multi-Source Information Fusion Based DLaaS For Traffic Flow Prediction
No ratings yet
Multi-Source Information Fusion Based DLaaS For Traffic Flow Prediction
10 pages
Gefhr2023 774 781
No ratings yet
Gefhr2023 774 781
8 pages
Application Analysis of Artificial Intelligence Te
No ratings yet
Application Analysis of Artificial Intelligence Te
12 pages
Software Engineer JD
No ratings yet
Software Engineer JD
2 pages
Applications of Artificial Intelligence in Finance Prospects Limits and Risks
No ratings yet
Applications of Artificial Intelligence in Finance Prospects Limits and Risks
6 pages
Anomaly Detection in Structural Health Monitoring With Ensemble Learning and Reinforcement Learning
No ratings yet
Anomaly Detection in Structural Health Monitoring With Ensemble Learning and Reinforcement Learning
16 pages
Classification of Software Qualities
No ratings yet
Classification of Software Qualities
6 pages
Greedy Online Classification of Persistent Market States Using Realized Intraday Volatility Features
No ratings yet
Greedy Online Classification of Persistent Market States Using Realized Intraday Volatility Features
14 pages
A Detailed Study of An Internet of Things Iot
No ratings yet
A Detailed Study of An Internet of Things Iot
7 pages
نمونه سوالات شبکه عصبی
No ratings yet
نمونه سوالات شبکه عصبی
4 pages
EAAW 2023 Paper 3
No ratings yet
EAAW 2023 Paper 3
11 pages
MT - M03 - C02 - SLM - Mobile Application Testing Features and Challenges PDF
No ratings yet
MT - M03 - C02 - SLM - Mobile Application Testing Features and Challenges PDF
22 pages
Choosing The Right Mendix Deployment Model For Your Business
No ratings yet
Choosing The Right Mendix Deployment Model For Your Business
15 pages
6492beef73e27 c844 Emerging Technologies in Cybersecurity Pa Task 2
No ratings yet
6492beef73e27 c844 Emerging Technologies in Cybersecurity Pa Task 2
9 pages
Diagnostic Tools For The TCP/IP Protocol Suite
No ratings yet
Diagnostic Tools For The TCP/IP Protocol Suite
4 pages
ST MQP-3 PDF
No ratings yet
ST MQP-3 PDF
2 pages
CNN and RNN Comparative Study For Intrusion Detection System
No ratings yet
CNN and RNN Comparative Study For Intrusion Detection System
12 pages

IAU ST Lecture2

Uploaded by

IAU ST Lecture2

Uploaded by

Big Data Analytics

• Many applications today are data-intensive, as

Standard building blocks of data-intensive

• If you are designing a data system or service, a

• Important concerns in most software systems

• Reliability: The system should continue to work

• “Continuing to work correctly, even when things

• The things that can go wrong are called faults,

• Although we generally prefer tolerating faults

• Hard disks crash, RAM becomes faulty, the

• There is a move toward systems that can

• There is no quick solution to the problem of

• How do we make our systems reliable, in spite of

• The best systems combine several approaches

• Scalability: As the system grows (in data volume,

• Describing the performance of a system:

• Scale up (vertical scaling)

• Scale out (horizontal scaling)

• Some systems are elastic, meaning that they can

• Distributing stateless services across multiple

• The problem may be

• It is well known that the majority of the cost of

• Three design principles for software systems to

Operability: Make it easy for operations teams to

routine tasks easy, including:

• Data systems can do various things to make

• Make it easy for new engineers to understand

• Making a system simpler does not necessarily

• Make it easy for engineers to make changes to

You might also like