DC Unit 1
DC Unit 1
Dis
Distributed Computing [417531]
Dr. R. V. Babar
Head of Dept.
Dept. of Artificial Intelligence and Data Science
SRTTC-FOE, Kamshet
[email protected]
Cell. +91 7588048922/9423558020
Suman Ramesh Tulsiani Charitable Trust’s
UNIT No.: 1
Introduction to Distributed Computing
▪ Mission:
▪ To impart knowledge & skill based education in collaboration with industry, academia &
research organization.
▪ To prepare competent engineers with the spirit of entrepreneurship by conducting various
Technical events and MOU.
▪ Prepare engineers to respond to the current and future needs of the industry, higher studies as
well as research.
▪ We aim to instill a sense of social responsibility and leadership qualities in our graduates,
enabling them to make positive contributions to the society at large.
▪ Vision:
▪ The Artificial Intelligence and Data Science is dedicated to persistently improve its
educational environment in order to develop rural youth with the strong academic and
technical backgrounds.
▪ Mission:
▪ To encourage students to become dynamic, problem solving individuals who can find
and understand the knowledge needed to be successful in the profession.
▪ Enrich Industry Institute Interaction program to get accustomed with corporate culture.
▪ To develop the students to survive with pioneering technology to meet IT industry
needs and contributing the progress of nation.
Distributed computing refers to the use of multiple interconnected computers or processors that work together to solve a
complex problem or perform a task.
1. Concurrency
2. Fault Tolerance
3. Scalability
4. Interprocess Communication
5. Transparency
6. Heterogeneity
7. Consistency and Replication
8. Load Balancing
9. Security
Issues of Distributed Computing
● Heterogeneity
● Scalability
● Openness
● Transparency
● Concurrency
● Security
● Failure Handling
Types of Distributed System
1. Distributed
Computing System
➔ Cluster Computing
Click to View
➔ Grid Computing
Cluster Computing
Click to View
Grid Computing
Distributed System Models:
● Distributed transaction processing: It works across different servers using multiple communication models. The
four characteristics that transactions have:
● Atomic: the transaction taking place must be indivisible to the others.
● Consistent: The transaction should be consistent after the transaction has been done.
● Isolated: A transaction must not interfere with another transaction.
● Durable: Once an engaged transaction, the changes are permanent. Transactions are often constructed as
several sub-transactions, jointly forming a nested transaction.
Distributed System Models:
Click to View
Introduction to AI & Data Science in DC
This combination opens up amazing possibilities across a range of domains, from advancing industry efficiency to taking on
challenging scientific problems. Two quickly developing fields, artificial intelligence (AI) and data science, use sophisticated
computer methods to mine data for insightful information.
The following are some essential ideas and methods for allocating computing tasks:
1. Workload Division
2. Communication and Synchronization
3. Scheduling and Load Balancing:
4. Orchestration and Fault Tolerance:
Introduction to AI & Data Science in DC
Data Storage and Access:
● Batch processing: Using frameworks like Hadoop MapReduce or Spark, analyze big datasets offline.
● Stream processing: Using Apache Kafka or Apache Flink, real-time data stream analysis is possible.
● In-memory computing: Although it uses more resources, this method of processing and storing data in RAM allows for
faster analysis.
● Distributed analytics systems: Scalable and effective platforms for analyzing big datasets are offered by programs like
Spark and Google BigQuery.
Introduction to AI & Data Science in DC
Data Management and Quality:
● Data integration: Creating a single, cohesive perspective by combining data from several sources.
● Data cleaning: Fixing mistakes and discrepancies in the data.
● Data governance: Creating guidelines and protocols to guarantee privacy, security, and correctness in data management.
● Data compression: lowering the amount of data stored without sacrificing information.
● Apache Hadoop is an open-source system for data storage and distributed processing.
● Apache Spark: Batch and stream processing unified analytics engine.
● Apache Kafka: Real-time data processing via distributed streaming.
● Google BigQuery: A cloud-based data warehouse designed for extensive analysis of data.
● Amazon Redshift: An analytics and data warehousing cloud-based data warehouse.
Introduction to AI & Data Science in DC
● Understanding Parallel Processing:
● Numerous processors or cores: Many modern computers have numerous processing cores that can handle multiple tasks
at once.
● Distributed systems: Even more parallelization is possible when the processing capacity of several computers is
combined over a network.
● Algorithms and code optimization: Parallelization is a natural fit for some algorithms but not for others. For the best use
of processor cores, proper code optimization is essential.
● Quicker execution: Workloads are split up and handled separately, which drastically cuts down on completion times.
● Scalability: Performance is further enhanced by adding more processing power (cores or machines).
● Real-time capabilities: For applications that move quickly, parallelization allows for real-time analysis and response.
● Resource optimization: Tasks are divided among several cores or computers to make efficient use of the resources that
are available.
Introduction to AI & Data Science in DC
Strategies for Leveraging Parallel Processing:
● Determining which tasks may be parallelized: Some jobs cannot be parallelized because of dependencies or constraints
for sequential execution. Examine your process to find qualified applicants.
● Selecting the appropriate libraries and tools: Parallel programming and task distribution features are provided by
frameworks such as CUDA(Compute Unified Device Architecture), MPI, and OpenMP. Your unique needs and the design of
your system will determine which tool is best for you.
● Tuning performance and optimization: You can overcome potential bottlenecks and greatly increase parallelization
efficiency by fine-tuning your code and algorithms.
Application of Integrating AI & DS in DS
The most cutting-edge method for managing maintenance in process plants is called predictive maintenance,
or PdM.
Predictive maintenance differs from other types of maintenance in many ways. Let’s start by looking at
various different types of maintenance, such as:
The most cutting-edge method for managing maintenance in process plants is called predictive maintenance,
or PdM.
Predictive maintenance differs from other types of maintenance in many ways. Let’s start by looking at
various different types of maintenance, such as:
SAMGUARD Tool
Application of Integrating AI & DS in DS
Fraud Detection
Another fascinating area where the combination of data science and artificial intelligence might unleash enormous potential in
distributed systems is fraud detection. Now let's explore some particular use cases and applications:
Another excellent illustration of how AI and data science excel in distributed systems is found in intelligent transportation
systems (ITS). By combining several technologies, they seek to increase the sustainability, safety, and efficiency of transportation
networks:
● Supply chain optimization is an extremely fascinating field where data science and artificial intelligence are applied to
dispersed systems! Let's investigate a few particular application domains.
Another exciting area where AI and data science combine in distributed systems to produce game-changing solutions is energy
management. Here are a few crucial areas in which they excel:
In this extremely important subject, AI and data science are transforming methods for diagnosing diseases and providing
medical care. Let's examine a few crucial areas where they are having a big influence:
1. Abstract:
2. Introduction:
3. Problem Statement:
4. Solution:
5. Implementation:
6. Results
7. Conclusion
3 Department of AI & DS Engineering | SRTTC-