Cloud Lab
Cloud Lab
Cloud computing refers to the delivery of computing services like storage, processing power,
databases, software, and networking, through the internet rather than relying on local servers
or personal devices. These services are provided by cloud service providers such as Amazon
Web Services (AWS), Microsoft Azure, and Google Cloud.
Cloud computing offers various service models, each serving different needs:
Cloud computing is categorized into different deployment models based on who manages the
infrastructure and how the resources are made available to the users. These models are:
1. Public Cloud:
Overview: Public clouds are owned and operated by third-party providers, and the
infrastructure is shared with multiple organizations.
Examples: Amazon Web Services (AWS), Microsoft Azure, Google Cloud.
Advantages: Cost-effective, highly scalable, no need for maintenance.
1|Page
Challenges: Less control over infrastructure, potential security concerns.
2. Private Cloud:
3. Hybrid Cloud:
Overview: A hybrid cloud combines both public and private clouds, enabling data and
applications to be shared between them.
Advantages: Greater flexibility, scalability, and better control over critical data.
Challenges: Complexity in managing multiple cloud environments, higher costs.
4. Community Cloud:
These models define the type of services that cloud providers offer:
Overview: IaaS offers virtualized computing resources such as virtual machines, storage,
and networks, all accessible via the internet.
Examples: AWS, Google Cloud, Microsoft Azure.
Advantages: Flexible, scalable, and does not require users to manage hardware.
Challenges: Requires users to manage operating systems and software on top of the
infrastructure.
Overview: PaaS provides a platform for developers to build, deploy, and manage
applications without the need to manage the underlying infrastructure.
Examples: Google App Engine, Microsoft Azure App Service.
Advantages: Speeds up application development, and easily scales as needed.
Challenges: Limited control over the infrastructure, potential vendor lock-in.
2|Page
Conclusion:
In this lab, we explored the key components of cloud computing, including its various service
models like IaaS, PaaS, SaaS, and FaaS, as well as the deployment models such as public,
private, hybrid, and community clouds. By understanding these components, we can make
informed decisions on the appropriate cloud solutions that best meet the needs of individuals
and organizations.
3|Page
Lab 02: Cloud Service Providers
Objective:
The objective of this lab is to understand the various cloud service providers in the industry
today, specifically Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform
(GCP). We will explore their key features, advantages, and disadvantages to gain insights into
which provider may best suit different business needs.
Introduction:
Cloud computing providers offer a wide range of services, from computing power to data
storage, networking, and specialized services like artificial intelligence and machine learning.
The major cloud providers — AWS, Microsoft Azure, and Google Cloud Platform (GCP) — each
offer a set of solutions that vary in capabilities, pricing, and integration options. Understanding
these services will help businesses choose the right provider based on their unique
requirements.
Amazon Web Services (AWS) is a leading and highly popular cloud platform that provides a
comprehensive suite of cloud-based services, covering everything from computing power to
storage, databases, analytics, networking, and more.
Key Features:
Compute Services: Elastic Compute Cloud (EC2) for scalable computing power.
Storage Services: Elastic Block Storage (EBS), Simple Storage Service (S3), etc.
Databases: Relational Database Service (RDS), DynamoDB, etc.
Networking: Virtual Private Cloud (VPC), Route 53, etc.
Machine Learning & AI: SageMaker, Comprehend, Rekognition, etc.
Security: Identity and Access Management (IAM), AWS Shield, etc.
Pros:
Cons:
4|Page
The large number of services can be overwhelming, making it complex to manage for
new users.
Pricing can be hard to estimate without proper planning, especially when scaling up.
2. Microsoft Azure:
Microsoft Azure is a cloud platform from Microsoft that offers a robust set of cloud services,
ranging from computing to analytics and networking. It is particularly known for its integration
with Microsoft products like Windows Server, SQL Server, and Active Directory.
Key Features:
Compute Services: Azure Virtual Machines (VM), Azure Kubernetes Service (AKS), etc.
Storage Services: Azure Blob Storage, Disk Storage, etc.
Databases: Azure SQL Database, Cosmos DB, etc.
Networking: Azure Virtual Network, Load Balancer, etc.
Machine Learning & AI: Azure AI, Azure Cognitive Services, etc.
Security: Azure Active Directory, Azure Security Center, etc.
Pros:
Strong integration with Microsoft products, ideal for businesses using Windows-based
environments.
Excellent hybrid cloud capabilities, enabling seamless integration between on premise
and cloud systems.
Enterprise-grade security and compliance, making it suitable for industries with strict
data regulations.
Cons:
New users might find the platform’s interface and features complex.
Some services may have a higher learning curve compared to AWS and GCP.
Google Cloud Platform (GCP) offers a powerful suite of cloud services focused on computing,
data storage, and big data analytics. GCP excels in machine learning, artificial intelligence, and
open-source technologies.
Key Features:
5|Page
Networking: Google Virtual Private Cloud, Cloud Load Balancing, etc.
Machine Learning & AI: TensorFlow, BigQuery, etc.
Security: Identity and Access Management (IAM), Google Cloud Security Command
Center, etc.
Pros:
Cutting-edge tools for machine learning and data analytics, especially in big data
scenarios.
Strong support for open-source technologies such as Kubernetes and TensorFlow.
Global infrastructure with advanced networking capabilities.
Cons:
6|Page
Conclusion:
In this lab, we explored the three leading cloud service providers: AWS, Microsoft Azure, and
Google Cloud Platform (GCP). Each provider offers a distinct set of features and benefits. AWS is
widely recognized for its vast service offerings, scalability, and flexibility. Azure is ideal for
organizations already using Microsoft products, with strong hybrid cloud capabilities. GCP is
well-suited for businesses that prioritize machine learning and data analytics, thanks to its
powerful tools and open-source support.
Choosing the right cloud provider depends on several factors, including the types of services
required, existing infrastructure, and budget considerations. By understanding the strengths
and weaknesses of each platform, businesses can make more informed decisions on which
cloud provider will best meet their needs.
7|Page
8|Page
Lab 03: Regions and Zones in Cloud Computing
Objective:
The purpose of this lab is to explore how cloud providers use regions and availability zones to
manage and distribute resources effectively, ensuring optimal performance, availability, and
fault tolerance for applications.
Region:
A region is a geographically defined area where a cloud service provider operates multiple data
centers. The region determines where your applications and data are stored and processed.
Cloud providers typically offer multiple regions to allow businesses to deploy resources closer
to their users for better performance and to comply with data sovereignty regulations.
An availability zone (AZ) is a specific data center or a cluster of data centers within a region.
These zones are designed to be isolated from failures that might occur in other zones, yet they
are interconnected through low-latency, high-throughput networks. AZs enable high availability
and fault tolerance because even if one zone experiences an issue, other zones can handle the
load.
Cloud providers offer different regions where users can deploy resources. Below is how you can
create resources in different regions on AWS and Google Cloud.
AWS Example:
9|Page
1. In the Google Cloud Console, navigate to Compute Engine > VM Instances.
2. Select the region selector and choose the desired region
3. Configure the VM by selecting the machine type, boot disk, and other configuration options,
then create the instance.
In cloud computing, clusters and instances are two key concepts with distinct uses.
Cluster:
A cluster consists of a group of interconnected virtual machines (VMs) or nodes that work
together as a unified system to deliver high availability, fault tolerance, and scalability. Clusters
are typically used for large-scale applications that require continuous availability and horizontal
scaling, such as web services, big data processing, or distributed systems like Kubernetes and
Hadoop. They offer fault tolerance: if one node fails, other nodes can pick up the workload,
ensuring continuous operation. However, clusters can be more expensive as they require
provisioning multiple instances, which increases the cost as the cluster expands.
Instance:
An instance, in contrast, is a single virtual server that runs applications, processes, and
workloads in the cloud. Instances are better suited for smaller, standalone applications that do
not require complex infrastructure or high availability. They are easier to manage since they
operate independently. Instances can be scaled both vertically (by adding more resources to
the existing instance) and horizontally (by adding additional instances). While instances can be
part of a larger cluster, they are often used for isolated tasks, such as hosting a website or
running a database.
10 | P a g e
o Choose a name for your instance.
o Select a region and availability zone based on where you want your instance to be
located.
o Choose the machine type (e.g., general-purpose, compute-optimized).
o Select the boot disk, such as Ubuntu or Windows, depending on your requirements.
o Enable firewall rules to allow HTTP/HTTPS traffic.
o Optionally configure other settings like networking and SSH keys for secure access.
11 | P a g e
12 | P a g e
13 | P a g e
Conclusion:
In this lab, we learned about the concepts of regions and availability zones in cloud computing.
We explored how cloud providers use these geographic areas to offer better resource
management, performance optimization, and fault tolerance for applications. We also
discussed the difference between clusters and instances. Clusters are ideal for large-scale, high-
availability applications, while instances are suitable for smaller, isolated tasks. Finally, we went
through the process of creating resources in different regions on AWS and Google Cloud.
By understanding regions, zones, and the differences between clusters and instances, users can
better design and deploy cloud-based applications to meet their business needs.
14 | P a g e
Lab 4: Understanding Map Reduce in Collaborative Tools
Objective:
The objective of this lab is to gain a deeper understanding of the Map Reduce programming
model, which is widely used for processing large datasets efficiently. In this lab, we will explore
how the Map Reduce model breaks down tasks into smaller parts (Map phase) and then
combines the results (Reduce phase) to produce a final output. We will apply this model to a
real-world scenario by simulating data processing from a collaborative tool such as Google
Sheets.
Introduction:
Map Reduce is a computational framework used for handling large-scale data across distributed
systems. Initially popularized by Google, it has become a standard model for processing big
data. The Map Reduce model consists of two main phases:
Map Phase: This phase divides the input data into smaller, manageable pieces, usually in
the form of key-value pairs. The data is then processed in parallel across different
systems.
Reduce Phase: After the Map phase, the data is aggregated or summarized by grouping
key-value pairs, then performing calculations (e.g., summing, counting, averaging) on
them to produce a consolidated result.
Steps Performed:
In this step, we simulate the Map function by collecting raw meeting data. The data includes
details such as:
Task Name
Assigned Participant
Status of the task
Time Spent on the task
Each of these details represents key-value pairs that we can process in parallel, enabling faster
data handling.
The Reduce phase processes the mapped data by aggregating it into useful summaries. The
following operations were performed during this phase:
15 | P a g e
Counting the number of tasks assigned to each participant.
Summing the total time spent by each participant.
For Ram:
=COUNTIF(B2:B6, "Ram")
For Hari:
=COUNTIF(B2:B6, "Hari")
For Shyam:
=COUNTIF(B2:B6, "Shyam")
Through this lab, we saw how the MapReduce framework can simplify and optimize the process
of handling and summarizing large datasets, making it a powerful tool for data-driven decision-
making.
16 | P a g e
Conclusion:
In this lab, we learned how to apply the Map Reduce programming model to process and
analyze data collected from a collaborative tool like Google Sheets. The Map phase helped us
break down the raw data into smaller chunks for efficient processing, while the Reduce phase
allowed us to aggregate that data into meaningful summaries. This method of data processing
can be easily applied to other real-world scenarios where large datasets need to be handled
and analyzed efficiently.
17 | P a g e
Lab 5: VMware Installation and Virtual Machine Creation
Objective:
The goal of this lab is to guide you through the installation process of VMware Workstation or
VMware Player on your system and demonstrate how to create a Virtual Machine (VM). This
will enable you to run multiple operating systems on one physical machine, which is an
essential skill for virtualization and creating software development environments.
Introduction:
VMware is a widely used software suite for creating and managing virtual machines.
Virtualization allows users to run several operating systems (OS) on a single physical computer,
providing more flexibility and efficient resource management. VMware uses a hypervisor to
manage resources and control virtual machines, allowing them to operate independently of the
physical hardware.
18 | P a g e
6. Configure Virtual Machine Hardware:
VMware will display the default hardware configuration, such as memory (RAM),
processors (CPU), and hard disk size.
You can click on Customize Hardware to adjust settings (e.g., allocate more RAM, CPU
cores, etc.).
The recommended configuration for most systems is 2 GB of RAM and at least 20 GB of disk
space for the virtual machine.
19 | P a g e
Conclusion:
In this lab, you have successfully installed VMware Workstation/Player, created a Virtual
Machine, and installed an operating system on it. VMware allows you to run multiple operating
systems on a single physical machine, making it an excellent tool for developers, testers, and IT
professionals. You now have a virtual environment where you can experiment with different
OSes without affecting your host system.
20 | P a g e
Lab 06: Collaboration Tools – Google Meet, Zoom, and Microsoft Teams
Objective:
The objective of this lab is to explore the use of collaboration tools such as Google Meet, Zoom,
and Microsoft Teams. These tools are essential for remote communication and collaboration,
enabling teams to conduct virtual meetings, share files, and collaborate in real-time. In this lab,
we will learn how to set up meetings, manage participants, share content, and use other
features to enhance collaboration.
Introduction:
In today's remote work environment, collaboration tools like Google Meet, Zoom, and
Microsoft Teams have become essential for communication within teams. These tools support
virtual meetings, video conferencing, screen sharing, file sharing, and messaging, which are
crucial for remote work. This lab will guide you through setting up and using these tools,
helping you understand their features and how they support efficient collaboration.
3. Invite Participants:
o Once the meeting starts, click on the Copy Joining Info or Add People to share
the meeting link or invite participants via email.
21 | P a g e
Set Up Zoom:
1. Sign in to Zoom:
o Download and install the Zoom app from the Zoom website
(https://zoom.us/download) or sign in directly through the Zoom web platform.
o Sign in with your Zoom account or create one if you don’t already have one.
3. Invite Participants:
o Once the meeting is created, you will see the meeting ID and link. Copy the
invitation link and share it with the participants.
22 | P a g e
Set Up Microsoft Teams
4. Invite Participants:
o After setting up the meeting, click Send to send out the meeting invite.
o Participants can join via the invitation link or directly through Teams if they are
already members of the Team.
23 | P a g e
Conclusion:
In this lab, we explored three popular collaboration tools: Google Meet, Zoom, and Microsoft
Teams. Each tool has its unique strengths, but all provide essential features for remote
communication, meetings, and collaboration. By creating meetings, managing participants, and
utilizing key features like screen sharing and chat, we have learned how to efficiently use these
tools for professional and personal virtual communication.
24 | P a g e
Lab 7: CSV Files in Google Colab
Objective:
The objective of this lab is to learn how to upload a CSV file to a Google Colab environment,
read it into a pandas DataFrame, and display the first few rows of the dataset. This is an
essential skill for data analysis as CSV files are one of the most common formats for storing and
sharing data.
25 | P a g e
26 | P a g e
Conclusion:
In this lab, we learned how to upload a CSV file to Google Colab, read the file into a pandas
DataFrame, and display the first few rows. This is a fundamental skill for any data analysis task,
as CSV files are commonly used for storing and sharing datasets. By using pandas, we can
manipulate, clean, and analyze the data efficiently.
27 | P a g e
Lab 8: Cloud Storage and Collaboration Tools - Google Drive, OneDrive, Dropbox,
Slack
Objective:
The objective of this lab is to explore and understand the usage of various cloud storage and
collaboration tools, including Google Drive, OneDrive, Dropbox, and Slack. These tools are
widely used for storing, sharing, and collaborating on files across devices and teams. In this lab,
we will learn how to use these tools effectively for both personal and professional purposes.
Introduction:
Cloud storage and collaboration tools have become essential in today's digital world, enabling
seamless file access, sharing, and real-time collaboration. Google Drive, OneDrive, and Dropbox
are popular cloud storage services that allow users to store files online, while Slack is a
collaboration tool designed for team communication, project management, and file sharing.
This lab focuses on:
Understanding the features and benefits of cloud storage services: Google Drive,
OneDrive, and Dropbox.
Exploring the use of Slack for team collaboration, messaging, and integration with cloud
storage services.
Learning how to use these tools to store, share, and collaborate on files.
1. Google Drive
4. Collaborating on Files:
28 | P a g e
o For files like Google Docs, Sheets, and Slides, you can collaborate in real-time
with others.
o Multiple users can edit and comment on the same document at the same time.
2. OneDrive
1. Access OneDrive:
o Go to OneDrive in your web browser.
o Sign in with your Microsoft account.
4. Collaborating on Files:
o For files such as Microsoft Office documents (Word, Excel, PowerPoint), multiple
users can edit them simultaneously.
o You can use OneDrive to sync files across devices, ensuring you have access to
them wherever you go.
29 | P a g e
3. Dropbox
1. Access Dropbox:
o Go to Dropbox in your web browser.
o Sign in with your Dropbox account.
4. Collaborating on Files:
o You can invite others to a shared folder where everyone can contribute, access, and
modify files.
o Dropbox allows for real-time collaboration with integrations for Microsoft Office or
Google Workspace files.
30 | P a g e
4. Slack for Collaboration
1. Access Slack:
o Go to Slack and create an account or log in to your workspace.
o Download the desktop app or use the web version.
2. Create a Workspace:
o Create or join a Slack workspace for team collaboration.
o Set up channels for various topics, projects, or departments (e.g., #general, #marketing,
#development).
5. Collaborating in Slack:
o Use @mentions to notify team members.
o You can also create to-do lists, share documents, and work on tasks in real-time.
o Slack supports integration with other tools (e.g., Trello, Jira) for enhanced project
management.
31 | P a g e
32 | P a g e
Conclusion:
In this lab, we explored the use of popular cloud storage and collaboration tools: Google Drive,
OneDrive, Dropbox, and Slack. These tools are integral in modern workplaces for storing,
sharing, and collaborating on files in real-time. We learned how to upload, organize, share files,
and collaborate using the features of these tools. Additionally, we explored Slack for team
communication, task management, and integrating cloud storage services for efficient file
access and collaboration.
33 | P a g e
Lab 9: Implementing MapReduce for Word Count in Google Colab
Objective:
The objective of this lab is to implement the MapReduce algorithm for counting the
occurrences of specific words (e.g., "deer" and "beer") in a large unstructured text file (CSV
format) using Google Colab. The goal is to understand how the MapReduce framework works in
practice by using a local Python-based approach.
Introduction:
MapReduce is a programming model that allows for the parallel processing of large datasets
across distributed systems. It consists of two main phases: the map phase, where data is
processed and split into key-value pairs, and the reduce phase, where the results are
aggregated. In this lab, we simulate the MapReduce approach to count occurrences of the
words "deer" and "beer" in a sample text dataset uploaded as a CSV file. We will simulate
MapReduce on a single machine using Python in Google Colab, mimicking the processing steps
that would typically occur in a large distributed cluster.
Materials Needed:
Pre-requisites:
First, we will upload the CSV file that contains the text data (for example, product reviews or
any other unstructured text). In Google Colab, use the following code to upload the file.
This will trigger the file upload interface, where you can select and upload your file. Once
uploaded, the file name will be automatically captured.
34 | P a g e
Step 2: Extract the File Name
After uploading the file, we dynamically capture the file name to use it in the subsequent steps:
file_name = list(uploaded.keys())[0]
print(f"Uploaded file name: {file_name}")
This step ensures that the file is correctly identified and used for processing.
Now we implement the MapReduce algorithm for counting occurrences of the words "deer"
and "beer". We'll use the mapper function to tokenize the text and emit key-value pairs, and
the reducer function will aggregate the counts.
import csv
from collections import defaultdict
35 | P a g e
36 | P a g e
37 | P a g e
Conclusion:
In this lab, we simulated the MapReduce algorithm using Python and Google Colab to count the
occurrences of specific words ("deer" and "beer") in an uploaded CSV file. The MapReduce
approach allows for parallel processing and is efficient for handling large datasets, though in
this case, we implemented it locally in Google Colab for simplicity.
38 | P a g e