0% found this document useful (0 votes)
20 views38 pages

Cloud Lab

The document provides an introduction to cloud computing, detailing its definition, key service models (IaaS, PaaS, SaaS), and deployment models (public, private, hybrid, community). It also explores major cloud service providers like AWS, Microsoft Azure, and Google Cloud Platform, highlighting their features, advantages, and disadvantages. Additionally, it discusses the concepts of regions, availability zones, clusters, and instances in cloud computing, as well as the Map Reduce programming model for processing large datasets.

Uploaded by

sana986781
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views38 pages

Cloud Lab

The document provides an introduction to cloud computing, detailing its definition, key service models (IaaS, PaaS, SaaS), and deployment models (public, private, hybrid, community). It also explores major cloud service providers like AWS, Microsoft Azure, and Google Cloud Platform, highlighting their features, advantages, and disadvantages. Additionally, it discusses the concepts of regions, availability zones, clusters, and instances in cloud computing, as well as the Map Reduce programming model for processing large datasets.

Uploaded by

sana986781
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 38

Lab 01: Introduction to Cloud Computing

Understanding Cloud Computing:

Cloud computing refers to the delivery of computing services like storage, processing power,
databases, software, and networking, through the internet rather than relying on local servers
or personal devices. These services are provided by cloud service providers such as Amazon
Web Services (AWS), Microsoft Azure, and Google Cloud.

Key Service Models in Cloud Computing:

Cloud computing offers various service models, each serving different needs:

1. Infrastructure as a Service (IaaS):


o This model provides virtualized computing resources, including virtual machines
and storage.
o It offers the flexibility of running applications on virtualized infrastructure
without managing physical servers.

2. Platform as a Service (PaaS):


o PaaS provides a platform for developers to create, run, and manage applications
without worrying about the underlying infrastructure.
o It allows developers to focus on coding while the service provider handles server
maintenance.

3. Software as a Service (SaaS):


o SaaS delivers fully functional software applications via the internet. These
applications are hosted and managed by cloud providers, and users only need a
browser to access them.
o Examples include web-based email services, productivity tools, and file-sharing
services.

Types of Cloud Computing Based on Deployment Models:

Cloud computing is categorized into different deployment models based on who manages the
infrastructure and how the resources are made available to the users. These models are:

1. Public Cloud:

 Overview: Public clouds are owned and operated by third-party providers, and the
infrastructure is shared with multiple organizations.
 Examples: Amazon Web Services (AWS), Microsoft Azure, Google Cloud.
 Advantages: Cost-effective, highly scalable, no need for maintenance.

1|Page
 Challenges: Less control over infrastructure, potential security concerns.

2. Private Cloud:

 Overview: A private cloud is used by a single organization, and it can either be


maintained on-premises or by a third-party provider.
 Advantages: More control, enhanced security, and privacy.
 Challenges: High cost, requires ongoing maintenance and management.

3. Hybrid Cloud:

 Overview: A hybrid cloud combines both public and private clouds, enabling data and
applications to be shared between them.
 Advantages: Greater flexibility, scalability, and better control over critical data.
 Challenges: Complexity in managing multiple cloud environments, higher costs.

4. Community Cloud:

 Overview: A community cloud is shared by multiple organizations that have common


concerns (e.g., security or compliance).
 Advantages: More cost-effective for shared resources, good for collaboration between
organizations.
 Challenges: Less flexibility compared to public and private clouds, security risks.

Cloud Computing Service Models:

These models define the type of services that cloud providers offer:

5. Infrastructure as a Service (IaaS):

 Overview: IaaS offers virtualized computing resources such as virtual machines, storage,
and networks, all accessible via the internet.
 Examples: AWS, Google Cloud, Microsoft Azure.
 Advantages: Flexible, scalable, and does not require users to manage hardware.
 Challenges: Requires users to manage operating systems and software on top of the
infrastructure.

6. Platform as a Service (PaaS):

 Overview: PaaS provides a platform for developers to build, deploy, and manage
applications without the need to manage the underlying infrastructure.
 Examples: Google App Engine, Microsoft Azure App Service.
 Advantages: Speeds up application development, and easily scales as needed.
 Challenges: Limited control over the infrastructure, potential vendor lock-in.
2|Page
Conclusion:

In this lab, we explored the key components of cloud computing, including its various service
models like IaaS, PaaS, SaaS, and FaaS, as well as the deployment models such as public,
private, hybrid, and community clouds. By understanding these components, we can make
informed decisions on the appropriate cloud solutions that best meet the needs of individuals
and organizations.

3|Page
Lab 02: Cloud Service Providers

Objective:

The objective of this lab is to understand the various cloud service providers in the industry
today, specifically Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform
(GCP). We will explore their key features, advantages, and disadvantages to gain insights into
which provider may best suit different business needs.

Introduction:

Cloud computing providers offer a wide range of services, from computing power to data
storage, networking, and specialized services like artificial intelligence and machine learning.
The major cloud providers — AWS, Microsoft Azure, and Google Cloud Platform (GCP) — each
offer a set of solutions that vary in capabilities, pricing, and integration options. Understanding
these services will help businesses choose the right provider based on their unique
requirements.

Types of Cloud Providers:

1. Amazon Web Services (AWS):

Amazon Web Services (AWS) is a leading and highly popular cloud platform that provides a
comprehensive suite of cloud-based services, covering everything from computing power to
storage, databases, analytics, networking, and more.

Key Features:

 Compute Services: Elastic Compute Cloud (EC2) for scalable computing power.
 Storage Services: Elastic Block Storage (EBS), Simple Storage Service (S3), etc.
 Databases: Relational Database Service (RDS), DynamoDB, etc.
 Networking: Virtual Private Cloud (VPC), Route 53, etc.
 Machine Learning & AI: SageMaker, Comprehend, Rekognition, etc.
 Security: Identity and Access Management (IAM), AWS Shield, etc.

Pros:

 Highly scalable and flexible, supporting businesses of all sizes.


 A wide range of services catering to diverse business requirements.
 Pay-as-you-go pricing model, offering cost flexibility.
 Extensive global infrastructure, with data centers across multiple regions.

Cons:

4|Page
 The large number of services can be overwhelming, making it complex to manage for
new users.
 Pricing can be hard to estimate without proper planning, especially when scaling up.

2. Microsoft Azure:

Microsoft Azure is a cloud platform from Microsoft that offers a robust set of cloud services,
ranging from computing to analytics and networking. It is particularly known for its integration
with Microsoft products like Windows Server, SQL Server, and Active Directory.

Key Features:

 Compute Services: Azure Virtual Machines (VM), Azure Kubernetes Service (AKS), etc.
 Storage Services: Azure Blob Storage, Disk Storage, etc.
 Databases: Azure SQL Database, Cosmos DB, etc.
 Networking: Azure Virtual Network, Load Balancer, etc.
 Machine Learning & AI: Azure AI, Azure Cognitive Services, etc.
 Security: Azure Active Directory, Azure Security Center, etc.

Pros:

 Strong integration with Microsoft products, ideal for businesses using Windows-based
environments.
 Excellent hybrid cloud capabilities, enabling seamless integration between on premise
and cloud systems.
 Enterprise-grade security and compliance, making it suitable for industries with strict
data regulations.

Cons:

 New users might find the platform’s interface and features complex.
 Some services may have a higher learning curve compared to AWS and GCP.

3.Google Cloud Platform (GCP):

Google Cloud Platform (GCP) offers a powerful suite of cloud services focused on computing,
data storage, and big data analytics. GCP excels in machine learning, artificial intelligence, and
open-source technologies.

Key Features:

 Compute Services: Google Compute Engine, Kubernetes Engine, etc.


 Storage Services: Google Cloud Storage, Persistent Disk, etc.
 Databases: Cloud SQL, Cloud Big table, Firestore, etc.

5|Page
 Networking: Google Virtual Private Cloud, Cloud Load Balancing, etc.
 Machine Learning & AI: TensorFlow, BigQuery, etc.
 Security: Identity and Access Management (IAM), Google Cloud Security Command
Center, etc.

Pros:

 Cutting-edge tools for machine learning and data analytics, especially in big data
scenarios.
 Strong support for open-source technologies such as Kubernetes and TensorFlow.
 Global infrastructure with advanced networking capabilities.

Cons:

 Offers a smaller selection of services compared to AWS and Azure.


 Some users may find GCP’s user interface and support less intuitive, particularly for
beginners.

6|Page
Conclusion:

In this lab, we explored the three leading cloud service providers: AWS, Microsoft Azure, and
Google Cloud Platform (GCP). Each provider offers a distinct set of features and benefits. AWS is
widely recognized for its vast service offerings, scalability, and flexibility. Azure is ideal for
organizations already using Microsoft products, with strong hybrid cloud capabilities. GCP is
well-suited for businesses that prioritize machine learning and data analytics, thanks to its
powerful tools and open-source support.

Choosing the right cloud provider depends on several factors, including the types of services
required, existing infrastructure, and budget considerations. By understanding the strengths
and weaknesses of each platform, businesses can make more informed decisions on which
cloud provider will best meet their needs.

7|Page
8|Page
Lab 03: Regions and Zones in Cloud Computing

Objective:

The purpose of this lab is to explore how cloud providers use regions and availability zones to
manage and distribute resources effectively, ensuring optimal performance, availability, and
fault tolerance for applications.

1. Regions and Zones:

Region:

A region is a geographically defined area where a cloud service provider operates multiple data
centers. The region determines where your applications and data are stored and processed.
Cloud providers typically offer multiple regions to allow businesses to deploy resources closer
to their users for better performance and to comply with data sovereignty regulations.

Availability Zone (AZ):

An availability zone (AZ) is a specific data center or a cluster of data centers within a region.
These zones are designed to be isolated from failures that might occur in other zones, yet they
are interconnected through low-latency, high-throughput networks. AZs enable high availability
and fault tolerance because even if one zone experiences an issue, other zones can handle the
load.

2. Creating Resources in Different Regions:

Cloud providers offer different regions where users can deploy resources. Below is how you can
create resources in different regions on AWS and Google Cloud.

AWS Example:

To create an EC2 instance in a different region on AWS, follow these steps:

1. Log in to the AWS Management Console.


2. In the EC2 Dashboard, locate the region selector at the top-right corner.
3. Choose a different region, such as US-East-1.
4. Proceed with launching a new EC2 instance by selecting the instance type, configuring storage,
and completing the instance creation process.

Google Cloud Example:

To create a virtual machine (VM) instance in a different region on Google Cloud:

9|Page
1. In the Google Cloud Console, navigate to Compute Engine > VM Instances.
2. Select the region selector and choose the desired region
3. Configure the VM by selecting the machine type, boot disk, and other configuration options,
then create the instance.

Comparison of Cluster and Instance in Cloud Computing:

In cloud computing, clusters and instances are two key concepts with distinct uses.

Cluster:

A cluster consists of a group of interconnected virtual machines (VMs) or nodes that work
together as a unified system to deliver high availability, fault tolerance, and scalability. Clusters
are typically used for large-scale applications that require continuous availability and horizontal
scaling, such as web services, big data processing, or distributed systems like Kubernetes and
Hadoop. They offer fault tolerance: if one node fails, other nodes can pick up the workload,
ensuring continuous operation. However, clusters can be more expensive as they require
provisioning multiple instances, which increases the cost as the cluster expands.

Instance:

An instance, in contrast, is a single virtual server that runs applications, processes, and
workloads in the cloud. Instances are better suited for smaller, standalone applications that do
not require complex infrastructure or high availability. They are easier to manage since they
operate independently. Instances can be scaled both vertically (by adding more resources to
the existing instance) and horizontally (by adding additional instances). While instances can be
part of a larger cluster, they are often used for isolated tasks, such as hosting a website or
running a database.

Creating the Instance in Google Cloud:

Follow these steps to create an instance in Google Cloud:

1. Sign Up for Google Cloud:


o If you do not have a Google Cloud account, sign up and log into the Google Cloud
Console.

2. Navigate to Compute Engine:


o In the Google Cloud Console, go to Compute Engine > VM Instances.

3. Create a New Instance:


o Click the Create Instance button to start configuring a new virtual machine.

4. Configure the Instance:

10 | P a g e
o Choose a name for your instance.
o Select a region and availability zone based on where you want your instance to be
located.
o Choose the machine type (e.g., general-purpose, compute-optimized).
o Select the boot disk, such as Ubuntu or Windows, depending on your requirements.
o Enable firewall rules to allow HTTP/HTTPS traffic.
o Optionally configure other settings like networking and SSH keys for secure access.

5. Finalize and Create:


o Once all configurations are complete, click Create to provision the instance.

11 | P a g e
12 | P a g e
13 | P a g e
Conclusion:

In this lab, we learned about the concepts of regions and availability zones in cloud computing.
We explored how cloud providers use these geographic areas to offer better resource
management, performance optimization, and fault tolerance for applications. We also
discussed the difference between clusters and instances. Clusters are ideal for large-scale, high-
availability applications, while instances are suitable for smaller, isolated tasks. Finally, we went
through the process of creating resources in different regions on AWS and Google Cloud.

By understanding regions, zones, and the differences between clusters and instances, users can
better design and deploy cloud-based applications to meet their business needs.

14 | P a g e
Lab 4: Understanding Map Reduce in Collaborative Tools

Objective:

The objective of this lab is to gain a deeper understanding of the Map Reduce programming
model, which is widely used for processing large datasets efficiently. In this lab, we will explore
how the Map Reduce model breaks down tasks into smaller parts (Map phase) and then
combines the results (Reduce phase) to produce a final output. We will apply this model to a
real-world scenario by simulating data processing from a collaborative tool such as Google
Sheets.

Introduction:

Map Reduce is a computational framework used for handling large-scale data across distributed
systems. Initially popularized by Google, it has become a standard model for processing big
data. The Map Reduce model consists of two main phases:

 Map Phase: This phase divides the input data into smaller, manageable pieces, usually in
the form of key-value pairs. The data is then processed in parallel across different
systems.
 Reduce Phase: After the Map phase, the data is aggregated or summarized by grouping
key-value pairs, then performing calculations (e.g., summing, counting, averaging) on
them to produce a consolidated result.

Steps Performed:

Step 1: Data Collection (Map Phase)

In this step, we simulate the Map function by collecting raw meeting data. The data includes
details such as:

 Task Name
 Assigned Participant
 Status of the task
 Time Spent on the task

Each of these details represents key-value pairs that we can process in parallel, enabling faster
data handling.

Step 2: Data Aggregation (Reduce Phase)

The Reduce phase processes the mapped data by aggregating it into useful summaries. The
following operations were performed during this phase:

15 | P a g e
 Counting the number of tasks assigned to each participant.
 Summing the total time spent by each participant.

We applied formulas in Google Sheets to aggregate the data:

 For Ram:

=COUNTIF(B2:B6, "Ram")

 For Hari:

=COUNTIF(B2:B6, "Hari")

 For Shyam:

=COUNTIF(B2:B6, "Shyam")

Results and Observations:

Through this lab, we saw how the MapReduce framework can simplify and optimize the process
of handling and summarizing large datasets, making it a powerful tool for data-driven decision-
making.

16 | P a g e
Conclusion:

In this lab, we learned how to apply the Map Reduce programming model to process and
analyze data collected from a collaborative tool like Google Sheets. The Map phase helped us
break down the raw data into smaller chunks for efficient processing, while the Reduce phase
allowed us to aggregate that data into meaningful summaries. This method of data processing
can be easily applied to other real-world scenarios where large datasets need to be handled
and analyzed efficiently.

17 | P a g e
Lab 5: VMware Installation and Virtual Machine Creation

Objective:

The goal of this lab is to guide you through the installation process of VMware Workstation or
VMware Player on your system and demonstrate how to create a Virtual Machine (VM). This
will enable you to run multiple operating systems on one physical machine, which is an
essential skill for virtualization and creating software development environments.

Introduction:

VMware is a widely used software suite for creating and managing virtual machines.
Virtualization allows users to run several operating systems (OS) on a single physical computer,
providing more flexibility and efficient resource management. VMware uses a hypervisor to
manage resources and control virtual machines, allowing them to operate independently of the
physical hardware.

Create a New Virtual Machine

1. Start VMware Workstation/Player:


 Open VMware Workstation or VMware Player from your desktop or start menu.

2. Create a New Virtual Machine:


 From the main interface, select File > New Virtual Machine or click on Create a New
Virtual Machine.
 Opt for Typical (recommended) configuration for a smoother and more automated
setup.

3. Select Installation Media:


 VMware will prompt you to select the installation media for the guest operating system.
 Browse and select the ISO file for the OS you wish to install (e.g., Ubuntu, Windows).

4. Select the Guest Operating System:


 Choose the OS type (e.g., Linux, Windows) and version (e.g., Ubuntu, Windows 10) from
the list.
 VMware will automatically adjust the VM settings based on your OS choice.

5. Name the Virtual Machine:


 Provide a name for your virtual machine.
 Choose the location on your system where the VM will be stored.

18 | P a g e
6. Configure Virtual Machine Hardware:
 VMware will display the default hardware configuration, such as memory (RAM),
processors (CPU), and hard disk size.
 You can click on Customize Hardware to adjust settings (e.g., allocate more RAM, CPU
cores, etc.).
 The recommended configuration for most systems is 2 GB of RAM and at least 20 GB of disk
space for the virtual machine.

7. Finish the Setup:


 After configuring the virtual machine hardware, click Finish.
 VMware will create the virtual machine and add it to the VMware interface.

Install the Operating System on the Virtual Machine

1. Power On the Virtual Machine:


 Select the newly created virtual machine in the VMware interface and click Power On to start it.
 The virtual machine will boot from the ISO file you selected earlier.

2. Install the Operating System:


 Follow the on-screen instructions to install the operating system, just as you would on a physical
computer.
 This includes setting up your language preferences, disk partitioning (usually the default settings
are fine), and configuring the network.
 For Linux (e.g., Ubuntu), follow the installation wizard to complete the setup.
 For Windows, you may need to enter a product key if prompted.

3. Complete the Installation:


 Once the installation is complete, the virtual machine will prompt you to restart.
 After rebooting, you should be able to log into your new virtual machine and begin using the
installed operating system.

Install VMware Tools

1. Install VMware Tools (Optional but Recommended):


 After the OS is installed, you can install VMware Tools, which enhances the performance of the
virtual machine and enables better integration with the host system.
 In VMware Workstation or Player, click VM > Install VMware Tools.
 Follow the prompts within the guest operating system to complete the installation (for Linux,
you may need to mount and extract the VMware Tools CD).

19 | P a g e
Conclusion:

In this lab, you have successfully installed VMware Workstation/Player, created a Virtual
Machine, and installed an operating system on it. VMware allows you to run multiple operating
systems on a single physical machine, making it an excellent tool for developers, testers, and IT
professionals. You now have a virtual environment where you can experiment with different
OSes without affecting your host system.

20 | P a g e
Lab 06: Collaboration Tools – Google Meet, Zoom, and Microsoft Teams

Objective:

The objective of this lab is to explore the use of collaboration tools such as Google Meet, Zoom,
and Microsoft Teams. These tools are essential for remote communication and collaboration,
enabling teams to conduct virtual meetings, share files, and collaborate in real-time. In this lab,
we will learn how to set up meetings, manage participants, share content, and use other
features to enhance collaboration.

Introduction:

In today's remote work environment, collaboration tools like Google Meet, Zoom, and
Microsoft Teams have become essential for communication within teams. These tools support
virtual meetings, video conferencing, screen sharing, file sharing, and messaging, which are
crucial for remote work. This lab will guide you through setting up and using these tools,
helping you understand their features and how they support efficient collaboration.

Steps to Perform the Lab:

Set Up Google Meet

1. Sign in to Google Meet:


o Go to the Google Meet website or open the Google Meet app if you have it
installed.
o Sign in with your Google account (either personal or work account).

2. Create a New Meeting:


o Click on the New Meeting button.
o You will have the option to:
o For this lab, choose Start an instant meeting.

3. Invite Participants:
o Once the meeting starts, click on the Copy Joining Info or Add People to share
the meeting link or invite participants via email.

21 | P a g e
Set Up Zoom:

1. Sign in to Zoom:
o Download and install the Zoom app from the Zoom website
(https://zoom.us/download) or sign in directly through the Zoom web platform.
o Sign in with your Zoom account or create one if you don’t already have one.

2. Create a New Meeting:


o Once signed in, click on Host a Meeting to start a new meeting.
o You can also click on Schedule a Meeting if you prefer to set up a future meeting.

3. Invite Participants:
o Once the meeting is created, you will see the meeting ID and link. Copy the
invitation link and share it with the participants.

22 | P a g e
Set Up Microsoft Teams

1. Sign in to Microsoft Teams:


o Go to the Microsoft Teams website (https://teams.microsoft.com) or open the
Teams app on your computer.
o Sign in using your Microsoft account or work/school account.

2. Create a New Team (Optional):


o If you haven’t created a team already, click on Join or create a team at the
bottom of the Teams interface.
o Choose Create a team and set up the team name, privacy, and members.

3. Create a New Meeting:


o Go to the Calendar tab in Teams.
o Click on New Meeting to schedule a meeting or choose Meet Now for an instant
meeting.
o Provide a title, start time, and invite participants by adding their emails or Teams
usernames.

4. Invite Participants:
o After setting up the meeting, click Send to send out the meeting invite.
o Participants can join via the invitation link or directly through Teams if they are
already members of the Team.

23 | P a g e
Conclusion:

In this lab, we explored three popular collaboration tools: Google Meet, Zoom, and Microsoft
Teams. Each tool has its unique strengths, but all provide essential features for remote
communication, meetings, and collaboration. By creating meetings, managing participants, and
utilizing key features like screen sharing and chat, we have learned how to efficiently use these
tools for professional and personal virtual communication.

24 | P a g e
Lab 7: CSV Files in Google Colab

Objective:

The objective of this lab is to learn how to upload a CSV file to a Google Colab environment,
read it into a pandas DataFrame, and display the first few rows of the dataset. This is an
essential skill for data analysis as CSV files are one of the most common formats for storing and
sharing data.

Steps to Perform the Lab:

Step 1: Upload the CSV File


from google.colab import files
uploaded = files.upload()
Step 2: Import the pandas Library
import pandas as pd
Step 3: Load the CSV File into a DataFrame
file_name = list(uploaded.keys())[0] # This ensures we always get the right filename
df = pd.read_csv(f"/content/{file_name}")
Step 4: Display the First 5 Rows of the Dataset
print("First 5 rows of the dataset:")
print(df.head())

25 | P a g e
26 | P a g e
Conclusion:

In this lab, we learned how to upload a CSV file to Google Colab, read the file into a pandas
DataFrame, and display the first few rows. This is a fundamental skill for any data analysis task,
as CSV files are commonly used for storing and sharing datasets. By using pandas, we can
manipulate, clean, and analyze the data efficiently.

27 | P a g e
Lab 8: Cloud Storage and Collaboration Tools - Google Drive, OneDrive, Dropbox,
Slack

Objective:

The objective of this lab is to explore and understand the usage of various cloud storage and
collaboration tools, including Google Drive, OneDrive, Dropbox, and Slack. These tools are
widely used for storing, sharing, and collaborating on files across devices and teams. In this lab,
we will learn how to use these tools effectively for both personal and professional purposes.

Introduction:

Cloud storage and collaboration tools have become essential in today's digital world, enabling
seamless file access, sharing, and real-time collaboration. Google Drive, OneDrive, and Dropbox
are popular cloud storage services that allow users to store files online, while Slack is a
collaboration tool designed for team communication, project management, and file sharing.
This lab focuses on:

 Understanding the features and benefits of cloud storage services: Google Drive,
OneDrive, and Dropbox.
 Exploring the use of Slack for team collaboration, messaging, and integration with cloud
storage services.
 Learning how to use these tools to store, share, and collaborate on files.

Steps to Perform the Lab:

1. Google Drive

1. Access Google Drive:


o Go to Google Drive in your web browser.
o Sign in with your Google account.

2. Upload Files to Google Drive:


o Click on the New button on the left-hand side.
o Select File upload or Folder upload to upload files from your computer.

3. Sharing Files in Google Drive:


o Right-click on the file/folder you want to share and select Share.
o You can share files with individuals or get a shareable link.
o You can set permissions like Viewer, Commenter, or Editor for collaborators.

4. Collaborating on Files:

28 | P a g e
o For files like Google Docs, Sheets, and Slides, you can collaborate in real-time
with others.
o Multiple users can edit and comment on the same document at the same time.

2. OneDrive

1. Access OneDrive:
o Go to OneDrive in your web browser.
o Sign in with your Microsoft account.

2. Upload Files to OneDrive:


o Click on the Upload button to upload files or folders.
o You can drag and drop files directly into the OneDrive window.

3. Sharing Files in OneDrive:


o Right-click the file or folder you want to share and select Share.
o Choose to send a sharing link to individuals or groups and set permissions (view
or edit).

4. Collaborating on Files:
o For files such as Microsoft Office documents (Word, Excel, PowerPoint), multiple
users can edit them simultaneously.
o You can use OneDrive to sync files across devices, ensuring you have access to
them wherever you go.

29 | P a g e
3. Dropbox

1. Access Dropbox:
o Go to Dropbox in your web browser.
o Sign in with your Dropbox account.

2. Upload Files to Dropbox:


o Click on Upload files or Upload folder to upload content.
o You can also drag and drop files into Dropbox for easy uploading.

3. Sharing Files in Dropbox:


o Right-click the file or folder you want to share and select Share.
o You can share the file with specific people or generate a link.
o Dropbox also supports setting permissions (can edit or can view).

4. Collaborating on Files:
o You can invite others to a shared folder where everyone can contribute, access, and
modify files.
o Dropbox allows for real-time collaboration with integrations for Microsoft Office or
Google Workspace files.

30 | P a g e
4. Slack for Collaboration

1. Access Slack:
o Go to Slack and create an account or log in to your workspace.
o Download the desktop app or use the web version.

2. Create a Workspace:
o Create or join a Slack workspace for team collaboration.
o Set up channels for various topics, projects, or departments (e.g., #general, #marketing,
#development).

3. Send Messages and Share Files:


o In Slack, you can send messages in channels or direct messages.
o You can share files from Google Drive, OneDrive, Dropbox, or directly upload files to
Slack.

4. Integrate Cloud Storage with Slack:


o You can integrate Google Drive, OneDrive, and Dropbox with Slack to easily share files.
o When you upload files to Slack, they can be directly linked to your cloud storage for
easier access.

5. Collaborating in Slack:
o Use @mentions to notify team members.
o You can also create to-do lists, share documents, and work on tasks in real-time.
o Slack supports integration with other tools (e.g., Trello, Jira) for enhanced project
management.

31 | P a g e
32 | P a g e
Conclusion:

In this lab, we explored the use of popular cloud storage and collaboration tools: Google Drive,
OneDrive, Dropbox, and Slack. These tools are integral in modern workplaces for storing,
sharing, and collaborating on files in real-time. We learned how to upload, organize, share files,
and collaborate using the features of these tools. Additionally, we explored Slack for team
communication, task management, and integrating cloud storage services for efficient file
access and collaboration.

33 | P a g e
Lab 9: Implementing MapReduce for Word Count in Google Colab

Objective:

The objective of this lab is to implement the MapReduce algorithm for counting the
occurrences of specific words (e.g., "deer" and "beer") in a large unstructured text file (CSV
format) using Google Colab. The goal is to understand how the MapReduce framework works in
practice by using a local Python-based approach.

Introduction:

MapReduce is a programming model that allows for the parallel processing of large datasets
across distributed systems. It consists of two main phases: the map phase, where data is
processed and split into key-value pairs, and the reduce phase, where the results are
aggregated. In this lab, we simulate the MapReduce approach to count occurrences of the
words "deer" and "beer" in a sample text dataset uploaded as a CSV file. We will simulate
MapReduce on a single machine using Python in Google Colab, mimicking the processing steps
that would typically occur in a large distributed cluster.

Materials Needed:

 Google Colab (accessible via browser).


 A CSV file containing unstructured text data (uploaded to Google Colab).
 Python Libraries: csv, collections (for processing and storing word counts).

Pre-requisites:

 Basic understanding of the MapReduce algorithm.


 Familiarity with Python programming and the Google Colab environment.
 Knowledge of CSV file format and how to work with text data.

Steps to Perform the Lab:

Step 1: Upload the CSV File

First, we will upload the CSV file that contains the text data (for example, product reviews or
any other unstructured text). In Google Colab, use the following code to upload the file.

from google.colab import files


uploaded = files.upload()

This will trigger the file upload interface, where you can select and upload your file. Once
uploaded, the file name will be automatically captured.

34 | P a g e
Step 2: Extract the File Name

After uploading the file, we dynamically capture the file name to use it in the subsequent steps:

file_name = list(uploaded.keys())[0]
print(f"Uploaded file name: {file_name}")

This step ensures that the file is correctly identified and used for processing.

Step 3: Implement MapReduce Logic

Now we implement the MapReduce algorithm for counting occurrences of the words "deer"
and "beer". We'll use the mapper function to tokenize the text and emit key-value pairs, and
the reducer function will aggregate the counts.

import csv
from collections import defaultdict

# Function to process the uploaded CSV file


def process_file(file_name):
word_count = defaultdict(int)

# Read the CSV file and process it line by line


with open(file_name, 'r') as f:
reader = csv.reader(f)
next(reader) # Skip the header row
for row in reader:
if row:
review = row[0].lower() # Assuming review is in the first column
words = review.split()
for word in words:
if word == "deer" or word == "beer":
word_count[word] += 1
return word_count

# Process the file and get word counts


word_counts = process_file(f"/content/{file_name}")

# Print the word counts


print("Word counts for 'deer' and 'beer':")
for word, count in word_counts.items():
print(f"{word}: {count}")

35 | P a g e
36 | P a g e
37 | P a g e
Conclusion:

In this lab, we simulated the MapReduce algorithm using Python and Google Colab to count the
occurrences of specific words ("deer" and "beer") in an uploaded CSV file. The MapReduce
approach allows for parallel processing and is efficient for handling large datasets, though in
this case, we implemented it locally in Google Colab for simplicity.

38 | P a g e

You might also like