Vikas Kumar Report Final
Vikas Kumar Report Final
in
Computer Science and Engineering
Submitted by
Vikas Kumar
199301298
Submitted to
AUGUST, 2022
Date: 17.08.2022
CERTIFICATE
This is to certify that Mr. Vikas Kumar has partially completed the Minor/ Major
Industrial Training during the period from June-2022 to July-2022 in our Organization /
Industry as a Partial Fulfilment of Degree of Bachelor of Technology in Computer
Science and Engineering of Manipal University Jaipur, during the academic year 2021-
22. He was trained in the field of Google Cloud Computing.
I hereby declare that the Industrial Training Report on Name of the Google Cloud
Computing Certification, Industry is an authentic record of my own work as requirements
of Minor/ Major Industrial Training during the period from June-2022 to July-2022 for
the award of the degree of B.Tech. (Computer Science and Engineering), Manipal
University Jaipur, Rajasthan, under the guidance of Dr. Manmohan sir.
Vikas Kumar
199301298
Date: 17.08.2022
Certified that the above statement made by the student is correct to the best of our
knowledge and belief
Examined by:
(Signature)
(Signature)
Head of Department
ACKNOWLEDGMENTS
First and foremost, I wish to express my sincere thanks and gratitude to my esteemed
Mentor Dr. Manmohan Sir who has contributed so much for successful completion of
my Industrial Training by his thoughtful reviews and valuable guidance. Next, I would
like to tender my sincere thanks to Dr. Sandeep Chaurasia (Head of Computer Science
and Engineering Department), the Department industrial training coordinator and
faculty members whose assistance was sought during the training work for his
cooperation and encouragement.
Vikas Kumar
199301298
ABSTRACT
Cloud computing is a technology that uses the internet for storing and managing data on
remote servers and then access data via the internet. This type of system allows users to
work on the remote. Cloud computing customers do not own the physical infrastructure;
they rent the usage from a third-party provider. Cloud Computing and the Essential
characteristics of cloud services are On-demand self- service, Broad network access,
Resource pooling, rapid elasticity. Cloud computing is so successful because of its
simplicity in its usage. They are a cost-effective solution for enterprises. The various
features are Optimal Server Utilization, On-demand cloud services (satisfying client),
Dynamic Scalability, Virtualization techniques. One such example is Google cloud – It is a
suite of public cloud services offered by Google. All the application development run on
Google hardware. They include Google Compute Engine, App engine, google cloud storage,
Google container engine.
Cloud computing is a general term for anything that involves delivering hosted services over
the internet. These services are divided into three main categories or types of cloud
computing: infrastructure as a service (IaaS), platform as a service (PaaS) and software as
a service (SaaS).
A cloud can be private or public. A public cloud sells services to anyone on the internet. A
private cloud is a proprietary network or a data center that supplies hosted services to a
limited number of people, with certain access and permissions settings. Private or public,
the goal of cloud computing is to provide easy, scalable access to computing resources and
IT services.
Cloud infrastructure involves the hardware and software components required for proper
implementation of a cloud computing model. Cloud computing can also be thought of as
utility computing or on-demand computing.
The name cloud computing was inspired by the cloud symbol that's often used to represent
the internet in flowcharts and diagrams.
LIST OF TABLES
Table No Table Title Page No
1 Cloud Functions Use Case 20
2 Template Parameter 26
LIST OF FIGURES
Figure No Figure Title Page No
1 Networking in Cloud 7
2 Beyond Corp Component and Access Flow 8
3 Multiple VPC Network 11
4 HTTP Load Balancer With Cloud Armour 12
5 High Level Internal TCP/UDP Load Balancer 13
6 Vertex AI 24
Contents
Page No
Acknowledgement i
Abstract ii
List Of Figures iii
List Of Tables vi
INTRODUCTION 1
Chapter 1 GOOGLE CLOUD ESSENTIALS
1.1 Creating a Virtual Machine 2
1.2 Cloud Shell and gcloud 3
1.3 Kubernetes Engine 4
1.4 Network Load Balancer & HTTP Load Balancer 5
1
Most of the regions have three or more zones.
In this hands-on lab, we create virtual machine instances of various machine types
using the Google Cloud Console and the gcloud command line. Also, learn how to
connect an NGINX web server to your virtual machine.
Resources that live in a zone are referred to as zonal resources. Virtual machine
Instances and persistent disks live in a zone. To attach a persistent disk to a virtual
machine instance, both resources must be in the same zone. Similarly, if you want to
assign a static IP address to an instance, the instance must be in the same region as
the static IP.
2
Cloud Shell and gcloud
Cloud Shell provide you with command-line access to computing resources hosted on
Google Cloud. Cloud Shell is a Debian-based virtual machine with a persistent 5-GB
home directory, which makes it easy for you to manage your Google Cloud projects and
resources. The gcloud command-line tool and other utilities you need are pre-installed in
Cloud Shell, which allows you to get up and running quickly.
Commands:
● gcloud compute allows you to manage your Compute Engine resources in a
format that's simpler than the Compute Engine API.
● If you omit the --zone flag, the gcloud tool can infer your desired zone based on
your default properties. Other required instance settings, such as machine
type and image, are set to default values if not specified in the create command.
3
4
Kubernetes Engine
When you run a GKE cluster, you also gain the benefit of advanced cluster management
features that Google Cloud provides. These include:
5
Network Load Balancer & HTTP Load Balancer
You can configure a network load balancer for TCP, UDP, ESP, GRE, ICMP, and ICMPv6 traffic.
● Google Cloud VMs that have internet access through Cloud NAT or instance-based NAT
● Load-balanced packets are received by backend VMs with the packet's source and
destination IP addresses, protocol, and, if the protocol is port-based, the source
and destination ports unchanged.
● Responses from the backend VMs go directly to the clients, not back through the
load balancer. The industry term for this is direct server return.
6
HTTP(s) Load Balancing
External HTTP(S) Load Balancing is a proxy-based Layer 7 load balancer that enables you to run
and scale your services behind a single external IP address. External HTTP(S) Load Balancing
distributes HTTP and HTTPS traffic to backends hosted on a variety of Google Cloud platforms
(such as Compute Engine, Google Kubernetes Engine (GKE), Cloud Storage, and so on), as well
as external backends connected over the internet or via hybrid connectivity. For details, see Use
cases.
Modes of operation
You can configure External HTTP(S) Load Balancing in the following modes:
● Global external HTTP(S) load balancer (classic): This is the classic external HTTP(S)
load balancer that is global in Premium Tier but can be configured to be regional in
Standard Tier. This load balancer is implemented on Google Front Ends (GFEs). GFEs are
distributed globally and operate together using Google's global network and control
plane.
7
Networking in Google Cloud
Google Cloud is divided into regions, which are further subdivided into zones.
● A region is a geographic area where the round trip time (RTT) from one VM to another
is typically under 1 ms.
● A zone is a deployment area within a region that has its own fully isolated and
independent failure domain.
This means that no two machines in different zones or in different regions share the same fate
in the event of a single failure.
At the time of this writing, Google has more than 27 regions and more than 82 zones across
200+ countries. This includes 146 network edge locations and CDN to deliver the content. This
is the same network that also powers Google Search, Maps, Gmail, and YouTube.
Google network infrastructure
Google’s physical network infrastructure powers the global virtual network that you need to run
your applications in the cloud. It offers virtual networking and tools needed to lift-and-shift,
expand, and/or modernize your applications:
8
Figure 1Networking in Cloud
9
BeyondCorp Enterprice(BCE)
Virtually every company today uses firewalls to enforce perimeter security. However, this
security model is problematic because, when that perimeter is breached, an attacker has
relatively easy access to a company’s privileged intranet. As companies adopt mobile and cloud
technologies, the perimeter is becoming increasingly difficult to enforce. Google is taking a
different approach to network security. We are removing the requirement for a privileged
intranet and moving our corporate applications to the Internet.
Google’s BeyondCorp initiative is moving to a new model that dispenses with a privileged
corporate network. Instead, access depends solely on device and user credentials, regardless of
a user’s network location—be it an enterprise location, a home network, or a hotel or coffee
shop. All-access to enterprise resources is fully authenticated, fully authorized, and fully
encrypted based upon device state and user credentials. We can enforce fine-grained access to
different parts of enterprise resources. As a result, all Google employees can work successfully
from any network, and without the need for a traditional VPN connection to the privileged
network. The user experience between local and remote access to enterprise resources is
effectively identical, apart from potential differences in latency.
10
VPC NETWORK
A Virtual Private Cloud (VPC) network is a virtual version of a physical network, implemented
inside of Google's production network, using Andromeda. A VPC network provides the following:
● Offers native Internal TCP/UDP Load Balancing and proxy systems for Internal HTTP(S)
Load Balancing.
● Connects to on-premises networks using Cloud VPN tunnels and Cloud Interconnect
attachments.
● VPC networks, including their associated routes and firewall rules, are global resources.
They are not associated with any particular region or zone.
● Subnets are regional resources.
● Each subnet defines a range of IPv4 addresses. Subnets in custom mode VPC networks
can also have a range of IPv6 addresses.
● Traffic to and from instances can be controlled with network firewall rules. Rules are
implemented on the VMs themselves, so traffic can only be controlled and logged as it
leaves or arrives at a VM.
● Resources within a VPC network can communicate with one another by using internal
IPv4 addresses, internal IPv6 addresses, or external IPv6 addresses, subject to applicable
network firewall rules. For more information, see communication within the network.
Routes
Routes define paths for packets leaving instances (egress traffic). For details about Google
Cloud route types, see the routes overview.
For a description of dynamic routing mode options, see Effects of dynamic routing mode in the
Cloud Router documentation.
Firewall rules
Both hierarchical firewall policies and VPC firewall rules apply to packets sent to and from VM
instances (and resources that depend on VMs, such as Google Kubernetes Engine nodes). Both
types of firewalls control traffic even if it is between VMs in the same VPC network.
11
Communication within the network
The system-generated subnet routes define the paths for sending traffic among instances
within the network by using internal IP addresses. For one instance to be able to communicate
with another, appropriate firewall rules must also be configured because every network has an
implied deny firewall rule for ingress traffic.
Except for the default network, you must explicitly create higher priority ingress firewall rules to
allow instances to communicate with one another. The default network includes several firewall
rules in addition to the implied ones, including the default-allow-internal rule, which permits
instance-to-instance communication within the network. The default network also comes with
ingress rules allowing protocols such as RDP and SSH.
Latency
The measured inter-region latency for Google Cloud networks can be found in our live
dashboard. The dashboard shows Google Cloud's median inter-region latency and throughput
performance metrics and the methodology to reproduce these results using PerfKit
Benchmarker.
Google Cloud typically measures round-trip latencies less than 55 μs at the 50th percentile and
tail latencies less than 80μs at the 99th percentile between c2-standard-4 VM instances in the
same zone.
Packet loss
Google Cloud tracks cross-region packet loss by regularly measuring round-trip loss between all
regions. We target the global average of those measurements to be lower than 0.01% .
12
Multiple VPC Networks
In this lab we create several VPC networks and VM instances and test connectivity across
networks. Specifically, we create two custom mode networks (managementnet and privatenet)
with firewall rules and VM instances as shown in this network diagram:
13
HTTP LOAD BALANCER WITH CLOUD ARMOUR
Google Cloud HTTP(S) load balancing is implemented at the edge of Google's network in
Google's points of presence (POP) around the world. User traffic directed to an HTTP(S) load
balancer enters the POP closest to the user and is then load balanced over Google's global
network to the closest backend that has sufficient capacity available.
Cloud Armor IP allowlist/denylist enable you to restrict or allow access to your HTTP(S) load
balancer at the edge of the Google Cloud, as close as possible to the user and to malicious
traffic. This prevents malicious users or traffic from consuming resources or entering your
virtual private cloud (VPC) networks.
In this lab, we configure an HTTP Load Balancer with global backends, as shown in the diagram
below. Then, you stress test the Load Balancer and denylist the stress test IP with Cloud Armor.
An Internal TCP/UDP Load Balancing service has a frontend (the forwarding rule) and a backend
(the backend service). You can use either instance groups or GCE_VM_IP zonal NEGs as
backends on the backend service. This example shows instance group backends.
16
Packet Mirroring with OpenSource IDS
Traffic Mirroring is a key feature in Google Cloud networking for security and network analysis.
Its functionality is similar to that of a network tap or a span session in traditional networking. In
short, Packet Mirroring captures network traffic (ingress and egress) from select "mirrored
sources", copies the traffic, and forwards the copy to "collectors".
It is important to note that Packet Mirroring captures the full payload of each packet and thus
consumes additional bandwidth. Because Packet Mirroring is not based on any sampling
period, it is able to be used for better troubleshooting, security solutions, and higher layer
application-based analysis.
Packet Mirroring is founded on a "Packet Mirroring Policy", which contains the following
attributes:
● Region
● VPC Network(s)
● Mirrored Source(s)
● Collector (destination)
● Mirrored traffic (filter)
Here are some key points that also need to be considered:
● Only TCP, UDP and ICMP traffic may be mirrored. This, however, should satisfy the
majority of use cases.
● "Mirrored Sources" and "Collectors" must be in the SAME Region, but can be in different
zones and even different VPCs, as long as those VPCs have properly Peered.
● Additional bandwidth charges apply, especially between zones. To limit the traffic being
mirrored, filters can be used.
One prime use case for "Packet Mirroring" is to use it in an Intrusion Detection System (IDS)
solution. Some cloud-based IDS solutions require a special service to run on each source VM, or
to put an IDS virtual appliance in line between the network source and destination. Both of these
have significant implications. For example, the service-based solution, though fully distributed,
requires that the guest operating system supports the software. The "in-line" solution can create
a network bottleneck as all traffic must be funnelled through the IDS appliance. The in-line
solution will also not be able to capture "east-west" traffic within VMs in the same VPC.
Google Cloud Packet Mirroring does not require any additional software on the VMs and it is
fully distributed across each of the mirrored virtual machines. The "Collector" IDS is placed out-
of-path using an Internal Network Load Balancer (ILB) and will receive both "north-south" traffic
and "east-west" traffic.
17
Baseline: Infrastructure
The concept of a baseline as a complete picture of infrastructure has only become possible
because of cloud computing. It’s a lot like a map versus a photograph. A map is incomplete and
only focuses on certain features such as the exit numbers or street names. But a photograph
shows everything with all of its details.
Before the cloud, a traditional data centre was more of a map than a photograph. You could see
boxes and even how they are connected, but the data centre was still full of mystery. For
example, if you look at a switch, you have to read the procedural code configuring the switch to
understand what it is doing. But with the cloud, the infrastructure configuration is exposed and
configured via an API. Everything is discoverable and can be understood. Because of this, a
baseline is a 100% resolution picture of a cloud infrastructure environment that the industry has
never had before.
18
Cloud Storage
Cloud Storage allows worldwide storage and retrieval of any amount of data at any time. You
can use Cloud Storage for a range of scenarios including serving website content, storing data
for archival and disaster recovery, or distributing large data objects to users via direct
download.
● Standard Storage: Good for “hot” data that’s accessed frequently, including websites,
streaming videos, and mobile apps.
● Nearline Storage: Low cost. Good for data that can be stored for at least 30 days,
including data backup and long-tail multimedia content.
● Coldline Storage: Very low cost. Good for data that can be stored for at least 90 days,
including disaster recovery.
● Archive Storage: Lowest cost. Good for data that can be stored for at least 365 days,
including regulatory archives.
19
CLOUD IAM
Google Cloud's Identity and Access Management (IAM) service lets you create and manage
permissions for Google Cloud resources. Cloud IAM unifies access control for Google Cloud
services into a single system and provides a consistent set of operations.
Cloud IAM provides the right tools to manage resource permissions with minimum fuss and
high automation. You don't directly grant users permissions. Instead, you grant them roles,
which bundle one or more permissions. This allows you to map job functions within your
company to groups and roles. Users get access only to what they need to get the job done, and
admins can easily grant default permissions to entire groups of users.
● Predefined Roles
● Custom Roles
Predefined roles are created and maintained by Google. Their permissions are automatically
updated as necessary, such as when new features or services are added to Google Cloud.
Custom roles are user-defined, and allow you to bundle one or more supported permissions to
meet your specific needs. Custom roles are not maintained by Google; when new permissions,
features, or services are added to Google Cloud, your custom roles will not be updated
automatically. You create a custom role by combining one or more of the available Cloud IAM
permissions. Permissions allow users to perform specific actions on Google Cloud resources.
20
CLOUD MONITORING
Cloud Monitoring provides visibility into the performance, uptime, and overall health of cloud-
powered applications. Cloud Monitoring collects metrics, events, and metadata from Google
Cloud, Amazon Web Services, hosted uptime probes, application instrumentation, and a variety
of common application components including Cassandra, Nginx, Apache Web Server,
Elasticsearch, and many others. Cloud Monitoring ingests that data and generates insights via
dashboards, charts, and alerts. Cloud Monitoring alerting helps you collaborate by integrating
with Slack, PagerDuty, HipChat, Campfire, and more.
SLO monitoring
Automatically infers or custom defines service-level objectives (SLOs) for applications and gets
alerted when SLO violations occur. Check out our step-by-step guide to learn how to set SLOs,
following SRE best practices.
Managed metrics collection for Kubernetes and virtual machines
Google Cloud’s operations suite offers Managed Service for Prometheus for use with
Kubernetes, which features self-deployed and managed collection options to simplify
metrics collection, storage, and querying. For VMs, you can use the Ops Agent, which
combines logging and metrics collection into a single agent that can be deployed at
scale using popular configuration and management tools.
Google Cloud integration
Discover and monitor all Google Cloud resources and services, with no additional
instrumentation, integrated right into the Google Cloud console.
21
CLOUD FUNCTIONS
Cloud Functions is a serverless execution environment for building and connecting cloud
services. With Cloud Functions you write simple, single-purpose functions that are attached to
events emitted from your cloud infrastructure and services.
Your Cloud Function is triggered when an event being watched is fired. Your code executes in a
fully managed environment. There is no need to provision any infrastructure or worry about
managing any servers.
Cloud Functions are written in Javascript and executed in a Node.js environment on Google
Cloud. You can take your Cloud Function and run it in any standard Node.js runtime which
makes both portability and local testing a breeze.
Cloud Functions provides a connective layer of logic that lets you write code to connect and
extend cloud services. Listen and respond to a file upload to Cloud Storage, a log change, or an
incoming message on a Cloud Pub/Sub topic.
Cloud Functions augments existing cloud services and allows you to address an increasing
number of use cases with arbitrary programming logic. Cloud Functions have access to the
Google Service Account credential and are thus seamlessly authenticated with the majority of
Google Cloud services such as Datastore, Cloud Spanner, Cloud Translation API, Cloud Vision
API, as well as many others. In addition, Cloud Functions are supported by numerous Node.js
client libraries, which further simplify these integrations.
Cloud events are things that happen in your cloud environment.These might be things like
changes to data in a database, files added to a storage system, or a new virtual machine
instance is created.
Events occur whether or not you choose to respond to them. You create a response to an event
with a trigger. A trigger is a declaration that you are interested in a certain event or set of events.
Binding a function to a trigger allows you to capture and act on events. For more information on
creating triggers and associating them with your functions, see Google Cloud, Cloud Functions
Documentation, Events and Triggers.
22
Serverless
Cloud Functions remove the work of managing servers, configuring software, updating
frameworks, and patching operating systems. The software and infrastructure are fully
managed by Google so you just add code. Furthermore, the provisioning of resources
happens automatically in response to events. This means that a function can scale from
a few invocations a day to many millions of invocations without any work from you.
Use cases
Asynchronous workloads like lightweight ETL, or cloud automations like triggering application
builds now no longer need their own server and a developer to wire it up. You simply deploy a
Cloud Function bound to the event you want and you're done.
The fine-grained, on-demand nature of Cloud Functions also makes it a perfect candidate for
lightweight APIs and webhooks. In addition, the automatic provisioning of HTTP endpoints
when you deploy an HTTP Function means there is no complicated configuration required as
there is with some other services. See the following table for additional common Cloud
Functions use cases:
Data Listen and respond to Cloud Storage events such as when a file is created,
Processing / changed, or removed. Process images, perform video transcoding, validate and
ETL transform data, and invoke any service on the Internet from your Cloud Function.
Via a simple HTTP trigger, respond to events originating from 3rd party systems
Webhooks
like GitHub, Slack, Stripe, or from anywhere that can send HTTP requests.
Compose applications from lightweight, loosely coupled bits of logic that are quick
Lightweight
to build and that scale instantly. Your functions can be event-driven or invoked
APIs
directly over HTTP/S.
Use Google's mobile platform for app developers, Firebase, and write your mobile
Mobile
backend in Cloud Functions. Listen and respond to events from Firebase Analytics,
Backend
Realtime Database, Authentication, and Storage.
23
24
GOOGLE CLOUD Pub/Sub
Google Cloud Pub/Sub is a messaging service for exchanging event data among applications
and services. A producer of data publishes messages to a Cloud Pub/Sub topic. A consumer
creates a subscription to that topic. Subscribers either pull messages from a subscription or are
configured as webhooks for push subscriptions. Every subscriber must acknowledge each
message within a configurable window of time.
Pub/Sub allows services to communicate asynchronously, with latencies on the order of 100
milliseconds.
Pub/Sub is used for streaming analytics and data integration pipelines to ingest and distribute
data. It's equally effective as a messaging-oriented middleware for service integration or as a
queue to parallelize tasks.
Publishers send events to the Pub/Sub service, without regard to how or when these events are
to be processed. Pub/Sub then delivers events to all the services that react to them. In systems
communicating through RPCs, publishers must wait for subscribers to receive the data.
However, the asynchronous integration in Pub/Sub increases the flexibility and robustness of
the overall system.
● Pub/Sub service: This messaging service is the default choice for most users and
applications. It offers the highest reliability and largest set of integrations, along with
automatic capacity management. Pub/Sub guarantees synchronous replication of all
data to at least two zones and best-effort replication to a third additional zone.
● Pub/Sub Lite service: A separate but similar messaging service built for lower cost. It
offers lower reliability compared to Pub/Sub. It offers either zonal or regional topic
storage. Zonal Lite topics are stored in only one zone. Regional Lite topics replicate data
to a second zone asynchronously. Also, Pub/Sub Lite requires you to pre-provision and
manage storage and throughput capacity. Consider Pub/Sub Lite only for applications
where achieving a low cost justifies some additional operational work and lower
reliability.
25
Common use cases
● Ingestion user interaction and server events: To use user interaction events from end-
user apps or server events from your system, you might forward them to Pub/Sub. You
can then use a stream processing tool, such as Dataflow, which delivers the events to
databases. Examples of such databases are BigQuery, Cloud Bigtable, and Cloud
Storage. Pub/Sub lets you gather events from many clients simultaneously.
● Parallel processing and workflows: You can efficiently distribute many tasks among
multiple workers by using Pub/Sub messages to connect to Cloud Functions. Examples
of such tasks are compressing text files, sending email notifications, evaluating AI
models, and reformatting images.
● Enterprise event bus: You can create an enterprise-wide real-time data sharing bus,
distributing business events, database updates, and analytics events across your
organization.
26
BASELINE: DATA, ML, AI
Data:
Big data is a combination of structured, semi-structured and unstructured data collected by
organizations that can be mined for information and used in machine
learning projects, predictive modelling and other advanced analytics applications.
Systems that process and store big data have become a common component of data
management architectures in organizations, combined with tools that support big data
analytics uses. Big data is often characterized by the three V's:
ML:
By studying and experimenting with machine learning, programmers test the limits of how much
they can improve the perception, cognition, and action of a computer system.
Deep learning, an advanced method of machine learning, goes a step further. Deep learning
models use large neural networks — networks that function like a human brain to logically
analyze data — to learn complex patterns and make predictions independent of human input.
AI:
Artificial Intelligence is the field of developing computers and robots that are capable of
behaving in ways that both mimic and go beyond human capabilities. AI-enabled programs can
analyze and contextualize data to provide information or automatically trigger actions without
human interference.
Today, artificial intelligence is at the heart of many technologies we use, including smart
devices and voice assistants such as Siri on Apple devices. Companies are incorporating
techniques such as natural language processing and computer vision — the ability for
computers to use human language and interpret images ¬— to automate tasks, accelerate
decision making, and enable customer conversations with chatbots.
27
Vertex AI
Vertex AI is Google Cloud's next generation, a unified platform for machine learning
development and the successor to the AI Platform announced at Google I/O in May
2021. By developing machine learning solutions on Vertex AI, you can leverage the latest
ML pre-built components and AutoML to significantly enhance development productivity,
the ability to scale your workflow and decision-making with your data, and accelerate
time to value.
Features
A unified UI for the entire ML workflow:
Vertex AI brings together the Google Cloud services for building ML under one, unified UI
and API. In Vertex AI, you can now easily train and compare models using AutoML or
custom code training and all your models are stored in one central model repository.
These models can now be deployed to the same endpoints on Vertex AI.
Pre-trained APIs for vision, video, and natural language:
Easily infuse vision, video, translation, and natural language ML into existing applications
or build entirely new intelligent applications across a broad range of use cases
(including Translation and Speech to Text). AutoML enables developers to train high-
quality models specific to their business needs with minimal ML expertise or effort. With
a centrally managed registry for all datasets across data types (vision, natural language,
and tabular).
End-to-end integration for data and AI:
Through Vertex AI Workbench, Vertex AI is natively integrated with BigQuery, Dataproc
and Spark. You can use BigQuery ML to create and execute machine learning models in
BigQuery using standard SQL queries on existing business intelligence tools and
spreadsheets, or you can export datasets from BigQuery directly into Vertex AI
Workbench and run your models from there. Use Vertex Data Labeling to generate highly
accurate labels for your data collection.
Support for all open source frameworks:
Vertex AI integrates with widely used open source frameworks such
as TensorFlow, PyTorch, and sci-kit-learn, along with supporting all ML frameworks and
artificial intelligence branches via custom containers for training and prediction.
28
Figure 6 Vertex AI
29
DATAPREP
Dataprep by Trifacta is an intelligent data service for visually exploring, cleaning, and preparing
structured and unstructured data for analysis, reporting, and machine learning. Because
Dataprep is serverless and works at any scale, there is no infrastructure to deploy or manage.
Your next ideal data transformation is suggested and predicted with each UI input, so you don’t
have to write code.
Serverless simplicity
Dataprep is an integrated partner service operated by Trifacta and based on their industry-
leading data preparation solution. Google works closely with Trifacta to provide a seamless
user experience that removes the need for up-front software installation, separate licensing
costs, or ongoing operational overhead. Dataprep is fully managed and scales on demand to
meet your growing data preparation needs so you can stay focused on analysis.
30
GOOGLE CLOUD DATAFLOW TEMPLATE
Pub/Sub Subscription to BigQuery
The Pub/Sub Subscription to BigQuery template is a streaming pipeline that reads JSON-
formatted messages from a Pub/Sub subscription and writes them to a BigQuery table. You can
use the template as a quick solution to move Pub/Sub data to BigQuery. The template reads
JSON-formatted messages from Pub/Sub and converts them to BigQuery elements.
● The data field of Pub/Sub messages must use the JSON format, described in this JSON
guide. For example, messages with values in the data field formatted as {"k1":"v1",
"k2":"v2"} can be inserted into a BigQuery table with two columns, named k1 and k2, with
a string data type.
● The output table must exist prior to running the pipeline. The table schema must match
the input JSON objects.
Template parameters
Parameter Description
31
DATAPROC
32
build data applications connecting Dataproc to BigQuery, Vertex AI, Cloud
Spanner, Pub/Sub, or Data Fusion.
33
CLOUD NATURAL LANGUAGE API
Cloud Natural Language API lets you extract information about people, places, events, (and
more) mentioned in text documents, news articles, or blog posts. You can use it to understand
sentiment about your product on social media, or parse intent from customer conversations
happening in a call center or a messaging app. You can even upload text documents for
analysis.
Syntax Analysis: Extract tokens and sentences, identify parts of speech (PoS) and create
dependency parse trees for each sentence.
Integrated REST API: Access via REST API. Text can be uploaded in the request or
integrated with Cloud Storage.
Features
AutoML
Train your own high-quality machine learning custom models to classify, extract, and
detect sentiment with minimum effort and machine learning expertise using Vertex AI for
natural language, powered by AutoML. You can use the AutoML UI to upload your training
data and test your custom model without a single line of code.
Natural Language API
The powerful pre-trained models of the Natural Language API empowers developers to
easily apply natural language understanding (NLU) to their applications with features
including sentiment analysis, entity analysis, entity sentiment analysis, content
classification, and syntax analysis.
Healthcare Natural Language AI
Gains real-time analysis of insights stored in unstructured medical text. Healthcare
Natural Language API allows you to distil machine-readable medical insights from
medical documents, while AutoML Entity Extraction for Healthcare makes it simple to
34
build custom knowledge extraction models for healthcare and life sciences apps—no
coding skills required.
REINFORCEMENT LEARNING
Like many other areas of machine learning research, reinforcement learning (RL) is evolving at
breakneck speed. Just as they have done in other research areas, researchers are leveraging
deep learning to achieve state-of-the-art results.
In particular, reinforcement learning has significantly outperformed prior ML techniques in game
playing, reaching human-level and even world-best performance on Atari, beating the human Go
champion, and is showing promising results in more difficult games like Starcraft II.
Due to its generality, reinforcement learning is studied in many disciplines, such as game
theory, control theory, operations research, information theory, simulation-based
optimization, multi-agent systems, swarm intelligence, and statistics. In the operations research
and control literature, reinforcement learning is called approximate dynamic
programming, or neuro-dynamic programming. The problems of interest in reinforcement
learning have also been studied in the theory of optimal control, which is concerned mostly with
the existence and characterization of optimal solutions, and algorithms for their exact
computation, and less with learning or approximation, particularly in the absence of a
mathematical model of the environment. In economics and game theory, reinforcement learning
may be used to explain how equilibrium may arise under bounded rationality.
The purpose of reinforcement learning is for the agent to learn an optimal, or nearly-optimal,
policy that maximizes the "reward function" or other user-provided reinforcement signal that
accumulates from the immediate rewards. This is similar to processes that appear to occur in
animal psychology. For example, biological brains are hardwired to interpret signals such as
pain and hunger as negative reinforcements, and interpret pleasure and food intake as positive
reinforcements. In some circumstances, animals can learn to engage in behaviors that optimize
these rewards. This suggests that animals are capable of reinforcement learning.
35
VIDEO INTELLIGENCE
Google Cloud Video Intelligence makes videos searchable and discoverable by extracting
metadata with an easy-to-use REST API. You can now search every moment of every video file
in your catalogue. It quickly annotates videos stored in Cloud Storage, and helps you identify key
entities (nouns) within your video; and when they occur within the video. Separate signal from
noise by retrieving relevant information within the entire video, shot-by-shot, -or per frame.
The Video Intelligence API allows developers to use Google video analysis technology as part of
their applications. The REST API enables users to annotate videos stored locally or in Cloud
Storage, or live-streamed, with contextual information at the level of the entire video, per
segment, per shot, and per frame.
Features:
Precise video analysis
Recognize over 20,000 objects, places, and actions in stored and streaming video.
Extract rich metadata at the video, shot, or frame level. Create your own custom entity
labels with AutoML Video Intelligence.
Simplify media management
Search your video catalogue the same way you search documents. Extract metadata that
can be used to index, organize, and search your video content, as well as control and
filter content for what’s most relevant.
Easily create intelligent video apps
Gain insights from video in near real-time using streaming video annotation and trigger
events based on objects detected. Build engaging customer experiences with highlight
reels, recommendations, and more.
36
Conclusion
This training was really helpful in understanding Google Cloud Infrastructure. From basic labs to
performing advanced AI&ML labs, this can be concluded that the Google’s cloud infrastructure
can be used to perform any task let it be Big Data Analytics or building a server for your
institution. Google Cloud is safe and secure. One only needs high speed internet to reach out to
Google Server and perform tasks.
The google Cloud Console is really interactive platform where you can access the resources
using the API. With cloud console you don’t need to have to write the commands for your tasks,
you can just click on the icon and the cloud does for you. Let it be creating an instance or do a
big data analysis, you have and interactive platform with built in features. The tasks can be
completed in clicks by providing the suitable details required in the given section.
This platform is secure as you and only you can access your google cloud resources by logging
in to your Google account.
37
Certificates
38