0% found this document useful (0 votes)

187 views6 pages

Professional Data Engineer Demo

This document provides a 10 question sample quiz on Google Cloud technologies and services. The questions cover topics like data modeling, data pipelines, databases, and data storage. It also advertises that the full quiz contains 173 premium questions and provides a link to purchase the full quiz PDF.

Uploaded by

RAVI TEJA KAKARLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

187 views6 pages

Professional Data Engineer Demo

Uploaded by

RAVI TEJA KAKARLA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Professional Data Engineer on Google Cloud

Platform
Google Professional-Data-Engineer
Version Demo

Total Demo Questions: 10

Total Premium Questions: 173

Buy Premium PDF

https://dumpsarena.com

[email protected]
QUESTION NO: 1

You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want
to see if you can improve training speed by removing some features while having a minimum effect on model accuracy.
What can you do?

A. Eliminate features that are highly correlated to the output labels.

B. Combine highly co-dependent features into one representative feature.

C. Instead of feeding in each feature individually, average their values in batches of 3.

D. Remove the features that have null values for more than 50% of the training records.

ANSWER: B

QUESTION NO: 2

Your company is running their first dynamic campaign, serving different offers by analyzing real-time data during the holiday
season. The data scientists are collecting terabytes of data that rapidly grows every hour during their 30-day campaign. They
are using Google Cloud Dataflow to preprocess the data and collect the feature (signals) data that is needed for the machine
learning model in Google Cloud Bigtable. The team is observing suboptimal performance with reads and writes of their initial
load of 10 TB of data. They want to improve this performance while minimizing cost. What should they do?

A. Redefine the schema by evenly distributing reads and writes across the row space of the table.

B. The performance issue should be resolved over time as the site of the BigDate cluster is increased.

C. Redesign the schema to use a single row key to identify values that need to be updated frequently in the cluster.

D. Redesign the schema to use row keys based on numeric IDs that increase sequentially per user viewing the offers.

ANSWER: A

QUESTION NO: 3

You used Cloud Dataprep to create a recipe on a sample of data in a BigQuery table. You want to reuse this recipe on a
daily upload of data with the same schema, after the load job with variable execution time completes. What should you do?

A. Create a cron schedule in Cloud Dataprep.

B. Create an App Engine cron job to schedule the execution of the Cloud Dataprep job.

C. Export the recipe as a Cloud Dataprep template, and create a job in Cloud Scheduler.

DumpsArena - Pass Your Next Certification Exam Fast!

dumpsarena.com
D. Export the Cloud Dataprep job as a Cloud Dataflow template, and incorporate it into a Cloud Composer job.

ANSWER: C

QUESTION NO: 4

You have a data stored in BigQuery. The data in the BigQuery dataset must be highly available. You need to define a
storage, backup, and recovery strategy of this data that minimizes cost. How should you configure the BigQuery table?

A. Set the BigQuery dataset to be regional. In the event of an emergency, use a point-in-time snapshot to recover the data.

B. Set the BigQuery dataset to be regional. Create a scheduled query to make copies of the data to tables suffixed with the
time of the backup. In the event of an emergency, use the backup copy of the table.

C. Set the BigQuery dataset to be multi-regional. In the event of an emergency, use a point-in-time snapshot to recover the
data.

D. Set the BigQuery dataset to be multi-regional. Create a scheduled query to make copies of the data to tables suffixed with
the time of the backup. In the event of an emergency, use the backup copy of the table.

ANSWER: B

QUESTION NO: 5

You want to migrate an on-premises Hadoop system to Cloud Dataproc. Hive is the primary tool in use, and the data format
is Optimized Row Columnar (ORC). All ORC files have been successfully copied to a Cloud Storage bucket. You need to
replicate some data to the cluster’s local Hadoop Distributed File System (HDFS) to maximize performance. What are two
ways to start using Hive in Cloud Dataproc? (Choose two.)

A. Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to HDFS. Mount the Hive tables locally.

B. Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to any node of the Dataproc cluster. Mount
the Hive tables locally.

C. Run the gsutil utility to transfer all ORC files from the Cloud Storage bucket to the master node of the Dataproc cluster.
Then run the Hadoop utility to copy them do HDFS. Mount the Hive tables from HDFS.

D. Leverage Cloud Storage connector for Hadoop to mount the ORC files as external Hive tables. Replicate external Hive
tables to the native ones.

E. Load the ORC files into BigQuery. Leverage BigQuery connector for Hadoop to mount the BigQuery tables as external
Hive tables. Replicate external Hive tables to the native ones.

ANSWER: B C

DumpsArena - Pass Your Next Certification Exam Fast!

dumpsarena.com
QUESTION NO: 6

Your company produces 20,000 files every hour. Each data file is formatted as a comma separated values (CSV) file that is
less than 4 KB. All files must be ingested on Google Cloud Platform before they can be processed. Your company site has a
200 ms latency to Google Cloud, and your Internet connection bandwidth is limited as 50 Mbps. You currently deploy a
secure FTP (SFTP) server on a virtual machine in Google Compute Engine as the data ingestion point. A local SFTP client
runs on a dedicated machine to transmit the CSV files as is. The goal is to make reports with data from the previous day
available to the executives by 10:00 a.m. each day. This design is barely able to keep up with the current volume, even
though the bandwidth utilization is rather low.

You are told that due to seasonality, your company expects the number of files to double for the next three months. Which
two actions should you take? (Choose two.)

A. Introduce data compression for each file to increase the rate file of file transfer.

B. Contact your internet service provider (ISP) to increase your maximum bandwidth to at least 100 Mbps.

C. Redesign the data ingestion process to use gsutil tool to send the CSV files to a storage bucket in parallel.

D. Assemble 1,000 files into a tape archive (TAR) file. Transmit the TAR files instead, and disassemble the CSV files in the
cloud upon receiving them.

E. Create an S3-compatible storage endpoint in your network, and use Google Cloud Storage Transfer Service to transfer
on-premises data to the designated storage bucket.

ANSWER: C E

QUESTION NO: 7

You are running a pipeline in Cloud Dataflow that receives messages from a Cloud Pub/Sub topic and writes the results to a
BigQuery dataset in the EU. Currently, your pipeline is located in europe-west4 and has a maximum of 3 workers, instance
type n1-standard-1. You notice that during peak periods, your pipeline is struggling to process records in a timely fashion,
when all 3 workers are at maximum CPU utilization. Which two actions can you take to increase performance of your
pipeline? (Choose two.)

A. Increase the number of max workers

B. Use a larger instance type for your Cloud Dataflow workers

C. Change the zone of your Cloud Dataflow pipeline to run in us-central1

D. Create a temporary table in Cloud Bigtable that will act as a buffer for new data. Create a new step in your pipeline to
write to this table first, and then create a new pipeline to write from Cloud Bigtable to BigQuery

E. Create a temporary table in Cloud Spanner that will act as a buffer for new data. Create a new step in your pipeline to
write to this table first, and then create a new pipeline to write from Cloud Spanner to BigQuery

ANSWER: B E

DumpsArena - Pass Your Next Certification Exam Fast!

dumpsarena.com
QUESTION NO: 8

You need to create a data pipeline that copies time-series transaction data so that it can be queried from within BigQuery by
your data science team for analysis. Every hour, thousands of transactions are updated with a new status. The size of the
intitial dataset is 1.5 PB, and it will grow by 3 TB per day. The data is heavily structured, and your data science team will
build machine learning models based on this data. You want to maximize performance and usability for your data science
team. Which two strategies should you adopt? (Choose two.)

A. Denormalize the data as must as possible.

B. Preserve the structure of the data as much as possible.

C. Use BigQuery UPDATE to further reduce the size of the dataset.

D. Develop a data pipeline where status updates are appended to BigQuery instead of updated.

E. Copy a daily snapshot of transaction data to Cloud Storage and store it as an Avro file. Use BigQuery’s support for
external data sources to query.

ANSWER: D E

QUESTION NO: 9

You want to process payment transactions in a point-of-sale application that will run on Google Cloud Platform. Your user
base could grow exponentially, but you do not want to manage infrastructure scaling.

Which Google database service should you use?

A. Cloud SQL

B. BigQuery

C. Cloud Bigtable

D. Cloud Datastore

ANSWER: A

QUESTION NO: 10

You decided to use Cloud Datastore to ingest vehicle telemetry data in real time. You want to build a storage system that will
account for the long-term data growth, while keeping the costs low. You also want to create snapshots of the data
periodically, so that you can make a point-in-time (PIT) recovery, or clone a copy of the data for Cloud Datastore in a
different environment. You want to archive these snapshots for a long time. Which two methods can accomplish this?
(Choose two.)

A. Use managed export, and store the data in a Cloud Storage bucket using Nearline or Coldline class.

DumpsArena - Pass Your Next Certification Exam Fast!

dumpsarena.com
B. Use managed export, and then import to Cloud Datastore in a separate project under a unique namespace reserved for
that export.

C. Use managed export, and then import the data into a BigQuery table created just for that export, and delete temporary
export files.

D. Write an application that uses Cloud Datastore client libraries to read all the entities. Treat each entity as a BigQuery table
row via BigQuery streaming insert. Assign an export timestamp for each export, and attach it as an extra column for each
row. Make sure that the BigQuery table is partitioned using the export timestamp column.

E. Write an application that uses Cloud Datastore client libraries to read all the entities. Format the exported data into a
JSON file. Apply compression before storing the data in Cloud Source Repositories.

ANSWER: C E

DumpsArena - Pass Your Next Certification Exam Fast!

dumpsarena.com

Data Contracts Early Release 042024
No ratings yet
Data Contracts Early Release 042024
52 pages
Pls Gca Pca Student Slides 6
No ratings yet
Pls Gca Pca Student Slides 6
108 pages
PCA Set2
No ratings yet
PCA Set2
21 pages
Professional Data Engineer Certification Exam Guide - Learn - Google Cloud
No ratings yet
Professional Data Engineer Certification Exam Guide - Learn - Google Cloud
10 pages
Lea 2 - Comparative Police System (Lesson 1)
No ratings yet
Lea 2 - Comparative Police System (Lesson 1)
8 pages
GCP - Architect Certification 002 Flashcards - Quizlet
No ratings yet
GCP - Architect Certification 002 Flashcards - Quizlet
67 pages
Preparing For Your Professional Data Engineer Journey T GCPPDE A m5 l6 File en 33
No ratings yet
Preparing For Your Professional Data Engineer Journey T GCPPDE A m5 l6 File en 33
33 pages
GCP PCAdemo
No ratings yet
GCP PCAdemo
18 pages
PCA Set1
No ratings yet
PCA Set1
67 pages
Fatawa Lottery and Gambling
No ratings yet
Fatawa Lottery and Gambling
25 pages
03 T-GCPPCA-A-m2-l6-file-en-17.en
No ratings yet
03 T-GCPPCA-A-m2-l6-file-en-17.en
26 pages
Internship Report On AKRSP
67% (3)
Internship Report On AKRSP
83 pages
Dayananda Sagar College of Engineering: M.TECH: Digital Electronics and Communication
No ratings yet
Dayananda Sagar College of Engineering: M.TECH: Digital Electronics and Communication
4 pages
Istio Certified Associate
No ratings yet
Istio Certified Associate
5 pages
GCP Digital Leader Cheat Sheet PDF
No ratings yet
GCP Digital Leader Cheat Sheet PDF
1 page
Full Job Description - Project Management Fresher
No ratings yet
Full Job Description - Project Management Fresher
2 pages
Aminu Final Draft-1
No ratings yet
Aminu Final Draft-1
86 pages
State Farm Report
No ratings yet
State Farm Report
20 pages
8A Workksheets
No ratings yet
8A Workksheets
20 pages
2024 Amherstburg Calendar - Web
No ratings yet
2024 Amherstburg Calendar - Web
36 pages
Score Report For Harness Certified Expert - Continuous Delivery & GitOps Developer
No ratings yet
Score Report For Harness Certified Expert - Continuous Delivery & GitOps Developer
2 pages
THHDH
No ratings yet
THHDH
56 pages
Personal Plan Udemy Courses
No ratings yet
Personal Plan Udemy Courses
2 pages
Vishwa Resume
No ratings yet
Vishwa Resume
1 page
Generalized Elliptical Distributions Theory and Applications (Thesis) - Frahm (2004)
No ratings yet
Generalized Elliptical Distributions Theory and Applications (Thesis) - Frahm (2004)
145 pages
Asymptotic Analysis: Objectives
No ratings yet
Asymptotic Analysis: Objectives
20 pages
ISO 14001 Environment Management Watermark
No ratings yet
ISO 14001 Environment Management Watermark
2 pages
Associate Data Practitioner Google Cloud Dumps Questions
No ratings yet
Associate Data Practitioner Google Cloud Dumps Questions
7 pages
Classics: Invention of The Integrated Circuit
No ratings yet
Classics: Invention of The Integrated Circuit
16 pages
English Sample Exam PDPP 202301
No ratings yet
English Sample Exam PDPP 202301
40 pages
FM Chapter 16 Exercises
No ratings yet
FM Chapter 16 Exercises
7 pages
Lewatit Monoplus S 108 H
No ratings yet
Lewatit Monoplus S 108 H
5 pages
Writeup 24112023 3
No ratings yet
Writeup 24112023 3
2 pages
Development Agreement
No ratings yet
Development Agreement
36 pages
Section A (50 Marks)
No ratings yet
Section A (50 Marks)
4 pages
Service Management in Cloud Computing
No ratings yet
Service Management in Cloud Computing
10 pages
Professional Cloud DevOps Engineer V12.35
No ratings yet
Professional Cloud DevOps Engineer V12.35
17 pages
Grand Designs UK - November 2021
No ratings yet
Grand Designs UK - November 2021
156 pages
GCP Associate Cloud Engineer
100% (1)
GCP Associate Cloud Engineer
4 pages
Conversion
No ratings yet
Conversion
1 page
Preparing For PCA Workbook
100% (1)
Preparing For PCA Workbook
87 pages
CHE 420 Syllabus 2024 FALL
No ratings yet
CHE 420 Syllabus 2024 FALL
3 pages
BP 36-56 Ingles
No ratings yet
BP 36-56 Ingles
16 pages
Advanced Nuclear Energy
No ratings yet
Advanced Nuclear Energy
46 pages
Jeppview For Windows: List of Pages in This Trip Kit
No ratings yet
Jeppview For Windows: List of Pages in This Trip Kit
30 pages
Cloud Digital Leader 1
100% (1)
Cloud Digital Leader 1
29 pages
Air21 Location
No ratings yet
Air21 Location
1 page
Taj Mahal Professional Resume Template Violet
No ratings yet
Taj Mahal Professional Resume Template Violet
3 pages
Associate Data Practitioner Exam Dumps
No ratings yet
Associate Data Practitioner Exam Dumps
6 pages
820P 203
No ratings yet
820P 203
10 pages
307 Standard Resume Format Template
No ratings yet
307 Standard Resume Format Template
3 pages
Resume For Cloud Technologies
No ratings yet
Resume For Cloud Technologies
4 pages
GCP Tech Leap Dumps Latest 2023
No ratings yet
GCP Tech Leap Dumps Latest 2023
147 pages
English Preparation Guide PDPP 201911
No ratings yet
English Preparation Guide PDPP 201911
16 pages
Professional Cloud DevOps Engineer - en
No ratings yet
Professional Cloud DevOps Engineer - en
33 pages
Gcse Computer Science 8520/1: Paper 1
No ratings yet
Gcse Computer Science 8520/1: Paper 1
23 pages
306 Fillable Resume Template
No ratings yet
306 Fillable Resume Template
3 pages
Real Google Cloud Associate Data Practitioner Study Questions by Brady
No ratings yet
Real Google Cloud Associate Data Practitioner Study Questions by Brady
8 pages
Free Resume Template European
No ratings yet
Free Resume Template European
3 pages
Website: Vce To PDF Converter: Facebook: Twitter:: Saa-C02.Vceplus - Premium.Exam.65Q
100% (1)
Website: Vce To PDF Converter: Facebook: Twitter:: Saa-C02.Vceplus - Premium.Exam.65Q
22 pages
Complete Guide To Freelancing in 2023
No ratings yet
Complete Guide To Freelancing in 2023
1 page
Ludo Game Report LP
No ratings yet
Ludo Game Report LP
15 pages
AWS Certified Solutions Architect Professional - Exam Guide
No ratings yet
AWS Certified Solutions Architect Professional - Exam Guide
22 pages
COMP0142 Introduction
No ratings yet
COMP0142 Introduction
15 pages
(MAKE A COPY) Wonsulting Resume Template + Resources
No ratings yet
(MAKE A COPY) Wonsulting Resume Template + Resources
6 pages
Right To Travel Brief
67% (6)
Right To Travel Brief
62 pages
Wonsulting Interview Prep - 1 Pager
No ratings yet
Wonsulting Interview Prep - 1 Pager
1 page
Free Resume Template Aspen
No ratings yet
Free Resume Template Aspen
3 pages
Google Developer
No ratings yet
Google Developer
121 pages
Product Owner Resume Example
No ratings yet
Product Owner Resume Example
1 page
Creative Google Docs Resume Template
No ratings yet
Creative Google Docs Resume Template
1 page
Google - Passleader.cloud Digital Leader - free.PDF.2023 May 25.by - Gary.153q.vce
No ratings yet
Google - Passleader.cloud Digital Leader - free.PDF.2023 May 25.by - Gary.153q.vce
5 pages
Latest - DevOps Coding Assessment
No ratings yet
Latest - DevOps Coding Assessment
2 pages
Asad Ahmad: Summary
No ratings yet
Asad Ahmad: Summary
2 pages
Scaled Professional Scrum
No ratings yet
Scaled Professional Scrum
25 pages
Professional Data Engineer Certification - Learn - Google Cloud
No ratings yet
Professional Data Engineer Certification - Learn - Google Cloud
5 pages
Pmi-Acp 8
No ratings yet
Pmi-Acp 8
18 pages
CertiBanks - ITIL 4 Foundation Question Bank - Practice Test 3
No ratings yet
CertiBanks - ITIL 4 Foundation Question Bank - Practice Test 3
32 pages
Certified Agile ITSM Manager
No ratings yet
Certified Agile ITSM Manager
2 pages
Exam Professional Data Engineer Topic 1 Question 204 Discussion - ExamTopics
No ratings yet
Exam Professional Data Engineer Topic 1 Question 204 Discussion - ExamTopics
1 page
Completion (Natural Flow)
No ratings yet
Completion (Natural Flow)
3 pages
DOFD V3.0 Exam Requirements 07.2018
No ratings yet
DOFD V3.0 Exam Requirements 07.2018
4 pages
HCIP-Cloud Service Solutions Architect V2.0 Exam Outline
No ratings yet
HCIP-Cloud Service Solutions Architect V2.0 Exam Outline
3 pages
Professional Cloud Architect Exam - Free Actual Q&As, Page 1 - ExamTopics
No ratings yet
Professional Cloud Architect Exam - Free Actual Q&As, Page 1 - ExamTopics
4 pages
Ace Updated 20 2 23
No ratings yet
Ace Updated 20 2 23
1 page
Google Certified Professional Data Engineer
No ratings yet
Google Certified Professional Data Engineer
4 pages
Certified Blockchain Developer CBDH Complete Video Course and Practice
No ratings yet
Certified Blockchain Developer CBDH Complete Video Course and Practice
4 pages
Tapan Banker, Tapan Nayan Banker Cloud Architect, Enterprise
No ratings yet
Tapan Banker, Tapan Nayan Banker Cloud Architect, Enterprise
2 pages
Associate Cloud Engineer Exam - Free Actual Q&As, Page 5 - ExamTopics
No ratings yet
Associate Cloud Engineer Exam - Free Actual Q&As, Page 5 - ExamTopics
3 pages
PfMP Exam Insights : Q&A with Explanations
From Everand
PfMP Exam Insights : Q&A with Explanations
SUJAN
No ratings yet

Professional Data Engineer Demo

Uploaded by

Professional Data Engineer Demo

Uploaded by

Professional Data Engineer on Google Cloud

Total Demo Questions: 10

Total Premium Questions: 173

A. Eliminate features that are highly correlated to the output labels.

B. Combine highly co-dependent features into one representative feature.

C. Instead of feeding in each feature individually, average their values in batches of 3.

A. Create a cron schedule in Cloud Dataprep.

DumpsArena - Pass Your Next Certification Exam Fast!

DumpsArena - Pass Your Next Certification Exam Fast!

A. Increase the number of max workers

B. Use a larger instance type for your Cloud Dataflow workers

C. Change the zone of your Cloud Dataflow pipeline to run in us-central1

DumpsArena - Pass Your Next Certification Exam Fast!

A. Denormalize the data as must as possible.

B. Preserve the structure of the data as much as possible.

C. Use BigQuery UPDATE to further reduce the size of the dataset.

Which Google database service should you use?

DumpsArena - Pass Your Next Certification Exam Fast!

DumpsArena - Pass Your Next Certification Exam Fast!

You might also like