Question 8
Question 8
Google Discussions
Go to Exam
TerramEarth's 20 million vehicles are scattered around the world. Based on the vehicle's location, its telemetry data
is stored in a Google Cloud Storage (GCS) regional bucket (US, Europe, or Asia). The CTO has asked you to run a
report on the raw telemetry data to determine why vehicles are breaking down after 100 K miles. You want to run
this job on all the data.
What is the most cost-effective way to run this job?
A. Move all the data into 1 zone, then launch a Cloud Dataproc cluster to run the job
B. Move all the data into 1 region, then launch a Google Cloud Dataproc cluster to run the job
C. Launch a cluster in each region to preprocess and compress the raw data, then move the data into a multi-
region bucket and use a Dataproc cluster to finish the job
D. Launch a cluster in each region to preprocess and compress the raw data, then move the data into a region
bucket and use a Cloud Dataproc cluster to finish the job
Comments
There is one thing for sure here. If we move/copy data between continents it will cost us money therefore compressing the
data before copying to another region/continent makes sense.
Preprocessing also makes sense because we probably want to process smaller chunks of data first (remember 100K
milage).
So now type of target bucket; multi-region or standard? multi-region is good for high-availability and low latency with a little
more cost however question doesn't require any of these features.
Therefore I think standard storage option is good to go given lower costs are always better.
So my answer would be D
upvoted 66 times
moving data from one region to another region will incur network egress cost. By compressing data and then moving would
reduce this cost. Though running Dataproc for preprocessing in each region will incur additional cost but it will also reduce
cost of running Dataproc job on all pre-processed data will also reduce cost offsetting additional cost of Dataproc cluster at
regional level.
upvoted 1 times
Load full discussion...
Platform