Pergunta 1: 1 / 1 Ponto
Pergunta 1: 1 / 1 Ponto
Pergunta 1: 1 / 1 Ponto
Pergunta 1
Your task is to predict if a person suffers from a disease by setting up a binary classification
model. Your solution needs to be able to detect the classification errors that may appear.
Considering the below description, which of the following would be the best error type?
“A person suffers from a disease. Your model classifies the case as having no disease”.
1 / 1 ponto
True negatives
False positives
False negatives
True positives
Correto
A false negative is an outcome where the model incorrectly predicts the negative class.
2.
Pergunta 2
As a senior data scientist, you need to evaluate a binary classification machine learning model.
As evaluation metric, you have to use the precision. Considering this, which is the most
appropriate visualization?
0 / 1 ponto
Gradient descent
Scatter plot
Violin plot
Incorreto
Try going back to Train and evaluate Classification models.
3.
Pergunta 3
In order to do a multi-class classification using an unbalanced training dataset, you have to apply
C-Support Vector classification. You use the following Python code for the C-Support Vector
classification:
from sklearn.svm import svc
import numpy as np
model1 = svc.fit(X_train, y)
Considering that your task is to evaluate the C-Support Vector classification code, what is the
most appropriate evaluation statement?
0 / 1 ponto
Incorreto
Try going back and reviewing Train and Evaluate Classification Models.
4.
Pergunta 4
Your task is to create and evaluate a model. One of the metrics shows an absolute metric in the
same unit as the label.
0 / 1 ponto
Incorreto
Try going back and reviewing Explore & Analyse Data with Python.
5.
Pergunta 5
Python is commonly known to ensure extensive functionality with powerful and statistical
numerical libraries. What are the utilities of TensorFlow?
1 / 1 ponto
Correto
TensorFlow supplies machine learning and deep learning capabilities.
6.
Pergunta 6
Choose from the list below the evaluation model that is described as a relative metric where the
higher the value is, the better will be the fit of the model.
1 / 1 ponto
Correto
This is the evaluation metric described. In essence, this metric represents how much of the
variance between predicted and actual label values the model is able to explain.
7.
Pergunta 7
If you use the sklearn.metrics classification report for evaluating how your model performs, what
result do you get from the F1-Score metric?
1 / 1 ponto
How many instances of this class are there in the test dataset
An average metric that takes both precision and recall into account.
Of the predictions the model made for this class, what proportion were correct
Out of all of the instances of this class in the test dataset, how many did the model identify
Correto
This is what the F1-Score provides.
8.
Pergunta 8
You are able to associate the K-Means clustering algorithm with the following machine learning
type:
1 / 1 ponto
Reinforcement learning
Correto
Clustering is a form of unsupervised machine learning in which the training data does not include
known labels.
9.
Pergunta 9
For training a classification model that is able to predict based on 8 numeric features where in
the classes is belonging an observation, you configured a deep neural network.
From the list below, which one states a truth related to the network architecture?
1 / 1 ponto
Correto
The output layer should contain a node for each possible class value.
10.
Pergunta 10
Which of the layer types described below is a principal one that retrieves important features in
images and works by putting a filter to images?
0 / 1 ponto
Flattening layer
Pooling layer
Convolutional layer
Incorreto
Try going back and reviewing Train a deep neural network.
11.
Pergunta 11
The company that you work for decides to expand the use of machine learning. The company
decides not to set up another compute environment in Azure. At the moment, you have at your
disposal the compute environments below.
0 / 1 ponto
1 mlc_cluster, 2 aks_cluster
1 nb_server, 2 aks_cluster
1 mlc_cluster, 2 nb_server
1 nb_server, 2 mlc_cluster
Incorreto
Try going back and reviewing Work with Compute in Azure Machine Learning.
12.
Pergunta 12
You have a set of CSV files that contain sales records. Your CSV files follow an identical data
schema.
The sales record for a certain month are held in one of the CSV files and the filename is sales.csv.
For every file there is a corresponding storage folder that shows the month and the year for the
data recording. In an Azure Machine Learning workspace has been set up a datastore for the
folders kept in an Azure blob container. The parent folder entitled sales contains the folders
organized to create the hierarchical structure below:
/sales
/01-2019
/sales.csv
/02-2019
/sales.csv
/03-2019
/sales.csv
In the sales folder is added a new folder with a certain month’s sales every time that month has
ended. You want to train a machine learning model by using the sales data while complying with
the requirements below:
- All of your sales data have to be loaded to date by a dataset and into a structure that enables
easy conversion to a dataframe.
- You have to ensure that experiments can be done by using only the data created until a specific
previous month, disregarding any data added after the month selected.
- You have to keep the number of registered datasets to the minimum possible.
Considering that the sales data have to be registered as a dataset in the Azure Machine Learning
service workspace, what actions should you take?
1 / 1 ponto
Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-
yyyy/sales.csv' file. Register the dataset with the name sales_dataset each month as a new
version and with a tag named month indicating the month and year it was registered. Use this
dataset for all experiments, identifying the version to be used based on the month tag as
necessary.
Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-
yyyy/sales.csv' file every month. Register the dataset with the name sales_dataset each month,
replacing the existing dataset and specifying a tag named month indicating the month and year it
was registered. Use this dataset for all experiments.
Create a new tabular dataset that references the datastore and explicitly specifies each
'sales/mm-yyyy/sales.csv' file every month. Register the dataset with the name
sales_dataset_MM-YYYY each month with appropriate MM and YYYY values for the month and
year. Use the appropriate month-specific dataset for experiments.
Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv',
register the dataset with the name sales_dataset and a tag named month indicating the month
and year it was registered, and use this dataset for all experiments.
Correto
This is the correct approach to this scenario.
13.
Pergunta 13
You decide to use the code below for the deployment of a model as an Azure Machine Learning
real-time web service:
service.wait_for_deployment(True)
You have to troubleshoot the deployment failure in order to determine what actions were taken
while deploying and to identify the one action that encountered a problem and didn’t succeed.
For this scenario, which of the following code snippets should you use?
1 / 1 ponto
service.state
service.serialize()
service.update_deployment_state()
service.get_logs()
Correto
You can print out detailed Docker engine log messages from the service object.
You can view the log for ACI, AKS, and Local deployments.
14.
Pergunta 14
You decide to register and train a model in your Azure Machine Learning workspace.
Your pipeline needs to ensure that the client applications are able to use the model for batch
inferencing.
Your single ParallelRunStep step pipeline uses a Python inferencing script in order to obtain
predictions from the input data.
Your task is to configure the inferencing script for the ParallelRunStep pipeline step.
Which are the most suitable two functions that you should use? Keep in mind that every correct
answer presents a part of the solution.
0 / 1 ponto
run(mini_batch)
init()
Correto
This function is called when the pipeline is initialized.
score(mini_batch)
batch()
main()
15.
Pergunta 15
You decide to deploy a real-time inference service for a trained model.
Your model is able to support a business-critical application, and you have to ensure it can
monitor the data that is submitted to the web, as well as the predictions generated by the data.
While keeping the administrative effort to a minimum, you have to be able to implement a
monitoring solution for the model deployed. What action should you take?
1 / 1 ponto
View the log files generated by the experiment used to train the model.
Enable Azure Application Insights for the service endpoint and view logged data in the Azure
portal.
Create an ML Flow tracking URI that references the endpoint, and view the data logged by ML
Flow.
Correto
You can also enable Azure Application Insights from Azure Machine Learning studio.
16.
Pergunta 16
If you want to install the Azure Machine Learning SDK for Python, what are the most suitable
package managers and CLI commands?
1 / 1 ponto
nuget azureml-sdk
Correto
This package management system contains the Python Azure ML SDK and is the CLI command to
install it.
17.
Pergunta 17
What SDK commands should you choose if you want to extract a certain version of a data set?
1 / 1 ponto
Correto
This is the correct command for this request.
18.
Pergunta 18
What are the most appropriate SDK commands you should choose if you want to publish the
pipeline that you created?
1 / 1 ponto
publishedpipeline = pipeline_publish(name='training_pipeline',
version='1.0')
published_pipeline = pipeline.publish(name='training_pipeline',
version='1.0')
published.pipeline = pipeline_publish(name='training_pipeline',
version='1.0')
published.pipeline = pipeline.publish(name='training_pipeline',
version='1.0')
Correto
This is the correct command for publishing a pipeline using the SDK.
19.
Pergunta 19
True or False?
1 / 1 ponto
True
False
Correto
You must define parameters for a pipeline before publishing it.
20.
Pergunta 20
If you want to set up a parallel run step, which of the SDK commands below should you choose?
1 / 1 ponto
parallelrun_step = ParallelRunStep(
name='batch-score',
parallel.run.config=parallel_run_config,
inputs=[batch_data_set.as_named_input('batch_data')],
output=output_dir,
arguments=[],
allow_reuse=True
parallelrun.step = ParallelRunStep(
name='batch-score',
parallel_run_config=parallel_run_config,
inputs=[batch_data_set.as_named_input('batch_data')],
output=output_dir,
arguments=[],
allow_reuse=True
parallelrun_step = ParallelRunStep(
name='batch-score',
parallel_run_config=parallel_run_config,
inputs=[batch_data_set.as_named_input('batch_data')],
output=output_dir,
arguments=[],
allow_reuse=True
parallelrun_step = ParallelRunStep(
name='batch-score',
parallel_run_config=parallel.run.config,
inputs=[batch_data_set.as_named_input('batch_data')],
output=output_dir,
arguments=[],
allow_reuse=True
Correto
These are the correct commands.
21.
Pergunta 21
What Python code should you write if your goal is to implement a median stopping policy?
1 / 1 ponto
early_termination_policy = MedianStoppingPolicy(truncation_percentage=10,
evaluation_interval=1,
delay_evaluation=5)
evaluation_interval=1,
delay_evaluation=5)
early_termination_policy = MedianStoppingPolicy(evaluation_interval=1,
delay_evaluation=5)
Correto
This is the correct code for this task.
22.
Pergunta 22
What code should you write for a PFIExplainer if you have a model entitled loan_model?
0 / 1 ponto
initialization_examples=X_test,
classes=['loan_amount','income','age','marital_status'],
features=['reject', 'approve'])
from interpret.ext.blackbox
initialization_examples=X_test,
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
explainable_model= DecisionTreeExplainableModel,
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
features=['loan_amount','income','age','marital_status'],
classes=['reject', 'approve'])
Incorreto
Try going back and reviewing Using explainers.
23.
Pergunta 23
Your task is to train a binary classification model in order for it to be able to target the correct
subjects in a marketing campaign.
What actions should you take if you want to ensure that your model is fair and will not be inclined
to ethnic discrimination?
1 / 1 ponto
Evaluate each trained model with a validation dataset, and use the model with the highest
accuracy score. An accurate model is inherently fair.
Correto
By using ethnicity as a sensitive field, and comparing disparity between selection rates and
performance metrics for each ethnicity value, you can evaluate the fairness of the model.
24.
Pergunta 24
You decided to preprocess and filter down only the relevant columns for your AirBnB housing
dataframe.
The columns that you kept are: id, host_name, bedrooms, neighbourhood_cleansed, price.
In order to obtain the first initial from the host_name column, you have written the following
function that you entitled firstInitialFunction:
def firstInitialFunction(name):
return name[0]
firstInitialFunction("George")
Your goal is to use the spark.sql.register in order to create a UDF from the function above,
because you want to ensure that the UDF will be created in the SQL namespace.
1 / 1 ponto
airbnbDF.createOrReplaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
airbnbDF.createAndReplaceTempView("airbnbDF")
spark.udf.register(sql_udf.firstInitialFunction)
airbnbDF.createTempView("airbnbDF")
spark.udf.register(sql_udf = firstInitialFunction)
airbnbDF.replaceTempView("airbnbDF")
spark.udf.register("sql_udf", firstInitialFunction)
Correto
This is the correct code for the task.
25.
Pergunta 25
In order to track the runs of a Linear Regression model of your AirBnB dataset, you decide to use
MLflow.
You want to make use of all the features included in your dataset.
At this point, you have created and logged the pipeline and you have logged the parameters.
1 / 1 ponto
predDF = pipelineModel.transform(testDF)
rmse = regressionEvaluator.setMetricName("rmse").evaluate(predDF)
r2 = regressionEvaluator.setMetricName("r2").evaluate(predDF)
predDF = pipelineModel.estimate(testDF)
rmse = regressionEvaluator.setMetricName("rmse").evaluate(predDF)
r2 = regressionEvaluator.setMetricName("r2").evaluate(predDF)
predDF = pipelineModel.transform(testDF)
rmse = regressionEvaluator.setMetricName("rmse").evaluate(predDF)
r2 = regressionEvaluator.setMetricName("r2").evaluate(predDF)
predDF = pipelineModel.evaluate(testDF)
rmse = regressionEvaluator.setMetricName("rmse").evaluate(predDF)
r2 = regressionEvaluator.setMetricName("r2").evaluate(predDF)
Correto
This is the correct code for the task.
26.
Pergunta 26
You decided to use Python code interactively in your Conda environment. You have all the
required Azure Machine Learning SDK and MLflow packages in the environment.
In order to log metrics in your Azure Machine Learning experiment entitled mlflow-experiment,
you have to use MLflow.
To give the correct answer, you have to replace the code comments that are bolded with some
suitable code options that you find in the answer area.
Considering this, what snippet should you choose to complete the code?
import mlflow
ws = Workspace.from_config()
print(“Finished!”)
0 / 1 ponto
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.set_experiment('mlflow-
experiment), #3 mlflow.start_run(), #4 mlflow.log_metric
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.get_run('mlflow-
experiment), #3 mlflow.start_run(), #4 run.log()
#1 mlflow.set_tracking_uri(ws.get_mlflow_tracking_uri()), #2 mlflow.get_run('mlflow-
experiment), #3 mlflow.start_run(), #4 mlflow.log_metric
Incorreto
Try going back and reviewing Use MLflow to track experiments, log metrics, and compare runs.
27.
Pergunta 27
Choose from the list below the supervised learning problem type that usually outputs
quantitative values.
1 / 1 ponto
Clustering
Classification
Regression
Correto
This would be the algorithm used because you would predict a label based on numerical values.
28.
Pergunta 28
Choose from the descriptions below the one that explains what does a negative correlation of -1
mean in terms of correlations.
1 / 1 ponto
For each unit increase in one variable, the same decrease is seen in the other
For each unit increase in one variable, the same increase is seen in the other
Correto
This is what a negative correlation of -1 indicate.
29.
Pergunta 29
Your task is to extract from the experiments list the last run.
1 / 1 ponto
runs[0].data.metrics
runs[0].data.metrics
runs[0].data.metrics
Correto
This is the correct code syntax.
30.
Pergunta 30
Choose from the list below the cross-validation technique that belongs to the exhaustive type.
1 / 1 ponto
Leave-one-out cross-validation
Correto
Leave-one-out cross-validation (LOOCV) is a particular case of leave-p-out cross-validation with p
= 1, which makes it an exhaustive type of cross-validation.
Holdout cross-validation
Leave-p-out cross-validation
Correto
Leave-p-out cross-validation (LpO CV) is an exhaustive type of cross-validation technique. It
involves using p observations as the validation set and the remaining observations as the training
set. This is repeated on all ways to cut the original sample on a validation set of p observations
and a training set.
K-fold cross-validation
31.
Pergunta 31
You decided to use Azure Machine Learning and your goal is to train a Diabetes Model and build a
container image for it.
You choose to make use of the scikit-learn ElasticNet linear regression model.
You want to use Azure Kubernetes Service (AKS) for the model deployment to production.
You have to create an active AKS cluster by using the Azure ML SDK.
1 / 1 ponto
aks_target = ComputeTarget.create(workspace = workspace,
name = aks_cluster_name,
provisioning_configuration = prov_config)
name = aks_cluster_name,
provisioning_configuration = prov_config)
(name = aks_cluster_name,
provisioning_configuration = prov_config)
name = aks_cluster_name,)
Correto
This is the correct code for this task.
32.
Pergunta 32
If you want to list the generated files after your experiment run is completed, what is the most
suitable object run you should choose?
1 / 1 ponto
list_file_names
download_file
download_files
get_file_names
Correto
You can use the run objects get_file_names method to list the files generated. Standard practice
is for scripts that train models to save them in the run's outputs folder.
33.
Pergunta 33
Your hyperparameter tuning needs to have a search space defined. The values of the batch_size
hyperparameter can be 128, 256, or 512 and the normal distribution values for the learning_rate
hyperparameter can have a mean of 10 and a standard deviation of 3.
What Python code should you write in order to achieve this goal?
0 / 1 ponto
param_space = {
'--learning_rate': qnormal(10, 3)
param_space = {
'--learning_rate': uniform(10, 3)
param_space = {
'--learning_rate': lognormal(10, 3)
param_space = {
'--learning_rate': normal(10, 3)
Incorreto
Você não selecionou uma resposta.
34.
Pergunta 34
You decided to use Azure Machine Learning and your goal is to train a Diabetes Model and build a
container image for it.
You choose to make use of the scikit-learn ElasticNet linear regression model.
You want to use Azure Kubernetes Service (AKS) for the model deployment to production.
At this point, you have deployed the image of the model to the desired AKS cluster.
After using different hyperparameters to train the new model, your goal is to deploy to the AKS
cluster the new image of the model.
1 / 1 ponto
prod_webservice.create (image=model_image_updated)
prod_webservice.wait_for_deployment(show_output = True)
prod_webservice.deploy (image=model_image_updated)
prod_webservice.wait_for_deployment(show_output = True)
prod_webservice.delete (image=model_image_updated)
prod_webservice.wait_for_deployment(show_output = True)
prod_webservice.update(image=model_image_updated)
prod_webservice.wait_for_deployment(show_output = True)
Correto
This is the correct code for this task.
35.
Pergunta 35
You are evaluating a completed binary classification machine learning model.
0 / 1 ponto
Gradient descent
Binary classification confusion matrix
A violin plot
Box plot
Incorreto
Try going back and reviewing Create a classification model with Azure AI.