Summer Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Databricks Databricks-Machine-Learning-Associate Actual Questions

Databricks Certified Machine Learning Associate Exam Questions and Answers

Question 13

A data scientist has defined a Pandas UDF function predict to parallelize the inference process for a single-node model:

They have written the following incomplete code block to use predict to score each record of Spark DataFramespark_df:

Which of the following lines of code can be used to complete the code block to successfully complete the task?

Options:

A.

predict(*spark_df.columns)

B.

mapInPandas(predict)

C.

predict(Iterator(spark_df))

D.

mapInPandas(predict(spark_df.columns))

E.

predict(spark_df.columns)

Question 14

A machine learning engineer has identified the best run from an MLflow Experiment. They have stored the run ID in the run_id variable and identified the logged model name as "model". They now want to register that model in the MLflow Model Registry with the name "best_model".

Which lines of code can they use to register the model associated with run_id to the MLflow Model Registry?

Options:

A.

mlflow.register_model(run_id, "best_model")

B.

mlflow.register_model(f"runs:/{run_id}/model”, "best_model”)

C.

millow.register_model(f"runs:/{run_id)/model")

D.

mlflow.register_model(f"runs:/{run_id}/best_model", "model")

Question 15

A data scientist has created two linear regression models. The first model uses price as a label variable and the second model uses log(price) as a label variable. When evaluating the RMSE of each model bycomparing the label predictions to the actual price values, the data scientist notices that the RMSE for the second model is much larger than the RMSE of the first model.

Which of the following possible explanations for this difference is invalid?

Options:

A.

The second model is much more accurate than the first model

B.

The data scientist failed to exponentiate the predictions in the second model prior tocomputingthe RMSE

C.

The datascientist failed to take the logof the predictions in the first model prior to computingthe RMSE

D.

The first model is much more accurate than the second model

E.

The RMSE is an invalid evaluation metric for regression problems

Question 16

A machine learning engineer wants to parallelize the inference of group-specific models using the Pandas Function API. They have developed theapply_modelfunction that will look up and load the correct model for each group, and they want to apply it to each group of DataFramedf.

They have written the following incomplete code block:

Which piece of code can be used to fill in the above blank to complete the task?

Options:

A.

applyInPandas

B.

groupedApplyInPandas

C.

mapInPandas

D.

predict