Summer Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Databricks ML Data Scientist Databricks-Machine-Learning-Associate New Questions

Databricks Certified Machine Learning Associate Exam Questions and Answers

Question 17

A data scientist has produced two models for a single machine learning problem. One of the models performs well when one of the features has a value of less than 5, and the other model performs well when the value of that feature is greater than or equal to 5. The data scientist decides to combine the two models into a single machine learning solution.

Which of the following terms is used to describe this combination of models?

Options:

A.

Bootstrap aggregation

B.

Support vector machines

C.

Bucketing

D.

Ensemble learning

E.

Stacking

Question 18

A data scientist wants to efficiently tune the hyperparameters of a scikit-learn model. They elect to use the Hyperopt library'sfminoperation to facilitate this process. Unfortunately, the final model is not very accurate. The data scientist suspects that there is an issue with theobjective_functionbeing passed as an argument tofmin.

They use the following code block to create theobjective_function:

Which of the following changes does the data scientist need to make to theirobjective_functionin order to produce a more accurate model?

Options:

A.

Add test set validation process

B.

Add a random_state argument to the RandomForestRegressor operation

C.

Remove the mean operation that is wrapping the cross_val_score operation

D.

Replace the r2 return value with -r2

E.

Replace the fmin operation with the fmax operation

Question 19

A data scientist is using Spark ML to engineer features for an exploratory machine learning project.

They decide they want to standardize their features using the following code block:

Upon code review, a colleague expressed concern with the features being standardized prior to splitting the data into a training set and a test set.

Which of the following changes can the data scientist make to address the concern?

Options:

A.

Utilize the MinMaxScaler object to standardize the training data according to global minimum and maximum values

B.

Utilize the MinMaxScaler object to standardize the test data according to global minimum and maximum values

C.

Utilize a cross-validation process rather than a train-test split process to remove the need for standardizing data

D.

Utilize the Pipeline API to standardize the training data according to the test data's summary statistics

E.

Utilize the Pipeline API to standardize the test data according to the training data's summary statistics

Question 20

What is the name of the method that transforms categorical features into a series of binary indicator feature variables?

Options:

A.

Leave-one-out encoding

B.

Target encoding

C.

One-hot encoding

D.

Categorical

E.

String indexing