Winter Special - Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: top65certs

Free and Premium CertNexus AIP-210 Dumps Questions Answers

Page: 1 / 7
Total 90 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Question 1

Which of the following approaches is best if a limited portion of your training data is labeled?

Options:

A.

Dimensionality reduction

B.

Probabilistic clustering

C.

Reinforcement learning

D.

Semi-supervised learning

Buy Now
Question 2

Which two of the following criteria are essential for machine learning models to achieve before deployment? (Select two.)

Options:

A.

Complexity

B.

Data size

C.

Explainability

D.

Portability

E.

Scalability

Question 3

A dataset can contain a range of values that depict a certain characteristic, such as grades on tests in a class during the semester. A specific student has so far received the following grades: 76,81, 78, 87, 75, and 72. There is one final test in the semester. What minimum grade would the student need to achieve on the last test to get an 80% average?

Options:

A.

82

B.

89

C.

91

D.

94

Question 4

Which of the following pieces of AI technology provides the ability to create fake videos?

Options:

A.

Generative adversarial networks (GAN)

B.

Long short-term memory (LSTM) networks

C.

Recurrent neural networks (RNN)

D.

Support-vector machines (SVM)

Question 5

In which of the following scenarios is lasso regression preferable over ridge regression?

Options:

A.

The number of features is much larger than the sample size.

B.

There are many features with no association with the dependent variable.

C.

There is high collinearity among some of the features associated with the dependent variable.

D.

The sample size is much larger than the number of features.

Question 6

Which of the following tools would you use to create a natural language processing application?

Options:

A.

AWS DeepRacer

B.

Azure Search

C.

DeepDream

D.

NLTK

Question 7

In a self-driving car company, ML engineers want to develop a model for dynamic pathing. Which of following approaches would be optimal for this task?

Options:

A.

Dijkstra Algorithm

B.

Reinforcement learning

C.

Supervised Learning.

D.

Unsupervised Learning

Question 8

Which two techniques are used to build personas in the ML development lifecycle? (Select two.)

Options:

A.

Population estimates

B.

Population regression

C.

Population resampling

D.

Population triage

E.

Population variance

Question 9

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

Options:

A.

The AI model was trained in winter and applied in summer.

B.

The application was migrated from on-premise to a public cloud.

C.

The team set flawed expectations when training the model.

D.

The training data used was inaccurate.

Question 10

Which of the following are true about the transform-design pattern for a machine learning pipeline? (Select three.)

It aims to separate inputs from features.

Options:

A.

It encapsulates the processing steps of ML pipelines.

B.

It ensures reproducibility.

C.

It represents steps in the pipeline with a directed acyclic graph (DAG).

D.

It seeks to isolate individual steps of ML pipelines.

E.

It transforms the output data after production.

Question 11

Which of the following is the primary purpose of hyperparameter optimization?

Options:

A.

Controls the learning process of a given algorithm

B.

Makes models easier to explain to business stakeholders

C.

Improves model interpretability

D.

Increases recall over precision

Question 12

The following confusion matrix is produced when a classifier is used to predict labels on a test dataset. How precise is the classifier?

Options:

A.

48/(48+37)

B.

37/(37+8)

C.

37/(37+7)

D.

(48+37)/100

Question 13

In general, models that perform their tasks:

Options:

A.

Less accurately are less robust against adversarial attacks.

B.

Less accurately are neither more nor less robust against adversarial attacks.

C.

More accurately are less robust against adversarial attacks.

D.

More accurately are neither more nor less robust against adversarial attacks.

Question 14

The graph is an elbow plot showing the inertia or within-cluster sum of squares on the y-axis and number of clusters (also called K) on the x-axis, denoting the change in inertia as the clusters change using k-means algorithm.

What would be an optimal value of K to ensure a good number of clusters?

Options:

A.

2

B.

3

C.

5

D.

9

Question 15

When should you use semi-supervised learning? (Select two.)

Options:

A.

A small set of labeled data is available but not representative of the entire distribution.

B.

A small set of labeled data is biased toward one class.

C.

Labeling data is challenging and expensive.

D.

There is a large amount of labeled data to be used for predictions.

E.

There is a large amount of unlabeled data to be used for predictions.

Question 16

An AI system recommends New Year's resolutions. It has an ML pipeline without monitoring components. What retraining strategy would be BEST for this pipeline?

Options:

A.

Periodically before New Year's Day and after New Year's Day

B.

Periodically every year

C.

When concept drift is detected

D.

When data drift is detected

Question 17

Which of the following items should be included in a handover to the end user to enable them to use and run a trained model on their own system? (Select three.)

Options:

A.

Information on the folder structure in your local machine

B.

Intermediate data files

C.

Link to a GitHub repository of the codebase

D.

README document

E.

Sample input and output data files

Question 18

Which two of the following decrease technical debt in ML systems? (Select two.)

Options:

A.

Boundary erosion

B.

Design anti-patterns

C.

Documentation readability

D.

Model complexity

E.

Refactoring

Question 19

Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?

Options:

A.

Delete entire rows that contain any missing features.

B.

Fill in missing features with random values for that feature in the training set.

C.

Fill in missing features with the average of observed values for that feature in the entire dataset.

D.

Delete entire columns that contain any missing features.

Question 20

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

Options:

A.

Add 1 to all of the Y values.

B.

Divide all the Y values by the standard deviation of Y.

C.

Explore the data for outliers.

D.

Subtract the mean of Y from all the Y values.

Question 21

Which of the following is a type 1 error in statistical hypothesis testing?

Options:

A.

The null hypothesis is false, but fails to be rejected.

B.

The null hypothesis is false and is rejected.

C.

The null hypothesis is true and fails to be rejected.

D.

The null hypothesis is true, but is rejected.

Question 22

An HR solutions firm is developing software for staffing agencies that uses machine learning.

The team uses training data to teach the algorithm and discovers that it generates lower employability scores for women. Also, it predicts that women, especially with children, are less likely to get a high-paying job.

Which type of bias has been discovered?

Options:

A.

Automation

B.

Emergent

C.

Preexisting

D.

Technical

Question 23

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

Options:

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Question 24

Which type of regression represents the following formula: y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable?

Options:

A.

Lasso regression

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Question 25

A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?

Options:

A.

De-Duplicate

B.

Destroy

C.

Detain

D.

Duplicate

Question 26

You are implementing a support-vector machine on your data, and a colleague suggests you use a polynomial kernel. In what situation might this help improve the prediction of your model?

Options:

A.

When it is necessary to save computational time.

B.

When the categories of the dependent variable are not linearly separable.

C.

When the distribution of the dependent variable is Gaussian.

D.

When there is high correlation among the features.

Question 27

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

Options:

A.

The model is overfitted and trained on a high quantity of patient records.

B.

The model is overfitted and trained on a low quantity of patient records.

C.

The model is underfitted and trained on a high quantity of patient records.

D.

The model is underfitted and trained on a low quantity of patient records.

Page: 1 / 7
Total 90 questions