Big Halloween Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: save70

Free and Premium CertNexus AIP-210 Dumps Questions Answers

Page: 1 / 7
Total 92 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Question 1

You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?

Options:

A.

Deep learning neural network

B.

Random forest

C.

Ridge regression

D.

Support-vector machine

Buy Now
Question 2

Below are three tables: Employees, Departments, and Directors.

Employee_Table

Department_Table

Director_Table

ID

Firstname

Lastname

Age

Salary

DeptJD

4566

Joey

Morin

62

$ 122,000

1

1230

Sam

Clarck

43

$ 95,670

2

9077

Lola

Russell

54

$ 165,700

3

1346

Lily

Cotton

46

$ 156,000

4

2088

Beckett

Good

52

$ 165,000

5

Which SQL query provides the Directors' Firstname, Lastname, the name of their departments, and the average employee's salary?

Options:

A.

SELECT m.Firstname, m.Lastname, d.Name, AVG(e.Saiary) as Dept_avg_SaiaryFROM Employee_Table as eLEFT JOIN Department_Table as d on e.Dept = d.NameLEFT JOIN Directorjable as m on d.ID = m.DeptJDGROUP BY m.Firstname, m.Lastname, d.Name

B.

SELECT m.Firstname, m.Lastname, d.Name, AVG(e.Salary) as Dept_avg_SalaryFROM Employee_Table as eRIGHT JOIN Departmentjable as d on e.Dept = d.NameINNER JOIN Directorjable as m on d.ID = m.DeptJDGROUP BY d.Name

C.

SELECT m.Firstname, m.Lastname, d.Name, AVG(e.Salary) as Dept_avg_SalaryFROM Employee_Table as eRIGHT JOIN Department_Table as d on e.Dept = d.NameINNER JOIN Directorjable as m on d.ID = m.DeptJDGROUP BY e.Salary

D.

SELECT m.Firstname, m.Lastname, d.Name, AVG(e.Salary) as Dept_avg_SalaryFROM Employee_Table as eRIGHT JOIN Department_Table as d on e.Dept = d.NameINNER JOIN Directorjable as m on d.ID = m.DeptIDGROUP BY m.Firstname, m.Lastname, d.Name

Question 3

Which of the following statements are true regarding highly interpretable models? (Select two.)

Options:

A.

They are usually binary classifiers.

B.

They are usually easier to explain to business stakeholders.

C.

They are usually referred to as "black box" models.

D.

They are usually very good at solving non-linear problems.

E.

They usually compromise on model accuracy for the sake of interpretability.

Question 4

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

Options:

A.

Add 1 to all of the Y values.

B.

Divide all the Y values by the standard deviation of Y.

C.

Explore the data for outliers.

D.

Subtract the mean of Y from all the Y values.

Question 5

Which two of the following statements about the beta value in an A/B test are accurate? (Select two.)

Options:

A.

The Beta value is the rate of type II errors for the test.

B.

The Beta value is the rate of type I errors for the test.

C.

The statistical power of a test is the inverse of the Beta value, or 1 - Beta.

D.

The Beta in an Alpha/Beta test represents one of the two variants of the A/B test.

Question 6

When should you use semi-supervised learning? (Select two.)

Options:

A.

A small set of labeled data is available but not representative of the entire distribution.

B.

A small set of labeled data is biased toward one class.

C.

Labeling data is challenging and expensive.

D.

There is a large amount of labeled data to be used for predictions.

E.

There is a large amount of unlabeled data to be used for predictions.

Question 7

Which type of regression represents the following formula: y = c + b*x, where y = estimated dependent variable score, c = constant, b = regression coefficient, and x = score on the independent variable?

Options:

A.

Lasso regression

B.

Linear regression

C.

Polynomial regression

D.

Ridge regression

Question 8

Workflow design patterns for the machine learning pipelines:

Options:

A.

Aim to explain how the machine learning model works.

B.

Represent a pipeline with directed acyclic graph (DAG).

C.

Seek to simplify the management of machine learning features.

D.

Separate inputs from features.

Question 9

Which of the following is NOT an activation function?

Options:

A.

Additive

B.

Hyperbolic tangent

C.

ReLU

D.

Sigmoid

Question 10

Which two encoders can be used to transform categorical data into numerical features? (Select two.)

Options:

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Question 11

Which of the following tests should be performed at the production level before deploying a newly retrained model?

Options:

A.

A/Btest

B.

Performance test

C.

Security test

D.

Unit test

Question 12

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

Options:

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Question 13

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

Options:

A.

The model is overfitted and trained on a high quantity of patient records.

B.

The model is overfitted and trained on a low quantity of patient records.

C.

The model is underfitted and trained on a high quantity of patient records.

D.

The model is underfitted and trained on a low quantity of patient records.

Question 14

Which two encodes can be used to transform categories data into numerical features? (Select two.)

Options:

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Question 15

A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?

Options:

A.

Mean squared error

B.

Precision and accuracy

C.

Precision and recall

D.

Recall and explained variance

Question 16

We are using the k-nearest neighbors algorithm to classify the new data points. The features are on different scales.

Which method can help us to solve this problem?

Options:

A.

Log transformation

B.

Normalization

C.

Square-root transformation

D.

Standardization

Question 17

An organization sells house security cameras and has asked their data scientists to implement a model to detect human feces, as distinguished from animals, so they can alert th customers only when a human gets close to their house.

Which of the following algorithms is an appropriate option with a correct reason?

Options:

A.

A decision tree algorithm, because the problem is a classification problem with a small number of features.

B.

k-means, because this is a clustering problem with a small number of features.

C.

Logistic regression, because this is a classification problem and our data is linearly separable.

D.

Neural network model, because this is a classification problem with a large number of features.

Question 18

Given a feature set with rows that contain missing continuous values, and assuming the data is normally distributed, what is the best way to fill in these missing features?

Options:

A.

Delete entire rows that contain any missing features.

B.

Fill in missing features with random values for that feature in the training set.

C.

Fill in missing features with the average of observed values for that feature in the entire dataset.

D.

Delete entire columns that contain any missing features.

Question 19

An HR solutions firm is developing software for staffing agencies that uses machine learning.

The team uses training data to teach the algorithm and discovers that it generates lower employability scores for women. Also, it predicts that women, especially with children, are less likely to get a high-paying job.

Which type of bias has been discovered?

Options:

A.

Automation

B.

Emergent

C.

Preexisting

D.

Technical

Question 20

In general, models that perform their tasks:

Options:

A.

Less accurately are less robust against adversarial attacks.

B.

Less accurately are neither more nor less robust against adversarial attacks.

C.

More accurately are less robust against adversarial attacks.

D.

More accurately are neither more nor less robust against adversarial attacks.

Question 21

What is the open framework designed to help detect, respond to, and remediate threats in ML systems?

Options:

A.

Adversarial ML Threat Matrix

B.

MITRE ATTandCK® Matrix

C.

OWASP Threat and Safeguard Matrix

D.

Threat Susceptibility Matrix

Question 22

A dataset can contain a range of values that depict a certain characteristic, such as grades on tests in a class during the semester. A specific student has so far received the following grades: 76,81, 78, 87, 75, and 72. There is one final test in the semester. What minimum grade would the student need to achieve on the last test to get an 80% average?

Options:

A.

82

B.

89

C.

91

D.

94

Question 23

Which of the following is TRUE about SVM models?

Options:

A.

They can be used only for classification.

B.

They can be used only for regression.

C.

They can take the feature space into higher dimensions to solve the problem.

D.

They use the sigmoid function to classify the data points.

Question 24

Which of the following is a common negative side effect of not using regularization?

Options:

A.

Overfitting

B.

Slow convergence time

C.

Higher compute resources

D.

Low test accuracy

Question 25

Which of the following is the correct definition of the quality criteria that describes completeness?

Options:

A.

The degree to which all required measures are known.

B.

The degree to which a set of measures are equivalent across systems.

C.

The degree to which a set of measures are specified using the same units of measure in all systems.

D.

The degree to which the measures conform to defined business rules or constraints.

Question 26

Which of the following algorithms is an example of unsupervised learning?

Options:

A.

Neural networks

B.

Principal components analysis

C.

Random forest

D.

Ridge regression

Question 27

Why do data skews happen in the ML pipeline?

Options:

A.

Test and evaluation data are designed incorrectly.

B.

There Is a mismatch between live input data and offline data.

C.

There is a mismatch between live output data and offline data.

D.

There is insufficient training data for evaluation.

Page: 1 / 7
Total 92 questions