This question will ask you to provide a missing option.
Complete the following syntax to test the homogeneity of variance assumption in the GLM procedure:
means Region /
An analyst knows that the categorical predictor, zip_code, is an important predictor of a binary target. However, zip_code has too many levels to be a feasible predictor in a model. The analyst uses PROC CLUSTER to implement Greenacre's method to reduce the number of categorical levels.
What is the correct application of Greenacre's method in this situation?
When mean imputation is performed on data after the data is partitioned for honest assessment, what is the most appropriate method for handling the mean imputation?
Refer to the lift chart:
What does the reference line at lift = 1 corresponds to?
The PROC LOGISTIC options SELECTION=SCORE and BEST=2 are used in a MODEL statement to generate a series of predictive models. The models are assigned numbers in order from 1 to 99 reflecting the fact that there are 50 candidate input variables. Results from the collection of derived models are used to generate the following plot of overall average profit by model number. Results are restricted to models with at least 9 inputs and at most 40 inputs.
The maximum value for the training data occurs for model number 46, and the maximum value for the validation data occurs for model number 43.
If you base model selection solely on overall average profit, what is the correct choice?
The SAS data set RESULT contains the following variables:
Which SAS programs can be used to find the p-value for comparing GrpA sales with GrpB sales? (Choose two.)
An analyst has a sufficient volume of data to perform a 3-way partition of the data into training, validation, and test sets to perform honest assessment during the model building process.
What is the purpose of the training data set?
A predictive model uses a data set that has several variables with missing values.
What two problems can arise with this model? (Choose two.)
The selection criterion used in the forward selection method in the REG procedure is:
Which characteristic of Studentized residuals indicate potential outliers?
Refer to the confusion matrix:
Calculate the sensitivity. (0 - negative outcome, 1 - positive outcome)
Click the calculator button to display a calculator if needed.
Refer to the exhibit.
Given alpha=0.02, which conclusion is justified regarding percentage of body fat, comparing small (S), medium (M), and large (L) wrist sizes?
Select the equivalent LOGISTIC procedure model statements. (Choose two.)
What is the default method in the LOGISTIC procedure to handle observations with missing data?