You are a data scientist trying to load data into your notebook session. You understand that Accelerated Data Science (ADS) SDK supports loading various data formats. Which of the following THREE are ADS supported data formats?
You have created a conda environment in your notebook session. This is the first time you are
working with published conda environments. You have also created an Object Storage bucket with
permission to manage the bucket.
Which two commands are required to publish the conda environment?
You want to evaluate the relationship between feature values and target variables. You have a
large number of observations having a near uniform distribution and the features are highly
correlated.
Which model explanation technique should you choose?
Six months ago, you created and deployed a model that predicts customer churn for a call
centre. Initially, it was yielding quality predictions. However, over the last two months, users are
questioning the credibility of the predictions.
Which two methods would you employ to verify the accuracy of the model?
You have trained a machine learning model on Oracle Cloud Infrastructure (OCI) Data Science,
and you want to save the code and associated pickle file in a Git repository. To do this, you have to
create a new SSH key pair to use for authentication. Which SSH command would you use to create
the public/private algorithm key pair in the notebook session?
As a data scientist, you create models for cancer prediction based on mammographic images.
The correct identification is very crucial in this case. After evaluating two models, you arrive at the
following confusion matrix.
Model 1 has Test accuracy is 80% and recall is 70%.
• Model 2 has Test accuracy is 75% and recall is 85%.
Which model would you prefer and why?
For your next data science project, you need access to public geospatial images.
Which Oracle Cloud service provides free access to those images?
As a data scientist, you are working on a global health data set that has data from more than 50 countries. You want to encode three features, such as 'countries', 'race', and 'body organ' as categories. Which option would you use to encode the categorical feature?
When preparing your model artifact to save it to the Oracle Cloud Infrastructure (OCI) Data Science model catalog, you create a score.py file. What is the purpose of the score.py fie?
data scientist, you use the Oracle Cloud Infrastructure (OCI) Language service to train custom
models. Which types of custom models can be trained?
You are a data scientist designing an air traffic control model, and you choose to leverage Oracle
AutoML You understand that the Oracle AutoML pipeline consists of multiple stages and
automatically operates in a certain sequence. What is the correct sequence for the Oracle AutoML
pipeline?
You are a data scientist leveraging Oracle Cloud Infrastructure (OCI) Data Science to create a
model and need some additional Python libraries for processing genome sequencing data. Which of
the following THREE statements are correct with respect to installing additional Python libraries to
process the data?
You have trained three different models on your data set using Oracle AutoML. You want to
visualize the behavior of each of the models, including the baseline model, on the test set. Which
class should be used from the Accelerated Data Science (ADS) SDK to visually compare the models?
You are a data scientist working inside a notebook session and you attempt to pip install a
package from a public repository that is not included in your conda environment. After running this
command, you get a network timeout error.
What might be missing from your networking configuration?
You are creating an Oracle Cloud Infrastructure (OCI) Data Science job that will run on a recurring basis in a production environment. This job will pick up sensitive data from an Object Storage bucket, train a model, and save it to the model catalog. How would you design the authentication mechanism for the job?
What preparation steps are required to access an Oracle AI service SDK from a Data Science
notebook session?
The Accelerated Data Science (ADS) model evaluation classes support different types of machine
learning modeling techniques. Which three types of modeling techniques are supported by ADS
Evaluators?
You have an embarrassingly parallel or distributed batch job on a large amount of data that you
consider running using Data Science Jobs. What would be the best approach to run the workload?
You have a complex Python code project that could benefit from using Data Science Jobs as it is a
repeatable machine learning model training task. The project contains many subfolders and classes.
What is the best way to run this project as a Job?
As a data scientist, you are tasked with creating a model training job that is expected to take
different hyperparameter values on every run. What is the most efficient way to set those
parameters with Oracle Data Science Jobs?
Which two statements are true about published conda environments?
As a data scientist, you are working on a global health data set that has data from more than 50
countries. You want to encode three features such as 'countries', 'race' and 'body organ' as
categories.
Which option would you use to encode the categorical feature?
Which Oracle Accelerated Data Science (ADS) classes can be used for easy access to data sets from
reference libraries and index websites such as scikit-learn?
While reviewing your data, you discover that your data set has a class imbalance. You are aware
that the Accelerated Data Science (ADS) SDK provides multiple built-in automatic transformation
tools for data set transformation. Which would be the right tool to correct any imbalance between
the classes?